Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using warcit through API rather than command line #9

Open
gshag opened this issue Jul 3, 2018 · 2 comments
Open

Using warcit through API rather than command line #9

gshag opened this issue Jul 3, 2018 · 2 comments

Comments

@gshag
Copy link

gshag commented Jul 3, 2018

Hi!
I would like to use the warcit library, but as an API rather than through the command line. This is because the warcit library right now only goes through files and parses them as web pages, but I want to pass in a dictionary of pages with the URLs as keys and the contents of the pages as the values. Is there any way to do this using the library as it is? If not, then I am working on a revision of the library with this added functionality - would you open to accepting this revision if I submit it as a pull request when it is completed?

@gshag gshag changed the title Using warcit through API rather than terminal Using warcit through API rather than command line Jul 3, 2018
@despens
Copy link
Contributor

despens commented Jul 4, 2018

Hey @gshag, I have submitted a pull request that allows for the input of a CSV file that maps files to URLs, see #2 — would that help with what you're trying to accomplish?

@gshag
Copy link
Author

gshag commented Jul 5, 2018

Hey @despens - thank you for your suggestion! While the CSV file feature is useful in general for specifying certain properties of some files, it isn't what I'm looking for. This is because I have a dictionary containing elements of the form {URL:fileContents} and my main aim is to prevent writing to and reading from files since those would be expensive operations for the scale of data I'm working with. That's why I was hoping for a feature that allowed me to pass the dictionary directly, rather than going through the intermediate process of file creation. The CSV feature still assumes that I have a set of files that contain the web page information, so not quite what I want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants