You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
I would like to use the warcit library, but as an API rather than through the command line. This is because the warcit library right now only goes through files and parses them as web pages, but I want to pass in a dictionary of pages with the URLs as keys and the contents of the pages as the values. Is there any way to do this using the library as it is? If not, then I am working on a revision of the library with this added functionality - would you open to accepting this revision if I submit it as a pull request when it is completed?
The text was updated successfully, but these errors were encountered:
gshag
changed the title
Using warcit through API rather than terminal
Using warcit through API rather than command line
Jul 3, 2018
Hey @gshag, I have submitted a pull request that allows for the input of a CSV file that maps files to URLs, see #2 — would that help with what you're trying to accomplish?
Hey @despens - thank you for your suggestion! While the CSV file feature is useful in general for specifying certain properties of some files, it isn't what I'm looking for. This is because I have a dictionary containing elements of the form {URL:fileContents} and my main aim is to prevent writing to and reading from files since those would be expensive operations for the scale of data I'm working with. That's why I was hoping for a feature that allowed me to pass the dictionary directly, rather than going through the intermediate process of file creation. The CSV feature still assumes that I have a set of files that contain the web page information, so not quite what I want.
Hi!
I would like to use the warcit library, but as an API rather than through the command line. This is because the warcit library right now only goes through files and parses them as web pages, but I want to pass in a dictionary of pages with the URLs as keys and the contents of the pages as the values. Is there any way to do this using the library as it is? If not, then I am working on a revision of the library with this added functionality - would you open to accepting this revision if I submit it as a pull request when it is completed?
The text was updated successfully, but these errors were encountered: