Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve usability #2

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

cthoyt
Copy link

@cthoyt cthoyt commented Aug 3, 2021

This PR does two things:

  1. It adds three utility functions to make it easier to load corpus lists from arbitrary data structures (instead of relying on internal state of a file path):
  • pubtator_loader.from_lines for an arbitrary iterable of strings
  • pubtator_loader.from_path for any path or path-like object
  • pubtator_loader.from_gz for a path or path-like object pointing to a file that needs to be gunzipped.

It also updates the example in the README to reflect new usage.

  1. Updates imports and type annotations for spacy in pubtator_document.py so users can use this code if spacy isn't installed.

If you're able to merge this, it would be great to make a new release too so I could clean up the dependencies in my code. Thanks for making a nice library for us!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant