Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reference extraction with HTML #2

Merged
merged 11 commits into from
Dec 19, 2019
Merged

Conversation

fchrubasik
Copy link
Contributor

The extractor now also works with input as HTML. This should be a bugfix for issue #1 as
discussed here.
The default is currently set as non-HTML. To change this simply add the argument True
when executing extractor.extract or change the default value of is_html in line 47 of
extractors.py to True.

@malteos
Copy link
Contributor

malteos commented Dec 9, 2019

Hi @fchrubasik

thank you very much for your contribution!

I would be happy to merge this but before I do so we definitely need to add a unit test to verify that this feature is working correctly. Also, please check on the existing unit tests, since they are currently failing:

https://travis-ci.org/openlegaldata/legal-reference-extraction/jobs/621724640

TypeError: extract_law_ref_markers() missing 1 required positional argument: 'is_html'

Let me know if you need any assistance, I'm here to help!

Best,
Malte

@fchrubasik
Copy link
Contributor Author

Hi @malteos,

the existing unit test should now work. I also added new unit tests based on the existing tests.
Let me know if these new tests are enough or if I forgot something.

Best,
Fabian

Copy link
Contributor

@malteos malteos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you so much for this Christmas gift. I'll try to deploy this to our production system asap.

@malteos malteos merged commit a4cba7a into openlegaldata:master Dec 19, 2019
@malteos malteos mentioned this pull request Dec 19, 2019
@malteos
Copy link
Contributor

malteos commented May 4, 2020

Finally deployed to production! Sorry for the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants