-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suppress less informative extractions #43
Comments
Yes, I think it would be useful to provide a means for suppressing strictly less informative extractions. In addition to having a option for filtering, provide access to the logic that determines if an extraction is strictly less informative than another extraction from the same sentence.
Going on a bit of a tangent and expanding a little on this idea, the following extraction comparison methods might be useful:
Perhaps, these capabilities do not belong to the core ollie library but instead belong in a separate ollie-utils library, which provides mechanisms to manipulate and transform extractions (argument, relation string normalizations, and equality under these transformations). Applications can (and in many cases need to) write their own logic to do these things but having a default implementation that comes with the Ollie library sounds useful to me. |
Niranjan, just compare the intervals if you want this functionality. I.e. |
But I'd love to have a normalization routine for relations or arguments. If you have some, let's talk about it sometime. |
Our original intention with Ollie was to find as many correct extractions as possible. This way, since each application will have specific requirements, they could write simple logic to keep what they're interested in.
However, there often are strictly less informative extractions and these are not useful for many applications. For example, you might have the following:
The last extraction is strictly less informative than the first. While in some applications (i.e. search) we may want all three so we have results for more queries, for others (i.e. document summarization) we don't want the third because it is redundant. Ollie should have an option of suppressing strictly less informative extractions.
The text was updated successfully, but these errors were encountered: