-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#112 Parse epub insed of mix of ppub and epub #141
Conversation
In general the paper publication is more relevant. Otherwise you have authors whose articles got published in the 1970s and suddenly they still publish. But it would be great to have a new attribute: Also thanks for already updating the tests! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more changes please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's improve the function one more time
… different functions. The new date format avoids trailing '-'. Publication year is now returned as int (or None)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplify the year to int conversion
New commit parses |
…_dict to have empty values if node is None. This commit conducts the needed changes.
…ub. Remark: File 3460867 has collection and epub.
The XML looks like this:
The code was mixing both elements. The new implementation parses the
epub