Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crossref: scholix format returning less results #70

Open
2 tasks done
slint opened this issue Aug 13, 2019 · 3 comments
Open
2 tasks done

crossref: scholix format returning less results #70

slint opened this issue Aug 13, 2019 · 3 comments

Comments

@slint
Copy link
Member

slint commented Aug 13, 2019

The following queries on CrossRef Event API for e.g. the Zenodo DOI prefix give a different number of total results depending on the response format.

  • Investigate what kind of links differ
  • If needed refactor the harvester to use the non-Scholix endpoint

Example queries:

# Default response format from "/v1/events"
curl "https://api.eventdata.crossref.org/v1/events?obj-id.prefix=10.5281&relation-type=references&source=crossref"
{ ... "total-results": 2380, ... }

# Scholix format from "/v1/events/scholix"
curl "https://api.eventdata.crossref.org/v1/events?obj-id.prefix=10.5281&relation-type=references&source=crossref"
{ ... "total-results": 2280, ... }
@Glignos
Copy link
Member

Glignos commented Aug 15, 2019

In the Scholix case despite the payload containing a number of results which equals to 2280 the real number of results seems to be much lower, 257. I can confirm that the ones returned by the Scholix endpoint all match with the ones from the events one, so there seems to be only surplus on the non-scholix side.
After this, it seems that there is a considerable amount of events that we end up not harvesting.

At this point, I think we should proceed with refactoring the harvester to the non-Scholix endpoint.
Example of a missing event:

{
    "license": "https://doi.org/10.13003/CED-terms-of-use",
    "obj_id": "https://doi.org/10.5281/zenodo.153937",
    "source_token": "8676e950-8ac5-4074-8ac3-c0a18ada7e99",
    "occurred_at": "2016-09-19T00:00:00Z",
    "subj_id": "https://doi.org/10.12688/f1000research.9259.1",
    "id": "31871305-1a69-447b-82a0-d27cf1d14a00",
    "terms": "https://doi.org/10.13003/CED-terms-of-use",
    "message_action": "create",
    "source_id": "crossref",
    "timestamp": "2017-05-19T13:30:11Z",
    "relation_type_id": "references"
}

@Glignos
Copy link
Member

Glignos commented Aug 16, 2019

Related PR #72

@lnielsen
Copy link
Collaborator

I think this should rather be reported to CrossRef as a bug (at least it seems like it). The right guy would be @afandian (Joe Wass).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants