Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COL dataset with thousands of constituents #540

Open
mdoering opened this issue Dec 13, 2023 · 0 comments
Open

COL dataset with thousands of constituents #540

mdoering opened this issue Dec 13, 2023 · 0 comments

Comments

@mdoering
Copy link
Member

The new eXtended COL Checklist on UAT contains nearly 20.000 source datasets in its data and as bibliographic entries in the EML. This causes the creation of thousands of constituent datasets with their own DOI on import of the dwca in the GBIF registry. It also effectively breaks the dataset portal page as the rendering takes minutes - if it ever finishes.

Consider various improvements:

  • aggregate sources in the col dwca by publisher used in the xcol config, e.g. plazi, BDJ and a few others
  • do not create any DOI for constituents
  • wait until we integrate checklistbank.org with the GBIF API. Constituents in the registry might not be needed at all any longer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant