Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stale doi issue -> sparcron needs to add logic to check discover for publication events #91

Open
tgbugs opened this issue Dec 10, 2022 · 2 comments

Comments

@tgbugs
Copy link
Collaborator

tgbugs commented Dec 10, 2022

I currently do not embed non-resolving dois in the export. However, because the dataset modified date does not get bumped when it is published (as it should) we never detect that there was a change. This means we probably need to start embedding non-resolving dois. There is some old logic lurking that the presence of a doi means that we can assume that it has resolved at least once and that a dataset has actually been published. There are a bunch of these places scattered around.

This also means that we need to add an additional check during our combination step to see whether a dataset has actually been published before we place it in the curation-export-published.ttl pile.

Further it means that we will no longer be able to determine publication status reliably using the json export.

I think this means that the discover database is actually the source of truth for this information, so I think I can set up a way to rerun a dataset export without fetching any new data that would allow us to run the export again when a publication happens. Then we wouldn't need to change the way we currently handle dois. This seems the safes option.

@jgrethe
Copy link

jgrethe commented Dec 12, 2022

'dataset modified date' tracks changes that have happened to the data itself. Not sure a publication event counts there. For example, there may be many changes to the publication status with no changes to the data. Then there is no reliable timestamp for the update to the data itself.

@tgbugs
Copy link
Collaborator Author

tgbugs commented Mar 7, 2023

A related issue here is that the -published files do not updated as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants