Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dates covered vs dates collected #4

Open
ruebot opened this issue Dec 30, 2016 · 2 comments
Open

Dates covered vs dates collected #4

ruebot opened this issue Dec 30, 2016 · 2 comments

Comments

@ruebot
Copy link
Member

ruebot commented Dec 30, 2016

Should we make a distinction between the two in the metadata?

@edsu
Copy link
Member

edsu commented Dec 30, 2016

In b73897d I added an added dataset field to indicate when the dataset was added to the catalog, which is different from the date the dataset was published.

But it sounds like you are talking about a different kind of date distinction. Are you thinking of a situation where the time period in which data collection ran does not match the time period of the tweets collected? I think this can only happen when running searches right?

@ruebot
Copy link
Member Author

ruebot commented Dec 30, 2016

Are you thinking of a situation where the time period in which data collection ran does not match the time period of the tweets collected? I think this can only happen when running searches right?

Exactly!

In the Scholars Portal Dataverse, where I put our datasets, we have the option of adding date ranges for "date covered" and "dates collected". This works well for all the major collections I've done with twarc, since I use a strategy of filter and search, and then deduplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants