Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create keyword dataset in the field of environmental sustainability and technology #782

Open
Ly0n opened this issue Aug 15, 2024 · 0 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@Ly0n
Copy link
Member

Ly0n commented Aug 15, 2024

For various future projects and ideas that we are developing, we need a comprehensive data set of keywords in the field of environmental sustainability and technology.

Our keyword dataset could also be very useful for other projects and organisations.

  1. Create a isolated repository that creates a CSV files with keywords.
  2. Input should be a blacklist of words and the ecosyste.ms API. Keywords could be derived from description in the first step but could also include the README or other documentation files in the future.
  3. Create a release process with GitHub Action that create a up-to-date artefact of the current keywords.
  4. Integrate other keyword sources to that a combined keywords datasets can be created for further use cases:
  5. Create a blog post about your findings and about the first release of the datasets.

Here some keywords we used so far. Further keywords sources should be investigated:

https://www.earthdata.nasa.gov/learn/find-data/idn/gcmd-keywords
https://blogs.reading.ac.uk/weather-and-climate-at-reading/2021/whats-that-data-why-and-how-the-geoscientific-community-is-forging-metadata-standards/
https://opensustain.tech/#taxonomy-and-ontology

This first experiments investigating this issue can be found here: https://colab.research.google.com/github/protontypes/osta/blob/main/topic_extraction.ipynb

@Ly0n Ly0n added help wanted Extra attention is needed good first issue Good for newcomers labels Aug 15, 2024
@Ly0n Ly0n changed the title Create Keyword dataset in the field of environmental sustainability and technology Create keyword dataset in the field of environmental sustainability and technology Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant