-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Augment sarscov2 community builds listing and include datasets in sequence search #266
Conversation
@jameshadfield Is this clear enough or should we explicitly indicate somehow whether datasets in sequence search results are from Nextstrain or the community? (edit: so far you can only get community datasets in search results like this by changing the search results jsonUrl locally, since i havent uploaded this to data.nextstrain.org yet) |
Just an idea, but could we port the same symbols you used for the map view of builds instead of showing the 'folded corner page' symbol? That would also indicate how 'focused' the build in, which might also help people decide which one might be of most help, depending on why/what they're looking for! |
creates an augmented version of static-site/content/allSARS-CoV-2Builds.yaml by fetching the actual datasets to get information about them such as date updated and number of tips to be displayed on the build listing and map.
building on: 081bd55633133e586a3dae86617564e1f2854418 , which fetched community datasets listed in static-site/content/allSARS-CoV-2Builds.yaml in order to augment that list with metadata. This takes the fetched community datasets and includes them in the sarscov2 search results database file. Still to-do is testing this on the front end.
696fee5
to
30d5558
Compare
Good idea @emmahodcroft. I think we could do this although the build level isn't immediately accessible in the strain map created for the search results since it's currently just strain -> list of dataset indexes, but we could match up those datasets names with entries from the build list that has the levels in it if we want to go with those icons. |
@jameshadfield this should be functional so I marked ready for review. However, before we merge, I still need to change a few URLs to data.nextstrain.org from staging.nextstrain.org. Just realized I also need to do this (as we discussed last week): |
Yes, good points @eharkins. I think this is an 'optional extra' rather than a dealbreaker - so if it's a lot of work do to this matching & maintain it - no worries! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eharkins -- this is looking really good.
I think there's a number of optional extras which I'd be in favor of making issues for rather than solving in this PR:
- augmenting core datasets (as mentioned above)
- changing the search icons, if possible
- cleaning up the displayed URL in search so that we don't display the query part
The functionality and code looks good - tested it a few different ways and it handles problems well (including if the fetch fails).
Things to add / fix before merge:
- dev docs to explain that the manually curated YAML is being augmented, and that this is happening via GitHub actions (it is, right?)
- brief dev docs to explain what extra information's added and what's needed (backend + frontend) to add more here
- URLs to change (as you noted)
- The augmented YAML currently includes “block chomping indicators” (YAML is crazy when it comes to multi-line strings) which should be removed. For instance:
- url: >-
http://auspice.finlaymagui.re/ncov/north-america/canada/ontario?f_division=Ontario
name: Ontario, Canada
geo: ontario
level: division
./scripts/collect-pathogen-resources.js --pathogen seasonal-flu
fails:
Writing search data to file ./data/search_seasonal-flu.json
(node:18117) UnhandledPromiseRejectionWarning: TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string. Received undefined
at validateString (internal/validators.js:124:11)
at Object.basename (path.js:1156:5)
at main (/Users/enigma/projects/nextstrain/nextstrain.org/scripts/collect-pathogen-resources.js:96:43)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (internal/process/task_queues.js:93:5)
...
P.S. I hadn't realised that Nextstrain groups datasets were part of this YAML - it's awesome that they are 😄
scripts/collect-search-results.js reads in a list of builds in a manually maintained pathogen build catalogue yaml file such as static-site/content/allSARS-CoV-2Builds.yaml and produces an augmented version with metadata from each corresponding dataset. (see 82e072d). That augmented yaml file is stored on s3 and fetched here to populate e.g the build-map (front-end manisfestation of that pathogen build catalogue on the page) with metadata about each dataset (e.g. date updated).
this makes this component more generally applicable for when we want to have similar build catalogues for other pathogens
we have a couple of scripts that are utilities for creating and checking the format of data resources behind pages like nextstrain.org/sars-cov-2. This generalizes the code in those scripts so that they make sense when we apply them to other pathogens in the future
also update the github action using scripts/collect-pathogen-resources.js also note that upon merging this we will need to update a link in the following doc: https://docs.nextstrain.org/en/latest/guides/share/sars-cov-2.html ; the link points to: scripts/check-sars-cov-2-builds-yaml.js which is now called: scripts/check-build-catalogue-yaml.js
this takes the augmenting of the build catalogue and abstracts it into a function
fixes bug when running for seasonal-flu introduced by work on this script. Better logic overall for writing outputs and printing messages
ac1ba9e
to
eab57a7
Compare
Thanks @jameshadfield! Cleaned up the commit history a bit and took care of these three:
I also ran Docs PR to address dev docs concerns: nextstrain/docs.nextstrain.org#44 |
this directory contains components which are now generally applicable to any page that wants to display a collection of info about any grouping of builds (e.g. flu, groups, commeunity), not just for sars-cov-2.
Closing this as all commits are in #271 - Eli please reopen if i'm missing something! |
see commit messages for now, just putting this up for review since I came to a good breaking point even though there is more to do.