Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Influenza page #271

Merged
merged 30 commits into from
Mar 11, 2021
Merged

Influenza page #271

merged 30 commits into from
Mar 11, 2021

Conversation

eharkins
Copy link
Contributor

@eharkins eharkins commented Feb 10, 2021

see commit messages. WIP

many commits here should go in separately as #266 but this is a separate concept so I used master as the base branch in anticipation of that getting merged first.

@tsibley tsibley temporarily deployed to nextstrain-s-influenza--0koet5 February 10, 2021 22:15 Inactive
@eharkins
Copy link
Contributor Author

Check out https://nextstrain-s-influenza--0koet5.herokuapp.com/influenza/ if you want to see the page.

@trvrb
Copy link
Member

trvrb commented Feb 10, 2021

Awesome! Some quick notes here @eharkins.

  • https://clades.nextstrain.org/ is SARS-CoV-2-specific at the moment. I think it will work for flu soon enough, but I'd drop this link for now.
  • Likewise I think I'd drop "Nextstrain discussion forum" and "Developer contributions". The SARS-CoV-2 page is a bit special. I think generally these pages want to be pretty streamlined for the pathogen of interest.
  • The current setup with the carats requires a lot of clicking to get to pages of interest (especially for seasonal flu). My inclination here would be to start with everything expanded. Easy to scroll and find the thing of interest. This should work for influenza where the list isn't too long. I see how this wouldn't work for SARS-CoV-2.
  • The "Nextstrain Team" doesn't work as a link. I'm not sure if this was intentional? This is the case here and with the SARS-CoV-2 page.
  • Also it would be nice particular for the avian flu pages if they said maintained by "Louise Moncla" rather than "Nextstrain team"

I really like where this is headed. That said, I do think we eventually want a "flatter" dataset UI (a little like #62 that I keep linking to). Something with columns for "maintainer" and "URL" and includes icon like what would be displayed in the byline. And then these columns could be sorted. Or the whole table could be filtered. Or another thought would be something like the GitHub UI here https://github.com/blab, where each dataset gets its own line that could include title, maintained by, last updated, etc... and most of the interaction is through the filter box. The "nested" approach here works for the very hierarchical flu datasets, but I don't think it would work as well for a grab bag of different pathogens from across core builds and different groups. It would be fun to sit down and sketch out some designs at some point.

@eharkins
Copy link
Contributor Author

Thanks for the quick review @trvrb!

Added some commits to address the bullet points. The last one is actually a bug also live on https://nextstrain.org/sars-cov-2 introduced by me forgetting to remove some fields in #248. There isn't meant to be a link there (unless you think we should link somewhere) for Nextstrain Team. I'll fix this separately for that yaml since it's live.

As for the general direction, I added an agenda point to discuss this more in our meeting tomorrow.

@eharkins eharkins temporarily deployed to nextstrain-s-influenza--0koet5 February 11, 2021 01:00 Inactive
@trvrb
Copy link
Member

trvrb commented Feb 11, 2021

Looks great Eli!

@trvrb trvrb self-requested a review February 11, 2021 20:37
@joverlee521
Copy link
Contributor

Is the Nextstrain.org splash page going to link to the new /influenza page? (Sorry if I missed the discussion on this.)

Currently the splash page has two flu cards under "Explore Pathogens" (1 for seasonal flu and 1 for avian flu) that link directly to default builds. Should these be combined into one "Influenza" card that links to this new /influenza page?

@eharkins
Copy link
Contributor Author

That would make sense to me @joverlee521, thanks I hadn't considered it in this PR. Does that sound good @trvrb?

@eharkins
Copy link
Contributor Author

eharkins commented Feb 16, 2021

Checklist before we can merge this:

@eharkins
Copy link
Contributor Author

I merged #278 and #279 here so we can get ready to merge this. However, this contains everything from #266, which I see as distinct in scope from this PR (though this PR relies on some of the refactoring in 266), so I think we need to decide if we still want the what's in #266 (see below) and if so, re-open it and merge it separately. If not, we can remove those commits from this branch before we merge. Details:

#266 includes:

  • A script fetching the sars-cov-2 datasets in the manually-curated YAML build catalogue in order to collect metadata (e.g. last updated date) about them.
  • A corresponding github action to push this "augmented" YAML with metadata to s3
  • Updating the existing /sars-cov-2 page UI to get the above "augmented" YAML from s3 and display dataset metadata
  • Extending the infrastructure that creates the database behind https://nextstrain.org/search/sars-cov-2 to include groups, community builds (listed in the manually-curated YAML build catalogue) in those search results.
  • some refactoring of react components behind /sars-cov-2

This PR includes:

this fetches the sars-cov-2 build catalogue from
s3 instead of using a local version.

it also refactors the react components behind
/sars-cov-2 to be more flexible for other pathogen
pages like /influenza.
this directory contains components
which are now generally applicable
to any page that wants to display a
collection of info about any grouping
of builds (e.g. flu, groups, commeunity),
not just for sars-cov-2.
in anticipation of using it for any listing of
builds, not specific to sars-cov-2. This
picks up where that left off in
anticipation of using it for a flu builds
page. Specifically this commit allows
the build hierarchy to be defined by custom
field names in a build catalogue yaml (e.g.
static-site/content/allSARS-CoV-2Builds.yaml)
. These field names are passed to the builds
component so it can create the hierarchy
according to them.
This creates a page at /influenza
analogous to /sars-cov-2 but for flu.
Note that this page isn't called /flu
because of conflict with existing redirects.
In addition to creating a useful hub for
gathering resources about flu, this can
serve as an example of how it should
hopefully now be easier to create more
pages like this that show off all that
Nextstrain has to offer on a particular
pathogen or topic spanning pathogens
(e.g.  a /groups page that lists all
nextstrain groups and links to docs
about groups etc). In many cases
these pages can serve as a way to
dedicate more UI real estate to some
things or sections we show off on the
splash page and offer a more customized
landing page for a particular topic.
Decouple build map from build dropdown
menu. Remove spacers and non-specific
stuff from specific components like
hierarchical dropdown menu and sit-rep
listing. Decouple fetching of build
catalogue from dropdown menu.
1: Flatten hierarchy in anticipation of
flatter UI.

2:
Manually add properties for easy parsing
in react. The long-term solution here will
be to generate things like this by parsing
build URLs in collect-pathogen-resources.js
this offers an alternative way to
list builds that is more flexible -
listing them in a flat (non-hierarchical)
interface that can be filtered.
The listing can easily be converted
into a table or other.

see #62 (comment)
for more details
eharkins and others added 15 commits March 11, 2021 10:48
This commit streamlines the dataset selection UI to use generic keywords from filenames. Principally, it:
1. Creates a new light weight collect-datasets.js script from collect-pathogen-resources.js that provisions datasets_influenza.json
2. Modifies influenza.jsx to use this dataset listing
3. Creates a new DatasetSelect component from FilterBuilds that filters by keyword and that implements some row styling
This uses react-styled-flexboxgrid for a responsive grid layout. We use Bootstrap elsewhere in nextstrain.org to accomplish a similar outcome, however, Bootstrap css and className="col-md-1" don't interact well with the use of styled-components. Here, I've opted for a Flexbox solution. This adds a dependency to package.json, but allows for an easy, expressive grid and I think should be generally useful. I compared a few similar options for this and this one was my favorite.
#279 gives more flexibility by depending
on the output of scripts/collect-datasets.js
instead and using a more scalable UI
an earlier commit on this branch changed
this component to fetch the sars-cov-2
builds catalogue that backs /sars-cov-2
from s3. This changes back to the original
way of using a local YAML file instead,
because we are holding off on the s3
implementation for now.
@jameshadfield jameshadfield temporarily deployed to nextstrain-s-influenza--tinqyr March 11, 2021 22:05 Inactive
@eharkins
Copy link
Contributor Author

eharkins commented Mar 11, 2021

I rebased to remove the following from this PR:

A script fetching the sars-cov-2 datasets in the manually-curated YAML build catalogue in order to collect metadata (e.g. last updated date) about them.

A corresponding github action to push this "augmented" YAML with metadata to s3

Updating the existing /sars-cov-2 page UI to get the above "augmented" YAML from s3 and display dataset metadata

Extending the infrastructure that creates the database behind https://nextstrain.org/search/sars-cov-2 to include groups, community builds (listed in the manually-curated YAML build catalogue) in those search results.

So now this represents a more well-defined scope:

  • refactoring of react components used for /sars-cov-2 for use in /influenza and beyond
  • addition of /influenza

Will do some more testing to make sure this is good to go, and will merge locally when it's ready to resolve the conflict github is warning us about.

@eharkins eharkins marked this pull request as ready for review March 11, 2021 22:07
@eharkins
Copy link
Contributor Author

@jameshadfield can you confirm that you no longer get the scroll anchor bug?

Copy link
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Can confirm that the scroll bug is no longer. I think a 5min cadence for the catalogue update isn't problematic at the moment, but we should consider how this works in detail in future work around the catalogue.

I'm in favor of merging this as-is, and I'll start to create issues for future bits of work that are made possible because of this. Thanks @eharkins!

@eharkins eharkins merged commit 6c5ea83 into master Mar 11, 2021
@eharkins eharkins deleted the influenza-page branch March 11, 2021 23:55
@trvrb
Copy link
Member

trvrb commented Mar 12, 2021

Thanks Eli! Things look generally great. Moving Dataset / Contributor / Uploaded date above the scroll area was smart.

The tick could be increased to 15 minutes. Doesn't matter much for flu whether it's 5 minutes or 24 hours (will get the same thing every time with the exception of slightly modified upload dates). Mostly will matter once we start incorporating groups. Just don't want to have a process that takes longer than 5 minutes (or whatever) to update. I'm fine starting with 5 minutes and seeing if it needs adjusting.

My only remaining issue (and something that doesn't block merge but should get some thought) is the text:

Influenza builds
This section is an index of public Nextstrain builds (datasets) for flu, organized by type. See here for more information on what a build is. If you know of a build not listed here, please let us know! Please note that inclusion on this list does not indicate an endorsement by the Nextstrain team.

As the linked doc page explains:

Nextstrain’s focus on providing a real-time snapshot of evolving pathogen populations necessitates a reproducible analysis that can be rerun when new sequences are available. The individual steps necessary to repeat analysis together comprise a “build”.

I think of a "build" as a workflow. seasonal-flu is a build. ncov is a build. I think of the products produced by these builds (eg ncov_global.json) as "datasets". One build may produce multiple datasets. Perhaps this makes the most sense when encountering Auspice JSONs like ncov_global_2020-06-01 where it's clear this a "dataset" distinct from the ongoing day-to-day rebuilds.

I was thinking of this interface as surfacing "datasets", because we might well have situations like https://nextstrain.org/groups/epicovigal/ncov/galicia-in-progress that don't have an obvious associated "build" (at least one that's not publicly up on GitHub).

For flu, I think we have seasonal-flu and avian-flu "builds" (which could well be linked to in the top free-form text) and a larger number of datasets that are continually updated by these two builds.

eharkins added a commit that referenced this pull request Mar 12, 2021
this removes an old workaround made
obsolete by #280 and updates a comment
post- #271
eharkins added a commit that referenced this pull request Mar 18, 2021
this changes a bunch of places where
I had used the word build when I really
meant dataset. For a clear explanation
of the difference, see:
#271 (comment)
eharkins added a commit that referenced this pull request Mar 30, 2021
this changes a bunch of places where
I had used the word build when I really
meant dataset. For a clear explanation
of the difference, see:
#271 (comment)
jameshadfield pushed a commit that referenced this pull request Apr 10, 2021
this removes an old workaround made
obsolete by #280 and updates a comment
post- #271
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants