Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add phylogenetic #8

Merged
merged 31 commits into from
Aug 2, 2024
Merged
Changes from 1 commit
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
53246bf
Move phylogenetic workflow to phylogenetic directory
j23414 Jul 9, 2024
bbb7e77
Add copy example data custom rules
j23414 Jul 9, 2024
4b3c822
Since lassa has S and L segments
j23414 Jul 9, 2024
cf59a92
Update the CI
j23414 Jul 9, 2024
ecb6aa3
Move rules for preparing sequences to its own smk file
j23414 Jul 9, 2024
1fd7d55
Move rules for constructing phylogeny to its own smk file
j23414 Jul 9, 2024
c3fa8f6
Move rules for annotating phylogeny to its own smk file
j23414 Jul 9, 2024
ee0135a
Move rule for exporting auspice json to its own smk file
j23414 Jul 9, 2024
c078718
Move config values to config file
j23414 Jul 9, 2024
05dcd7d
Update augur export v1 to v2
j23414 Jul 9, 2024
5bfd527
Move config to defaults to match pathogen-repo-guide
j23414 Jul 9, 2024
003ecfc
Add description statement
j23414 Jul 9, 2024
4d5aeec
Copy phylogenetic instructions from pathogen-repo-guide
j23414 Jul 9, 2024
d81791c
Download sequences and metadata from data.nextstrain.org
j23414 Jul 10, 2024
d7b5931
Pass curated GenBank data through the rest of pipeline
j23414 Jul 10, 2024
ee21b9f
Bypass duplicate reference strain detected
j23414 Jul 10, 2024
543de0b
Fixup: Add description statement
j23414 Jul 10, 2024
de8645d
Fixup example sequences to ID on accession
j23414 Jul 10, 2024
fa12fbd
Fixup AmbiguousRuleException
j23414 Jul 10, 2024
c5f87ae
Add rule to autogenerate colors
j23414 Jul 10, 2024
8ba2317
Display strain name on tree
j23414 Jul 10, 2024
2553ebc
Attribution
j23414 Jul 10, 2024
689800e
Add phylogenetic automation and deploy
j23414 Jul 10, 2024
f818c4b
Separate files into segment directories
j23414 Jul 29, 2024
e4d25fb
Update description to match https://nextstrain.org/lassa/s
j23414 Jul 30, 2024
ecd6ac9
Fixup: Update description to match https://nextstrain.org/lassa/s
j23414 Jul 30, 2024
3eb4a8d
Update .github/workflows/ingest-to-phylogenetic.yaml
j23414 Jul 31, 2024
7e177ea
ingest: Switch to lowercase segment names
j23414 Jul 31, 2024
072da67
phylogenetic: Switch to lowercase segment names
j23414 Jul 31, 2024
81d1cd1
Stage the phylogenetic build to get feedback from SME before making i…
j23414 Jul 31, 2024
7cde259
Since number of S and L segment sequences are both below 5k, include …
j23414 Aug 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions phylogenetic/defaults/description.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequences and metadata for sharing their work. Please note that although data generators have generously shared data in an open fashion, that does not mean there should be free license to publish on this data. Data generators should be cited where possible and collaborations should be sought in some circumstances. Please try to avoid scooping someone else's work. Reach out if uncertain.

This work is made possible by the open sharing of genetic data by research groups, including these groups currently collecting Lassa sequences: [Christian Happi](http://acegid.org/), [Pardis Sabeti](https://www.sabetilab.org/), [Katherine Siddle](https://www.sabetilab.org/katherine-siddle/) and colleagues, whose data was shared via [this virological.org post](http://virological.org/t/new-lassa-virus-genomes-from-nigeria-2015-2016/191). If you intend to use these sequences prior to publication, please contact them directly to coordinate.

The Irrua specialist Teaching Hospital (ISTH) and Institute for Lassa Fever Research and Control (ILFRC), Irrua, Edo State, Nigeria; The Bernhard-Nocht Institute for Tropical Medicine (BNITM), Hamburg, Germany; Public Health England (PHE); African Center of Excellence for Genomics of Infectious Disease (ACEGID ), Redeemer’s University, Ede, Nigeria; Broad Institute of MIT and Harvard University (Cambridge, MA, USA). For further details, including conditions of reuse, please contact [Ephraim Epogbaini](mailto:[email protected]), [Stephan Günther](http://www.who.int/blueprint/about/stephan-gunther/en/), and [Philippe Lemey](https://rega.kuleuven.be/cev/ecv/lab-members/PhilippeLemey.html). Their data was first shared via [this virological.org post](http://virological.org/t/2018-lasv-sequencing-continued/192/8), which is continually updated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple nits:

  • fix cap
  • remove whitespace
  • for the other institutions, parenthetical followups are used for institutional abbreviations, not location -- standardize entry for Harvard/The Broad to match

Many of the institutions have active websites (e.g., ISTH = https://www.isth.org.ng/); consider linking to them.

Suggested change
The Irrua specialist Teaching Hospital (ISTH) and Institute for Lassa Fever Research and Control (ILFRC), Irrua, Edo State, Nigeria; The Bernhard-Nocht Institute for Tropical Medicine (BNITM), Hamburg, Germany; Public Health England (PHE); African Center of Excellence for Genomics of Infectious Disease (ACEGID ), Redeemer’s University, Ede, Nigeria; Broad Institute of MIT and Harvard University (Cambridge, MA, USA). For further details, including conditions of reuse, please contact [Ephraim Epogbaini](mailto:[email protected]), [Stephan Günther](http://www.who.int/blueprint/about/stephan-gunther/en/), and [Philippe Lemey](https://rega.kuleuven.be/cev/ecv/lab-members/PhilippeLemey.html). Their data was first shared via [this virological.org post](http://virological.org/t/2018-lasv-sequencing-continued/192/8), which is continually updated.
The Irrua Specialist Teaching Hospital (ISTH) and Institute for Lassa Fever Research and Control (ILFRC), Irrua, Edo State, Nigeria; The Bernhard-Nocht Institute for Tropical Medicine (BNITM), Hamburg, Germany; Public Health England (PHE); African Center of Excellence for Genomics of Infectious Disease (ACEGID), Redeemer’s University, Ede, Nigeria; Broad Institute of MIT and Harvard University, Cambridge, MA, USA. For further details, including conditions of reuse, please contact [Ephraim Epogbaini](mailto:[email protected]), [Stephan Günther](http://www.who.int/blueprint/about/stephan-gunther/en/), and [Philippe Lemey](https://rega.kuleuven.be/cev/ecv/lab-members/PhilippeLemey.html). Their data was first shared via [this virological.org post](http://virological.org/t/2018-lasv-sequencing-continued/192/8), which is continually updated.


We curate sequence data and metadata from NCBI as the starting point for our analyses. Curated sequences and metadata are available as flat files at:

* [data.nextstrain.org/files/workflows/lassa/L/sequences.fasta.zst](https://data.nextstrain.org/files/workflows/lassa/L/sequences.fasta.zst)
Expand Down