-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update usage of "accession" as the ID column #29
Conversation
ac2b5d5
to
0ef13fa
Compare
0ef13fa
to
e012512
Compare
May be able to check this manually, although I'm all for a reasonable automated check!
Or compare |
Thanks @j23414! I'll give it a try. |
I added CI with example data to get a quick build. Looks like |
Oh, thanks for catching! And I like the fixup. For the missing rsv/scripts/set_final_strain_name.py Lines 6 to 7 in 77840b2
to something like:
I haven't tried the above code so I'm not certain if adding new attributes to a node would cause downstream errors. |
I think it has something to do with I didn't get around to this yet and might not today, but I'll update here when I do! |
Ok, I think I got to the root of this. Reasons:
If this is it, then a proper fix would be to update read_metadata to keep the index column as a column when setting the index. Signing off now, I'll get to this another day. |
Add a note where it should be kept.
Checking presence of a key in the config should be done with the membership operator.
Don't set defaults when retrieving strain_id_field so "accession" is only set on the config level.
New options in Augur 22.2.0 allow usage of this column as the ID column across all subcommands that read metadata. For this workflow in particular, the metadata file can now be used as-is. This removes the need for a modified copy of the metadata. It also allows specifying an original metadata column "strain" as the display strain field, rather than a column "strain_original" generated during the Snakemake workflow.
77840b2
to
d123d60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d123d60's PR-triggered CI run results (pr29/d123d60/docker/rsv/a/F, etc.) should be comparable to those from latest master
(master/4ecf498/docker/rsv/a/F, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran a manual test, I can see both strain name and accession named tips. Looks good to me!
Description of proposed changes
New options in Augur 22.1.0 allow usage of this column as the ID column across all subcommands that read metadata.
For this workflow in particular, the metadata file can now be used as-is. This removes the need for a modified copy of the metadata. It also allows specifying an original metadata column "strain" as the display strain field, rather than a column "strain_original" generated during the Snakemake workflow.
Related issue(s)
Testing