-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update BenchmarkSVs workflow(s) with some new features #199
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! only question is whether you want to add any tests.
Array[String] base_sample_names | ||
File base_vcf | ||
File base_vcf_index | ||
String base_sample_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is this changing to only support benchmarking a single sample at a time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I wanted to keep things simpler in the WDL / Terra UI and made this just single sample for now. It would be easy to write a wrapper in the future if there was demand.
I'll put tests on the infinite to-do list and get this merged for now. Thanks for checking! |
This PR includes a bunch of refactoring, improvements, and new features for the process of benchmarking SV VCFs. At a high level, this includes:
truvari refine
andtruvari ga4gh
for collapsing (or "harmonizing") similar events in truth/query down to one event to try to improve benchmarking statistics where calls might get mismatched due to extreme fuzziness in the calling step. This includes an alignment step usingmafft
as outlined in thetruvari
documentation on the process. The docker image is updated to include newer versions of truvari as well as mafft. This option requires the input files to be phased.