-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate VCF header when writing, if no header is explicitly supplied #1021
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I think we need to update any existing headers rather than autogenerate unfortunately. All sorts of stuff will be stored in these headers and autogenerating a minimal header for the data will lose that information.
0458a3b
to
1b8e57f
Compare
OK, I've updated to follow the approach suggested by @jeromekelleher, which is to use existing VCF header lines if they are available - and only generate header lines if they are not available. See new API doc for details. |
1b8e57f
to
9f20eb6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not gone through the gory details, but LGTM. Perhaps @benjeffery could take a closer look?
This PR has conflicts, @tomwhite please rebase and push updated version 🙏 |
9f20eb6
to
15352be
Compare
15352be
to
2b7d855
Compare
@tomwhite I see you're still pushing here, let me know when I should review. |
@benjeffery It's ready for a review now (I was just fixing conflicts). Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just a couple of comments.
Should there also be tests for what happens in edge cases? Empty string header arguments, that kind of thing.
for variant in v: | ||
assert "NS" not in dict(variant.INFO).keys() | ||
assert "HQ" not in variant.FORMAT | ||
assert variant.genotypes is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also assert that H3 and GL are retained?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Co-authored-by: Ben Jeffery <[email protected]>
If a header is passed in by the user then it will be used as-is, so an empty string header will result in a headerless VCF. I'm not sure that's something we want to support or encourage though - perhaps we leave it as undefined at this point? |
bfe07a1
to
40b95dc
Compare
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #1021 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 49 49
Lines 4748 4871 +123
==========================================
+ Hits 4748 4871 +123
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Update changelog
40b95dc
to
fc47c0e
Compare
Great, this looks ready to merge now. |
changelog.rst
Not ready to merge (but happy to discuss) since it doesn't have full coverage yet.