Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove remaining incorrect zero-length-B-array checks and exception throwing #1669

Closed
wants to merge 2 commits into from

Conversation

jmarshall
Copy link
Member

@jmarshall jmarshall commented Jun 13, 2023

PR #1194 removed a number of Array.getLength(value) == 0 checks when setting B-array tagged fields, as such empty tagged field values had recently been made explicitly valid in SAM/BAM/CRAM files.

Unfortunately the existing such check in setUnsignedArrayAttribute() was omitted from that PR and has now caused the issue raised in samtools/hts-specs#728. Whatever additional problems might (but probably don't) exist elsewhere in the HTSJDK code for zero-length ML tags, this check for a generic zero-length array was incorrect and should be removed. A test case for setUnsignedArrayAttribute() has also been added.

In grepping for other candidates, I noticed one other potentially incorrect looking Array.getLength invocation.

This was in Slice::setAttribute(), which appears to set an optional tag in a Slice header. There are currently no such tags defined, so the question is somewhat moot but — as a separate commit for your consideration — I have also removed that check. (As there are no such tags currently defined, there are no test cases for these fields to which a test exercising this change could be added.)

This function was omitted from PR samtools#1194. Zero-length 'B' arrays
are valid, so remove this check too and add a test case.
Slices' optional tags follow the rules for BAM tagged fields, so
zero-length 'B' arrays are valid. Slice tags are currently unused,
so the question is somewhat moot and there are no test cases.
Copy link
Collaborator

@gileshall gileshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging @droazen for additional review. Changes look fine to me.

Since we have a use case in the wild, is there any merit in testing it with a contrived VCF replicating the pathology or is this too cumbersome / overkill?

Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarshall Thanks for this patch! It looks fine to me, but to make sure that the cram codepath can handle empty arrays I think it would be advisable to add a cram unit test as well.

@@ -745,9 +745,6 @@ public void setReferenceMD5(final CRAMReferenceRegion cramReferenceRegion) {
* @param value tag value
*/
public void setAttribute(final String tag, final Object value) {
if (value != null && value.getClass().isArray() && Array.getLength(value) == 0) {
throw new IllegalArgumentException("Empty value passed for tag " + tag);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a unit test for the cram setAttribute() method with empty arrays as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned, there are no current unit tests for this Slice attribute functionality and the slice tag functionality is currently unused in the format. If you think this Slice.java change is dubious, please discard the second commit.

@droazen droazen self-assigned this Jun 22, 2023
@jmarshall
Copy link
Member Author

jmarshall commented Jun 22, 2023

I am not a Java programmer and I am not familiar with your test suite or how to set up a CRAM unit test in it.

I provided this pull request to get the ball rolling, as there had been no response to my comment on #1194 notifying the HTSJDK maintainers of this problem. If the maintainers think that applying this change requires additional unit tests, they should add such tests to the PR themselves or use this PR as inspiration for their own fix for this problem.

@droazen
Copy link
Contributor

droazen commented Jun 23, 2023

No problem @jmarshall, @gileshall has kindly volunteered to complete this PR. Thanks for getting the ball rolling

@droazen droazen assigned droazen and unassigned droazen Jun 23, 2023
@cmnbroad
Copy link
Collaborator

@gileshall Sorry - I'm a bit late in looking at this. Since this method is unused by the CRAM code, I'd propose that we just remove it rather than adding a test to cover it.

@cmnbroad
Copy link
Collaborator

Also, FWIW, the Slice class has pretty extensive unit test coverage, distributed across a few test files, with the primary ones being in the SliceTests class.

@lbergelson
Copy link
Member

This was finished in #1674. Thank you everyone who contributed.

@lbergelson lbergelson closed this Aug 3, 2023
@jmarshall jmarshall deleted the zero-len-array branch August 3, 2023 22:49
@jmarshall
Copy link
Member Author

Thanks, I'm glad this made it into the release. 🎉

I wouldn't mind if you wanted to put “by @gileshall and @jmarshall” in the release notes. (And this PR's title “Remove remaining incorrect zero-length-B-array checks” is perhaps a better summary one-liner.)

@lbergelson
Copy link
Member

@jmarshall Ah, sorry about that. I made sure it credited you in the actual commit but I just hit the button for "generate some lazy release notes" and it apparently doesn't include co-authors. I've updated it. My recent commit message work has been pretty bad. Zero-length array is better than the accidental string of 6 commits that all say "fix" that I merged before that because I hadn't noticed it was set to rebase and not squash...

@jmarshall
Copy link
Member Author

NBD but thanks — it's good to have things to point to if people ask me what I spend my time on…

Bad luck about your string of “fixes” 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants