-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential problem with writing gz files #8
Comments
@jrm5100 I wonder - how is your tool reading the VCF? I'm fairly sure that io::write_lines() will write a vanilla gzipped file if the extension is .gz - not a bgzipped file. Perhaps the tool you're running expects gzipped files to be bgzipped? If that is the case I'd also be open to making io::write_lines() write bgzipped instead of plain gzipped files as that's probably more generally useful. |
Yeah, it just occurred to me the incompatibility might be with noodles-bgzf, and I guess the name might confirm that your hunch is correct. That's what I get for trying to write up a strange issue on a Friday evening without thinking it through. Not sure what the best approach is. Perhaps |
What about the following simple rust psuedo-code that could be refactored into its own function. If it's only for testing purposes, that should work. We tend to use this library for compression: https://github.com/sstadick/gzp let tempdir = TempDir::new().unwrap();
let io = fgoxide::io::Io::default();
let input = tempdir.path().join("input.vcf.gz");
let writer = BufWriter::new(File::create(&input).unwrap());
let mut bgzf_writer = BgzfSyncWriter::new(writer, Compression::new(3));
bgzf_writer.write_all(b"@NAME\nGATTACA\n+\nIIIIIII\n").unwrap();
bgzf_writer.flush().unwrap();
drop(gz_writer); |
I'm using
fgoxide
to write a test involving.vcf.gz
files.I've defined some input lines like
My test is failing with
Error: bytes remaining on stream
, but passing with files I've run manually.If I extract the generated VCF file and re-compress it with bgzip
My tool works. The same issue occurs when running the tool directly on the files (so the test itself isn't the problem, unless I've written the input file incorrectly).
The bgzip version is slightly larger.
There's no difference in the vcf version of each.
input.vcf.gz
input2.vcf.gz
The text was updated successfully, but these errors were encountered: