Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blank row crashes R - read.vcfR() #141

Open
TomJamesW opened this issue Jul 9, 2019 · 4 comments
Open

blank row crashes R - read.vcfR() #141

TomJamesW opened this issue Jul 9, 2019 · 4 comments

Comments

@TomJamesW
Copy link

Hello Brian,

I'm using SLiM 3 to generate vcf files which I am analysing in R, however occasionally read.vcfR() was causing my R session to 'encounter a fatal error' and abort. I followed your advice to determine the problem and it seems it is crashing because the vcf file has two completely blank lines in it. I appreciate this may be more of an issue with SLiM but I was hoping it might be possible for vcfR to accommodate this, or if you have any other suggestions?

I have attached two files - blankrow346.vcf contains blank lines before POS=346 and causes R to abort when read.vcfR() is run, and clean.vcf is of a similar size and style but works fine. The files are being automatically generated so unfortunately I can't manually edit them. I'm using the most recent GitHub version of vcfR.

Blanklineex.zip

Many thanks,
Tom

@TomJamesW
Copy link
Author

p.s. even if there was a way for this to produce a warning/error instead of crashing R that would be beneficial

@knausb
Copy link
Owner

knausb commented Jul 9, 2019

Hi @TomJamesW , thanks for bringing this to my attention! And double thanks for example files, that really helps. I think the best answer here is that you should let the SLiM people know about this. The VCF specification v4.3 section 1 reports zero length records are not allowed. Although, your file does appear to report itself to be v4.2.

That said, I agree that it would be nice to handle that more elegantly. Your file "clean.vcf" reads in fine. But I see lines 16, 135, 243, and possibly others as blank. In your file "blankrow346.vcf" I see two empty lines before the variant at position 346. If I remove one of these I can read it in. If I add three empty lines it crashes. So this appears to be issue as having to do with more than one empty line, as you report. So we shall add this to the to-do list.

If you're looking for a quick fix, you could grep out the empty lines.

@TomJamesW
Copy link
Author

Thanks for your speedy response! I'll let the SLiM people know too - I have a feeling they have been tinkering with the way they (simplify and) output vcf files, so they may not have realised there has been this affect.

I hadn't noticed the blank lines in "clean.vcf" either, will also pass this on and try using grep in the meantime.

Thanks for your help.

@knausb knausb reopened this Jul 10, 2019
@knausb
Copy link
Owner

knausb commented Jul 10, 2019

Oops, I'm going to leave this issue open until I get a chance to address it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants