-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract additional metadata from VCF files #464
Comments
Somewhat related to this, I've been working on code to convert between indices and genotype calls for VCF fields of length I don't currently have a specific feature to add with these (hence no PR) but I'm likely to be working with genotype posterior distributions in future (stored in the GP format field). |
Thanks @timothymillar, this would be a good addition in the future. What do you mean by "up to the point of overflowing the index"? |
A large enough combination of ploidy and n_alleles will result in an index that is too large for an int64. But this shouldn't be a problem for realistic values. |
##INFO
variant_
. We can punt cases whenNumber
is not 1 for now.cyvcf2
.##FORMAT
Number
is not 1 for now.cyvcf2
makes it easy to see which FORMAT fields are available for each variant withv.FORMAT
.
, inv.format('DP')
they are-2147483648
, and inv.gt_depths
they are-1
.##CONTIG
vcf.seqnames
has contig namesvcf.seqlens
has contig lengthsvcf['CONTIG']
should getcontig
header lines but it's giving me aKeyError
; some kind of encoding issue, I guess.vcf.header_iter()
, though, it appears thatcyvcf2
is not parsing out the assembly field of the contig header lines, so we will have to usevcf.raw_header
or patchcyvcf2
.##SAMPLE, ##INDIVIDUAL
The text was updated successfully, but these errors were encountered: