-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling indels #45
Comments
The above is for QUILT2_prepare_reference.R. [2024-11-04 23:57:59] Program start |
Thanks @PMuchina, Unfortunately, QUILT does not support imputing Indels at the moment. To my knowledge, Indels or SVs imputation is always challenging. |
Hi all, |
I'd say a key reason is just historical / lack of time. I wrote much of STITCH (which includes the code to read in and process bams at variant sites) back in ~2012-2014 or so, when indels were less of a priority, and much less accurate. Adding bi-allelic indels to the current code base isn't conceptually hard, but would require time (say 1-2 weeks full time). I'm in industry now and don't have that kind of time. Zilong could do it but it would need to be weighed against his other responsibilities and projects. Multi-allelic variants (indels or SNPs) are slightly harder though still doable (especially in QUILT) though for STITCH they'd be a bit trickier (I think) |
How does QUILT handle indels? It seems indels are skipped when building the binary reference.
[2024-10-31 22:41:27] Program start
[2024-10-31 22:41:27] Begin converting reference haplotypes
[2024-10-31 22:41:27] Using reference information from:/usr/home/ironbank/researchdata/qgg/zexi/BSF/Peter/Chr1/QUILTv1.0.5/Chr1_reference_panel.bcf
[2024-10-31 22:41:27] Using strategy of first imputing using common SNPs and then using all SNPs, with allele frequency threshold:0.001
[2024-10-31 22:41:27] Begin get sites and haplotypes from reference vcf
[2024-10-31 23:07:55] End get sites and haplotypes from reference vcf
[2024-10-31 23:07:55] There were 59759 skipped variants when processing the reference VCF (not bi-allelic, not a SNP, not unique position
[2024-10-31 23:07:56] There are 0 common and 0 rare (0 total) variants in the left buffer region -499999 <= position < 1
[2024-10-31 23:07:56] There are 428384 common and 301 rare (428685 total) variants in the central region 1 <= position <= 4999987
[2024-10-31 23:07:56] There are 33651 common and 2 rare (33653 total) variants in the right buffer region 4999987 < position <= 5499987
[2024-10-31 23:07:56] There are 0 regions out of 14438 below minimum recombination rate, setting them to minimum rate
[2024-10-31 23:07:56] There are 0 regions out of 14438 above maximum recombination rate, setting them to maximum rate
[2024-10-31 23:08:06] Using nMaxDH = 114
[2024-10-31 23:08:11] Build mspbwt indices
[2024-10-31 23:08:13] Done building mspbwt indices
[2024-10-31 23:08:13] Save converted reference haplotypes
[2024-10-31 23:08:14] Done converting reference haplotypes
The text was updated successfully, but these errors were encountered: