-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output interpretation #55
Comments
I'll update the docs when I get a chance, in the mean time I hope I can answer some of your questions briefly:
The sample genotype information
bubbles and breakpoints are two different variant calling algorithms we have developed. Which is best depends on the quality of your reference, coverage, number of samples and repeat content of the genome in question. Simply:
|
This explanation is very helpful. Thank you. |
Hi Isaac, We are still having issues with the output file. I have sent you a few emails that include the file output. I have rerun the program after applying your update and it seems to have resolved the issue of genotyping for some samples but not others. Some help with this issue would be greatly appreciated. Thanks |
Thankyou for providing this software. Sorry if this is a simple question but we are hoping you could provide some clarity and explanation of the output results. We would like to use this software for our analysis. We have run the pipeline on a few samples and have discovered a few different outputs and would like confirmation that we are interpreting the data correctly. The output below is from the bubble.joint.plain.k31.k61.geno.vcf files.
Our first set of output displays this. Would this be interpreted as Ck01 and Ck02 having the same base as the reference whilst Ck03 and Ck04 have the same base as the ALT?
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Ck01 Ck02 Ck03 Ck04
NC_020260.1 220 . G C . PASS BUBBLE=41257;K31 GT:K61R:K61A:GQ 1:57:0:. 1:72:0:. 1:0:31:. 1:0:230:.
NC_020260.1 839 . T C . PASS BUBBLE=15255;K31 GT:K61R:K61A:GQ 1:66:0:. 1:57:0:. 1:0:21:. 1:0:181:.
The second lot of output we are getting is this. What does it mean if there is only dots rather than coverage values?
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Ck01 Ck02 Ck03 Ck04
NC_020260.1 14366 . T C . PASS BUBBLE=2393;K31 GT:K61R:K61A:GQ .:.:.:. .:.:.:. .:.:.:. .:.:.:.
NC_020260.1 14385 . T G . PASS BUBBLE=2393;K31 GT:K61R:K61A:GQ .:.:.:. .:.:.:. .:.:.:. .:.:.:.
And finally we have some output where the GT is 0. How would this be interpreted? Also why is a GQ value provided when there is one isolate analysed but not when there are multiple isolates?
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Ck01
NC_020260.1 1103701 . A C . PASS BRKPNT=1507;K31;AC=1;AN=1 GT:K61R:K61A:GQ 0:51:10:20
NC_020260.1 1152696 . T G . PASS BRKPNT=1323;K31;AC=1;AN=1 GT:K61R:K61A:GQ 0:32:8:15
Would you also be able to provide an explanation for the difference between the breakpoints and bubble vcf files? We have noticed that some sites occur in one file type whilst in the other they are absent. Why does this occur? Also, is the main difference between the breakpoints.joint.plain.k31.k61.geno.vcf and breakpoints.join.plain.k31.k61.vcf is that the coverage is shown in the geno.vcf and only the GT values displayed in the other? Does the same apply to the bubble.joint vcf files?
Any help would be greatly appreciated.
Regards,
Alicia
The text was updated successfully, but these errors were encountered: