You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are four categories that reads can fall into during GetGeneASEbyReads.py: ref, alt, no snp, or ambiguous. The sum of all of the counts in these four categories does not equal the number of reads in my input bam file - what's causing this?
The text was updated successfully, but these errors were encountered:
Just noticed this issue. I assume you have either figured this out or given up on it, but I would guess what's going on is one or both of the following:
If multiple genes map to the same coordinates, then a given read could in principle count towards both of them. So if you're just taking the sum of each of the four columns, you would get more reads than you started with.
If a read does not map to any gene in the GFF file, then it won't get counted. So if you have a lot of reads not mapping to annotated genes (species specific exons, unannotated lncRNAs, eRNAs, or DNA contamination are a few possibilities that come to mind), then you could end up with fewer reads than you started with.
There are four categories that reads can fall into during GetGeneASEbyReads.py: ref, alt, no snp, or ambiguous. The sum of all of the counts in these four categories does not equal the number of reads in my input bam file - what's causing this?
The text was updated successfully, but these errors were encountered: