Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reads dropped during ASE calling? #17

Open
rmagoglia opened this issue Jul 20, 2016 · 1 comment
Open

Reads dropped during ASE calling? #17

rmagoglia opened this issue Jul 20, 2016 · 1 comment

Comments

@rmagoglia
Copy link

There are four categories that reads can fall into during GetGeneASEbyReads.py: ref, alt, no snp, or ambiguous. The sum of all of the counts in these four categories does not equal the number of reads in my input bam file - what's causing this?

@petercombs
Copy link
Contributor

Just noticed this issue. I assume you have either figured this out or given up on it, but I would guess what's going on is one or both of the following:

  • If multiple genes map to the same coordinates, then a given read could in principle count towards both of them. So if you're just taking the sum of each of the four columns, you would get more reads than you started with.
  • If a read does not map to any gene in the GFF file, then it won't get counted. So if you have a lot of reads not mapping to annotated genes (species specific exons, unannotated lncRNAs, eRNAs, or DNA contamination are a few possibilities that come to mind), then you could end up with fewer reads than you started with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants