-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dorado for m6A analysis #1170
Comments
Hi @baibhav-bioinfo,
|
Okay @malton-ont thanks for the response. |
hello, dorado basecaller [email protected] pod5/ --modified-bases-models [email protected]_m6A_DRACH@v1/ --device cuda:all > sample.bam dorado basecaller [email protected] pod5/ --modified-bases-models [email protected]_m6A@v1/ --device cuda:all > sample.bam for the DRACH motif context i got 60,000 sites per sample (10 million reads with ~1000nt per read) Did something wrong happened in my ALL context run? Please suggest. |
Hi @baibhav-bioinfo, By default dorado outputs predictions in all-context for any site with a >5% chance of being a modification. You can adjust this using the |
hi with 0.25 i got again >1 million sites. is there anything wrong i am doing? $dorado basecaller [email protected] pod5/ --modified-bases-models |
There doesn't look to be anything wrong with your command as far as I can see. Is 1.6 million sites not reasonable? That's ~30x the number you see for the DRACH context - could it just be that only 1/30 of the A-bases are in a DRACH context? |
but all the literature and research articles mentions majority of the m6A sites are in DRACH motif, with very few exceptions. maybe i will need to see other filters, using modkit. I will read more and let you know if find anything. Thankyou so much for the prompt responses. |
The all-context model reports modifications at all A-bases, regardless of the context, so I'd expect more sites to be recorded. From your reading it sounds like these should have a relatively low probability outside the DRACH context, which tracks with you seeing few sites when applying a high threshold value. Beyond this, I think you'll get better answers elsewhere, as these are more bioinformatics questions than issues with dorado itself. |
You are right, i will write on nanopore community for better explanations. |
hello, as you mentioned in the previous chats the default threshold for m6A site detection in "all context" is 5% or 0.05. |
No thresholds are applied in the case of context-based models. This is because there is no way to distinguish via the MM/ML tags between bases that are skipped because they did not match the context and bases that would be skipped because they did not meet the threshold. |
thanks @malton-ont, i will try to understand what you just said. There is also one question, might be trivial. These are the commands i ran for both DRACH and all context I inspected the bam files from dorado. the bam files just from step 1 have the MM/ML tags (which tells us about the m6A sites detected), but the aligned bam files from step 2 doesnt have it. That makes me little confused where did the modification information go. |
|
Thanks @malton-ont , now I can see what is happening there during alignment. |
You'll likely get a better response to paper requests on the Nanopore community forums. |
Hello,
Appreciation for developing the valuable and diverse software.
I am trying to use Dorado for m6A analysis (using DRS data), from site identification to Diff Methylated Rate analysis between conditions with replicates (using modkit).
(1) i wanted to know if
dorado basecaller [email protected] /path/to/pod5
--modified-bases-models [email protected]_m6A_DRACH@v1/ --device cude:all --reference ref.fasta > calls.bam
this is correct approach for alignment along with basecalling.
also as other tools (eg. m6Anet) uses transcriptome rather than genome. which one would be more suitable to align with?.
(2) The resultant bed file (after modkit pileup) contains position with Nmod=0. Why are those in the output when my argument is to detect the m6A mod sites? do i need to filter out those, keeping only Nmod>=1?
also the file have Nother_mod, what are those and why are we getting other mods?
is there any option to get only relevant Nmod rows, in my case A?
The text was updated successfully, but these errors were encountered: