You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Good morning!
I am working on marine aerobic methanotrophs, a group of microorganisms that oxidize methane in the presence of oxygen, and I used dada2 to analyse pmoA gene (econding methane monooxygenase) sequences that I took from a study. This study used a specific pmoA gene database to assign the taxonomy and I wondered IF and HOW a small database can affect the algorithm behind the "assignTaxonomy" function. If so, can you suggest me how to solve this problem?
Many many thanks in advance!
The text was updated successfully, but these errors were encountered:
You can only assign to what is in your database. So that is one way. The second important way is that without outgroup sequences, you can make spurious assignments to what is in the database even when the query sequence is not very similar to anything in the reference database. This is because the way the naive Bayesian classifier method (implemented by assignTaxonomy) calculates the certainty of an assignment. It subsamples the query sequence and checks how often those subsample get assigned the same taxonomy. Small databases with no outgroup sequences are much more likely to have the subsamples match the same reference sequence as the full query simply because no other remotely similar sequences are in the database.
I don't know about pmoA references, but you could consider a method like IdTaxa in the DECIPHER package as an alternative approach. This method directly considers sequence similarity between query and best reference match when making assignments, and is more robust to the type of error I described above.
Good morning!
I am working on marine aerobic methanotrophs, a group of microorganisms that oxidize methane in the presence of oxygen, and I used dada2 to analyse pmoA gene (econding methane monooxygenase) sequences that I took from a study. This study used a specific pmoA gene database to assign the taxonomy and I wondered IF and HOW a small database can affect the algorithm behind the "assignTaxonomy" function. If so, can you suggest me how to solve this problem?
Many many thanks in advance!
The text was updated successfully, but these errors were encountered: