You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Addition Description
It would be useful to bin reads by primer prior to primer removal. I'd like to separate a single FASTQ-based artifact (containing several different primers) into multiple output artifacts by primer; each output artifact would be characterized by a single primer. This would be helpful for meta-analyses in which sequences with multiple primers/variable regions may be found in a single QIIME artifact.
This is possible with native Cutadapt (as of v4.5) using steps to demultiplex, but not in the QIIME 2 plugin as its inputs are restricted to specific semantic types.
Current Behavior
The QIIME 2 plugin performs a similar function with qiime cutadapt demux (based on adapter sequence), but generates only a single output for demultiplexed sequences. It also requires an input artifact of type MultiplexedSingleEndBarcodeInSequence and does not accept SampleData[Single/PairedEndSequencesWithQuality].
qiime cutadapt trim could technically perform this by running the command once per primer (pair), but that is quite inefficient.
Proposed Behavior
q2-cutadapt would take as input 1) a FASTQ artifact of SampleData[Single/PairedEndSequencesWithQuality], which contains N different primer sequences among its many reads, and 2) a tab-separated metadata file containing the N primer names and corresponding primer sequences.
As output, it would generate N artifacts of SampleData[Single/PairedEndSequencesWithQuality]; each output artifact would contain reads of the same primer sequence. There would also be an output artifact (also SampleData[Single/PairedEndSequencesWithQuality]) of sequences that did not have any of the N primer names.
Questions
Does QIIME 2 allow for variable numbers of output artifacts? I suppose that would be a blocker to implementation.
Great to know, thanks @ebolyen! Yes, I would be more than happy to work on it. Is Collection[...] a semantic type found in q2-types / the base QIIME 2 installation? I'm not seeing much documentation for it on first glance.
Hey @lina-kim, I am actually working on some tutorial content that includes Collection right now. You can see the working draft here. Note that you'll only be able to access this tutorial page through this like as it's built from a pull-request (so you won't find this content if you navigate from https://develop.qiime2.org yet). This link will also break once the corresponding PR is merged.
You can also find the new API docs on Collectionhere.
Want to take a look at that and let us know if you have questions about how to use Collection?
Addition Description
It would be useful to bin reads by primer prior to primer removal. I'd like to separate a single FASTQ-based artifact (containing several different primers) into multiple output artifacts by primer; each output artifact would be characterized by a single primer. This would be helpful for meta-analyses in which sequences with multiple primers/variable regions may be found in a single QIIME artifact.
This is possible with native Cutadapt (as of
v4.5
) using steps to demultiplex, but not in the QIIME 2 plugin as its inputs are restricted to specific semantic types.Current Behavior
qiime cutadapt demux
(based on adapter sequence), but generates only a single output for demultiplexed sequences. It also requires an input artifact of typeMultiplexedSingleEndBarcodeInSequence
and does not acceptSampleData[Single/PairedEndSequencesWithQuality]
.qiime cutadapt trim
could technically perform this by running the command once per primer (pair), but that is quite inefficient.Proposed Behavior
q2-cutadapt
would take as input 1) a FASTQ artifact ofSampleData[Single/PairedEndSequencesWithQuality]
, which contains N different primer sequences among its many reads, and 2) a tab-separated metadata file containing the N primer names and corresponding primer sequences.SampleData[Single/PairedEndSequencesWithQuality]
; each output artifact would contain reads of the same primer sequence. There would also be an output artifact (alsoSampleData[Single/PairedEndSequencesWithQuality]
) of sequences that did not have any of the N primer names.Questions
References
qiime cutadapt trim-paired
The text was updated successfully, but these errors were encountered: