Added branched handling of ULI inputs in filter_pacbio #115

tkchafin · 2024-09-10T12:15:20Z

Ultra low-input libraries (tracked in the "library" samplesheet column) will now be run through pbmarkdup. Note nothing is removed in the test file, but I have marked the PB cram as "uli" to trigger the test

Closes #72

PR checklist

This comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
If you've added a new tool - have you followed the pipeline conventions in the contribution docs
Make sure your code lints (nf-core lint).
Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
Usage Documentation in docs/usage.md is updated.
Output Documentation in docs/output.md is updated.
CHANGELOG.md is updated.
README.md is updated (including new tool citations and authors/contributors).

muffato · 2024-09-10T19:53:47Z

Shane first runs lima on a database of ULI adapters, and then pbmarkdup

for ULI data, we need to run and extra lima to trim the ULI adapter sequence

https://github.com/sanger-tol/tol-workflows/blob/main/wr/wr-import-pacbio-ccs#L323-L338

Do we need lima here too ?

tkchafin · 2024-09-11T07:06:19Z

For Sanger data, this will already have been done (actually, mark/rm duplicates is done as well), so technically I think we can treat ULI reads the same as LI/other prep types for production purposes.

For full ULI support for external data, special handling of adapter trimming makes sense, although the pipeline as-is generally assumes most read filtering/qc has been done prior to running. Maybe we could think about adding an optional sub workflow to take in raw data?

tkchafin · 2024-09-12T09:21:10Z

@reichan1998 Can you review? I am tracking the lima/adapter removal suggestion in a separate ticket on pre-alignment QC, but for now we can merge the pbmarkdup integration if it is all working

tkchafin added 3 commits September 10, 2024 13:10

set file for testing uli subworkflow

1e529de

added pacbio_pbmarkdup module

7ccf87e

added uli markdup subworkflow

7c50e25

tkchafin requested a review from reichan1998 September 10, 2024 12:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added branched handling of ULI inputs in filter_pacbio #115

Added branched handling of ULI inputs in filter_pacbio #115

tkchafin commented Sep 10, 2024

muffato commented Sep 10, 2024

tkchafin commented Sep 11, 2024

tkchafin commented Sep 12, 2024

Added branched handling of ULI inputs in filter_pacbio #115

Are you sure you want to change the base?

Added branched handling of ULI inputs in filter_pacbio #115

Conversation

tkchafin commented Sep 10, 2024

PR checklist

muffato commented Sep 10, 2024

tkchafin commented Sep 11, 2024

tkchafin commented Sep 12, 2024