Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added branched handling of ULI inputs in filter_pacbio #115

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

tkchafin
Copy link
Contributor

Ultra low-input libraries (tracked in the "library" samplesheet column) will now be run through pbmarkdup. Note nothing is removed in the test file, but I have marked the PB cram as "uli" to trigger the test

Closes #72

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@muffato
Copy link
Member

muffato commented Sep 10, 2024

Shane first runs lima on a database of ULI adapters, and then pbmarkdup

for ULI data, we need to run and extra lima to trim the ULI adapter sequence

https://github.com/sanger-tol/tol-workflows/blob/main/wr/wr-import-pacbio-ccs#L323-L338

Do we need lima here too ?

@tkchafin
Copy link
Contributor Author

For Sanger data, this will already have been done (actually, mark/rm duplicates is done as well), so technically I think we can treat ULI reads the same as LI/other prep types for production purposes.

For full ULI support for external data, special handling of adapter trimming makes sense, although the pipeline as-is generally assumes most read filtering/qc has been done prior to running. Maybe we could think about adding an optional sub workflow to take in raw data?

@tkchafin
Copy link
Contributor Author

@reichan1998 Can you review? I am tracking the lima/adapter removal suggestion in a separate ticket on pre-alignment QC, but for now we can merge the pbmarkdup integration if it is all working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants