Skip to content

Commit

Permalink
Merge pull request #658 from genomic-medicine-sweden/pad_bed
Browse files Browse the repository at this point in the history
padding bed file
  • Loading branch information
jemten authored Jan 10, 2025
2 parents 4d97d32 + fc62fb6 commit 41cd7fc
Show file tree
Hide file tree
Showing 12 changed files with 231 additions and 6 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Fixed`

- Restrict deepvariant analysis of WES samples to bait regions [#633](https://github.com/nf-core/raredisease/pull/633)
- Restrict deepvariant analysis of WES samples to bait regions [#633](https://github.com/nf-core/raredisease/pull/633), [#658](https://github.com/nf-core/raredisease/pull/658)
- bcftools annotate declaration in annotate CADD subworkflow [#624](https://github.com/nf-core/raredisease/pull/624)
- Rhocallviz subworkflow will only be invocated once per sample [#621](https://github.com/nf-core/raredisease/pull/621)
- Updated createCaseChannel function to include a check for maternal and paternal ids being set to a numeric 0 [#643](https://github.com/nf-core/raredisease/pull/643)
Expand Down
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@

> Danecek P, Bonfield JK, Liddle J, et al. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10(2):giab008. doi:10.1093/gigascience/giab008
- [BEDTools](https://academic.oup.com/bioinformatics/article/26/6/841/244688)

> Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841-842. doi:10.1093/bioinformatics/btq033
- [BWA-MEM](https://arxiv.org/abs/1303.3997)

> Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Published online May 26, 2013. Accessed March 14, 2023. http://arxiv.org/abs/1303.3997
Expand Down
10 changes: 10 additions & 0 deletions conf/modules/prepare_references.config
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,16 @@ process {
ext.args2 = '--csi'
}

withName: '.*PREPARE_REFERENCES:BEDTOOLS_PAD_TARGET_BED' {
ext.when = { !params.target_bed.equals(null) && params.bait_padding > 0 }
ext.prefix = { "${meta.id}_pad${params.bait_padding}" }
ext.args = { "-b ${params.bait_padding}" }
}

withName: '.*PREPARE_REFERENCES:TABIX_BGZIPINDEX_PADDED_BED' {
ext.prefix = { "${meta.id}_pad${params.bait_padding}" }
}

withName: '.*PREPARE_REFERENCES:GATK_BILT' {
ext.when = { !params.target_bed.equals(null) }
ext.prefix = { "${meta.id}_target" }
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,11 @@
"git_sha": "bfa8975eefb8df3e480a44ac9e594f23f52b2963",
"installed_by": ["modules"]
},
"bedtools/slop": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["modules"]
},
"bwa/index": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
Expand Down
5 changes: 5 additions & 0 deletions modules/nf-core/bedtools/slop/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

49 changes: 49 additions & 0 deletions modules/nf-core/bedtools/slop/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

51 changes: 51 additions & 0 deletions modules/nf-core/bedtools/slop/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

36 changes: 36 additions & 0 deletions modules/nf-core/bedtools/slop/tests/main.nf.test

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

35 changes: 35 additions & 0 deletions modules/nf-core/bedtools/slop/tests/main.nf.test.snap

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions modules/nf-core/bedtools/slop/tests/nextflow.config

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

30 changes: 26 additions & 4 deletions subworkflows/local/prepare_references.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
// Prepare reference files
//

include { BEDTOOLS_SLOP as BEDTOOLS_PAD_TARGET_BED } from '../../modules/nf-core/bedtools/slop/main'
include { BWA_INDEX as BWA_INDEX_GENOME } from '../../modules/nf-core/bwa/index/main'
include { BWA_INDEX as BWA_INDEX_MT } from '../../modules/nf-core/bwa/index/main'
include { BWA_INDEX as BWA_INDEX_MT_SHIFT } from '../../modules/nf-core/bwa/index/main'
Expand All @@ -24,6 +25,7 @@ include { SENTIEON_BWAINDEX as SENTIEON_BWAINDEX_GENOME } from '../../modul
include { SENTIEON_BWAINDEX as SENTIEON_BWAINDEX_MT } from '../../modules/nf-core/sentieon/bwaindex/main'
include { SENTIEON_BWAINDEX as SENTIEON_BWAINDEX_MT_SHIFT } from '../../modules/nf-core/sentieon/bwaindex/main'
include { TABIX_BGZIPTABIX as TABIX_PBT } from '../../modules/nf-core/tabix/bgziptabix/main'
include { TABIX_BGZIPTABIX as TABIX_BGZIPINDEX_PADDED_BED } from '../../modules/nf-core/tabix/bgziptabix/main'
include { TABIX_BGZIPTABIX as TABIX_BGZIPINDEX_VCFANNOEXTRA } from '../../modules/nf-core/tabix/bgziptabix/main'
include { TABIX_TABIX as TABIX_VCFANNOEXTRA } from '../../modules/nf-core/tabix/tabix/main'
include { TABIX_TABIX as TABIX_DBSNP } from '../../modules/nf-core/tabix/tabix/main'
Expand Down Expand Up @@ -97,8 +99,18 @@ workflow PREPARE_REFERENCES {
// Vcf, tab and bed indices
TABIX_DBSNP(ch_known_dbsnp)
TABIX_GNOMAD_AF(ch_gnomad_af_tab)
TABIX_PT(ch_target_bed).tbi.set { ch_tbi }
TABIX_PBT(ch_target_bed).gz_tbi.set { ch_bgzip_tbi }

// Index target bed file in case of gz input
TABIX_PT(ch_target_bed)
ch_target_bed
.join(TABIX_PT.out.tbi)
.set{ ch_trgt_bed_tbi }
// Compress and index target bed file in case of uncompressed input
TABIX_PBT(ch_target_bed).gz_tbi
.set { ch_bgzip_tbi }
ch_target_bed_gz_tbi = Channel.empty()
.mix(ch_trgt_bed_tbi, ch_bgzip_tbi)

ch_vcfanno_extra_unprocessed
.branch { it ->
bgzipindex: !it[1].toString().endsWith(".gz")
Expand All @@ -121,6 +133,15 @@ workflow PREPARE_REFERENCES {
.mix(ch_vcfanno_bgzip, ch_vcfanno_index)
.collect()
.set{ch_vcfanno_extra}

// Pad bed file
BEDTOOLS_PAD_TARGET_BED(
ch_target_bed,
ch_fai.map { _meta, fai -> return fai }
)
TABIX_BGZIPINDEX_PADDED_BED(BEDTOOLS_PAD_TARGET_BED.out.bed).gz_tbi
.set { ch_target_bed_gz_tbi }

// Generate bait and target intervals
GATK_BILT(ch_target_bed, ch_dict).interval_list
GATK_ILT(GATK_BILT.out.interval_list)
Expand Down Expand Up @@ -163,6 +184,8 @@ workflow PREPARE_REFERENCES {
ch_versions = ch_versions.mix(TABIX_BGZIPINDEX_VCFANNOEXTRA.out.versions)
ch_versions = ch_versions.mix(TABIX_VCFANNOEXTRA.out.versions)
ch_versions = ch_versions.mix(TABIX_DBSNP.out.versions)
ch_versions = ch_versions.mix(BEDTOOLS_PAD_TARGET_BED.out.versions)
ch_versions = ch_versions.mix(TABIX_BGZIPINDEX_PADDED_BED.out.versions)
ch_versions = ch_versions.mix(GATK_BILT.out.versions)
ch_versions = ch_versions.mix(GATK_ILT.out.versions)
ch_versions = ch_versions.mix(CAT_CAT_BAIT.out.versions)
Expand Down Expand Up @@ -190,10 +213,9 @@ workflow PREPARE_REFERENCES {
mtshift_fasta = GATK_SHIFTFASTA.out.shift_fa.collect() // channel: [ val(meta), path(fasta) ]
mtshift_bwa_index = ch_bwa_mtshift // channel: [ val(meta), path(index) ]
mtshift_bwamem2_index = BWAMEM2_INDEX_MT_SHIFT.out.index.collect() // channel: [ val(meta), path(index) ]

gnomad_af_idx = TABIX_GNOMAD_AF.out.tbi.collect() // channel: [ val(meta), path(fasta) ]
known_dbsnp_tbi = TABIX_DBSNP.out.tbi.collect() // channel: [ val(meta), path(fasta) ]
target_bed = Channel.empty().mix(ch_tbi, ch_bgzip_tbi).collect() // channel: [ val(meta), path(bed), path(tbi) ]
target_bed = ch_target_bed_gz_tbi.collect() // channel: [ val(meta), path(bed), path(tbi) ]
vcfanno_extra = ch_vcfanno_extra.ifEmpty([[]]) // channel: [ [path(vcf), path(tbi)] ]
bait_intervals = CAT_CAT_BAIT.out.file_out.map{ meta, inter -> inter}.collect().ifEmpty([[]]) // channel: [ path(intervals) ]
target_intervals = GATK_BILT.out.interval_list.map{ meta, inter -> inter}.collect() // channel: [ path(interval_list) ]
Expand Down
4 changes: 3 additions & 1 deletion subworkflows/local/utils_nfcore_raredisease_pipeline/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ def toolCitationText() {
]
other_citation_text = [
"BCFtools (Danecek et al., 2021),",
"BEDTools (Quinlan & Hall, 2010),",
"GATK (McKenna et al., 2010),",
"MultiQC (Ewels et al. 2016),",
params.skip_peddy ? "" : "Peddy (Pedersen & Quinlan, 2017),",
Expand Down Expand Up @@ -432,7 +433,8 @@ def toolBibliographyText() {
params.run_rtgvcfeval ? "<li>Cleary, J. G., Braithwaite, R., Gaastra, K., Hilbush, B. S., Inglis, S., Irvine, S. A., Jackson, A., Littin, R., Rathod, M., Ware, D., Zook, J. M., Trigg, L., & Vega, F. M. D. L. (2015). Comparing Variant Call Files for Performance Benchmarking of Next-Generation Sequencing Variant Calling Pipelines (p. 023754). bioRxiv. https://doi.org/10.1101/023754</li>" : "",
"<li>Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., & 1000 Genome Project Data Processing Subgroup. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079. https://doi.org/10.1093/bioinformatics/btp352</li>",
(!params.skip_smncopynumbercaller && params.analysis_type.equals("wgs")) ? "<li>Chen, X., Sanchis-Juan, A., French, C. E., Connell, A. J., Delon, I., Kingsbury, Z., Chawla, A., Halpern, A. L., Taft, R. J., Bentley, D. R., Butchbach, M. E. R., Raymond, F. L., & Eberle, M. A. (2020). Spinal muscular atrophy diagnosis and carrier screening from genome sequencing data. Genetics in Medicine, 22(5), 945–953. https://doi.org/10.1038/s41436-020-0754-0</li>" : "",
"<li>Li, H. (2011). Tabix: Fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics, 27(5), 718–719. https://doi.org/10.1093/bioinformatics/btq671</li>"
"<li>Li, H. (2011). Tabix: Fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics, 27(5), 718–719. https://doi.org/10.1093/bioinformatics/btq671</li>",
"<li>Quinlan, AR., Hall IM. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinfomatics, 26(6), 841-842. https://doi.org/10.1093/bioinformatics/btq033</li>"
]

def concat_text = align_text +
Expand Down

0 comments on commit 41cd7fc

Please sign in to comment.