Skip to content

v1.1.0 - Ancient Aurora

Compare
Choose a tag to compare
@DLBPointon DLBPointon released this 09 Apr 09:00
· 178 commits to main since this release
dd32db5

[1.1.0] - Ancient Aurora - [2024-04-26]

The second release for sanger-tol, created with the nf-core template.

This builds on the initial release by adding subworkflows which generate Kmer-based coverage tracks and a Kmer spectra graph. There are also a number of updates to the logic used throughout the pipeline, as well as to the resources required by a significant number of modules.

What's Changed

  • Updates to the resource allocation methods used by a number of modules in the base.config.
  • Added a flag to stop the usage of Juicer.
  • Subworkflow to generate a kmer-based coverage track.
  • Subworkflow to generate/update a kmer spectra graph.
  • Subworkflow to use minimap2 for HiC mapping, if selected.
  • Subworkflow to use BWAmem2 for HiC mapping, if selected.
  • Subworkflow to ingest Pretext accessory files into the Pretext file, simplifying post-TreeVal data manipulation.
  • Updated the logic in use throughout the pipeline.
  • Updated the modules.config to include some of the logic, cleaning the code.
  • Updated the HiC subworkflow to include subsampling the HiC data for Juicer due to resource requirements with large amounts of data.
  • Updated the YAML_INPUT subworkflow, this now contains "flags" to change some software options.
  • Updated the data names in the input YAML to reduce confusion.
  • Updated software (Pretext{View, Snapshot, Graph}) to allow for use on large genomes with big data.
    • Added associated patch files and CPU architecture files.
  • Updated the minimap2 align module to remove samtools view in preference of paftools for our use case.
  • Updated the test.yml in line with the above changes.
  • Updated the SELFCOMP subworkflow to allow for the parallelisation of the work on large genomes.
  • Updated the READ_COVERAGE subworkflow to produce the scaffold-based AVG coverage and STND coverage
  • Updated Modules from NF-Core - mostly relates to module structure rather than software.
  • Updated the SummaryStats output to include HiC container counts.
  • Added -T / -t flags where possible to minimise the use of the /tmp directory.
  • Replaced CONCAT_MUMMER with CATCAT for simplicity.
  • Removed JUICER from the RAPID entry point.
  • Removed the CSI or TBI logic. CSI is now used by default, this simplifies the workflow and enlarges the capacity to handle much larger genomes. The logic block previously required was then moved.
  • Added NF-DOWNLOAD to the CI-CD due to an error that causes incomplete downloaded when downloading a number of images at the same time.
  • Added the RAPID_TOL entry point which is more geared towards the requirements of Sanger.
  • Fix a bug in build_alignment_blocks.py to avoid indexing errors happening in large genomes.
  • Change output BEDGRAPH from EXTRACT_TELO module.
  • Fix to the telomere output.
  • Fix to the SelfComp scripts.

New Contributors

  • @gq1 made their first contribution in #186

Contributors

Full Changelog: 1.0.0...v1.1.0