Skip to content

changrong1023/sunbeam

 
 

Repository files navigation

Sunbeam: a robust, extensible metagenomic sequencing pipeline

CircleCI Documentation Status

Sunbeam is a pipeline written in snakemake that simplifies and automates many of the steps in metagenomic sequencing analysis. It uses conda to manage dependencies, so it doesn't have pre-existing dependencies or admin privileges, and can be deployed on most Linux workstations and clusters.

Sunbeam currently automates the following tasks:

  • Quality control, including adaptor trimming, host read removal, and quality filtering;
  • Taxonomic assignment of reads to databases using Kraken;
  • Assembly of reads into contigs using Megahit;
  • Contig annotation using BLAST[n/p/x];
  • Mapping of reads to target genomes; and
  • ORF prediction using Prodigal.

Sunbeam was designed to be modular and extensible. Some extensions have been built for:

  • IGV for viewing read alignments
  • KrakenHLL, an alternate read classifier
  • Kaiju, a read classifier that uses BWA rather than kmers
  • Anvi'o, a downstream analysis pipeline that does lots of stuff!

More extensions can be found at the extension page: https://www.sunbeam-labs.org/

To get started, see our documentation!


Changelog:

v2.0.0 (January 22, 2019)

  • Start a project using resources directly from the SRA using sunbeam init --data_acc [SRA ###]. For more information, see the docs
  • New extension website: https://www.sunbeam-labs.org/
  • Improved documentation
  • Numerous bugfixes and optimizations

v1.2.1 (May 24, 2018)

  • Minor bugfixes

v1.2.0 (May 2, 2018)

  • Low-complexity reads are now removed by default rather than masked
  • Bug fixes related to single-end sequencing experiments
  • Documentation updates

v1.1.0 (April 8, 2018)

  • Reports include number of filtered reads per host, rather than in aggregate
  • Static binary dependency for komplexity for easier deployment
  • Remove max length filter for contigs

v1.0.0 (March 22, 2018)

  • First stable release!
  • Support for single-end sequencing experiments
  • Low-complexity read masking via komplexity
  • Support for extensions
  • Documentation on ReadTheDocs.io
  • Better assembler (megahit)
  • Better ORF finder (prodigal)
  • Can remove reads from any number of host/contaminant genomes
  • Semantic versioning checks
  • Integration tests and continuous deployment

Contributors

About

A robust, extensible metagenomics pipeline

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 58.5%
  • Shell 41.5%