Prymetime is a de novo genome assembly pipeline that uses long reads from Oxford Nanopore Technologies and PacBio and short reads from Illumina. It was designed to produce high-quality genome assemblies from engineered yeast and bacteria strains. Prymetime relies on the long read de novo assembler Flye for linear contigs and the hybrid assembler Unicycler for circular contigs. Prymetime now allows for long-read or short-read only.
All software requirements for Prymetime have been packaged together into a Docker image. Docker is available freely here: https://hub.docker.com/search?offering=community&type=edition
Although it is possible to run the Prymetime Docker image on a desktop computer, we strongly recommend running the pipeline on a server. The memory requirements of Flye and Unicycler at the recommended 40X genome coverage for nanopore and Illumina reads are likely not possible on a "normal" desktop computer.
Additionally, this Docker image can be wrapped in a Singularity image to run on HPCs for ease of use.
Build Singularity image
singularity build prymetime docker://sjtrauber/prymetime:v2
Run Prymetime assembly pipeline
singularity run \
-B ~/path/to/input:/input \
-B ~/path/to/output:/output \
~/prymetime \
-long ~/path/to/nanopore.fastq \
-illumina_1 ~/path/to/illumina_1.fastq \
-illumina_2 ~/path/to/illumina_2.fastq \
-outdir ~/path/to/output \
-preferred_assembly short \ # indicates bacterial assembly, remove for yeast
-read_type <type> \ # include if using only <long> or <short> reads or <assembly> for inputting pre-assembled genome for eng_sig identification
-eng_sig ~/path/to/bacterial_signatures.fna \ # optional
-ref_genome ~/path/to/output/GCF_001456255.1.fna # optional
The final genome assembly will be the my_directory_final.fasta file.
Download Docker image
git clone https://github.com/emyounglab/prymetime.git
Build Docker image
docker build --tag prymetimev2 prymetime
Install time is around one hour on a desktop computer.
Mount a directory with the -v
flag. The directory before the :
must be an absolute path to a file or directory, and the directory
after the :
is where it will be mounted inside the container.
Run Prymetime assembly pipeline
docker run -it --rm \
-v /path/to/input_dir:/input \
-v /path/to/output_dir:/output \
prymetime \
-long /input/my_long.fastq \
-illumina_1 /input/my_illumina_1.fastq \
-illumina_2 /input/my_illumina_2.fastq \
-outdir /output/my_directory
-preferred_assembly short \ # indicates bacterial assembly, remove for yeast
-read_type <type> \ # include if using only <long> or <short> reads or <assembly> for inputting pre-assembled genome for eng_sig identification
-eng_sig ~/path/to/bacterial_signatures.fna \ # optional
-ref_genome ~/path/to/output/GCF_001456255.1.fna # optional
The final genome assembly will be the my_directory_final.fasta file.
The -eng_sig option will also produce a PDF displaying engineering signatures that were found in the genome assembly, shown below:
The eng_sig_felix.fasta file (provided in the PRYMETIME folder) contains all engineering signatures used in this study.
The entrypoint script can be overridden for debugging using the
--entrypoint
argument to docker run. Using /bin/bash
as the
entrypoint starts an interactive shell when the docker image is
run. Here is an example:
docker run -it --rm \
-v $(realpath ../data):/input \
-v $(realpath output):/output \
--entrypoint /bin/bash \
prymetime
The run time of Prymetime will depend highly on the computer or server used, and the size of the read libraries. On a desktop computer with a small 10X genome coverage read library, Prymetime took approximately 7 hours.
Prymetime utilizes the following software packages:
Prymetime publication is available: https://www.nature.com/articles/s41467-021-21656-9