Preprocessing steps for RNA-seq reads and abundances quantification

This is a Snakemake workflow to:

Check the quality of RNA-seq reads using FastQC.
Trim low-quality bases and adapter sequences using Trim Galore.
Assess the post-trimming quality of reads using Trim Galore.
Quantifying abundances of transcripts using Kallisto.

Installation and requirements

This pipeline requires the use of Snakemake, FastQC v0.11.9, Trim Galore v0.6.10, Kallisto v0.50.1.
If not previously installed run the following code:

# Clone the repository
git clone https://github.com/Ahmedbargheet/Snakemake_RNA_seq.git
cd Snakemake_RNA_seq

## Snakemake installation in a conda environment
conda env create --file envs/env_snakemake.yml

# Alternatively you can create the environment manually:
conda create -n snakemake_env -c bioconda snakemake fastqc=0.11.9 trim-galore=0.6.10 kallisto=0.50.1
conda activate snakemake_env

Additionally, the human transcriptome database should be downloaded from ENSEMBL
Follow the following steps for downloading and indexing the human transcriptome database

mkdir cDNA
cd cDNA
wget https://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh38.cdna.all.fa.gz
gunzip Homo_sapiens.GRCh38.cdna.all.fa.gz
kallisto index -i /cDNA/Homo_sapiens.GRCh38.cdna.idx /cDNA/Homo_sapiens.GRCh38.cdna.all.fa.gz -t 16

Overview of the pipeline

How to run the Snakemake pipeline

In the Snakefile, you will find samples variables. You can change ["sample_name"] to your actual sample name. The pipeline is designed to work with paired files {sample}_1.fastq.gz and {sample}_2.fastq.gz)

# run the pipeline
mkdir -p result/1.fastqc/
mkdir -p result/2.trimming/
mkdir -p result/3.kallisto/
snakemake --cores 8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Preprocessing steps for RNA-seq reads and abundances quantification

This is a Snakemake workflow to:

Installation and requirements

Overview of the pipeline

How to run the Snakemake pipeline

Files

README.md

Latest commit

History

README.md

File metadata and controls

Preprocessing steps for RNA-seq reads and abundances quantification

This is a Snakemake workflow to:

Installation and requirements

Overview of the pipeline

How to run the Snakemake pipeline