Description

This is a repository for the snakemake version of the bash RNASeq pipeline compatible with the Clemson University's Center for Human Genetics (CUCHG) High Performance Computing (HPC) cluster.

slurm/config.yaml: config file for HPC architecture and slurm compatibility
snakemake_submitter.sh: initiates conda environment and submits the snakemake job to snakemake
initiator.sh: sets up the directory and launches the snakemake_submitted.sh
Snakefile: the pipeline
RNASeq.yaml: environmental variables for the pipeline

Citation

If you use this pipeline, please cite the following:

COBRE Grant (P20 GM139767) for support for use of Clemson University Center for Human Genetics Research Core facilities
Clemson University Center for Human Genetics Bioinformatics and Statistics Core

Prerequisites

only install these if not running the pipeline on CUCHG's HPC

Instructions

Adding user- and project- specific information

generally, add information encompassed by "<>" in the files below

slurm/config.yaml:
- add max number of jobs (integer)
- partition name (string)
- max number of cpus (integer) and max RAM (integer in MB): contact systems administrators for these values and do not edit once established
RNASeq.yaml: fill out all information except EXT
snakemake_submitter.sh:
- sbatch parameters:
  - add job name (string)
  - partition name (string)
  - time (in Hr:Min:Sec format)
  - output and error (add path to working directory, same as DEST from RNASeq.yaml, but leave the /log... parts unchanged)
  - mail-user (add user email address)
- cd line: add path to working directory (same as DEST from RNASeq.yaml)
- source line: add path to conda initiation script (conda.sh) to choose the right conda
- conda activate line: add the name of the environment with a working snakemake installation (on Secretariat it is "snakemake")

How to run (don't skip previous step!)

I. Test run (head/master/login node)

Open ssh shell (using MobaXterm or Putty) on head/master/login node
Make a working directory for the analysis and git clone this repository:
```
git clone https://github.com/chg-bsl/snakemake_rnaseq.git
```
Copy Snakefile, snakemake_submitter.sh, RNASeq.yaml, slurm/config.yaml and initiator.sh to working directory
Make sure the variables encompassed by "<>" in slurm/config.yaml, RNASeq.yaml and snakemake_submitter.sh have been modified to reflect info specific to your run (eg: working directory, raw data location, etc)

Open a ssh shell and run:

##Initialize the correct conda and bring conda into bash environment
source <path to conda initialization script>
##Activate the correct conda environment containing snakemake installation
conda activate <snakemake conda environment>

cd <working directory containing analysis pipeline files>

Generate DAG figure:

snakemake -n -p -s Snakefile --configfile RNASeq.yaml --profile slurm --dag | display | dot

Generate the workflow:

snakemake -n -p -s Snakefile --configfile RNASeq.yaml --profile slurm

II. Actual run (head/master/login node)

If step 5 in test run (Generate the DAG figure and Generate the workflow commands) do not generate any errors (red text), run:

./initiator.sh

Tracking progress

There are three places to check for progress:

squeue
This pipeline (when run successfully) will create log and logs_slurm directories within the working directory. In the log directory, look for output_<job_ID>.txt and error_<job_ID>.txt for current status of the run. When the run is successful, the last line should contain a x of x steps (100%) done.
In the logs_slurm directory, the most current log files with specific rule names on the file names represent the current statuses.

Future additions

Make a link to image showing what the directory should look like
Create a rule to grep module add to summarize a session info

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Citation

Prerequisites

Instructions

Adding user- and project- specific information

How to run (don't skip previous step!)

I. Test run (head/master/login node)

II. Actual run (head/master/login node)

Tracking progress

Future additions

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
slurm		slurm
.gitignore		.gitignore
README.md		README.md
RNASeq.yaml		RNASeq.yaml
Snakefile		Snakefile
initiator.sh		initiator.sh
snakemake_submitter.sh		snakemake_submitter.sh

chg-bsl/snakemake_rnaseq

Folders and files

Latest commit

History

Repository files navigation

Description

Citation

Prerequisites

Instructions

Adding user- and project- specific information

How to run (don't skip previous step!)

I. Test run (head/master/login node)

II. Actual run (head/master/login node)

Tracking progress

Future additions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages