Skip to content

chg-bsl/snakemake_rnaseq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

This is a repository for the snakemake version of the bash RNASeq pipeline compatible with the Clemson University's Center for Human Genetics (CUCHG) High Performance Computing (HPC) cluster.

  • slurm/config.yaml: config file for HPC architecture and slurm compatibility
  • snakemake_submitter.sh: initiates conda environment and submits the snakemake job to snakemake
  • initiator.sh: sets up the directory and launches the snakemake_submitted.sh
  • Snakefile: the pipeline
  • RNASeq.yaml: environmental variables for the pipeline

Citation

If you use this pipeline, please cite the following:

  • COBRE Grant (P20 GM139767) for support for use of Clemson University Center for Human Genetics Research Core facilities
  • Clemson University Center for Human Genetics Bioinformatics and Statistics Core

Prerequisites

only install these if not running the pipeline on CUCHG's HPC

Instructions

Adding user- and project- specific information

generally, add information encompassed by "<>" in the files below

  • slurm/config.yaml:
    • add max number of jobs (integer)
    • partition name (string)
    • max number of cpus (integer) and max RAM (integer in MB): contact systems administrators for these values and do not edit once established
  • RNASeq.yaml: fill out all information except EXT
  • snakemake_submitter.sh:
    • sbatch parameters:
      • add job name (string)
      • partition name (string)
      • time (in Hr:Min:Sec format)
      • output and error (add path to working directory, same as DEST from RNASeq.yaml, but leave the /log... parts unchanged)
      • mail-user (add user email address)
    • cd line: add path to working directory (same as DEST from RNASeq.yaml)
    • source line: add path to conda initiation script (conda.sh) to choose the right conda
    • conda activate line: add the name of the environment with a working snakemake installation (on Secretariat it is "snakemake")

How to run (don't skip previous step!)

I. Test run (head/master/login node)

  1. Open ssh shell (using MobaXterm or Putty) on head/master/login node
  2. Make a working directory for the analysis and git clone this repository:
    git clone https://github.com/chg-bsl/snakemake_rnaseq.git
    
  3. Copy Snakefile, snakemake_submitter.sh, RNASeq.yaml, slurm/config.yaml and initiator.sh to working directory
  4. Make sure the variables encompassed by "<>" in slurm/config.yaml, RNASeq.yaml and snakemake_submitter.sh have been modified to reflect info specific to your run (eg: working directory, raw data location, etc)
  5. Open a ssh shell and run:
    ##Initialize the correct conda and bring conda into bash environment
    source <path to conda initialization script>
    ##Activate the correct conda environment containing snakemake installation
    conda activate <snakemake conda environment>
    
    cd <working directory containing analysis pipeline files>
    
    Generate DAG figure:
    snakemake -n -p -s Snakefile --configfile RNASeq.yaml --profile slurm --dag | display | dot
    
    Generate the workflow:
    snakemake -n -p -s Snakefile --configfile RNASeq.yaml --profile slurm
    

II. Actual run (head/master/login node)

If step 5 in test run (Generate the DAG figure and Generate the workflow commands) do not generate any errors (red text), run:

./initiator.sh

Tracking progress

There are three places to check for progress:

  1. squeue
  2. This pipeline (when run successfully) will create log and logs_slurm directories within the working directory. In the log directory, look for output_<job_ID>.txt and error_<job_ID>.txt for current status of the run. When the run is successful, the last line should contain a x of x steps (100%) done.
  3. In the logs_slurm directory, the most current log files with specific rule names on the file names represent the current statuses.

Future additions

  • Make a link to image showing what the directory should look like
  • Create a rule to grep module add to summarize a session info

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published