Skip to content

Download SRA and convert to fastq files automatically

Notifications You must be signed in to change notification settings

rodrigopsav/sraDownloader

Repository files navigation

sraDownloader

User manual and guide

sraDownloader uses sratoolkit to download and convert sra files into fastq files and trimmomatic to separate the paired and unpaired reads. Trimmomatic won't trim the adapters in this pipeline.

Citation
Download sraDownloader
Install sraDownloader Dependencies
Using conda environments without sraDownloader
Running sraDownloader
Killing sraDownloader
sraDownloader Example
Check log files

Citation

Savegnago, R. P. sraDownloader. 2021. GitHub repository, https://github.com/rodrigopsav/sraDownloader

Download sraDownloader

git clone https://github.com/rodrigopsav/sraDownloader.git 

Install sraDownloader Dependencies

sraDownloader dependencies are installed in a conda environment called sra. Before run sraDownloader, you MUST install the dependencies (sraDownloader/install_sra_dependencies/install_sra_dependencies.sh file) even if you have already installed the programs in your machine. To install sraDownloader dependencies, run:

cd sraDownloader/install_sra_dependencies
./install_sra_dependencies.sh -d <directory/to/install/sraDownloader/dependencies>

Unfortunately, incompatibilities can happen when install the most recent version of the programs. If you detect some errors after install sraDownloader dependencies, use the following command lines to remove the previous installation and re-install them with the versions used originally to develop sraDownloader:

conda env remove --name sra
./install_sra_dependencies_versions.sh -d <directory/to/install/sraDownloader/dependencies>

Using conda environments without sraDownloader

You can use all the programs without sraDownloader, by activating the proper conda environment:

conda activate sra

Try, fasterq-dump, prefetch and other programs installed in this conda env. To check a complete list of programs installed, type:

conda list

Running sraDownloader

To run IVDP in a local machine, type:

./sraDownloader.sh -s 1 -l <sraList file> -n <number of parallel samples> -o <path/to/output/folder>

and to run on a HPCC with slurm scheduler, type:

./sraDownloader.sh -s 2 -l <sraList file> -n <number of parallel samples> -o <path/to/output/folder>
   -s: submission system (1: local server; 2: HPCC with slurm)
   -l: path to list of sra run accessions
   -n: number of samples to run in parallel (default=5: DO NO WORK WITH HPCC)
   -o: output directory

Killing sraDownloader

In the end of each submission, sraDownloader shows a message to kill the analysis like that:

# sraDownloader running in a local server
To kill this SRA analysis, run: kill -- 1463

# sraDownloader running on HPCC with slurm
for job in $(squeue -u $USER | grep 463739 | awk '{print \$1}'); do scancel $job; done

NOTE: each analysis shows a different number (like 1463 and 463739 in the example above. Copy and paste these command lines anywhere until the analysis finishes (just in case you want to stop it earlier).

sraDownloader Example

On a local machine, type:

./sraDownloader.sh -s 1 -l sraList/exampleSRA.txt -n 10 -o ~/outSRA

and to run on a HPCC with slurm scheduler, type:

./sraDownloader.sh -s 2 -l sraList/exampleSRA.txt -o ~/outSRA

Check log files

When sraDownloader finishes the jobs, go to the output folder, subfolder samples:

cd ~/outSRA/samples

There are three files:

  • sra_list.txt
  • sra_download_convert_successful.txt
  • sra_download_convert_failed.txt

Check failed samples in sra_download_convert_failed.txt. Pretend that sample ERR1742684 failed. To get more details about it, go to:

cat ~/outSRA/log/ERR1742684.txt | less

About

Download SRA and convert to fastq files automatically

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages