You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I need to run TE-Aid in parallel but it causes errors because of using shared resources.
I tried this command (to copy TE-Aid to a temp file for each process so it doesn't use the same database) in a HPC cluster in parallel but it does not work for all processes:
and this is what I got (it just worked with process 1 and gave error for processes 2 and 3):
0% 0:3=0s fasta_3.fa query: fasta_2.fa
ref genome: ./tmp/2/genome_file
TE -> genome blastn e-value: 10e-8
full length min ratio: 0.9
hits transparency: 0.3
full length hits transparency: 0.9
no ORF detected, skipping blastp...
[1] "R: ploting genome blastn results and computing coverage..."
[1] "consensus length: 360 bp"
[1] "R: ploting self dot-plot and orf/protein hits..."
[1] "no orf to plot..."
null device
1
Done! The graph (.pdf) can be found in the output folder: ./tmp/2/output
Warning message:
In file(file, "rt") :
cannot open file './tmp/2/output/orftetable': No such file or directory
33% 1:2=31s fasta_3.fa query: fasta_1.fa
ref genome: ./tmp/1/genome_file
TE -> genome blastn e-value: 10e-8
full length min ratio: 0.9
hits transparency: 0.3
full length hits transparency: 0.9
RepeatPeps is downloaded and formatted, blastp-ing...
[1] "R: ploting genome blastn results and computing coverage..."
[1] "consensus length: 1582 bp"
[1] "R: ploting self dot-plot and orf/protein hits..."
null device
1
Done! The graph (.pdf) can be found in the output folder: ./tmp/1/output
66% 2:1=11s fasta_3.fa query: fasta_3.fa
ref genome: ./tmp/3/genome_file
TE -> genome blastn e-value: 10e-8
full length min ratio: 0.9
hits transparency: 0.3
full length hits transparency: 0.9
no ORF detected, skipping blastp...
[1] "R: ploting genome blastn results and computing coverage..."
[1] "consensus length: 541 bp"
[1] "R: ploting self dot-plot and orf/protein hits..."
[1] "no orf to plot..."
null device
1
Done! The graph (.pdf) can be found in the output folder: ./tmp/3/output
Warning message:
In file(file, "rt") :
cannot open file './tmp/3/output/orftetable': No such file or directory
100% 3:0=0s fasta_3.fa
would you please let me know what the solution is?
Cheers,
Mani
The text was updated successfully, but these errors were encountered:
First of all, as far as I know, TE-Aid wasn't made for running in parallel. The basic output of this tool is a pdf plot that you have to inspect manually, which is not feasible for multitude of TEs. In other words, TE-Aid was designed to work with a specific consensus for getting an overview of its structure and genome representation.
Second, in order to maximize the speed without running TE-Aid in parallel and avoid potential collisions, you could just loop over your fastas with a bash script while using the same output folder. If your files and corresponding fasta headers have different names that should work fine and you won't download/generate BLAST databases for each fasta. I haven't worked with X laevis, but for danio, which has genome two times smaller, it takes ~15 seconds to run TE-Aid, when databases are prepared, so it shouldn't be as bad as well for your clawed friend. Anyhoo, I would just submit a bash script to your cluster that loops over your fastas:
#!/usr/bin/env bash
#SBATCH parameters or whatever HPC control system you have
GENOME=/path/to/genome
for fa in ./*.fasta
do
TE-Aid -q ${fa} -g ${GENOME} -o output_folder
done
And thirdly, the formatting of the parallel command you wrote in your question is broken. That makes it harder to read it and understand.
Hi,
I need to run TE-Aid in parallel but it causes errors because of using shared resources.
I tried this command (to copy TE-Aid to a temp file for each process so it doesn't use the same database) in a HPC cluster in parallel but it does not work for all processes:
GENOME="../aipysurus_laevis.polished.fa"$TEAID/* ./tmp/{#}/TE-Aid/ && ln -sf $ (realpath $GENOME) ./tmp/{#}/genome_file && ./tmp/{#}/TE-Aid/TE-Aid -q {} -g ./tmp/{#}/genome_file -o ./tmp/{#}/output && mv ./tmp/{#}/output/* ./" && rm -r ./tmp/
TEAID="/hpcfs/users/a1177955/local/TE-Aid/"
parallel --bar --jobs 3 -a fasta_list.txt "mkdir -p ./tmp/{#}/TE-Aid && mkdir -p ./tmp/{#}/output && cp -ar
and this is what I got (it just worked with process 1 and gave error for processes 2 and 3):
0% 0:3=0s fasta_3.fa query: fasta_2.fa
ref genome: ./tmp/2/genome_file
TE -> genome blastn e-value: 10e-8
full length min ratio: 0.9
hits transparency: 0.3
full length hits transparency: 0.9
no ORF detected, skipping blastp...
[1] "R: ploting genome blastn results and computing coverage..."
[1] "consensus length: 360 bp"
[1] "R: ploting self dot-plot and orf/protein hits..."
[1] "no orf to plot..."
null device
1
Done! The graph (.pdf) can be found in the output folder: ./tmp/2/output
Warning message:
In file(file, "rt") :
cannot open file './tmp/2/output/orftetable': No such file or directory
33% 1:2=31s fasta_3.fa query: fasta_1.fa
ref genome: ./tmp/1/genome_file
TE -> genome blastn e-value: 10e-8
full length min ratio: 0.9
hits transparency: 0.3
full length hits transparency: 0.9
RepeatPeps is downloaded and formatted, blastp-ing...
[1] "R: ploting genome blastn results and computing coverage..."
[1] "consensus length: 1582 bp"
[1] "R: ploting self dot-plot and orf/protein hits..."
null device
1
Done! The graph (.pdf) can be found in the output folder: ./tmp/1/output
66% 2:1=11s fasta_3.fa query: fasta_3.fa
ref genome: ./tmp/3/genome_file
TE -> genome blastn e-value: 10e-8
full length min ratio: 0.9
hits transparency: 0.3
full length hits transparency: 0.9
no ORF detected, skipping blastp...
[1] "R: ploting genome blastn results and computing coverage..."
[1] "consensus length: 541 bp"
[1] "R: ploting self dot-plot and orf/protein hits..."
[1] "no orf to plot..."
null device
1
Done! The graph (.pdf) can be found in the output folder: ./tmp/3/output
Warning message:
In file(file, "rt") :
cannot open file './tmp/3/output/orftetable': No such file or directory
100% 3:0=0s fasta_3.fa
would you please let me know what the solution is?
Cheers,
Mani
The text was updated successfully, but these errors were encountered: