Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] [metaMiner] transeq only outputs first contig #6

Open
brymerr921 opened this issue Feb 9, 2023 · 0 comments
Open

[bug] [metaMiner] transeq only outputs first contig #6

brymerr921 opened this issue Feb 9, 2023 · 0 comments

Comments

@brymerr921
Copy link

Description

When providing metaMiner with an un-annotated nucleotide FASTA file that has more than one DNA sequences, transeq is run to six-frame translate it prior to running hmmsearch.

However, in my experience transeq.py only outputs six-frame translations for the first sequence in the FASTA file. This can be reproduced as follows:

printf '>seq1\nATGATGATGATGTAA\n>seq2\nAATGGAAGAAGAATAGAA\n' > test.fasta
python transeq.py test.fasta -o test.out --frame 6 --wide

Now, test.out contains:

>seq1_1
MMMM*
>seq1_2
***CX
>seq1_3
DDDVX
>seq1_4
LHHHH
>seq1_5
TSSSX
>seq1_6
YIIIX

test.out should contain:

>seq1_1
MMMM*
>seq1_2
***CX
>seq1_3
DDDVX
>seq1_4
LHHHH
>seq1_5
TSSSX
>seq1_6
YIIIX
>seq2_1
NGRRIE
>seq2_2
MEEE*X
>seq2_3
WKKNRX
>seq2_4
FYSSSI
>seq2_5
LFFFHX
>seq2_6
SILLPX

Possible solutions

Use gotranseq as a near drop-in replacement as it requires only a single binary, compared with the transeq program within EMBOSS.
Caveat: the output is not in --wide format and wraps at 60 characters, with no --wide option available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant