Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFF File Parsing Fails with "No elements found with gene!" #8

Open
Aasesss opened this issue Jan 9, 2025 · 2 comments
Open

GFF File Parsing Fails with "No elements found with gene!" #8

Aasesss opened this issue Jan 9, 2025 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@Aasesss
Copy link

Aasesss commented Jan 9, 2025

I encountered an issue when trying to parse a GFF file using the software. The process fails with the following error message:

No elements found with gene!
Are you specifying a feature present in the gff file?
Error parsing GTF/GFF3/BED file!

Use the following GFF file:

##gff-version 3
A01 EVM gene 124814 125347 . + . ID=BnaA01G0000100ZS;Name=BnaA01G0000100ZS
A01 EVM mRNA 124814 125347 . + . ID=BnaA01T0000100ZS;Parent=BnaA01G0000100ZS;Name=BnaA01T0000100ZS
A01 EVM exon 124814 125347 . + . ID=BnaA01T0000100ZS.exon1;Parent=BnaA01T0000100ZS
A01 EVM CDS 124814 125347 . + 0 ID=cds.BnaA01T0000100ZS;Parent=BnaA01T0000100ZS

Could you confirm if there is a specific requirement for the gene feature, or if there is an issue with the software's parsing logic? Any guidance on resolving this issue would be greatly appreciated.

Best regards,
Minjian Chen

@tobiasrausch
Copy link
Member

tobiasrausch commented Jan 9, 2025

I think GTF files using -i gene_name whereas for GFF you need to specify -i ID. The feature can be gene or exon, for instance sansa annotate -i ID -f gene ....

@tobiasrausch tobiasrausch self-assigned this Jan 9, 2025
@tobiasrausch tobiasrausch added the question Further information is requested label Jan 9, 2025
@Aasesss
Copy link
Author

Aasesss commented Jan 9, 2025

I think GTF files using -i gene_name whereas for GFF you need to specify -i ID. The feature can be gene or exon, for instance sansa annotate -i ID -f gene ....

Thank you for your response!
I followed your suggestion and added the -i ID option, and the program ran successfully. However, the annotation columns in the final output file are all showing NA. I'm not sure what is causing this issue.

Here is the command I ran:

$ sansa annotate -i ID -f gene -g ZS11_v0.norm.gff.gz test.vcf.gz -a test.anno.bcf -o test.anno.tsv.gz -c
[2025-Jan-09 20:07:35] sansa annotate -i ID -f gene -g ZS11_v0.norm.gff.gz test.vcf.gz -a test.anno.bcf -o test.anno.tsv.gz -c
[2025-Jan-09 20:07:35] Parse SV annotation database
[2025-Jan-09 20:07:36] Parsed 4775 out of 6636 VCF/BCF records.
[2025-Jan-09 20:07:36] GFF3 feature parsing
[2025-Jan-09 20:08:10] Query input SVs
[2025-Jan-09 20:08:10] Parsed 4775 out of 6636 VCF/BCF records.
[2025-Jan-09 20:08:10] Done.

And here's an excerpt from my file:

[1]ANNOID query.chr query.start query.chr2 query.end query.id query.qual query.svtype query.ct query.svlen query.startfeature query.endfeature query.containedfeature
id000000000 scaffoldA01 167534 scaffoldA01 167534 SV_9 0 INS NtoN 5864 NA NA NA
id000000001 scaffoldA01 169517 scaffoldA01 173449 SV_8 0 DEL 3to5 3933 NA NA NA
id000000002 scaffoldA01 246232 scaffoldA01 246232 SV_17 0 INS NtoN 815 NA NA NA
id000000003 scaffoldA01 258262 scaffoldA01 259810 SV_19 0 DEL 3to5 1549 NA NA NA
id000000004 scaffoldA01 261111 scaffoldA01 261111 SV_21 0 INS NtoN 348 NA NA NA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants