Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFF parsing in count_reads.R #9

Closed
thomcuddihy opened this issue Jan 15, 2025 · 1 comment
Closed

GFF parsing in count_reads.R #9

thomcuddihy opened this issue Jan 15, 2025 · 1 comment

Comments

@thomcuddihy
Copy link

Had an issue where making ref_gene_df failed due to double the number of locus_tags compared to other features. On investigating, locus_tags contained both "locus_tag=..." and "old_locus_tag=..." entries.

Manually changed count_reads.R from:

locus_tags <- unlist(lapply(gene_attr, function(x) {
    x[grepl("locus_tag", x)]
}))

to:

locus_tags <- unlist(lapply(gene_attr, function(x) {
    x[grepl("^locus_tag", x)]
}))

Might be worth changing all feature greps ("locus_tag", "gene_biotype" and "gene") to "^feature=" e.g.

Didn't think it was worth a PR, since it could also be considered an issue with user input file.

Note sure if related to issue #8

adamd3 added a commit that referenced this issue Jan 16, 2025
Fix #9: Avoid mismatches when grepping gene details from gff
@adamd3 adamd3 closed this as completed in cfc7b90 Jan 16, 2025
@adamd3
Copy link
Owner

adamd3 commented Jan 16, 2025

Thanks for reporting and for providing the fix. This is now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants