-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENSEMBL transcripts not versioned #233
Comments
This issue touches on several problems:
The only official way to get exon coordinates out of Ensembl is to use the perl API. Unfortunately, when I last tried in May 2016, I discovered that Ensembl and bioperl didn't work on a modern distribution. In ensembl-dev:
My recollection is that perl 5.14 was significantly out of date at the time, and that installing manually had knock-on effects with dependent modules. I gave up. So, to solve this issue, we need a reliable way to get versioned transcripts out of Ensembl. |
Would we be able to use the .gff files for this purpose, e.g. http://ftp.ensembl.org/pub/release-104/gff3/homo_sapiens/Homo_sapiens.GRCh38.104.chr_patch_hapl_scaff.gff3.gz? It appears that they have gene/transcript/exon IDs with versions, and earlier releases are also maintained, e.g. release 101: http://ftp.ensembl.org/pub/release-101/gff3/homo_sapiens/Homo_sapiens.GRCh38.101.chr_patch_hapl_scaff.gff3.gz |
Yes, those should be usable in principle, but no work has actually gone into that yet. |
Hi, I've made cdot - data provider that includes Ensembl transcripts - see HGVS issue The GTF parsing code etc is all under MIT if you want to re-use this in UTA, an alternative would be to use the JSON and convert that to SQL (or the data provider) |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue was closed because it has been stalled for 7 days with no activity. |
When trying to refer to ensembl transcripts we cannot find by version in the 20210129 data release.
The text was updated successfully, but these errors were encountered: