You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The pipeline downloads the first file using the following command:
get -N ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.xml.gz
his file downloads just as before.
The second file download traditionally could be accessed with the
following command:
From this query, we are expecting a list of non-reviewed UniProt ids.
The documentation that the limit for the stream is something like 5 million ( a lot less than the >200 million).
Once they make it so that we can do the query with the non legacy site we should make the switch.
Another way to do this is
Uniprot suggestions:
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.fasta.gz
and use the command below to extract all the accessions:
zgrep '>' uniprot_trembl.fasta.gz | cut -d '|' -f 2
Or, while the legacy site is up (https://legacy.uniprot.org/) you could continue to use your previous command, just replacing www by legacy.
The text was updated successfully, but these errors were encountered:
This is only once they can handle the larger queries:
https://rest.uniprot.org/uniprotkb/stream?compressed=true&format=list&query=%28reviewed%3Afalse
The pipeline downloads the first file using the following command:
get -N ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.xml.gz
his file downloads just as before.
The second file download traditionally could be accessed with the
following command:
get -O uniprot-reviewed:no.list.gz https://www.uniprot.org/uniprot/?query=reviewed:no&format=list&force=true&compress=yes
Unfortunately, this one no longer works after the update. I believe the new URL that we should be querying is:
wget -O uniprot-file-test.txt.gz https://rest.uniprot.org/uniprotkb/stream?compressed=true&format=list&query=(reviewed:false)
From this query, we are expecting a list of non-reviewed UniProt ids.
The documentation that the limit for the stream is something like 5 million ( a lot less than the >200 million).
Once they make it so that we can do the query with the non legacy site we should make the switch.
Another way to do this is
Uniprot suggestions:
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.fasta.gz
and use the command below to extract all the accessions:
zgrep '>' uniprot_trembl.fasta.gz | cut -d '|' -f 2
Or, while the legacy site is up (https://legacy.uniprot.org/) you could continue to use your previous command, just replacing www by legacy.
The text was updated successfully, but these errors were encountered: