Skip to content

Viral contigs containing RdRP

Robert Edgar edited this page Jun 27, 2021 · 8 revisions

Download

https://serratus-public.s3.amazonaws.com/rdrp_contigs/rdrp_contigs.tar.gz (1Gb tarball)

Tarball contains two files:
rdrp_contigs.fa (2.9 Gb FASTA) Serratus contigs with palmprint detected by palmscan
rdrp_contigs.tsv (265Mb tab-separated text)

Classification as viral and known/novel

A contig is classified as viral if (1) it has a high-confidence RdRP according to palmscan, and (2) it has an E-value <= 1e-6 in a diamond search of the named viral subset of PalmDB. Otherwise, it is undetermined (undet).

A contig is classified as known if its palmprint has >= 90% identity in a diamond search of the NCBI non-redundant protein database NR, otherwise it is novel.

Contig counts by category

1016347 viral/novel
326942 viral/known
96359 undet/novel
6197 undet/known

Tentative taxonomy assignment

Tentative taxonomies were predicted by a simple consensus method. The usearch_global command in usearch was used to search the named viral (NV) subset of PalmDB release 2021-03-14 named.fa.gz. The top 10 hits were considered for each palmprint, and the majority name assigned at each rank. If there was no majority, no name is assigned. Identity thresholds were applied: phylum=0%, class=30%, order=30%, family=40%. genus=70%, species=90%. If a hit had identity less than the threshold, the name at that rank is excluded.

Fields in rdrp_contigs.tsv

1 Contig FASTA label of contig
2 SRA SRA accession
3 Length Contig length
4 Depth Mean coverage (read depth)
5 Category One of viral/novel, viral/known, undet/novel, undet/known
6 NR_label Label of top hit to non-redundant protein (NR).
7 NR_pctid Identity of top hit in NR.
8 NR_evalue E-value of top hit in NR.
9 NV_label Label of top hit to named viral (NV).
10 NV_pctid Identity of top hit in NV.
11 NV_evalue E-value of top hit in NV.
12 PalmDB_label Label of top hit to PalmDB species-like OTU.
13 PalmDB_pctid Identity to PalmDB sOTU.
14 phylum Tentative phylum
15 class Tentative class
16 order Tentative order
17 family Tentative family
18 genus Tentative genus
19 species Tentative species

Clone this wiki locally