Understand the hierarchical structure of Pango-lineage names.
The Pango Lineage Translator App was inspired by a series of similar tweets making fun of the growing length of full SARS-CoV-2 Pango Lineage names. For example:
These two names do indeed describe the same Pango Lineage and are synonyms, only that the short form uses defined aliases to abbreviate the long form.
With the evolution of SARS-CoV-2, a structured solution to designate and
name new lineages was required.
The idea and implementation of naming lineages using the Pango
nomenclature system are well described on the Pango Network
website.
The side effect of such a nomenclature system is that:
- from the abbreviated names, it is not easy to track its ancestry, and
- the non-abbreviated names get pretty long.
Here is where this App tries to bring everything together and illustrate the ancestry of abbreviated Pango Lineage names by highlighting all the aliases of the full name.
The first step is to translate and remove all aliases of a given lineage name to get the full name.
(pango_lineage_full <- translate_lineage("BQ.1.1"))
#> [1] "B.1.1.529.5.3.1.1.1.1.1.1"
Next, aliases are assigned again to the relevant parts of the full lineage name. There might be a remainder that can not be further abbreviated at the end.
(pango_lineage_full_tibble <- divide_lineage(pango_lineage_full))
#> # A tibble: 4 × 2
#> pango_short pango_long_relevant
#> <chr> <chr>
#> 1 BA B.1.1.529
#> 2 BE .5.3.1
#> 3 BQ .1.1.1
#> 4 no_alias .1.1
Different color schemes are used to highlight the different variants of concern defined by the WHO, such as Omicron.
(color_vector <- VOC_color(pango_lineage_full))
#> font background_level_base
#> "#3A0301" "#E69000"
Lastly, a gt table is put together with nested spanner column labels for each alias and a direct link to cov-spectrum as footnote where one can find more details about the selected lineage.
create_pango_lineage_table(pango_lineage_full_tibble, color_vector)