Robersy Sanchez
Department of Biology. Eberly College of Science.
Pennsylvania State University, University Park, PA 16802
[email protected]
ORCID:
orcid.org/0000-0002-5246-1453
This is a R package to compute the automorphisms between pairwise aligned DNA sequences represented as elements from a Genomic Abelian group as described in the paper Genomic Abelian Finite Groups. In a general scenario, whole chromosomes or genomic regions from a population (from any species or close related species) can be algebraically represented as a direct sum of cyclic groups or more specifically Abelian p-groups. Basically, we propose the representation of multiple sequence alignments (MSA) of length N as a finite Abelian group created by the direct sum of homocyclic Abelian group of prime-power order:
G = (ℤp1α1)n1 ⊕ (ℤp1α2)n2 ⊕ … ⊕ (ℤpkαk)nk
Where, the pi’s are prime numbers, αi ∈ ℕ and ℤpiαi is the group of integer modulo piαi.
For the purpose of automorphism between two aligned DNA sequences, piαi ∈ {5, 26, 53}.
This application is currently available in Bioconductor (version 3.18) https://doi.org/doi:10.18129/B9.bioc.GenomAutomorphism. Watch this repo or check for updates.
There are several tutorials on how to use the package at GenomAutomorphism website
- Get started-with GenomAutomorphism
- Analysis of Automorphisms on a DNA Multiple Sequence Alignment
- Analysis of Automorphisms on a MSA of Primate BRCA1 Gene
- A Short Introduction to Algebraic Taxonomy on Genes Regions
- Automorphism analysis on COVID-19 data
- Modular Matrix Operations of Mutational Events
This package depends, so far, from: Biostrings, GenomicRanges, numbers, and S4Vectors.
if (!requireNamespace("BiocManager")) install.packages("BiocManager")
BiocManager::install(c("Biostrings", "GenomicRanges", "S4Vectors",
"BiocParallel", "GenomeInfoDb", "BiocGenerics", "numbers", "devtools",
"doParallel", "data.table", "foreach","parallel"), dependencies = TRUE)
BiocManager::install('genomaths/GenomAutomorphism')
-
Sanchez R, Morgado E, Grau R. Gene algebra from a genetic code algebraic structure. J Math Biol. 2005 Oct;51(4):431-57. doi: 10.1007/s00285-005-0332-8. Epub 2005 Jul 13. PMID: 16012800. ( PDF).
-
Sanchez R, Grau R, Morgado E. A novel Lie algebra of the genetic code over the Galois field of four DNA bases. Math Biosci. 2006;202: 156–174. doi:10.1016/j.mbs.2006.03.017
-
Sanchez R, Grau R. An algebraic hypothesis about the primeval genetic code architecture. Math Biosci. 2009/07/18. 2009;221: 60–76. doi:10.1016/j.mbs.2009.07.001
-
Robersy Sanchez, Jesús Barreto (2021) Genomic Abelian Finite Groups. doi: 10.1101/2021.06.01.446543.
-
M. V José, E.R. Morgado, R. Sánchez, T. Govezensky, The 24 possible algebraic representations of the standard genetic code in six or in three dimensions, Adv. Stud. Biol. 4 (2012) 119–152.PDF.
-
R. Sanchez. Symmetric Group of the Genetic–Code Cubes. Effect of the Genetic–Code Architecture on the Evolutionary Process MATCH Commun. Math. Comput. Chem. 79 (2018) 527-560. PDF.
-
Sanchez, R., 2014. Evolutionary Analysis of DNA-protein-coding regions based on a genetic code cube metric. Current Topics in Medicinal Chemistry, 14(3), pp.407-417. https://doi.org/10.2174/1568026613666131204110022.