-
Notifications
You must be signed in to change notification settings - Fork 8
Algorithms to compute DNA complexity
License
caballero/SeqComplex
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
== SeqComplex == This is a collection of methods to compute the composition and complexity of a DNA sequence(s) from a Fasta file. The SeqComplex.pm is a Perl Module containing implementations for each complexity measure. Additionally, several tools are provided which utilitize this module. They include: (1) compSeq.pl compute the methods in a windowed mode. (2) profileComplexSeq.pl compute the methods using the whole sequence. (3) gatherStats.pl: Example script to run all methods in windowed mode and save raw data for later processing. (4) displayStats.pl: Example script to read in raw data from gatherStats.pl and display as either a table or a Google Charts HTML file. Computed methods *gc: C+G content *gcs: C+G skew *cpg: CpG skew *cwf: Complexity by Wootton & Federhen *ce: Entropy *cz: Complexity as compression ratio (using Gzip) *cmN: Complexity as Markov model size of N *ctN: Trifnov's complexity with order N *clN: Linguistic complexity with order N Additional methods *ats: A+T skew *ket: Keto skew *pur: Purine skew Citation * Caballero J, Smit AFA, Hood L, Glusman G, Realistic artificial DNA sequences as negative controls for computational genomics, Nucleic Acids Research, 2014. https://doi.org/10.1093/nar/gku356 Copyright (C) 2009-2015 by Juan Caballero [[email protected]] All code is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.
About
Algorithms to compute DNA complexity
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published