usage: run_sepp.py [-h] [-v] [-A N] [-P N] [-F N] [-D DISTANCE] [-M DIAMETER]
[-S DECOMP] [-p DIR] [-o OUTPUT] [-d OUTPUT_DIR]
[-c CONFIG] [-t TREE] [-r RAXML] [-a ALIGN] [-f FRAG]
[-m MOLECULE] [-x N] [-cp CHCK_FILE] [-cpi N] [-seed N]
This script runs the SEPP algorithm on an input tree, alignment, fragment
file, and RAxML info file.
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
DECOMPOSITION OPTIONS:
These options determine the alignment decomposition size and taxon
insertion size. If None is given, then the default is to align/place at
10% of total taxa. The alignment decomosition size must be less than the
taxon insertion size.
-A N, --alignmentSize N
max alignment subset size of N [default: 10% of the
total number of taxa or the placement subset size if
given]
-P N, --placementSize N
max placement subset size of N [default: 10% of the
total number of taxa or the alignment length
(whichever bigger)]
-F N, --fragmentChunkSize N
maximum fragment chunk size of N. Helps controlling
memory. [default: 5000]
-D DISTANCE, --distance DISTANCE
minimum p-distance before stopping the
decomposition[default: 1]
-M DIAMETER, --diameter DIAMETER
maximum tree diameter before stopping the
decomposition[default: None]
-S DECOMP, --decomp_strategy DECOMP
decomposition strategy [default: using tree branch
length]
OUTPUT OPTIONS:
These options control output.
-p DIR, --tempdir DIR
Tempfile files will be written to DIR. Full-path
required. [default: /tmp/sepp]
-o OUTPUT, --output OUTPUT
output files with prefix OUTPUT. [default: output]
-d OUTPUT_DIR, --outdir OUTPUT_DIR
output to OUTPUT_DIR directory. full-path required.
[default: .]
INPUT OPTIONS:
These options control input. To run SEPP the following is required.A
backbone tree (in newick format), a RAxML_info file (this is the file
generated by RAxML during estimation of the backbone tree. Pplacer uses
this info file to set model parameters),a backbone alignment file (in
fasta format), and a fasta file including fragments. The input sequences
are assumed to be DNA unless specified otherwise.
-c CONFIG, --config CONFIG
A config file, including options used to run SEPP.
Options provided as command line arguments overwrite
config file values for those options. [default: None]
-t TREE, --tree TREE Input tree file (newick format) [default: None]
-r RAXML, --raxml RAXML
RAxML_info file including model parameters, generated
by RAxML.[default: None]
-a ALIGN, --alignment ALIGN
Aligned fasta file [default: None]
-f FRAG, --fragment FRAG
fragment file [default: None]
-m MOLECULE, --molecule MOLECULE
Molecule type of sequences. Can be amino, dna, or rna
[default: dna]
OTHER OPTIONS:
These options control how SEPP is run
-x N, --cpu N Use N cpus [default: number of cpus available on the
machine]
-cp CHCK_FILE, --checkpoint CHCK_FILE
checkpoint file [default: no checkpointing]
-cpi N, --interval N Interval (in seconds) between checkpoint writes. Has
effect only with -cp provided.[default: 3600]
-seed N, --randomseed N
random seed number.[default: 297834]