MAGstats: A Jupyter notebook to visualize the completeness and redundancy of MAGs and draft genomes
To visualize the completeness and reduduancy of your MAG, you need to get two files ready (example_data
):
- A newick tree file (
MAG_tree.nwk
) - A metadata file (
MAG_metadata.tsv
) The metadata file must contain these columns in order:MAG_ID Length Completion Redundancy GC_Content
, and is tab-delimited.
You can launch this jupyter notebook using binder by clicking , and upload your files to the example_data
folder via the upload
button in your project directory (where theindex.ipynb
locates). You probably need to modify these two lines to adapt to your file names in the first code block before running through all the codes:
nwk_file <- "MAG_tree.nwk"
bin_metadata_file <- "MAG_metadata.tsv"
-
For the tree file, you can use GToTree to extract single-copy marker genes and to get the concatenated multiple sequence alignements. Then use RAxML-NG to build the maximum-likelihood phylogenomic tree. GToTree also generate a tree file by running
fasttree
. -
For the metadata file, you can get it via checkM or anvio, and format it using Excel and export it as tsv.