Output mash tree or equivalent? #50

flashton2003 · 2020-05-12T08:57:20Z

Hi,

It's nice that mob-suite tells me the plasmid in the reference database which is closest to each plasmid it identifies in my sample, but it would be good if it output a tree of mash distances so I can see how it relates to multiple plasmids in the database. Possibly within some mash distance threshold so that I don't end up with a tree with 12000 tips.

Or even just output the mash distance matrix of my plasmid vs everything in teh database, so I can easily see how far it is from another plasmid of interest.

I can roll my own using mashtree, but others might find useful?

Just a thought, thanks for the nice tool.

Best,

Phil

jrober84 · 2020-05-22T17:34:54Z

I will label this one as an enhancement for future versions. The clusters.txt file in the databases/ directory contains the typing information for all of the plasmids in the reference database. We have a primary cluster designation that is meant for aggregating similar plasmids together at a mash distance of 0.06 and a secondary cluster designation distance (0.025) which should capture near duplicates of sequences. You can select members of the same cluster in the file for building a tree with mash to see larger patterns. In our experience, draft versus complete versions of plasmids can vary up to 0.025 in mash distances, so if a plasmid shares that same cluster, you will want to use a more sensitive technique like SNP typing for distinguishing them further.

kbessonov1984 · 2020-05-22T17:53:56Z

Dear Phil, although not exactly what you need in terms of distance matrix to all database entries, but you can try out our previous version (2.1.0) of MOB-Suite with plasmid host-range phylogenetic tree reconstruction feature. It will build a phylo tree based on plasmid features (replicon and cluster id) and overlay it against all plasmid sequences and corresponding taxonomy information in our database.

Thank you for feature suggestion.

$mob_typer -i plasmid.fasta -o mob-typer --host_range_detailed

flashton2003 · 2020-05-28T12:59:09Z

Thanks both!

jrober84 · 2022-05-26T17:42:15Z

I am thinking to create a series of single-linkage flat clusters based on mash distances, between your input and the reference database and provide basic summary statistics on the average pairwise distance within the primary mob_cluster. This will constrain the number of samples and make the comparisons sensible.

jrober84 added the enhancement New feature or request label May 22, 2020

jrober84 mentioned this issue Oct 26, 2020

Mash Dists wihtin and between clusters #71

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output mash tree or equivalent? #50

Output mash tree or equivalent? #50

flashton2003 commented May 12, 2020

jrober84 commented May 22, 2020

kbessonov1984 commented May 22, 2020

flashton2003 commented May 28, 2020

jrober84 commented May 26, 2022

Output mash tree or equivalent? #50

Output mash tree or equivalent? #50

Comments

flashton2003 commented May 12, 2020

jrober84 commented May 22, 2020

kbessonov1984 commented May 22, 2020

flashton2003 commented May 28, 2020

jrober84 commented May 26, 2022