Ideas

Update 08/05/17

We decided not to merge contigs
Top priority for Mohamed : to finish plots. (1) Sort genomes by the cov and report total number of reads (2) color reads according to fidelity
Use moch datasets and subsample to obtain genomes covered by only a few reads

Prepare the database by mapping bacteria substring (sliding window) on fungi. And also taking the entire refref besides fungi and map onto the fungi to mask fungi genomes.
If the read is mapped entirely to the masked region then ignore it, if it spans the non-masked and masked then keep it if at least 30bp(?) overlap with non-masked
Does it make sense to do this masking inside the database? In between virus, fungi, and plasmids?
Maybe consider LCA instead of just assigning multi-mapped reads (maybe for future release when we do bacteria)
If we do stringent masking we can trust several reads and detect rare organism. This is not available now?
Make interactive graph
Properties of the graph: take only reads which are UNIQ, certain fidelity, etc
Explore all technical parameters
Report separately % genome coverage for UNIW, multi-mapped within, and muti-mapped across