Release 1.1.0 · CGATOxford/UMI-tools

A long overdue release covering some minor functionality updates and bugfixes:

Write out reads failing regex matching with extract/whitelist (see options --filtered-out, --filtered-out2). See #328 for motivation
Ignore template length with paired-end dedup/group (see option --ignore-tlen). See #357 for motivation. Thanks @skitcattCRUKMI
Ignore read pair suffixes with extract/whitelist e.g /1 or /2. (see option --ignore-read-pair-suffixes). See #325, #391, #418, PierreBSC/Viral-Track#9 for motivation

Sped up error correction mapping for cell barcodes in whitelist by using BKTree. Thanks @redst4r. Note that this adds a new python dependency (pybktree) which is available via pip and conda-forge.
Very slight reduction in memory usage for dedup/group via bugfix to reduce the amount of reads being retained in the buffer. Thanks to @mitrinh1 for spotting this (#428). The bug was equivalent to hardcoding the option -buffer-whole-contig on, which ensures all reads with the same start position are grouped together for deduplication, but at the cost of not yielding reads until the end of each contig, thus increasing memory usage. As such, the bug was not detrimental to results output.

Unmapped mates were not properly discarded with dedup and group. Thanks @Daniel-Liu-c0deb0t for rectifying this.

Provide feedback