Skip to content

Releases: CGATOxford/UMI-tools

v1.1.6: Update version.py

03 Oct 10:15
3f2cb4f
Compare
Choose a tag to compare

UMI-tools output is now deterministic with --random-seed

Many users have had issues with making UMI-tools deterministic, which previously relied upon both --random-seed and the enivornmental variable PYTHONHASHSEED being set. From v1.1.6 only --random seed is required.

Please note that in some cases the implemented solution may make the output from v.1.1.6 different to previous versions, even if --random-seed is set to the same value. The differences will be very slight and the different outputs represent equally sensible UMI grouping/deduplication since they relate only to how ties are broken.

Thank you @TyberiusPrime, @christianbioinf and others for their suggestions for how to remove the dependency on PYTHONHASHSEED for deterministic output.

New features

Bugfix

Documentation

  • FAQ entry regarding identification of possible duplicates reads/pairs - @TomSmithCGAT in #631
  • Improved docs regarding chimeric/unmapped/unpaired read pairs - @TomSmithCGAT in #629

Other

New Contributors

Full Changelog: 1.1.5...v1.1.6

1.1.5

14 Feb 11:12
cac1f00
Compare
Choose a tag to compare

New features

  • Enables read suffixes to be removed from single end data: @IanSudbery in #591. See #580 for motivating issue
  • Adds a script to prepare umi_tools dedup output for use with RSEM: @IanSudbery in #609. See #465 and #607 for motivating issues

Bugfix

Documentation

  • Fixed docs for dedup stats filenames: @msto in #604

New Contributors

  • @msto made their first contribution in #604

Full Changelog: 1.1.4...1.1.5

1.1.4

02 Mar 12:08
d98ebac
Compare
Choose a tag to compare

Debug to support python 3.11. Thank you @sjaenick for bringing this to our attention and testing (#563)

1.1.3

01 Mar 14:18
2472edd
Compare
Choose a tag to compare

New features

  • Adds '--umi-separator' option to umi_tools extract to specify UMI separator. Thanks @opplatek (#548)

Optimisation

  • Speeds up read pair mate writing. Significant benefit for transcriptome alignments (#543)

Bugfix

1.1.2

06 May 23:12
0c7e86b
Compare
Choose a tag to compare

Bugfix

  • whitelist --filtered-out with SE reads threw an unassigned error. Thanks @yech1990 for rectifying this (#453)

Also includes a very minor update of syntax (#455)

1.1.1

18 Nov 19:58
124f1dc
Compare
Choose a tag to compare

Updates requirements for pysam version to >0.16.0.1. Thanks @sunnymouse25 (#444)

1.1.0

04 Nov 00:12
f65dfbf
Compare
Choose a tag to compare

A long overdue release covering some minor functionality updates and bugfixes:

Additional functionality:

  • Write out reads failing regex matching with extract/whitelist (see options --filtered-out, --filtered-out2). See #328 for motivation
  • Ignore template length with paired-end dedup/group (see option --ignore-tlen). See #357 for motivation. Thanks @skitcattCRUKMI
  • Ignore read pair suffixes with extract/whitelist e.g /1 or /2. (see option --ignore-read-pair-suffixes). See #325, #391, #418, PierreBSC/Viral-Track#9 for motivation

Performance

  • Sped up error correction mapping for cell barcodes in whitelist by using BKTree. Thanks @redst4r. Note that this adds a new python dependency (pybktree) which is available via pip and conda-forge.
  • Very slight reduction in memory usage for dedup/group via bugfix to reduce the amount of reads being retained in the buffer. Thanks to @mitrinh1 for spotting this (#428). The bug was equivalent to hardcoding the option -buffer-whole-contig on, which ensures all reads with the same start position are grouped together for deduplication, but at the cost of not yielding reads until the end of each contig, thus increasing memory usage. As such, the bug was not detrimental to results output.

Bugfixes:

  • Unmapped mates were not properly discarded with dedup and group. Thanks @Daniel-Liu-c0deb0t for rectifying this.

1.0.1: Merge pull request #385 from CGATOxford/{TS}-DebugCellTag

06 Dec 12:39
289b9cc
Compare
Choose a tag to compare

Debug for KeyError when some reads are missing a cell barode tag and stats output required from umi_tools dedup. See comments from @ZHUwj0 in #281

1.0.0

14 Feb 20:58
1207720
Compare
Choose a tag to compare

This release is intended to be a stable release with no plans for significant updates to UMI-tools functionality in the near future. As part of this release, much of the code base has been refactored. It is possible this may have introduced bugs which have not been picked up by the regression testing. If so, please raise an issue and we'll try and rectify with a minor release update ASAP.

Documentation

UMI-tools documentation is now available online: https://umi-tools.readthedocs.io/en/latest/index.html

Along with the previous documentation, the readthedocs pages also include new pages:

  • FAQ
  • Making use of our Alogrithmns: The API

New knee method for whitelist

  • The method to detect the "knee" in whitelist has been updated (#317). This method should always identify a threshold and is now set as the default method. Note that this knee method appears to be slightly more conservative (fewer cells above threshold) but having identified the knee, one can always re-run whitelist and use --set-cell-number to expand the whitelist if desired
  • The old method is still available via --knee-method=density
  • In addition, to run the old knee method but allow whitelist to exit without error even if a suitable knee point isn't identified, use the new --allow-threshold-error option (#249)
  • Putative errors in CBs above the knee can be detected using --ed-above-threshold (#309)

Explicit options for handling chimeric & inproper read pairs (#312)

The behaviour for chimeric read pairs, inproper read pairs and unmapped reads can now be explictly set with the --chimeric-pairs, --unpaired-reads and --unmapped-reads.

New options

  • --temp-dir: Set the directory for temporary files (#254)
  • --either-read & --either-read-resolve: Extract the UMI from either read (#175)

Misc

  • Updates python testing version to 3.6.7 and drops python 2 testing
  • Replace deprecated imp import (#318)
  • Debug error with pysam <0.14 (#319)
  • Refactor module files
  • Moves documentation into dedicated module

0.5.5

16 Nov 17:23
442dea4
Compare
Choose a tag to compare

Mainly minor debugs and improved detection of incorrect command line options. Minor updates to documentation.

  • Resolves issues correctly skipping reads which have not been assigned (#191 & #273). This involves the addition of the --assigned-status-tag option

Testing for OSX has been dropped due to unresolved issues with travis. We hope to resurrect this in the future!

In line with major python packages (e.g https://www.numpy.org/neps/nep-0014-dropping-python2.7-proposal.html), support for python 2 will be dropped from January 1st 2019.