Skip to content

Releases: HKU-BAL/ClairS

v0.4.0

11 Oct 14:38
Compare
Choose a tag to compare

This version is a major update. The new features and benchmarks are explained in a technical note titled “Improving the performance of ClairS and ClairS-TO with new real cancer cell-line datasets and PoN”. A summary of changes:

  1. Starting from this version, ClairS will provide two model types. ssrs is a model trained initially with synthetic samples and then real samples augmented (e.g., ont_r10_dorado_sup_5khz_ssrs), ss is a model trained from synthetic samples (e.g., ont_r10_dorado_sup_5khz_ss). The ssrs model provides better performance and fits most usage scenarios. ss model can be used when missing a cancer-type in model training is a concern. In v0.4.0, four real cancer cell-line datasets (HCC1937/BL, HCC1954/BL, H1437/BL, and H2009/BL) covering two cancer types (breast cancer, lung cancer) published by Park et al. were used for ssrs model training.
  2. Added BQ jittering in model training to address the BQ distribution difference between the training and calling datasets that leads to performance drop.
  3. Added the --indel_min_af option and adjusted the default minimum allelic fraction requirement to 0.1 for Indels in ONT platform.

v0.3.1

16 Aug 14:43
54e0d7a
Compare
Choose a tag to compare
  1. Added four options i. --use_heterozygous_snp_in_tumor_sample_and_normal_bam_for_intermediate_phasing, ii. --use_heterozygous_snp_in_normal_sample_and_normal_bam_for_intermediate_phasing, iii. --use_heterozygous_snp_in_tumor_sample_and_tumor_bam_for_intermediate_phasing, and iv. --use_heterozygous_snp_in_normal_sample_and_tumor_bam_for_intermediate_phasing. iii is equivalent to --use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing added in v0.2.0. iv is equivalent to --use_heterozygous_snp_in_normal_sample_for_intermediate_phasing added in v0.2.0. Use normal bam for intermediate phasing was a request from @Sergey Aganezov. When the coverage of normal and tumor are similar, using normal bam for intermediate phasing has negligible difference from using tumor bam in our experiments using HCC1395/BL.
  2. Added --haplotagged_tumor_bam_provided_so_skip_intermediate_phasing_and_haplotagging to use the haplotype information provided in the tumor bam directly and skip intermediate phasing and haplotagging. This option is useful when using ClairS in a pipeline in which the phasing of the tumor bam is done before running ClairS. BAM haplotagged by WhatsHap and LongPhase are accepted.
  3. Bumped up Clair3 dependency to version 1.0.10, LongPhase to version 1.7.3.

v0.3.0

08 Jul 02:19
Compare
Choose a tag to compare
  1. Added a module called “verdict” (Option --enable_verdict) to statistically classify a called variant into either a germline, somatic, or subclonal somatic variant based on the CNV profile and tumor purity estimation.
  2. Improved model training speed, reduced model training time cost by about three times.

v0.2.0

04 May 08:16
Compare
Choose a tag to compare
  1. Added --use_heterozygous_snp_in_normal_sample_for_intermediate_phasing/--use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing option to support using either heterozygous SNPs in the normal sample or tumor sample for intermediate phasing. The previous versions used in_tumor_sample for phasing. In this new version, when testing with ONT 4kkz HCC1395/BL and using in_normal_sample for intermediate phasing, the SNV precision improved ~2%, while recall remained unchanged. in_normal_sample becomes the default from this version. However, if the coverage of normal sample is low, please consider switching back to using in_tumor_sample (#22, idea contributed by the longphase team @sloth-eat-pudding).
  2. Added --use_heterozygous_indel_for_intermediate_phasing to include high quality heterozygous Indels for intermediate phasing. With this new option, the haplotagged tumor reads increased by ~3% in ONT 4khz HCC1395/BL, the option becomes default from this version.
  3. Added a model that might provide a slightly better performance for liquid tumor. In this release, only ONT Dorado 5khz HAC for liquid tumor (-p ont_r10_dorado_hac_5khz_liquid) is provided. The model was trained with slightly higher normal contamination. We are testing out the new model with collaborator.
  4. Added --use_longphase_for_intermediate_haplotagging option to replace WhatsHap haplotagging by LongPhase haplotagging to speed up read haplotagging process, the option becomes default from this version.
  5. Bumped up Clair3 dependency to version 1.0.7, LongPhase to version 1.7.

v0.1.7

26 Jan 07:50
d1c5096
Compare
Choose a tag to compare
  1. Added ONT Dorado 5khz HAC (-p ont_r10_dorado_hac_5khz) and Dorado 4khz HAC (-p ont_r10_dorado_hac_4khz) model, renamed all ONT Dorado SUP model, check here for more details.
  2. Enabled somatic variant calling in sex chromosomes.
  3. Added FAU, FCU, FGU, FTU, RAU, RCU, RGU, and RTU tags.

v0.1.6

18 Sep 11:45
Compare
Choose a tag to compare
  1. Fixed an output bug that caused no VCF output if no Indel candidate was found (contributor @Khi Pin).
  2. Fixed showing incorrect reference allele depth at a deletion region.
  3. Added PacBio HiFi quick demo.

v0.1.5

02 Aug 07:08
Compare
Choose a tag to compare
  1. Updated SNV calling using ONT Dorado 4kHz data with a new model trained using multiple-sample pairs (HG003/4);
  2. Updated SNV calling using ONT Dorado 5kHz data with a new model trained using multiple-sample pairs (HG001/HG002, HG003/4);
  3. Support somatic indel calling using ONT Dorado 4kHz data.
  4. Support somatic indel calling using ONT Dorado 5kHz data.

v0.1.4

16 Jul 13:32
74f2e34
Compare
Choose a tag to compare
  1. Added reference depth in the AD tag.
  2. Added HiFi Sequel II Indel model.

v0.1.3

05 Jul 08:33
Compare
Choose a tag to compare

Added ONT Dorado 4khz (-p ont_r10_dorado_4khz) and 5khz (-p ont_r10_dorado_5khz) models, check here for more details.
Renamed platform options ont_r10 to ont_r10_guppy and ont_r9 to ont_r9_guppy.

v0.1.2

17 May 08:02
Compare
Choose a tag to compare

Added HiFi Revio model, renamed HiFi Sequel II model from hifi to hifi_sequel2.