Skip to content

Commit

Permalink
updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
zhengzhenxian committed Apr 30, 2024
1 parent f2606f6 commit 7be913e
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 3 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ For somatic variant calling using tumor only sample, please try [ClairS-TO](http
------

## Latest Updates
*v0.2.0 (Apr 29)* : 1. Added "--use_heterozygous_snp_in_normal_sample_for_intermediate_phasing"/"--use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing" option to support using either heterozygous SNPs in the normal sample or tumor sample for intermediate phasing. The previous versions used in_tumor_sample for phasing. In this new version, when testing with ONT 4kkz HCC1395/BL and using in_normal_sample for intermediate phasing, the SNV precision improved ~2%, while recall remained unchanged. in_normal_sample becomes the default from this version. However, if the coverage of normal sample is low, please consider switching back to using in_tumor_sample ([#22](https://github.com/HKU-BAL/ClairS/issues/22), contributor @[sloth-eat-pudding](https://github.com/sloth-eat-pudding)). 2. Added "--use_heterozygous_indel_for_intermediate_phasing" to include high quality heterozygous Indels for intermediate phasing. With this new option, the haplotagged tumor reads increased by ~3% in ONT 4khz HCC1395/BL, the option becomes default from this version. 3. Added a model that might provide a slightly better performance for liquid tumor. In this release, only ONT Dorado 5khz HAC for liquid tumor (`-p ont_r10_dorado_hac_5khz_liquid`) is provided. The model was trained with slightly higher normal contamination. We are testing out the new model with collaborator. 4. Bumped up Clair3 dependency to version 1.0.7, LongPhase to version 1.7.
*v0.2.0 (Apr 29)* : 1. Added `--use_heterozygous_snp_in_normal_sample_for_intermediate_phasing`/`--use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing` option to support using either heterozygous SNPs in the normal sample or tumor sample for intermediate phasing. The previous versions used in_tumor_sample for phasing. In this new version, when testing with ONT 4kkz HCC1395/BL and using in_normal_sample for intermediate phasing, the SNV precision improved ~2%, while recall remained unchanged. in_normal_sample becomes the default from this version. However, if the coverage of normal sample is low, please consider switching back to using in_tumor_sample ([#22](https://github.com/HKU-BAL/ClairS/issues/22), idea contributed by the longphase team @[sloth-eat-pudding](https://github.com/sloth-eat-pudding)). 2. Added `--use_heterozygous_indel_for_intermediate_phasing` to include high quality heterozygous Indels for intermediate phasing. With this new option, the haplotagged tumor reads increased by ~3% in ONT 4khz HCC1395/BL, the option becomes default from this version. 3. Added a model that might provide a slightly better performance for liquid tumor. In this release, only ONT Dorado 5khz HAC for liquid tumor (`-p ont_r10_dorado_hac_5khz_liquid`) is provided. The model was trained with slightly higher normal contamination. We are testing out the new model with collaborator. 4. Added `--use_longphase_for_intermediate_haplotagging` option to replace WhatsHap haplotagging by LongPhase haplotagging to speed up read haplotagging process, the option becomes default from this version. 5. Bumped up Clair3 dependency to version 1.0.7, LongPhase to version 1.7.

*v0.1.7 (Jan 25, 2024)* : 1. Added ONT Dorado 5khz HAC (`-p ont_r10_dorado_hac_5khz`) and Dorado 4khz HAC (`-p ont_r10_dorado_hac_4khz`) model, renamed all ONT Dorado SUP model, check [here](https://github.com/HKU-BAL/ClairS/blob/main/README.md#pre-trained-models) for more details. 2. Enabled somatic variant calling in sex chromosomes. 3. Added `FAU`, `FCU`, `FGU`, `FTU`, `RAU`, `RCU`, `RGU`, and `RTU` tags.

Expand Down Expand Up @@ -345,6 +345,8 @@ docker run -it hkubal/clairs:latest /opt/bin/run_clairs --help
EXPERIMENTAL: Use the heterozygous SNPs in tumor VCF called by Clair3 for intermediate phasing. Option: {True, False}. Default: False.
--use_heterozygous_indel_for_intermediate_phasing USE_HETEROZYGOUS_INDEL_FOR_INTERMEDIATE_PHASING
EXPERIMENTAL: Use the heterozygous Indels in normal and tumor VCFs called by Clair3 for intermediate phasing. Option: {True, False}. Default: True.
--use_longphase_for_intermediate_haplotagging USE_LONGPHASE_FOR_INTERMEDIATE_HAPLOTAGGING
EXPERIMENTAL: Use the longphase instead of whatshap for intermediate haplotagging. Option: {True, False}. Default: True.
--indel_output_prefix INDEL_OUTPUT_PREFIX
Prefix for Indel output VCF filename. Default: indel.
--indel_pileup_model_path INDEL_PILEUP_MODEL_PATH
Expand Down
7 changes: 5 additions & 2 deletions run_clairs
Original file line number Diff line number Diff line change
Expand Up @@ -663,6 +663,9 @@ def check_args(args):
if args.use_heterozygous_indel_for_intermediate_phasing is None:
args.use_heterozygous_indel_for_intermediate_phasing = True

if args.use_longphase_for_intermediate_haplotagging is None:
args.use_longphase_for_intermediate_haplotagging = True

if args.genotyping_mode_vcf_fn is not None or args.hybrid_mode_vcf_fn is not None:
logging(log_warning("[INFO] Enable --print_ref_calls and --print_germline_calls options in genotyping mode!"))
args.print_ref_calls = True
Expand Down Expand Up @@ -805,6 +808,7 @@ def print_command_line(args):
cmdline += '--use_heterozygous_snp_in_normal_sample_for_intermediate_phasing {}'.format(args.use_heterozygous_snp_in_normal_sample_for_intermediate_phasing) if args.use_heterozygous_snp_in_normal_sample_for_intermediate_phasing is not None else ""
cmdline += '--use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing {}'.format(args.use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing) if args.use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing is not None else ""
cmdline += '--use_heterozygous_indel_for_intermediate_phasing {}'.format(args.use_heterozygous_indel_for_intermediate_phasing) if args.use_heterozygous_indel_for_intermediate_phasing is not None else ""
cmdline += '--use_longphase_for_intermediate_haplotagging {}'.format(args.use_longphase_for_intermediate_haplotagging) if args.use_longphase_for_intermediate_haplotagging is not None else ""
cmdline += '--conda_prefix {} '.format(args.conda_prefix) if args.conda_prefix is not None else ""
args.cmdline = cmdline
except:
Expand Down Expand Up @@ -1752,12 +1756,11 @@ def somatic_parser():
help="EXPERIMENTAL: Use the heterozygous Indels in normal and tumor VCFs called by Clair3 for intermediate phasing. Option: {True, False}. Default: True."
)

#EXPERIMENTAL: Use the heterozygous Indels in normal and tumor VCFs called by Clair3 for intermediate phasing. Option: {True, False}. Default: True.
optional_params.add_argument(
"--use_longphase_for_intermediate_haplotagging",
type=str2bool,
default=None,
help=SUPPRESS
help="EXPERIMENTAL: Use the longphase instead of whatshap for intermediate haplotagging. Option: {True, False}. Default: True."
)

# options for internal process control
Expand Down

0 comments on commit 7be913e

Please sign in to comment.