Runtime is on HG003 (all chromosomes).
Stage | Time (minutes) |
---|---|
make_examples | ~110m |
call_variants | ~185m |
postprocess_variants (with gVCF) | ~80m |
total | ~375m = ~6.3 hours |
hap.py results on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training.
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 501523 | 2978 | 1207 | 0.994097 | 0.997696 | 0.995893 |
SNP | 3306397 | 21099 | 4556 | 0.993659 | 0.998625 | 0.996136 |
Runtime is on HG003 (all chromosomes).
Stage | Time (minutes) |
---|---|
make_examples | ~10m |
call_variants | ~2m |
postprocess_variants (with gVCF) | ~1m |
total | ~13m |
hap.py results on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training.
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 1020 | 31 | 14 | 0.970504 | 0.986717 | 0.978544 |
SNP | 24938 | 341 | 58 | 0.986511 | 0.997680 | 0.992064 |
Runtime is on HG003 (all chromosomes).
Stage | Time (minutes) |
---|---|
make_examples | ~125m |
call_variants | ~170m |
postprocess_variants (with gVCF) | ~75m |
total | ~370m = ~6.2 hours |
hap.py results on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training.
(The input BAM is phased already and DeepVariant was run with
--use_hp_information=true
.)
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 501805 | 2696 | 2661 | 0.994656 | 0.994935 | 0.994795 |
SNP | 3323555 | 3940 | 1642 | 0.998816 | 0.999507 | 0.999161 |
Runtime is on HG003 (all chromosomes).
Stage | Time (minutes) |
---|---|
make_examples | ~155m |
call_variants | ~170m |
postprocess_variants (with gVCF) | ~55m |
total | ~380m = ~6.3 hours |
Evaluating on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training the hybrid model.
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 503228 | 1273 | 1990 | 0.997477 | 0.996249 | 0.996863 |
SNP | 3323696 | 3799 | 1710 | 0.998858 | 0.999486 | 0.999172 |
For simplicity and consistency, we report runtime with a CPU instance with 64 CPUs This is NOT the fastest or cheapest configuration. For more scalable execution of DeepVariant see the External Solutions section.
Use gcloud compute ssh
to log in to the newly created instance.
Download and run any of the following case study scripts:
# Get the script.
curl -O https://raw.githubusercontent.com/google/deepvariant/r1.2/scripts/inference_deepvariant.sh
# WGS
bash inference_deepvariant.sh --model_preset WGS
# WES
bash inference_deepvariant.sh --model_preset WES
# PacBio
bash inference_deepvariant.sh --model_preset PACBIO
# Hybrid
bash inference_deepvariant.sh --model_preset HYBRID_PACBIO_ILLUMINA
Runtime metrics are taken from the resulting log after each stage of DeepVariant, and the accuracy metrics come from the hap.py summary.csv output file.