Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convergence Error running GATK GermlineCNVCaller cohort mode #8952

Open
snakesch opened this issue Aug 14, 2024 · 5 comments
Open

Convergence Error running GATK GermlineCNVCaller cohort mode #8952

snakesch opened this issue Aug 14, 2024 · 5 comments

Comments

@snakesch
Copy link

Dear GATK developers,

Thank you very much for the tool! I recently encountered an issue while running GATK gCNV, specifically with the step GermlineCNVCaller in cohort mode. I ran the workflow using WDL and the GATK v4.3.0.0 docker image. I am not sure how the error should be interpreted and thus would like to seek your advice on debugging. Since the entire cohort contains 75 samples, I omitted some verbose debug messages in the error log below.

Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/paedyl01/disk1/louisshe/work/NGS/wdl/test_workflow_cnv/germline/cromwell-executions/CNVGermlineCohortWorkflow/d53c0a12-6b07-4f74-acb
21:05:38.275 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:05:38.297 DEBUG NativeLibraryLoader - Extracting libgkl_compression.so to /paedyl01/disk1/louisshe/tmp/gatk/libgkl_compression8230524266824482022.so
21:05:38.388 INFO  GermlineCNVCaller - ------------------------------------------------------------
21:05:38.388 INFO  GermlineCNVCaller - The Genome Analysis Toolkit (GATK) v4.3.0.0
21:05:38.388 INFO  GermlineCNVCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
21:05:38.388 INFO  GermlineCNVCaller - Executing as louisshe@paedyl02 on Linux v3.10.0-1160.11.1.el7.x86_64 amd64
21:05:38.389 INFO  GermlineCNVCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
21:05:38.389 INFO  GermlineCNVCaller - Start Date/Time: August 13, 2024 9:05:38 PM GMT
21:05:38.389 INFO  GermlineCNVCaller - ------------------------------------------------------------
21:05:38.389 INFO  GermlineCNVCaller - ------------------------------------------------------------
21:05:38.390 INFO  GermlineCNVCaller - HTSJDK Version: 3.0.1
21:05:38.390 INFO  GermlineCNVCaller - Picard Version: 2.27.5
21:05:38.390 INFO  GermlineCNVCaller - Built for Spark Version: 2.4.5
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.BUFFER_SIZE : 131072
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.CREATE_INDEX : false
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.CREATE_MD5 : false
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.CUSTOM_READER_FACTORY :
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.DISABLE_SNAPPY_COMPRESSOR : false
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.EBI_REFERENCE_SERVICE_URL_MASK : https://www.ebi.ac.uk/ena/cram/md5/%s
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.NON_ZERO_BUFFER_SIZE : 131072
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.REFERENCE_FASTA : null
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.SAM_FLAG_FIELD_FORMAT : DECIMAL
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:05:38.391 INFO  GermlineCNVCaller - HTSJDK Defaults.USE_CRAM_REF_DOWNLOAD : false
21:05:38.392 DEBUG ConfigFactory - Configuration file values:
21:05:38.395 DEBUG ConfigFactory -      gcsMaxRetries = 20
21:05:38.395 DEBUG ConfigFactory -      gcsProjectForRequesterPays =
21:05:38.395 DEBUG ConfigFactory -      gatk_stacktrace_on_user_exception = false
21:05:38.395 DEBUG ConfigFactory -      samjdk.use_async_io_read_samtools = false
21:05:38.395 DEBUG ConfigFactory -      samjdk.use_async_io_write_samtools = true
21:05:38.395 DEBUG ConfigFactory -      samjdk.use_async_io_write_tribble = false
21:05:38.395 DEBUG ConfigFactory -      samjdk.compression_level = 2
21:05:38.395 DEBUG ConfigFactory -      spark.kryoserializer.buffer.max = 512m
21:05:38.395 DEBUG ConfigFactory -      spark.driver.maxResultSize = 0
21:05:38.395 DEBUG ConfigFactory -      spark.driver.userClassPathFirst = true
21:05:38.395 DEBUG ConfigFactory -      spark.io.compression.codec = lzf
21:05:38.395 DEBUG ConfigFactory -      spark.executor.memoryOverhead = 600
21:05:38.395 DEBUG ConfigFactory -      spark.driver.extraJavaOptions =
21:05:38.395 DEBUG ConfigFactory -      spark.executor.extraJavaOptions =
21:05:38.395 DEBUG ConfigFactory -      codec_packages = [htsjdk.variant, htsjdk.tribble, org.broadinstitute.hellbender.utils.codecs]
21:05:38.395 DEBUG ConfigFactory -      read_filter_packages = [org.broadinstitute.hellbender.engine.filters]
21:05:38.395 DEBUG ConfigFactory -      annotation_packages = [org.broadinstitute.hellbender.tools.walkers.annotator]
21:05:38.395 DEBUG ConfigFactory -      cloudPrefetchBuffer = 40
21:05:38.395 DEBUG ConfigFactory -      cloudIndexPrefetchBuffer = -1
21:05:38.395 DEBUG ConfigFactory -      createOutputBamIndex = true
21:05:38.396 INFO  GermlineCNVCaller - Deflater: IntelDeflater
21:05:38.396 INFO  GermlineCNVCaller - Inflater: IntelInflater
21:05:38.396 INFO  GermlineCNVCaller - GCS max retries/reopens: 20
21:05:38.396 INFO  GermlineCNVCaller - Requester pays: disabled
21:05:38.396 INFO  GermlineCNVCaller - Initializing engine
21:05:38.399 DEBUG ScriptExecutor - Executing:
21:05:38.399 DEBUG ScriptExecutor -   python
21:05:38.399 DEBUG ScriptExecutor -   -c
21:05:38.399 DEBUG ScriptExecutor -   import gcnvkernel
21:06:10.792 DEBUG ScriptExecutor - Result: 0
21:06:10.792 INFO  GermlineCNVCaller - Done initializing engine
21:06:10.826 INFO  GermlineCNVCaller - Intervals specified...
log4j:WARN No appenders could be found for logger (org.broadinstitute.hdf5.HDF5Library).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
21:06:12.479 INFO  FeatureManager - Using codec IntervalListCodec to read file file:///paedyl01/disk1/louisshe/work/NGS/wdl/test_workflow_cnv/germline/cromwell-executions/CNVGuts/-947966988/Homo_sapiens_assembly38.bed.preprocessed.filtered.scattered.0154.interval_list
21:06:12.640 DEBUG FeatureDataSource - Cache statistics for FeatureInput /paedyl01/disk1/louisshe/work/NGS/wdl/test_workflow_cnv/germline/cromwell-executions/CNVGermlineCohort8/Homo_sapiens_assembly38.bed.preprocessed.filtered.scattered.0154.interval_list:/paedyl01/disk1/louisshe/work/NGS/wdl/test_workflow_cnv/germline/cromwell-executions/CNVGermli947966988/Homo_sapiens_assembly38.bed.preprocessed.filtered.scattered.0154.interval_list:
21:06:12.640 DEBUG FeatureCache - Cache hit rate  was 0.00% (0 out of 0 total queries)
21:06:12.645 INFO  IntervalArgumentCollection - Processing 4999155 bp from intervals
21:06:12.656 INFO  GermlineCNVCaller - Reading and validating annotated intervals...
21:06:18.914 WARN  GermlineCNVCaller - Sequence dictionary in annotated-intervals file does not match the master sequence dictionary.
21:06:19.130 INFO  GermlineCNVCaller - GC-content annotations for intervals found; explicit GC-bias correction will be performed...
21:06:19.200 INFO  GermlineCNVCaller - Running the tool in COHORT mode...
21:06:19.200 INFO  GermlineCNVCaller - Validating and aggregating data from input read-count files...
21:07:11.897 DEBUG ScriptExecutor - Executing:
21:07:11.897 DEBUG ScriptExecutor -   python
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/cohort_denoising_calling.418897092082188314.py
21:07:11.897 DEBUG ScriptExecutor -   --ploidy_calls_path=/paedyl01/disk1/louisshe/work/NGS/wdl/test_workflow_cnv/germline/cromwell-executions/CNVGermlineCohortWorkflow/d53c0a
21:07:11.897 DEBUG ScriptExecutor -   --output_calls_path=/paedyl01/disk1/louisshe/out/NMD/batch1_2023/batch1_all/cnv/cohort_calls/batch1_all-calls
21:07:11.897 DEBUG ScriptExecutor -   --output_tracking_path=/paedyl01/disk1/louisshe/out/NMD/batch1_2023/batch1_all/cnv/cohort_calls/batch1_all-tracking
21:07:11.897 DEBUG ScriptExecutor -   --random_seed=1984
21:07:11.897 DEBUG ScriptExecutor -   --modeling_interval_list=/paedyl01/disk1/louisshe/tmp/gatk/intervals6744296186531223263.tsv
21:07:11.897 DEBUG ScriptExecutor -   --output_model_path=/paedyl01/disk1/louisshe/out/NMD/batch1_2023/batch1_all/cnv/cohort_calls/batch1_all-model
21:07:11.897 DEBUG ScriptExecutor -   --enable_explicit_gc_bias_modeling=True
21:07:11.897 DEBUG ScriptExecutor -   --read_count_tsv_files
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A140019A.rc7689938763423585433.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A140227A.rc5534074413391544082.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A140228A.rc8415746856369873408.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A150001A.rc6707204643753825321.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A160688B.rc867291074142408537.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A180282.rc2654253522563750156.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A180285.rc7225441130334195352.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A180287.rc9197457766389224118.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A190117A.rc3193929221414883945.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A190118A.rc710110938611804899.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A190119A.rc5863858387543609595.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A200126.rc224011369846433499.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A200137.rc158441496946272590.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A200256.rc4161183391298828848.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/A200301.rc7021293169640503435.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/NM145A.rc1138236372314543774.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/NM146A.rc4638504608748835640.tsv
21:07:11.897 DEBUG ScriptExecutor -   /paedyl01/disk1/louisshe/tmp/gatk/NM156.rc4899167012623615948.tsv
21:07:11.897 DEBUG ScriptExecutor -   --psi_s_scale=1.000000e-02
21:07:11.897 DEBUG ScriptExecutor -   --mapping_error_rate=1.000000e-02
21:07:11.897 DEBUG ScriptExecutor -   --depth_correction_tau=1.000000e+04
21:07:11.897 DEBUG ScriptExecutor -   --q_c_expectation_mode=hybrid
21:07:11.897 DEBUG ScriptExecutor -   --num_samples_copy_ratio_approx=200
21:07:11.897 DEBUG ScriptExecutor -   --max_bias_factors=6
21:07:11.897 DEBUG ScriptExecutor -   --psi_t_scale=1.000000e-02
21:07:11.897 DEBUG ScriptExecutor -   --log_mean_bias_std=1.000000e-01
21:07:11.897 DEBUG ScriptExecutor -   --init_ard_rel_unexplained_variance=1.000000e-01
21:07:11.897 DEBUG ScriptExecutor -   --num_gc_bins=20
21:07:11.897 DEBUG ScriptExecutor -   --gc_curve_sd=1.000000e+00
21:07:11.897 DEBUG ScriptExecutor -   --active_class_padding_hybrid_mode=50000
21:07:11.897 DEBUG ScriptExecutor -   --enable_bias_factors=True
21:07:11.897 DEBUG ScriptExecutor -   --disable_bias_factors_in_active_class=False
21:07:11.897 DEBUG ScriptExecutor -   --p_alt=5.000000e-04
21:07:11.897 DEBUG ScriptExecutor -   --cnv_coherence_length=1.000000e+04
21:07:11.897 DEBUG ScriptExecutor -   --max_copy_number=5
21:07:11.897 DEBUG ScriptExecutor -   --p_active=0.100000
21:07:11.897 DEBUG ScriptExecutor -   --class_coherence_length=10000.000000
21:07:11.897 DEBUG ScriptExecutor -   --learning_rate=5.000000e-02
21:07:11.897 DEBUG ScriptExecutor -   --adamax_beta1=9.000000e-01
21:07:11.897 DEBUG ScriptExecutor -   --adamax_beta2=9.900000e-01
21:07:11.897 DEBUG ScriptExecutor -   --log_emission_samples_per_round=50
21:07:11.897 DEBUG ScriptExecutor -   --log_emission_sampling_rounds=10
21:07:11.897 DEBUG ScriptExecutor -   --log_emission_sampling_median_rel_error=5.000000e-03
21:07:11.897 DEBUG ScriptExecutor -   --max_advi_iter_first_epoch=5000
21:07:11.897 DEBUG ScriptExecutor -   --max_advi_iter_subsequent_epochs=100
21:07:11.897 DEBUG ScriptExecutor -   --min_training_epochs=10
21:07:11.897 DEBUG ScriptExecutor -   --max_training_epochs=100
21:07:11.897 DEBUG ScriptExecutor -   --initial_temperature=2.000000e+00
21:07:11.897 DEBUG ScriptExecutor -   --num_thermal_advi_iters=2500
21:07:11.897 DEBUG ScriptExecutor -   --convergence_snr_averaging_window=500
21:07:11.897 DEBUG ScriptExecutor -   --convergence_snr_trigger_threshold=1.000000e-01
21:07:11.897 DEBUG ScriptExecutor -   --convergence_snr_countdown_window=10
21:07:11.897 DEBUG ScriptExecutor -   --max_calling_iters=10
21:07:11.897 DEBUG ScriptExecutor -   --caller_update_convergence_threshold=1.000000e-03
21:07:11.897 DEBUG ScriptExecutor -   --caller_internal_admixing_rate=7.500000e-01
21:07:11.897 DEBUG ScriptExecutor -   --caller_external_admixing_rate=1.000000e+00
21:07:11.897 DEBUG ScriptExecutor -   --disable_caller=false
21:07:11.897 DEBUG ScriptExecutor -   --disable_sampler=false
21:07:11.897 DEBUG ScriptExecutor -   --disable_annealing=false
Traceback (most recent call last):
  File "/paedyl01/disk1/louisshe/tmp/gatk/cohort_denoising_calling.418897092082188314.py", line 179, in <module>
    warm_up_task.engage()
  File "/opt/miniconda/envs/gatk/lib/python3.6/site-packages/gcnvkernel/tasks/inference_task_base.py", line 346, in engage
    converged_continuous = self._update_continuous_posteriors()
  File "/opt/miniconda/envs/gatk/lib/python3.6/site-packages/gcnvkernel/tasks/inference_task_base.py", line 403, in _update_continuous_posteriors
    raise ConvergenceError
gcnvkernel.tasks.inference_task_base.ConvergenceError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/paedyl01/disk1/louisshe/tmp/gatk/cohort_denoising_calling.418897092082188314.py", line 184, in <module>
    sys.exit(gcnvkernel.io_consts.diverged_inference_exit_code)
NameError: name 'sys' is not defined
21:09:00.147 INFO  GermlineCNVCaller - Shutting down engine
[August 13, 2024 9:09:00 PM GMT] org.broadinstitute.hellbender.tools.copynumber.GermlineCNVCaller done. Elapsed time: 3.36 minutes.
Runtime.totalMemory()=3168796672
org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException:
python exited with 1

Much appreciated for your help indeed!

Best regards,
Louis

@gokalpcelik
Copy link
Contributor

Can you provide the whole log as an attachment? It is really hard to debug with missing information.

@snakesch
Copy link
Author

stderr.log

@gokalpcelik
Copy link
Contributor

How did you install the GATK Conda environment? Looks like a problem with the conda environment configuration. One possible reason could be that conda environment is not the version 4.3.0.0 is requesting.

@snakesch
Copy link
Author

I did not install the GATK conda environment. Instead, I created a singularity container using the GATK v4.3.0.0 docker image. Also, the GATK version I have outside container is v4.6.0.0, so the fact that GATK v4.3.0.0 got called means it was using the GATK came with the image.

@gokalpcelik
Copy link
Contributor

We don't support singularity so I am not sure how to help if that's the version you are using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants