Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using QUILT2.R to impute downsampled data with the gold standard file as the reference panel, an issue occurs. #46

Open
bbdragon1 opened this issue Nov 5, 2024 · 4 comments

Comments

@bbdragon1
Copy link

Hi,
The program only generates chunked panel data as RData files but does not proceed to the imputation step. When using a different reference panel with the same code, there are no issues; only the panel file is different. I have confirmed that the buffer, imputation start, and end position information are correct. I would like to understand why this happens and seek assistance. Here is my code:
QUILT2_PATH="QUILT2.R"
$QUILT2_PATH
--outputdir="$Output_dir"
--chr="$chrom"
--nCores="$Threads_num"
--regionStart="$regionStart"
--regionEnd="$regionEnd"
--buffer="$buffer"
--nGen=100
--bamlist="$Bamlist"
--reference_vcf_file="$Panel_file"
--save_prepared_reference=TRUE

@Zilong-Li
Copy link
Collaborator

Hey,

Can you show the log? Also, make sure you are re-running the command with different outputdir.

@bbdragon1
Copy link
Author

Hello, I have noticed the following error when running files, but I am not sure what the reason for this error is :
Error in mtm[!x, ] : (subscript) logical subscript too long
Calls: QUILT ... select_new_haps_mspbwt_v3 -> lapply -> FUN -> lapply -> FUN
Execution halted
Command exited with non-zero status 1. I would like to ask for your help.
The following is the log content:
[2024-11-10 10:01:37] Running QUILT(outputdir = gold_82_25_panel_preared_chr1_quilt2_output, chr = 1, method = diploid, regionStart = 1, regionEnd = 8000000, buffer = 500000, fflist = , bamlist = /..../bam/lcWGS_downsample_bam.list, cramlist = , sampleNames_file = , reference = , nCores = 1, nGibbsSamples = 7, n_seek_its = 3, n_burn_in_seek_its = NA, Ksubset = 600, Knew = 600, K_top_matches = 5, output_gt_phased_genotypes = TRUE, heuristic_match_thin = 0.1, output_filename = NULL, RData_objects_to_save = NULL, output_RData_filename = NULL, prepared_reference_filename = gold_82_25_panel_preared_chr1_quilt2_output/RData/QUILT_prepared_reference.1.1.8000000.RData, save_prepared_reference = FALSE, tempdir = NA, bqFilter = 17, panel_size = NA, posfile = , genfile = , phasefile = , maxDifferenceBetweenReads = 1e+10, make_plots = FALSE, make_plots_block_gibbs = FALSE, verbose = TRUE, shuffle_bin_radius = 5000, iSizeUpperLimit = 1e+06, record_read_label_usage = FALSE, record_interim_dosages = FALSE, use_bx_tag = TRUE, bxTagUpperLimit = 50000, addOptimalHapsToVCF = FALSE, estimate_bq_using_truth_read_labels = FALSE, override_default_params_for_small_ref_panel = TRUE, gamma_physically_closest_to = NA, seed = NA, hla_run = FALSE, downsampleToCov = 30, minGLValue = 1e-10, minimum_number_of_sample_reads = 2, nGen = 100, reference_vcf_file = , reference_haplotype_file = , reference_legend_file = , reference_sample_file = , reference_populations = NA, reference_phred = 30, reference_exclude_samplelist_file = , region_exclude_file = , genetic_map_file = , nMaxDH = NA, make_fake_vcf_with_sites_list = FALSE, output_sites_filename = NA, expRate = 1, maxRate = 100, minRate = 0.1, print_extra_timing_information = FALSE, small_ref_panel_block_gibbs_iterations = c(3, 6, 9), small_ref_panel_gibbs_iterations = 20, plot_per_sample_likelihoods = FALSE, use_small_eHapsCurrent_tc = FALSE, mspbwtL = 3, mspbwtM = 1, use_mspbwt = TRUE, mspbwt_nindices = 4, use_splitreadgl = FALSE, override_use_eMatDH_special_symbols = NA, use_hapMatcherR = TRUE, shard_check_every_pair = TRUE, use_eigen = TRUE, impute_rare_common = TRUE, rare_af_threshold = 0.001, make_heuristic_plot = FALSE, heuristic_approach = A, use_list_of_columns_of_A = TRUE, calculate_gamma_on_the_fly = TRUE)
[2024-11-10 10:01:37] Auto-set n_burn_in_seek_its to 2 i.e. only sample one dosage per Gibbs sample
[2024-11-10 10:01:37] Overriding default parameters for small reference panel
[2024-11-10 10:01:37] Observing number of reference haplotypes K=164
[2024-11-10 10:01:37] Reset n_seek_its from 3 to 1
[2024-11-10 10:01:37] Reset n_burn_in_seek_its from 2 to 0
[2024-11-10 10:01:37] Set Ksubset to 164 (no longer necessary)
[2024-11-10 10:01:37] Set Knew to 164 (no longer necessary)
[2024-11-10 10:01:37] Get BAM sample names
[2024-11-10 10:01:37] Done getting BAM sample names
[2024-11-10 10:01:37] There are 359 common SNPs and 359 SNPs overall in this region
[2024-11-10 10:01:37] There are 0 regions out of 11 below minimum recombination rate, setting them to minimum rate
[2024-11-10 10:01:37] There are 0 regions out of 11 above maximum recombination rate, setting them to maximum rate
[2024-11-10 10:01:37] Imputing sample: 1
[W::hts_idx_load3] The index file is older than the data file: /home/zhangz_group/zhangz4/data/USER/zhongzm/Me2/LcWGS/bam/P67C08.downsample.bam.bai
[W::hts_idx_load3] The index file is older than the data file: /home/zhangz_group/zhangz4/data/USER/zhongzm/Me2/LcWGS/bam/P67C08.downsample.bam.bai
[2024-11-10 10:01:50] The average depth of this sample is:1.41442521399119
[2024-11-10 10:01:50] There are 166 reads under consideration
[2024-11-10 10:01:50] i_gibbs=1, i_it = 1 small gibbs
[2024-11-10 10:01:50] i_gibbs=1, i_it = 1 full
Error in mtm[!x, ] : (subscript) logical subscript too long
Calls: QUILT ... select_new_haps_mspbwt_v3 -> lapply -> FUN -> lapply -> FUN
Execution halted
Command exited with non-zero status 1
Command being timed: "QUILT2.R --prepared_reference_filename=gold_82_25_panel_preared_chr1_quilt2_output/RData/QUILT_prepared_reference.1.1.8000000.RData --bamlist=/..../bam/lcWGS_downsample_bam.list --chr=1 --method=diploid --nCores=1 --regionStart=1 --regionEnd=8000000 --buffer=500000 --nGen=100 --outputdir=gold_82_25_panel_preared_chr1_quilt2_output"
User time (seconds): 13.19
System time (seconds): 0.37
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.58
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 691164
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 135628
Voluntary context switches: 299
Involuntary context switches: 53
Swaps: 0
File system inputs: 337752
File system outputs: 16
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1

@Zilong-Li
Copy link
Collaborator

Hi,

Seems you have a small reference panel of 164 haplotypes. And we recommend to use QUILT1 for small reference panel. Also, we had a new release v2.0.1 which may fix some errors for small reference panel. Can you also try it? Let us know it helps.

Best,
Zilong.

@bbdragon1
Copy link
Author

Thank you for answering my questions. We will continue testing with version 2.0.1 next. Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants