PGGB takes more than 96 hours of walltime on HPC #407

kiratalreja3 · 2024-09-02T23:34:31Z

Hi team,

I am trying to replicate the HPRC year1v2 PGGB steps stated here : https://github.com/pangenome/HPRCyear1v2genbank

I am using all of HPRC assemblies, 20 haplotypes from my data and CHM13+GrCh38 references - which brings the dataset to a total of 116 assemblies.

I followed the steps to divide the dataset into chromosome-specific fasta files (partition), making a combined file for sex chromosomes and acrocentric chromosomes as mentioned in the link above.

Quoting the draft human pangenome paper methods :
“We then applied PGGB (v.0.2.0+531f85f) to each partition to build a chromosome-specific graph. Run in parallel over 6 PowerEdge R6515 AMD EPYC 7402P 24-core nodes with 384 GB of RAM, this process requires 22.49 system days, or around 3.7 days wallclock.”

However, in my case, the acrocentric chromosome community is exceeding 96 hours of walltime (limit of my shared HPC), even when given significantly more resources - a full node with 48 cores & 1440GB RAM.

This is the PGGB command I am launching:
pggb -I chrAcrocentric.fasta.gz -o "${PBS_JOBFS}/chrAcrocentric.pggb.out" -n 116 -p 98 -s 100000 -k 331 -O 0.03 -m -A -S -V chm13,grch38 -t 48 -T 48

Am I doing something wrong here? It clearly should not exceed 96 hours with the resources given.
Would masking out the ribosomal DNA of these chromosomes and assembling it as a separate graph would be a way to go?

AndreaGuarracino · 2024-09-04T22:16:36Z

Hi @kiratalreja3, are you using the latest versions of PGGB? A lot has changed in the last year or so, including the sensitivity of the mapping phase in WFMASH. Higher sensitivity can lead to graphs that represent more variation, which is harder to handle, especially with acrocentric chromosomes. Which PGGB step is taking up the majority of your runtime?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PGGB takes more than 96 hours of walltime on HPC #407

PGGB takes more than 96 hours of walltime on HPC #407

kiratalreja3 commented Sep 2, 2024 •

edited

Loading

AndreaGuarracino commented Sep 4, 2024

PGGB takes more than 96 hours of walltime on HPC #407

PGGB takes more than 96 hours of walltime on HPC #407

Comments

kiratalreja3 commented Sep 2, 2024 • edited Loading

AndreaGuarracino commented Sep 4, 2024

kiratalreja3 commented Sep 2, 2024 •

edited

Loading