Enable running --str (or other subcomponents of the pipeline) more modularly #169

ilivyatan · 2024-04-04T08:13:20Z

Is your feature related to a problem?

Yes
Maintaining consistency of analyses in a reasonable timeframe.

Describe the solution you'd like

I've ran the full pipeline on samples, but forgot to designate the --sex parameter for the --str analysis.
So I want to just run the --str part again, and have it use the necessary inputs that it has already generated.
Yet, running the pipeline again, designating only --str, starts rerunning the --snv analysis and haplotagging the BAM file, which takes a long time.
It would be great if it could locate the files it needs in the 'output' folder and just run the specific analysis.

Describe alternatives you've considered

I've run the straglr independently. It takes less than 30 seconds for a 30x covered human genome... and another 15 min for phasing with longphase and annotation with stranger. This is a solution, but is less streamlined and doesn't produce the nice reports that epi2me does, and only some of the samples need to be repeated, so having a uniformity of analysis is important.
Another alternative could be to enable the reporting tools as command line.

Additional context

I run epi2me via nextflow command line on the promethion24 machine.
Since snv analysis is a precursor to the other types of analyses, maybe there can be an option to designate whether snv analysis has already run, and supply the result files, so that the pipeline can continue with additional analyses.
For example, a routine could look like this:
First run only --snv, check out the results, and then run again with --sv.
(Phasing can be run at the end to connect everything up.)

RenzoTale88 · 2024-04-04T08:17:39Z

Hi @ilivyatan this should be intrisecally possible with nextflow if you keep your work directory by providing -resume. You can run the analysis again, adding or changing the parameters as you wish, and the workflow should be able to recognise what's been already run (in this case the --snp analysis) and simply execute the new steps. However, if you were to change the parameters for the --snp analysis, the workflow would have to repeat all or some of these steps as well.

MemoonaRasheed · 2024-10-17T18:02:51Z

Hi @RenzoTale88 Is it possible to run --str alone without -resume. I ran the version v. 1.2 earlier for SNP, SV, and mod calling, and now I want to do str calling. I only specified --str but the sample is running snp calling as well as per the logs that I am seeing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable running --str (or other subcomponents of the pipeline) more modularly #169

Enable running --str (or other subcomponents of the pipeline) more modularly #169

ilivyatan commented Apr 4, 2024

RenzoTale88 commented Apr 4, 2024

MemoonaRasheed commented Oct 17, 2024

Enable running --str (or other subcomponents of the pipeline) more modularly #169

Enable running --str (or other subcomponents of the pipeline) more modularly #169

Comments

ilivyatan commented Apr 4, 2024

Is your feature related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

RenzoTale88 commented Apr 4, 2024

MemoonaRasheed commented Oct 17, 2024