Expose resource files and reference genome to users #7

cgpu · 2019-11-25T16:30:09Z

This is an enhancement request which would aid in being able to run GenomeChronicler for any reference genome.

It also helps avoid GATK conflicts with sarek resources, especially for the non-canonical chromosomes.

The idea would be that all the current files (listed below) are kept as defaults, but also exposed to the user as hyperparameters. This will allow me to expose them as parameters in the GenomeChronicler Nextflow process and keep them in sync with the former processes resource files.

Even further, we could separate each process within the main perl script and have it as N processes in a fully nextflow-ified version. As a grooming first step, we could update below in what step of GenomeChronicler uses the reference files/resources and continue from there.

{process placeholder}	resource file
process	1kGP_GRCh38_exome.bed
process	1kGP_GRCh38_exome.bim
process	1kGP_GRCh38_exome.fam
process	GRCh38_full_analysis_set_plus_decoy_hla_noChr.dict
process	GRCh38_full_analysis_set_plus_decoy_hla_noChr.fa
process	GRCh38_full_analysis_set_plus_decoy_hla_noChr.fa.fai
process	clinvar.db
process	genosetDependencies.txt
process	getevidence.db
process	gnomad.db
process	parsedGenosets.txt
process	snpedia.db
process	snps.19-114.unique.nochr.bed
process	snps.19-114.unique.nochr.bed.gz
process	snps.19-114.unique.nochr.bed.gz.tbi

@afonsoguerra feel free to add enhancement label, since this is not a bug, but a nice-to-have addition.

The text was updated successfully, but these errors were encountered:

afonsoguerra · 2019-11-26T16:54:48Z

Moving forward I think the different scripts need to be linked by a configuration file for each run, that will include all the parameters instead of being passed through the command line, that way an arbitrary number of parameters can be set and only passed once... once the manuscript is out of the door I'll look into doing that.

cgpu · 2019-11-27T10:59:34Z

Sounds good to me! Since you already have familiarity with Sarek we can work together on this to implement it the pure nextflow way, so we will have a nextflow.config with defaults but also allow the user to custom-specify, either completely free, or with sensible options (eg. curated by us reference and resource bundles).

cgpu changed the title ~~Expose resource file to user~~ Expose resource files and reference genome to users Nov 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose resource files and reference genome to users #7

Expose resource files and reference genome to users #7

cgpu commented Nov 25, 2019 •

edited

Loading

afonsoguerra commented Nov 26, 2019

cgpu commented Nov 27, 2019

Expose resource files and reference genome to users #7

Expose resource files and reference genome to users #7

Comments

cgpu commented Nov 25, 2019 • edited Loading

afonsoguerra commented Nov 26, 2019

cgpu commented Nov 27, 2019

cgpu commented Nov 25, 2019 •

edited

Loading