This folder contains
- a folder scripts with the R scripts for the deconvolution in each setting (class/omic),
- the Snakefile: this file should be adapted in order to do deconvolution on in vitro/vivo data or new datasets, or to run the pipeline with new methods (see below),
- the definition file for the apptainer container container2.def, along with the directions to create the container file .sif from the definition file. The definition file can be updated to test new methods by adding necessary packages.
conda install -n base -c conda-forge mamba
mamba create -c conda-forge -c bioconda -n YOUR_ENV 'snakemake==7.22.0'
conda activate YOUR_ENV
- The parameter 'input_path' in the R scripts needs to be changed to the path where you stored the dataset to be deconvoluted.
- In Snakefile, the path YOUR/PROJECT/ROOT needs to be changed to the root path of your project in the shell command of the deconvoution rules.
- Run the Snakefile
snakemake --latency-wait 60 --cores 1 --jobs 50
In case the user wants to add new datasets, the following parts of the deconvolution scripts need to be modified:
- for the supervised deconvolution with InstaPrism, the function "prism.states" in scripts should be modified to include the variable cell types (in our case tumor types) of the new dataset: sections ##Define variable types and ##Deaggregate variable types,
- the list featselec_K in the scripts has to include the expected number of cell types k in the new dataset, in the format "New_Data"=k, for the feature selection step,
- in the Snakefile, the parameter DATA_OMIC should be modified as well to include the new dataset.
In case the user wants to add new methods, he/she should modify:
- the corresponding R script depending on if it's a supervised/unsupervised method for RNA/DNAm data,
- the parameter METHOD_OMIC_CLASS in the preamble of the Snakefile,
- the definition file container2.def of the container if needed to add required packages to the container.
Tu run the RNA methods that require TPM normalization (OLS, NNLS, SVR), the file with gene lengths "human_lengths.rds" is available upon request from the authors.