-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CADD to version 1.7 #513
Comments
I've done some work on trying to package CADD v1.7 and have not succeeded. I'd just like to post the information here, in case it may be helpful. The approach with CADD v1.6.post1 was to run the documented command to download the conda environments into the docker container (https://github.com/BioContainers/containers/blob/60ba043b6e419b33b385d9cc4f22375a69890d84/cadd-scripts-with-envs/1.6.post1/Dockerfile#L45). In version 1.7 it didn't download all of the necessary environments, so I couldn't use the same approach. Recently, CADD released v1.7.1. CADD now support using singularity images for the snakemake pipeline instead of using conda environments. It fixes some other bugs, and also adds a docker image with the Attempt 1: Based on CADD's docker image: https://github.com/fa2k/BioContainers-fork/blob/cadd-1.7/cadd-scripts-with-envs/1.7.1/Dockerfile - does not successfully load the conda environments because they exist at the wrong path. Attempt 2: Create conda environments manually in a loop: https://github.com/fa2k/BioContainers-fork/blob/cadd-1.7/cadd-scripts-with-envs/1.7.1/Dockerfile-full Attempt 2 produces a 24GB docker image that can successfully execute some CADD commands when combined with the modified cadd module here: https://github.com/fa2k/raredisease/blob/caddtest/modules/nf-core/cadd/main.nf The linked CADD module contains some additional work-arounds. The current iteration crashes in snakemake rule annotate_regseq on command:
with error message:
The input VCF to this rule is missing the FORMAT column. I'm about to give up for a while on CADD, because there are too many problems. But I thought it may help to share this progress, and maybe someone has some tips for how to continue trying. |
Thanks for taking the time to test @fa2k. We'll try to pick this up after the release of the next version |
Cadd version 1.7 has been released. Among other update the scoring now also uses information from protein language models. See paper here
Pre-computed scores can be found here
https://cadd.bihealth.org/download
The text was updated successfully, but these errors were encountered: