-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting snake error #120
Comments
update: seems to resolve when changing to a config.yaml file instead of a .config... but still get an error **VPIPE_BASEDIR = /opt/V-dock/V-pipe/workflow all 1 1 1 [Sat Jan 22 10:37:15 2022] Job stats: all 1 1 1 This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.** I have a folder called 'samples' in the directory I'm working in, why doesn't it see it? |
Regarding the first part:
vpipe.config files use the old INI-like config format that is standard in Python's config parser. general:
x: y you would have had to use INI-stlye: [general]
x=y but modern snakemake encourage using YAML or JSON files instead, so renaming your configuration as config.yaml and keeping the YAML format is the most standard-compliant. You can find out more about the configuration in the config's README.md, and in the manual found in config/config.html in your local installation. |
thank you! That makes sense. Any thoughts on the samples directory issue? |
Regarding the later part: Using base configuration virus HIV
WARNING: No samples found in samples/. Not generating config/samples.tsv.
WARNING: Sample list file config/samples.tsv not found. First note:
Second note:
Please check using Again the README.md inside |
Hmm ok, thanks for that- I'll alter the directories. I'm using the docker version- does this still apply? Should I be using docker in ubuntu or can I use it in the terminal? |
Since a few days, the latest version contains a few utilities that might help you automatically build this structure + give you a TSV that you can append to your samples.tsv file. Check the samples mass importers section of the utils' README.md.
Yes, the config, the
I am not 100% sure I understand your question. I've tested the docker command from a terminal as written in the README.md |
Thank you very much for that- Still getting errors tho :( VPIPE_BASEDIR = /opt/V-dock/V-pipe/workflow |
Update: I created a empty folder 'config' and it worked! Thank you very much |
Though you have managed to make it work, I would be interested in your YAML and TSV files (if you don't mind sharing them) so I can see what was triggering the error message, with the aim to eventually fix it or make the message more expressive. |
It runs but I now get this error (sorry for all this) Error in rule hmm_align:
Shutting down, this might take some time. The Alignment error file gives this: Warning: en_US.UTF-8 could not be imbued, this is likely due to a missing locale on your system |
I'm using this config file: general: input: output: |
Also getting a lot of 'bad' sequences Input and filter stats: |
Okay, I got what is happening (though I would need the content of the
Your files are pair-ended and contain 100bp per read. By default, V-pipe assumes a length of 250bp. (and if you look a few lines higher in the log you should see something along the lines of:)
Because every single read is shorter than the cut-off, every single one is considered bad by PrinSeq. (You can find a similar problem as your, but with more logging in #119 ) You should specify what is the expected read length. Either by adding a third column in your TSV file, or by setting the default parameter in the configuration, section "input:" -> property "read_length:". There is a README.md in An older walk-through about configuring V-pipe can be found in the sars-cov-2 tutorial - it's a bit older but gives an oversight of the needed steps (there's a section in the config's README.md pointing the changes in more recent versions of V-pipe) If you use the mass-importers in the |
Hi, did you create this folder in your working directory? I´m getting an error message like the one you described |
Hi, we have updated the config's readme section about the samples TSV to make this more clear. |
How are you running V-pipe? It should look like something like this:
(We're basically following the Distribution and Reproducibility recommendations of Snakemake ) |
Hi,
name: SARS-CoV-2 input: frameshift_deletions_checks: snv: lofreq: |
Yes, specially since you're setting the read-length in your tsv:
input:
datadir: "{VPIPE_BASEDIR}/../samples/" Oh, sorry, this is my fault: mys instruction weren't clear enough. The thing is that all the virus base-config we provide use files (references, etc.) which are stored in V-pipe, in its subdirectory You should instead have written something like: general:
aligner: "bwa"
input:
datadir: "samples/"
read_length: 151
reference: "references/MT3502821.fasta"
metainfo_file: "references/metainfo.yaml"
gff_directory: "references/gffs/"
primers_file: "references/primers/nCoV-2019.tsv"
phylogeny_data: "{VPIPE_BASEDIR}/../resources/sars-cov-2/phylogeny/selected_covid_sequences.fasta"
frameshift_deletions_checks:
genes_gff: "references/gffs/MT350282.1.gb"
snv:
consensus: false
lofreq:
consensus: false An alternative would be to load the SARS-CoV-2 virus base config, and then only replace the properties that you need to change, leaving the defaults from the virus base config for the other options. The following will use your files that you have in your working directory's subdirectory general:
virus_base_config: "sars-cov-2"
input:
read_length: 151
reference: "references/MT3502821.fasta"
metainfo_file: "references/metainfo.yaml"
gff_directory: "references/gffs/"
primers_file: "references/primers/nCoV-2019.tsv"
frameshift_deletions_checks:
genes_gff: "references/gffs/MT350282.1.gb" |
Hi, I managed to start running the program with the changes you suggested on the config file. (Thanks!) I see that in my results/SNVs folder the shorah.log files show no error yet, but the pipeline has been showing a message of "Activating conda environment: /home/geninfo/msisco/V-pipe/workflow/.snakemake/conda/6b4b4f9b39457eac0d31ae59ec8c6ce6 for a few hours, without any changes.. is this normal? The REGION_1 folder shows the reads-support.fast files but is not showing any new files or the final snvs.vcf file Again, thank you for taking the time to answer all this questions. |
Yes, shorah doesn't output much on the console.
indeed, that files is the best place to track the output of ShoRAH. Note that currently there's an issue in ShoRAH and it might lose SNVs that happen close to boundaries of amplicons if all their read start exactly at the same position. This will be fixed in a version of ShoRAH that will be fixed in the comming month. In the meantime lofreq can be an alternative for more speed, or to circumvent the above mentionned but. Another tools that might be interesting is to experiment with cojac, though it's not (Yet) integrated into V-pipe and needs to be run separate (you can provide it a V-pipe's sample TSV file and an output directory which contain alignement bams). Try the dev branch as we haven't release version 0.2 on bioconda yet. This is the tool that we use in our wastewater surveillance. As it relies only on alignment, it provides much faster than running an SNV caller. (Though as it only relies on cooccurrence of select mutations, it isn't as high confidence as ShoRAH which relies on entire local haplotypes - but in our practical experience with the surveillance, this is good enough for early detection of variants) |
Hello, I ran the pipeline successfully, it lasted around 18 days. |
Yes, it can take time. For future runs, I would suggest running it on the cluster or on a very-large workstation.
Keep in mind that SARS-CoV-2 doesn't have that many mutations, and as Illumina's sequencing is short-reads based, you might have not have enough information to distinguish haplotype and match neighbouring reads (i.e.: there whole regions where all the variant have similar reads). Building global haplotype might not be possible.
So, the rules themselves are called indeed If you want to directly invoke the rule, you would need to give an exact output that it must generate. e.g. -/v-pipe results/patient1/20100113/variants/global/quasispecies.fasta You could obtain something toughly similar by turning on the haplotype reconstruction by adding these options to the relevant section of your configuration YAML file (as usual check config/config.html for references): general:
haplotype_reconstruction: haploclique
output:
global: True And then use the -/v-pipe --until haploclique |
Hi, Thanks for your kind reply. I tried to run savage but It gives me this error: I read that a possible workaround is to use "--no-build-id" but I´m not sure how to. |
VPIPE_BASEDIR = /opt/V-dock/V-pipe/workflow
Importing legacy configuration file vpipe.config
MissingSectionHeaderError in line 107 of /opt/V-dock/V-pipe/workflow/rules/common.smk:
File contains no section headers.
file: 'vpipe.config', line: 1
'general:\n'
File "/opt/V-dock/V-pipe/workflow/Snakefile", line 12, in
File "/opt/V-dock/V-pipe/workflow/rules/common.smk", line 259, in
File "/opt/V-dock/V-pipe/workflow/rules/common.smk", line 170, in process_config
File "/opt/V-dock/V-pipe/workflow/rules/common.smk", line 107, in load_legacy_ini
File "/opt/conda/envs/snakemake/lib/python3.10/configparser.py", line 698, in read
File "/opt/conda/envs/snakemake/lib/python3.10/configparser.py", line 1086, in _read
The text was updated successfully, but these errors were encountered: