Skip to content

Commit

Permalink
Fix remaining broken boxes since our checker stopped working
Browse files Browse the repository at this point in the history
  • Loading branch information
hexylena committed Sep 26, 2024
1 parent 12a17e9 commit 6ec5d13
Show file tree
Hide file tree
Showing 12 changed files with 190 additions and 182 deletions.
137 changes: 69 additions & 68 deletions topics/assembly/tutorials/vgp_genome_assembly/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -548,34 +548,34 @@ Let's use gfastats to get a basic idea of what our assembly looks like. We'll ru
>
> 2. Rename outputs of `gfastats` step to as `Hap1 stats` and `Hap2 stats`
>
> > > This would generate summary files that look like this (only the first six rows are shown):
> > >
> > > ```
> > > Expected genome size 11747160
> > > # scaffolds 0
> > > Total scaffold length 0
> > > Average scaffold length nan
> > > Scaffold N50 0
> > > Scaffold auN 0.00
> > > ```
> > >
> > > Because we ran `gfastats` on hap1 and hap2 outputs of `hifiasm` we need to join the two outputs together for easier interpretation:
> This would generate summary files that look like this (only the first six rows are shown):
>
> ```
> Expected genome size 11747160
> # scaffolds 0
> Total scaffold length 0
> Average scaffold length nan
> Scaffold N50 0
> Scaffold auN 0.00
> ```
>
> Because we ran `gfastats` on hap1 and hap2 outputs of `hifiasm` we need to join the two outputs together for easier interpretation:
>
> 3. Run {% tool [Column join](toolshed.g2.bx.psu.edu/repos/iuc/collection_column_join/collection_column_join/0.0.3) %} with the following parameters:
> - {% icon param-files %} *"Input file"*: select `Hap1 stats` and the `Hap2 stats` datasets. Keep all other settings as they are.
>
> 4. Rename the output as `gfastats on hap1 and hap2 (full)`
>
> > > This would generate a joined summary file that looks like this (only the first five rows are shown):
> > >
> > > ```
> > > # gaps 0 0
> > > # gaps in scaffolds 0 0
> > > # paths 0 0
> > > # segments 17 16
> > > ```
> > >
> > > Now let's extract only relevant information by excluding all lines containing the word `scaffold` since there are no scaffolds at this stage of the assembly process (only contigs):
> This would generate a joined summary file that looks like this (only the first five rows are shown):
>
> ```
> # gaps 0 0
> # gaps in scaffolds 0 0
> # paths 0 0
> # segments 17 16
> ```
>
> Now let's extract only relevant information by excluding all lines containing the word `scaffold` since there are no scaffolds at this stage of the assembly process (only contigs):
>
> 5. Run {% tool [Search in textfiles](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1) %} with the following parameters:
> - {% icon param-files %} *"Input file"*: select `gfastats on hap1 and hap2 (full)`
Expand Down Expand Up @@ -756,35 +756,35 @@ Let's use gfastats to get a basic idea of what our assembly looks like. We'll ru
>
> 2. Rename outputs of `gfastats` step to as `Primary stats` and `Alternate stats`
>
> > > This would generate summary files that look like this (only the first six rows are shown):
> > >
> > > ```
> > > Expected genome size 11747160
> > > # scaffolds 25
> > > Total scaffold length 18519764
> > > Average scaffold length 740790.56
> > > Scaffold N50 813311
> > > Scaffold auN 913050.77
> > > ```
> > >
> > > Because we ran `gfastats` on Primary and Alternate outputs of `hifiasm` we need to join the two outputs together for easier interpretation:
> This would generate summary files that look like this (only the first six rows are shown):
>
> ```
> Expected genome size 11747160
> # scaffolds 25
> Total scaffold length 18519764
> Average scaffold length 740790.56
> Scaffold N50 813311
> Scaffold auN 913050.77
> ```
>
> Because we ran `gfastats` on Primary and Alternate outputs of `hifiasm` we need to join the two outputs together for easier interpretation:
>
> 3. Run {% tool [Column join](toolshed.g2.bx.psu.edu/repos/iuc/collection_column_join/collection_column_join/0.0.3) %} with the following parameters:
> - {% icon param-files %} *"Input file"*: select `Primary stats` and the `Alternate stats` datasets (these are from **Step 2** above). Keep all other setting as they are.
>
> 4. Rename the output as `gfastats on Pri and Alt (full)`
>
> > > This would generate a joined summary file that looks like this (only five rows are shown):
> > >
> > > ```
> > > # contigs 25 10
> > > # dead ends . 16
> > > # disconnected components . 7
> > > # edges . 6
> > > # gaps 0 0
> > > ```
> > >
> > > Now let's extract only relevant information by excluding all lines containing the word `scaffold` since there are no scaffolds at this stage of the assembly process (only contigs):
> This would generate a joined summary file that looks like this (only five rows are shown):
>
> ```
> # contigs 25 10
> # dead ends . 16
> # disconnected components . 7
> # edges . 6
> # gaps 0 0
> ```
>
> Now let's extract only relevant information by excluding all lines containing the word `scaffold` since there are no scaffolds at this stage of the assembly process (only contigs):
>
> 5. Run {% tool [Search in textfiles](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1) %} with the following parameters:
> - {% icon param-files %} *"Input file"*: select `gfastats on Pri and Alt (full)`
Expand Down Expand Up @@ -876,7 +876,7 @@ Despite BUSCO being robust for species that have been widely studied, it can be
> - {% icon param-file %} *"First genome assembly"*: `Primary contigs FASTA`
> - {% icon param-file %} *"Second genome assembly"*: `Alternate contigs FASTA`
>
> > > (REMINDER: `Primary contigs FASTA` and `Alternate contigs FASTA` were generated [earlier](#gfa2fasta_solo))
> (REMINDER: `Primary contigs FASTA` and `Alternate contigs FASTA` were generated [earlier](#gfa2fasta_solo))
>
{: .hands_on}
Expand Down Expand Up @@ -913,23 +913,24 @@ The first relevant parameter is the `Estimated genome size`.
> <hands-on-title>Get estimated genome size</hands-on-title>
>
> 1. Look at the `GenomeScope summary` output (generated during *k*-mer profiling [step](#genome-profiling-with-genomescope2)). The file should have content that looks like this (it may not be exactly like this):
> > ```
> > GenomeScope version 2.0
> > input file = ....
> > output directory = .
> > p = 2
> > k = 31
> > TESTING set to TRUE
> >
> > property min max
> > Homozygous (aa) 99.4165% 99.4241%
> > Heterozygous (ab) 0.575891% 0.583546%
> > Genome Haploid Length 11,739,321 bp 11,747,160 bp
> > Genome Repeat Length 722,921 bp 723,404 bp
> > Genome Unique Length 11,016,399 bp 11,023,755 bp
> > Model Fit 92.5159% 96.5191%
> > Read Error Rate 0.000943206% 0.000943206%
> > ```
>
> ```
> GenomeScope version 2.0
> input file = ....
> output directory = .
> p = 2
> k = 31
> TESTING set to TRUE
>
> property min max
> Homozygous (aa) 99.4165% 99.4241%
> Heterozygous (ab) 0.575891% 0.583546%
> Genome Haploid Length 11,739,321 bp 11,747,160 bp
> Genome Repeat Length 722,921 bp 723,404 bp
> Genome Unique Length 11,016,399 bp 11,023,755 bp
> Model Fit 92.5159% 96.5191%
> Read Error Rate 0.000943206% 0.000943206%
> ```
>
> 2. Copy the number value for the maximum Genome Haploid Length to your clipboard (CTRL + C on Windows; CMD + C on MacOS).
> 3. Click on "Upload Data" in the toolbox on the left.
Expand Down Expand Up @@ -992,7 +993,7 @@ Now let's parse the `transition between haploid & diploid` and `upper bound for
> >
> {: .question}
>
> > Now let's get the transition parameter.
> Now let's get the transition parameter.
>
> 5. Run {% tool [Advanced Cut](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cut_tool/1.1.0) %} with the following parameters:
> - {% icon param-file %} *"File to cut"*: `Parsing purge parameters`
Expand Down Expand Up @@ -1318,11 +1319,11 @@ Before we begin, we need to upload BioNano data:
>
> 1. Copy the following URLs into clipboard. You can do this by clicking on {% icon copy %} button in the right upper corner of the box below. It will appear if you mouse over the box.
>
> > ```
> > https://zenodo.org/records/5887339/files/bionano.cmap
> > ```
> ```
> https://zenodo.org/records/5887339/files/bionano.cmap
> ```
>
> 2. Upload datasets into Galaxy
> 2. Upload datasets into Galaxy
> - set the datatype to `cmap`
>
> {% snippet faqs/galaxy/datasets_import_via_link.md format="cmap" %}
Expand Down
11 changes: 7 additions & 4 deletions topics/dev/tutorials/community-tool-table/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,14 +130,17 @@ an example of the file that is used to manually filter the tools for a community
> 1. Download the `tools.tsv` file in `results/<your community>`.
> 2. Open `tools.tsv` with a Spreadsheet Software.
> 3. Review each line corresponding to a tool
You can also just review some tools. Those tools that are not reviewed will be have `FALSE` in the `Reviewed` columns the updated table.
>
> You can also just review some tools. Those tools that are not reviewed will be have `FALSE` in the `Reviewed` columns the updated table.
>
> 1. Change the value in the `Reviewed` column from `FALSE` to `TRUE` (this will be done automatically if an entry of the tool in `tools_status.tsv` exists).
> 2. Add `TRUE` to the `To keep` column if the tool should be kept, and `FALSE` if not.
> 3. Add `TRUE` or `FALSE` also to the `Deprecated` column.
>
> 4. Copy paste the `Galaxy wrapper id`, `To keep`, `Deprecated` column in a new table (in that order).
This can also be done using the reference function of your Spreadsheet Software.
>
> This can also be done using the reference function of your Spreadsheet Software.
>
> 5. Export the new table as TSV (without header).
> 6. Submit the TSV as `tools_status.tsv` in your community folder.
> 7. Wait for the Pull Request to be merged
Expand Down
4 changes: 2 additions & 2 deletions topics/dev/tutorials/tool-annotation/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ To link a Galaxy tool to its corresponding bio.tools entry, we need to first fin
> 2. Search your tool
> 3. Expand the row
> 4. Open the link shown in the `Galaxy wrapper parsed folder` column
>
{: .hands_on}

Now we have the wrapper, and can add the bio.tools entry.
Expand All @@ -214,4 +214,4 @@ Now we have the wrapper, and can add the bio.tools entry.
>
{: .hands_on}

# Conclusion
# Conclusion
2 changes: 1 addition & 1 deletion topics/dev/tutorials/tool-from-scratch/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,7 @@ Note that for using `planemo`from a new shell you will need to activate the pyth
> > ```bash
> > planemo, version 0.74.3
> > ```
> {: .code-out}
> {: .code-in}
>
> 2. `planemo --help` will show the available commands with a short desctiption (lint, test, and serve will be part of this tutorial)
> 3. `planemo SUBCOMMAND --help` will show the usage information for the corresponding subcommand. Try to obtain the information for the `lint` subcommand.
Expand Down
4 changes: 2 additions & 2 deletions topics/fair/tutorials/fair-data-registration/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ Discipline-specific repositories cater for communities and datatypes, and typica
> <question-title></question-title>
>
> An example of a discipline-specific repository is [ArrayExpress](https://www.ebi.ac.uk/biostudies/arrayexpress) database. ArrayExpress stores data from high-through functional genomics assays, such as RNAseq, ChIPseq and expression microarrays.
The data submission interface of ArrayExpress is called [Annotare](https://www.ebi.ac.uk/fg/annotare/login/). Without creating a login, what help is given to a person looking to submit a dataset for the first time?
> The data submission interface of ArrayExpress is called [Annotare](https://www.ebi.ac.uk/fg/annotare/login/). Without creating a login, what help is given to a person looking to submit a dataset for the first time?
>
> > <solution-title></solution-title>
> >
Expand All @@ -127,7 +127,7 @@ The data submission interface of ArrayExpress is called [Annotare](https://www.e
> > <solution-title></solution-title>
> >
> > Open the **Findability** pulldown on the left hand banner to find recipes for the following:
[Depositing to generic repositories - Zenodo use case](https://faircookbook.elixir-europe.org/content/recipes/findability/zenodo-deposition.html) and [Registering Datasets in Wikidata](https://faircookbook.elixir-europe.org/content/recipes/findability/registeringDatasets.html).
> > [Depositing to generic repositories - Zenodo use case](https://faircookbook.elixir-europe.org/content/recipes/findability/zenodo-deposition.html) and [Registering Datasets in Wikidata](https://faircookbook.elixir-europe.org/content/recipes/findability/registeringDatasets.html).
> >
> {: .solution}
{: .question}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,7 @@ Providing documentation is also important to help understand the workflow's purp

> <agenda-title></agenda-title>
>
> In this tutorial, you will learn about the best practices that the Galaxy community
has created for workflows.
> In this tutorial, you will learn about the best practices that the Galaxy community has created for workflows.
>
> 1. TOC
> {:toc}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ E.g. the workflow could be combined with metagenomic workflows, that allow to sc
> > <comment-title> Genome download </comment-title>
> >
> > This downloads the `Streptomyces coelicolor A3(2) complete genome`,
which should be a great source for biosynthetic gene clusters (BGCs).
> > which should be a great source for biosynthetic gene clusters (BGCs).
> {: .comment}
>
{: .hands_on}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -783,6 +783,7 @@ To prepare the **ABRicate**{% icon tool %} output tabulars of both samples for f

<div class="Long-Version" markdown="1">

> <hands-on-title>Antimicrobial Resistance Genes Identification</hands-on-title>
> 1. {% tool [Replace](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.4) %} with the following parameters:
> - {% icon param-file %} *"File to process"*: `report` (output of **ABRicate** {% icon tool %})
> - In *"Find and Replace"*:
Expand All @@ -803,6 +804,7 @@ To prepare the **ABRicate**{% icon tool %} output tabulars of both samples for f
>
> 2. Rename the output collection `AMRs`
{: .hands-on}

</div>

> <question-title></question-title>
Expand Down Expand Up @@ -874,6 +876,7 @@ To prepare the **ABRicate**{% icon tool %} output tabulars of both samples for f

<div class="Long-Version" markdown="1">

> <hands-on-title>Replace Text</hands-on-title>
> 1. {% tool [Replace](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.4) %} with the following parameters:
> - {% icon param-file %} *"File to process"*: `report` (output of **ABRicate** {% icon tool %})
> - In *"Find and Replace"*:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -169,11 +169,11 @@ You can access the data for this tutorial in multiple ways:
> - {% icon param-file %} *"Annotated data matrix"*: `N701-400k`
> - *"Function to manipulate the object"*: `Concatenate along the observations axis`
> - {% icon param-file %} *"Annotated data matrix to add"*: `Select all the other matrix files from bottom to top, N702 to N707`
>
> <comment-title></comment-title>
> >If you imported files from Zenodo instead of using the input history, yours might not be in the same order as ours. Since the files will be concatenated in the order that you click, it will be helpful if you click them in the same order, from N702 to N707. This will ensure your samples are given the same batch numbers as we got in this tutorial, which will help when we're adding in metadata later!
{: .comment}
>
>
> > <comment-title></comment-title>
> > If you imported files from Zenodo instead of using the input history, yours might not be in the same order as ours. Since the files will be concatenated in the order that you click, it will be helpful if you click them in the same order, from N702 to N707. This will ensure your samples are given the same batch numbers as we got in this tutorial, which will help when we're adding in metadata later!
> {: .comment}
>
> > <warning-title>Don't add N701!</warning-title>
> > You are adding files to N701, so do not add N701 to itself!
> {: .warning}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -131,10 +131,12 @@ Next we will retrieve the remaining datasets.
>
{: .hands_on}

<!-- > <details-title>Dataset subsampling</details-title>
<!--
> <details-title>Dataset subsampling</details-title>
>
> As indicated above, for this tutorial the depth of the samples was reduced in order to speed up the time needed to carry out the analysis. This was done as follows, all reads mapping to chromosome 10 were kept and a random fraction of 0.003 were added.
{: .details} -->
{: .details}
-->

# Quality assessment

Expand Down
48 changes: 24 additions & 24 deletions topics/variant-analysis/tutorials/beacon_cnv_query/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,30 +86,30 @@ Those parametars are, "CHROMOSOME", "Start", and "End".
> >
> > What types of information can be extracted from records?
> >
> > > ```json
> > >{'_id': ObjectId('66c466431ea6cb4184ee0f2f'),
> > > 'assemblyId': 'GRCh38',
> > > 'biosampleId': 'MP2PRT-PARNFH-TMP1-A, MP2PRT-PARNFH-NM1-A',
> > > 'definitions': {'Location': {'chromosome': '17',
> > > 'end': 43170245,
> > > 'start': 43044295}},
> > > 'diseaseType': 'acute lymphoblastic leukemia',
> > > 'gene': 'BRCA1',
> > > 'geneID': 'ENSG00000012048.23',
> > > 'id': 'refvar-66c466431ea6cb4184ee0f2f',
> > > 'info': {'caseID': 'MP2PRT-PARNFH, MP2PRT-PARNFH',
> > > 'cnCount': 3,
> > > 'fileName': 'f11b7fb7-a610-4978-b5c4-523450a0fd5f.wgs.ASCAT.gene_level.copy_number_variation.tsv',
> > > 'legacyId': 'DUP:chr17:43044295-43170245',
> > > 'projectID': 'MP2PRT-ALL',
> > > 'sampleType': 'Blood Derived Cancer - Bone Marrow, Blood Derived '
> > > 'Cancer - Bone Marrow, Post-treatment'},
> > > 'primarySite': 'hematopoietic and reticuloendothelial systems',
> > > 'updated': '2024-08-19T21:23:09.374531',
> > > 'variantInternalId': '17:43044295-43170245:EFO:0030071',
> > > 'variantState': {'id': 'EFO:0030071', 'label': 'low-level gain'},
> > > 'variantType': 'DUP'}
> > > ```
> > ```json
> > {'_id': ObjectId('66c466431ea6cb4184ee0f2f'),
> > 'assemblyId': 'GRCh38',
> > 'biosampleId': 'MP2PRT-PARNFH-TMP1-A, MP2PRT-PARNFH-NM1-A',
> > 'definitions': {'Location': {'chromosome': '17',
> > 'end': 43170245,
> > 'start': 43044295}},
> > 'diseaseType': 'acute lymphoblastic leukemia',
> > 'gene': 'BRCA1',
> > 'geneID': 'ENSG00000012048.23',
> > 'id': 'refvar-66c466431ea6cb4184ee0f2f',
> > 'info': {'caseID': 'MP2PRT-PARNFH, MP2PRT-PARNFH',
> > 'cnCount': 3,
> > 'fileName': 'f11b7fb7-a610-4978-b5c4-523450a0fd5f.wgs.ASCAT.gene_level.copy_number_variation.tsv',
> > 'legacyId': 'DUP:chr17:43044295-43170245',
> > 'projectID': 'MP2PRT-ALL',
> > 'sampleType': 'Blood Derived Cancer - Bone Marrow, Blood Derived '
> > 'Cancer - Bone Marrow, Post-treatment'},
> > 'primarySite': 'hematopoietic and reticuloendothelial systems',
> > 'updated': '2024-08-19T21:23:09.374531',
> > 'variantInternalId': '17:43044295-43170245:EFO:0030071',
> > 'variantState': {'id': 'EFO:0030071', 'label': 'low-level gain'},
> > 'variantType': 'DUP'}
> > ```
> >
> > > <solution-title></solution-title>
> > > 1. Identifiers and IDs:
Expand Down
Loading

0 comments on commit 6ec5d13

Please sign in to comment.