Annotation of Mass2Motif mass fragments and neutral losses with MAGMa

In the Figure below a schematic representation of the whole pipeline is depicted. The red blocks with the rounded edges represent the final input and output. The blue rectangular blocks represent tools or websites used in the pipeline, and the yellow blocks with a wavy bottom represent intermediate inputs and outputs. The picture files generated by some scripts are not included in this pipeline.

Input

MS2LDA was run through the GNPS website on the three .mzML files containing the measured MS2 spectra to generate Mass2Motifs. To run MS2LDA on GNPS, first a classical molecular network was generated on GNPS. After the MS2LDA analysis on GNPS was finished, the .dict file containing the information obtained through the MS2LDA analysis on GNPS (e.g. Mass2Motifs, Mass2Motif fragments or losses, etc.) was uploaded on MS2LDA.org using the upload tab in the create experiment option. From MS2LDA.org the .csv containing the extracted fragment and loss Mass2Motif fragments or losses, and the .csv containing all fragmentation spectra and Mass2Motifs matching details were downloaded. The consensus spectra in .mgf format from the classical molecular network and the two .csv files from MS2LDA with the Mass2Motifs, the Mass2Motif fragments or losses, and the spectrum identifiers of experimental spectra that contained certain Mass2Motif fragments or losses were used as an input for the pipeline.

Extra information

Tutorial on classical molecular networking on GNPS: https://www.youtube.com/watch?v=PqTuex0nsGk&t=3s

Tutorial on MS2LDA on GNPS: https://www.youtube.com/watch?v=0wKUmjPy40s

Documentation MS2LDA on GNPS: https://ccms-ucsd.github.io/GNPSDocumentation/ms2lda/

MS2Query

see https://github.com/iomega/ms2query for installation and run instructions

A separate conda environment was made to run this script. This environment included the following packages:

However, it should be noted that not all these packages are neccessary to run the script!!

Select Mass2Motifs

Prepare environment

A separate conda environment was made to run this script. This environment included the following packages:

However, it should be noted that not all these packages are neccessary to run the script!!

Install tools

conda install -c conda-forge rdkit

Run script

e.g. python3 select_Mass2Motif_frag_and_loss.py /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/select_Mass2Motifs/input/MS2Query_output.csv /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/select_Mass2Motifs/input/MS2LDA_spectra_and_motif.csv /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/select_Mass2Motifs/input/MS2LDA_motif_and_fragments.csv /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/select_Mass2Motifs/input/consensus_spectra_from_GNPS_classical_molecular_network.mgf /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/select_Mass2Motifs/output

MassQL

Prepare environment

A separate conda environment was made to run this script. This environment included the following packages:

However, it should be noted that not all these packages are neccessary to run the script!!

Install tools

https://pypi.org/project/massql/

Run script

e.g. python3 massql.py /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MassQL/input/motif_massql_querries.txt /mnt/LTR_userdata/hooft001/mass_spectral_embeddings/datasets/GNPS_15_12_21/ALL_GNPS_15_12_2021_positive_annotated.pickle /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MassQL/output/out_spectrum /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MassQL/output/out_files /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MassQL/output/json_enzo/GNPS.mgf /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MassQL/output/json_enzo/GNPS.json

Extra information

MassQL documentation: https://mwang87.github.io/MassQueryLanguage_Documentation/

MassQL sandbox (try-out queries): https://msql.ucsd.edu/

GNPS public spectral libraries: https://gnps.ucsd.edu/ProteoSAFe/libraries.jsp

MAGMa

Prepare environment

A separate conda environment was made to run this script. This environment included the following packages:

However, it should be noted that not all these packages are neccessary to run the script!!

Install tools

see https://github.com/NLeSC/MAGMa/tree/master/job

conda install -c conda-forge rdkit

Run script

e.g. python3 MAGMa_final.py /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MassQL/output/out_spectrum /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MAGMa/output/MAGMa_results_database_for_every_spectrum_from_massql /home/seele006/thesis/motif_massql_querries.txt /lustre/BIF/nobackup/seele006/MSc_thesis_annotation_Mass2Motif_fragments_data/MAGMa/output/pic_mass2Motif_frag

Extra information

Pipeline developed by Rogers et al. https://github.com/iomega/motif_annotation/blob/master/annotate_motifs.py

Output

The annotations for each Mass2Motif mass fragment and neutral loss were combined in a tsv-formatted output file. The frequency that each SMILES annotation for a Mass2Motif fragment or loss was obtained, was also tracked in the .tsv file. If the molecular weight of the SMILES annotated to the neutral loss was not similar 1 decimal after the comma to the weight of the Mass2Motif neutral loss, the molecular weight of the SMILES structure was also written to the .tsv file.

Name		Name	Last commit message	Last commit date
Latest commit History 214 Commits
.idea		.idea
code_in_development		code_in_development
MAGMa_final.py		MAGMa_final.py
MS2Query.py		MS2Query.py
README.md		README.md
massql.py		massql.py
select_Mass2Motif_frag_and_loss.py		select_Mass2Motif_frag_and_loss.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Annotation of Mass2Motif mass fragments and neutral losses with MAGMa

Input

Extra information

MS2Query

Select Mass2Motifs

Prepare environment

Install tools

Run script

MassQL

Prepare environment

Install tools

Run script

Extra information

MAGMa

Prepare environment

Install tools

Run script

Extra information

Output

About

Releases

Packages

Languages

Anna-MarieSeelen/Thesis_Bioinformatics

Folders and files

Latest commit

History

Repository files navigation

Annotation of Mass2Motif mass fragments and neutral losses with MAGMa

Input

Extra information

MS2Query

Select Mass2Motifs

Prepare environment

Install tools

Run script

MassQL

Prepare environment

Install tools

Run script

Extra information

MAGMa

Prepare environment

Install tools

Run script

Extra information

Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages