Skip to content
This repository has been archived by the owner on Jan 12, 2022. It is now read-only.
/ rgoslin-old Public archive

R implementation of parsers for the Grammars on succinct lipid nomenclature (Goslin).

License

Notifications You must be signed in to change notification settings

lifs-tools/rgoslin-old

Repository files navigation

R implementation for parsing of lipid shorthand nomenclature names, version 2.0

R-CMD-check R-CMD-check-bioc codecov (master) DOI

This project is a parser, validator and normalizer implementation for shorthand lipid nomenclatures, based on the Grammar of Succinct Lipid Nomenclatures project.

https://github.com/lifs-tools/goslin defines multiple grammars for different sources of shorthand lipid nomenclature. This allows to generate parsers based on the defined grammars, which provide immediate feedback whether a processed lipid shorthand notation string is compliant with a particular grammar, or not.

NOTE: Please report any issues you might find to help improve it!

Here, rgoslin 2.0 uses the Goslin grammars and the cppgoslin parser to support the following general tasks:

  1. Facilitate the parsing of shorthand lipid names dialects.
  2. Provide a structural representation of the shorthand lipid after parsing.
  3. Use the structural representation to generate normalized names, following the latest shorthand nomenclature.

Related Projects

Changes in Version 2.0

  • The column names within the data frames returned from the parse* methods now use column names with dots instead of spaces. This makes it easier to use the column names unquoted within other R expressions.
  • All parse* methods now return data frames.
  • The Messages column has been added to capture parser messages. If parsing succeeds, this will contain NA and Normalized.Name will contain the normalized lipid shorthand name.
  • Parser implementations have been updated to reflect the latest lipid shorthand nomenclature changes. Please see the Goslin repository for more details.
  • Exceptions in the C++ part of the library are captured as warnings in R. However, if you parse multiple lipid names, exceptions will not stop the parsing process.

Installation

Prerequisites

Install the devtools package with the following command.

if(!require(devtools)) { install.packages("devtools") }

Adjusting Makevars for more performance

In order to apply platform-specific optimizations, you can edit your user Makevars file. This file is in ~/.R/Makevars, where ~ is your user directory. If it does not exist, you may need to create the directory and the file. To apply optimizations, put the following lines into your Makevars file.

CFLAGS = -O3 -Wall -mtune=native -march=native
CXXFLAGS = -O3 -Wall -mtune=native -march=native
CXX1XFLAGS = -O3 -Wall -mtune=native -march=native
CXX11FLAGS = -O3 -Wall -mtune=native -march=native

Depending on the number of available cores, you can speed up compilation by redefining MAKE in Makevars (here for 4 CPU cores):

MAKE = make -j4

Please note that these settings will apply to all R packages that require compilation from this point on! Also, -O3 may have detrimental influence on some code. You can also replace it with R's default -O2.

Installing rgoslin

Run

  install_github("lifs-tools/rgoslin")

to install from the github repository.

This will install the latest, potentially unstable development version of the package with all required dependencies into your local R installation.

If you want to use a proper release version, referenced by a Git tag (here: v2.0.0) install the package as follows:

  install_github("lifs-tools/rgoslin", ref="v2.0.0")

If you want to work off of a specific branch (here: adding_masses), install the package as follows:

  install_github("lifs-tools/rgoslin", ref="adding_masses")

If you also want to build the help and vignette, add the following arguments:

  install_github("lifs-tools/rgoslin", ref="adding_masses", build_manual = TRUE, build_vignettes = TRUE)

If you have cloned the code locally, use devtools as follows. Make sure you set the working directory to where the API code is located. Then execute

library(devtools)
install(".")

To run the tests, execute

library(devtools)
test()

Usage

To load the package, start an R session and type

  library(rgoslin)

Type the following to see the package vignette / tutorial:

  vignette('introduction', package = 'rgoslin')

Adding cppgoslin as a Git subtree

In the root of your git project, run the git subtree command, with <PREFIX> replaced by the subdirectory path where you want the subtree to live (src/cppgoslin/):

git subtree add --prefix=<PREFIX> https://github.com/lifs-tools/cppgoslin.git master

For rgoslin, we use the following command:

git subtree add --prefix=src/cppgoslin/ https://github.com/lifs-tools/cppgoslin.git master

Note: instead of the https URL to the Git repository, you can also use the ssh location, e.g. [email protected]:lifs-tools/cppgoslin.git.

Instead of master, you can choose any other branch or tag to clone. For more information on git subtree, see Git Subtree or this article.

Pulling and pushing of a Git subtree

For pulling and pushing, you have to change into the root directory of the host repository and execute the following commands:

Pulling

git subtree pull --prefix=<PREFIX> https://github.com/lifs-tools/cppgoslin.git master

Pushing

git subtree push --prefix=<PREFIX> https://github.com/lifs-tools/cppgoslin.git master

Alternatively, you can create shortcuts/aliases in your repository's .git/config file:

[alias]
    # the acronym stands for "subtree pull"
    cppgoslin-pull = "!f() { git subtree pull --prefix <PREFIX> [email protected]:lifs-tools/cppgoslin.git master; }; f"
    # the acronym stands for "subtree push"
    cppgoslin-push = "!f() { git subtree push --prefix <PREFIX> [email protected]:lifs-tools/cppgoslin.git master; }; f"

Make sure to replace <PREFIX> with the proper path from your repository root directory to the directory where you placed your subtree in!

This allows you to run git cppgoslin-pull to pull the latest master version, or git cppgoslin-push to push your latest local commits on the cppgoslin subtree to the upstream repository.