Skip to content

Schema and generated objects for biolink data model and upper ontology

License

Notifications You must be signed in to change notification settings

amykglen/biolink-model

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Biolink Model Python 3.7 Build Status DOI Join the chat at https://gitter.im/biolink-model/community Regenerate Biolink Model Artifacts Deploy Documentation

Biolink Model

Quickstart docs:

For a good overview of the biolink-model, watch Chris Mungall's talk at ICBO 2020.

Refer to the following resources for a quick introduction to the Biolink Model:

See also Biolink Model Guidelines for help understanding, curating, and working with the model.

Introduction

The purpose of the Biolink Model is to provide a high-level datamodel of biological entities (genes, diseases, phenotypes, pathways, individuals, substances, etc), their properties, relationships, and enumerate ways in which they can be associated.

The representation is independent of storage technology or metamodel (Solr documents, neo4j/property graphs, RDF/OWL, JSON, CSVs, etc). Different mappings to each of these are provided.

The specification of the Biolink Model is a single YAML file built using linkml. The basic elements of the YAML are:

  • Class Definitions: definitions of upper level classes representing both 'named thing' and 'association'
  • Slot Definitions: definitions of slots (aka properties) that can be used to relate members of these classes to other classes or data types. Slots collectively refer to predicates, node properties, and edge properties

The model itself is being used in the following projects:

Organization

The main source of truth is biolink-model.yaml. This is a YAML file that is intended to be relatively simple to view and edit in its native form.

The yaml definition is currently used to derive:

Make and build instructions

Prerequisites: Python 3.7+ and pipenv

To install pipenv,

pip3 install pipenv

To install the project,

make install

To regenerate artifacts from the Biolink Model YAML,

make

Note: the Makefile requires the following dependencies to be installed:

jsonschema

jsonschema

Generally install using

pip3 install jsonschema

jsonschema2pojo

jsonschema2pojo

If you are on a Mac, it can be installed using brew:

brew install jsonschema2pojo

For other OS environments, download the latest release then extract it into your execution path. eg

wget https://github.com/joelittlejohn/jsonschema2pojo/releases/download/jsonschema2pojo-1.0.2/jsonschema2pojo-1.0.2.tar.gz
tar -xvzf jsonschema2pojo-1.0.2.tar.gz
export PATH=$PATH:`pwd`/jsonschema2pojo-1.0.2/bin

GraphViz

See GraphViz site for installation in your operating system.

How do I use Biolink Model YAML programatically?

For operations such as CURIE lookup, finding class by synonyms, get parents, get ancestors, etc. please make use of biolink-model-toolkit. It provides convenience methods for traversing Biolink Model.

Citing Biolink Model

Unni DR, Moxon SAT, Bada M, Brush M, Bruskiewich R, Caufield JH, Clemons PA, Dancik V, Dumontier M, Fecho K, Glusman G, Hadlock JJ, Harris NL, Joshi A, Putman T, Qin G, Ramsey SA, Shefchek KA, Solbrig H, Soman K, Thessen AE, Haendel MA, Bizon C, Mungall CJ, The Biomedical Data Translator Consortium (2022). Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science. Clin Transl Sci. Wiley; 2022 Jun 6; https://onlinelibrary.wiley.com/doi/10.1111/cts.13302

About

Schema and generated objects for biolink data model and upper ontology

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 96.6%
  • Jupyter Notebook 3.1%
  • Other 0.3%