Skip to content

openBIS at IMV

Osvaldo Zagordi edited this page Dec 5, 2017 · 11 revisions

openBIS is an information management system developed to organise and analyse data from experiments, especially large volume ones.

Scope

Since spring/summer 2017 we've been using openBIS to store and organise data from our MiSeq machines. Our first goal was to have an automatic way of registering and annotating sequencing data. Further, we wanted to automatically trigger the start of some analysis pipelines for a subset of our sequencing samples.

The concept

In the following paragraphs, we will describe the basic concepts of openBIS starting from the simplest entity (sample) and by making concrete references to our implementation. Please also refer to openBIS user documentation.

Sample

The primary object in openBIS is a sample. While in common language a sample is a small quantity extracted from a whole or a specimen, in openBIS lingo a sample is anything that can be measured, analysed, annotated, described. In our implementation, for example, a MiSeq run is a sample. We annotate it with information taken from the sample sheet, like investigator name, chemistry used, phiX concentration and so on. A full list of properties can be found in the metadata model.

In our implementation we have three sample types:

  • MISEQ_RUN,
  • MISEQ_SAMPLE,
  • RESISTANCE_TEST

Dataset and attachments

When a data file must be stored into openBIS it can be attached to a sample or it can be added to it as dataset (see this FAQ to learn about the difference).

In our implementation, we define a sample of type MISEQ_SAMPLE with a fastq file as a dataset. This sample will be annotated with sample ID (a number), sample name (the column "name" in the sample sheet) and other properties and will have a fastq file as its dataset.

On the other hand, attachments are added to samples of type RESISTANCE_TEST after they are analysed, as described in the pybis page.

Experiments and projects

Experiment is a special attribute of samples that can be used to organise them, while projects are, essentially, containers of experiments. This is especially convenient when accessing openBIS via a browser because one can easily navigate to the desired sample or set of samples.

The picture below shows the experiments that are found in the project RESISTANCE (selected in the left column).

Browser view of openBIS

Next picture shows the samples present in the experiment MISEQ_SAMPLES in the project RESISTANCE.

Browser view of openBIS

It is worth noting that we defined our model such that an experiment will only contain samples of the same type, and we also used a very simple naming scheme. As a result,

  • MISEQ_SAMPLES only contains samples of type MISEQ_SAMPLE,
  • MISEQ_RUNS only contains samples of type MISEQ_RUN,
  • RESISTANCE_TESTS only contains samples of type RESISTANCE_TEST.

Mapping

A very important feature in openBIS is the possibility to establish a parent/child relationship between samples. In openBIS lingo, this linking is called mapping. We use links between two types of samples:

  • each MISEQ_SAMPLE is child of a MISEQ_RUN,
  • each RESISTANCE_TEST is child of a MISEQ_SAMPLE (this applies to project RESISTANCE only).

The relationship between RESISTANCE_TEST and MISEQ_SAMPLE is used to automatically analyse the fastq files with MinVar and attach the results (see pybis page for more info).


For further information, see also the openBIS FAQ.