-
Notifications
You must be signed in to change notification settings - Fork 1
openBIS at IMV
openBIS is an information management system developed to organise and analyse data from experiments, especially large volume ones.
Since spring/summer 2017 we've been using openBIS to store and organise data from our MiSeq machines. Our first goal was to have an automatic way of registering and annotating sequencing data. Further, we wanted to automatically trigger the start of some analysis pipelines for a subset of our sequencing samples.
In the following paragraphs, we will describe the basic concepts of openBIS starting from the simplest entity (sample) and by making concrete references to our implementation.
The primary object in openBIS is a sample. While in common language a sample is a small quantity extracted from a whole or a specimen, in openBIS lingo a sample is anything that can be measured, analysed, annotated, described. In our implementation, for example, a MiSeq run is a sample. We annotate it with information taken from the sample sheet, like investigator name, chemistry used, phiX concentration and so on.
In our implementation we have three sample types:
- MISEQ_RUN,
- MISEQ_SAMPLE,
- RESISTANCE_TEST
When a data file must be stored into openBIS it can be attached to a sample or it can be added to it as dataset (see this FAQ to learn about the difference).
In our implementation, we define a sample of type MISEQ_SAMPLE with a fastq file as a dataset. This sample will be annotated with sample ID (a number), sample name (the column "name" in the sample sheet) and other properties and will have a fastq file as its dataset.
On the other hand, attachments are added to samples of type RESISTANCE_TEST
after they are analysed, as described in the
pybis page.
Experiment is a special attribute of samples that can be used to organise them, while projects are, essentially, containers of experiments. This is especially convenient when accessing openBIS via a browser because one can easily navigate to the desired sample or set of samples.
The picture below shows the experiments that are found in the project RESISTANCE (selected in the left column).
Next picture shows the samples present in the experiment MISEQ_SAMPLES in the project RESISTANCE.
It is worth noting that we defined our model such that an experiment will only contain samples of the same type, and we also used a very simple naming scheme. As a result,
- MISEQ_SAMPLES only contains samples of type MISEQ_SAMPLE,
- MISEQ_RUNS only contains samples of type MISEQ_RUN,
- RESISTANCE_TESTS only contains samples of type RESISTANCE_TEST.
A very important feature in openBIS is the possibility to establish a parent/child relationship between samples. In openBIS lingo, this linking is called mapping. We use links between two types of samples:
- each MISEQ_SAMPLE is child of a MISEQ_RUN,
- each RESISTANCE_TEST is child of a MISEQ_SAMPLE (this applies to project RESISTANCE only).
The relationship between RESISTANCE_TEST and MISEQ_SAMPLE is used to automatically analyse the fastq files with MinVar and attach the results (see pybis page for more info).
For further information, see also the openBIS FAQ.