Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Project 12: Empowering the community with notebooks for bespoke microbiome analyses

Abstract

MGnify is EMBL-EBI’s metagenomics resource, which is part of ELIXIR Metagenomics Community. We recently launched a Notebook Server to provide an online, Jupyter Lab environment for users to explore programmatic access to MGnify’s datasets using Python or with R. This ready to use environment and example analysis notebooks bridge the gap between the ease but limitations of browsing the MGnify website, and the complexity but possibilities of installing a local environment to work with data stored in MGnify. Particular goals of the Notebook Server include reproducible downstream analyses, user empowerment through best-practice examples and fast workflows from datasets to publication-ready graphics, and code-as-documentation training materials for users of MGnify.

We have three objectives for the BioHackathon:

First, to increase the breadth of example notebooks to cover the entire of the MGnify API surface. This means users will be able to jump from any resource on the MGnify website into a Jupyter Notebook ready to read and analyse that dataset.

Second, to showcase examples using SIAMCAT’s statistical and machine-learning frameworks for comparative metagenomics (https://siamcat.embl.de/). This builds upon the existing integration of MGnifyR (https://github.com/beadyallen/MGnifyR), and towards our vision of curating a repository of exemplary packages and workflows from collaborators and the community.

Third, to explore integration with the Galaxy Europe project. Galaxy supports a broad range of tools, including Jupyter Notebooks. Serving the MGnify Notebook experience from Galaxy Europe infrastructure can unlock a wider set of possibilities for users, as well as provide a persistent analysis workbench.

Topics

Biodiversity Data Platform Interoperability Platform Marine Metagenomics

Project Number: 12

Lead(s)

Martin Beracochea - [email protected] Sandy Rogers - [email protected]

Expected outcomes

BioHackathon outcomes:

  • Expand the MGnify Notebook Server to include example Jupyter Notebooks, in both Python and R, to read in and tabulate or visualise: Samples, Runs, Analyses, Publications, and MAGs
  • Integrate the SIAMCAT package, and create an example of association testing from MGnify API data
  • Create a proof-of-concept for serving MGnify Notebooks via the Galaxy Europe platform

Long-term expected outcomes:

  • Integration of MGnify Notebooks in the Galaxy Europe platform . (6 months)
  • Establish a catalogue of notebooks to allow scientist to share their approaches, which can be either used to ensure reproducibility or allow the methods to be applied to different datasets. (8 months)

Expected audience

5

Number of expected hacking days: 4


Some outcomes:

grafik