Award details

MetaboLights: Creating the missing Metabolomics community resource

Principal Investigator / Supervisor Dr Chris C Steinbeck
Co-Investigators /
Professor Julian Griffin
Institution EMBL - European Bioinformatics Institute
DepartmentChemoinformatics and Metabolism
Funding typeResearch
Value (£) 863,382
TypeResearch Grant
Start date 01/11/2010
End date 30/04/2014
Duration42 months


This project will instantiate, at the European Bioinformatics Institute (EBI) in Hinxton, Cambridge, UK, the MetaboloLights Metabolomics database with various different components focussed on both data standards and primary experimental data. The resource will be cross-species, cross-application and will cover all relevant analytical methods. To date metabolomic databases have either focused on spectra and metadata associated with the analysis of standards or data associated with a small number of species or one analytical tool. MetaboLights will characterize the metabolites in various aspects: 1.through their chemical structure, names and related information, 2.through their spectral (NMR, MS) and chromatographic information (retention times) and 3.through their occurrence, concentration or absence in a particular species, organ, tissue or cell type. All of this information will further be linked to other resources of interest at either the EBI or elsewhere, including the Reactome database for pathway analysis, to our chemo-genomics database ChEMBL to account for drug interactions, or to UniProt and PDB to linked to their respective processing enzymes or to other biomacromolecules with which the metabolite is known to interact. There will further be d) a database for metabolomics experiment management. This is a vital resource still needed by the metabolomic community to encourage multi-lab collaborative projects, and aid peer review and software development. For the success of MetaboLights it will also be essential to work on a data input pipeline supporting the user to the greatest possible extent. The types of data used in metablomics are an order of magnitude more diverse than those in proteomics and the task of creating such a pipeline will be complex. Here we plan to again closely cooperate with the community in creating open standards and encourage the vendors to implement the production of open formats in their instrument software.


Metabolomics studies the occurrence and change of concentrations of small molecular weight chemical compounds (metabolites) in organisms, organs, tissues, cells and ultimately cell compartments in the context of environmental changes, disease or other boundary conditions. It does this by means of spectroscopic and chromatographic techniques and by observing at once not only a few but all compounds visible to the particular technique used. To understand what a change in concentration of one or more of the signals in a spectrum or chromatogram ultimately means, the identity and role of the chemical compounds underlying the respective signal needs to be revealed. This requires curated databases with reference data of chemical structures of biological metabolites assigned to their spectra data. It also requires the knowledge about reference concentrations of metabolites in the biological system of interest under given conditions. In genome and protein science large resources exist to document which genes or proteins are found or expressed under certain conditions in the system of interest. In metabolomics, however, a general system with such information has not yet been instantiated. Instead, a number of resources exist which specialize on certain 'kingdoms of life', species or diseases, or analytical devices. Here, we therefore propose to establish a general metabolomics database resource at the European Binformatics Institute in Hinxton, Cambridge, UK, which serves this crucial information to the biological community in the UK and worldwide. This resource, with the working title 'MetaboLights', will serve information about metabolites and their reference spectra and chromatographic data, their occurrence and concentrations in organisms, tissues, cells, etc, under well-defined conditions, and last but not least documentation about how metabolomics experiments were conducted. Like all other EBI resources, the MetaboLights databases will be completely open to the public, including open access to the data. Data will be made available in publicly accepted open standards. The software will be open source. MetaboLights is not meant to replace specialist resources for Metabolomics. Rather, it will build on prior art and collaborate. We are dedicated to close collaboration with all major parties involved in the creation of this prior art, such as the Metabolomics Society, Metabomeeting and the Metabolomics Standards Initiative. In the molecular biology universe as it exists today, none of the large database efforts, neither in genomics nor any other area, lives an isolated life. Rather, there is a regulated ecosystem of resources, such as GenBank, Ensemble and DDBJ for the genome sciences, which interchange data freely and compete on how to present this data and on its analysis. We aim to come to similar data sharing agreements with major resources such as the Human Metabolome Database, the Golm Metabolome Database or the Rikken Metabolomics Platform. However, MetaboLights will be the first comprehensive, cross-species, cross-technique database which combines curated reference data of pure metabolites, curated information about their occurrence and concentration in species, organs, tissues and cell types under various condition with data characterizing the experiment which lead to these findings.

Impact Summary

The MetaboLights resource, designed to become the third missing pillar of large Omics resources, to compliment the EBI's proteomics and genomics resources, will benefit a number of significant communities performing biological research and development in metabolomics and functional genomics. This is in congruence with a number of strategic research priorities of the BBSRC. In systems approaches to biological research, metabolomics allows us to study how the metabolic system reacts to changes in the environments, to stress, to disease and other boundary conditions with high time resolution. These data can then be mapped to biological pathway models and impose a dynamic view of the system. For ageing research, metabolomics is used to study and characterize states and dynamics of the ageing organism with no (urine) or low (blood) invasiveness, or through tissue analysis. Both in bioenergy research as well as in crop science, metabolomics is used to study how plants or microbes used for energy harvesting react to environmental changes (robustness) or how their energy metabolism react to genetic manipulation or other perturbations (flexibility). Generally, the field is of major importance for our understanding of how biological systems, most notably metabolic networks behave under various conditions and for developing a personalized medicine because metabolites, as end products of cellular regulatory processes, provide insights into the response of biological systems to genetic or environmental changes as well as diseases. Metabolomics is also widely used in UK industry including the drug safety assessment process in the pharmaceutical industry, pesticide toxicology in agrochemicals, biomarker discovery for medical diagnostics and plant fitness for crop development. Metabolic profiles are therefore ideal a) as a diagnostic technique and b) for classifying organisms (including humans) by their phenotype. According to Goodacre and coauthors, in order to deal with the torrent of data from metabolomics, it 'is clear [...] that we shall need good databases, very good data and even better algorithms [... and that ] curation of these databases is essential if they are to be useful to the wider community'. To government agencies and ministries, this resource could become a portal of information about small molecule biomarkers and their significance for a particular diagnostic tool or method, thereby aiding in decision making and setting new strategic priorities in medical diagnostics, an area of great interest to the NHS. This work will also provide bioinformatic resources for a number of major companies in the UK such as GlaxoSmithKline, Syngenta, AstraZeneca and Unilever. The work of all of the beneficiaries listed above will benefit because the MetaboLights resource will be the first comprehensive, cross-species, cross-technique database which combines curated reference data of pure metabolites, curated information about their occurrence and concentration in organs, tissues and cell types under various conditions with data characterizing the experiment which lead to these findings. Considerable synergy for the understanding of metabolism can be leveraged by cross-species analogy when metabolomics information is put into its genomic and transcriptomic context. Since we aim at creating a unique and general metabolomics resource, based in the UK, which will interact with all relevant international databases in the field. The biological community will immediately benefit from MetaboLights because like all other EBI resources, the MetaboLights databases will be completely open to the public, including open access to the data. Data will be made available in publicly accepted open standards. As for all EBI resources, we will provide training material and courses about MetaboLights which will be open and accessible for everyone.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsTechnology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file