jump to searchjump to navigationjump to content


Nowadays, gene discovery has been made very efficient with the combination of deep sequencing and the exploitation of natural variation. Just in Arabidopsis, hundreds of genetic loci have been identified as influencing a wide variety of processes, and we aim to go from gene-of-interest to characterized protein product using approaches to “take a picture” of the comprehensive metabolome of the plant.

The IPB is currently operating a wide range of NMR and mass spectrometry instruments for metabolomics across all four departments, which are integrated into our Metabolomics Platform.

The experimental work is complemented by extensive Cheminformatics and Bioinformatics research to process and interpret the huge amounts of data. The IPB is operating the first European MassBank server, and hosts several online tools for metabolite identification.

Contact partner for all interests concerning the metabolomics platform is Dr. Steffen Neumann.

Publications by Tag: Metabolomics

Sort by: Year Type of publication

Displaying results 1 to 6 of 6.


Altenburger, R., Ait-Aissa, S., Antczak, P., Backhaus, T., Barceló, D., Seiler, T.-B., Brion, F., Busch, W., Chipman, K., López de Alda, M., de Aragão Umbuzeiro, G., Escher, B. I., Falciani, F., Faust, M., Focks, A., Hilscherova, K., Hollender, J., Hollert, H., Jäger, F., Jahnke, A., Kortenkamp, A., Krauss, M., Lemkine, G. F., Munthe, J., Neumann, S., Schymanski, E. L., Scrimshaw, M., Segner, H., Slobodnik, J., Smedes, F., Kughathas, S., Teodorovic, I., Tindall, A. J., Tollefsen, K. E., Walz, K.-H., Williams, T. D., Van den Brink, P. J., van Gils, J., Vrana, B., Zhang, X. & Brack, W. Future water quality monitoring — Adapting tools to deal with mixtures of pollutants in water resource management Sci Total Environ 512–513, 540–551, (2015) DOI: 10.1016/j.scitotenv.2014.12.057

Environmental quality monitoringofwaterresourcesis challenged with providing the basisfor safe guarding the environment against adverse biological effects of anthropogenic chemical contamination from diffuse and point sources. While current regulatory efforts focus on monitoring and assessing a few legacy chemicals, many more anthropogenic chemicals can be detected simultaneously in our aquatic resources. However, exposure to chemical mixtures does not necessarily translate into adverse biological effects nor clearly shows whether mitigation

measures are needed. Thus, the question which mixtures are present and which have associated combined effects becomes central for defining adequate monitoring and assessment strategies. Here we describe the vision of the international, EU-funded project SOLUTIONS, where three routes are explored to link the occurrence of chemical mixtures at specific sites to the assessment of adverse biological combination effects. First of all, multi-residue target and non-target screening techniques covering a broader range of anticipated chemicals

co-occurring in the environment are being developed. By improving sensitivity and detection limits for known bioactive compounds of concern, new analytical chemistry data for multiple components can be obtained and used to characterise priority mixtures. This information on chemical occurrence will be used to predict mixture toxicity and toderive combined effecte stimatessuitable for advancing environmental quality standards. Secondly, bioanalytical tools will be explored to provide aggregate bioactivity measuresintegrating all components that produce common (adverse) outcomes even for mixtures of varying compositions. The ambition is to provide comprehensive arrays of effect-based tools and trait-based field observations that link multiple chemical exposures to various environmental protection goals more directly and to provide improved in situ observations for impact assessment of mixtures. Thirdly, effect-directed analysis (EDA) will be applied to identify major drivers of mixture toxicity. Refinements of EDA include the use of statistical approaches with monitoring information for guidance of experimental EDA studies. These three approaches will be explored using case studies at the

Danube and Rhine river basins as well as rivers of the Iberian Peninsula. The synthesis offindings will be organised toprovide guidance for futuresolution-oriented environmenta lmonitoring and exploremore systematic ways to assess mixture exposures and combination effects in future water quality monitoring.


Moreno, P., Beisken, S., Harsha, B., Muthukrishnan, V., Tudose, I., Dekker, A., Dornfeldt, S., Taruttis, F., Grosse, I., Hastings, J., Neumann, S. & Steinbeck, C. BiNChE: A web tool and library for chemical enrichment analysis based on the ChEBI ontology BMC Bioinformatics 16, 56, (2015) DOI: 10.1186/s12859-015-0486-3

Background: Ontology-based enrichment analysis aids in the interpretation and understanding of large-scale biological data. Ontologies are hierarchies of biologically relevant groupings. Using ontology annotations, which link ontology classes to biological entities, enrichment analysis methods assess whether there is a significant over or under representation of entities for ontology classes. While many tools exist that run enrichment analysis for protein sets

annotated with the Gene Ontology, there are only a few that can be used for small molecules enrichment analysis.

Results: We describe BiNChE, an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology.Conclusions: BiNChE aids in the exploration of large sets of small molecules produced within Metabolomics or other Systems Biology research contexts. The open-source tool provides easy and highly interactive web access to enrichment analysis with the ChEBI ontology tool and is additionally available as a standalone library.


González-Beltrán, A., Neumann, S., Maguire, E., Sansone, S.-A. & Rocca-Serra, P. The Risa R/Bioconductor package: integrative data analysis from experimental metadata and back again. BMC Bioinformatics 15 (Suppl 1), S:11, (2014) DOI: 10.1186/1471-2105-15-S1-S11


matic and accessible format that abstracts away common constructs for describing Investigations, Studies and Assays, ISA is increasingly popular. To attract further interest towards the format and extend support to ensure reproducible research and reusable data, we present the Risa package, which delivers a central component to support the ISA format by enabling effortless integration with R, the popular, open source data crunching environment.


The Risa package bridges the gap between the metadata collection and curation in an ISA-compliant way and the data analysis using the widely used statistical computing environment R. The package offers functionality for: i) parsing ISA-Tab datasets into R objects, ii) augmenting annotation with extra metadata not explicitly stated in the ISA syntax; iii) interfacing with domain specific R packages iv) suggesting potentially useful R packages available in Bioconductor for subsequent processing of the experimental data described in the ISA format; and finally v) saving back to ISA-Tab files augmented with analysis specific metadata from R. We demonstrate these features by presenting use cases for mass spectrometry data and DNA microarray data.


The Risa package is open source (with LGPL license) and freely available through Bioconductor. By making Risa available, we aim to facilitate the task of processing experimental data, encouraging a uniform representation of experimental information and results while delivering tools for ensuring traceability and provenance tracking.

Software availability

The Risa package is available since Bioconductor 2.11 (version 1.0.0) and version 1.2.1 appeared in Bioconductor 2.12, both along with documentation and examples. The latest version of the code is at the development branch in Bioconductor and can also be accessed from GitHub https://github.com/ISA-tools/Risa webcite, where the issue tracker allows users to report bugs or feature requests.


Tautenhahn, R., Böttcher, C. & Neumann, S. Highly sensitive feature detection for high resolution LC/MS BMC Bioinformatics 9, 504, (2008) DOI: 10.1186/1471-2105-9-504


Liquid chromatography coupled to mass spectrometry (LC/MS) is an important analytical technology for e.g. metabolomics experiments. Determining the boundaries, centres and intensities of the two-dimensional signals in the LC/MS raw data is called feature detection. For the subsequent analysis of complex samples such as plant extracts, which may contain hundreds of compounds, corresponding to thousands of features – a reliable feature detection is mandatory.


We developed a new feature detection algorithm centWavefor high-resolution LC/MS data sets, which collects regions of interest (partial mass traces) in the raw-data, and applies continuous wavelet transformation and optionally Gauss-fitting in the chromatographic domain. We evaluated our feature detection algorithm on dilution series and mixtures of seed and leaf extracts, and estimated recall, precision and F-score of seed and leaf specific features in two experiments of different complexity.


The new feature detection algorithm meets the requirements of current metabolomics experiments. centWavecan detect close-by and partially overlapping features and has the highest overall recall and precision values compared to the other algorithms, matchedFilter(the original algorithm of XCMS) and the centroidPicker from MZmine. The centWavealgorithm was integrated into the Bioconductor R-package XCMSand is available from http://www.bioconductor.org/


Lange, E., Tautenhahn, R., Neumann, S. & Gröpl, C. Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements BMC Bioinformatics 9, 375, (2008) DOI: 10.1186/1471-2105-9-375


Liquid chromatography coupled to mass spectrometry (LC-MS) has become a prominent tool for the analysis of complex proteomics and metabolomics samples. In many applications multiple LC-MS measurements need to be compared, e. g. to improve reliability or to combine results from different samples in a statistical comparative analysis. As in all physical experiments, LC-MS data are affected by uncertainties, and variability of retention time is encountered in all data sets. It is therefore necessary to estimate and correct the underlying distortions of the retention time axis to search for corresponding compounds in different samples. To this end, a variety of so-called LC-MS map alignment algorithmshave been developed during the last four years. Most of these approaches are well documented, but they are usually evaluated on very specific samples only. So far, no publication has been assessing different alignment algorithms using a standard LC-MS sample along with commonly used quality criteria.


We propose two LC-MS proteomics as well as two LC-MS metabolomics data sets that represent typical alignment scenarios. Furthermore, we introduce a new quality measure for the evaluation of LC-MS alignment algorithms. Using the four data sets to compare six freely available alignment algorithms proposed for the alignment of metabolomics and proteomics LC-MS measurements, we found significant differences with respect to alignment quality, running time, and usability in general.


The multitude of available alignment methods necessitates the generation of standard data sets and quality measures that allow users as well as developers to benchmark and compare their map alignment tools on a fair basis. Our study represents a first step in this direction. Currently, the installation and evaluation of the "correct" parameter settings can be quite a time-consuming task, and the success of a particular method is still highly dependent on the experience of the user. Therefore, we propose to continue and extend this type of study to a community-wide competition. All data as well as our evaluation scripts are available at http://msbi.ipb-halle.de/msbi/caap .


Kuhn, S., Egert, B., Neumann, S. & Steinbeck, C. Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction BMC Bioinformatics 9, 400, (2008) DOI: 10.1186/1471-2105-9-400


Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB.


A mean absolute error of 0.18 ppm was achieved for the prediction of proton NMR shifts ranging from 0 to 11 ppm. Random forest, J48 decision tree and support vector machines achieved similar overall errors. HOSE codes being a notably simple method achieved a comparatively good result of 0.17 ppm mean absolute error.


NMR prediction methods applied in the course of this work delivered precise predictions which can serve as a building block for Computer-Assisted Structure Elucidation for biological metabolites.

This page was last modified on 10.03.2014.

IPB Mainnav Search