Books and chapters

Schober, D., Salek R. M. & Neumann, S. Towards standardized evidence descriptors for metabolite annotations.. In: IN: Proceedings of the 7th Workshop on Ontologies and Data in Life Sciences, organized by the GI Workgroup Ontologies in Biomedicine and Life Sciences (OBML), Halle (Saale), Germany, September 29-30, 2016.  (Loebe, F.; Boeker, M.; Herre, H.; Jansen, L.; Schober, ). CEUR Workshop Proceedings 1692, E 5, (2016)

Motivation: Data on measured abundances of small molecules from biomaterial is currently accumulating in the literature and in online repositories. Unless formal machine-readable evidence assertions for such metabolite identifications are provided, quality assessment based re-use will be sparse. Existing annotation schemes are not universally adopted, nor granular enough to be of practical use in evidence-based quality assessment. Results: We review existing evidence schemes for metabolite identifications of variant semantic expressivity and derive requirements for a 'compliance-optimized' yet traceable annotation model. We present a pattern-based, yet simple taxonomy of intuitive and self-explaining descriptors that allow to annotate metab-olomics assay results both in literature and data bases with evidence information on small molecule analytics gained via technologies such as mass spectrometry or NMR. We present example annotations for typical mass spectrometry molecule assignments and outline next steps for integration with existing ontologies and metabolomics data exchange formats. Availability: An initial draft and documentation of the metabolite identification evidence code ontology is available at


Wohlgemuth, G., Mehta, S. S., Mejia, R. F., Neumann, S., Pedrosa, D., Pluskal, T., Schymanski, E. L., Willighagen, E. L., Wilson, M., Wishart, D. S., Arita, M., Dorrestein, P. C., Bandeira, N., Wang, M., Schulze, T., , Salek, R. M., Steinbeck, C., Nainala, V. C., Mistrik, R., Nishioka, T. & Fiehn, O. SPLASH, a hashed identifier for mass spectra. Nat Biotech 34, 1099-1101, (2016) DOI: 10.1038/nbt.3689


Treutler, H. & Neumann, S. Prediction, detection, and validation of isotope clusters in mass spectrometry data. Metabolites 6, (2016) DOI: 10.3390/metabo6040037

Mass spectrometry is a key analytical platform for metabolomics. The precise quantification and identification of small molecules is a prerequisite for elucidating the metabolism and the detection, validation, and evaluation of isotope clusters in LC-MS data is important for this task. Here, we present an approach for the improved detection of isotope clusters using chemical prior knowledge and the validation of detected isotope clusters depending on the substance mass using database statistics. We find remarkable improvements regarding the number of detected isotope clusters and are able to predict the correct molecular formula in the top three ranks in 92% of the cases. We make our methodology freely available as part of the Bioconductor packages xcms version 1.50.0 and CAMERA version 1.30.0.


Treutler, H., Tsugawa, H., Porzel, A., Gorzolka, K., Tissier, A., Neumann, S. & Balcke, G. U. Discovering regulated metabolite families in untargeted metabolomics studies. Anal Chem 88, 8082-8090, (2016) DOI: 10.1021/acs.analchem.6b01569

The identification of metabolites by mass spectrometry constitutes a major bottleneck which considerably limits the throughput of metabolomics studies in biomedical or plant research. Here, we present a novel approach to analyze metabolomics data from untargeted, data-independent LC-MS/MS measurements. By integrated analysis of MS1 abundances and MS/MS spectra, the identification of regulated metabolite families is achieved. This approach offers a global view on metabolic regulation in comparative metabolomics. We implemented our approach in the web application “MetFamily”, which is freely available at http://msbi.ipb-halle.de/MetFamily/. MetFamily provides a dynamic link between the patterns based on MS1-signal intensity and the corresponding structural similarity at the MS/MS level. Structurally related metabolites are annotated as metabolite families based on a hierarchical cluster analysis of measured MS/MS spectra. Joint examination with principal component analysis of MS1 patterns, where this annotation 


Mönchgesang, S., Strehmel, N., Trutschel, D., Westphal, L., Neumann, S. & Scheel, D. Plant-to-plant variability in root metabolite profiles of 19 Arabidopsis thaliana accessions is substance-class-dependent Inter J Mol Sci 17, (2016) DOI: 10.3390/ijms17091565

Natural variation of secondary metabolism between different accessions of Arabidopsis thaliana (A. thaliana) has been studied extensively. In this study, we extended the natural variation approach by including biological variability (plant-to-plant variability) and analysed root metabolic patterns as well as their variability between plants and naturally occurring accessions. To screen 19 accessions of A. thaliana, comprehensive non-targeted metabolite profiling of single plant root extracts was performed using ultra performance liquid chromatography/electrospray ionization quadrupole time-of-flight mass spectrometry (UPLC/ESI-QTOF-MS) and gas chromatography/electron ionization quadrupole mass spectrometry (GC/EI-QMS). Linear mixed models were applied to dissect the total observed variance. All metabolic profiles pointed towards a larger plant-to-plant variability than natural variation between accessions and variance of experimental batches. Ratios of plant-to-plant to total variability were high and distinct for certain secondary metabolites. None of the investigated accessions displayed a specifically high or low biological variability for these substance classes. This study provides recommendations for future natural variation analyses of glucosinolates, flavonoids, and phenylpropanoids and also reference data for additional substance classes.

Mönchgesang, S., Strehmel, N., Schmidt, S., Westphal, L., Taruttis, F., Müller, E., Herklotz, S., Neumann, S. & Scheel, D. Natural variation of roots exudates in Arabidopsis thaliana - linking metabolomic and genomic data. Sci Rep 6, 29033 , (2016) DOI: 10.1038/srep29033

Many metabolomics studies focus on aboveground parts of the plant, while metabolism within roots and the chemical composition of the rhizosphere, as influenced by exudation, are not deeply investigated. In this study, we analysed exudate metabolic patterns of Arabidopsis thaliana and their variation in genetically diverse accessions. For this project, we used the 19 parental accessions of the Arabidopsis MAGIC collection. Plants were grown in a hydroponic system, their exudates were harvested before bolting and subjected to UPLC/ESI-QTOF-MS analysis. Metabolite profiles were analysed together with the genome sequence information. Our study uncovered distinct metabolite profiles for root exudates of the 19 accessions. Hierarchical clustering revealed similarities in the exudate metabolite profiles, which were partly reflected by the genetic distances. An association of metabolite absence with nonsense mutations was detected for the biosynthetic pathways of an indolic glucosinolate hydrolysis product, a hydroxycinnamic acid amine and a flavonoid triglycoside. Consequently, a direct link between metabolic phenotype and genotype was detected without using segregating populations. Moreover, genomics can help to identify biosynthetic enzymes in metabolomics experiments. Our study elucidates the chemical composition of the rhizosphere and its natural variation in A. thaliana, which is important for the attraction and shaping of microbial communities.


Hoehenwarter, W., Mönchgesang, S., Neumann, S., Majovsky, P., Abel, S. & Müller, J. Comparative expression profiling reveals a role of the root apoplast in local phosphate response BMC Plant Biol. 16 , 106, (2016) DOI: 10.1186/s12870-016-0790-8

Plant adaptation to limited phosphate availability comprises a wide range of responses to conserve and remobilize internal phosphate sources and to enhance phosphate acquisition. Vigorous restructuring of root system architecture provides a developmental strategy for topsoil exploration and phosphate scavenging. Changes in external phosphate availability are locally sensed at root tips and adjust root growth by modulating cell expansion and cell division. The functionally interacting Arabidopsis genes, LOW PHOSPHATE RESPONSE 1 and 2 (LPR1/LPR2) and PHOSPHATE DEFICIENCY RESPONSE 2 (PDR2), are key components of root phosphate sensing. We recently demonstrated that the LOW PHOSPHATE RESPONSE 1 - PHOSPHATE DEFICIENCY RESPONSE 2 (LPR1-PDR2) module mediates apoplastic deposition of ferric iron (Fe3+) in the growing root tip during phosphate limitation. Iron deposition coincides with sites of reactive oxygen species generation and triggers cell wall thickening and callose accumulation, which interfere with cell-to-cell communication and inhibit root growth.


Vinaixa, M., Schymanski, E. L., Neumann, S., Navarro, M., Salek, R. M. & Yanes, O. Mass spectral databases for LC/MS and GC/MS-based metabolomics: state of the field and future prospects. Trends Analyt. Chem. 78, 23-35, (2016) DOI: 10.1016/j.trac.2015.09.005

Mass spectrometry-based metabolomics is now widely used to obtain new insights into human, plant and microbial biochemistry, drug and biomarker discovery, nutrition research and food control. Despite this great shared interest, identifying and characterizing the structure of metabolites has become a major bottleneck for converting raw mass spectrometric data into biological knowledge. In this regard, comprehensive and well-annotated MS-based spectral databases play a key role towards converting raw spectral data into metabolite annotations and thus biological knowledge. The main characteristics of the mass spectral databases currently used in MS-based metabolomics, are reviewed in this paper, underlining the advantages and limitations of each. Extending this, the overlap of compounds with MSn (n≥2) spectra from authentic chemical standards in most public and commercial databases has been calculated for the first time. Finally, future prospects for mass spectral databases are discussed in terms of the needs posed by novel applications and instrumental advancements.


Nettling, M., Treutler, H., Cerquides, J. & Grosse, I. Detecting and correcting the binding-affinity bias in ChIP-seq data using inter-species information BMC Genomics 17:347, (2016) DOI: 10.1186/s12864-016-2682-6


Transcriptional gene regulation is a fundamental process in nature, and the experimental and computational investigation of DNA binding motifs and their binding sites is a prerequisite for elucidating this process. ChIP-seq has become the major technology to uncover genomic regions containing those binding sites, but motifs predicted by traditional computational approaches using these data are distorted by a ubiquitous binding-affinity bias. Here, we present an approach for detecting and correcting this bias using inter-species information.


We find that the binding-affinity bias caused by the ChIP-seq experiment in the reference species is stronger than the indirect binding-affinity bias in orthologous regions from phylogenetically related species. We use this difference to develop a phylogenetic footprinting model that is capable of detecting and correcting the binding-affinity bias. We find that this model improves motif prediction and that the corrected motifs are typically softer than those predicted by traditional approaches.


These findings indicate that motifs published in databases and in the literature are artificially sharpened compared to the native motifs. These findings also indicate that our current understanding of transcriptional gene regulation might be blurred, but that it is possible to advance this understanding by taking into account inter-species information available today and even more in the future.


Rocca-Serra, P., Salek, R. M., Arita, M., Correa, E., Dayalan, S., Gonzalez-Beltran, A., Ebbels, T., Goodacre, R., Hastings, J., Haug, K., Koulman, A., Nikolski, M., Oresic, M., Sansone, S.-A., Schober, d., Smith, J., Steinbeck, C., Viant, M. R. & Neumann, S. Data standards can boost metabolomics research, and if there is a will, there is a way Metabolomics 12:14 , (2016) DOI: 10.1007/s11306-015-0879-3

Thousands of articles using metabolomics approaches are published every year. With the increasing amounts of data being produced, mere description of investigations as text in manuscripts is not sufficient to enable re-use anymore: the underlying data needs to be published together with the findings in the literature to maximise the benefit from public and private expenditure and to take advantage of an enormous opportunity to improve scientific reproducibility in metabolomics and cognate disciplines. Reporting recommendations in metabolomics started to emerge about a decade ago and were mostly concerned with inventories of the information that had to be reported in the literature for consistency. In recent years, metabolomics data standards have developed extensively, to include the primary research data, derived results and the experimental description and importantly the metadata in a machine-readable way. This includes vendor independent data standards such as mzML for mass spectrometry and nmrML for NMR raw data that have both enabled the development of advanced data processing algorithms by the scientific community. Standards such as ISA-Tab cover essential metadata, including the experimental design, the applied protocols, association between samples, data files and the experimental factors for further statistical analysis. Altogether, they pave the way for both reproducible research and data reuse, including meta-analyses. Further incentives to prepare standards compliant data sets include new opportunities to publish data sets, but also require a little “arm twisting” in the author guidelines of scientific journals to submit the data sets to public repositories such as the NIH Metabolomics Workbench or MetaboLights at EMBL-EBI. In the present article, we look at standards for data sharing, investigate their impact in metabolomics and give suggestions to improve their adoption.

