zur Suche springenzur Navigation springenzum Inhalt springen

Sortieren nach: Erscheinungsjahr Typ der Publikation

Zeige Ergebnisse 1 bis 10 von 63.


Witting, M., Ruttkies, C., Neumann, S. & Schmitt-Kopplin, P. LipidFrag: Improving reliability of in silico fragmentation of lipids and application to the Caenorhabditis elegans lipidome.  PLoS ONE 12, e0172311, (2017) DOI: 10.1371/journal.pone.0172311

Lipid identification is a major bottleneck in high-throughput lipidomics studies. However, tools for the analysis of lipid tandem MS spectra are rather limited. While the comparison against spectra in reference libraries is one of the preferred methods, these libraries are far from being complete. In order to improve identification rates, the in silico fragmentation tool MetFrag was combined with Lipid Maps and lipid-class specific classifiers which calculate probabilities for lipid class assignments. The resulting LipidFrag workflow was trained and evaluated on different commercially available lipid standard materials, measured with data dependent UPLC-Q-ToF-MS/MS acquisition. The automatic analysis was compared against manual MS/MS spectra interpretation. With the lipid class specific models, identification of the true positives was improved especially for cases where candidate lipids from different lipid classes had similar MetFrag scores by removing up to 56% of false positive results. This LipidFrag approach was then applied to MS/MS spectra of lipid extracts of the nematode Caenorhabditis elegans. Fragments explained by LipidFrag match known fragmentation pathways, e.g., neutral losses of lipid headgroups and fatty acid side chain fragments. Based on prediction models trained on standard lipid materials, high probabilities for correct annotations were achieved, which makes LipidFrag a good choice for automated lipid data analysis and reliability testing of lipid identifications.
Publikationen in Druck

Meier, R., Ruttkies, C., Treutler, H. & Neumann, S. Bioinformatics can boost metabolomics research.  J. Biotechnol. (2017) DOI: 10.1016/j.jbiotec.2017.05.018

Metabolomics is the modern term for the field of small molecule research in biology and biochemistry. Currently, metabolomics is undergoing a transition where the classic analytical chemistry is combined with modern cheminformatics and bioinformatics methods, paving the way for large-scale data analysis. We give some background on past developments, highlight current state-of-the-art approaches, and give a perspective on future requirements.

Schymanski, E. L., Ruttkies, C., Krauss, M., Brouard, C., Kind, T., Dührkop, K., Allen, F., Vaniya, A., Verdegem, D., Böcker, S., Rousu, J., Shen, H., Tsugawa, H., Sajed, T., Fiehn, O., Ghesquière, B. & Neumann, S. Critical assessment of small molecule identification 2016: automated methods. J. Cheminformatics 9, 22, (2017) DOI: 10.1186/s13321-017-0207-1

The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest (www.casmi-contest.org) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification.

The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in “Category 2: Best Automatic Structural Identification—In Silico Fragmentation Only”, won by Team Brouard with 41% challenge wins. The winner of “Category 3: Best Automatic Structural Identification—Full Information” was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways.

The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for “known unknowns”. As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for “real life” annotations. The true “unknown unknowns” remain to be evaluated in future CASMI contests.
Publikationen in Druck

Al Shweiki, M. H. D. R., Mönchgesang, S., Majovsky, P., Thieme, D., Trutschel, D. & Hoehenwarter, W. Assessment of Label-Free quantification in discovery proteomics and impact of technological factors and natural variability of protein abundance. J Proteome Res. (2017) DOI: 10.1021/acs.jproteome.6b00645

We evaluated the state of label-free discovery proteomics focusing especially on technological contributions and contributions of naturally occurring differences in protein abundance to the intersample variability in protein abundance estimates in this highly peptide-centric technology. First, the performance of popular quantitative proteomics software, Proteome Discoverer, Scaffold, MaxQuant, and Progenesis QIP, was benchmarked using their default parameters and some modified settings. Beyond this, the intersample variability in protein abundance estimates was decomposed into variability introduced by the entire technology itself and variable protein amounts inherent to individual plants of the Arabidopsis thaliana Col-0 accession. The technical component was considerably higher than the biological intersample variability, suggesting an effect on the degree and validity of reported biological changes in protein abundance. Surprisingly, the biological variability, protein abundance estimates, and protein fold changes were recorded differently by the software used to quantify the proteins, warranting caution in the comparison of discovery proteomics results. As expected, ∼99% of the proteome was invariant in the isogenic plants in the absence of environmental factors; however, few proteins showed substantial quantitative variability. This naturally occurring variation between individual organisms can have an impact on the causality of reported protein fold changes.


Mönchgesang, S., Strehmel, N., Trutschel, D., Westphal, L., Neumann, S. & Scheel, D. Plant-to-plant variability in root metabolite profiles of 19 Arabidopsis thaliana accessions is substance-class-dependent Inter J Mol Sci 17, (2016) DOI: 10.3390/ijms17091565

Natural variation of secondary metabolism between different accessions of Arabidopsis thaliana (A. thaliana) has been studied extensively. In this study, we extended the natural variation approach by including biological variability (plant-to-plant variability) and analysed root metabolic patterns as well as their variability between plants and naturally occurring accessions. To screen 19 accessions of A. thaliana, comprehensive non-targeted metabolite profiling of single plant root extracts was performed using ultra performance liquid chromatography/electrospray ionization quadrupole time-of-flight mass spectrometry (UPLC/ESI-QTOF-MS) and gas chromatography/electron ionization quadrupole mass spectrometry (GC/EI-QMS). Linear mixed models were applied to dissect the total observed variance. All metabolic profiles pointed towards a larger plant-to-plant variability than natural variation between accessions and variance of experimental batches. Ratios of plant-to-plant to total variability were high and distinct for certain secondary metabolites. None of the investigated accessions displayed a specifically high or low biological variability for these substance classes. This study provides recommendations for future natural variation analyses of glucosinolates, flavonoids, and phenylpropanoids and also reference data for additional substance classes.


Treutler, H., Tsugawa, H., Porzel, A., Gorzolka, K., Tissier, A., Neumann, S. & Balcke, G. U. Discovering regulated metabolite families in untargeted metabolomics studies. Anal Chem 88, 8082-8090, (2016) DOI: 10.1021/acs.analchem.6b01569

The identification of metabolites by mass spectrometry constitutes a major bottleneck which considerably limits the throughput of metabolomics studies in biomedical or plant research. Here, we present a novel approach to analyze metabolomics data from untargeted, data-independent LC-MS/MS measurements. By integrated analysis of MS1 abundances and MS/MS spectra, the identification of regulated metabolite families is achieved. This approach offers a global view on metabolic regulation in comparative metabolomics. We implemented our approach in the web application “MetFamily”, which is freely available at http://msbi.ipb-halle.de/MetFamily/. MetFamily provides a dynamic link between the patterns based on MS1-signal intensity and the corresponding structural similarity at the MS/MS level. Structurally related metabolites are annotated as metabolite families based on a hierarchical cluster analysis of measured MS/MS spectra. Joint examination with principal component analysis of MS1 patterns, where this annotation 

Bücher und Buchkapitel

Schober, D., Salek R. M. & Neumann, S. Towards standardized evidence descriptors for metabolite annotations.. In: IN: Proceedings of the 7th Workshop on Ontologies and Data in Life Sciences, organized by the GI Workgroup Ontologies in Biomedicine and Life Sciences (OBML), Halle (Saale), Germany, September 29-30, 2016.  (Loebe, F.; Boeker, M.; Herre, H.; Jansen, L.; Schober, ). CEUR Workshop Proceedings 1692, E 5, (2016)

Motivation: Data on measured abundances of small molecules from biomaterial is currently accumulating in the literature and in online repositories. Unless formal machine-readable evidence assertions for such metabolite identifications are provided, quality assessment based re-use will be sparse. Existing annotation schemes are not universally adopted, nor granular enough to be of practical use in evidence-based quality assessment. Results: We review existing evidence schemes for metabolite identifications of variant semantic expressivity and derive requirements for a 'compliance-optimized' yet traceable annotation model. We present a pattern-based, yet simple taxonomy of intuitive and self-explaining descriptors that allow to annotate metab-olomics assay results both in literature and data bases with evidence information on small molecule analytics gained via technologies such as mass spectrometry or NMR. We present example annotations for typical mass spectrometry molecule assignments and outline next steps for integration with existing ontologies and metabolomics data exchange formats. Availability: An initial draft and documentation of the metabolite identification evidence code ontology is available at


Wohlgemuth, G., Mehta, S. S., Mejia, R. F., Neumann, S., Pedrosa, D., Pluskal, T., Schymanski, E. L., Willighagen, E. L., Wilson, M., Wishart, D. S., Arita, M., Dorrestein, P. C., Bandeira, N., Wang, M., Schulze, T., , Salek, R. M., Steinbeck, C., Nainala, V. C., Mistrik, R., Nishioka, T. & Fiehn, O. SPLASH, a hashed identifier for mass spectra. Nat Biotech 34, 1099-1101, (2016) DOI: 10.1038/nbt.3689


Treutler, H. & Neumann, S. Prediction, detection, and validation of isotope clusters in mass spectrometry data. Metabolites 6, 37, (2016) DOI: 10.3390/metabo6040037

Mass spectrometry is a key analytical platform for metabolomics. The precise quantification and identification of small molecules is a prerequisite for elucidating the metabolism and the detection, validation, and evaluation of isotope clusters in LC-MS data is important for this task. Here, we present an approach for the improved detection of isotope clusters using chemical prior knowledge and the validation of detected isotope clusters depending on the substance mass using database statistics. We find remarkable improvements regarding the number of detected isotope clusters and are able to predict the correct molecular formula in the top three ranks in 92% of the cases. We make our methodology freely available as part of the Bioconductor packages xcms version 1.50.0 and CAMERA version 1.30.0.


Rocca-Serra, P., Salek, R. M., Arita, M., Correa, E., Dayalan, S., Gonzalez-Beltran, A., Ebbels, T., Goodacre, R., Hastings, J., Haug, K., Koulman, A., Nikolski, M., Oresic, M., Sansone, S.-A., Schober, d., Smith, J., Steinbeck, C., Viant, M. R. & Neumann, S. Data standards can boost metabolomics research, and if there is a will, there is a way Metabolomics 12:14 , (2016) DOI: 10.1007/s11306-015-0879-3

Thousands of articles using metabolomics approaches are published every year. With the increasing amounts of data being produced, mere description of investigations as text in manuscripts is not sufficient to enable re-use anymore: the underlying data needs to be published together with the findings in the literature to maximise the benefit from public and private expenditure and to take advantage of an enormous opportunity to improve scientific reproducibility in metabolomics and cognate disciplines. Reporting recommendations in metabolomics started to emerge about a decade ago and were mostly concerned with inventories of the information that had to be reported in the literature for consistency. In recent years, metabolomics data standards have developed extensively, to include the primary research data, derived results and the experimental description and importantly the metadata in a machine-readable way. This includes vendor independent data standards such as mzML for mass spectrometry and nmrML for NMR raw data that have both enabled the development of advanced data processing algorithms by the scientific community. Standards such as ISA-Tab cover essential metadata, including the experimental design, the applied protocols, association between samples, data files and the experimental factors for further statistical analysis. Altogether, they pave the way for both reproducible research and data reuse, including meta-analyses. Further incentives to prepare standards compliant data sets include new opportunities to publish data sets, but also require a little “arm twisting” in the author guidelines of scientific journals to submit the data sets to public repositories such as the NIH Metabolomics Workbench or MetaboLights at EMBL-EBI. In the present article, we look at standards for data sharing, investigate their impact in metabolomics and give suggestions to improve their adoption.

IPB Mainnav Search