jump to searchjump to navigationjump to content

Sort by: Year Type of publication

Displaying results 1 to 10 of 65.

Publications

Schober, D., Jacob, D., Wilson, M., Cruz, J. A., Marcu, A., Grant, J. R., Moing, A., Deborde, C., de Figueiredo, L. F., Haug, K., Rocca-Serra, P., Easton, J., Ebbels, T. M. D., Hao, J., Ludwig, C., Günther, U. L., Rosato, A., Klein, M. S., Lewis, I. A., Luchinat, C., Jones, A. R., Grauslys, A., Larralde, M., Yokochi, M., Kobayashi, N., Porzel, A., Griffin, J. L., Viant, M. R., Wishart, D. S., Steinbeck, C., Salek, R. M. & Neumann, S. nmrML: A community supported open data standard for the description, storage, and exchange of NMR data. Anal Chem. 90 , 649–656, (2018) DOI: 10.1021/acs.analchem.7b02795

NMR is a widely used analytical technique with a growing number of repositories available. As a result, demands for a vendor-agnostic, open data format for long-term archiving of NMR data have emerged with the aim to ease and encourage sharing, comparison, and reuse of NMR data. Here we present nmrML, an open XML-based exchange and storage format for NMR spectral data. The nmrML format is intended to be fully compatible with existing NMR data for chemical, biochemical, and metabolomics experiments. nmrML can capture raw NMR data, spectral data acquisition parameters, and where available spectral metadata, such as chemical structures associated with spectral assignments. The nmrML format is compatible with pure-compound NMR data for reference spectral libraries as well as NMR data from complex biomixtures, i.e., metabolomics experiments. To facilitate format conversions, we provide nmrML converters for Bruker, JEOL and Agilent/Varian vendor formats. In addition, easy-to-use Web-based spectral viewing, processing, and spectral assignment tools that read and write nmrML have been developed. Software libraries and Web services for data validation are available for tool developers and end-users. The nmrML format has already been adopted for capturing and disseminating NMR data for small molecules by several open source data processing tools and metabolomics reference spectral libraries, e.g., serving as storage format for the MetaboLights data repository. The nmrML open access data standard has been endorsed by the Metabolomics Standards Initiative (MSI), and we here encourage user participation and feedback to increase usability and make it a successful standard.
Printed publications

Döll, S., Kuhlmann, M., Rutten, T., Mette, M. F., Scharfenberg, S., Petridis, A., Berreth, D.-C. & & Mock, H.-P.  Accumulation of the coumarin scopolin under abiotic stress conditions is mediated by the Arabidopsis thaliana THO/TREX complex. Plant J. (2018) DOI: 10.1111/tpj.13797

Secondary metabolites are involved in the plant stress response. Among these are scopolin and its active form scopoletin, which are coumarin derivatives associated with reactive oxygen species scavenging and pathogen defence. Here we show that scopolin accumulation can be induced in the root by osmotic stress and in the leaf by low-temperature stress in Arabidopsis thaliana. A genetic screen for altered scopolin levels in A. thaliana revealed a mutant compromised in scopolin accumulation in response to stress; the lesion was present in a homologue of THO1 coding for a subunit of the THO/TREX complex. The THO/TREX complex contributes to RNA silencing, supposedly by trafficking precursors of small RNAs. Mutants defective in THO, AGO1, SDS3 and RDR6 were impaired with respect to scopolin accumulation in response to stress, suggesting a mechanism based on RNA silencing such as the trans-acting small interfering RNA pathway, which requires THO/TREX function.
Printed publications

Al Shweiki, M. H. D. R., Mönchgesang, S., Majovsky, P., Thieme, D., Trutschel, D. & Hoehenwarter, W. Assessment of Label-Free quantification in discovery proteomics and impact of technological factors and natural variability of protein abundance. J Proteome Res. 16 , 1410–1424, (2017) DOI: 10.1021/acs.jproteome.6b00645

We evaluated the state of label-free discovery proteomics focusing especially on technological contributions and contributions of naturally occurring differences in protein abundance to the intersample variability in protein abundance estimates in this highly peptide-centric technology. First, the performance of popular quantitative proteomics software, Proteome Discoverer, Scaffold, MaxQuant, and Progenesis QIP, was benchmarked using their default parameters and some modified settings. Beyond this, the intersample variability in protein abundance estimates was decomposed into variability introduced by the entire technology itself and variable protein amounts inherent to individual plants of the Arabidopsis thaliana Col-0 accession. The technical component was considerably higher than the biological intersample variability, suggesting an effect on the degree and validity of reported biological changes in protein abundance. Surprisingly, the biological variability, protein abundance estimates, and protein fold changes were recorded differently by the software used to quantify the proteins, warranting caution in the comparison of discovery proteomics results. As expected, ∼99% of the proteome was invariant in the isogenic plants in the absence of environmental factors; however, few proteins showed substantial quantitative variability. This naturally occurring variation between individual organisms can have an impact on the causality of reported protein fold changes.

Publications

Meier, R., Ruttkies, C., Treutler, H. & Neumann, S. Bioinformatics can boost metabolomics research.  J. Biotechnol. 261 , 137-141, (2017) DOI: 10.1016/j.jbiotec.2017.05.018

Metabolomics is the modern term for the field of small molecule research in biology and biochemistry. Currently, metabolomics is undergoing a transition where the classic analytical chemistry is combined with modern cheminformatics and bioinformatics methods, paving the way for large-scale data analysis. We give some background on past developments, highlight current state-of-the-art approaches, and give a perspective on future requirements.
Publications

Witting, M., Ruttkies, C., Neumann, S. & Schmitt-Kopplin, P. LipidFrag: Improving reliability of in silico fragmentation of lipids and application to the Caenorhabditis elegans lipidome.  PLoS ONE 12, e0172311, (2017) DOI: 10.1371/journal.pone.0172311

Lipid identification is a major bottleneck in high-throughput lipidomics studies. However, tools for the analysis of lipid tandem MS spectra are rather limited. While the comparison against spectra in reference libraries is one of the preferred methods, these libraries are far from being complete. In order to improve identification rates, the in silico fragmentation tool MetFrag was combined with Lipid Maps and lipid-class specific classifiers which calculate probabilities for lipid class assignments. The resulting LipidFrag workflow was trained and evaluated on different commercially available lipid standard materials, measured with data dependent UPLC-Q-ToF-MS/MS acquisition. The automatic analysis was compared against manual MS/MS spectra interpretation. With the lipid class specific models, identification of the true positives was improved especially for cases where candidate lipids from different lipid classes had similar MetFrag scores by removing up to 56% of false positive results. This LipidFrag approach was then applied to MS/MS spectra of lipid extracts of the nematode Caenorhabditis elegans. Fragments explained by LipidFrag match known fragmentation pathways, e.g., neutral losses of lipid headgroups and fatty acid side chain fragments. Based on prediction models trained on standard lipid materials, high probabilities for correct annotations were achieved, which makes LipidFrag a good choice for automated lipid data analysis and reliability testing of lipid identifications.
Publications

Schymanski, E. L., Ruttkies, C., Krauss, M., Brouard, C., Kind, T., Dührkop, K., Allen, F., Vaniya, A., Verdegem, D., Böcker, S., Rousu, J., Shen, H., Tsugawa, H., Sajed, T., Fiehn, O., Ghesquière, B. & Neumann, S. Critical assessment of small molecule identification 2016: automated methods. J. Cheminformatics 9, 22, (2017) DOI: 10.1186/s13321-017-0207-1

Background
The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest (www.casmi-contest.org) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification.

Results
The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in “Category 2: Best Automatic Structural Identification—In Silico Fragmentation Only”, won by Team Brouard with 41% challenge wins. The winner of “Category 3: Best Automatic Structural Identification—Full Information” was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways.

Conclusions
The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for “known unknowns”. As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for “real life” annotations. The true “unknown unknowns” remain to be evaluated in future CASMI contests.
Publications

Treutler, H., Tsugawa, H., Porzel, A., Gorzolka, K., Tissier, A., Neumann, S. & Balcke, G. U. Discovering regulated metabolite families in untargeted metabolomics studies. Anal Chem 88, 8082-8090, (2016) DOI: 10.1021/acs.analchem.6b01569

The identification of metabolites by mass spectrometry constitutes a major bottleneck which considerably limits the throughput of metabolomics studies in biomedical or plant research. Here, we present a novel approach to analyze metabolomics data from untargeted, data-independent LC-MS/MS measurements. By integrated analysis of MS1 abundances and MS/MS spectra, the identification of regulated metabolite families is achieved. This approach offers a global view on metabolic regulation in comparative metabolomics. We implemented our approach in the web application “MetFamily”, which is freely available at http://msbi.ipb-halle.de/MetFamily/. MetFamily provides a dynamic link between the patterns based on MS1-signal intensity and the corresponding structural similarity at the MS/MS level. Structurally related metabolites are annotated as metabolite families based on a hierarchical cluster analysis of measured MS/MS spectra. Joint examination with principal component analysis of MS1 patterns, where this annotation 

Publications

Treutler, H. & Neumann, S. Prediction, detection, and validation of isotope clusters in mass spectrometry data. Metabolites 6, 37, (2016) DOI: 10.3390/metabo6040037

Mass spectrometry is a key analytical platform for metabolomics. The precise quantification and identification of small molecules is a prerequisite for elucidating the metabolism and the detection, validation, and evaluation of isotope clusters in LC-MS data is important for this task. Here, we present an approach for the improved detection of isotope clusters using chemical prior knowledge and the validation of detected isotope clusters depending on the substance mass using database statistics. We find remarkable improvements regarding the number of detected isotope clusters and are able to predict the correct molecular formula in the top three ranks in 92% of the cases. We make our methodology freely available as part of the Bioconductor packages xcms version 1.50.0 and CAMERA version 1.30.0.

Publications

Rocca-Serra, P., Salek, R. M., Arita, M., Correa, E., Dayalan, S., Gonzalez-Beltran, A., Ebbels, T., Goodacre, R., Hastings, J., Haug, K., Koulman, A., Nikolski, M., Oresic, M., Sansone, S.-A., Schober, d., Smith, J., Steinbeck, C., Viant, M. R. & Neumann, S. Data standards can boost metabolomics research, and if there is a will, there is a way Metabolomics 12:14 , (2016) DOI: 10.1007/s11306-015-0879-3

Thousands of articles using metabolomics approaches are published every year. With the increasing amounts of data being produced, mere description of investigations as text in manuscripts is not sufficient to enable re-use anymore: the underlying data needs to be published together with the findings in the literature to maximise the benefit from public and private expenditure and to take advantage of an enormous opportunity to improve scientific reproducibility in metabolomics and cognate disciplines. Reporting recommendations in metabolomics started to emerge about a decade ago and were mostly concerned with inventories of the information that had to be reported in the literature for consistency. In recent years, metabolomics data standards have developed extensively, to include the primary research data, derived results and the experimental description and importantly the metadata in a machine-readable way. This includes vendor independent data standards such as mzML for mass spectrometry and nmrML for NMR raw data that have both enabled the development of advanced data processing algorithms by the scientific community. Standards such as ISA-Tab cover essential metadata, including the experimental design, the applied protocols, association between samples, data files and the experimental factors for further statistical analysis. Altogether, they pave the way for both reproducible research and data reuse, including meta-analyses. Further incentives to prepare standards compliant data sets include new opportunities to publish data sets, but also require a little “arm twisting” in the author guidelines of scientific journals to submit the data sets to public repositories such as the NIH Metabolomics Workbench or MetaboLights at EMBL-EBI. In the present article, we look at standards for data sharing, investigate their impact in metabolomics and give suggestions to improve their adoption.

Publications

Brack, W., Ait-Aissa, S., Burgess, R. M., Busch, W., Creusot, N., Di Paolo, C., Escher, B. I., Hewitt, L. M., Hilscherova, K., Hollender, J., Hollert, H., Jonker, W., Kooli, J., Lamoree, M., Muschket, M., Neumann, S., Rostkowski, P., Ruttkies, C., Schollee, J., Schymanski, E. L., Schulze, T., Seiler, T.-B., Tindall, A. J., De Aragão Umbuzeiron, G., Vrana, B. & Krauss, M. Effect-directed analysis supporting monitoring of aquatic environments — An in-depth overview  Sci Total Environ 544, 1073–1118, (2016) DOI: 10.1016/j.scitotenv.2015.11.102

Aquatic environments are often contaminated with complex mixtures of chemicals that may pose a risk to ecosystems and human health. This contamination cannot be addressed with target analysis alone but tools are required to reduce this complexity and identify those chemicals that might cause adverse effects. Effect-directed analysis (EDA) is designed to meet this challenge and faces increasing interest in water and sediment quality monitoring. Thus, the present paper summarizes current experience with the EDA approach and the tools required, and provides practical advice on their application. The paper highlights the need for proper problem formulation and gives general advice for study design. As the EDA approach is directed by toxicity, basic principles for the selection of bioassays are given as well as a comprehensive compilation of appropriate assays, including their strengths and weaknesses. A specific focus is given to strategies for sampling, extraction and bioassay dosing since they strongly impact prioritization of toxicants in EDA. Reduction of sample complexity mainly relies on fractionation procedures, which are discussed in this paper, including quality assurance and quality control. Automated combinations of fractionation, biotesting and chemical analysis using so-called hyphenated tools can enhance the throughput and might reduce the risk of artifacts in laboratory work. The key to determining the chemical structures causing effects is analytical toxicant identification. The latest approaches, tools, software and databases for target-, suspect and non-target screening as well as unknown identification are discussed together with analytical and toxicological confirmation approaches. A better understanding of optimal use and combination of EDA tools will help to design efficient and successful toxicant identification studies in the context of quality monitoring in multiply stressed environments.

IPB Mainnav Search