- Results as:
- Print view
- Endnote (RIS)
- BibTeX
- Table: CSV | HTML
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Research Mission and Profile
Molecular Signal Processing
Bioorganic Chemistry
Biochemistry of Plant Interactions
Cell and Metabolic Biology
Independent Junior Research Groups
Program Center MetaCom
Publications
Good Scientific Practice
Research Funding
Networks and Collaborative Projects
Symposia and Colloquia
Alumni Research Groups
Publications
Mass spectral libraries are collections of reference spectra, usually associated with specific analytes from which the spectra were generated, that are used for further downstream analysis of new spectra. There are many different formats used for encoding spectral libraries, but none have undergone a standardization process to ensure broad applicability to many applications. As part of the Human Proteome Organization Proteomics Standards Initiative (PSI), we have developed a standardized format for encoding spectral libraries, called mzSpecLib (https://psidev.info/mzSpecLib). It is primarily a data model that flexibly encodes metadata about the library entries using the extensible PSI-MS controlled vocabulary and can be encoded in and converted between different serialization formats. We have also developed a standardized data model and serialization for fragment ion peak annotations, called mzPAF (https://psidev.info/mzPAF). It is defined as a separate standard, since it may be used for other applications besides spectral libraries. The mzSpecLib and mzPAF standards are compatible with existing PSI standards such as ProForma 2.0 and the Universal Spectrum Identifier. The mzSpecLib and mzPAF standards have been primarily defined for peptides in proteomics applications with basic small molecule support. They could be extended in the future to other fields that need to encode spectral libraries for nonpeptidic analytes.
Publications
Mass spectrometry (MS) is one of the primary techniques used for large-scale analysis of small molecules in metabolomics studies. To date, there has been little data format standardization in this field, as different software packages export results in different formats represented in XML or plain text, making data sharing, database deposition, and reanalysis highly challenging. Working within the consortia of the Metabolomics Standards Initiative, Proteomics Standards Initiative, and the Metabolomics Society, we have created mzTab-M to act as a common output format from analytical approaches using MS on small molecules. The format has been developed over several years, with input from a wide range of stakeholders. mzTab-M is a simple tab-separated text format, but importantly, the structure is highly standardized through the design of a detailed specification document, tightly coupled to validation software, and a mandatory controlled vocabulary of terms to populate it. The format is able to represent final quantification values from analyses, as well as the evidence trail in terms of features measured directly from MS (e.g., LC-MS, GC-MS, DIMS, etc.) and different types of approaches used to identify molecules. mzTab-M allows for ambiguity in the identification of molecules to be communicated clearly to readers of the files (both people and software). There are several implementations of the format available, and we anticipate widespread adoption in the field.
Publications
Metabolic fingerprinting is a powerful analytical technique, giving access to high-throughput identification and relative quantification of multiple metabolites. Because of short analysis times, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) is the preferred instrumental platform for fingerprinting, although its power in analysis of free fatty acids (FFAs) is limited. However, these metabolites are the biomarkers of human pathologies and indicators of food quality. Hence, a high-throughput method for their fingerprinting is required. Therefore, here we propose a MALDI-TOF-MS method for identification and relative quantification of FFAs in biological samples of different origins. Our approach relies on formation of monomolecular Langmuir films (LFs) at the interphase of aqueous barium acetate solution, supplemented with low amounts of 2,5-dihydroxybenzoic acid, and hexane extracts of biological samples. This resulted in detection limits of 10–13–10–14 mol and overall method linear dynamic range of at least 4 orders of magnitude with accuracy and precision within 2 and 17%, respectively. The method precision was verified with eight sample series of different taxonomies, which indicates a universal applicability of our approach. Thereby, 31 and 22 FFA signals were annotated by exact mass and identified by tandem MS, respectively. Among 20 FFAs identified in Fucus algae, 14 could be confirmed by gas chromatography-mass spectrometry.
Publications
NMR is a widely used analytical technique with a growing number of repositories available. As a result, demands for a vendor-agnostic, open data format for long-term archiving of NMR data have emerged with the aim to ease and encourage sharing, comparison, and reuse of NMR data. Here we present nmrML, an open XML-based exchange and storage format for NMR spectral data. The nmrML format is intended to be fully compatible with existing NMR data for chemical, biochemical, and metabolomics experiments. nmrML can capture raw NMR data, spectral data acquisition parameters, and where available spectral metadata, such as chemical structures associated with spectral assignments. The nmrML format is compatible with pure-compound NMR data for reference spectral libraries as well as NMR data from complex biomixtures, i.e., metabolomics experiments. To facilitate format conversions, we provide nmrML converters for Bruker, JEOL and Agilent/Varian vendor formats. In addition, easy-to-use Web-based spectral viewing, processing, and spectral assignment tools that read and write nmrML have been developed. Software libraries and Web services for data validation are available for tool developers and end-users. The nmrML format has already been adopted for capturing and disseminating NMR data for small molecules by several open source data processing tools and metabolomics reference spectral libraries, e.g., serving as storage format for the MetaboLights data repository. The nmrML open access data standard has been endorsed by the Metabolomics Standards Initiative (MSI), and we here encourage user participation and feedback to increase usability and make it a successful standard.
Publications
The identification of metabolites by mass spectrometry constitutes a major bottleneck which considerably limits the throughput of metabolomics studies in biomedical or plant research. Here, we present a novel approach to analyze metabolomics data from untargeted, data-independent LC-MS/MS measurements. By integrated analysis of MS1 abundances and MS/MS spectra, the identification of regulated metabolite families is achieved. This approach offers a global view on metabolic regulation in comparative metabolomics. We implemented our approach in the web application “MetFamily”, which is freely available at http://msbi.ipb-halle.de/MetFamily/. MetFamily provides a dynamic link between the patterns based on MS1-signal intensity and the corresponding structural similarity at the MS/MS level. Structurally related metabolites are annotated as metabolite families based on a hierarchical cluster analysis of measured MS/MS spectra. Joint examination with principal component analysis of MS1 patterns, where this annotation is preserved in the loadings, facilitates the interpretation of comparative metabolomics data at the level of metabolite families. As a proof of concept, we identified two trichome-specific metabolite families from wild-type tomato Solanum habrochaites LA1777 in a fully unsupervised manner and validated our findings based on earlier publications and with NMR.
Publications
Demands in research investigating small molecules by applying untargeted approaches have been a key motivator for the development of repositories for mass spectrometry spectra and automated tools to aid compound identification. Comparatively little attention has been afforded to using retention times (RTs) to distinguish compounds and for liquid chromatography there are currently no coordinated efforts to share and exploit RT information. We therefore present PredRet; the first tool that makes community sharing of RT information possible across laboratories and chromatographic systems (CSs). At http://predret.org, a database of RTs from different CSs is available and users can upload their own experimental RTs and download predicted RTs for compounds which they have not experimentally determined in their own experiments. For each possible pair of CSs in the database, the RTs are used to construct a projection model between the RTs in the two CSs. The number of compounds for which RTs can be predicted and the accuracy of the predictions are dependent upon the compound coverage overlap between the CSs used for construction of projection models. At the moment, it is possible to predict up to 400 RTs with a median error between 0.01 and 0.28 min depending on the CS and the median width of the prediction interval ranging from 0.08 to 1.86 min. By comparing experimental and predicted RTs, the user can thus prioritize which isomers to target for further characterization and potentially exclude some structures completely. As the database grows, the number and accuracy of predictions will increase.
Publications
Metabolomic data are frequently acquired using chromatographically coupled mass spectrometry (MS) platforms. For such datasets, the first step in data analysis relies on feature detection, where a feature is defined by a mass and retention time. While a feature typically is derived from a single compound, a spectrum of mass signals is more a more-accurate representation of the mass spectrometric signal for a given metabolite. Here, we report a novel feature grouping method that operates in an unsupervised manner to group signals from MS data into spectra without relying on predictability of the in-source phenomenon. We additionally address a fundamental bottleneck in metabolomics, annotation of MS level signals, by incorporating indiscriminant MS/MS (idMS/MS) data implicitly: feature detection is performed on both MS and idMS/MS data, and feature–feature relationships are determined simultaneously from the MS and idMS/MS data. This approach facilitates identification of metabolites using in-source MS and/or idMS/MS spectra from a single experiment, reduces quantitative analytical variation compared to single-feature measures, and decreases false positive annotations of unpredictable phenomenon as novel compounds. This tool is released as a freely available R package, called RAMClustR, and is sufficiently versatile to group features from any chromatographic-spectrometric platform or feature-finding software.
Publications
Lectin binding has been studied using the particle plasmon light-scattering properties of gold nanoparticles printed into an array format. Performance of the kinetic assay is evaluated from a detailed analysis of the binding of concanavalin A (ConA) and wheat germ agglutinin (WGA) to their target monosaccharides indicating affinity constants in the order of KD ∼10 nM for the lectin-monosaccharide interaction. The detection limits for the lectins following a 200 s injection time were determined as 10 ng/mL or 0.23 nM and 100 ng/mL or 0.93 nM, respectively. Subsequently, a nine-lectin screen was performed on the porcine and human fibrinogen glycoproteins. The observed spectra of lectin-protein specific binding rates result in characteristic patterns that evidently correlate with the structure of the glycans and allow one to distinguish between glycosylation of the porcine and human fibrinogens. The array technology has the potential to perform a multilectin screen of large numbers of proteins providing information on protein glycosylation and their microheterogeneity.
Publications
Liquid chromatography coupled to mass spectrometry is routinely used for metabolomics experiments. In contrast to the fairly routine and automated data acquisition steps, subsequent compound annotation and identification require extensive manual analysis and thus form a major bottleneck in data interpretation. Here we present CAMERA, a Bioconductor package integrating algorithms to extract compound spectra, annotate isotope and adduct peaks, and propose the accurate compound mass even in highly complex data. To evaluate the algorithms, we compared the annotation of CAMERA against a manually defined annotation for a mixture of known compounds spiked into a complex matrix at different concentrations. CAMERA successfully extracted accurate masses for 89.7% and 90.3% of the annotatable compounds in positive and negative ion modes, respectively. Furthermore, we present a novel annotation approach that combines spectral information of data acquired in opposite ion modes to further improve the annotation rate. We demonstrate the utility of CAMERA in two different, easily adoptable plant metabolomics experiments, where the application of CAMERA drastically reduced the amount of manual analysis.
Publications
This article explores consensus structure elucidation on the basis of GC/EI-MS, structure generation, and calculated properties for unknown compounds. Candidate structures were generated using the molecular formula and substructure information obtained from GC/EI-MS spectra. Calculated properties were then used to score candidates according to a consensus approach, rather than filtering or exclusion. Two mass spectral match calculations (MOLGEN-MS and MetFrag), retention behavior (Lee retention index/boiling point correlation, NIST Kovat’s retention index), octanol–water partitioning behavior (log Kow), and finally steric energy calculations were used to select candidates. A simple consensus scoring function was developed and tested on two unknown spectra detected in a mutagenic subfraction of a water sample from the Elbe River using GC/EI-MS. The top candidates proposed using the consensus scoring technique were purchased and confirmed analytically using GC/EI-MS and LC/MS/MS. Although the compounds identified were not responsible for the sample mutagenicity, the structure-generation-based identification for GC/EI-MS using calculated properties and consensus scoring was demonstrated to be applicable to real-world unknowns and suggests that the development of a similar strategy for multidimensional high-resolution MS could improve the outcomes of environmental and metabolomics studies.