- Results as:
- Print view
- Endnote (RIS)
- BibTeX
- Table: CSV | HTML
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Publications
Research Mission and Profile
Molecular Signal Processing
Bioorganic Chemistry
Biochemistry of Plant Interactions
Cell and Metabolic Biology
Independent Junior Research Groups
Program Center MetaCom
Publications
Good Scientific Practice
Research Funding
Networks and Collaborative Projects
Symposia and Colloquia
Alumni Research Groups
Publications
Flavor is the main factor driving consumers acceptance of food products. However, tracking the biochemistry of flavor is a formidable challenge due to the complexity of food composition. Current methodologies for linking individual molecules to flavor in foods and beverages are expensive and time-consuming. Predictive models based on machine learning (ML) are emerging as an alternative to speed up this process. Nonetheless, the optimal approach to predict flavor features of molecules remains elusive. In this work we present FlavorMiner, an ML-based multilabel flavor predictor. FlavorMiner seamlessly integrates different combinations of algorithms and mathematical representations, augmented with class balance strategies to address the inherent class of the input dataset. Notably, Random Forest and K-Nearest Neighbors combined with Extended Connectivity Fingerprint and RDKit molecular descriptors consistently outperform other combinations in most cases. Resampling strategies surpass weight balance methods in mitigating bias associated with class imbalance. FlavorMiner exhibits remarkable accuracy, with an average ROC AUC score of 0.88. This algorithm was used to analyze cocoa metabolomics data, unveiling its profound potential to help extract valuable insights from intricate food metabolomics data. FlavorMiner can be used for flavor mining in any food product, drawing from a diverse training dataset that spans over 934 distinct food products.Scientific Contribution FlavorMiner is an advanced machine learning (ML)-based tool designed to predict molecular flavor features with high accuracy and efficiency, addressing the complexity of food metabolomics. By leveraging robust algorithmic combinations paired with mathematical representations FlavorMiner achieves high predictive performance. Applied to cocoa metabolomics, FlavorMiner demonstrated its capacity to extract meaningful insights, showcasing its versatility for flavor analysis across diverse food products. This study underscores the transformative potential of ML in accelerating flavor biochemistry research, offering a scalable solution for the food and beverage industry.
Publications
Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small fraction of these metabolites can be annotated with confidence. Many novel computational methods and tools have been developed to enable chemical structure annotation to known and unknown compounds such as in silico generated spectra and molecular networking. Here, we present an automated and reproducible Metabolome Annotation Workflow (MAW) for untargeted metabolomics data to further facilitate and automate the complex annotation by combining tandem mass spectrometry (MS2) input data pre-processing, spectral and compound database matching with computational classification, and in silico annotation. MAW takes the LC-MS2 spectra as input and generates a list of putative candidates from spectral and compound databases. The databases are integrated via the R package Spectra and the metabolite annotation tool SIRIUS as part of the R segment of the workflow (MAW-R). The final candidate selection is performed using the cheminformatics tool RDKit in the Python segment (MAW-Py). Furthermore, each feature is assigned a chemical structure and can be imported to a chemical structure similarity network. MAW is following the FAIR (Findable, Accessible, Interoperable, Reusable) principles and has been made available as the docker images, maw-r and maw-py. The source code and documentation are available on GitHub (https://github.com/zmahnoor14/MAW). The performance of MAW is evaluated on two case studies. MAW can improve candidate ranking by integrating spectral databases with annotation tools like SIRIUS which contributes to an efficient candidate selection procedure. The results from MAW are also reproducible and traceable, compliant with the FAIR guidelines. Taken together, MAW could greatly facilitate automated metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery.
Publications
For several sesquiterpene lactones (STLs) found in Asteraceae plants, very interesting biomedical activities have been demonstrated. Chicory roots accumulate the guaianolide STLs 8-deoxylactucin, lactucin, and lactucopicrin predominantly in oxalated forms in the latex. In this work, a supercritical fluid extract fraction of chicory STLs containing 8-deoxylactucin and 11β,13-dihydro-8-deoxylactucin was shown to have anti-inflammatory activity in an inflamed intestinal mucosa model. To increase the accumulation of these two compounds in chicory taproots, the lactucin synthase that takes 8-deoxylactucin as the substrate for the regiospecific hydroxylation to generate lactucin needs to be inactivated. Three candidate cytochrome P450 enzymes of the CYP71 clan were identified in chicory. Their targeted inactivation using the CRISPR/Cas9 approach identified CYP71DD33 to have lactucin synthase activity. The analysis of the terpene profile of the taproots of plants with edits in CYP71DD33 revealed a nearly complete elimination of the endogenous chicory STLs lactucin and lactucopicrin and their corresponding oxalates. Indeed, in the same lines, the interruption of biosynthesis resulted in a strong increase of 8-deoxylactucin and its derivatives. The enzyme activity of CYP71DD33 to convert 8-deoxylactucin to lactucin was additionally demonstrated in vitro using yeast microsome assays. The identified chicory lactucin synthase gene is predominantly expressed in the chicory latex, indicating that the late steps in the STL biosynthesis take place in the latex. This study contributes to further elucidation of the STL pathway in chicory and shows that root chicory can be positioned as a crop from which different health products can be extracted.
Publications
Compound (or chemical) databases are an invaluable resource for many scientific disciplines. Exposomics researchers need to find and identify relevant chemicals that cover the entirety of potential (chemical and other) exposures over entire lifetimes. This daunting task, with over 100 million chemicals in the largest chemical databases, coupled with broadly acknowledged knowledge gaps in these resources, leaves researchers faced with too much—yet not enough—information at the same time to perform comprehensive exposomics research. Furthermore, the improvements in analytical technologies and computational mass spectrometry workflows coupled with the rapid growth in databases and increasing demand for high throughput “big data” services from the research community present significant challenges for both data hosts and workflow developers. This article explores how to reduce candidate search spaces in non-target small molecule identification workflows, while increasing content usability in the context of environmental and exposomics analyses, so as to profit from the increasing size and information content of large compound databases, while increasing efficiency at the same time. In this article, these methods are explored using PubChem, the NORMAN Network Suspect List Exchange and the in silico fragmentation approach MetFrag. A subset of the PubChem database relevant for exposomics, PubChemLite, is presented as a database resource that can be (and has been) integrated into current workflows for high resolution mass spectrometry. Benchmarking datasets from earlier publications are used to show how experimental knowledge and existing datasets can be used to detect and fill gaps in compound databases to progressively improve large resources such as PubChem, and topic-specific subsets such as PubChemLite. PubChemLite is a living collection, updating as annotation content in PubChem is updated, and exported to allow direct integration into existing workflows such as MetFrag. The source code and files necessary to recreate or adjust this are jointly hosted between the research parties (see data availability statement). This effort shows that enhancing the FAIRness (Findability, Accessibility, Interoperability and Reusability) of open resources can mutually enhance several resources for whole community benefit. The authors explicitly welcome additional community input on ideas for future developments.
Publications
AbstractWe report the major conclusions of the online open-access workshop “Computational Applications in Secondary Metabolite Discovery (CAiSMD)” that took place from 08 to 10 March 2021. Invited speakers from academia and industry and about 200 registered participants from five continents (Africa, Asia, Europe, South America, and North America) took part in the workshop. The workshop highlighted the potential applications of computational methodologies in the search for secondary metabolites (SMs) or natural products (NPs) as potential drugs and drug leads. During 3 days, the participants of this online workshop received an overview of modern computer-based approaches for exploring NP discovery in the “omics” age. The invited experts gave keynote lectures, trained participants in hands-on sessions, and held round table discussions. This was followed by oral presentations with much interaction between the speakers and the audience. Selected applicants (early-career scientists) were offered the opportunity to give oral presentations (15 min) and present posters in the form of flash presentations (5 min) upon submission of an abstract. The final program available on the workshop website (https://caismd.indiayouth.info/) comprised of 4 keynote lectures (KLs), 12 oral presentations (OPs), 2 round table discussions (RTDs), and 5 hands-on sessions (HSs). This meeting report also references internet resources for computational biology in the area of secondary metabolites that are of use outside of the workshop areas and will constitute a long-term valuable source for the community. The workshop concluded with an online survey form to be completed by speakers and participants for the goal of improving any subsequent editions.
Publications
Late blight, caused by the oomycete Phytophthora infestans, is economically the most important foliar disease of potato. To assess the importance of the leaf surface, as the site of the first encounter of pathogen and host, we performed untargeted profiling by liquid chromatography–mass spectrometry of leaf surface metabolites of the susceptible cultivated potato Solanum tuberosum and the resistant wild potato species Solanum bulbocastanum. Hydroxycinnamic acid amides, typical phytoalexins of potato, were abundant on the surface of S. tuberosum, but not on S. bulbocastanum. One of the metabolites accumulating on the surface of the wild potato was identified as lysophosphatidylcholine carrying heptadecenoic acid, LPC17:1. In vitro assays revealed that both spore germination and mycelial growth of P. infestans were efficiently inhibited by LPC17:1, suggesting that leaf surface metabolites from wild potato species could contribute to early defense responses against P. infestans.
Publications
Seeds of domesticated Vicia (vetch) species (family Fabaceae-Faboideae) are produced and consumed worldwide for their nutritional value. Seed accessions belonging to 16 different species of Vicia—both domesticated and wild taxa—were subjected to a chemotaxonomic study using ultraperformance liquid chromatography–mass spectrometry (UPLC-MS) analyzed by chemometrics. A total of 89 metabolites were observed in the examined Vicia accessions. Seventy-eight out of the 89 detected metabolites were annotated. Metabolites quantified belonged to several classes, viz., flavonoids, procyanidins, prodelphinidins, anthocyanins, stilbenes, dihydrochalcones, phenolic acids, coumarins, alkaloids, jasmonates, fatty acids, terpenoids, and cyanogenics, with flavonoids and fatty acids amounting to the major classes. Flavonoids, fatty acids, and anthocyanins showed up as potential chemotaxonomic markers in Vicia species discrimination. Fatty acids were more enriched in Vicia faba specimens, while the abundance of flavonoids was the highest in Vicia parviflora. Anthocyanins allowed for discrimination between Vicia hirsuta and Vicia sepium. To the best of our knowledge, this is the first report on employing UPLC-MS metabolomics to discern the diversity of metabolites at the intrageneric level among Vicia species.
Publications
Rosemary and sage species from Lamiaceae contain high amounts of structurally related but diverse abietane diterpenes. A number of substances from this compound family have potential pharmacological activities and are used in the food and cosmetic industry. This has raised interest in their biosynthesis. Investigations in Rosmarinus officinalis and some sage species have uncovered two main groups of cytochrome P450 oxygenases that are involved in the oxidation of the precursor abietatriene. CYP76AHs produce ferruginol and 11-hydroxyferruginol, while CYP76AKs catalyze oxidations at the C20 position. Using a modular Golden-Gate-compatible assembly system for yeast expression, these enzymes were systematically tested either alone or in combination. A total of 14 abietane diterpenes could be detected, 8 of which have not been reported thus far. We demonstrate here that yeast is a valid system for engineering and reconstituting the abietane diterpene network, allowing for the discovery of novel compounds with potential bioactivity.
Publications
Chemical database searching has become a fixture in many non-targeted identification workflows based on high-resolution mass spectrometry (HRMS). However, the form of a chemical structure observed in HRMS does not always match the form stored in a database (e.g., the neutral form versus a salt; one component of a mixture rather than the mixture form used in a consumer product). Linking the form of a structure observed via HRMS to its related form(s) within a database will enable the return of all relevant variants of a structure, as well as the related metadata, in a single query. A Konstanz Information Miner (KNIME) workflow has been developed to produce structural representations observed using HRMS (“MS-Ready structures”) and links them to those stored in a database. These MS-Ready structures, and associated mappings to the full chemical representations, are surfaced via the US EPA’s Chemistry Dashboard (https://comptox.epa.gov/dashboard/). This article describes the workflow for the generation and linking of ~ 700,000 MS-Ready structures (derived from ~ 760,000 original structures) as well as download, search and export capabilities to serve structure identification using HRMS. The importance of this form of structural representation for HRMS is demonstrated with several examples, including integration with the in silico fragmentation software application MetFrag. The structures, search, download and export functionality are all available through the CompTox Chemistry Dashboard, while the MetFrag implementation can be viewed at https://msbi.ipb-halle.de/MetFragBeta/.
Publications
Lens culinaris and several Lupinus species are two legumes regarded as potential protein resources aside from their richness in phytochemicals. Consequently, characterization of their metabolite composition seems warranted to be considered as a sustainable commercial functional food. This study presents a discriminatory holistic approach for metabolite profiling in accessions of four lentil cultivars and four Lupinus species via gas chromatography/mass spectrometry. A total of 107 metabolites were identified, encompassing organic and amino acids, sugars, and sterols, along with antinutrients, viz., alkaloids and sugar phosphates. Among the examined specimens, four nutritionally valuable accessions ought to be prioritized for future breeding to include Lupinus hispanicus, enriched in organic (ca. 11.7%) and amino acids (ca. 5%), and Lupinus angustifolius, rich in sucrose (ca. 40%), along with two dark-colored lentil cultivars ‘verte du Puy’ and ‘Black Beluga’ enriched in peptides. Antinutrient chemicals were observed in Lupinus polyphyllus, owing to its high alkaloid content. Several species-specific markers were also revealed using multivariate data analyses.