Preprints
Medina-Ortiz, D.; Khalifeh, A.; Anvari-Kazemabad, H.; Davari, M. D.; Interpretable and explainable predictive machine learning models for data-driven protein engineering bioRxiv (2024) DOI: 10.1101/2024.02.18.580860
Protein engineering using directed evolution and (semi)rational design has emerged as a powerful strategy for optimizing and enhancing enzymes or proteins with desired properties. Integrating artificial intelligence methods has further enhanced and accelerated protein engineering through predictive models developed in data-driven strategies. However, the lack of explainability and interpretability in these models poses challenges. Explainable Artificial Intelligence addresses the interpretability and explainability of machine learning models, providing transparency and insights into predictive processes. Nonetheless, there is a growing need to incorporate explainable techniques in predicting protein properties in machine learning-assisted protein engineering. This work explores incorporating explainable artificial intelligence in predicting protein properties, emphasizing its role in trustworthiness and interpretability. It assesses different machine learning approaches, introduces diverse explainable methodologies, and proposes strategies for seamless integration, improving trust-worthiness. Practical cases demonstrate the explainable model’s effectiveness in identifying DNA binding proteins and optimizing Green Fluorescent Protein brightness. The study highlights the utility of explainable artificial intelligence in advancing computationally assisted protein design, fostering confidence in model reliability.
Preprints
Herrera-Rocha, F.; Fernández-Niño, M.; Duitama, J.; P. Cala, M.; José Chica, M.; A. Wessjohann, L.; D. Davari, M.; Fernando González Barrios, A.; FlavorMiner: A Machine Learning Platform for Extracting Molecular Flavor Profiles from Structural Data ChemRxiv (2024) DOI: 10.26434/chemrxiv-2024-821xm
Flavor is the main factor driving consumers acceptance of food products. However, tracking the biochemistry of flavor is a formidable challenge due to the complexity of food composition. Current methodologies for linking individual molecules to flavor in foods and beverages are expensive and time-consuming. Predictive models based on machine learning (ML) are emerging as an alternative to speed up this process. Nonetheless, the optimal approach to predict flavor features of molecules remains elusive. In this work we present FlavorMiner, an ML-based multilabel flavor predictor. FlavorMiner seamlessly integrates different combinations of algorithms and mathematical representations, augmented with class balance strategies to address the inherent class of the input dataset. Notably, Random Forest and K-Nearest Neighbors combined with Extended Connectivity Fingerprint and RDKit molecular descriptors consistently outperform other combinations in most cases. Resampling strategies surpass weight balance methods in mitigating bias associated with class imbalance. FlavorMiner exhibits remarkable accuracy, with an average ROC AUC score of 0.88. This algorithm was used to analyze cocoa metabolomics data, unveiling its profound potential to help extract valuable insights from intricate food metabolomics data. FlavorMiner can be used for flavor mining in any food product, drawing from a diverse training dataset that spans over 934 distinct food products.
Preprints
Balcke, G.; Saoud, M.; Grau, J.; Rennert, R.; Mueller, T.; Yousefi, M.; Davari, M. D.; Hause, B.; Csuk, R.; Rashan, L.; Grosse, I.; Tissier, A.; Wessjohann, L.; Machine learning-based metabolic pattern recognition predicts mode of action for anti-cancer drug candidates Research Square (2024) DOI: 10.21203/rs.3.rs-3494185/v1
A bottleneck in the development of new anti-cancer drugs is the recognition of their mode of action (MoA). We combined metabolomics and machine learning to predict MoAs of novel anti-proliferative drug candidates, focusing on human prostate cancer cells (PC-3). As proof of concept, we studied 38 drugs with known effects on 16 key processes of cancer metabolism, profiling low molecular weight intermediates of the central carbon and cellular energy metabolism (CCEM) by LC-MS/MS. These metabolic patterns unveiled distinct MoAs, enabling accurate MoA predictions for novel agents by machine learning. We validate the transferability of MoA predictions from PC-3 to two other cancer cell models and show that correct predictions are still possible, but at the expense of prediction quality. Furthermore, metabolic profiles of treated cells yield insights into intracellular processes, exemplified for drugs inducing different types of mitochondrial dysfunction. Specifically, we predict that pentacyclic triterpenes inhibit oxidative phosphorylation and affect phospholipid biosynthesis, as supported by respiration parameters, lipidomics, and molecular docking. Using biochemical insights from individual drug treatments, our approach offers new opportunities, including the optimization of combinatorial drug applications.
Preprints
Arndt, H.; Bachurski, M.; Yuanxiang, P.; Franke, K.; Wessjohann, L. A.; Kreutz, M. R.; Grochowska, K. M.; A screen of plant-based natural products revealed that quercetin prevents amyloid-β uptake in astrocytes as well as resulting astrogliosis and synaptic dysfunction Research Square (2024) DOI: 10.21203/rs.3.rs-4155455/v1
Two connected histopathological hallmarks of Alzheimer’s disease (AD) are chronic neuroinflammation and synaptic dysfunction. The accumulation of the most prevalent posttranslationally modified form of Aβ1–42, pyroglutamylated amyloid-β (Aβ3(pE)-42) in astrocytes is directly linked to glial activation and the release of proinflammatory cytokines that in turn contribute to early synaptic dysfunction in AD. At present the mechanisms of Aβ3(pE)-42 uptake to astrocytes are unknown and pharmacological interventions that interfere with this process are not available. Here we developed a simple screening assay to identify substances from a plant extract library that prevent astroglial Aβ3(pE)-42 uptake. We first show that this approach yields valid and reproducible results. Second, we show endocytosis of Aβ3(pE)-42 oligomers by astrocytes and that quercetin, a plant flavonol, is effective to specifically block astrocytic buildup of oligomeric Aβ3(pE)-42. Importantly, quercetin does not induce a general impairment of endocytosis. However, it efficiently protects against early synaptic dysfunction following exogenous Aβ3(pE)-42 application.
Printed publications
Arndt, H.; Bachurski, M.; Yuanxiang, P.; Franke, K.; Wessjohann, L. A.; Kreutz, M. R.; Grochowska, K. M.; A screen of plant-based natural products revealed that quercetin prevents pyroglutamylated amyloid-β (Aβ3(pE)-42) uptake in astrocytes as well as resulting astrogliosis and synaptic dysfunction Mol. Neurobiol. (2024) DOI: 10.1007/s12035-024-04509-6
Two connected histopathological hallmarks of Alzheimer’s disease (AD) are chronic neuroinflammation and synaptic dysfunction. The accumulation of the most prevalent posttranslationally modified form of Aβ1–42, pyroglutamylated amyloid-β (Aβ3(pE)-42) in astrocytes is directly linked to glial activation and the release of proinflammatory cytokines that in turn contribute to early synaptic dysfunction in AD. At present, the mechanisms of Aβ3(pE)-42 uptake to astrocytes are unknown and pharmacological interventions that interfere with this process are not available. Here we developed a simple screening assay to identify substances from a plant extract library that prevent astroglial Aβ3(pE)-42 uptake. We first show that this approach yields valid and reproducible results. Second, we show endocytosis of Aβ3(pE)-42 oligomers by astrocytes and that quercetin, a plant flavonol, is effective to specifically block astrocytic buildup of oligomeric Aβ3(pE)-42. Importantly, quercetin does not induce a general impairment of endocytosis. However, it efficiently protects against early synaptic dysfunction following exogenous Aβ3(pE)-42 application.
Publications
Moeller, M.; Dhar, D.; Dräger, G.; Özbasi, M.; Struwe, H.; Wildhagen, M.; Davari, M. D.; Beutel, S.; Kirschning, A.; Sesquiterpene cyclase BcBOT2 promotes the unprecedented Wagner-Meerwein rearrangement of the methoxy group J. Am. Chem. Soc. 146, 17838-17846, (2024) DOI: 10.1021/jacs.4c03386
Presilphiperfolan-8β-ol synthase (BcBOT2), a substrate-promiscuous sesquiterpene cyclase (STC) of fungal origin, is capable of converting two new farnesyl pyrophosphate (FPP) derivatives modified at C7 of farnesyl pyrophosphate (FPP) bearing either a hydroxymethyl group or a methoxymethyl group. These substrates were chosen based on a computationally generated model. Biotransformations yielded five new oxygenated terpenoids. Remarkably, the formation of one of these tricyclic products can only be explained by a cationically induced migration of the methoxy group, presumably via a Meerwein-salt intermediate, unprecedented in synthetic chemistry and biosynthesis. The results show the great principle and general potential of terpene cyclases for mechanistic studies of unusual cation chemistry and for the creation of new terpene skeletons.
Publications
Méndez, Y.; Vasco, A. V.; Ebensen, T.; Schulze, K.; Yousefi, M.; Davari, M. D.; Wessjohann, L. A.; Guzmán, C. A.; Rivera, D. G.; Westermann, B.; Diversification of a novel α‐galactosyl ceramide hotspot boosts the adjuvant properties in parenteral and mucosal vaccines Angew. Chem. Int. Ed. 63, e202310983, (2024) DOI: 10.1002/anie.202310983
The development of potent adjuvants is an important step for improving the performance of subunit vaccines. CD1d agonists, such as the prototypical α‐galactosyl ceramide (α‐GalCer), are of special interest due to their ability to activate iNKT cells and trigger rapid dendritic cell maturation and B‐cell activation. Herein, we introduce a novel derivatization hotspot at the α‐GalCer skeleton, namely the N‐substituent at the amide bond. The multicomponent diversification of this previously unexplored glycolipid chemotype space permitted the introduction of a variety of extra functionalities that can either potentiate the adjuvant properties or serve as handles for further conjugation to antigens toward the development of self‐adjuvanting vaccines. This strategy led to the discovery of compounds eliciting enhanced antigen‐specific T cell stimulation and a higher antibody response when delivered by either the parenteral or the mucosal route, as compared to a known potent CD1d agonist. Notably, various functionalized α‐GalCer analogues showed a more potent adjuvant effect after intranasal immunization than a PEGylated α‐GalCer analogue previously optimized for this purpose. Ultimately, this work could open multiple avenues of opportunity for the use of mucosal vaccines against microbial infections.
Publications
Mejía-Manzano, L. A.; Ortiz-Alcaráz, C. I.; Parra Daza, L. E.; Suarez Medina, L.; Vargas-Cortez, T.; Fernández-Niño, M.; González Barrios, A. F.; González-Valdez, J.; Saccharomyces cerevisiae
biofactory to produce naringenin using a systems biology approach and a bicistronic vector expression strategy in flavonoid production Microbiology Spectrum 12, e03374-23, (2024) DOI: 10.1128/spectrum.03374-23
Naringenin is the central flavonoid in the biosynthesis of several bioactive compounds and presents a growing demand for its nutraceutical properties. Naringenin extraction from plants is non-viable due to low yields, and microbial platforms could represent a controlled and sustained alternative to produce it using several metabolic engineering tools. This study shows the naringenin production in
Saccharomyces cerevisiae
from glucose through a combined approach of systems biology, enzyme criteria selection, and a molecular engineering strategy.
In silico
prediction using a mixed integer linear programming (MILP) algorithm showed that the phenylpropanoid pathway was the shortest and most viable metabolic pathway. Two biscistronic constructs were generated using the PTV-1 2A peptide sequence, and a naringenin biofactory was assembled with the phenylalanine ammonia-lyase/tyrosine ammonia-lyase genes encoding phenylalanine/tyrosine ammonia-lyase (Rhodobacter capsulatus), 4-coumaroyl (4 Cl) encoding a
p-coumaroyl-CoA ligase (Solanum lycopersicum), CHS encoding chalcone synthase (Hypericum androsaemum), and CHI encoding a chalcone isomerase (Glycine max). Naringenin productivity in batch fermentation was about 40.67 ± 3.47 µg/Lh with a 6.10 ± 0.52 mg/L titer (22.41 ± 1.91 µM) and a 3.26 ± 1.36 mg/g yield (YP/S) with the detection of additional flavonoids. The obtained concentration is better than other related works in diverse engineered microorganisms. The results suggest a successful and optimizable alternative for the heterologous flavanone production in yeast combined with bicistronic expression mediated by a 2A peptide sequence for the first time. This strategy supports the production of extensive routes for other nutraceutical compounds.
IMPORTANCE
Flavonoids are a group of compounds generally produced by plants with proven biological activity, which have recently beeen recommended for the treatment and prevention of diseases and ailments with diverse causes. In this study, naringenin was produced in adequate amounts in yeast after
in silico
design. The four genes of the involved enzymes from several organisms (bacteria and plants) were multi-expressed in two vectors carrying each two genes linked by a short viral peptide sequence. The batch kinetic behavior of the product, substrate, and biomass was described at lab scale. The engineered strain might be used in a more affordable and viable bioprocess for industrial naringenin procurement.
Publications
Manoilenko, S.; Dippe, M.; Fuchs, T.; Eisenschmidt-Bönn, D.; Ziegler, J.; Bauer, A.-K.; Wessjohann, L. A.; Enzymatic one-step synthesis of natural 2-pyrones and new-to-nature derivatives from coenzyme A esters J. Biotechnol. 388, 72-82, (2024) DOI: 10.1016/j.jbiotec.2024.04.006
The 2-pyrone moiety is present in a wide range of structurally diverse natural products with various biological activities. The plant biosynthetic routes towards these compounds mainly depend on the activity of either type III polyketide synthase-like 2-pyrone synthases or hydroxylating 2-oxoglutarate dependent dioxygenases. In the present study, the substrate specificity of these enzymes is investigated by a systematic screening using both natural and artificial substrates with the aims of efficiently forming (new) products and understanding the underlying catalytic mechanisms. In this framework, we focused on the in vitro functional characterization of a 2-pyrone synthase Gh2PS2 from Gerbera x hybrida and two dioxygenases AtF6’H1 and AtF6’H2 from Arabidopsis thaliana using a set of twenty aromatic and aliphatic CoA esters as substrates. UHPLC-ESI-HRMSn based analyses of reaction intermediates and products revealed a broad substrate specificity of the enzymes, enabling the facile \"green\" synthesis of this important class of natural products and derivatives in a one-step/one-pot reaction in aqueous environment without the need for halogenated or metal reagents and protective groups. Using protein modelling and substrate docking we identified amino acid residues that seem to be important for the observed product scope.
Publications
Liu, Y.; Esposto, D.; Mahdi, L. K.; Porzel, A.; Stark, P.; Hussain, H.; Scherr-Henning, A.; Isfort, S.; Bathe, U.; Acosta, I. F.; Zuccaro, A.; Balcke, G. U.; Tissier, A.; Hordedane diterpenoid phytoalexins restrict Fusarium graminearum infection but enhance the colonization by Bipolaris sorkiniana of barley roots Mol. Plant 17, 1307-1327, (2024) DOI: 10.1016/j.molp.2024.07.006
Plant immunity is a multilayered process that includes recognition of patterns or effectors from pathogens to elicit defense responses. These include the induction of a cocktail of defense metabolites that typically restrict pathogen virulence. Here, we investigate the interaction between barley roots and the fungal pathogens Bipolaris sorokiniana (Bs) and Fusarium graminearum (Fg) at the metabolite level. We identify hordedanes, a previously undescribed set of labdane-related diterpenoids with antimicrobial properties, as critical players in these interactions. Infection of barley roots by Bs and Fg elicits hordedane synthesis from a 600-kb gene cluster. Heterologous reconstruction of the biosynthesis pathway in yeast and Nicotiana benthamiana produced several hordedanes, including one of the most functionally decorated products 19-b-hydroxy-hordetrienoic acid (19-OH-HTA). Barley mutants in the diterpene synthase genes of this cluster are unable to produce hordedanes but, unexpectedly, show reduced Bs colonization. By contrast, colonization by Fusarium graminearum, another fungal pathogen of barley and wheat, is 4-fold higher in the mutants completely lacking hordedanes. Accordingly, 19-OH-HTA enhances both germination and growth of Bs, whereas it inhibits other pathogenic fungi, including Fg. Analysis of microscopy and transcriptomics data suggest that hordedanes delay the necrotrophic phase of Bs. Taken together, these results show that adapted pathogens such as Bs can subvert plant metabolic defenses to facilitate root colonization.