- Ergebnisse als:
- Druckansicht
- Endnote (RIS)
- BibTeX
- Tabelle: CSV | HTML
Publikation
Leitbild und Forschungsprofil
Molekulare Signalverarbeitung
Natur- und Wirkstoffchemie
Biochemie pflanzlicher Interaktionen
Stoffwechsel- und Zellbiologie
Unabhängige Nachwuchsgruppen
Program Center MetaCom
Publikationen
Gute Wissenschaftliche Praxis
Forschungsförderung
Netzwerke und Verbundprojekte
Symposien und Kolloquien
Alumni-Forschungsgruppen
Publikationen
Publikation
Protein engineering through directed evolution and (semi)rational approaches is routinely applied to optimize protein properties for a broad range of applications in industry and academia. The multitude of possible variants, combined with limited screening throughput, hampers efficient protein engineering. Data-driven strategies have emerged as a powerful tool to model the protein fitness landscape that can be explored in silico, significantly accelerating protein engineering campaigns. However, such methods require a certain amount of data, which often cannot be provided, to generate a reliable model of the fitness landscape. Here, we introduce MERGE, a method that combines direct coupling analysis (DCA) and machine learning (ML). MERGE enables data-driven protein engineering when only limited data are available for training, typically ranging from 50 to 500 labeled sequences. Our method demonstrates remarkable performance in predicting a protein’s fitness value and rank based on its sequence across diverse proteins and properties. Notably, MERGE outperforms state-of-the-art methods when only small data sets are available for modeling, requiring fewer computational resources, and proving particularly promising for protein engineers who have access to limited amounts of data.