Geschmack ist vorhersagbar: Mit FlavorMiner. FlavorMiner heißt das Tool, das IPB-Chemiker und Partner aus Kolumbien jüngst entwickelt haben. Das Programm kann, basierend auf maschinellem Lernen (KI), anhand der…
Seit Februar 2021 bietet Wolfgang Brandt, ehemaliger Leiter der Arbeitsgruppe Computerchemie am IPB, sein Citizen Science-Projekt zur Pilzbestimmung an. Dafür hat er in regelmäßigen Abständen öffentliche Vorträge zur Vielfalt…
Herres-Pawlis, S.; Bach, F.; Bruno, I. J.; Chalk, S. J.; Jung, N.; Liermann, J. C.; McEwen, L. R.; Neumann, S.; Steinbeck, C.; Razum, M.; Koepler, O.;Minimum information standards in chemistry: A call for better research data management practicesAngew. Chem. Int. Ed.61e202203038(2022)DOI: 10.1002/anie.202203038
Research data management (RDM) is needed to assist experimental advances and data collection in the chemical sciences. Many funders require RDM because experiments are often paid for by taxpayers and the resulting data should be deposited sustainably for posterity. However, paper notebooks are still common in laboratories and research data is often stored in proprietary and/or dead-end file formats without experimental context. Data must mature beyond a mere supplement to a research paper. Electronic lab note-books (ELN) and laboratory information managementsystems (LIMS) allow researchers to manage data better and they simplify research and publication. Thus, an agreement is needed on minimum information standards for data handling to support structured approaches to data reporting. As digitalization becomes part of curricular teaching, future generations of digital native chemists will embrace RDM and ELN as an organic part of their research.
Publikation
Moreno, P.; Beisken, S.; Harsha, B.; Muthukrishnan, V.; Tudose, I.; Dekker, A.; Dornfeldt, S.; Taruttis, F.; Grosse, I.; Hastings, J.; Neumann, S.; Steinbeck, C.;BiNChE: A web tool and library for chemical enrichment analysis based on the ChEBI ontologyBMC Bioinformatics1656(2015)DOI: 10.1186/s12859-015-0486-3
BackgroundOntology-based enrichment analysis aids in the interpretation and understanding of large-scale biological data. Ontologies are hierarchies of biologically relevant groupings. Using ontology annotations, which link ontology classes to biological entities, enrichment analysis methods assess whether there is a significant over or under representation of entities for ontology classes. While many tools exist that run enrichment analysis for protein sets annotated with the Gene Ontology, there are only a few that can be used for small molecules enrichment analysis.ResultsWe describe BiNChE, an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology.ConclusionsBiNChE aids in the exploration of large sets of small molecules produced within Metabolomics or other Systems Biology research contexts. The open-source tool provides easy and highly interactive web access to enrichment analysis with the ChEBI ontology tool and is additionally available as a standalone library.
Publikation
Kuhn, S.; Egert, B.; Neumann, S.; Steinbeck, C.;Building blocks for automated elucidation of metabolites: Machine learning methods for NMR predictionBMC Bioinformatics9400(2008)DOI: 10.1186/1471-2105-9-400
BackgroundCurrent efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB.ResultsA mean absolute error of 0.18 ppm was achieved for the prediction of proton NMR shifts ranging from 0 to 11 ppm. Random forest, J48 decision tree and support vector machines achieved similar overall errors. HOSE codes being a notably simple method achieved a comparatively good result of 0.17 ppm mean absolute error.ConclusionNMR prediction methods applied in the course of this work delivered precise predictions which can serve as a building block for Computer-Assisted Structure Elucidation for biological metabolites.