- Results as:
- Print view
- Endnote (RIS)
- BibTeX
- Table: CSV | HTML
Publications
Publications
Publications
Publications
Publications
Research Mission and Profile
Molecular Signal Processing
Bioorganic Chemistry
Biochemistry of Plant Interactions
Cell and Metabolic Biology
Independent Junior Research Groups
Program Center MetaCom
Publications
Good Scientific Practice
Research Funding
Networks and Collaborative Projects
Symposia and Colloquia
Alumni Research Groups
Publications
Liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) is increasingly popular for the non-targeted exploration of complex samples, where tandem mass spectrometry (MS/MS) is used to characterize the structure of unknown compounds. However, mass spectra do not always contain sufficient information to unequivocally identify the correct structure. This study investigated how much additional information can be gained using hydrogen deuterium exchange (HDX) experiments. The exchange of “easily exchangeable” hydrogen atoms (connected to heteroatoms), with predominantly [M+D]+ ions in positive mode and [M-D]− in negative mode was observed. To enable high-throughput processing, new scoring terms were incorporated into the in silico fragmenter MetFrag. These were initially developed on small datasets and then tested on 762 compounds of environmental interest. Pairs of spectra (normal and deuterated) were found for 593 of these substances (506 positive mode, 155 negative mode spectra). The new scoring terms resulted in 29 additional correct identifications (78 vs 49) for positive mode and an increase in top 10 rankings from 80 to 106 in negative mode. Compounds with dual functionality (polar head group, long apolar tail) exhibited dramatic retention time (RT) shifts of up to several minutes, compared with an average 0.04 min RT shift. For a smaller dataset of 80 metabolites, top 10 rankings improved from 13 to 24 (positive mode, 57 spectra) and from 14 to 31 (negative mode, 63 spectra) when including HDX information. The results of standard measurements were confirmed using targets and tentatively identified surfactant species in an environmental sample collected from the river Danube near Novi Sad (Serbia). The changes to MetFrag have been integrated into the command line version available at http://c-ruttkies.github.io/MetFrag and all resulting spectra and compounds are available in online resources and in the Electronic Supplementary Material (ESM).
Publications
BackgroundMolecule identification is a crucial step in metabolomics and environmental sciences. Besides in silico fragmentation, as performed by MetFrag, also machine learning and statistical methods evolved, showing an improvement in molecule annotation based on MS/MS data. In this work we present a new statistical scoring method where annotations of m/z fragment peaks to fragment-structures are learned in a training step. Based on a Bayesian model, two additional scoring terms are integrated into the new MetFrag2.4.5 and evaluated on the test data set of the CASMI 2016 contest.ResultsThe results on the 87 MS/MS spectra from positive and negative mode show a substantial improvement of the results compared to submissions made by the former MetFrag approach. Top1 rankings increased from 5 to 21 and Top10 rankings from 39 to 55 both showing higher values than for CSI:IOKR, the winner of the CASMI 2016 contest. For the negative mode spectra, MetFrag’s statistical scoring outperforms all other participants which submitted results for this type of spectra.ConclusionsThis study shows how statistical learning can improve molecular structure identification based on MS/MS data compared on the same method using combinatorial in silico fragmentation only. MetFrag2.4.5 shows especially in negative mode a better performance compared to the other participating approaches.
Publications
BackgroundMetabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological, and many other applied biological domains. Its computationally intensive nature has driven requirements for open data formats, data repositories, and data analysis tools. However, the rapid progress has resulted in a mosaic of independent, and sometimes incompatible, analysis methods that are difficult to connect into a useful and complete data analysis solution.FindingsPhenoMeNal (Phenome and Metabolome aNalysis) is an advanced and complete solution to set up Infrastructure-as-a-Service (IaaS) that brings workflow-oriented, interoperable metabolomics data analysis platforms into the cloud. PhenoMeNal seamlessly integrates a wide array of existing open-source tools that are tested and packaged as Docker containers through the project's continuous integration process and deployed based on a kubernetes orchestration framework. It also provides a number of standardized, automated, and published analysis workflows in the user interfaces Galaxy, Jupyter, Luigi, and Pachyderm.ConclusionsPhenoMeNal constitutes a keystone solution in cloud e-infrastructures available for metabolomics. PhenoMeNal is a unique and complete solution for setting up cloud e-infrastructures through easy-to-use web interfaces that can be scaled to any custom public and private cloud environment. By harmonizing and automating software installation and configuration and through ready-to-use scientific workflow user interfaces, PhenoMeNal has succeeded in providing scientists with workflow-driven, reproducible, and shareable metabolomics data analysis platforms that are interfaced through standard data formats, representative datasets, versioned, and have been tested for reproducibility and interoperability. The elastic implementation of PhenoMeNal further allows easy adaptation of the infrastructure to other application areas and ‘omics research domains.
Publications
Per- and polyfluoroalkyl substances (PFASs) are a large and diverse class of chemicals of great interest due to their wide commercial applicability, as well as increasing public concern regarding their adverse impacts. A common terminology for PFASs was recommended in 2011, including broad categorization and detailed naming for many PFASs with rather simple molecular structures. Recent advancements in chemical analysis have enabled identification of a wide variety of PFASs that are not covered by this common terminology. The resulting inconsistency in categorizing and naming of PFASs is preventing efficient assimilation of reported information. This article explores how a combination of expert knowledge and cheminformatics approaches could help address this challenge in a systematic manner. First, the “splitPFAS” approach was developed to systematically subdivide PFASs (for eventual categorization) following a CnF2n+1–X–R pattern into their various parts, with a particular focus on 4 PFAS categories where X is CO, SO2, CH2 and CH2CH2. Then, the open, ontology-based “ClassyFire” approach was tested for potential applicability to categorizing and naming PFASs using five scenarios of original and simplified structures based on the “splitPFAS” output. This workflow was applied to a set of 770 PFASs from the latest OECD PFAS list. While splitPFAS categorized PFASs as intended, the ClassyFire results were mixed. These results reveal that open cheminformatics approaches have the potential to assist in categorizing PFASs in a consistent manner, while much development is needed for future systematic naming of PFASs. The “splitPFAS” tool and related code are publicly available, and include options to extend this proof-of-concept to encompass further PFASs in the future.
Publications
MotivationDeveloping a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator.ResultsWe developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science.Availability and implementationThe PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects.