zur Suche springenzur Navigation springenzum Inhalt springen

Sortieren nach: Erscheinungsjahr Typ der Publikation

Zeige Ergebnisse 1 bis 10 von 22.

Publikation

Zulfiqar, M.; Crusoe, M. R.; König-Ries, B.; Steinbeck, C.; Peters, K.; Gadelha, L.; Implementation of FAIR practices in computational metabolomics workflows—A case study Metabolites 14, 118, (2024) DOI: 10.3390/metabo14020118

Scientific workflows facilitate the automation of data analysis tasks by integrating various software and tools executed in a particular order. To enable transparency and reusability in workflows, it is essential to implement the FAIR principles. Here, we describe our experiences implementing the FAIR principles for metabolomics workflows using the Metabolome Annotation Workflow (MAW) as a case study. MAW is specified using the Common Workflow Language (CWL), allowing for the subsequent execution of the workflow on different workflow engines. MAW is registered using a CWL description on WorkflowHub. During the submission process on WorkflowHub, a CWL description is used for packaging MAW using the Workflow RO-Crate profile, which includes metadata in Bioschemas. Researchers can use this narrative discussion as a guideline to commence using FAIR practices for their bioinformatics or cheminformatics workflows while incorporating necessary amendments specific to their research area.
Publikation

Zulfiqar, M.; Stettin, D.; Schmidt, S.; Nikitashina, V.; Pohnert, G.; Steinbeck, C.; Peters, K.; Sorokina, M.; Untargeted metabolomics to expand the chemical space of the marine diatom Skeletonema marinoi Front. Microbiol. 14, 1295994, (2023) DOI: 10.3389/fmicb.2023.1295994

Diatoms (Bacillariophyceae) are aquatic photosynthetic microalgae with an ecological role as primary producers in the aquatic food web. They account substantially for global carbon, nitrogen, and silicon cycling. Elucidating the chemical space of diatoms is crucial to understanding their physiology and ecology. To expand the known chemical space of a cosmopolitan marine diatom, Skeletonema marinoi, we performed High-Resolution Liquid Chromatography-Tandem Mass Spectrometry (LC-MS2) for untargeted metabolomics data acquisition. The spectral data from LC-MS2 was used as input for the Metabolome Annotation Workflow (MAW) to obtain putative annotations for all measured features. A suspect list of metabolites previously identified in the Skeletonema spp. was generated to verify the results. These known metabolites were then added to the putative candidate list from LC-MS2 data to represent an expanded catalog of 1970 metabolites estimated to be produced by S. marinoi. The most prevalent chemical superclasses, based on the ChemONT ontology in this expanded dataset, were organic acids and derivatives, organoheterocyclic compounds, lipids and lipid-like molecules, and organic oxygen compounds. The metabolic profile from this study can aid the bioprospecting of marine microalgae for medicine, biofuel production, agriculture, and environmental conservation. The proposed analysis can be applicable for assessing the chemical space of other microalgae, which can also provide molecular insights into the interaction between marine organisms and their role in the functioning of ecosystems.
Publikation

Zulfiqar, M.; Gadelha, L.; Steinbeck, C.; Sorokina, M.; Peters, K.; MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry J. Cheminform. 15, 32, (2023) DOI: 10.1186/s13321-023-00695-y

Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small fraction of these metabolites can be annotated with confidence. Many novel computational methods and tools have been developed to enable chemical structure annotation to known and unknown compounds such as in silico generated spectra and molecular networking. Here, we present an automated and reproducible Metabolome Annotation Workflow (MAW) for untargeted metabolomics data to further facilitate and automate the complex annotation by combining tandem mass spectrometry (MS2) input data pre-processing, spectral and compound database matching with computational classification, and in silico annotation. MAW takes the LC-MS2 spectra as input and generates a list of putative candidates from spectral and compound databases. The databases are integrated via the R package Spectra and the metabolite annotation tool SIRIUS as part of the R segment of the workflow (MAW-R). The final candidate selection is performed using the cheminformatics tool RDKit in the Python segment (MAW-Py). Furthermore, each feature is assigned a chemical structure and can be imported to a chemical structure similarity network. MAW is following the FAIR (Findable, Accessible, Interoperable, Reusable) principles and has been made available as the docker images, maw-r and maw-py. The source code and documentation are available on GitHub (https://github.com/zmahnoor14/MAW). The performance of MAW is evaluated on two case studies. MAW can improve candidate ranking by integrating spectral databases with annotation tools like SIRIUS which contributes to an efficient candidate selection procedure. The results from MAW are also reproducible and traceable, compliant with the FAIR guidelines. Taken together, MAW could greatly facilitate automated metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery.
Preprints

Zulfiqar, M.; Gadelha, L.; Steinbeck, C.; Sorokina, M.; Peters, K.; MAW - The reproducible Metabolome Annotation Workflow for untargeted tandem mass Spectrometry bioRxiv (2022) DOI: 10.1101/2022.10.17.512224

Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC-MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small fraction of these metabolites can be annotated with confidence. Many novel computational methods and tools have been developed to enable chemical structure annotation to known and unknown compounds such as in silico generated spectra and molecular networking. Here, we present an automated and reproducible Metabolome Annotation Workflow (MAW) for untargeted metabolomics data to further facilitate and automate the complex annotation by combining tandem mass spectrometry (MS2) input data pre-processing, spectral and compound database matching with computational classification, and in silico annotation. MAW takes the LC-MS2 spectra as input and generates a list of putative candidates from spectral and compound databases. The databases are integrated via the R package Spectra and the metabolite annotation tool SIRIUS as part of the R segment of the workflow (MAW-R). The final candidate selection is performed using the cheminformatics tool RDKit in the Python segment (MAW-Py). Furthermore, each feature is assigned a chemical structure and can be imported to a chemical structure similarity network. MAW is following the FAIR (Findable, Accessible, Interoperable, Reusable) principles and has been made available as the docker images, maw-r and maw-py. The source code and documentation are available on GitHub. The performance of MAW is evaluated on two case studies. We found that MAW can improve candidate ranking by integrating spectral databases with annotation tools like SIRIUS which contributes to an efficient candidate selection procedure. The results from MAW are also reproducible and traceable, compliant with the FAIR guidelines. Taken together, MAW could greatly facilitate automated metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery.
Publikation

Herres-Pawlis, S.; Bach, F.; Bruno, I. J.; Chalk, S. J.; Jung, N.; Liermann, J. C.; McEwen, L. R.; Neumann, S.; Steinbeck, C.; Razum, M.; Koepler, O.; Minimum information standards in chemistry: A call for better research data management practices Angew. Chem. Int. Ed. 61, e202203038, (2022) DOI: 10.1002/anie.202203038

Research data management (RDM) is needed to assist experimental advances and data collection in the chemical sciences. Many funders require RDM because experiments are often paid for by taxpayers and the resulting data should be deposited sustainably for posterity. However, paper notebooks are still common in laboratories and research data is often stored in proprietary and/or dead-end file formats without experimental context. Data must mature beyond a mere supplement to a research paper. Electronic lab note-books (ELN) and laboratory information managementsystems (LIMS) allow researchers to manage data better and they simplify research and publication. Thus, an agreement is needed on minimum information standards for data handling to support structured approaches to data reporting. As digitalization becomes part of curricular teaching, future generations of digital native chemists will embrace RDM and ELN as an organic part of their research.
Publikation

Steinbeck, C.; Koepler, O.; Bach, F.; Herres-Pawlis, S.; Jung, N.; Liermann, J. C.; Neumann, S.; Razum, M.; Baldauf, C.; Biedermann, F.; Bocklitz, T. W.; Boehm, F.; Broda, F.; Czodrowski, P.; Engel, T.; Hicks, M. G.; Kast, S. M.; Kettner, C.; Koch, W.; Lanza, G.; Link, A.; Mata, R. A.; Nagel, W. E.; Porzel, A.; Schlörer, N.; Schulze, T.; Weinig, H.-G.; Wenzel, W.; Wessjohann, L. A.; Wulle, S.; NFDI4Chem - Towards a National Research Data Infrastructure for Chemistry in Germany Res. Ideas Outcomes 6, e55852, (2020) DOI: 10.3897/rio.6.e55852

The vision of NFDI4Chem is the digitalisation of all key steps in chemical research to support scientists in their efforts to collect, store, process, analyse, disclose and re-use research data. Measures to promote Open Science and Research Data Management (RDM) in agreement with the FAIR data principles are fundamental aims of NFDI4Chem to serve the chemistry community with a holistic concept for access to research data. To this end, the overarching objective is the development and maintenance of a national research data infrastructure for the research domain of chemistry in Germany, and to enable innovative and easy to use services and novel scientific approaches based on re-use of research data. NFDI4Chem intends to represent all disciplines of chemistry in academia. We aim to collaborate closely with thematically related consortia. In the initial phase, NFDI4Chem focuses on data related to molecules and reactions including data for their experimental and theoretical characterisation.This overarching goal is achieved by working towards a number of key objectives:Key Objective 1: Establish a virtual environment of federated repositories for storing, disclosing, searching and re-using research data across distributed data sources. Connect existing data repositories and, based on a requirements analysis, establish domain-specific research data repositories for the national research community, and link them to international repositories.Key Objective 2: Initiate international community processes to establish minimum information (MI) standards for data and machine-readable metadata as well as open data standards in key areas of chemistry. Identify and recommend open data standards in key areas of chemistry, in order to support the FAIR principles for research data. Finally, develop standards, if there is a lack.Key Objective 3: Foster cultural and digital change towards Smart Laboratory Environments by promoting the use of digital tools in all stages of research and promote subsequent Research Data Management (RDM) at all levels of academia, beginning in undergraduate studies curricula.Key Objective 4: Engage with the chemistry community in Germany through a wide range of measures to create awareness for and foster the adoption of FAIR data management. Initiate processes to integrate RDM and data science into curricula. Offer a wide range of training opportunities for researchers.Key Objective 5: Explore synergies with other consortia and promote cross-cutting development within the NFDI.Key Objective 6: Provide a legally reliable framework of policies and guidelines for FAIR and open RDM.
Publikation

Emami Khoonsari, P.; Moreno, P.; Bergmann, S.; Burman, J.; Capuccini, M.; Carone, M.; Cascante, M.; de Atauri, P.; Foguet, C.; Gonzalez-Beltran, A. N.; Hankemeier, T.; Haug, K.; He, S.; Herman, S.; Johnson, D.; Kale, N.; Larsson, A.; Neumann, S.; Peters, K.; Pireddu, L.; Rocca-Serra, P.; Roger, P.; Rueedi, R.; Ruttkies, C.; Sadawi, N.; Salek, R. M.; Sansone, S.-A.; Schober, D.; Selivanov, V.; Thévenot, E. A.; van Vliet, M.; Zanetti, G.; Steinbeck, C.; Kultima, K.; Spjuth, O.; Interoperable and scalable data analysis with microservices: applications in metabolomics Bioinformatics 35, 3752-3760, (2019) DOI: 10.1093/bioinformatics/btz160

MotivationDeveloping a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator.ResultsWe developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science.Availability and implementationThe PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects.
Publikation

Peters, K.; Bradbury, J.; Bergmann, S.; Capuccini, M.; Cascante, M.; de Atauri, P.; Ebbels, T. M. D.; Foguet, C.; Glen, R.; Gonzalez-Beltran, A.; Günther, U. L.; Handakas, E.; Hankemeier, T.; Haug, K.; Herman, S.; Holub, P.; Izzo, M.; Jacob, D.; Johnson, D.; Jourdan, F.; Kale, N.; Karaman, I.; Khalili, B.; Emami Khoonsari, P.; Kultima, K.; Lampa, S.; Larsson, A.; Ludwig, C.; Moreno, P.; Neumann, S.; Novella, J. A.; O'Donovan, C.; Pearce, J. T. M.; Peluso, A.; Piras, M. E.; Pireddu, L.; Reed, M. A. C.; Rocca-Serra, P.; Roger, P.; Rosato, A.; Rueedi, R.; Ruttkies, C.; Sadawi, N.; Salek, R. M.; Sansone, S.-A.; Selivanov, V.; Spjuth, O.; Schober, D.; Thévenot, E. A.; Tomasoni, M.; van Rijswijk, M.; van Vliet, M.; Viant, M. R.; Weber, R. J. M.; Zanetti, G.; Steinbeck, C.; PhenoMeNal: processing and analysis of metabolomics data in the cloud GigaScience 8, giy149, (2019) DOI: 10.1093/gigascience/giy149

BackgroundMetabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological, and many other applied biological domains. Its computationally intensive nature has driven requirements for open data formats, data repositories, and data analysis tools. However, the rapid progress has resulted in a mosaic of independent, and sometimes incompatible, analysis methods that are difficult to connect into a useful and complete data analysis solution.FindingsPhenoMeNal (Phenome and Metabolome aNalysis) is an advanced and complete solution to set up Infrastructure-as-a-Service (IaaS) that brings workflow-oriented, interoperable metabolomics data analysis platforms into the cloud. PhenoMeNal seamlessly integrates a wide array of existing open-source tools that are tested and packaged as Docker containers through the project's continuous integration process and deployed based on a kubernetes orchestration framework. It also provides a number of standardized, automated, and published analysis workflows in the user interfaces Galaxy, Jupyter, Luigi, and Pachyderm.ConclusionsPhenoMeNal constitutes a keystone solution in cloud e-infrastructures available for metabolomics. PhenoMeNal is a unique and complete solution for setting up cloud e-infrastructures through easy-to-use web interfaces that can be scaled to any custom public and private cloud environment. By harmonizing and automating software installation and configuration and through ready-to-use scientific workflow user interfaces, PhenoMeNal has succeeded in providing scientists with workflow-driven, reproducible, and shareable metabolomics data analysis platforms that are interfaced through standard data formats, representative datasets, versioned, and have been tested for reproducibility and interoperability. The elastic implementation of PhenoMeNal further allows easy adaptation of the infrastructure to other application areas and ‘omics research domains.
Preprints

Peters, K.; Bradbury, J.; Bergmann, S.; Capuccini, M.; Cascante, M.; de Atauri, P.; Ebbels, T. M. D.; Foguet, C.; Glen, R.; Gonzalez-Beltran, A.; Guenther, U.; Handakas, E.; Hankemeier, T.; Haug, K.; Herman, S.; Holub, P.; Izzo, M.; Jacob, D.; Johnson, D.; Jourdan, F.; Kale, N.; Karaman, I.; Khalili, B.; Emami Khoonsari, P.; Kultima, K.; Lampa, S.; Larsson, A.; Ludwig, C.; Moreno, P.; Neumann, S.; Novella, J. A.; O'Donovan, C.; Pearce, J. T. M.; Peluso, A.; Pireddu, L.; Piras, M. E.; Reed, M. A. C.; Rocca-Serra, P.; Roger, P.; Rosato, A.; Rueedi, R.; Ruttkies, C.; Sadawi, N.; Salek, R.; Sansone, S.-A.; Selivanov, V.; Spjuth, O.; Schober, D.; Thévenot, E. A.; Tomasoni, M.; van Rijswijk, M.; van Vliet, M.; Viant, M. R.; Weber, R. J. M.; Zanetti, G.; Steinbeck, C.; PhenoMeNal: Processing and analysis of Metabolomics data in the Cloud bioRxiv (2018) DOI: 10.1101/409151

Background Metabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism’s metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological and many other applied biological domains. Its computationally-intensive nature has driven requirements for open data formats, data repositories and data analysis tools. However, the rapid progress has resulted in a mosaic of independent – and sometimes incompatible – analysis methods that are difficult to connect into a useful and complete data analysis solution.Findings The PhenoMeNal (Phenome and Metabolome aNalysis) e-infrastructure provides a complete, workflow-oriented, interoperable metabolomics data analysis solution for a modern infrastructure-as-a-service (IaaS) cloud platform. PhenoMeNal seamlessly integrates a wide array of existing open source tools which are tested and packaged as Docker containers through the project’s continuous integration process and deployed based on a kubernetes orchestration framework. It also provides a number of standardized, automated and published analysis workflows in the user interfaces Galaxy, Jupyter, Luigi and Pachyderm.Conclusions PhenoMeNal constitutes a keystone solution in cloud infrastructures available for metabolomics. It provides scientists with a ready-to-use, workflow-driven, reproducible and shareable data analysis platform harmonizing the software installation and configuration through user-friendly web interfaces. The deployed cloud environments can be dynamically scaled to enable large-scale analyses which are interfaced through standard data formats, versioned, and have been tested for reproducibility and interoperability. The flexible implementation of PhenoMeNal allows easy adaptation of the infrastructure to other application areas and ‘omics research domains.
Publikation

Schober, D.; Jacob, D.; Wilson, M.; Cruz, J. A.; Marcu, A.; Grant, J. R.; Moing, A.; Deborde, C.; de Figueiredo, L. F.; Haug, K.; Rocca-Serra, P.; Easton, J.; Ebbels, T. M. D.; Hao, J.; Ludwig, C.; Günther, U. L.; Rosato, A.; Klein, M. S.; Lewis, I. A.; Luchinat, C.; Jones, A. R.; Grauslys, A.; Larralde, M.; Yokochi, M.; Kobayashi, N.; Porzel, A.; Griffin, J. L.; Viant, M. R.; Wishart, D. S.; Steinbeck, C.; Salek, R. M.; Neumann, S.; nmrML: A Community Supported Open Data Standard for the Description, Storage, and Exchange of NMR Data Anal. Chem. 90, 649-656, (2018) DOI: 10.1021/acs.analchem.7b02795

NMR is a widely used analytical technique with a growing number of repositories available. As a result, demands for a vendor-agnostic, open data format for long-term archiving of NMR data have emerged with the aim to ease and encourage sharing, comparison, and reuse of NMR data. Here we present nmrML, an open XML-based exchange and storage format for NMR spectral data. The nmrML format is intended to be fully compatible with existing NMR data for chemical, biochemical, and metabolomics experiments. nmrML can capture raw NMR data, spectral data acquisition parameters, and where available spectral metadata, such as chemical structures associated with spectral assignments. The nmrML format is compatible with pure-compound NMR data for reference spectral libraries as well as NMR data from complex biomixtures, i.e., metabolomics experiments. To facilitate format conversions, we provide nmrML converters for Bruker, JEOL and Agilent/Varian vendor formats. In addition, easy-to-use Web-based spectral viewing, processing, and spectral assignment tools that read and write nmrML have been developed. Software libraries and Web services for data validation are available for tool developers and end-users. The nmrML format has already been adopted for capturing and disseminating NMR data for small molecules by several open source data processing tools and metabolomics reference spectral libraries, e.g., serving as storage format for the MetaboLights data repository. The nmrML open access data standard has been endorsed by the Metabolomics Standards Initiative (MSI), and we here encourage user participation and feedback to increase usability and make it a successful standard.
IPB Mainnav Search