zur Suche springenzur Navigation springenzum Inhalt springen

Publikationen - Stoffwechsel- und Zellbiologie

Sortieren nach: Erscheinungsjahr Typ der Publikation

Zeige Ergebnisse 1 bis 2 von 2.

Bücher und Buchkapitel

Hinneburg, A.; Porzel, A.; Wolfram, K.; An Evaluation of Text Retrieval Methods for Similarity Search of Multi-dimensional NMR-Spectra Lecture Notes in Computer Science 4414, 424-438, (2007) ISBN: 978-3-540-71233-6 DOI: 10.1007/978-3-540-71233-6_33

Searching and mining nuclear magnetic resonance (NMR)-spectra of naturally occurring substances is an important task to investigate new potentially useful chemical compounds. Multi-dimensional NMR-spectra are relational objects like documents, but consists of continuous multi-dimensional points called peaks instead of words. We develop several mappings from continuous NMR-spectra to discrete text-like data. With the help of those mappings any text retrieval method can be applied. We evaluate the performance of two retrieval methods, namely the standard vector space model and probabilistic latent semantic indexing (PLSI). PLSI learns hidden topics in the data, which is in case of 2D-NMR data interesting in its owns rights. Additionally, we develop and evaluate a simple direct similarity function, which can detect duplicates of NMR-spectra. Our experiments show that the vector space model as well as PLSI, which are both designed for text data created by humans, can effectively handle the mapped NMR-data originating from natural products. Additionally, PLSI is able to find meaningful ”topics” in the NMR-data.
Bücher und Buchkapitel

Wolfram, K.; Porzel, A.; Hinneburg, A.; Similarity Search for Multi-dimensional NMR-Spectra of Natural Products Lecture Notes in Computer Science 4213, 650-658, (2006) ISBN: 978-3-540-46048-0 DOI: 10.1007/11871637_67

Searching and mining nuclear magnetic resonance (NMR)-spectra of naturally occurring products is an important task to investigate new potentially useful chemical compounds. We develop a set-based similarity function, which, however, does not sufficiently capture more abstract aspects of similarity. NMR-spectra are like documents, but consists of continuous multi-dimensional points instead of words. Probabilistic semantic indexing (PLSI) is an retrieval method, which learns hidden topics. We develop several mappings from continuous NMR-spectra to discrete text-like data. The new mappings include redundancies into the discrete data, which proofs helpful for the PLSI-model used afterwards. Our experiments show that PLSI, which is designed for text data created by humans, can effectively handle the mapped NMR-data originating from natural products. Additionally, PLSI combined with the new mappings is able to find meaningful ”topics” in the NMR-data.
IPB Mainnav Search