MolDiA:  A Novel Molecular Diversity Analysis Tool. 1. Principles and Architecture

We introduce the principles and the architecture of a user-friendly software named MOLDIA (Molecular Diversity Analysis) which aims to the comparison of diverse molecular data sets through an XML structured database of predefined fragments. The MOLDIA descriptors are composed of complex fingerprint-...

Full description

Saved in:
Bibliographic Details
Published inJournal of chemical information and modeling Vol. 47; no. 6; pp. 2197 - 2207
Main Authors Maldonado, Ana G, Doucet, Jean-Pierre, Petitjean, Michel, Fan, Bo-Tao
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 01.11.2007
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We introduce the principles and the architecture of a user-friendly software named MOLDIA (Molecular Diversity Analysis) which aims to the comparison of diverse molecular data sets through an XML structured database of predefined fragments. The MOLDIA descriptors are composed of complex fingerprint-like structures, which enclose not only structural information but also physicochemical property data. The system architecture includes the use of customizable weights on molecular descriptors and different choices of similarity/diversity measures to analyze the given data sets. Intermolecular comparisons using Ullmann's algorithm were optimized by the use of fuzzy logic, generic atoms, and a whole system of chemical graph analysis. We have found that customizing the similarity/diversity computation using structural and/or properties weights and choosing the level of fuzziness of the molecular comparison allow the user to adapt the tool to particular needs and increases the possibilities of MolDiA applications. The implementation of XML Web technologies has proven to improve and ease the extraction, processing, and analysis of chemical information.
Bibliography:istex:6FB5367A083F19BBC1D95E380AA026A7A71EEEB4
ark:/67375/TPS-40L9XKS1-1
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1549-9596
1549-960X
DOI:10.1021/ci700120v