MolDiA: A Novel Molecular Diversity Analysis Tool. 1. Principles and Architecture
We introduce the principles and the architecture of a user-friendly software named MOLDIA (Molecular Diversity Analysis) which aims to the comparison of diverse molecular data sets through an XML structured database of predefined fragments. The MOLDIA descriptors are composed of complex fingerprint-...
Saved in:
Published in | Journal of chemical information and modeling Vol. 47; no. 6; pp. 2197 - 2207 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
United States
American Chemical Society
01.11.2007
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We introduce the principles and the architecture of a user-friendly software named MOLDIA (Molecular Diversity Analysis) which aims to the comparison of diverse molecular data sets through an XML structured database of predefined fragments. The MOLDIA descriptors are composed of complex fingerprint-like structures, which enclose not only structural information but also physicochemical property data. The system architecture includes the use of customizable weights on molecular descriptors and different choices of similarity/diversity measures to analyze the given data sets. Intermolecular comparisons using Ullmann's algorithm were optimized by the use of fuzzy logic, generic atoms, and a whole system of chemical graph analysis. We have found that customizing the similarity/diversity computation using structural and/or properties weights and choosing the level of fuzziness of the molecular comparison allow the user to adapt the tool to particular needs and increases the possibilities of MolDiA applications. The implementation of XML Web technologies has proven to improve and ease the extraction, processing, and analysis of chemical information. |
---|---|
Bibliography: | istex:6FB5367A083F19BBC1D95E380AA026A7A71EEEB4 ark:/67375/TPS-40L9XKS1-1 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1549-9596 1549-960X |
DOI: | 10.1021/ci700120v |