A new manner to use application of Shannon Entropy in similarity computation

The paper proposes a new manner to use application of Shannon Entropy in similarity computation for any objects and any objects groups. In computation of the chemical similarity cases the proposed formulas use the values of one or more molecular descriptors, divided into classes (categories) by usin...

Full description

Saved in:
Bibliographic Details
Published inJournal of mathematical chemistry Vol. 49; no. 10; pp. 2330 - 2344
Main Author Tarko, Laszlo
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Netherlands 01.11.2011
Springer
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The paper proposes a new manner to use application of Shannon Entropy in similarity computation for any objects and any objects groups. In computation of the chemical similarity cases the proposed formulas use the values of one or more molecular descriptors, divided into classes (categories) by using a suitable criterion. The paper proposes original criteria to made difference between ‘saturated’, ‘non-saturated non-aromatic’ and ‘aromatic’ molecular fragments and between ‘hydrogen-acceptor’ and ‘hydrogen-donor-acceptor’ fragments for the purpose of classifying fragments into classes. According to the proposed formula two molecules A and B are similar enough if the value of Shannon Entropy of A + B aggregate is close to the value of Shannon Entropy of A molecule and close to the value of Shannon Entropy of B molecule. The proposed similarity formula can be used as statistical correlation index, useful if the number of values of two analyzed variables is unequal. The proposed formula is useful in the quantitative evaluation of the ‘representative sample’ character of any ‘sample’. The paper presents the chemical similarity computation in Zimelidine/Fluoxtine/Chloramphenicol/Crufomate/Phoxim group. For comparison purposes, the paper also presents a Tanimoto coefficient calculation for the same molecules group. In addition, the paper presents two non-chemical examples regarding ‘Representative Sample Index’ calculation.
ISSN:0259-9791
1572-8897
DOI:10.1007/s10910-011-9889-1