Improving the performance of SOMFA by use of standard multivariate methods

Self-Organizing Molecular Field Analysis (SOMFA) comes with a built-in regression methodology, the Self-Organizing Regression (SOR), instead of relying on external methods such as PLS. In this article we present a proof of the equivalence between SOR and SIMPLS with one principal component. Thus, th...

Full description

Saved in:
Bibliographic Details
Published inSAR and QSAR in environmental research Vol. 16; no. 6; pp. 567 - 579
Main Authors Korhonen, S.-P., Tuppurainen, K., Laatikainen, R., Peräkylä, M.
Format Journal Article
LanguageEnglish
Published England Taylor & Francis Group 01.12.2005
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Self-Organizing Molecular Field Analysis (SOMFA) comes with a built-in regression methodology, the Self-Organizing Regression (SOR), instead of relying on external methods such as PLS. In this article we present a proof of the equivalence between SOR and SIMPLS with one principal component. Thus, the modest performance of SOMFA on complex datasets can be primarily attributed to the low performance of the SOMFA regression methodology. A multi-component extension of the original SOR methodology (MCSOR) is introduced, and the performances of SOR, MCSOR and SIMPLS are compared using several datasets. The results indicate that in general the performance of SOMFA models is greatly improved if SOR is replaced with a more sophisticated regression method. The results obtained for the Cramer (CBG) dataset further underline the fact that it is a very poor benchmark dataset and should not be used to evaluate the performance of QSAR techniques.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1062-936X
1029-046X
DOI:10.1080/10659360500468419