Patch seriation to visualize data and model parameters
We developed a new seriation merit function for enhancing the visual information of data matrices. A local similarity matrix is calculated, where the average similarity of neighbouring objects is calculated in a limited variable space and a global function is constructed to maximize the local simila...
Saved in:
Published in | Journal of cheminformatics Vol. 15; no. 1; p. 78 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Cham
Springer International Publishing
09.09.2023
BioMed Central Ltd Springer Nature B.V BMC |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We developed a new seriation merit function for enhancing the visual information of data matrices. A local similarity matrix is calculated, where the average similarity of neighbouring objects is calculated in a limited variable space and a global function is constructed to maximize the local similarities and cluster them into patches by simple row and column ordering. The method identifies data clusters in a powerful way, if the similarity of objects is caused by some variables and these variables differ for the distinct clusters. The method can be used in the presence of missing data and also on more than two-dimensional data arrays. We show the feasibility of the method on different data sets: on QSAR, chemical, material science, food science, cheminformatics and environmental data in two- and three-dimensional cases. The method can be used during the development and the interpretation of artificial neural network models by seriating different features of the models. It helps to identify interpretable models by elucidating clusters of objects, variables and hidden layer neurons.
Graphical Abstract |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1758-2946 1758-2946 |
DOI: | 10.1186/s13321-023-00757-1 |