Improving Regression Models by Dissimilarity Representation of Bio-chemical Data

The determination of characteristics by regression models using bio-chemical data from analytical techniques such as Near Infrared Spectrometry and Nuclear Magnetic Resonance is a common activity within the recognition of substances and their chemical-physical properties. The data obtained from the...

Full description

Saved in:

Bibliographic Details
Published in	Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications Vol. 11401; pp. 64 - 71
Main Authors	Silva-Mata, Francisco Jose, Jiménez, Catherine, Barcas, Gabriela, Estevez-Bresó, David, Acosta-Mendoza, Niusvel, Gago-Alonso, Andres, Talavera-Bustamante, Isneri
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2019 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Bio-chemical data Dissimilarity representation Regression
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The determination of characteristics by regression models using bio-chemical data from analytical techniques such as Near Infrared Spectrometry and Nuclear Magnetic Resonance is a common activity within the recognition of substances and their chemical-physical properties. The data obtained from the mentioned techniques are commonly represented as vectors, which ignore the continuous nature of data and the correlation between variables. This fact affects the regression modeling and calibration processes. For solving these problems, alternative representations of data have been previously used with good results, such as those ones based on functions and the others based on dissimilarity representation. By using the alternative based on dissimilarities, the obtained results improve the efficiency of the classification processes, but the experience in regression with this representation is scarce. For this reason, in this paper, in order to improve the quality of the regression models, we combine the dissimilarity representation with some adequate data pre-processing, in our case, we use the classical Partial Least Square regression as the modeling method. The evaluation of the results was carried out by using the coefficient of determination \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R^2$$\end{document} for each case and a statistical analysis of them is performed.
ISBN:	9783030134686 3030134687
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-13469-3_8