Classification of Similarly Colored Medicinal Berries using Hyperspectral Images and Machine Learning Models
As the misuse of medicinal plants increases due to misclassifications brought about by similarities in the external characteristics (color, size, shape) of plants and their fruits, accurate identification techniques must be developed. Spectral information can be used to identify various characterist...
Saved in:
Published in | Weon'ye gwahag gi'sulji Vol. 42; no. 3; pp. 249 - 263 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
한국원예학회HST
01.01.2024
한국원예학회 |
Subjects | |
Online Access | Get full text |
ISSN | 1226-8763 2465-8588 |
DOI | 10.7235/HORT.20240022 |
Cover
Loading…
Summary: | As the misuse of medicinal plants increases due to misclassifications brought about by similarities in the external characteristics (color, size, shape) of plants and their fruits, accurate identification techniques must be developed. Spectral information can be used to identify various characteristics of medicinal plants in wavelength ranges that cannot be seen by the naked eye. This study develops a non-destructive identification and classification technology for medicinal plants using hyperspectral imaging combined with machine learning models to eliminate the misidentification of medicinal berries that are very similar in size, shape, and color. Four models were used to classify different plant species: the logistic regression (LR), K-nearest neighbor (KNN), decision tree (DT), and random forest (RF) models. The optimal classification model was selected based on classification performance indicators. The dried fruit of four medicinal plant species were used: Cornus officinalis, Lycium chinense, Lycium barbarum, and Schisandra chinensis. Hyperspectral images of the samples were obtained corresponding to 150 wavelength bands in the 400–1000 nm range. For the training dataset, the average reflectance spectrum per berry was extracted. The accuracy, F1 score, confusion matrix, and receiver operating characteristic (ROC) curve were used to evaluate the performance of each classification model. The LR model performed best, with accuracy of 0.99 and an area under the curve (AUC) value of 1 for all samples. The LR model produces very accurate results, and the classification system based on it is fast and non-destructive. The machine-learning-based hyperspectral imaging classification system can be applied and scaled up to the industrial level, effectively eliminating the misuse of medicinal plants through accurate identification of these plants. KCI Citation Count: 0 |
---|---|
ISSN: | 1226-8763 2465-8588 |
DOI: | 10.7235/HORT.20240022 |