Similarity Statistics for Clusterability Analysis with the Application of Cell Formation Problem

This paper proposes the use of the statistics of similarity values to evaluate the clusterability or structuredness associated with a cell formation (CF) problem. Typically, the structuredness of a CF solution cannot be known until the CF problem is solved. In this context, this paper investigates t...

Full description

Saved in:
Bibliographic Details
Published inJournal of Probability and Statistics Vol. 2018; no. 2018; pp. 1 - 17
Main Authors Zhu, Yingyu, Li, Simon
Format Journal Article
LanguageEnglish
Published Cairo, Egypt Hindawi Publishing Corporation 01.01.2018
Hindawi
John Wiley & Sons, Inc
Hindawi Limited
Wiley
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper proposes the use of the statistics of similarity values to evaluate the clusterability or structuredness associated with a cell formation (CF) problem. Typically, the structuredness of a CF solution cannot be known until the CF problem is solved. In this context, this paper investigates the similarity statistics of machine pairs to estimate the potential structuredness of a given CF problem without solving it. One key observation is that a well-structured CF solution matrix has a relatively high percentage of high-similarity machine pairs. Then, histograms are used as a statistical tool to study the statistical distributions of similarity values. This study leads to the development of the U-shape criteria and the criterion based on the Kolmogorov-Smirnov test. Accordingly, a procedure is developed to classify whether an input CF problem can potentially lead to a well-structured or ill-structured CF matrix. In the numerical study, 20 matrices were initially used to determine the threshold values of the criteria, and 40 additional matrices were used to verify the results. Further, these matrix examples show that genetic algorithm cannot effectively improve the well-structured CF solutions (of high grouping efficacy values) that are obtained by hierarchical clustering (as one type of heuristics). This result supports the relevance of similarity statistics to preexamine an input CF problem instance and suggest a proper solution approach for problem solving.
ISSN:1687-952X
1687-9538
DOI:10.1155/2018/1348147