Fuzzy clustering and fuzzy c-means partition cluster analysis and validation studies on a subset of citescore dataset

A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clus...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of electrical and computer engineering (Malacca, Malacca) Vol. 9; no. 4; p. 2760
Main Authors Rajkumar, K. Varada, Yesubabu, Adimulam, Subrahmanyam, K.
Format Journal Article
LanguageEnglish
Published Yogyakarta IAES Institute of Advanced Engineering and Science 01.08.2019
Subjects
Online AccessGet full text
ISSN2088-8708
2088-8708
DOI10.11591/ijece.v9i4.pp2760-2770

Cover

Loading…
More Information
Summary:A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2088-8708
2088-8708
DOI:10.11591/ijece.v9i4.pp2760-2770