A graph-based clustering algorithm for software systems modularization
Clustering algorithms, as a modularization technique, are used to modularize a program aiming to understand large software systems as well as software refactoring. These algorithms partition the source code of the software system into smaller and easy-to-manage modules (clusters). The resulting deco...
Saved in:
Published in | Information and software technology Vol. 133; p. 106469 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.05.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Clustering algorithms, as a modularization technique, are used to modularize a program aiming to understand large software systems as well as software refactoring. These algorithms partition the source code of the software system into smaller and easy-to-manage modules (clusters). The resulting decomposition is called the software system structure (or software architecture). Due to the NP-hardness of the modularization problem, evolutionary clustering approaches such as the genetic algorithm have been used to solve this problem. These methods do not make much use of the information and knowledge available in the artifact dependency graph which is extracted from the source code.
To overcome the limitations of the existing modularization techniques, this paper presents a new modularization technique named GMA (Graph-based Modularization Algorithm).
In this paper, a new graph-based clustering algorithm is presented for software modularization. To this end, the depth of relationships is used to compute the similarity between artifacts, as well as seven new criteria are proposed to evaluate the quality of a modularization. The similarity presented in this paper enables the algorithm to use graph-theoretic information.
To demonstrate the applicability of the proposed algorithm, ten folders of Mozilla Firefox with different domains and functions, along with four other applications, are selected. The experimental results demonstrate that the proposed algorithm produces modularization closer to the human expert’s decomposition (i.e., directory structure) than the other existing algorithms.
The proposed algorithm is expected to help a software designer in the software reverse engineering process to extract easy-to-manage and understandable modules from source code. |
---|---|
ISSN: | 0950-5849 1873-6025 |
DOI: | 10.1016/j.infsof.2020.106469 |