Architecture Recovery using Partitioning Clustering Technique
thesisposted on 2016-07-01, 00:00 authored by Alessandro Chetta
Software development evolved quickly in the last decades to satisfy market needs. The software market poses some constraints such as strict deadlines, high software reliability and performance, and limited budget. Hence, developer teams face more challenging problems and implement larger projects to realize a competitive software that satisfies the constraints. It implies team sizes are becoming larger. A large developer team contains some subteams such as a design and implementation team, a debugging team and testing team. Thus, information sharing among the team members is crucial in satisfying market demand. Knowledge of the software architecture is a way to describe a software system at a high level of abstraction. The software architecture is an effective information sharing means beside natural language documentation. In addition software architecture are decrypting, not language dependent and often use graphical representations. Nonetheless, the software architecture is not always available or up to date. In this case, architecture recovery seeks to recover a software architecture based on reverse engineering techniques. The architecture recovery process extracts a software architecture taking as input the software row source code. The state of the art in software architecture recovery does not provide a reliable solution yet. This motivated additional research on the application of clustering algorithms. My research contribution is a novel technique able to recover the architecture of a large software project from its source code. I named this tool K-recovery after the clustering algorithm K-means. I have also assessed empirically the effectiveness of my method. A challenge in these em- pirical evaluations is the lack of a ground truth, that is, knowledge of the actual software architecture that my method is recovering. To evaluate the recovered architectures, I recovered manually the analyzed projects archi- tectures, then I compared them. K-recovery recovered succesfully the MVC components of a software video game. Moreover, it recovered the architectures of two software projects with 75% average accuracy. The software projects analyzed have more than 10K SLOC and the largest contains more than 80 classes. Furthermore, I implemented some cluster cohesion and coupling metrics that provide an evaluation about the clusters quality.