Software development is a knowledge-intensive activity, which requires mastering several languages, frameworks, technology trends (among other aspects) under the pressure of ever-increasing arrays of external libraries and resources. Recommender systems are gaining high relevance in software engineering since they aim at providing developers with real-time recommendations, which can reduce the time spent on discovering and understanding reusable artifacts from software repositories, and thus inducing productivity and quality gains. In this paper, we focus on the problem of mining open source software repositories to identify similar projects, which can be evaluated and eventually reused by developers. To this end, CrossSim is proposed as a novel approach to model open source software projects and related artifacts and to compute similarities among them. An evaluation on a dataset containing 580 GitHub projects shows that CrossSim outperforms an existing technique, which has been proven to have a good performance in detecting similar GitHub repositories.

CrossSim: Exploiting mutual relationships to detect similar OSS projects

Nguyen Phuong
Writing – Original Draft Preparation
;
Di Rocco J.;Rubei R.;Di Ruscio D.
2018

Abstract

Software development is a knowledge-intensive activity, which requires mastering several languages, frameworks, technology trends (among other aspects) under the pressure of ever-increasing arrays of external libraries and resources. Recommender systems are gaining high relevance in software engineering since they aim at providing developers with real-time recommendations, which can reduce the time spent on discovering and understanding reusable artifacts from software repositories, and thus inducing productivity and quality gains. In this paper, we focus on the problem of mining open source software repositories to identify similar projects, which can be evaluated and eventually reused by developers. To this end, CrossSim is proposed as a novel approach to model open source software projects and related artifacts and to compute similarities among them. An evaluation on a dataset containing 580 GitHub projects shows that CrossSim outperforms an existing technique, which has been proven to have a good performance in detecting similar GitHub repositories.
978-1-5386-7383-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11697/183471
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? ND
social impact