Software development is a knowledge-intensive activity, which requires mastering several languages, frameworks, technology trends (among other aspects) under the pressure of ever-increasing arrays of external libraries and resources. Recommender systems are gaining high relevance in software engineering since they aim at providing developers with real-time recommendations, which can reduce the time spent on discovering and understanding reusable artifacts from software repositories, and thus inducing productivity and quality gains. In this paper, we focus on the problem of mining open source software repositories to identify similar projects, which can be evaluated and eventually reused by developers. To this end, CrossSim is proposed as a novel approach to model open source software projects and related artifacts and to compute similarities among them. An evaluation on a dataset containing 580 GitHub projects shows that CrossSim outperforms an existing technique, which has been proven to have a good performance in detecting similar GitHub repositories.

CrossSim: Exploiting Mutual Relationships to Detect Similar OSS Projects

Di Rocco, Juri;RUBEI, RICCARDO;Di Ruscio, Davide
2018-01-01

Abstract

Software development is a knowledge-intensive activity, which requires mastering several languages, frameworks, technology trends (among other aspects) under the pressure of ever-increasing arrays of external libraries and resources. Recommender systems are gaining high relevance in software engineering since they aim at providing developers with real-time recommendations, which can reduce the time spent on discovering and understanding reusable artifacts from software repositories, and thus inducing productivity and quality gains. In this paper, we focus on the problem of mining open source software repositories to identify similar projects, which can be evaluated and eventually reused by developers. To this end, CrossSim is proposed as a novel approach to model open source software projects and related artifacts and to compute similarities among them. An evaluation on a dataset containing 580 GitHub projects shows that CrossSim outperforms an existing technique, which has been proven to have a good performance in detecting similar GitHub repositories.
2018
978-1-5386-7383-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/128312
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 15
social impact