An automated approach to assess the similarity of GitHub repositories