Languages evolve over time in a process in which reproduction, mutation and extinction are all possible, similar to what happens to living organisms. Using this similarity it is possible, in principle, to build family trees which show the degree of relatedness between languages. The method used by modern glottochronology, developed by Swadesh in the 1950s, measures distances from the percentage of words with a common historical origin. The weak point of this method is that subjective judgment plays a relevant role. Recently we proposed an automated method that avoids the subjectivity, whose results can be replicated by studies that use the same database and that doesn’t require a specific linguistic knowl- edge. Moreover, the method allows a quick comparison of a large number of languages. We applied our method to the Indo-European and Austronesian families, considering in both cases, fifty different languages. The resulting trees are similar to those of previous studies, but with some important differences in the position of few languages and subgroups. We believe that these differences carry new information on the structure of the tree and on the phylogenetic relationships within families.

Automated languages phylogeny from Levenshtein distance - Filogenia automatizada de linguas a partir da distancia de Levenshtein

SERVA, Maurizio
2009-01-01

Abstract

Languages evolve over time in a process in which reproduction, mutation and extinction are all possible, similar to what happens to living organisms. Using this similarity it is possible, in principle, to build family trees which show the degree of relatedness between languages. The method used by modern glottochronology, developed by Swadesh in the 1950s, measures distances from the percentage of words with a common historical origin. The weak point of this method is that subjective judgment plays a relevant role. Recently we proposed an automated method that avoids the subjectivity, whose results can be replicated by studies that use the same database and that doesn’t require a specific linguistic knowl- edge. Moreover, the method allows a quick comparison of a large number of languages. We applied our method to the Indo-European and Austronesian families, considering in both cases, fifty different languages. The resulting trees are similar to those of previous studies, but with some important differences in the position of few languages and subgroups. We believe that these differences carry new information on the structure of the tree and on the phylogenetic relationships within families.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/31321
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact