Phylogenetic trees can be reconstructed from the matrix which contains the distances between all pairs of languages in a family. Recently, we proposed a new method which uses normalized Levenshtein distances among words with the same meaning and averages over all the items of a given list. Decisions about the number of items in the input lists for language comparison have been debated since the beginning of glottochronology. The point is that words associated with some of the meanings have a rapid lexical evolution. Therefore, a large vocabulary comparison is only apparently more accurate than a smaller one, since many of the words do not carry any useful information. In principle, one should find the optimal length of the input lists, studying the stability of the different items. In this paper we tackle the problem with an automated methodology based only on our normalized Levenshtein distance. With this approach, the program of an automated reconstruction of language relationships is completed.

Lexical evolution rates derived from automated stability measures

SERVA, Maurizio
2010-01-01

Abstract

Phylogenetic trees can be reconstructed from the matrix which contains the distances between all pairs of languages in a family. Recently, we proposed a new method which uses normalized Levenshtein distances among words with the same meaning and averages over all the items of a given list. Decisions about the number of items in the input lists for language comparison have been debated since the beginning of glottochronology. The point is that words associated with some of the meanings have a rapid lexical evolution. Therefore, a large vocabulary comparison is only apparently more accurate than a smaller one, since many of the words do not carry any useful information. In principle, one should find the optimal length of the input lists, studying the stability of the different items. In this paper we tackle the problem with an automated methodology based only on our normalized Levenshtein distance. With this approach, the program of an automated reconstruction of language relationships is completed.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/18172
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 5
social impact