Using genetic evolution to reconstruct language family trees

Using genetic evolution to reconstruct language family trees

By Alice Dolin

In 1859 Charles Darwin published his work On the Origin of Species, famously laying out his ideas on evolution and natural selection. Almost immediately afterwards, other scholars noticed that a similar process of evolution by mutation and inheritance can describe how languages change. Languages slowly acquire new features and pass them on the descendent languages. Then these descendent languages gradually split off from each other to form new languages. A few years later in 1863, August Schleicher established this in his family-tree or Stammbaum theory, which models language evolution as a tree that begins at a single ancestor language and then splits into more and more separate branches of languages over time. The other main theory proposed at the time was the Johannes Schmidt’s wave theory or Wellentheorie, which instead models language change as a series of waves. Each new feature is innovated in one place and then spreads out geographically to nearby speakers. Ultimately though, Schleicher’s family-tree model became the dominant theory in historical linguistics.

Since these theories were established in the 19th century, a lot of the work done in the field of historical linguistics has involved reconstructing the family trees of different language groups. For most of the field’s history, this has involved manual reconstruction through the Comparative Method. The Comparative Method involves comparing two languages and finding consistent differences between the languages that occur many times in different words. The same process is then done for multiple languages to get a picture of how they all fit together in the tree.

However, in the past couple of decades a new method has emerged for historical reconstruction that has reignited the link between biology and linguistics. This new method involves using computer programs and mathematical models originally developed for genetic evolution and applying them to language evolution, by treating the language data as though it is genetic data. This field is called computational phylogenetics and it has led to many new classifications of different language families. However, it is also a controversial methodology and some scholars reject it. This is mostly because the analogy between genetic and languages evolution is not perfect. The biggest criticism for example is that languages don’t just inherit features from ancestral languages, they also borrow across neighbouring languages. Phylogenetic models don’t have a good way to address borrowing and so this can lead to incorrectly grouping together two unrelated languages that borrow from each other as being ancestrally related. The many debates and new findings make this a rapidly developing field within linguistics.

For my Honours thesis I am performing a computational phylogenetic analysis of several Tani languages, which are a group of Trans-Himalayan languages spoken predominantly in northeast India. This group has generally struggled to be classified into a neat tree, and so my analysis will hopefully shed some light on the history of these languages. Some previous studies have used a few Tani languages as part of a bigger phylogenetic analysis of the Trans-Himalayan family, but no study has yet focused this closely on the Tani group.

Alice Dolin is an Honours student in the Department of Linguistics at the University of Sydney.

Cover image: copyright Minna Sundberg. Licensed under Creative Commons.

No Comments

Sorry, the comment form is closed at this time.