Researchers have developed a new software which can be used to reconstruct dead, ancient languages using probabilistic models of sound change.

Linguists typically reconstruct a language by noting the patterns of variation over thousands of years. Sound changes to one word typically affect other words which have a similar sound, so these patterns can be detected by linguists and by a computer system. The process carried out currently by linguists is very slow and labor intensive, but this new computer system is able to pore through the data very quickly.

This new software was tested by taking 637 Austronesian languages currently spoken in Asia and the Pacific, and attempting to reconstruct the ancient languages they're based on. The system was found to have provided a relatively accurate, large-scale automatic reconstruction of the protolanguages. A language believed to be roughtly seven thousand years old was reconstructed using a database of 142,000 words.

When compared to the results of linguists specializing in Austronesian languages, more 85% of the system's reconstructions were found to be within one character of the manual reconstruction. These are very promising results, however a linguist is still able to produce a higher accuracy, so this will be likely to become a tool used by linguists rather than a replacement for them.

Dr. Dan Klein, an associate professor at the University of California, Berkley, commented, "Our system still has shortcomings. For example, it can't handle morphological changes or re-duplications - how a word like 'cat' becomes 'kitty-cat'. At a much deeper level, our system doesn't explain why or how certain changes happened, only that they probably did happen."

Abstract at PNAS
BBC News