PIE-alects: Using Genetics to Trace the Origins of a Language

For many, using genetic techniques to hypothesise the migration of a language seems abstract and unlikely. However, a recent publication by Haak et al (2015) outlines how advancements in genetic analysis have allowed scientists to trace the potential rise and spread of a language through Europe, by tracing the migrations of the populations that potentially used it.

It has been known for a long time that the majority of European languages share a common ancestor. In the 1700s, Sir William Jones noticed the similarities between words in Latin, Sanskrit, German, and hypothesized that they all had the same origin. This mysterious language, known as ‘Proto-Indo-European’ (PIE), is the ancestral language to over 400 languages and dialects found in Europe and Asia today.

For widespread fixation and universal usage of a language, it has to be introduced quickly and consistently into existing populations, for example via a mass migration of people speaking the same language, and consequent successful assimilation into the pre-existing population. As there is no written evidence of PIE, scientists have used genetics to trace the movement of certain ancient peoples to see if any migration had a large enough impact to potentially introduce a language.

Homo sapiens in Europe- when and where?

Homo sapiens (modern day humans) evolved in Africa around 200,000 years ago, and reached Europe around 42,000 years ago (another very interesting topic which I may write about one day!). These early modern humans were hunter-gatherers, and existed in small-ish populations for over 30,000 years.

Agriculture as a fixed lifestyle and practice arose in the Middle-Eastern Levant around 11,000 years ago. The first farmers to migrate into Europe were from Anatolia (modern day Turkey) around 8,000 years ago, and it is likely they migrated through the Mediterranean region into mid-Europe. This migration formed the basis of the Anatolian hypothesis; that PIE was spoken by these early farmers, and that they introduced it into Europe. Their migration was widespread across Europe throughout the Neolithic, and there is evidence that they interacted, and even interbred, with contemporary European hunter-gatherers.

How does genetics fit in?

Genetic analysis of ancient individuals from thousands of years ago can be difficult. There are often not many remains found, and the DNA needed for genetic analysis is often damaged or in small amounts. Analysing mitochondrial DNA and Y-chromosome DNA is a popular alternative to whole genome analysis:

Mitochondrial DNA (mtDNA) and Y chromosome DNA

Mitochondria (as mentioned in my previous post) have their own genome, although it is drastically reduced and has undergone significant horizontal gene transfer to the host nucleus. Mitochondrial DNA is often preferred for analysis, as there are many mitochondria in a cell, and therefore a larger amount of mtDNA than nuclear DNA. mtDNA also mutates at a higher frequency than nuclear DNA, and is non-recombining (meaning bits don’t switch around between parent and offspring, so forms an easily traceable lineage). Y chromosome DNA is passed from father to son, and is also non-recombining. Both mtDNA and Y chromosome DNA can be used to determine the haplotype of an individual.

What is a haplotype?

Haplotypes are defined by small, single nucleotide changes (polymorphisms) in the sequence of the mtDNA or Y chromosome, which are unique to a certain geographical area or group of people, and arise when a population is isolated for a while. Tracing the frequency of haplotypes in a region over time can show how populations can change from migrations and invasions, and can be used to trace the ancient ancestry of modern-day people.

Whole genome analysis

Whole genome analysis (WGA) has recently become a popular genetic technique for analysing ancient genomes, due to a massive advance in technology. WGA focuses on single nucleotide polymorphisms (SNPs), which are single point mutations in the DNA sequence that occur at a frequency of >1% in the population. WGA can analyse hundreds of thousands of these SNPs at once, and can build a picture of how different populations have moved around, based on the frequency of SNPs from different ancient samples.

The PIE mystery- who were the Yamnaya?

In 2015, Haak et al published a paper that shook the foundation of the Anatolian hypothesis- could PIE have originated elsewhere? This paper conducted a massive genome-wide analysis of nearly 400,000 SNPs from ancient individuals that lived between 8,000 – 3,000 years ago. These genomes were compared to 2345 modern day humans, creating an analysis that was an order of magnitude larger than any other previous study. It concluded that there was a large migration around 6,000 years ago of the Yamnaya people, who were early pastoralists that inhabited the Russian Steppe. From archaeological evidence, we know that the Yamnaya had domesticated horses, and probably invented the wheel. It is these forms of transport, plus a nomadic lifestyle, that makes the Yamnaya a possible candidate for the introduction of PIE to Europe.

Haak et al also conducted mtDNA and Y chromosome analyses on the ancient individuals. This showed that there was indeed a large-scale migration from Russia to central Europe by the Yamnaya around 6,000 years ago, as shown by the massive change in Y-chromosome haplotype distribution in Europe after the migration:


These figures show that, prior to the Yamnaya migration (8KYa), Russia had a Y chromosome haplogroup of predominantly R1a/R1b, and Europe had a Y chromosome haplogroup of G2a…


…about 3,000 years after the Yamnaya migration, Russia still had predominantly R1a/R1b haplogroups. Europe however was no longer haplogroup G2a, but 60% of the population had R1a/R1b haplogroups, and 40% had some combination of other haplogroups. This suggests that the Yamnaya underwent a very swift and successful migration into Europe between 8KYa and 3KYa, and that they had a profound impact on the genetic composition of Europe.

The whole genome analysis also indicated that there was a mass migration at this time, the effect of which can still be observed in modern-day Europeans. Principle Component Analysis (PCA) showed that modern European ancestry can be modeled as a percentage of three ancient groups: The Yamnaya pastoralists, the Anatolian Farmers, and the Mesolithic hunter-gatherers. This was also demonstrated by Lazaridis et al in 2014: genetic analysis of ancient hunter-gatherers and farmers showed that modern European populations can trace their ancestry back to varying proportions of these three groups of people.


Did the Yamnaya introduce PIE to Europe?

The Yamnaya undoubtedly had a strong impact on the genetic composition of Europe after their migration from the Russian Steppe. Their domestication of horses and the probable invention of the wheel meant that they had the means to initiate a rapid and successful migration over a relatively short period of time. The descendants of the Yamnaya formed the Corded Ware culture, which was a dominating culture across much of northern and central Europe during the Neolithic. All modern day Europeans contain some proportion of Yamnaya ancestry in their genome. However, there is no definite evidence that the Yamnaya were the ancestral Proto-Indo-European speakers that introduced the language to the rest of Europe- they could just have been a population with a highly successful migration from the Russian steppe.

Why is this important?

Again the initial importance behind this discovery is not apparent. However, the way researchers went about the investigations is exciting- using genetics to trace the potential migrations of a language is a new use for the ever-increasing power behind genetic sequencing techniques. It shows that the previously unheard-of Yamnaya have traceable ancestry in all living Europeans, and managed to spread across almost the whole of Europe within a few thousand years. It is very likely that the Yamnaya brought new cultures and traditions with them that influenced the development of the societies in Europe, including the Corded Ware culture that dominated northern Europe during the Neolithic. Haak et al (2015) has successfully proposed that the Yamnaya could present an alternative hypothesis to the introduction of PIE into Europe, and thus induced further debate, discussion and research into the ancient ancestors of Europeans.



