“In contrast, the oldest individuals from the northern mountain flank itself, which are three first-degree-related individuals from the Unakozovskaya cave associated with the Darkveti-Meshoko Eneolithic culture (analysis label ‘Eneolithic Caucasus’), show mixed ancestry mostly derived from sources related to the Anatolian Neolithic (orange) and CHG/Iran Neolithic (green) in the ADMIXTURE plot. While similar ancestry profiles have been reported for Anatolian and Armenian Chalcolithic and BA individuals, this result suggests the presence of this mixed ancestry north of the Caucasus as early as ~6500 years ago.” ref

Ancient North Eurasian ancestry in Steppe Maykop individuals

“Four individuals from mounds in the grass steppe zone, archaeologically associated with the ‘Steppe Maykop’ cultural complex, lack the Anatolian farmer-related (AF) component when compared to contemporaneous Maykop individuals from the foothills. Instead, they carry a third and fourth ancestry component that is linked deeply to Upper Paleolithic Siberians (maximized in the individual Afontova Gora 3 (AG3) and Native Americans, respectively, and in modern-day North Asians, such as North Siberian Nganasan. To illustrate this affinity with ‘ancient North Eurasians’ (ANE), we also ran PCA with 147 Eurasian and 29 Native American populations. The latter represents a cline from ANE-rich steppe populations such as EHG, Eneolithic individuals, AG3, and Mal’ta 1 (MA1) to modern-day Native Americans at the opposite end. To formally test the excess of alleles shared with ANE/Native Americans, we performed f4-statistics of the form f4(Mbuti, X; Steppe Maykop, Eneolithic steppe), which resulted in significantly positive Z-scores (Z >3) for AG3, MA1, EHG, Clovis, and Kennewick for the ancient populations and many present-day Native American populations.” ref

“Based on these observations, we used qpWave and qpAdm methods to model the number of ancestral sources contributing to the Steppe Maykop individuals and their relative ancestry coefficients. Simple two-way models of Steppe Maykop as an admixture of Eneolithic steppe, AG3, or Kennewick do not fit. However, we could successfully model Steppe Maykop ancestry as being derived from populations related to all three sources (p-value 0.371 for rank 2): Eneolithic steppe (63.5 ± 2.9%), AG3 (29.6 ± 3.4%) and Kennewick (6.9 ± 1.0%). We note that the Kennewick-related signal is most likely driven by the East Eurasian part of Native American ancestry as the f4-statistics (Steppe_Maykop, Fitted Steppe_Maykop; Outgroup1, Outgroup2) show that the Steppe Maykop individuals share more alleles not only with Karitiana but also with Han Chinese.” ref

Characterizing the Caucasus ancestry profile

“The Maykop period, represented by 12 individuals from eight Maykop sites (Maykop, n = 2; a cultural variant ‘Novosvobodnaya’ from the site Klady, n = 4; and Late Maykop, n = 6) in the northern foothills appears homogeneous. These individuals closely resemble the preceding Eneolithic Caucasus individuals and present a continuation of the local genetic profile. This ancestry persisted in the following centuries at least until ~3100 years ago (1100 cal BCE), as revealed by individuals from Kura-Araxes from both the northeast (Velikent, Dagestan) and the South Caucasus (Kaps, Armenia), as well as MBA/LBA individuals (e.g. Kudachurt, Marchenkova Gora) from the north. Overall, this Caucasus ancestry profile falls among the ‘Armenian and Iranian Chalcolithic’ individuals and is indistinguishable from other Kura-Araxes individuals (Armenian EBA) on the PCA plot (Fig. 2), suggesting a dual origin involving Anatolian/Levantine and Iran Neolithic/CHG ancestry, with only minimal EHG/WHG contribution possibly as part of the AF ancestry.” ref

Admixture f3-statistics of the form f3(X, Y; target) with the Caucasus cluster as target resulted in significantly negative Z scores (Z < −3) when CHG (or AG3 in Late Maykop) were used as one and Anatolian farmers as the second potential source. We also used qpWave to determine the number of streams of ancestry and found that a minimum of two is sufficient. We then tested whether each temporal/cultural group of the Caucasus cluster could be modelled as a simple two-way admixture by exploring all possible pairs of sources in qpWave. We found support for CHG as one source and AF ancestry or a derived form such as is found in southeastern Europe as the other. We focused on mixture models of proximal sources (Fig. 4b) such as CHG and Anatolian Chalcolithic for all six groups of the Caucasus cluster (Eneolithic Caucasus, Maykop and Late Makyop, Maykop-Novosvobodnaya, Kura-Araxes, and Dolmen LBA), with admixture proportions on a genetic cline of 40–72% Anatolian Chalcolithic related and 28–60% CHG related.” ref

“When we explored Romania_EN and Bulgaria_Neolithic individuals as alternative southeast European sources (30–46% and 32–49%), the CHG proportions increased to 54–70% and 51–68%, respectively. We hypothesize that alternative models, replacing the Anatolian Chalcolithic individual with yet unsampled populations from eastern Anatolia, South Caucasus or northern Mesopotamia, will likely also provide a fit to some of the tested Caucasus groups. Models with Iran Neolithic as substitute for CHG could also explain the data in a two-way admixture with the combination of Armenia Chalcolithic or Anatolia Chalcolithic as the other source. However, models replacing CHG with EHG received no support, indicating no strong influence for admixture from the adjacent steppe to the north. We also found no direct evidence of EHG or WHG ancestry in Caucasus groups, but observed that Kura-Araxes and Maykop-Novosvobodnaya individuals had likely received additional Iran Chalcolithic-related ancestry (24.9% and 37.4%, respectively).” ref

Characterising the Steppe ancestry profile

“Individuals from the North Caucasian steppe associated with the Yamnaya cultural formation (5300–4400 years ago, 3300–2400 calBCE) appear genetically almost identical to previously reported Yamnaya individuals from Kalmykia immediately to the north, the middle Volga region, Ukraine, and to other BA individuals from the Eurasian steppes who share the characteristic ‘steppe ancestry’ profile as a mixture of EHG and CHG-related ancestry. These individuals form a tight cluster in PCA space (Fig. 2) and can be shown formally to be a mixture by significantly negative admixture f3-statistics of the form f3(EHG, CHG; target). This cluster also involves individuals of the North Caucasus culture (4800–4500 years ago, 2800–2500 calBCE) in the piedmont steppe, who share the steppe ancestry profile, as do individuals from the Catacomb culture in the Kuban, Caspian and piedmont steppes (4600–4200 years ago, 2600–2200 calBCE), which succeeded the Yamnaya horizon.” ref

“The individuals of the MBA post-Catacomb horizon (4200–3700 years ago, 2200–1700 calBCE) such as Late North Caucasus and Lola cultures represent both ancestry profiles common in the North Caucasus: individuals from the mountain site Kabardinka show a typical steppe ancestry profile, whereas individuals from the site Kudachurt 90 km to the west or our most recent individual from the western LBA Dolmen culture (3400–3200 BP, 1400–1200 calBCE) retain the ‘southern’ Caucasus profile. In contrast, one Lola culture individual resembles the ancestry profile of the Steppe Maykop individuals.” ref

Admixture into the steppe zone from the south

“Evidence for interaction between the Caucasus and the Steppe clusters is visible in our genetic data from individuals associated with the later Steppe Maykop phase around 5300–5100 years ago. These ‘outlier’ individuals were buried in the same mounds as those with steppe and in particular Steppe Maykop ancestry profiles but share a higher proportion of AF ancestry visible in the ADMIXTURE plot and are also shifted towards the Caucasus cluster in PC space (Fig. 2d). This observation is confirmed by formal D-statistics. By modelling Steppe Maykop outliers successfully as a two-way mixture of Steppe Maykop and representatives of the Caucasus cluster, we can show that these individuals received additional ‘Anatolian and Iranian Neolithic ancestry’, most likely from contemporaneous sources in the south. We used ALDER to estimate an average admixture time for the observed farmer-related ancestry in Steppe Maykop outliers of 20 generations or 560 years ago.” ref

Anatolian farmer-related ancestry in steppe groups

“Eneolithic Samara individuals form a cline in PC space running from EHG to CHG (Fig. 2d), which is continued by the newly reported Eneolithic steppe individuals. However, the trajectory of this cline changes in the subsequent centuries. Here we observe a cline from Eneolithic_steppe towards the Caucasus cluster. We can qualitatively explain this ‘tilting cline’ by developments south of the Caucasus, where Iranian and AF ancestries continue to mix, resulting in a blend that is also observed in the Caucasus cluster, from where it could have spread onto the steppe. The first appearance of ‘combined farmer-related ancestry’ in the steppe zone is evident in Steppe Maykop outliers. However, PCA results suggest that Yamnaya and later groups of the West Eurasian steppe carry also some farmer-related ancestry as they are slightly shifted towards ‘European Neolithic groups’ in PC2 (Fig. 2d) compared to the preceding Eneolithic steppe individuals. The ‘tilting cline’ is also confirmed by admixture f3-statistics, which provide statistically significant negative values for AG3 and any AF group as the two sources. Using f– and D-statistics we also observe an increase in farmer-related ancestry (both Anatolian and Iranian) in our Steppe cluster, distinguishing the Eneolithic steppe from later groups. In addition, we find the Caucasus cluster or Levant/AF groups to share more alleles with Steppe groups than with EHG or Samara_Eneolithic. MLBA groups such as Poltavka, Andronovo, Srubnaya, and Sintashta show a further increase of AF ancestry consistent with previous studies, reflecting different processes not directly related to events in the Caucasus.” ref

“Reserchers then used qpWave and qpAdm to explore the number of ancestry sources for the AF component to evaluate whether geographically proximate groups contributed plausibly to the subtle shift of Eneolithic ancestry in the steppe towards Neolithic groups. Specifically, we tested whether any of the Eurasian steppe ancestry groups can be successfully modelled as a two-way admixture between Eneolithic steppe and a population X derived from Anatolian- or Iranian farmer-related ancestry, respectively. Surprisingly, we found that a minimum of four streams of ancestry is needed to explain all eight steppe ancestry groups tested (Fig. 2). Importantly, our results show a subtle contribution of both AF ancestry and WHG-related ancestry (Fig. 4), likely brought in through MN/LN farming groups from adjacent regions in the West. A direct source of AF ancestry can be ruled out. At present, due to the limits of our resolution, we cannot identify a single best source population. However, geographically proximal and contemporaneous groups such as Globular Amphora and Eneolithic groups from the Black Sea area (Ukraine and Bulgaria), representing all four distal sources (CHG, EHG, WHG, and Anatolian_Neolithic), are among the best-supported candidates. Applying the same method to the subsequent North Caucasian Steppe groups such as Catacomb, (Late) North Caucasus confirms this pattern.” ref

“Using qpAdm with Globular Amphora as a proximate surrogate population, we estimated the contribution of AF ancestry into Yamnaya and other steppe groups. We find that Yamnaya Samara individuals have 13.2 ± 2.7% and Ukraine or Caucasus Yamnaya individuals 16.6 ± 2.9% AF ancestry—statistically indistinguishable proportions. Substituting Globular Amphora with Iberia Chalcolithic does not alter the results profoundly. This suggests that the source population was a mixture of AF ancestry and a minimum of 20% WHG ancestry, a genetic profile shared by many European MN/LN and Chalcolithic individuals of the 3rd millennium BCE analyzed thus far. To account for potentially un-modelled ancestry from the Caucasus groups, we added ‘Eneolithic Caucasus’ as an additional source to build a three-way model. We found that Yamnaya Caucasus, Yamnaya Ukraine Ozera, North Caucasus and Late North Caucasus had likely received additional ancestry (6–40%) from nearby Caucasus groups. This suggests a more complex and dynamic picture of steppe ancestry groups through time, including the formation of a local variant of steppe ancestry in the North Caucasian steppe from the local Eneolithic, a contribution of Steppe Maykop groups, and population continuity between the early Yamnaya period and the MBA (5300–3200 years ago, 3300–2200 calBCE).” ref

Insights from micro-transects through time

“The availability of multiple individuals from one burial mounds allowed us to test genetic continuity on a micro-transect level. By focusing on two kurgans (Marinskaya 5 and Sharakhalsun 6) with four and five individuals, respectively, we observe that the genetic ancestry varied through time, alternating between the Steppe and Caucasus ancestries, suggesting a shifting genetic border between the two genetic clusters. We also detected various degrees of kinship between individuals buried in the same mound, which supports the view that particular mounds reflected genealogical lineages. Overall, we observe a balanced sex ratio within our sites across the individuals tested (Supplementary Note 4).” ref

A joint model of ancient populations of the Caucasus region

“The fitted qpGraph model recapitulates the genetic separation between the Caucasus and Steppe groups with the Eneolithic steppe individuals deriving more than 60% of ancestry from EHG and the remainder from a CHG-related basal lineage, whereas the Maykop group received about 86.4% from CHG, 9.6% Anatolian farming related ancestry, and 4% from EHG. The Yamnaya individuals from the Caucasus derived the majority of their ancestry from Eneolithic steppe individuals, but also received about 16% from Globular Amphora-related farmers (Fig. 5).” ref

Fig. 5
figure 5

“Admixture Graph modelling of the population history of the Caucasus region. We started with a skeleton tree without admixture, including Mbuti, Loschbour and MA1. We grafted onto this EHG, CHG, Globular_Amphora, Eneolithic_steppe, Maykop, and Yamnaya_Caucasus, adding them consecutively to all possible edges in the tree and retaining only graph solutions that provided no differences of |Z| > 3 between fitted and estimated statistics. The lowest Z-score for this graph is |Z| = 2.824. We note that the maximum discrepancy is f4(MA1, Maykop; EHG, Eneolithic_steppe) = −3.369 if we do not add the 4% EHG ancestry to Maykop. Drifts along edges are multiplied by 1000 and dashed lines represent admixture, Full size image.” ref

“Our data from the Caucasus region cover a 3000-year interval of prehistory, during which we observe a genetic separation between the groups in the northern foothills and those groups of the bordering steppe regions in the north (i.e. the ‘real’ steppe). We have summarised these broadly as Caucasus and Steppe groups in correspondence with eco-geographic vegetation zones that characterise the socio-economic basis of the associated archaeological cultures. When compared to present-day human populations from the Caucasus, which show a clear separation into North and South Caucasus groups along the Great Caucasus mountain range (Fig. 2d), our new data highlight a different situation during the BA. The fact that individuals buried in kurgans in the North Caucasian piedmont zone are more closely related to ancient individuals from regions further south in today’s Armenia, Georgia and Iran results in two main observations.” ref

“First, sometime after the BA present-day North Caucasian populations must have received additional gene-flow from steppe populations that now separates them from southern Caucasians, who largely retained the BA ancestry profile. The archaeological and historic records suggest numerous incursions during the subsequent Iron Age and Medieval times, but ancient DNA from these time periods will be needed to test this directly. Second, our results reveal that the Caucasus was no barrier to human movement in prehistory. Instead the interface of the steppe and northern mountain ecozones could be seen as a transfer zone of cultural innovations from the south and the adjacent Eurasian steppes to the north (Supplementary Note 1). The latter is best exemplified by the two Steppe Maykop outlier individuals, which carry additional AF ancestry, for which the contemporaneous piedmont Maykop individuals present likely candidates for the source of this ancestry. This might also explain the regular presence of ‘Maykop-style artefacts’ in burials that share Steppe Eneolithic traditions and are genetically assigned to the Steppe group. Hence the diverse ‘Steppe Maykop’ group indeed represents the mutual entanglement of Steppe and Caucasus groups and their cultural affiliations in this interaction sphere.” ref

“Concerning the influences from the south, our oldest dates from the immediate Maykop predecessors Darkveti-Meshoko (Eneolithic Caucasus) indicate that the Caucasus genetic profile was present north of the range ~6500 years ago, 4500 calBCE. This is in accordance with the Neolithization of the Caucasus, which had started in the flood plains of South Caucasian rivers in the 6th millennium BCE, from where it spread across to the West/Northwest during the following millennium. It remains unclear whether the local CHG ancestry profile (Kotias Klde and Satsurblia in today’s Georgia) was also present in the North Caucasus region before the Neolithic. However, if we take the CHG ancestry as a local baseline and the oldest Eneolithic Caucasus individuals from our transect as a proxy for the local Late Neolithic ancestry, we notice a substantial increase in AF ancestry. This in all likelihood reflects the process of Neolithization, which also brought this type of ancestry to Europe. As a consequence, it is possible that Neolithic groups could have reached the northern foothills earlier. Hence, additional sampling from older individuals would be desirable to fill this temporal and spatial gap.” ref

We show that the North Caucasus piedmont region was genetically connected to the south at the time of the eponymous grave mound of Maykop. Even without direct ancient DNA data from northern Mesopotamia, our results suggest an increased assimilation of Chalcolithic individuals from Iran, Anatolia, and Armenia and those of the Eneolithic Caucasus during 6000–4000 calBCE, and thus likely also intensified cultural connections. It is possible that the cultural and genetic basis of Maykop were formed within this sphere of interaction (Fig. 4). In fact, the Maykop phenomenon was long understood as the terminus of expanding Mesopotamian civilisations. It has been further suggested that along with these influences the key technological innovations in western Asia that had revolutionised the late 4th millennium BCE had ultimately also spread to Europe. An earlier connection in the late 5th millennium BCE, however, allows speculations about an alternative archaeological scenario: was the cultural exchange mutual and did e.g. metal rich areas such as the Caucasus contribute substantially to the development and transfer of these innovations?” ref

“Within the 3000-year interval covered in this study, we observe a degree of genetic continuity within each cluster, albeit occasionally interspersed by subtle gene-flow between the two clusters as well as from outside sources. Moreover, our data show that the northern flanks were consistently linked to the Near East and had received multiple streams of gene flow from the south during the Maykop, Kura-Araxes, and late phase of the North Caucasus culture. Interestingly, this renewed appearance of the southern genetic make-up in the foothills corresponds to a period of climatic deterioration (known as 4,200 years ago event) in the steppe zone, that put a halt to the exploitation of the steppe zone for several hundred years. Further insight arises from individuals that were buried in the same kurgan but in different time periods, as highlighted in the two kurgans Marinskaya 5 and Sharakhalsun 6. Here, we recognize that the distinction between Steppe and Caucasus (Fig. 1) is not strict but rather reflects a shifting border of genetic ancestry through time, possibly due to climatic/vegetation shifts and/or cultural factors linked to subsistence strategies or social exchange. Thus, the occurrence of Steppe ancestry in the northern foothills likely coincides with the range expansion of Yamnaya pastoralists. However, more time-stamped data from this region will be needed to provide details on the dynamics of this contact zone.” ref

“An important observation is that Eneolithic Samara and Eneolithic steppe individuals directly north of the Caucasus had initially not received AF gene flow. Instead, the Eneolithic steppe ancestry profile shows an even mixture of EHG- and CHG ancestry, suggesting an effective cultural and genetic border between the contemporaneous Eneolithic populations, notably Steppe and Caucasus. Due to the temporal limitations of our dataset, we currently cannot determine whether this ancestry is stemming from an existing natural genetic gradient running from EHG far to the north to CHG/Iran in the south or whether this is the result of Iranian/CHG-related ancestry reaching the steppe zone independently and prior to a stream of AF ancestry, where they mixed with local hunter-gatherers that carried only EHG ancestry.” ref

“All later steppe groups, starting with Yamnaya, deviate from the EHG-CHG admixture cline towards European populations in the West. We show that these individuals had received AF ancestry, in line with published evidence from Yamnaya individuals from Ukraine (Ozera) and Bulgaria. In the North Caucasus, this genetic contribution could have occurred through immediate contact with Caucasus groups or further south. An alternative source, explaining the increase in WHG-related ancestry, would be contact with contemporaneous Chalcolithic/EBA farming groups at the western periphery of the Yamnaya distribution area, such as Globular Amphora and Cucuteni–Trypillia from Ukraine, which have been shown to carry AF ancestry.” ref

“Archaeological arguments are consonant with both scenarios. Contact between early Yamnaya and late Maykop groups is suggested by Maykop impulses seen in early Yamnaya complexes. A western sphere of interaction is evident from striking resemblances of imagery inside burial chambers of Central Europe and the Caucasus, and similarities in geometric decoration patterns in stone cist graves in the Northern Pontic steppe, on stone stelae in the Caucasus, and on pottery of the Eastern Globular Amphora Culture, which links the eastern fringe of the Carpathians and the Baltic Sea. This overlap of symbols implies a late 4th millennium BCE communication and interaction network that operated across the Black Sea area involving the Caucasus, and later also early Globular Amphora groups in the Carpathians and east/central Europe. The role of early Yamnaya groups within this network is still unclear. However, this interaction zone predates any direct influence of Yamnaya groups in Europe, or the succeeding formation of the Corded Ware, and its persistence opens the possibility of subtle gene-flow from farmers at the eastern border of arable lands into the steppe, several centuries before the massive range expansions of pastoralist groups that reached Central Europe in the mid-3rd millennium BCE.” ref

“A surprising discovery was that Steppe Maykop individuals from the eastern desert steppes harboured a distinctive ancestry component that relates them to Upper Palaeolithic Siberians (AG3, MA1) and Native Americans. This is exemplified by the more commonly East Asian features such as the derived EDAR allele, which has also been observed in HG from Karelia and Scandinavia. The additional affinity to East Asians suggests that this ancestry is not derived directly from ANE but from a yet-to-be-identified ancestral population in north-central Eurasia with a wide distribution between the Caucasus, the Ural Mountains, and the Pacific coast, of which we have discovered the so far southwestern-most and also youngest genetic representatives.” ref

“The insight that the Caucasus mountains served as a corridor for the spread of CHG ancestry north but also for subtle later gene-flow from the south allows speculations on the postulated homelands of Proto-Indo-European (PIE) languages and documented gene-flows that could have carried a consecutive spread of both across West Eurasia. This also opens up the possibility of a homeland of PIE south of the Caucasus, and could offer a parsimonious explanation for an early branching off of Anatolian languages, as shown on many PIE tree topologies. Geographically conceivable are also Armenian and Greek, for which genetic data support an eastern influence from Anatolia or the southern Caucasus, and an Indo-Iranian offshoot to the east. However, latest ancient DNA results from South Asia suggest an LMBA spread via the steppe belt. Irrespective of the early branching pattern, the spread of some or all of the PIE branches would have been possible via the North Pontic/Caucasus region and from there, along with pastoralist expansions, to the heart of Europe. This scenario finds support from the well attested and widely documented ‘steppe ancestry’ in European populations and the postulate of increasingly patrilinear societies in the wake of these expansions.” ref

Streams of ancestry and inference of mixture proportions

The researchers used qpWave and qpAdm as implemented in ADMIXTOOLS with the option ‘allsnps: YES’ to test whether a set of test populations is consistent with being related via N streams of ancestry from a set of outgroup populations and estimate mixture proportions for a Test population as a combination of N ‘reference’ populations by exploiting (but not explicitly modeling) shared genetic drift with a set of outgroup populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic. The “DG” samples are extracted from high coverage genomes sequenced as part of the Simons Genome Diversity Project. For some analyses, we used an extended set of outgroup populations, including some of the following additional ancient populations to constrain standard errors: WHG, EHG, and Levant Neolithic.” ref