The question of Italy population genetic structure comes up rather often for various reasons. I haven’t visited this topic in much detail since reading Consanguinity, Inbreeding, and Genetic Drift in Italy, a very old book using classical genetic techniques. L. L. Cavalli-Sforza did not find much structure in Italy at the time, but it turns out that there wasn’t enough power in the methods. I have some access to Italian data sets and I can tell you that there is a lot of variation. Sicilians in particular are mixed in ways unique outside of the Iberian peninsula A few years ago using the PopRes data set Peter Ralph and Graham Coop found in The Geography of Recent Genetic Ancestry across Europe some interesting facts about Italy:
In addition to the very few genetic common ancestors that Italians share both with each other and with other Europeans, we have seen significant modern substructure within Italy (i.e., Figure 2) that predates most of this common ancestry, and estimate that most of the common ancestry shared between Italy and other populations is older than about 2,300 years (Figure S16). Also recall that most populations show no substructure with regards to the number of blocks shared with Italians, implying that the common ancestors other populations share with Italy predate divisions within these other populations. This suggests significant old substructure and large population sizes within Italy, strong enough that different groups within Italy share as little recent common ancestry as other distinct, modern-day countries, substructure that was not homogenized during the migration period. These patterns could also reflect in part geographic isolation within Italy as well as a long history of settlement of Italy from diverse sources.
There were limitations in terms of how much geographic specificity the PopRes data set provided them, so there was only so much you could say. One hypothesis could be that unlike much of Europe deep local structure within the Italian peninsula predating the Roman Empire persists to this day. The Latinization of Italy then during the late Republican and early Imperial period could be thought of primarily as a matter of cultural diffusion and elite emulation. This stands to reason in part because much of the Italian peninsula was inhabited by peoples who were already speaking languages very close to Latin. But, another possibility is that this deep structure exists became of more recent migrations. For example, the existence of Magna Graecia in southern Italy and Sicily was due to the migration of males from Greece in the centuries before the rise of Rome. The genetic distance of this population would be inflated due to this gene flow, and if Italian demographic history is such that gene flow across regions is low, then it would persist.
But things have changed since 2013. We know a fair amount more about European genetic history, thanks to ancient DNA. Just read Ancient human genomes suggest three ancestral populations to get a flavor. In short, it turns out that most European populations can be modeled as a three-way admixture, between one group with ancient Middle Eastern affinities, but different from modern Middle Easterners. Modern Sardinians are very close to this group. A second group are the indigenous European hunter-gatherers, who presumably expanded after the retreat of the tundra and had deeper roots in the continent, possibly at least back to the Gravettian period. Finally, a third group is a compound with a different Middle Eastern group, the European hunter-gatherer ancestry, and an ancient North Eurasian population more distant to other West Eurasians.
Most readers of this weblog are familiar with this song and dance. Now I want to submit new results from a paper in EJHG, The Italian genome reflects the history of Europe and the Mediterranean basin. A minor nit: I would assume that the Italian genome reflects the history of Europe and the Mediterranean basin! It would be really surprising if the Italian genome reflects the history of East Asia and the South China Sea!
What immediately jumped out for me about the results form this paper is that it seems clear that all non-Sardinian populations exhibit equal distance to Sardinians. That is, there is no “Sardinian-cline” in these data. Perhaps there are populations on the mainland that do exhibit a Sardinian-cline, but they haven’t been sampled in this study. What does this mean? The circumstantial evidence is strong that there was an intrusive population across Europe which arrived from the steppes spread across Northern Europe about ~4,500 years ago. The linguistic evidence tends to bind the Celtic and Italic branches of the Indo-European language family, so it seems the case that there was likely an intrusive population from Northern Europe that arrived sometime between 500 BC, when the Italian populations start to edge into history, and 2500 BC, when the Indo-Europeans swept Northern Europe. These people would presumably have amalgamated with the original Sardinian-like group. The best work suggests that though Sardinians have the most of this ancestry, it is still predominant in Southern Europe overall. It is curious then that the Sardinian fraction is so low, and, that it is relatively even. In fact, it is lowest in the southernmost Italian groups, and highest in Lombdary! Part of this is probably because Sardinian is not the same as Sardinian-like farmer. But I still would have expected some cline (I presume the Sardinians shifted toward the mainland are due to migration from the mainland). On the other hand, there is a large north-south gradient that you can see on the admixture plot .
The plot to the left is too small to make out well, but as people allude to Italian population structure in a world-wide context, this PCA does just that. The bright green are the Southern Italians, the bright light blue the Central Italians, and the red the Northern Italians. You see that the Southern Italians are shifted toward the Middle Eastern groups, while the Northern Italians are closer to groups like the Spanish and French. To the top right are Northern European groups, in purple, and the bottom right are Mozabites, with Turks in dark green in the middle, shifted toward Italians. Sardinians occupy the far left. As you can see, contrary to a commenter earlier this week, Italians of all stripes are not that distinct from other Europeans.
But, Southern Italians, and from what I have seen in private data Sicilians in particular, are distinct because of a possible admixture signal with exotic groups you don’t normally see in Europeans. If you look in the supplements the possibility becomes clearer. There is a lot of evidence that this admixture is North African. You see this in the ADMIXTURE plots in the supplements, as well as the IBD sharing patterns. The South Italian groups are enriched with the Mozabites and Moroccans, not groups from the eastern Mediterranean. The likely period when this admixture occurred is when Sicily was an Arab emirate, from 830 to 1070. More or less Sicily was then part of the greater Maghreb. Calabria also had a Muslim presence, though more tenuous.
Finally, the authors used LD patterns and reference populations to attempt to estimate admixture times:
We found evidence of the presence of a mix of Central-Northern European and Middle Eastern-North African ancestries in the Italian individuals (Supplementary Table S5). The estimated times of admixture ranged between ~2050 and 1300 years ago (y.a.), with an average of about 1650 y.a. – assuming 29 years per generation– for Northern Italians, and between ~3000 and 1450 y.a. (~2100 y.a. on average) for Central Italians. Finally, for the Southern Italian individuals, admixture between European and Northern African-Middle Eastern ancestry was estimated to have occurred about 1000 y.a. (see Supplementary Table S5 and Supplementary Results for a complete report of significant results).
The admixture in Southern Italy is estimated to have occurred ~1000 years ago. That’s pretty much what you’d expect. These methods tend to pick up the last signal of admixture, so there may have been ones earlier (e.g., Magna Graecia?). That might explain the relatively low fraction of “Sardinian” ancestry, as this area of Italy has had significant gene flow from outside Italy over the past 2,500 years, whether it be Greeks, people from other parts of the Mediterranean, and last Maghrebis.
The difference between Northern and Central Italians is intriguing. The reference populations are not optimal, and the dates have a wide interval. We actually know what was happening 2,100 years ago in Central Italy, and there was no admixture between Middle Eastern and Northern European groups. The Roman world empire was still in a nascent state. The Northern Italian admixture date might align with a German migration into Italy, or perhaps the Gauls in the centuries earlier. I really don’t know. I am of the inclination to suggest that the Central Italian signal might be somehow low balling the Indo-European admixture.
The authors say that their data will be released. But I looked up the accession number, and it’s not up there yet.