The inimitable Joe Pickrell has dropped his Khoisan-are-part-Italian preprint onto arXiv, Ancient west Eurasian ancestry in southern and eastern Africa. I’m being glib in my characterization of the paper’s core conclusion, but there’s a reason for such a flip response: the inferences that he seems to draw from the genetic data strike me as verging on crazy. But that’s OK, what genetics is telling us is that history was a whole lot crazier than we had imagined.

Let’s back up for a moment here. For several decades now geneticists have assumed that the Bushmen of the Kalahari, the Khoisan-qua-Khoisan, Africa’s last hunter-gatherers who retain their ancestral language along with the Hadza, are the ur-humans. The basal lineage that first diverged from the rest of mankind at the cusp of the Out of Africa event. This is evident in Y chromosomal and mtDNA phylogenies, where the Bushmen and their kin harbor variants which coalesce deeply in time with those of others. And, a few years ago another group revealed the likelihood that Bushmen also are products of an admixture event in the last ~50,000 years with a distinct hominin lineage which diverged ~1 million years before the present from the main line which led up to anatomically modern humanity. Now Pickrell et al. present us with a twist which is perhaps even more astringent than a lime: in their genomes the Bushmen and their Khoisan kin, the Khoe herders, reflect an ancient admixture event with East Africans, who themselves were the outcomes of hybridizations between West Eurasians and indigenous African populations. More relevantly for my concise summation of the conclusion, the West Eurasian component does not necessarily reflect modern Middle Eastern populations, so much as Southern Europeans!

How did they infer such bizarre results? Magic? No. Basically the authors looked at patterns of linkage disequilibrium. Got it? Probably not. If you are curious, confused, and intent upon understanding the thrust of their methods in your bones, you probably need to read Loh et al. Barring that trust in the great hive-mind that is the Reich lab, or attempt to swallow my trite condensation.

If you consider a short to medium length sequence of the genome, there are genetic variants, alleles, segregating across that sequence. The frequency of these alleles vary across populations. And, there are on occasion correlations of allelic combinations, seen together across a single sequence than would be likely if the alleles across the loci assorted at random. A concrete example would be a population which is the product of a recent admixture event between Africans and Europeans. Recombination would take many generations to break apart all the associations between alleles which are diagnostic and distinctive of African and European ancestry, so long blocks of ancestry tracts could be inferred simply by phasing the genome on the individual level (i.e., you know the sequence of each homolog inherited from each parent, instead of just genotype values). There would be linkage disequilibrium within the population because particular variants would be associated with others across loci due to recent distinct ancestry at the genomic level. If you noticed that SNP 1 had an African allele, then SNP 2 located nearby in the locus is also more likely to have an African allele than expectation, until the point that linkage equilibrium is attained.

As I noted above, these associations are broken apart over time in a regular fashion by genetic recombination. Therefore, the decay in linkage disequilibrium across the genome can allow you to infer time since a putative admixture event. This works at various time depths. African Americans have long range LD because the admixture was relatively recent. To date older admixture events one must be more cunning, as the LD decays and becomes exceedingly faint as recombination hacks apart previous distinctive associations as two genetic backgrounds merge. But what about multiple admixture events and the consequent linkage disequilibrium patterns? What the authors did in the above paper was to test the fit of the data to a composite of LD curves in scenarios where it seems likely that there were two possible admixture events. And, they found multiple populations which did fit this model.

Dispensing with the technicalities, here are the results of admixture events as inferred from the LD decay curves:

The most parsimonious model that Pickrell et al. propose is simple as it is crazy.

1) An ancient initial admixture event in the environs of the Horn of Africa between a proto-West Eurasian population and a proto-Sudanic population

2) A second admixture event which occurs when a population derived downstream from event 1 encounters the ancestors of the Khoisan

Pickrell et al. infer a ~3,000 year old admixture event between West Eurasians and Africans for the Semitic populations of the Ethiopian plateau in keeping with Pagani et al.’s only marginally less crazy results. Then you have step 2, with an admixture between proto-Bushmen/proto-Khoe and the hybrid East Africans ~1,500 years ago. Let us accept these genetic results on the face of it. What they bring home to me is the power of culture. Though vastly diminished today, groups such as the Khoe Nama managed to preserve their integrity and independence down to the period of European colonialism (only being truly decimated in Namibia in the early 20th century by the Germans). A wave of Bantu farmers overwhelmed most of southern Africa, but select groups of Khoisan managed to maintain zones of habitation where they persisted with their unique cultural traditions and perpetuated their language. Some of this surely was ecology, as the vast Karoo region is not particularly amenable to the Bantu cultural toolkit. But, I also suspect that institutional and economic (e.g. cattle culture) influences that the East Africans had upon the Khoe, and perhaps even indirectly the Bushmen, also made these populations more robust to the Bantu expansion than otherwise would have been the case.

Being a preprint on arXiv, the paper of which I speak here is free to you, and copiously explained in loving detail in the supplements in terms of method and madness. I am not particularly enthusiastic about having long discussions about how these results are crazy and can not be right. They are crazy. But I know enough about the methodology here to understand the logic, and accept that the authors are grasping at something very strange and true, even if their particular interpretation and specific results may be disputable. Let me quote the paper at this point:

The hypothesis that west Eurasian ancestry entered eastern Africa through Arabia must be reconciled with the observation that the best modern proxies for this ancestry are often found in southern Europe rather than the Middle East (Supplementary Table 4). This observation can be interpreted in the context of ancient DNA work in Europe, which has shown that, approximately 5,000 years ago, people genetically closely related to modern southern Europeans were present as far north as Scandinavia [Keller et al., 2012; Skoglund et al., 2012]. We thus find it plausible that the people living in the Middle East today are not representative of the people who were living the Middle East 3,000 years ago. Indeed, even in historical times, there have been extensive population movements from and to the Middle East [Davies, 1997; Kennedy, 2008].

Think on that. If Pickrell et. al. are right do you think that the Middle East is particularly special in this regard? I will say that it comes to mind that the high consanguinity may result in strange outcomes if one is not careful with the sampling strategy (I’m thinking of the Samaritans I see in their data), though I doubt that this is an incautious group. But I do think it is plausible that some European populations are better proxies for the ancient Levantines than the modern Levantines because the latter have been washed over by multiple demographic waves (though I want to see more comparisons with Christian Arab* samples).

A second bombshell dropped by Pickrell et. al.:

We note that we have interpreted admixture signals in terms of large-scale movements of people. An alternative frame for interpreting these results might instead propose an isolation-by-distance model in which populations primarily remain in a single location but individuals choose mates from within some relatively small radius. In principle, this sort of model could introduce west Eurasian ancestry into southern Africa via a “diffusion-like” process. Two observations argue against this possibility. First, the gene ow we observe is asymmetric: while some eastern African populations have up to 50% west Eurasian ancestry, levels of sub-Saharan African ancestry in the Middle East and Europe are considerably lower than this (maximum of 15% [Moorjani et al., 2011]) and do not appear to consist of ancestry related to the Khoisan. Second, the signal of west Eurasian ancestry is present in southern Africa but absent from central Africa, despite the fact that central Africa is geographically closer to the putative source of the ancestry. These geographically-specific and asymmetric dispersal patterns are most parsimoniously explained by migration from west Eurasia into eastern Africa, and then from eastern to southern Africa.

Isolation-by-distance is alluded to implicitly when we speak of human genetic variation as clinal. And it’s not totally lacking in utility as a null model. But I think we need to add another layer of complexity upon this parsimonious elegance of human clans eternally exchanging mates in monotonous step-wise fashion. Multiple populations over the past 10,000 years (and likely earlier!) were rocked massive demographic turmoil, as foreigners from afar amalgamated themselves upon the local substrate, and abolished the old to bring forth something new. The author of this post is himself a product of such an event. The genetic story of mankind is not just one of continuous and diffuse gene flow gradually over a landscape of small-sale societies. No, this placid background condition was periodically perturbed by an explosion of translocating peoples, likely triggered by a technological or cultural revolution of some sort. The genetic impact in many cases is too great to be anything but a folk wandering.

Unlike isolation-by-distance these patterns do not flow linearly across space, but exhibit discordant lashing patterns through ecologically fertile terrain. Rather than a mist gliding across the plains, imaging a flash flood scouring a ravine. A more gentle analogy would be that these are demographic ripples, which expand outward, temporarily distorting the calm surface of isolation-by-distance dynamics, and eventually fading back into the background and becoming the new normal. But once the ripple has faded how do we know that it was once? That is a difficult thing indeed, and these results indicate the problems inherent. It may be that the echoes of the ripple that Pickrell et al. detect issue from a source which no longer exists. Are the scions of the first farmers of the ancient Levant hidden away in the valleys of Tuscany and the plains of Tanzania? A crazy proposition also, but not necessarily a false one.

* I know some Christian Arabs do not want to be called Arabs.

Sahul 10,000 years ago

John Hawks has a very long rumination on the story of blonde Melanesians which came out last week. If I can read between the lines I think some of the implications dovetail with John’s thesis in his 2007 paper on adaptive acceleration. But I’ll leave the deep reading of tea leaves to those better versed in such affairs. Rather, I will comment on two issues. The first is specific. I believe that the TYRP1 R93C allele responsible for blonde hair among the Solomon Islanders is going to be found to be the same one responsible for blonde hair around New Guinea and among some Australian Aboriginals.

There are several reasons why I suspect this. First, these Oceanian populations do seem to be a distinctive clade. There are some disagreements among geneticists as to whether they are the descendants of the first settlers of Oceania in totality, or whether they’re a compound lineage. The natural historical details need to be teased apart, but there’s no doubt that they’re a distinct and separate phylogenetic lineage in relation to other human populations (with some admixture from Austronesians in the case of Melanesians). Second, the phenotype of rather light hair, and dark skin, is striking and parallel among all these populations. It is not entirely impossible that random genetic drift could stumble upon an architecture which results in such a suite of characteristics, but I’m generally skeptical of this possibility of having occurred several times in the same human lineage, when it seems relatively rare in other populations (the loci associated with light eyes and light hair in Eurasians seem to have some effect on skin color as well, so it is difficult, though not impossible, to have very dark skin and light hair). Third, as noted in John’s post, this seems an “old” variant. A new mutation which rose rapidly in frequency should have a lot of associated markers flanking it (ergo, high linkage disequilibrium). Again, we need to take into account the joint information here; there seem several Oceanian populations where the allele is at high, but still minor, frequency (assuming it is the same allele). Oceanian populations, especially those verging toward Melanesia, have low effective population sizes. If this is an old variant I am curious as to why it has not fixed in any of the daughter populations (recall that if it did fix, and admixture recently drove it below ~100 percent, there would be a lot of linkage disequilibrium). Therefore, I think we have to seriously consider the possibility that balancing selection is maintaining this polymorphism in the Oceanian populations. This does not entail that the selection is targeting the hair color phenotype, though it may.

In his post John Hawks seems to suggest that ancient variation in pigmentation, phenotypic and genotypic, may be somewhat different from what was the norm in Eurasia over the past 10,000 years. Rather than focusing on this issue, I will rather put forward a proposition of only moderate boldness, and lay out my own suspicion of what the next few years will tell us about the human past. First, let us set the stage. Between 100,000 and 15,000 years ago modern humans spread out across the world, to every continent where they reside today. There were several “touch & go” moments. The Toba event may have been one, and the Last Glacial Maximum perhaps another. Human populations waxed & waned. Because of low densities, and resultant barriers to gene flow, many of the deep phylogenies evident in the mtDNA and Y chromosomal record date to this period. It does not seem that modern human expanded rapidly from one or two populations ~15,000 years ago. Mitochondrial Eve tells us that is not so. As I have stated before the “state of the art” in 2005 would have ended with this story. By about ~15,000 years ago, as Amerindians reached the precipice of the New World, pre-Columbian patterns of genetic variation were settling in.

I do not believe that this story holds up. It does not seem to accurately describe how the 20 percent of the world’s population which is South Asian came to be. More precisely, there was always a peculiar discordance in the uniparental lineages for South Asians; the maternal line was closer to East Eurasians, the paternal close to West Eurasians. The autosomal data, which looked at the whole genome, indicated a moderately closer affinity to West Eurasians, though as a function of geography (northwest to southeast) and caste (high to low). These conundrums are solved by a recent admixture of >10,000 years before past. But it is not simply the existence of particular sets of humans. The patterns of genetic variation in Africa today are very new, and owe little to the Last Glacial Maximum. Rather, it seems that the Bantu expansion wiped clean much of Sub-Saharan Africa of pre-existent variation. The Khoisan, Pygmies, and Hadza being the remnants of vast numbers of pre-Bantu populations, which seem to have had marginal effect on the genetics of East African Bantus.

The two cases above illustrate dual dynamics which I believe shaped the nature of modern human genetic variation in the early Holocene: the great pruning and the great synthesis. The great pruning consists of the marginalization and assimilation of vast amounts of cultural diversity, and perhaps genetic diversity as well. It seems likely that the Pygmies and Khoisan have archaic admixture not found in other Africans, while Oceanians have archaic admixture not found in other humans. If it were not for geography, and the practice of horticulture, the populations of Near Oceania may have suffered the fate of the Negritos of Malaysia and the Philippines. But speaking of Negritos, the Andaman Islanders, and South Asians, illustrate the second aspect: the great synthesis. South Asians are a compound in part descended from the cousins of the Andaman Islanders. In The First Farmers Peter Bellwood lays out an elegant model of demic diffusion from core agriculture hearths, which explains the modern linguistic and genetic patterns we see around us. As an extreme null hypothesis it is useful, but I think we need to be cautious about taking a stylized fact too literally. Bellwood’s thesis came close to positing ancient agricultural apartheid. But mtDNA remains imply strongly that women of local lineages were absorbed into expanding populations (a result which is eminently plausible given historic patterns). Racial purity has never been high on the list of human male “to-do’s.” There is a moderate probability that within the next few years other Eurasian populations will be seen to be similar to South Asians in their origin: products of a dialectic between the conqueror and the conquered, between the intruder and the indigene. Finally, in the shadows, and poorly understood from what I can glean, is the third dynamic which comes to the fore in the receding tide of the action of the great pruning and synthesis: the sons of the soil who manage to dodge the fate of being pruned from the human tree by finding a different strategy of survival.

Obviously this is not the whole story. The Columbian Exchange is one chapter, which post-dates the prehistoric prunes and synthesis, though it echoes them in many ways. The expansion of the Han Chinese is another story which post-dates the period to which I allude. But these stories are known and counted. What I am suggesting here is that in the near future dynamics only faintly recollected will begin to move to out of the shadows, because they have left an imprint in our genes.

After linking to Marnie Dunsmore’s blog on the Neolithic expansion, and reading Peter Bellwood’s First Farmers, I’ve been thinking a bit on how we might integrate some models of the rise and spread of agriculture with the new genomic findings. Bellwood’s thesis basically seems to be that the contemporary world pattern of expansive macro-language families (e.g., Indo-European, Sino-Tibetan, Afro-Asiatic, etc.) are shadows of the rapid demographic expansions in prehistory of farmers. In particular, hoe-farmers rapidly pushing into virgin lands. First Farmers was published in 2005, and so it had access mostly to mtDNA and Y chromosomal studies. Today we have a richer data set, from hundreds of thousands of markers per person, to mtDNA and Y chromosomal results from ancient DNA. I would argue that the new findings tend to reinforce the plausibility of Bellwood’s thesis somewhat.

The primary datum I want to enter into the record in this post, which was news to me, is this: the island of Cyprus seems to have been first settled (at least in anything but trivial numbers) by Neolithic populations from mainland Southwest Asia.* In fact, the first farmers in Cyprus perfectly replicated the physical culture of the nearby mainland in toto. This implies that the genetic heritage of modern Cypriots is probably attributable in the whole to expansions of farmers from Southwest Asia. With this in mind let’s look at Dienekes’ Dodecad results at K = 10 for Eurasian populations (I’ve reedited a bit):


Modern Cypriots exhibit genetic signatures which shake out into three putative ancestral groups. West Asian, which is modal in the Caucasus region. South European, modal in Sardinia. And Southwest Asian, which is modal in the Arabian peninsula. Cypriots basically look like Syrians, but with less Southwest Asian, more balance between West Asian and South European, and far less of the minor components of ancestry.

Just because an island was settled by one group of farmers, it does not mean that subsequent invasions or migrations could not have an impact. The indigenous tribes of Taiwan seem to be the original agriculturalists of that island, and after their settlement there were thousands of years of gradual and continuous cultural change in situ. But within the last 300 years settlers from Fujian on the Chinese mainland have demographically overwhelmed the native Taiwanese peoples.

During the Bronze Age it seems Cyprus was part of the Near East political and cultural system. The notional kings of Cyprus had close diplomatic relations with the pharaohs of Egypt. But between the end of the Bronze Age and the Classical Age Cyprus became part of the Greek cultural zone. Despite centuries of Latin and Ottoman rule, it has remained so, albeit with a prominent Turkish minority.

One thing notable about Cyprus, and which distinguishes it from mainland Greece, is the near total absence of a Northern European ancestral component. Therefore we can make the banal inference that Northern Europeans were not initially associated with the demographic expansions of farmers from the Middle East. Rather, I want to focus on the West Asian and Southern European ancestral components. One model for the re-population of Europe after the last Ice Age is that hunter-gatherers expanded from the peninsular “refugia” of Iberia and Italy, later being overlain by expansions of farmers from the Middle East, and perhaps Indo-Europeans from the Pontic steppe. I have a sneaking suspicion though that what we’re seeing among Mediterranean populations are several waves of expansion out of the Near East. I now would offer the tentative hypothesis that the South European ancestral element at K = 10 is a signature of the first wave of farmers which issued out of the Near East. The West Asians were a subsequent wave. I assume that the two groups must correlate to some sort of cultural or technological shift, though I have no hypothesis as to that.

From the above assertions, it is clear that I believe modern Sardinians are descendants of that first wave of farmers, unaffected by later demographic perturbations. I believe that Basques then are a people who emerge from an amalgamation of the same wave of seafaring agriculturalists with the indigenous populations preceding them (the indigenes were likely the descendants of a broad group of northern Eurasians who expanded after the end of the last Ice Age from the aforementioned refugia). They leap-frogged across fertile regions of the Mediterranean and pushed up valleys of southern France, and out of the Straits of Gibraltar. Interestingly, the Basque lack the West Asian minority element evident in Dienekes’ Spaniards, Portuguese, as well as the HGDP French (even up to K = 15 they don’t shake out as anything but a two way admixture, while the Sardinians show a minor West Asian component). Also, the West Asian and Southern European elements are several times more well represented proportionally among Scandinavians than Finns. The Southern European element is not found among the Uyghur, though the Northern European and West Asian one is. I infer from all these patterns that the Southern European element derived from pre-Indo-European farmers who pushed west from the Near East. It is the second largest component across much of the Northwestern Europe, the largest across much of Southern European, including Greece.

A second issue which First Farmers clarified are differences between the spread of agriculture from the Near East to Europe and South Asia. It seems that the spread of agriculture across South Asia was more gradual, or least had a longer pause, than in Europe. A clear West Asian transplanted culture arrived in what is today Pakistan ~9,000 years ago. But it does not seem that the Neolithic arrived to the far south of India until ~4,000 years ago. I think that a period of “incubation” in the northwest part of the subcontinent explains the putative hybridization between “Ancient North Indians” and “Ancient South Indians” described in Reconstructing Indian population history. The high proportion of “Ancestral North Indian,” on the order of ~40%, as well as Y chromosomal markers such as R1a1a, among South Indian tribal populations, is a function of the fact that these groups are themselves secondary amalgamations between shifting cultivators expanding from the Northwest along with local resident hunter-gatherer groups which were related to the ASI which the original West Asian agriculturalists encountered and assimilated in ancient Pakistan (Pathans are ~25% ASI). I believe that the Dravidian languages arrived from the Northwest to the south of India only within the last 4-5,000 with the farmers (some of whom may have reverted to facultative hunter-gathering, as is common among tribals). This relatively late arrival of Dravidian speaking groups explains why Sri Lanka has an Indo-European presence to my mind; the island was probably only lightly settled by farming Dravidian speakers, if at all, allowing Indo-European speakers from Gujarat and Sindh to leap-frog and quickly replace the native Veddas, who were hunter-gatherers.

Note: Here is K = 15.

* Wikipedia says there were hunter-gatherers, but even here the numbers were likely very small.

