If you read Nell Irvin Painter’s The History of White People you will learn that the white race is a social construction of relatively recent vintage. When I read her work in 2011 I was a touch annoyed by it, because a lot of interesting empirical data was shoehorned into her thesis and preferences. In relation to her putative topic, she wasn’t a big fan (I don’t doubt that Painter likes white people as humans, but she obviously thinks that the invention of the white race was not a good thing). I have serious reservations and objections to these sorts of Manichaean frameworks. And yet over the last few years I have come to a very different but new perspective: I believe white people emerged biologically only in the past 5,000 years, on the edge of history and prehistory. I think a plain reading of the race concept in biology is entirely defensible so long as you integrate population thinking. But, human races are not primordial. They aren’t even Pleistocene.
This brings us to the Kalash of Pakistan. They are pagans who live in the fastness of the Chitral. Their cousins on the other side of the border, in Afghanistan, are the Nuristanis, who were foricbly converted to Islam in the last decade of the 19th century. The Man Who Would be King takes place among the Nuristanis, who were then termed Kafirs. It was written in 1888, before the conversion to Islam. The Kalash were in British India, so spared from conversion. It seems unlikely that they will persist beyond this generation due to the social-political milieu of modern Pakistan, where religious toleration only exists for economic elites who can withdraw into their own private world. It was this context which drove Gerard Russsell to include the Kalash in his book Heirs to Forgotten Kingdoms: Journeys Into the Disappearing Religions of the Middle East. The Kalash are not Middle Eastern, and are very different from various heterodox groups of the Middle East (who often have connections to the astral religion of Late Antiquity), but there is an urgency in recording their culture before it disappears.
Another major salient aspect of the Kalash is that they are mostly white. That is, if you took a Kalash man and dressed him in jeans and a baseball cape wouldn’t think twice if you saw him in a country music video. Let me quote from Man Who Would be King:
“‘In another six months,’ says Dravot, ‘we’ll hold another Communication and see how you are working.’ Then he asks them about their villages, and learns that they was fighting one against the other and were fair sick and tired of it. And when they wasn’t doing that they was fighting with the Mohammedans. ‘You can fight those when they come into our country,’ says Dravot. ‘Tell off every tenth man of your tribes for a Frontier guard, and send two hundred at a time to this valley to be drilled. Nobody is going to be shot or speared any more so long as he does well, and I know that you won’t cheat me because you’re white people — sons of Alexander — and not like common, black Mohammedans. You are my people and by God,’ says he, running off into English at the end — ‘I’ll make a damned fine Nation of you, or I’ll die in the making!’
… Dravot gives out that him and me were gods and sons of Alexander, and Past Grand-Masters in the Craft, and was come to make Kafiristan a country where every man should eat in peace and drink in quiet, and specially obey us. Then the Chiefs come round to shake hands, and they was so hairy and white and fair it was just shaking hands with old friends. We gave them names according as they was like men we had known in India — Billy Fish, Holly Dilworth, Pikky Kergan that was Bazar-master when I was at Mhow, and so on, and so on.
The genetics on the pigmentation loci make it clear why the Kalash are so fair. They are fixed at SLC24A5 for the derived variant. In fact their pigmentation genes are rather similar in allele frequency distribution to Sardinians (check SLC45A2 and OCA2/HERC2). In a European context the Kalash are not notably fair skinned, but a substantial number can clearly pass as white without difficulty because for all practical purposes they are white physically. The observations of Kipling’s narrator in Man Who Would be King holds true today, white Western journalists who need to pretend to be native in Afghanistan take on a Nuristani identity. Even if most Nuristanis and Kalash are not blue eyed and blonde haired, enough are that it is not totally implausible that a fair Northern European could pass as one of them.
Though the Nuristanis and Kalash are at one end of the distribution in South Asia, they’re not total aberrations. Many Pathans, for example, basically look white. Above I posted the photo of Ayub Khan, military dictator of Pakistan in the 1960s. He was an ethnic Pathan. Khan loomed large in my father’s recollection of this period. When he arrived in Pakistan to complete his master’s degree he was surprised that most people were not white like Ayub Khan!
Which brings me to the question, if a subset of people on the Northwest fringes of the Indian subcontinent are physically white, are they then related to the peoples of Europe to an inordinate level? In the 19th century the presumption was they were, insofar as these were “Lost White Races,” with some theorists positing connections between high caste Indians and Europeans as Aryans. These sorts of mental frameworks are not particularly unique to Europeans. I’m mostly finished with The Making of Modern Japan, and the Japanese immediately made an analogy in appearance between the Europeans entering their waters and the Ainu people to their North. And then there is the legend of Alexander. In particular, that the Kalash are descended from the Macedonians and Greeks who marched with Alexander. That in truth they are a lost European tribe. I get questions about this pretty much every three to four months. I always answer in the negative. There is no strong evidence of a specific connection. I’ve even made it into the Wikipedia entry for the Kalash:
Discover Magazine genetics blogger Razib Khan has repeatedly cited information indicating that the Kalash are an Indo-Iranian people with no Macedonian ethnic admixture. A study by Hellenthal et al. (2014) on the DNA of the Kalash peopl evidence of input from Europe or the Middle East (the researchers could not pin down a precise geographic location) between 990 and 210 BC, a period that overlaps with that of Alexander the Great.
The paper cited to offer up an opening to the possibility of Kalash connections to the Macedonians comes up frequently. It’s known to me, and though the group associated with it is top notch, and the results are certainly impressive, their interpretations are not bullet proof (and the authors are reasonably tentative). I went back and re-read the Hellenthal et al. paper, and checked out their awesome website where you can repeat their analyses. The screenshot to the left shows the Kalash admixture event. They have Greeks and Bulgarians in their data, but the gene flow is from Northern Europe.
Enough talk though. I have data, and will do some more analyses myself. The preliminaries. I took the Reich lab Haak et al. data set (it’s a subset of this), and yanked out a bunch of populations. Additionally, I took the four Yamnaya samples with the best quality genotypes, and created a data set where all their genotypes are included and those that they are missing are excluded (the –mind option in Plink). What I’m saying here is that the variation in the data set is skewed toward the good SNP calls in the ancient Yamnaya samples. After some more quality control I got down to 85,000 SNPs.
First, here is some PCA….
You can’t see it on the thumbnail, and the colors are confusing if you click it, but the Kalash sit square on the northwest edge of South Asian populations. Exactly where you’d expect them to be if they were indigenous to South Asia, and not European transplants. The earlier genetic markers I talked about were a narrow set related to pigmentation. This is genome-wide, sampled out of the 30 million polymorphisms. If you took the pigmentation related loci, Kalash would probably cluster on the edge of Europe. What this shows is that not all genes are representative of genome-wide patterns. SLC24A5 seems to have been subject to selection within South Asia, in situ.
Next I want to zoom in a bit to make a point. You probably want to click to enlarge, but from the top right to bottom left: Greeks, Lithuanians, Yamnaya, Pathan/Kalash.
You notice in this plot that the Kalash are closer to Lithuanians than Greeks. I think a fair minded person would say that the Kalash look more like Greeks than Lithuanians, that is, they’re brunette whites. But genome-wide data show that they are closer to Lithuanians! This is in line with the results you saw above from the Globetrotter genetic admixture methodology. Kalash affinities in Europe are not with Southern Europeans, but Northern Europeans.
Next, we’ll look at PC 3.
Click the image to see it bigger, what PC 3 in these data map onto is a Papuan (up top) and South Asia (bottom) axis. The Kalash are one of the most South Asian populations on this axis! Don’t make too much of this, as there aren’t any South Indian groups. But, it shows that there is something distinctive about the Kalash which is like many other South Asians, and not like Europeans. Not surprisingly the Iranian samples are somewhat shifted toward the South Asians.
The above plots are a bit cluttered. So let’s look at a subsample. Below is a zoom in. PC 1 separates East Asians (off to the left of the plot, not visible) from Europeans. PC 2 separates Yamnaya from everyone else (they are below the bottom edge). I’ve highlighted a few populations.
You can see that the Lithuanians are the most Yamnaya-shifted population. But the Kalash and other Northwest South Asian groups are Yamnaya shifted as well. Not surprisingly, the Druze and Sardinians are the least Yamnaya shifted. The Greeks are not notably Yamnaya shifted, though they are in comparison to the Sardinians.
Now we’ll run Treemix. The parameters -m = 5 and -k = 500. I ran a bunch of iterations. The plots are below.
The Kalash are drifted a lot. So they are a long branch often. But you see that most often they are near the other Northwest South Asians. There is no gene flow parameter from Europe. Though that’s probably a function of the other gene flow events being more much significant. So let’s cut down the data set to the same extent as with the PCA.
The Kalash are much closer to Europeans in these plots than some South Asians. But why? Clearly it is partly a function of the positioning and affinities of the Yamnaya. Additionally, the Kalash cluser with the Pathans, and to some extent other Northwest South Asians. Geography is stamped genome-wide.
Let me quote from the supplements of Hellenthal et al.:
The Kalash are a geographically and genetically (39) isolated population that have lived in a remote valley within present-day Pakistan for many centuries (40; 66). In the original (Full) analysis, the Kalash possess our oldest estimated date of most recent admixture, of 600BCE (990-210BCE), between sources best represented today by Germany-Austria (though within a range of potential European-related sources, e.g. represented by Turkey in the CentralAsia analysis; 35%) and the nearby Pathan (65%). Intriguingly, this period overlaps that of Alexander the Great (356-323BCE) whose army, local tradition holds, the Kalash are descended from (40). The history of this group is not known: our analysis suggests a major admixture event from a source related to present-day Western Eurasians, but we cannot identify the geographic origin of this ancient source precisely.
In the “Central Asia” analysis of Note S7.4 (but not in the “full” analysis), a very similar ancient admixture signal (always dated older than 90BCE) is seen in five nearby Pakistan populations: the Makrani, Balochi, and Brahui, and more weakly in the Pathan and Sindhi, but not identified in the most northerly groups. Ancient admixture involving sources related to East Asia is inferred in the easterly Burusho and tentatively (within a second signal) the Kalash. These older events are similar in date to that seen in the Kalash but involve less strongly European-like, and more West Asian like, sources (Figure 4; Figure S18), and pre-date recorded history for the region.
The power to detect the events seems a bit weak. Probably better phasing (they had 425,000 markers and used population-based methods) and sample coverage would help. But I think what they’re seeing here are two migration events. First, one with affinities to northern West Asia, which is the majority of the “Ancestral North Indian” (ANI) signal. It is overwhelming in the south and northeast of the subcontinent. A secondary wave probably relates to the Indo-Aryans. It is substantial in the northwestern regions, and less so as you proceed into the Gangetic plain, and present only among Brahmins and other migrants in southern India. It probably correlates well with lactase persistence. I suspect that the Jatts may actually have substantial ancestry from post-Aryan waves based on genetic results I’ve seen.
Where does this leave us in relation to the Kalash? Why is it that they look so much like European whites when phylogenetically they aren’t much more like European whites than many people around them. A few years ago I discussed Indian genetics with John Hawks, and one objection I had to the idea of a European-affiliated Indo-Aryan migration of any substantial demographic heft is that European pigmentation alleles are so rare in South Asians. I’m particularly thinking of European variations of SLC45A2 and OCA2/HERC2. I now understand that my assumptions were wrong. 4,000 years ago Europeans did not look like Europeans!
First, as outlined in Population genomics of Bronze Age Eurasia, Massive migration from the steppe was a source for Indo-European languages in Europe, and Eight thousand years of natural selection in Europe, the genetic character of Europeans as we understand it is a recent phenomenon. Second, as outlined in Genetic Evidence for Recent Population Mixture in India, the genetic character of South Asians is also a recent phenomenon. In fact, both are of the same period, with the finishing touches probably around ~3,000 to 4,000 years ago.
What does this mean? Well, it could be that the “white” phenotype emerged several times in variously related people. In other words, the similarities between the Kalash and Southern Europeans is due to convergence, not common descent. This is reasonable, since all the best evidence now suggests that in many ways the most ancient Southern European populations, such as Sardinians, are among the most distant from South Asians of the European groups. Of course some of the alleles for pigmentation are common. For example, SLC24A5 has a very explosive haplotype structure with little variation. It’s new across its whole range. There are some suggestions though that it is most diverse in the Middle East. It may have swept across all of Western Eurasia recently. Part of the expansion was demographic no doubt, but, part of it was also selection. So being part of the common network of demes, Southern Europeans and Northwest South Asians drew upon some of the same variation as part of their adaptive response to selection pressures.
For skin color the standard explanations are out there. The sun, sexual selection, and changes wrought by agriculture. But can they really explain all these concurrent shifts across Eurasia? In Darwin’s Radio the science fiction author Greg Bear posits a genetic time-bomb within us all introduced by a virus that produces species wide saltation. So Neanderthals turned into modern humans almost immediately. It’s a science fiction story. But what about the idea of a disease which selects strongly for the derived variant of SLC24A5? The change in skin color is just a side effect. In fact in South Asia it’s not optimal, though with clothing and avoiding direct sun during the midday, people can deal with it. Instead of a great white race sweeping across Eurasia, I’m positing a great white plague. And not just for white people. What about the sweep around EDAR, which results in many of the characteristics so distinctive about East Asians. It’s a major development gene, but perhaps it too is a reaction to a disease?
All these things lead me at a strange place. I think human population structure is a big deal. It’s real, it matters. Genetics, and genetic variation matters. But, I also think that a lot of it isn’t very deep in terms of time. That is, a lot of the genetic variation is mixed and matched of recent vintage. Rather than phylogenetic trees, there are reticulated graphs. But not only is the history of our species’ phylogeny radically conditional on the last 10,000 years, many of the salient physical characteristics are also recent, and seem to be popping up everywhere at the same time. And yet, remember Luke Jostins’ plot which showed parallel increase in encephalization across hominin lineages for millions of years? This may not be the first time that inevitable processes were driving many lineages toward the same end points.