Unlike in some Asian societies dairy products are relatively well known in South Asia. Apparently at some point my paternal grandmother’s family operated a milk production business. This is notable because Bengal is not quite the land of pastoralists. In much of North India milk and milk-products loom larger, in particular ghee. People don’t tend to consume what makes them ill, and even accounting for some processing in the form of butter, most researchers have assumed a substantial number of South Asians must be lactase persistent. That is, they can extract nutritive value out of the lactose sugar present in milk (in addition to fat and protein). Additionally, many South Asians have the well known -13910 C>T common in Western Eurasia. How do I know this? Because I share my genetic information with lots of South Asians, and some of them, especially Punjabis, come up as “lactose tolerant” on that allele.
-13910 C>T is modal in Northwest India, where cattle culture is most widespread across society. It drops off as one moves south, east, and north, into zones where milk production and products are less integral, or lacking, in the cultural toolkit. The ability to digest lactose as an adult is interesting and nice because it’s a perfect illustration of the power of natural selection to reshape traits. Relatively genetically close populations can be very different depending on whether the trait is favored, or not.
What’s the case here? There are many statistical genetic tricks that they used, but I’ll spare you that. First, remember that lactase persistence has emerged multiple times. There’s a mutation which is very common in Northern Europe, which extends into Central Eurasia. This is the same one discussed in this paper. Other mutations are localized to the Arabian peninsula and East Africa. The convergent evolution suggests some combination of:
1) Strong selection pressure for this trait in dairying cultures
2) A large mutational target, in that a wide range of changes seem to effect the appropriate shift
3) Low levels of gene flow which allow for different variants to flourish. If gene flow was too ubiquitous than the earliest variant would sweep all the others before it
It turns out that the overwhelming majority of detected variants known to allow for lactase persistence in India is the West Eurasian one. This is interesting, because there are various genetic and cultural reasons to connect South Asia to West Eurasia (even Europe). There is some genetic evidence to imply that the West Eurasian mutation derives from the Volga region. Though the word does not appear in the text of the paper it does not take a rocket-scientist to infer that this allele may have been introduced by Indo-Aryans. The main counter-argument against this is that it seems that their statistical corrections imply that geography predicts the variation of the trait more than linguistic affinity (i.e., if there was a sharp difference between Indo-Europeans and Dravidians who were neighbors it would be of great interest) by and large (the Austro-Asiatics and Tibeto-Burmans are exceptions, language is a good predictor of the lack of lactase persistence). These results make me less skeptical of the possibility that most of the recent admixture from West Eurasia in South Asia was due to the Indo-Europeans. Perhaps they did push south in a continuous manner gradually, and culturally were discontinuous? This is theoretically not implausible. On the other hand, I do wonder if perhaps the West Eurasian mutations pre-dates the Indo-Europeans. The authors of the paper observe that pastoralism has a 7,000 year history in South Asia. Not as long as Europe, but a long time indeed.
But these results don’t tell us just about ancestry. The region around LCT in Europeans shows a lot of evidence of natural selection. What about in South Asians? It seem that some of the signatures do persist in India. Additionally, there is a strong correlation between pastoralism and lactase persistence. This stands to reason, but it is nice to have that confirmed. This suggests that we need to be careful about inferring too much in regards to ancestry from this locus: it is not a neutral proxy, as it is subject to positive or negative natural selection. The aggregate frequency in their pooled sample is ~0.20, with high bounds in the range of ~0.75. Based on earlier Y and mtDNA work the authors suggest that it is more likely that these frequency variations and the overall level is a function of natural selection more than ancestry. In other words, a small group of pastoralists brought the favored allele, which spread rapidly to ecologically favored niches.