The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information



=>
Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
/
Austro-Asiatic

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS
seAsia

Reconstructing Austronesian population history in Island Southeast Asia, Mark Lipson, Po-Ru Loh, Nick Patterson, Priya Moorjani, Ying-Chin Ko, Mark Stoneking, Bonnie Berger, David Reich doi: 10.1101/005603

One of the strangest aspects of human history is the fact that periodically groups on the margins seem to rise to the fore and enter into a phase of rapid expansion into virgin territory. By “virgin” I don’t necessarily mean uninhabited, but rather virgin in relation to the mode of production which defines the expansionary group. A classic illustration by this is the rise of the Anglo-Saxon Diaspora between 1600 and 1900, as it settled territories inhabited by other populations at much lower population densities. The Bantu Expansion is another case in point. What you see in both cases is the migration of a population which has found a way to produce more calories per unit of land, and the weight of numbers resulted in the marginalization and/or absorption of the native populations, to varying degrees. In the Anglo North America and Oceania the admixture of indigenous ancestry is relatively low, at least into European populations. In East and Southern Africa the admixture of non-Bantu populations is definitely somewhat higher.

Austronesian expansion

Austronesian expansion

This dynamic has old roots in our lineage. It goes back at least to the rise of modern humanity on the fringe of Africa 50 to 100 thousand years ago, and its subsequent expansion across the world (with some assimilation of older hominin lineages). A more recent case is the Austronesian expansion out of Taiwan, which encompasses a longitudinal gradient from East Africa all the way to South America, and a latitudinal one from Hawaii to New Zealand. Even today I suspect people would be impressed by this, but it is all the more amazing when you observe that modern humans seem to have stabilized their range in Near Oceania for ~30,000 years. Unlike the “first farmers” of the Middle East the expansion of the Austronesians had less to do with a mode of production, than pioneering navigational skills and a lack of all sanity and rationality when it came to venturing across great expanses of water.

The question of why a small group of Southeast Asian people in Taiwan began to move in a manner which would trigger a world-wide cultural and demographic revolution is still an open one. But a second issue which can be explored is the nature of who these seafarers came into contact with. Of course most of the discussion has been around the uptake of Melanesian admixture in Near Oceania. A second question for me has always been the nature of the dominance of Austronesians in maritime Southeast Asia. Basically, Indonesia and Malaysia. The mainland of Southeast Asia was dominated by Austro-Asiatic peoples until the arrival of Tai, Miao, and Tibeto-Burman groups over the past few thousand years. Did Austro-Asiatics populate maritime Southeast Asia at one point? A preprint on bioRxiv aims to explore this question, Reconstructing Austronesian population history in Island Southeast Asia:

Austronesian languages are spread across half the globe, from Easter Island to Madagascar. Evidence from linguistics and archaeology indicates that the “Austronesian expansion,” which began 4-5 thousand years ago, likely had roots in Taiwan, but the ancestry of present-day Austronesian-speaking populations remains controversial. Here, focusing primarily on Island Southeast Asia, we analyze genome-wide data from 56 populations using new methods for tracing ancestral gene flow. We show that all sampled Austronesian groups harbor ancestry that is more closely related to aboriginal Taiwanese than to any present-day mainland population. Surprisingly, western Island Southeast Asian populations have also inherited ancestry from a source nested within the variation of present-day populations speaking Austro-Asiatic languages, which have historically been nearly exclusive to the mainland. Thus, either there was once a substantial Austro-Asiatic presence in Island Southeast Asia, or Austronesian speakers migrated to and through the mainland, admixing there before continuing to western Indonesia.

In the discussion the authors clear come down on the side that Austronesian and Austro-Asiatic admixture occurred prior to the settlement of maritime Southeast Asia. Though their marker set couldn’t infer timing of admixture event the relatively evenness of admixture in western Southeast Asia and the archaeological evidence seem to point to the idea that Austro-Asiatic speakers did not push past peninsular Malaysia (where there are Austro-Asiatic speakers in the interior among the Negrito populations). To me this has always struck me as strange, because obviously the island of Java has been amenable to widespread rice farming, and Indonesia today is as populous as all of mainland Southeast Asia. But it seems that the spread of populations over water can be highly contingent, and not inevitable. For example barbarian incursions into mainland Italy often stopped at the straits which separated the continent from Sicily, and the Vandal adoption of seafaring has always been somewhat mysterious. Though there has been gene flow across Gibraltar, it does seem that the existence of a water barrier has resulted in a major genetic discontinuity. And yet tens of thousands of years ago in prehistory the ancestors of the Australian and Melanesian peoples crossed from Sundaland to Sahul.

 
• Category: Science • Tags: Austro-Asiatic, Austronesians, Southeast Asia 
🔊 Listen RSS

Most people in South Asia speak one of two varieties of language, Indo-Aryan and Dravidian. These two are not particularly closely related. Indo-Aryan is an Indo-European language, as is evident in the plethora of obvious cognates with other Indo-European dialects. I have a minimal fluency in Bengali, the easternmost of the Indo-European languages, and quite a bit more fluency with English, one of the most westernmost, and it was evident to me rather early on (e.g., grass vs. gash, man vs. manush, nose vs. nak). In contrast to me Dravidian languages are peculiar because the accent and cadence are clearly South Asian, but they are utterly impenetrable (though there are many loan words into Indo-Aryan from Dravidian).


But in this post I’m going to explore the genetic relationships of the people who speak a subgroup of Austro-Asiatic languages indigenous to India, that of the Munda. The traditional question has always been whether the Austro-Asiatic languages are from India, or, whether they are from Southeast Asia. More precisely, did the Munda culture come to India, or is the Munda culture a relic of the original Austro-Asiatic domain in eastern India?

As background I believe it is important that readers understand that the territory between Vietnam and that of the Munda was likely dominated by Austro-Asiatic dialects ~2,000 years ago. Both the Burmese and Thai arrived in the historic period from southern China, and overthrew Mon or Khmer cultures which flourished in lowland Southeast Asia. In the case of both the Burmese and Thai it was a situation where the newcomers imposed their language upon the indigenous population, but by and large adopted most elements of high culture from the natives (e.g., Theravada Buddhism). The monarchies of Thailand and Burma drew directly from the Indic-inflected polities of the Khmer and Mon.

The recent extensive distribution and variety of Austro-Asiatic languages in Southeast Asia is suggestive of the likelihood that they derive from this area, but it is not a definitive point in that model’s favor. But there are now other genetic lines of inquiry. A few years ago a paper came out which reported that the Y chromosomal lineages of the Munda people which connect them to the Southeast Asia are much more diverse in Southeast Asia. This matters because population expansions and migrations tend to homogenize lineages through greater genetic drift, with the “source” population more likely to maintain diversity. Additionally, there was also evidence of a genetic variant in EDAR which has the hallmark of recent increase in frequency across eastern Asia. This seems to peg the Munda arrival to the Holocene, not the Pleistocene. Finally, there is the pattern of male lineages exhibiting some concordance with Southeast Asia, but female lineages being entirely indigenous. This is a classic expectation from a model of migration where there was a strong bias toward males because of the mobility of these groups, which lacked women and children.

I decided to further explore the question using the Estonian Biocenter data sets, as well as the HGDP and HapMap. For those of you who are curious about the technical details, I LD pruned the Estonian Biocenter marker set from ~600,000 down to ~130,000. I also put the samples through –geno 0.01 and –mind 0.80 on Plink to get high quality individuals and good coverage on markers. To be explicitly clear, I renamed and combined some of the populations in the original data set (e.g., Chamars = UP_Dalits). I ran a preliminary MDS to make sure that the data wasn’t strange, and it checked out.

So to do the analysis I ran TreeMix. I used Chinese Americans as the root outgroup population, and wanted 5 migrations, and also tried to correct for any remaining LD by looking across a window of 1,000 SNPs. You can view my first plot below.

The primary thing I would focus on is the gene flow from Cambodians to Munda. This is exactly what one might expect if the Munda were intrusive to South Asia. More interestingly, observe that there is no gene flow into Burmese from the South Asian groups, even though they are much closer proximity to South Asia! This is probably picking up something deep in history then. The fact that the Munda diverge early from other South Asian groups is also in keeping with Admixture or Structure bar plot results: the South Asian ancestry of the Munda is relatively unadmixed.

Next I wanted to focus more on the eastern population flows. So I removed a lot of the western groups which overwhelmed my gene flow edges.

In this scenario again there is a gene flow parameter from the rough region of the Cambodian node. Perhaps more curious now there is a powerful gene flow parameter into the Burmese from the same locus.Totally intelligible in light of the fact that the modern Burmese are genetically a hybrid population between Tibeto-Burman and Mon (Austro-Asiatic).

I’m certainly not ready to assert that the “case is closed.” But it seems that we need to shift our probabilities again toward the intrusive hypothesis.

Image credit: Wikipedia

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

If you have not read my post “To the antipode of Asia”, this might be a good time to do so if you are unfamiliar with the history, prehistory, and ethnography of mainland Southeast Asia. In this post I will focus on mainland Southeast Asia, and how it relates implicitly to India and China genetically, and what inferences we can make about demography and history. Though I will touch upon the Malay peninsula in the preliminary results, I have removed the Indonesian and Philippine samples from the data set in totality. This means that in this post I will not touch upon spread of the Austronesians.

I present before you two tentative questions:

- What was the relationship of the spread of Indic culture to Indic genes in mainland Southeast Asia before 1000 A.D.?

- What was the relationship of the spread of Tai culture to Tai genes in mainland Southeast Asia after 1000 A.D.?

The two maps above show the distribution of Austro-Asiatic and Tai languages in mainland Southeast Asia. Observe that when you join the two together in a union they cover much of the eastern 2/3 of mainland Southeast Asia. The fragmented nature of Austro-Asiatic languages in the northern region, edging into the People’s Republic of China, implies to us immediately that it is likely that in the past there was a continuous zone of Austro-Asiatic speech in this region. From the histories and mythologies of the Tai people we know that this group migrated from the southern fringes of China around ~1000 A.D. This is obvious when we note that there are still Tai people in southern China, and the expansion of the Tai across what is today Thailand is to some extent historically attested. Between 1000 and 1500 there was a wholesale ethnic reorganization of the Chao Phray river basin. Was that a matter of demographic replacement, or cultural assimilation, or some of both?

Second, what was the impact of Indians upon mainland Southeast Asia? One of the easiest ways to ascertain Indian influence is script. Burmese, Thai and Cambodian scripts all derive from Grantha, an archaic Tamil script (non-Islamic scripts in island Southeast Asia, such as Javanese and Balinese, are also derive from South Indian precursors). The Indian religious influences also are more southern than northern, manifesting in the southern forms of Shaivite Hinduism and Sri Lankan Theravada Buddhism.


There are three data sets which I looked at. I ran most of them from K = 2 to K = 12. This means that I threw all the individuals into a common pool and told the ADMIXTURE program to estimate their individual proportions of K number of populations. In this way we can get a general sense of the relationships of the populations. Remember that these aren’t necessarily real populations, and, the nature of the variation thrown into the pool impacts the nature of the inferred components greatly. I’m not reporting clear, distinctive, and objective entities extracted out of the data set. We’re looking at human intelligible interpretations of the patterns dependent upon the inputs and parameters. They’re telling us something real, but this isn’t like measuring the acceleration of a falling ball. It’s like describing the position of the ball in relation to a different set of reference objects. There’s a real ball with a specific position, but the descriptions are going to vary depending on what references you use (e.g., to the left of object A and below B, to the right of object C and above object D, etc.).

Here are the sets:

1) A “large” set which includes the mainland Pan-Asian populations, the white Americans from the HapMap, and some Malay peninsular groups.

2) A “medium” set which prunes most of the North Asian groups, Malaysian groups, and the white Americans. So it’s mostly mainland Southeast Asia, southern China, and India.

3) A “small” set, which removes many of the Southeast Asian populations, but keeps the Indian ones. I purposely overloaded this set with Indians to examine possibilities of Indian admixture in a few Southeast Asian groups.

Some notes. The Pan-Asian data set has ~56,000 markers. This is tolerable, but not optimal. It’s definitely good enough for European vs. Indian vs. East Asian vs. Negrito. But not less optimal for intra-regional variation. So take it with a grain of salt. But since I’m looking at Indian vs. East Asian, I’m mildly confident of that finding in relation to this data set. Second, the intersection of white Americans with the Pan-Asian set was ~30,000 markers. For Cambodians it was only ~22,000. There were ~100 white Americans, but only ~11 Cambodians. Be very cautious of the Cambodian results for this reason. Finally, remember that the ancestral components are abstractions, and can imply that stable and long admixed hybrid populations are their own distinct component, as well as isolates which are highly inbred.

There are three analyses and visualizations I will display below.

1) ADMIXTURE bar plots, which show the ancestral proportions of groups or individuals of a particular ancestral element.

2) Fst estimates across ancestral elements. This is a rough summary of genetic distance. I’ll also show you a two dimensional visualization on occasion, but remember that this removes some relationship information. The table is more accurate, though the visualization is easier to read.

3) Finally, I used EIGENSOFT to run some PCAs. This means that I took the pool of data and allowed the program to extract out the independent dimensions of variation. I ran it so that it pulled out the top 6 dimensions. The west-east dimension is always the largest by many multiples. Remember that the plots are not scaled.

I should also say that the K’s I’m showing are the most before inbred subgroups within the reported populations started breaking out into their own components (this happened especially within the Indians).

Starting at the beginning, I have noticed in the Pan-Asian data set that some groups, particularly Mons and Malays, seem to show Indian admixture. My question: is this really Indian admixture, or perhaps recent European admixture? That’s why I had the large data set, with white Americans. Here are the results:

So it seems unlikely that the Mon and Maly admixture with a West Eurasian element is from Europeans. Rather, it is consistent with Indians. In fact, I’m pretty confident it isn’t West Asian either, as is a possibility in the case of the Malays, because that component tends to align with Europeans at this scale. Finally, I will tell you that the admixture in both Mon and Malays is relatively even. In other words, the group estimates aren’t being shifted by one or two highly admixed Indians, which would be a good tell as to recent intermarriage. Not unheard of. Mahathir Mohamad’s paternal grandfather was a Kerala Muslim.

Now let’s look at the PCA. I’ll focus on dimensions 1, 2, and 3. Remember that these are the three largest dimensions of genetic variance rank ordered. Dimension one is by far the largest, by a factor of at least five usually in these plots. It’s the west vs. east Eurasian dimension.

I’ve highlighted the important bits. Two notes. First, I think you do see the suggestion that the Mon & Malay are shifted toward the Indians, not the Europeans. This is in perfect alignment with the ADMIXTURE result. Second, please note that the “Indian Singapore” population is heterogeneous. It is mostly Tamil, but there are clearly other Indians in the sample, and, some individuals who have Malay or Chinese ancestry.

Additionally, please note in the ADMIXTURE result above the similarity between the Tai and the Zhuang. The Zhuang are China’s second largest ethnic group, and reputedly the source population for the Tai migrations into mainland Southeast Asia. Before I move on, you should have some sense of the locations and ethno-linguistic affinities of some of the more obscure groups:

Location Group Language group
Northern Thailand Htin Austro-Asiatic
Northern Thailand Lawa Austro-Asiatic
Northern Thailand Mon Austro-Asiatic
Northern Thailand Palong Austro-Asiatic
Northern Thailand Plang Austro-Asiatic
Southern China Wa Austro-Asiatic
Northern Thailand Yao Hmong-Mien (Mien)
Southern China and Northern Thailand Hmong Hmong-Mien
Southern China Zhuang Tai
Northern Thailand Karen Tibeto-Burman
Southern China Jinuo Tibeto-Burman

One aspect which isn’t listed here is the classification of some of these populations as “hill tribes” or not. The Mon and the H’tin are both Austro-Asiatic, but the former are in some ways analogous to the Greeks on mainland Southeast Asia, while the latter are a tribal isolate which has preserved its identity in the hills of northern Thailand. By Greeks, I mean that the Mon have been assimilated or dominated by the Bamar in Burma and the Tai in Thailand, but in both cases have imparted to these groups the essence of Southeast Asian Indic high culture. The Mon were at one point ascendant from the lower Irrawaddy in southern Burma to the lower Chao Praya basin in Thailand, the terminus of which today is Bangkok. In contrast, groups like the H’tin and Lawa were presumably relatively insulated from Indic influence. The Hmong are relative newcomers to Southeast Asia, which explains their status as animists for example. Finally, you have groups like the Wa which are technically not even Southeast Asian, but are Austro-Asiatic. They should give us a sense of Austro-Asiatics without an Indic imprint.

Let’s move on to step two, the medium data set. I’m removing the white Americans, Malaysians, and North Asian groups. And now I’m including the Cambodians.

Again, the Mon have the Indian component. And so do the Cambodians. Remember that while everyone else has 56,000 SNPs, the Cambodians only have 22,000, so we need to be careful. Though you see this element in the HGDP runs as well. That is, an Indian affiliated component. It’s relatively evenly distributed among the Cambodians, so you can’t chalk it up to a few admixed individuals. Again, you see the similarity between the Zhuang and the Tai. The main difference is that the Tai seem to have admixed with various Southeast Asian groups. That’s to be expected. What surprised me though is that from these results it seems that the Tai expansion was demographically, not just linguistically, dominant. This is clear even the Bangkok sample. More on this later.

Below are the genetic distances between the inferred ancestral groups. The labels given the modal population, and then the language family:

Jinuo_Burman Htin_Austro Tai SouthAsian Palong_Austro Hmong
Jinuo_Burman 0 0.073 0.057 0.115 0.092 0.085
Htin_Austro 0.073 0 0.03 0.088 0.065 0.06
Tai 0.057 0.03 0 0.09 0.064 0.047
SouthAsian 0.115 0.088 0.09 0 0.117 0.117
Palong_Austro 0.092 0.065 0.064 0.117 0 0.09
Hmong 0.085 0.06 0.047 0.117 0.09 0

Here are some visualizations:

And here’s the PCA:

In this plot you see both the Mon and Cambodians shifted toward the Indians, again. Also, note the Zhuang and the Tai mostly overlap rather well. The y-axis is defined it seems by Austro-Asiatic hill tribes, then the Tibeto-Burman groups, and a gap until you hit the Tai cluster, which eventually merges with the Hmong. There’s a reasonable language family affinity here, insofar as the Yao are between the Tai and the Hmong.

Finally, we move to the Indo-centric run. I’ve removed a lot of the Southeast Asian groups now. Some of the hill tribes are obviously relatively isolated, and so throw up their own clusters or diverge on PCA rather easily. That’s a function of genetic differences which build up if you are relatively insulated from gene flow. Because I removed so many populations I’m only left with three K’s before you get qasi-family clusters showing up as K’s. Also, I’m going to show you individual bar plots for Cambodians and Mon to illustrate that the Indian component isn’t just isolated instances of admixture:

The Fsts are straightforward in this case:

Austro-Asiatic Tai South Asian
Austro-Asiatic 0 0.028 0.084
Tai 0.028 0 0.085
South Asian 0.084 0.085 0

It’s the PCA which is really interesting in this run. The first isn’t too exceptional:

OK, first, since this is an Indian focused set, you see that there’s more than the standard west-east dimension. You have several lower order dimensions which separate Indians! I had previous assumed that the Indian component which always shows up in the Cambodians in the HGDP was a function of deep ancient ancestry with the “Ancestral South Indians” of Reich et al. This ancient population may have had affinities with many groups out toward Southeast Asia, and so the residual cluster in Cambodians may have been part of the deep Ice Age ancestry of this group. These results convince me that this is not so straightforward an explanation. In this sample the group that has the highest ASI are the Bhils, a tribal population. In one of the plots you see that the Bhils form one end of the distribution, and Gujarat Vaishyas the other. It is clear that this is an Ancestral North Indian-Ancestral South Indian cline. The Mon and Cambodians don’t deviate much from the center, suggesting to me that they aren’t too skewed toward the ASI! Additionally, the “center” of the distribution is weighted toward caste South Indians. This is then is a nice resolution, because it dovetails perfectly with the historical evidence for a South Indian specific influence on Southeast Asia in the early historic period.

This isn’t a slam dunk. There needs to be estimates of the time since admixture. It should post-date the ANI-ASI admixture event, and be in the same range as the Uyghurs. Unfortunately with only 56,000 SNPs I’m not sure this estimate is possible, but I’ll look into it. Additionally, a deeper survey of Y and mtDNA lineages need to be done in Southeast Asia. They may show sex-biased migration. I did look for the West Eurasian specific SLC24A5 variant, which goes no lower than ~50% in South India, but that’s not in the Pan-Asian SNP data set. It is in the HGDP, and none of the 11 Cambodians have it. This would lean toward the ASI hypothesis, but seeing as how the West Eurasian variant may only about ~50%, and the Cambodians are less than 10% South Asian, it isn’t totally implausible that it wouldn’t show up in 22 gene copies (using realistic assumptions I get a ~50% probability that a West Eurasian copy of SLC24A5 wouldn’t be found in the Cambodians with N = 11).

I’ve not devoted too much space to the Tai-Zhuang connection in this post, because it’s obvious in the plots. The Tai are obviously somewhat shifted toward Austro-Asiatic groups, but far less than I would have expected. In fact, taking the ADMIXTURE components too literally you might infer that there’s been more Tai admixture into the Mon and Khmer than the other way around! This might not be totally implausible when you consider that Thailand’s population is nearly five times that of Cambodia. But the standard model I’ve read suggests that Tai warrior bands conquered the Mon-Khmer indigenes, and absorbed much of their high culture. These results don’t cohere easily with that in terms of demographics.

I have a possible explanation for what occurred. Much of Thailand may not have been too populous until the past ~1,000 years, with lowland agriculture being driven by elite direction. The Tai may have brought superior agricultural techniques, and so entered into a phase of rapid population expansion into the lowland frontier, which had no parallel during the Mon and Khmer period of dominance. In other words, the Tai bands were small and initially outnumbered by the Mon and Khmer. But through favorable resource direction and priority allocation of newly arable land to co-ethnics the small Tai population might quickly have come to dominate the previous inhabitants. This is the model which is outlined in the Rise of Islam and the Bengal Frontier. In it the author basically argues that eastern Bengal was lightly populated until large scale Muslim elite driven projects to open up the agricultural frontier. The recruited peasants were either Muslim or converted to Islam, because the cultural landscape was relatively fluid and unsettled, in contrast to the more static peasant economy of western Bengal, which remained Hindu. The Islamicization of eastern Bengal in this model had less to do with the conversion of native tribes, and more to do with the rapid demographic expansion of Bengali peasant colonies which were enabled by agricultural projects, colonies which were Islamicized or were drawn from the minority Muslim peasantry of the western zone by Mughal elites intent on creating a region where the Hindu upper castes were marginalized. Similarly, the Tai expansion in Southeast Asia may have been into a de facto “empty” landscape. During the period when Mon and Khmer high culture was absorbed the Tai may have been the smaller element in terms of numbers. The current ratios are a function of later social and demographic processes.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

The Pith: the genetic relationships between bacteria in our stomach can tell us a lot about the relationships between various groups of people. Additionally, the distribution of different strains of bacteria may have significant public health implications.

The above image is from a paper which was pushed online yesterday in PLoS ONE: Evolutionary History of Helicobacter pylori Sequences Reflect Past Human Migrations in Southeast Asia. It’s a paper which caught my attention for several reasons. First, I’ve exhibited some curiosity about the history and prehistory of Southeast Asia of late. Elucidating this region’s historical dynamics may bear upon more general questions of human evolutionary and cultural process. Second, H. pylori is a fascinating organism whose connection to specific human populations is tight enough that it can shed light on past interactions of different groups. In short, just like humans H. pylori exhibits regional specificity and local history. But additionally, H. pylori is also subject to natural selection after introduction into a new population, and so can serve as a window upon cultural contacts which might otherwise leave a light demographic footprint. In other words, the spread of H. pylori across human populations may be compared to the spread of Buddhism. This religion came to China and Japan with some Buddhists of South and Central Asian origin, but by and large its spread was memetic rather than through natural increase of a Buddhist population.

First, let’s hit the abstract:

The human population history in Southeast Asia was shaped by numerous migrations and population expansions. Their reconstruction based on archaeological, linguistic or human genetic data is often hampered by the limited number of informative polymorphisms in classical human genetic markers, such as the hypervariable regions of the mitochondrial DNA. Here, we analyse housekeeping gene sequences of the human stomach bacterium Helicobacter pylori from various countries in Southeast Asia and we provide evidence that H. pylori accompanied at least three ancient human migrations into this area: i) a migration from India introducing hpEurope bacteria into Thailand, Cambodia and Malaysia; ii) a migration of the ancestors of Austro-Asiatic speaking people into Vietnam and Cambodia carrying hspEAsia bacteria; and iii) a migration of the ancestors of the Thai people from Southern China into Thailand carrying H. pylori of population hpAsia2. Moreover, the H. pylori sequences reflect iv) the migrations of Chinese to Thailand and Malaysia within the last 200 years spreading hspEasia strains, and v) migrations of Indians to Malaysia within the last 200 years distributing both hpAsia2 and hpEurope bacteria. The distribution of the bacterial populations seems to strongly influence the incidence of gastric cancer as countries with predominantly hspEAsia isolates exhibit a high incidence of gastric cancer while the incidence is low in countries with a high proportion of hpAsia2 or hpEurope strains. In the future, the host range expansion of hpEurope strains among Asian populations, combined with human motility, may have a significant impact on gastric cancer incidence in Asia.

H. pylori can be separated into very distinctive lineages of geographically limited scope, despite some horizontal gene flow. One clade seems generally restricted to western Eurasia, another to eastern Eurasia, and there are some Africa specific lineages as well. But within these particular clades one can drill-down to a finer-grain. For example, there are Indian lineages within the broader west Eurasian family of strains. As mutation over time results in the build up of distinctive variants in localized populations, a simple assessment of mutational steps between lineages can allow one to infer a tree of descent from a common ancestor.

Let’s tack for a moment to some history without microbial goodness. To some extent Southeast Asia can be considered part of “Greater India,” more or less. This is most evident in Thailand and Cambodia, two nations which are cultural heirs to the Khmer civilization which produced Angor Wat. The religious and artistic sensibilities of both these modern societies are deeply imprinted by South Asian norms through that precursor polity. The Theravada Buddhism of these societies still has a vital connection to South Asia (especially Sri Lanka) and is more obviously Indian in its sensibility than for example the Zen sect of Japan (which derives from Chinese Chan). In Vietnam there remains a small group of Malay Cham Saivite Hindus, the remnants of the Champa Empire.

The affinities in maritime Southeast Asia are a bit clouded because of the interposition of Islam between moderns and the Dharmic past. Only the Balinese remain as a vital living heir to the Indian-influenced polities of early Indonesia, Srivijaya and Majapahit. Despite this notional reality the Indian influence remains discernible even among Muslim Indonesians, in particular in East Java, where shadow puppet shows of the Ramayana remain popular. Like Angor Wat, Borobudur in Java is a testament to the monumental Indian past. But even the avowed Islamic flavor of modern maritime Southeast Asia may have some Indian connection, insofar as there is the possibility that South Asian Muslims were critical players in the eastern Indian ocean trade network which slowly Islamicized over the course of the second millennium.

We are then presented with the question: if the Indian influence in Southeast Asia was so strong in the past, where are the genes of Indians? The authors note that mitochondrial DNA analyses, the maternal lineage, show no South Asian specific lineages in appreciable frequencies among native populations. A fixation on mtDNA seemed rather strange to me for two reasons. First, with the PanAsian SNP data set there’s some autosomal data. Second, there are strong reasons to suppose that Indian migrants would be male. The myths and sketchy historical references of this period don’t seem to envisage mass folk migrations, where Indian men bring their women and children and recreate their homelands. Rather, often these men are portrayed as religious specialists or military leaders of genius. The authors note that there is evidence of Indian artisans in Thailand ~2,000 years ago. This is eminently plausible, there are references to towns of Indian merchants in Sumeria ~4,000 years ago! But again, there is no reason that these artisans necessarily brought their wives. Rather, if they were purchased for their skills they may simply have been the human property which was the object of capitalist transactions between two autocrats.

The nature of cultural transfer, and the relatively high fidelity of that transfer, implies to me that some Indians did migrate to Southeast Asia. But they were few, and their genetic impact was minimal. Rather, what we see is the power of memes to operate very differently from genes. The Indian memes rapidly swallowed up the cultural commanding heights, and became normative from Java to northern Thailand (northern Vietnam is the exception to this rule, as it was influenced by China).

H. pylori shares many of the same tendencies as memes, despite its more concrete biological character. As bacteria it can spread rapidly within a population, and decouple itself from the endogenous natural increase of its original hosts. That spread can be driven by natural selection which means that it isn’t a good representation of the ancestry of its hosts. But even natural selection can’t erase the inferences one can make about original contacts between distinct groups.

In this paper the authors present evidence from the nature of H. pylori in Southeast Asia that there was tangible physical contact between Indians and Southeast Asians in the antique past. More precisely, below is a figure which shows the nature of relationships of west Eurasian H. pylori lineages in India and Southeast Asia, with European and other west Eurasian samples as a control.

What you see here is that Indian H. pylori is basal to the Southeast Asian branches, though within the same clade against the European lineages. This tells you that there’s an affinity between Indian and Southeast Asian lineages under consideration here, but that that affinity is diminished by a period of separation. This matters because some regions of Southeast Asia, such as Malaysia, have a large Indian population which arrived in the past few centuries. The fact that there is a distinct Southeast Asia specific lineage suggests that there has been a long period of separation between the two populations, and one can’t attribute the frequency of the west Eurasian Indian H. pylori simply to recent contacts. At least in most of Southeast Asia. It turns out that in the Philippines the west Eurasian H. pylori cluster with Spanish populations. This has to be the outcome of hundreds of years of colonialism.

There’s also this fascinating historical and geographical tidbit:

A study on the distribution of H. pylori virulence factor cagA among Vietnamese identified 84% of the strains harbouring the type II of the cag-right motif…which is characteristic for East Asian strains (hpEastAsia), ranging from 76% in Ho Chi Minh city in South Vietnam to 93% in Hanoi in North Vietnam. However, there was a remarkable difference in the frequency of cag-right motif of type I which is predominant in European (hpEurope) strains. While the type I motif was absent from North Vietnam, it was found in 8/49 (16%) of the samples from Ho Chi Minh city near the Mekong delta. Interestingly, prior to annexation by the Vietnamese in the 17th century, this city was an important Khmer sea port known as Prey Nokor…Thus, hpEurope strains also seem to be frequent among Vietnamese in the Mekong delta, and thus the Annamite mountain range that originates in the Tibetan and Yunnan regions of southwest China and forms Vietnam’s border with Laos and Cambodia seem to have shaped an effective natural barrier for the containment of Indian influence in the Mekong basin, explaining the low prevalence of hpEurope strains elsewhere in Vietnam.

The geographic contours of the nation-state of Vietnam as we understand it today are a relatively new phenomenon. The Vietnamese people, the Kinh, are an ancient nation. But for most of the past ~2,000 years what we know as Vietnam was divided between the Kinh in the north, and the Khmers and later Austronesian Chams in the center and south. Unlike the other peoples of Southeast Asia the Kinh looked to the north, to China, as their cultural model. While India’s influence in Southeast Asia (excepting the Chola adventures) has been through “soft power,” the Chinese have periodically ruled Vietnam directly, and otherwise placed it into the category of tributary state.

There needn’t be any geographical determinism here. Projection of cultural or military power declines in proportion to distance. In relation to culture that projection does not decline linearly, but often exhibits a sharp break. The Vietnamese did not move the Annamite range south when they defeated Champa and began to swallow the eastern flank of the Khmer kingdom. Rather, they shifted populations and cultural identities of populations, and therefore the civilizational boundaries. The line which separated Indic and Sinic moved south with the spread of the Kinh and the retreat of the Khmer. This did not eliminate in totality the Indic influence. Hindu Cham remain in Vietnam, while forms of Therevada Buddhism have some purchase in the Mekong delta, unlike in the rest of country where Chinese derived Mahayana reigns supreme. And so it is that Indic H. pylori also remains as a residual in the southern regions of Vietnam, evidence of the trade and cultural networks which bound it to Greater India as some point in the past.

Next let’s look at the distribution of East Asia specific H. pylori:

The figure is hard to read, but here’s the short of it: there are Amerindian, Taiwan-Oceanian, Chinese, and Southeast Asia specific lineages. More specifically the authors attempt to infer the origin of one particular Southeast Asia specific lineage which exhibits some overlap in southern China. This is because they believe that it can trace the migration of the Austro-Asiatics, likely the first agriculturalists in Southeast Asia. The H. pylori strain in question spans southern China to Malaysia. The geographic zone encompasses regions now inhabited by Thai or Malay speakers, but it seems likely that at one point the whole zone was dominated by Austro-Asiatics. The clincher would be to see if Munda from northeast India carry this same H. pylori strain. In fact an analysis of the phylogenetic tree of strains of H. pylori found in Austro-Asiatic populations or their descendants might be able to move the needle on whether they’re exogenous to India or not (the “older” lineages should be basal).

So far I’ve been focused on issues of phylogeny. How populations of humans and bacteria relate to each other. But there are functional and adaptive implications and dynamics at work. In terms of adaptation it seems that some strains of H. pylori are simply more fit than others in some environments. The Spanish presence in the Philippines was very light demographically over the centuries of their colonial rule. There was considerable residential segregation of the Spanish away from the natives, and the Chinese, who outnumbered the Spaniards often by two orders of magnitude. And yet you have a situation where H. pylori of Spanish provenance seems to be dominant. Why? The authors report that there’s a fair amount of evidence that European H. pylori strains are generalists who outcompete the specialist East Asian and Amerindian lineages. I think one can’t ignore the reality that the “European” strains are endemic to a huge swath of western Eurasia, from Europe to India. Because of their large population sizes these lineages probably have more diversity than the other populations, and so can adapt to a wide range of conditions.

A functional and public health concern is that East Asia H. pylori may be the cause of the much higher stomach cancer rates in that region of the world. You probably know that H. pylori is a critical player in ulcers, so its impact in this region shouldn’t be a surprise. Prior to reading this paper I’ve heard that East Asian stomach cancer rates were due to condiments used. This goes to show the difficulty of much of medical science which relies on correlations and rough guesses about causality.

Obviously I’m interested in what markers such as the distribution of pathogens which are reliant on humans can tell us about history. But over the long term the complex interplay between these pathogens, disease risk, and other phenotypic characteristics, is where the real action is going to be.

Citation: Breurec S, Guillard B, Hem S, Brisse S, Dieye FB, & et al. (2011). Evolutionary History of Helicobacter pylori Sequences Reflect Past Human Migrations in Southeast Asia PLoS ONE : 10.1371/journal.pone.0022058

Image credit: Mark Alexander

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

Markers show populations sampled by HUGO Pan-Asian SNP Consortium


The Pith: Southeast Asia was settled by a series of distinct peoples. The pattern of settlement can be discerned in part by examination of patterns of genetic variation. It seems likely that Austro-Asiatic populations were dominant across the western half of Indonesia before the arrival of Austronesians.

About a year and a half ago I reviewed a paper in Science which did a first pass through some of the findings suggested by the HUGO Pan-Asian SNP Consortium data set, which pooled a wide range of Asian populations. You can see the locations on the map above (alas, the labels are too small to read the codes). The important issue in relation to this data set is that it has a thick coverage of Southeast Asia, which is not well represented in the HGDP. Unfortunately there are only ~50,000 markers, which is not optimal for really fine-grained intra-regional analysis in my opinion. But better than nothing, and definitely sufficient for coarser scale analysis.

A few things have changed since I first reviewed this paper. First, I pulled down a copy of the Pan-Asian SNP data set. I’m going to play with it myself soon. Second, after reading Strange Parallels, volume 1 and 2, I know a lot more about Southeast Asian history. Finally, the possibility of archaic admixture amongst Near Oceanians makes the genetics of the regions which were once Sundaland and Sahul of particular interest.


Before we hit the genetics, let’s review a little of the ethnography of Southeast Asia, as this may allow us to tease apart the meaning of some of the results. The largest ethno-linguistic group in Southeast Asia is that of Austronesians. An interesting point in relation to Austronesians is that they aren’t limited to Southeast Asia. As you can see the Austronesians range from off the coast of South America (Easter Island) to southeast Africa (Madagascar). Though there’s debate about this issue it seems to me that the most likely current point of departure of the Austronesian migration is Taiwan. Though today Taiwan is predominantly Han Chinese, that is an artifact of relatively recent migration. The indigenous population is clearly Austronesian.

A second language family which is somewhat expansive, though Southeast Asia focused, is Austro-Asiatic. There is a great deal of internal structure to this ethno-linguistic group, in that there is a well known coherent Mon-Khmer cluster, which includes some ethnic minorities in Burma and Thailand, as well as Cambodians. Additionally you have Vietnamese in the east and some tribal groups in northeast India. There has long been debate about whether these Indian tribes, the Munda, are the original Indians, to be supplanted later by Dravidian and Indo-Aryan speakers, or intrusive to the subcontinent. I believe that the most recent genetic data points to intrusion from the east into South Asia. Austro-Asiatic was likely less fragmented in mainland Southeast Asia before the historical period. Both the dominant ethnic groups in Burma and Thailand are intrusive and absorbed Mon-Khmer populations, the latter dynamic being historically attested.

Finally there are the ethno-linguistic clusters of Burma and Thailand (and Laos). The former nation is dominated by the Bamar, a Sino-Tibetan population with origins in South China ~1,500 years ago. In Burma the Mon substrate persists, while the Shan people of Thai affinity reign supreme across the northeastern fringe of the nation. In Thailand and Laos the Mon-Khmer substrate has been marginalized to isolated residual groups. But it is notable that in both these polities the Mon-Khmer populations set the tone for the civilizational orientation of the conquering ethnicities. The Thai abandoned Chinese influenced Mahayana Buddhism for the Indian influenced Theravada Buddhism of the conquered populace. Despite the notional ethnic chasm between the Thai and the Khmer of Cambodia, the broad cultural similarities due to the common roots in the society of the Khmer Empire is clear.

With the ethnographic context in place, let’s look at the two primary figures which we get from the paper. The first figure shows a phylogenetic tree of the relationships of the populations in their database, color-coded by ethnolinguistic group. Next to that tree there’s a STRUCTURE plot at K = 14, which means 14 ancestral populations. They’ve colored the bar components to match the ethno-linguistic classes (e.g., red = Austro-Asiatic, an Austro-Asiatic modal component). The second figure shows two PCA panels. PC 1 is the largest component of genetic variance in the data set, and PC 2 the second largest. I’ve added a label for the Papuan populations.

Going back to the chronology above, we know that the Thai came last. The Sino-Tibetans came before then. The issue I wonder about is the relationship of the Austronesians and Austro-Asiatic groups. Interestingly the Austronesian proportions are high not only in island Southeast Asia, but also among many South Chinese groups. In contrast, among the Mon-Khmer hill tribes of Thailand, who are presumably representative of groups which were present before the Thai migrations, it is absent. And it is notable to me that not only does Austro-Asiatic exhibit fragmentation in relation to Thai and Sino-Tibetan, but it does so to some extent with relation to Austronesian! The indigenous folk of central Malaysia seem to speak a Austro-Asiatic language. Finally, the Austro-Asiatic component rises in frequency on the southern fringes of island Southeast Asia, in densely populated Java.

Because of the thicker textual record for mainland Southeast Asia we know that the Austro-Asiatic groups predate the Thai and Sino-Tibetan ones. I believe that the Austro-Asiatic element also predates Austronesian in Southeast Asia. That is, I believe that an Austro-Asiatic substrate existed before the arrival of Austronesians from the zone between the Philippines and Taiwan. The Negritos of inner Malaysia, who are genetically and physically distinctive, speak Austro-Asiatic languages. This should not be surprising, it seems that hunter-gatherer groups often switch to the language of resident agriculturalists. Because of their isolation some of these groups have persisted in speaking the languages of the “first farmers” of Malaysia, even after those pioneers were absorbed by newcomers.

The PCA shows clearly that the Austronesians are the genetically most varied of these Southeast Asian groups. Why? I believe it is because they are late arrivals who have admixed in sequence with whoever was resident in their target zones. In the east of island Southeast Asia the admixture occurred with a Melanesian population. Both the STRUCTURE plot and the PCA show evidence of this sort of two-way admixture. The STRUCTURE is straightforward, but note the linear distribution of the Austronesians in relation to outgroups in the first panel, and implicitly on the second.

Why is the Austro-Asiatic fraction higher in Java than to the zones in the north? Java is today the most densely populated region of Indonesia because of its fertility. I hypothesize that the spread of the Austronesians was facilitated by a more effective form of agriculture which could squeeze more productivity out of marginal land. Relative to Java the Malay peninsula, Borneo, and Sumatra, are agriculturally marginal. The densities of the Austro-Asiatics was greatest in Java, while they were very thin in the regions to the north. It seems likely that the Austronesians engaged in a series of “leap-frogs” to islands and maritime fringes which were not cultivated by the Austro-Asiatic populations. Some Indonesian groups, such as the Mentawai who live on the island of the same name off the western coast of Sumatra, cluster with the Taiwanese, as if they transplanted their society in totality.

One thing that needs to be mentioned when talking about the genetics and prehistory of Southeast Asia are the “Negritos.” As indicated by their name these are a small people with African-like features. As is clear from the charts above these people are not particularly genetically close to Africans. The Philippine Negritos seem to have some relationship to the Melanesians. Interestingly they speak an Austronesian language; again following the trend where marginalized indigenes seem to pick up the language of their farming neighbors. The Negritos of Malaysia are somewhat different, but note that one of the populations exhibits Austro-Asiatic, but not Austronesian, admixture. This comports with my supposition that the Austro-Asiatic populations were the first to marginalize these tribes before themselves being assimilated by the Austronesians.

Someone with a better ethnographic understanding of Southeast Asia than I could probably decode the results above with greater power. But at this point I think we’ve got a chronology like so:

1) First you have hunter-gatherer populations of broad Melanesian affinities in Southeast Asia.

2) Then Austro-Asiatic populations move south from the fringes of southern China. Some push west to India, while others leap-frog south to zones suitable for agriculture such as Java.

3) Then Austronesian populations sweep south along water routes, and marginalize the Austro-Asiatics in island Southeast Asia, though the not on the mainland.

4) The Bamar arrive from southern China over 1,000 years ago, and marginalize the Austro-Asiatics in Burma.

5) The Thai arrive from southern China less than 1,000 years ago, take over the central zone of mainland Southeast Asia, and make inroads to the west in Burma.

I will hazard to guess that the Malagasy of Madagascar are Austronesians who have very little of the Austro-Asiatic element in their ancestry. I believe this is so because they were part of the leap-frog dynamic where societies were transplanted from suitable point to point by water (the Malagasy language seems to be a branch of dialects of southern Borneo!).

So far I’ve been talking about the north to south movement. And yet the paper observes a south or north gradient in genetic diversity, which implies to the authors migration from south to north (the northern East Asian groups being a subset of the southern). But the past may have been more complex than we give it credit for. It is entirely possible that modern humans arrived in northeast Asia via a southern route, retreated south during the glaciation, and expanded north, with some groups pushing back south again. As it is, looking at how distantly the Melanesians relate to East Eurasians I think the most plausible model is that there wasn’t a relatively recent expansion from Southeast Asia. Rather, the ancestors of most East Eurasians survived in refugia in China, and a sequence of agriculturally driven expansions have reshaped Southeast Asia more recently. These populations admixed with the indigenous substrate, more or less. This would have resulted in an uptake of genetic diversity. Finally, the massive expansion of Han from the Yellow river basin may have caused the extinction of many lineages across China within the past ~3,000 years.

Citation: ., Abdulla, M., Ahmed, I., Assawamakin, A., Bhak, J., Brahmachari, S., Calacal, G., Chaurasia, A., Chen, C., Chen, J., Chen, Y., Chu, J., Cutiongco-de la Paz, E., De Ungria, M., Delfin, F., Edo, J., Fuchareon, S., Ghang, H., Gojobori, T., Han, J., Ho, S., Hoh, B., Huang, W., Inoko, H., Jha, P., Jinam, T., Jin, L., Jung, J., Kangwanpong, D., Kampuansai, J., Kennedy, G., Khurana, P., Kim, H., Kim, K., Kim, S., Kim, W., Kimm, K., Kimura, R., Koike, T., Kulawonganunchai, S., Kumar, V., Lai, P., Lee, J., Lee, S., Liu, E., Majumder, P., Mandapati, K., Marzuki, S., Mitchell, W., Mukerji, M., Naritomi, K., Ngamphiw, C., Niikawa, N., Nishida, N., Oh, B., Oh, S., Ohashi, J., Oka, A., Ong, R., Padilla, C., Palittapongarnpim, P., Perdigon, H., Phipps, M., Png, E., Sakaki, Y., Salvador, J., Sandraling, Y., Scaria, V., Seielstad, M., Sidek, M., Sinha, A., Srikummool, M., Sudoyo, H., Sugano, S., Suryadi, H., Suzuki, Y., Tabbada, K., Tan, A., Tokunaga, K., Tongsima, S., Villamor, L., Wang, E., Wang, Y., Wang, H., Wu, J., Xiao, H., Xu, S., Yang, J., Shugart, Y., Yoo, H., Yuan, W., Zhao, G., & Zilfalil, B. (2009). Mapping Human Genetic Diversity in Asia Science, 326 (5959), 1541-1545 DOI: 10.1126/science.1177074

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

As I am currently reading Victor Lieberman’s magisterial Strange Parallels: Volume 2. So I was very interested in a new paper from BMC Genetics, Genetic structure of the Mon-Khmer speaking groups and their affinity to the neighbouring Tai populations in Northern Thailand, pointed to by Dienekes today. Here are the results and conclusions:

A large fraction of genetic variation is observed within populations (about 80% and 90 % for mtDNA and the Y-chromosome, respectively). The genetic divergence between populations is much higher in Mon-Khmer than in Tai speaking groups, especially at the paternally inherited markers. The two major linguistic groups are genetically distinct, but only for a marginal fraction (1 to 2 %) of the total genetic variation. Genetic distances between populations correlate with their linguistic differences, whereas the geographic distance does not explain the genetic divergence pattern.

The Mon-Khmer speaking populations in northern Thailand exhibited the genetic divergence among each other and also when compared to Tai speaking peoples. The different drift effects and the post-marital residence patterns between the two linguistic groups are the explanation for a small but significant fraction of the genetic variation pattern within and between them.

There are many occasions when it has taken a synthetic scholar to point out to me the overall structure of a constellation of facts which I was conscious of prior. So it is with Lieberman’s work. I had known that the eruption of the Thai peoples into Southeast Asia occurred with the last 1,000 years, before which the peninsula was divided between Tibeto-Burman populations to the west and Austro-Asiatic languages to the east (the latter divided between the Khmer and Vietnamese). Additionally, it is presumed that the Tibeto-Burman languages themselves displaced Austro-Asiatic in the western zone (as evident by the persistence of Mon in modern Burma). What was noted in volume 1 of Strange Parallels though is that the three geographical regions engaged with and assimilated the Thai invasions different. In the center the Thai succeeded in dominating the previous groups and imposing their identity upon the region. It is often asserted that modern Cambodia’s existence as an independent state is a function of the protection conferred upon it by the French from the expansive ambitions of the Empire of Siam. But in the east the Vietnamese state was barely impacted by the Thai folk wandering. As in China the Thai in Vietnam are marginalized “mountain tribes.” Finally, in the west, in the zone which became Burma, the Thai did not take over the cultural commanding heights. But neither were they absolutely marginalized as in the east. Rather, the Shan people became part of the of the Burmese landscape, integrated into the Theravada Buddhist culture, but also a significant secondary ethnos to the Burman majority (along with Karens, Mons, etc.).

What does this have to do with genetics? Possibly everything and nothing, and all answers in between.

The massive shift in ethno-linguistic identity in the center of mainland Southeast Asia, its lack in the east, and position at the equipoise in the west, should be excellent tests of propositions as to the nature of the spread of such ethno-linguistic identities. Is it pure construction, demographic replacement, or some quantitative combination of the two parameters? Unfortunately the BMC Genetics paper focuses only on Y chromosomes and mtDNA, the paternal and maternal lineage. These markers are informative, but I’d rather look at total genome content. The ethnic coverage in a small area of northern Thailand though is impressive. The open circles represent Mon-Khmer ethnic groups, the dark ones Thai. The Mon-Khmer are the presumed indigenes, while the Thai are intrusive. At least over the past 1,000 years.

Below I’ve reedited the Y and mtDNA multidimensional scaling plots. The Y is on the left, and mtDNA on the right. The clustering pattern shows relationships across the lineages. Again, the open markers represent Mon-Khmer groups, and the closed ones Thai.

Since the paper is open access I invite you to read their interpretations. All I’d say is that the clustering of male Thai lineages is very interesting, and is well explained by the model of groups of related men being intrusive to a region, and taking wives from the indigenes. In contrast the Mon-Khmer Y chromosomal lineages scatter about more, and that may be due to the fact they coalesce back to common ancestors far further back in history. The intrusion of the Thai into Southeast Asia may then be demographically characterized by a migration of male warbands. In regions where these warbands managed to topple the previous order, as in central mainland Southeast Asia, they may have then monopolized access to women and entered into a period of demographic expansion.

Luckily we do have some thick-marker autosomal data. To the left I’ve reedited a figure generated with the HUGO Pan-Asian data. The bar plot is at K = 14. I’ve excised many of the extraneous populations. The colors within the bar plot correspond to associations with broader language families. So red seems to be Austro-Asiatic, while blue is Thai. You can see in the figure that the Chinese Thai lack the red Mon-Khmer component. Interestingly the the Hmong of upland Southeast Asia, who are culturally marginal to the dominant Theravada Buddhist culture of the lowlands, exhibit evidence of very sharp differentiation from the Thai and the Austro-Asiatic groups. They lack the affinity with island Southeast Asians, Malays, and Taiwanese Aborigines, which seems common amongst the South Chinese more broadly. The Karen of Thailand are probably the best proxy we have for the Tibeto-Burman people of Burma, who post-date the Austro-Asiatic, and predate the Thai. Going by these data it looks as if the Karen are very hard to differentiate from the Austro-Asiatic populations, though very distinctive from the Thai.

The Pan-Asian data set leaves a lot to be desired. There’s not much coverage of the east or west. I suspect that Southeast Asia is going to be somewhat complex, and extrapolating from the correlations between languages and genes in Thailand is going to get us only so far. But it’s a start. In Strange Parallels the author makes the case that mainland Southeast Asia can tell us a lot about generic Eurasian historical process. I hope, and suspect, that it can tell us something more general about the interplay between language and genes over time in other regions as well.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

munda2

The past ten years has obviously been very active in the area of human genomics, but in the domain of South Asian genetic relationships in a world wide context it has seen veritable revolutions and counter-revolutions. The final outlines are still to be determined. In the mid-1990s the conventional wisdom was that South Asians were a branch of a broader West Eurasian cluster of peoples, albeit more distant from the core Middle Eastern-North-African-European-Caucasian clade. The older physical anthropological literature would have asserted that South Asians were predominantly Caucasoid, but with a Australoid element admixed in at varying proportions as a function of geography and caste. To put it more concretely, and I think accurately, a large degree of South Asian physical variety can be defined along the spectrum between A. R. Rahman and Nawaz Sharif. The regional and caste truisms are only correlations. Subrahmanyan Chandrasekhar was a Tamil Brahmin, but experienced anti-black racism in the United States. I think that is reasonable in light of his appearance.

ResearchBlogging.org This rough & ready mainstream understanding, supporting by classical genetic markers, was overturned in the early years of the 21st century. One line of thought argued that South Asians were much more distinctive from the broader Western Eurasian cluster of peoples. Representative of this body of work is a paper like The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. These researchers tended to start with the female lineages, mtDNA, and then supplement that with Y lineages, the paternal descent. A separate line of evidence, generally drawn from Y chromosomal results, indicated that there were deep connections between the people of India and those of Central Eurasia, in particular via the R1a haplogroup. Additionally, one aspect of the first set of results which was very surprising was that it actually placed South Asians closer to East, not West, Eurasians. But by the end of the aughts the uniparental studies had been supplemented by a range of results produced from SNP-chips, which looked at hundreds of thousands of genetic variants. These studies seemed to support the older view of South Asians being closer to West Eurasians than East Eurasians. Finally last year a paper came out which posited that almost all South Asian populations were actually an ancient stabilized hybrid between two groups, a European-like population, “Ancient North Indians” (ANI), and another group which is no longer present in unadmixed form, “Ancient South Indians” (ASI), of whom the Andaman Islanders are distant relatives. Though there was a slight bias toward ANI as a whole, the fraction of ASI increased as one went southeast, and down the caste ladder. The distinctive “South Asian” ancestral group in other words then may actually be conceived of as a compound of these two elements; an admixture of the native substrate against a European-like genetic background.

Strangely it sounds an awful lot like the older idea of a Caucasoid population with Australoid admixture. We know now that the connection between the tribal peoples of India, and the indigenous groups of South and Southeast Asia as a whole, to those of Australia and Melanesia, is tenuous at best. So the term “Australoid” is not really informative, and may even mislead. And in terms of historical linguistics I don’t think we’ve solved the problem by appealing to an “Aryan invasion.” The high fraction of ANI among South Indian tribal groups who are isolated from even Dravidian caste groups is a clue to the likelihood that the admixture event is very ancient, and probably precedes the arrival of the Aryans to the Indian subcontinent.

But there are more than two actors in this game. In Reconstructing Indian population history the authors acknowledge that their model is stylized, that reality is more complex. Additionally, they perceive in their data that some tribal groups from northeast India have an element which is outside of the purview of a two-way admixture event. They discarded this set from their broader analysis because this seemed to be a restricted phenomenon to these groups. A new paper in Molecular Biology and Evolution re-injects this third element into the picture. Population Genetic Structure in Indian Austroasiatic speakers: The Role of Landscape Barriers and Sex-specific Admixture:

The geographic origin and time of dispersal of Austroasiatic (AA) speakers, presently settled in South and Southeast Asia, remains disputed. Two rival hypotheses, both assuming a demic component to the language dispersal, have been proposed. The first of these places the origin of Austroasiatic speakers in Southeast Asia with a later dispersal to South Asia during the Neolithic, whereas the second hypothesis advocates pre-Neolithic origins and dispersal of this language family from South Asia. To test the two alternative models this study combines the analysis of uniparentally inherited markers with 610,000 common SNP loci from the nuclear genome. Indian AA speakers have high frequencies of Y chromosome haplogroup O2a; our results show that this haplogroup has significantly higher diversity and coalescent time (17-28 KYA) in Southeast Asia, strongly supporting the first of the two hypotheses. Nevertheless, the results of principal component and “structure-like” analyses on autosomal loci also show that the population history of AA speakers in India is more complex, being characterised by two ancestral components – o ne represented in the pattern of Y chromosomal and EDAR results, the other by mtDNA diversity and genomic structure. We propose that AA speakers in India today are derived from dispersal from Southeast Asia, followed by extensive sex-specific admixture with local Indian populations.

Some background is necessary here. South Asia is notoriously linguistically diverse, but, that diversity can be bracketed into several broad families. First, the Indo-European languages are represented by Indo-Aryan and Iranian dialects (and Germanic, if you include English). Second, the Dravidian languages are found across the subcontinent, from Brahui in Pakistan to Malto in Bangladesh. But they’re really the dominant languages in the southern cone of South Asia. That being said it seems likely that historically their distribution extended far into the north, with Brahui in western Pakistan being a relic of that period, as well as the fragmented tribal groups in Central India. There is also evidence down to historic periods of a Dravidian-speaking substrate in Maharashtra. And purely from a philological perspective it seems clear that many Indo-Aryan languages evolved within a Dravidian linguistic substrate.

Next, in the far north there are languages of Tibetan provenance and affinity. These are explicable in their origins and relationship. But in the northeast third of the Indian subcontinent there are a two groups of Austro-Asiatic languages. The prefix “Austro” is indicative of the symbiotic relationship between historical linguistics and physical anthropology in the early 20th century (most famously illustrated in the transplantation of the social-linguistic term Aryan from a South Asian and Iranian context, to a racialized Northern European term). The map at the top of this post shows the distribution of the Austro-Asiatic languages, as well as their subdivisions. There is clearly an eastern and western wing to the group, but most scholars assume that this is an artifact of the historical eruption of the Burman and Thai peoples out of the southern fringes of the Chinese Empire and into mainland Southeast Asia.

800px-Ramakrishna_Mission_Cherrapunjee_106Within India the Austro-Asiatic languages fall into two broad categories: the Munda and the Khasi. The Khasi inhabit the massif which separates Bengal and Assam. Their culture and society is at some variance from the norm in India (they are matrilocal, and animist or Christian). A close relationship to the people to the east is clear in both their language and their physical appearance. The Khasi, and other groups such as the Garo, are of the family of peoples and ethnicities which have arrived from the east and north relatively recently, making the transition from the world of Tibet and Burma to India. This is evident in the face of the Khasi child in the image to the left. Once passing out of their lands of origin these populations have assimilated to different degrees to the Indic domain. The Tripuri people for example retain a Tibeto-Burman language, but are adherents of Vaishnav Hinduism (my own family were once subjects of the Manikya dynasty). The Ahom of Assam were totally assimilated by the Indo-Aryan substrate. Like the Bulgars of Bulgaria their only influence was in the ethnonym that they contributed to their subjects. A quick survey of my own genetics, and those of other South Asians of eastern origin on 23andMe, clearly shows the influence of assimilated Tibeto-Burmans. One Bangladeshi Muslim individual clearly carries an East Asian Y chromosomal haplogroup.

The Munda are a somewhat different case. In older historical literature on South Asia there is some consideration that the Munda may be the earliest inhabitants of India; predating the Dravidians. Some readers of South Asian origin also point out that in the early Indo-Aryan language there may be more evidence of Munda, than Dravidian, influence. But the eastern connections of the Munda languages seem clear, albeit less explicable than those of the Khasi or the Tibeto-Burman peoples of the far northeast. If the Munda are the indigenous people then it stands to reason that the Mon-Khmer languages derive from South Asia. On the other hand the vast majority of the Austro-Asiatic languages exist in Southeast Asia, and, the Munda themselves have been hypothesized as being the bearers of rice-culture from the east.

This is where genetics comes into play. There has already been evidence of an eastern influence in the genes of the Munda from other researchers, so what this paper does is look at that in detail, instead of discarding it as a minor effect which muddles the broader picture. I’ve reformatted figure 3 to show how the groups relate to each other. On the left is a PCA. Most of the variance is west-east, ~6%, while some of it is north-south, ~1%. On the right is a bar plot generated from ADMIXTURE. I’ve edited out many of the populations. Focus on the Austro-Asiatic groups from India.

munda1

In the PCA you see the SE-NW axis of ANI-ASI admixture which is the primary aspect of genetic variation within South Asia. Numerically Dravidian and Indo-Aryan groups along this axis are the vast majority of South Asians. But the Munda and other Austro-Asiatic groups are not trivial; there are strong suggestions that the eastern Indo-Aryan groups, Oriya, Bengali, and Assamese, are to some extent shaped by influence from the Austro-Asiatic elements. The closer connection of the Khasi to East Asian populations is clear on the PCA. But the fact that the South Indian samples are further along axis-Y than the Munda are indicative of admixture in the Munda population. Looking at the bar plot that’s clear. The dominant dark-green signature of South Indian ancestry is also predominant among the Munda, and found at non-trivial amounts among Iranian, Khasi, and Southeast Asian populations, but the Munda clearly have an eastern component which is not found in South Indians. This is probably the element which perturbs them on the PCA.

But this just tells us the relationships in terms of total genome content. It doesn’t necessarily tells us the historical sequence of admixture events or the direction of migration. In fact the evidence of Indian ancestry in Southeast Asia could be suggesting migration from South Asia to the Southeast Asia (there is plenty of cultural evidence of transmission, though the presumption is that the demographic movements were marginal). They note in the paper that one phenomenon which could be obscuring and confusing our understanding is that much of gene flow occurs through isolation-by-distance (IBD). Village-to-village dynamics. In contrast to this you have folk wanderings, which result in a “leapfrog” aspect. The Hazara and Uyghur are both cases of leapfrogging, as their genetic makeup can’t be explained easily by IBD. So here the connections between the Munda and Southeast Asians, and the broader relationship between Southeast Asians and South Asians, could be IBD, or perhaps reflect deep ancient common ancestry. Perhaps the ASI group spanned the region from the Arabian Sea to the South China sea, and were only later overlain by ANI and East Asian populations.

To explore these questions the authors tunneled down to a more fine-grained scale, and looked at uniparental lineages as well as a gene at which recent selection seems to have operated upon East Asians in distinction to other groups, EDAR. Though uniparental lineages are only partially informative in terms of ancestry, they are very amenable to dating because of their haploid inheritance patterns. And the relationships between the branches of the termini can give us historical information.

The following figure shows the relationship and distribution of a particular Y chromosomal haplogroup which the Munda carry, and other South Asians tend not to, which connects them to the east:

munda3

The haplogroup is O2a (M95). The results from the Y chromosomal data are not clear, though they do seem to reject the model whereby Southeast Asian O2a lineages derive from Indian ones. But it does not seem as if you have a scenario where one founder lineage entered into South Asia from Southeast Asia, there are too many disparate branches of O2a found among Indians. Additionally, the coalescence time (back to last common ancestor) is deeper in Southeast Asia, but still deep in South Asia among the Munda. From this it seems that the origin of Austro-Asiatic languages in South Asia can be rejected, but the details of the emergence of Austro-Asiatic in South Asia can not be clearly perceived as of yet. From what I can gather the authors themselves do not necessarily believe that their results in this domain are robust (insensitive to varying the model’s assumptions even marginally).

An interesting point though is that the mtDNA, the female lineage, does not seem to diverge from other South Asians much at all. I find it intriguing that this is the same pattern we see along the major NW-SE axis of variation. It seems that mtDNA lineages unite South Asians, while the Y lineages separate them (by caste and region). The generality has many exceptions, but it points to a peculiar sex mediated admixture process from both the northwest and northeast. Men on the move have reshaped the genetics and culture of South Asia, but the mtDNA lineages still point to an ancient Eurasian group with distant but stronger affinities to the east than the west. The mtDNA are likely the purest distillation of ASI.

Finally, they look at frequencies of variants of EDAR among the South Asian groups. EDAR is in some ways diagnostic of East Asian ancestry; it seems that a variant which produces thick straight hair emerged relatively recently among East Asians. Here’s the result from the HGDP browser:

edar1

edar2The G allele exhibits co-dominance, so the GA phenotype has intermediate hair-thickness between AA and GG. Haplotype structure based tests of natural selection have indicated that the derived G allele is recent. The map to the right shows the frequency of the derived G variant by population group. The bubble size is proportional to frequency, while the colors represent language groups. Again the Khasi and Tibeto-Burman groups are as you’d expect, they exhibit a relatively high frequency ofthe derived variant. The Hazara are a group which only came into being within the last 1,000 years through an admixture event. The Tharu seem to have their origins in Nepal’s transitional zone, and all the Nepali populations have significant admixture with Tibetan groups even if they themselves are not Tibetan in language and culture. The interesting result are the Munda. The Dravidian groups lack the derived EDAR variant, as do Indo-European groups without a plausible East Asian source of admixture. But within the Munda the derived variant is found in proportions ~5%. This is far lower than the 60% among the Tibeto-Burmans of the northeast, or the 40% among the Khasi, but it is significant. And this result allows the authors to reject the IBD model of connection for Austro-Asiatic groups, because the Munda harbor the variant which other South Asian groups in their environs do not. Gene flow predicated on linguistic affiliation at such a remove seems implausible, so the most parsimonious explanation is that the Munda languages arrived in India from Southeast Asia as part of a leapfrog folk wandering.

But why the low frequency of the derived variant? Obviously the Munda have admixed with the local substrate, so dilution would be one explanation. Another could be that when the Munda left East Asia the frequency was lower. Additionally, whatever selective forces were driving the frequency up may have abated in South Asia, and it could be that there was selection against the derived variant! Whatever the truth of it the existence of the derived EDAR variant among the Munda would be like finding the European LCT variant among an East Asian population: clear evidence of long distance gene flow and population movement.

So where does this lead us? First, let me observe that some of the authors on this paper are the same ones who argued for a predominantly indigenous origin for South Asians in the early 2000s based on mtDNA variation. In this paper they seem to be leaning against an indigenous origin for the Munda, or at least refuting the conjecture that the Munda are ur-Indians par excellence. I didn’t go into the details of the coalescence times because they’re rather a mess, but EDAR is probably a “tipping point” in arguing for a relatively recent exogenous origin for the Munda. The strong sex asymmetry in genetic variation is also suggestive, we have plenty of evidence of historical examples of genetic leapfrogs occurring through men-on-the-move. The asymmetry also seems to exist among the Khasi and other Tibeto-Burmans in India’s northeast (figure 2 of the paper).

The arguments about the history, culture, and genetics of South Asia have traditionally been disputed along the Aryan-Dravidian axis. I’m not interested in rehashing that aspect, but these data point us to another reality: on India’s northeast frontier there’s another component. As an ethnic Bengali myself I’ve always been somewhat aware of this. Some of my relatives and family acquaintances look much more like Garos than other South Asians. This component is even more evident on the face of Assamese and Nepali, whose languages are Indo-Aryan and religion is Hinduism, but whose appearance bespeaks a more variegated background. On some level South Asians from these regions are aware of their peculiarity, even if it isn’t spoken of much. I have read that in the wake of the victory of Japan over Russia in the early 20th century Bengali intellectuals expressed in public their pride at their Asiatic ancestry. With the rise of China in the 21st century I suspect more South Asians from Nepal, Bengal, and Assam, will rediscover that aspect of their background which links them to the east, and not the west. The genetics is just telling us what we already knew.

Citation: Gyaneshwer Chaubey, Mait Metspalu, Ying Choi, Reedik Mägi, Irene Gallego Romero, Pedro Soares, Mannis van Oven, Doron M. Behar, Siiri Rootsi, Georgi Hudjashov, Chandana Basu Mallick, Monika Karmin, Mari Nelis, Jüri Parik, Alla Goverdhana Reddy, Ene Metspalu, George van Driem, Yali Xue, Chris Tyler-Smith, Kumarasamy Thangaraj, Lalji Singh, Maido Remm, Martin B. Richards, Marta Mirazon Lahr, Manfred Kayser, Richard Villems, & Toomas Kivisild (2010). Population Genetic Structure in Indian Austroasiatic speakers: The Role of Landscape Barriers and Sex-specific Admixture Mol Biol Evol : 10.1093/molbev/msq288

Link acknowledgement: Dienekes Pontikos.

Addendum: This is more a speculative comment, so I will tack this on to the body of the main post. Here’s my current very tentative model for how South Asians came to be. At some point after the last Ice Age 10,000 years ago the ANI arrived, and hybridized with the ASI, who are descendants of the older original Out of Africa wave to South Asia. After this, but before the Aryans, the Munda arrived from the northeast, and pushed into lands inhabited by ANI-ASI groups. 4,000-3,000 years ago the Indo-Aryans arrive, and impose themselves as an elite on the ANI-ASI hybrid population, before being assimilated biologically and imparting their language to the Indian majority. I don’t know where Dravidian came from, but perhaps it was the language of the ANI (its existence in fragments all across the swath of the northern Indian subcontinent is suggestive, as well as possible connections to ancient Elamite, the language of Bronze Age southwest Iran). Eventually the Aryanized ANI-ASI marginalized the Munda in northeast India and drove them to the highlands. Finally, the Tibeto-Burmans arrived in the historical period.

Image Credit: Wikimedia Commons

(Republished from Discover/GNXP by permission of author or representative)
 
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at http://www.razib.com"