The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

Sardinian actress

Sardinian actress

Several years ago a paper was published, The History of African Gene Flow into Southern Europeans, Levantines, and Jews:

Previous genetic studies have suggested a history of sub-Saharan African gene flow into some West Eurasian populations after the initial dispersal out of Africa that occurred at least 45,000 years ago. However, there has been no accurate characterization of the proportion of mixture, or of its date. We analyze genome-wide polymorphism data from about 40 West Eurasian groups to show that almost all Southern Europeans have inherited 1%–3% African ancestry with an average mixture date of around 55 generations ago, consistent with North African gene flow at the end of the Roman Empire and subsequent Arab migrations. Levantine groups harbor 4%–15% African ancestry with an average mixture date of about 32 generations ago, consistent with close political, economic, and cultural links with Egypt in the late middle ages. We also detect 3%–5% sub-Saharan African ancestry in all eight of the diverse Jewish populations that we analyzed. For the Jewish admixture, we obtain an average estimated date of about 72 generations. This may reflect descent of these groups from a common ancestral population that already had some African ancestry prior to the Jewish Diasporas.

At the time Dienekes had a pretty strong critique of the paper: the authors assumed that Northern Europeans were an unadmixed reference population, when in fact these populations may have had mixture from East Asians. He presented PCA plots which illustrated the fact that CEU (whites from Utah with British and German ancestry) sample were shifted toward Chinese in comparison to Sardinians. All these years later I think that in fact there is another explanation besides East Asian admixture to explain this: the Chinese themselves have some admixture from West Eurasians. The Sardinians are notably lacking in some admixture components common among mainland Europeans, so the West Eurasian admixture into Chinese probably is closer to mainland Europeans than it would be to Sardinians.

So what about the Sardinian admixture from Africans? The same dynamic might be at play here: old admixture from “Early European Farmers” into Africans might explain why they’re closer to Africans.

But let’s assume that the Sardinian admixture from Africans is legitimate. What does that mean for estimates of ancestral derivation from continental populations than you see in the DTC personal genomics firms that report ancestry results? It can only mean that among people of Southern European ancestry a few percent of African ancestry is being “masked” because it is part of the reference population set. This came to my mind because a half Cuban friend of mine had ~2% African ancestry. Reasonable. When I checked by running unsupervised ADMIXTURE he had ~4%. Then I noticed that the Sardinian reference set was often in the 1-2% range. One explanation for the discrepancy then would be that a few percent of African ancestry in his genome was simply swallowed up by the reference population in the supervised learning framework.

• Category: Race/Ethnicity, Science • Tags: Admixture 
🔊 Listen RSS
The Austronesian languages, credit

The Austronesian languages, credit

The Austronesians were crazy and extraordinary. Starting about ~5,000 years ago they set off from the environs of Taiwan, and began to push outward. For ~30,000 years the people of Melanesia had defined the eastern edge of human habitation, but the Polynesian branch of the Austronesians blasted past that, going alway the way to Hawaii and Easter Island. At the other extreme the ancestors of the Malagasy settled Madagascar, and island which the peoples of Africa had not reached as of yet despite ~200,000 years of human habitation. We don’t know what was happening here, but it is hard to pinpoint particular cultural, environmental, or genetic forces which might result in these sorts of radical change in mores. Humans are conservative and cautious by nature. But our particular lineage of modern humans far less so than our forebears or cousins. After all we did make it Oceania and the Americas, while the others did not.

But a great unresolved question is contact with the Americas. There’s a lot of suggestive evidence, but no clincher. But two recent papers increase the probability considerably. Both are in Current Biology, Two ancient human genomes reveal Polynesian ancestry among the indigenous Botocudos of Brazil and Genome-wide Ancestry Patterns in Rapanui Suggest Pre-European Admixture with Native Americans. Alexander Kim has already reviewed the nuts & bolts of the first paper. Here’s the major finding: heretofore the reasonable assumption about these Polynesian remains in interior Brazil were the product of escaped slaves, but there is an 80-90% probability that they died before any such enslavement of Polynesians could have occurred. In fact both remains may be pre-Columbian!

Cite: Genome-wide Ancestry Patterns in Rapanui Suggest Pre-European Admixture with Native Americans

Cite: Genome-wide Ancestry Patterns in Rapanui Suggest Pre-European Admixture with Native Americans

The second paper has a somewhat more subtle result. The inhabitants of modern day Easter Island are descended in the main from the Polynesians who arrived from the west. This has long been known from classical genetics and non-genetic fields. There has also been suggestion of European and Amerindian admixture. Entirely reasonable in light of Easter Island being a possession of Chile, and 19th century migratory events. What these authors did is that by looking at the distribution of ancestry outcomes in the genomes of Easter Islanders, they inferred that the admixture with Amerindians far predated that with Europeans. The rationale here is simple: recent ancestry from divergent groups tends to exhibit patterns of long alternating blocks, due to a relatively small number of recombination events. In contrast older ancestry tends to be broken up by many recombination events over the generations, until deconvolution can’t separate the two elements and they fuse as one. As an example of the latter case modern day Europeans and South Asians are compound populations whose admixture dates of ~4,000 years or more makes it difficult to trivially deconvolute their ancestral components on a genome-wide scale (though ancient DNA from Mal’ta likely can help in the case of Europeans).

Figure 4 above shows the match of two demographic models with the empirical results. M2 is one where Mestizos from Chile bring European and Amerindian ancestry into the genomes of Easter Islanders. M1 is where there is an ancient Amerindian admixture, followed by a later European one. The solid lines show the predictions, while the points show the empirical results from the samples. It is clear visually that M1 fits the data. There are many short Amerindian blocks, evident of an old admixture, as opposed to more varied and longer European blocks. The rough dates for Amerindian ancestry admixture are in the range of 1300 to 1400 A.D., which match reasonably well with when Easter Island was settled.

These results are strong. Not definitive and probably not the last word, though more Easter Islander samples can end the debate of admixture at least. But they make us wonder how incredible human migrations have been over the past ~50,000 years! Ancient people were far more daring than we had imagined, and I think we need to reconsider what “crazy” exactly is in many ways.

• Category: Science • Tags: Admixture 
🔊 Listen RSS

Simon van der Stel, first governor of the Dutch Cape Colony. His maternal grandmother was an Indian slave

Simon van der Stel, first governor of the Dutch Cape Colony. His maternal grandmother was an Indian slave.

In the comments below a question was asked about the non-European admixture in white Canadians, New Zealanders, and Australians. It was prompted by the fact that low levels of non-European admixture do seem to be found in most whites in the Family Tree DNA database where both parents were born in South Africa (granted, a small sample). My hunch is that these individuals must be Afrikaner, because I have a hard time understanding how else one would detect Khoisan and Southeast Asian ancestry (West African and South Asian ancestry would be easier to explain). So what about other “white dominions,” those realms of the British Empire united by being either dominated or ruled by white people. I actually just looked at the data for Australia, Canada and New Zealand. The sample size for New Zealand was small. But, in these cases those individuals of preponderant European ancestry have no non-European ancestry, by and large. A few Canadians do have some fractions of Native American ancestry. This seems in line with the data on American whites from 23andMe. Only a small minority have non-European ancestry. Afrikaners are somewhat like many Latin American whites, in being visibly white European, but usually carrying some recent non-European ancestry because of the history of their people.

Addendum: Please recall that lack of genetic signal from ancestors 200-300 years back is not uncommon. Most Americans with colonial stock for example can probably trace a line of genealogical descent back to a Native American. But, because of the small fraction most of these genealogical descendants will not exhibit any genomic segments identical by descent with these individuals.

• Category: Race/Ethnicity, Science • Tags: Admixture 
🔊 Listen RSS

Citation: Decker, Jared E., et al. "Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle." arXiv preprint arXiv:1309.5118 (2013).

Citation: Decker JE, McKay SD, Rolf MM, Kim J, Molina Alcalá A, et al. (2014) Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle. PLoS Genet 10(3): e1004254. doi:10.1371/journal.pgen.1004254


440px-Steak_03_bg_040306I am a man of a particular age, old enough to remember when the idea of thousands of what were then quaintly termed ‘molecular markers’ would have left one aghast at the surfeit of data. Today the term “post-genomic” almost strikes me as as anachronistic as the “information superhighway.” This is not the post-genomic era, it just is, the wildest dreams that were, are. But the glorious present of data abundance is not without its limitations and pitfalls. As a friend explained once, bioinformaticians just “do stuff,” sometimes without understanding why they do stuff. Somewhere along the way the bio part seems to have been forgotten in the hurry to assemble the next organism as the machine demands more and more for its hungry maw. But the mechanical monster slurping through the fire hose of data with a hacked together chimera of a regular expression isn’t without some purpose. Many biologists with an interest in evolution have a dream of dense marker painting vast swaths of the tree of life, an empire of phyolgenetic information to be conquered.

But these vistas need some context, a horizon of information about the organism. This came to mind when I read Jared Decker’s new paper on the phylogenetics of domestic cattle, Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle. In many ways it is a straightforward paper. You can see discussions on the earlier iterations over at Haldane’s Sieve (the preprint process seems to have worked to make it a more robust and clear publication from what I can tell!). Decker utilizes some straightforward methods (at least straightforward in 2014) on a very large SNP marker data set with expansive geographic coverage. In particular, TreeMix, Admixture, and PCA. With about ~40,000 SNPs these packages should blast through the data rather quickly (I’ve used all of them with this marker density, and sample sizes of approximately the size of the one Decker has).

You can read the whole paper yourself since it is open access. To me it seems to reiterate that cattle truly are cattle, to be pulled and prodded and traded at the whim of human beings. The fact that many East African cattle have predominantly Indian heritage (one of the two major clades) illustrates the fact that domestic animals exhibit the protean tendencies of human culture, rather than biological organisms which are governed by standard geographical and morphological diversification through conventional population genetic pressures. But I have to still admit that much of the narrative force of this paper escapes me because I lack understanding of the cattle at a level beyond the plainly statistical genetic. In other words, the organism matters. Cattle geneticists who may “hum through” the plots may still be able to grasp the force of argument with a greater clarity because their understanding of the topic is fundamentally thicker than that of outsiders. Many of the paper’s inferences from genetic data clearly draw their plausibility from elements of natural history which bovine biologists would take for granted.

And this is just the beginning. Over the next decade it seems inevitable that the clusters at the heart of “genomics cores” across the world will be gorging on whole sequences of thousands of individuals for many organisms. It will be a “flood the zone” era for attempting to understand the tree of life. An army of bioinformaticists will be thrown at the data in human waves, absorbing shock after shock, slowly transforming the ad hoc kludge pipelines of the pre-Model T era of genomics into simpler turnkey solutions. And then the biology will come back to the fore, and the deep wellspring of knowledge by those who focus on specific organisms and is going to be the essence of the enterprise once more.

• Category: Science • Tags: Admixture, Genomics, PCA, TreeMix 
🔊 Listen RSS

There have been many popular press treatments of Hellenthal et al.’s A Genetic Atlas of Human Admixture History already. If you have not seen their interactive map, which imparts many of their results, I highly recommend it. To understand the scientific results it does help to read some of this group’s earlier papers, such as Inference of Population Structure using Dense Haplotype Data and Population Identification Using Genetic Data. As I suggested earlier the real paper is in the supplements, which has the virtue of being free, but generally the downside of not enforcing concision or accessibility. Obviously the general public is going to focus on the primary results; which populations mixed when. But perhaps more important is that the ingenuous methods described in the supplements illustrate the power of looking at linked variants across segments of the genome, rather than just the variants themselves.
guatemalanThese segments are haplotypes, sequences of variation across genetic regions which exhibit some association. This association can be used to illustrate relatedness across populations and individuals, because the greater the distance of generations (meiotic events) the more recombination events break apart the haplotypes. To make this clearer, I’ve included several chromosomes “painted” by 23andMe as a function of varied ancestral assignments for one individual. You notice that in this painting different colors keep alternating. That is because the individual in this case is a friend whose background is from the mestizo population of Central America. In other words, he has well over 10 generations of recombination events breaking apart associations of ancestry along his genome.

The paper above reports the end product of a similar process of analysis, but quite a bit more elaborated in the inferences being made. At this point I will elide the technical details, not because they are unimportant (I’m particularly fascinated by their decomposition of decay curves which hide multiple admixture events), but because they are difficult, and with one read-through of the supplements I don’t particularly grasp all the subtleties. What is relevant for the reader is that authors used haplotypic information by phasing their data, and so presumably can squeeze more juice out of it. This is illustrated by the comparison with the ROLLOFF application in ADMIXTOOLS, which uses just genotype data to make similar inferences. The future is probably toward phased analysis of haplotypes, because this sort of structuring of genomic data is more informationally rich. But it is computationally intensive to perform population based phasing, and the marker sets have to be dense enough that you can infer haplotypes. That will happen, but we’re not there yet with all data sets. This is a preview of the future, but we’re not in the future yet, that’s for sure.

Before moving onto a cursory survey of the results, I’d like to quote one of the authors from Nick Wade’s piece in The New York Times:

Dr. Myers and Dr. Hellenthal said that they hoped historians would find their work useful, but that they had not collaborated with historians.

“In some sense we don’t want to talk to historians,” Dr. Falush said. “There’s a great virtue in being objective: You put the data in and get the history out. We do think this is a way of reconstructing history by just using DNA.”

You might wonder how they could test if their methods were on the right track as they went along. They preformed many simulations, constructing admixture scenarios to which they were blind and which their method managed to ferret out. Looking at their reported results in the supplements I’m impressed and rather confident that they’re onto something (additionally, the signals were cross-validated with ROLLOFF). But this sort of attitude of ostensible objectivity through ignorance does concern me somewhat, because the reality is the authors are quite aware at least in the most superficial sense of historical events, and did not hesitate to include their judgments within the text. From the paper itself:

Distinct, ancient, and partially shared admixture signals (always dated older than 90 BCE) are seen in six groups (Fig. 4B), including the Kalash (Fig. 2C), whose strongest signal suggests a major admixture event (990 to 210 BCE) from a source related to present-day Western Eurasians, although we cannot identify the geographic origin precisely. This period overlaps that of Alexander the Great (356 to 323 BCE), whose army, local tradition holds, the Kalash are descended from (40), but these ancient events predate recorded history in the region, precluding confident interpretation.

Credit: Flickr

Credit: Flickr

The Kalash are famous because they are pagans who reside in the fastness of the Chitral in northwestern Pakistan. It is highly likely that they will be subject to cultural genocide by their Muslim neighbors within the next 10-20 years, as the Taliban has been threatening them with forced conversion for the past few years, and it seems unlikely that the broader Pakistani society would be willing to expend much more in blood & treasure to protect them. Additionally, many of the Kalash also evince a very European physiognomy, making the legend of a Macedonian origin plausible on the face of it. The 95% confidence interval for date of admixture actually does fall within the period of Alexander’s invasion of the Persian east. But what are truly the chances of this? My own hunch is that the admixture into the Kalash is a real phenomenon, but probably because of another migration event which we are less aware of. This conjecture is based on some prior information. Zack Ajmal has been running the Harappa Ancestry Project for many years now, and has assembled a very large database of Indian ethnicities and castes. And there are already some suggestive patterns which I think may shed light on what is going on with the Kalash. I’ve taken Zack’s data and sorted it by particular ancestral quanta for a subset of populations:

Ethnicity Dataset N S Indian Baloch Caucasian NE Euro
haryana-jatt harappa 5 27% 37% 9% 18%
rajasthani-jatt harappa 2 25% 35% 11% 15%
punjabi-jatt-sikh harappa 13 28% 40% 10% 12%
brahmin-uttar-pradesh metspalu 8 42% 36% 5% 12%
pathan hgdp 23 23% 42% 16% 11%
kalash hgdp 23 22% 43% 18% 11%
burusho hgdp 25 23% 41% 12% 10%
pushtikar-brahmin harappa 1 31% 36% 12% 10%
bengali-brahmin harappa 7 42% 33% 5% 10%
kashmiri-pandit reich 5 32% 39% 12% 9%
punjabi harappa 12 33% 39% 11% 8%
up-kshatriya metspalu 7 45% 37% 4% 8%
punjabi-arain xing 25 31% 44% 10% 7%
sindhi hgdp 24 29% 46% 10% 6%
iyer-brahmin harappa 11 47% 37% 5% 5%

The “NE Euro” cluster is strongly correlated with Northern Europe. It peaks in Finns at 80%, with Lithuanians next at 72%. It’s striking to me that the peasant cultivators of eastern Punjab, the Jatts, have a high fraction of this component, which drops off as you go east and south, as you’d expect, but also into Pakistan. The Jatts have legends of “Scythian” ancestry. This might not be true, but I think something is going on in their history to explain their elevated “NE Euro,” which is above the fraction of North Indian upper castes. Interestingly the ratio between “Baloch” and “S Indian” is ~1.4, almost exactly that of the Punjabi Arain of Pakistan, who are also peasant farmers. What this suggests to me is that the “NE Euro” among the Jatt may be an overlay upon the peasant substrate of the Punjab. I also believe that it post-dated any primary Indo-Aryan contribution, as the upper castes of the North Indian plain do not exhibit the same patterns as the Jatt. Such specific stories are likely common, and illustrate the process of demographic “leapfrogging,” where populations translocate themselves rapidly over great distances, and admix with the local substrate in a single pulse.

As Dienekes has already observed this method has the greatest power to detect admixture events between two strongly distinct populations less than 4,000 years ago. So, for example, he observes that the Yayoi-Jomon admixture in Japan which occurred ~1,500-2,000 years ago does not show up, likely because the two populations are too genetically similar. In contrast, low fractions of East Asian admixture do show up among European populations because they jump out against the European genetic admixture. Many of the more ancient admixture events detected by the Reich lab are outside the purview of this method, which relies on haplotype associations which decay as an exponential function of the admixture time. Additionally, as the authors note they have greater ability to narrow in on singular pulse admixture events, as opposed to continuous gene flow. This is why I say that this is really a map of recent leapfrogging as rapid demographic translocations produce the sort of genetic revolutions which leave the marks that these methods can easily detect.

And that is exactly the case in Central Eurasia. It’s clear on the map at the top of this page that admixture events are dominant in Central Eurasia, but less so on the periphery. H. L. Mackinder’s Heartland is always roiling. And the biggest commotion seems to have been caused by the Mongol Empire, as the authors repeatedly allude to this event as having caused dislocation and admixture. Those who were skeptical of the idea of a Genghis Khan Y chromosomal lineage should present a forthright rebuttal, because though these results don’t imply the expansion of the Genghiside lineage as such, they definitely point to the Mongolians as being the “source” population for numerous admixture events across Central Asia. Both the peoples and the timing fit. The primary question I have about these results though is the relative weaker signal of the Turkic migrations, which preceded the rise of the Mongol Empire by over 500 years. One possibility is that the Turkic signal is weaker because it is closer to a continuous admixture scenario, as the nomadic Turks infiltrating the Islamic civilization were slowly emulsified by their neighbors. In contrast, the rise of the Mongols was a hammer blow to the Islamic world, and peoples rose and fell, an amalgamated, in a few generations.

These results also resolve some old debates among historians and archaeologists. During the late 6th and 7th centuries the Byzantine focused their energies on the Anatolian, and theBalkan hinterland and much of Greece proper fell to barbarians, Scalveni, Slavs. After several centuries the hinterlands of Greece proper were reconquered, and under Basil II the Balkans were brought back into the Byzantine fold. This process of ethnic back and forth, and likely admixture, was already hinted at in Peter Ralph and Graham Coop’s The Geography of Recent Genetic Ancestry across Europe. But the results here are even clearer. Modern Greeks do seem to have significant Slavic ancestry. Similarly, the Slavs, north to south, are the products of an admixture event which produces a cline in ancestry.

Most of the results in this paper are of this form. Stepping into tendentious historical debates, and pointing the finger in one specific direction. They came. They inferred. They resolved. But not in all instances. Here I quote from the paper:

A different method, which aims to detect but not date admixture, concluded that Cambodians trace ~16% of their DNA to a group equally related to modern-day Europeans and East Asians (29). GLOBETROTTER infers a ~19% contribution from a similar source related to modern-day Central, South, and East Asians and an ~81% contribution from a source related specifically to modern-day Han and Dai, the latter a branch of the Tai people who entered the region in historical times (30) (Fig. 2D, orange box 5). Further, this event dates to 1362 CE (1194 to 1502 CE), a period spanning the end of the Indianized Khmer empire (802 to 1431 CE) (30), one of the most powerful empires in Southeast Asia, whose fall

9780521804967 It is true Southeast Asia witnessed a massive cultural and demographic upheaval ~1000 A.D. Anyone who wishes to read about this period’s impact on modern Southeast Asia should get a copy of Victor Lieberman’s Strange Parallels. A thumbnail sketch of what occurred is that Thai invasions from southern China transformed what are today the nations of Thailand and Laos from being zones of Mon-Khmer civilization to a synthetic one, where the Thai absorbed many elements of the Theravada Buddhist culture of their predecessors while maintaining an ethno-linguistic distinctiveness. To the west the Burmese were already in the process of absorbing their Mon predecessors when the Thai arrived and established the Shan polities in the hinterland. Ultimately the incipient Burmese polity survived the Thai assaults, though it retained a Thai minority, albeit one integrated through Therevada Buddhism. Finally, in Vietnam you had a situation where the Kinh, the Vietnamese, were shielded to a great extent from the Thai invasions by geography, and engaged in their own expansion south toward the Mekong delta at the expense of the Khmer.

Why does any of this matter? Because the straightforward interpretation to me of the text above from the paper is that ~80% of the ancestry of modern Cambodians is actually from the Thai. This is difficult to credit, as the standard model is that in fact there was greater assimilation of Mon and Khmer to Thai identification. It could be that continuous gene flow resulted in the demographic turnover, but from what I can tell this should not show up so strongly with the tests being utilized here. The standard model is that Austro-Asiatic rice farmers arrived from southern China nearly ~4,000 years ago, and assimilated a Melanesian-like population. The Thai migrations provided an overlay across the highlands of Burma and in Thailand and Laos, but it assimilated a large and substantial non-Thai peasant substrate. In eastern Thailand this assimilation of Khmer rice farmers has continued to occur down to the present. But perhaps the history is wrong somewhere. If confirmed by future analyses, then the historians and archaeologists may need to look at their inferences with fresh eyes.

One of the ways that the press is spinning these results is that inter-ethnic admixture was extremely common in the past. This is too simple a model, and in fact I suspect that isolation-by-distance gene flow between neighboring groups was the default for most of human history. These pulse admixture events show up against the background of conventional and boring variation because they’re atypical, albeit not rare. Associations between geographically disparate groups is fascinating, because they illustrate the power of human technology (the horse) and organization (reproductive advantage). The future is going to be synthesizing this sort of natural science with history and economics, to construct a fully textured model of the past where the normal is perturbed by bursts of atypicality.

Citation: DOI: 10.1126/science.1243518.

Addendum: Finally, I should add that many of the low frequency admixtures that they see and do not explain with any clarity have reasonable explanations. Perhaps the authors did not elaborate due to constraints of space, or simply because they did not wish to engage in excessive speculation. But to me it seems obvious that West Eurasian admixture in places like Mongolia make a long more sense when you remember that the Mongols had to employ Christian priests to serve their Alan mercenaries. As for West Asian admixture on the North China plain, Sogdians were common as a “middleman minority” during the Tang dynasty, while the Mongol Yuan famously brought in many Muslims from Central Asia.

• Category: History, Science • Tags: Admixture 
🔊 Listen RSS
Father of Lies

Father of Lies

More often than not the discipline of history seems to swing between the true and trivial (or perhaps more precisely, picayune), and grand narratives which emphasize a nearly fictionalized story. In some ways this is not entirely a problem. When teaching young children the history of the United States a punctilious adherence to fact is essential, but, one can not deny that the selection of topicality can sway and shade the direction of the lessons learned. But far too often this ideological element of the historical narrative determines the central focus, rather than floating along the margins. With erudite command of detail historical scholars can, if they so choose, engage in a game of ideological sophistry, cultural flattery, and underhanded polemic. Both Howard Zinn and David Barton were and are players at this game. But there are still those who engage in the Sisyphean task of perceiving the world as it is, not as we would wish it to be, through the dark glass. Such a colossal enterprise, to ascertain the objective character of an exceedingly complex phenomenon, requires every tool at hand. Historians have traditionally been hunters of musty texts in neglected libraries, but they have on many an occasion received auxiliary data from scholars working in more material domains, such as archaeologists and engineers. Today you must add geneticists to the growing brigade of scholars attempting to excavate the past.

It is known

It is known

In truth the power of genetics is most evident and necessary in areas where history is silent, before written records can build a narrative skeleton in which we can play. Using both modern and ancient DNA samples the geneticists, working with archaeologists, can still make vague inferences where before there was only darkness. But illumination can be had even in time periods when historical records are quite good. Though the public understands evolution to transpire over eons, basic population genetic processes occur over a matter of generations, and so can give us fresh insight into dynamics which played out quite recently in time. A new paper in PLoS GENETICS, Reconstructing the Population Genetic History of the Caribbean, does just this. Obviously we already have a history of the Caribbean. As every schoolboy knows it began in 1492, and proceeded across the centuries as a palimpsest of European colonial powers, and later independent nations, rose and fell. But history is more than just wars, international congresses, and once-in-a-generation discoveries. It is the ebbing and flowing of peoples themselves in their aggregate masses. Conventional textual narratives and coarse archaeological inferences can get us rather far. See Charles C. Mann’s magisterial 1493 for an example. But historical population genetics goes a step further, as it attempts to infer demography through patterns of variation in genes, the most elemental instrumental variable for tracing demographic patterns one might imagine.

journal.pgen.1003925.g004 What the above paper does is reiterate, emphasize, and clarify, particular population genetic demographic events which have been suspected. First, the Amerindian populations were not static creatures in equilibrium with nature, but dynamic. There is clear evidence in these results that some groups migrated from South America to Central America, and especially the Caribbean. This is not unreasonable a priori, but far too often our stylized models presume the Amerindian population as a homogeneous, uniform, almost ahistorical substrate upon which European agency and African tragedy can unfold. But on the contrary, the peoples of the New World had their own history, oral as it may be. As you can see the Maya, one of the most iconic of Amerindian peoples, seem to exhibit some southern affinities, perhaps the result of an ancient “back migration.” If the Old World is any guide there may have been many forward and back migrations.

journal.pgen.1003925.g004 This ancient legacy is evident in the admixed populations, the Mestizos, Zambos, and Mulattos of the Greater Caribbean region. Looking in particular at the Puerto Rican and Dominican populations you see low, but significant, levels of admixture from specific native groups. One the one hand you may not be surprised, but it must be stated that before the genetic evidence there was much skepticism as to whether any Amerindian genetic heritage persisted in the populations of the Caribbean. A particular style of cultural/humanistic scholar intuited that perhaps an emphasis on indigenous ancestry was a mechanism for people of some African ancestry to deflect attention away from this aspect of their heritage because of the fraught history of slavery. Though the internal logic here seems reasonable, the empirical evidence makes it clear that the legacy of the Amerindians does persist in these islands, among these peoples. Their motive may have been unpalatable, but their argument was right.

Who were these Amerindian people? And how did they become integrated with the synthetic populations which came to dominate these islands? This is where textual history and genetics operate in a complimentary fashion. Both history and ethnography document mass population collapse in concert with an androcide of the Amerindians. By this, I mean that European males took Amerindian women as concubines, and engaged in de facto polygyny in the New World. Hernan Cortes, conqueror of the Aztecs, illustrates his phenomenon, as he had an illegitimate son, Martin Cortes, with his native translator, and later on a legitimate son, another Martin Cortes, with a Spanish noblewoman. This pattern of sexual liberty and license was common in the early years, and has been extensively documented by historians. It is a reason many anthropologists give for the relatively low rates of legitimacy in much of Latin America. And of course what applied to Amerindian females also applied to African females. What the genetics makes clear is that this asymmetric pattern of cultural power relations was demographically very significant. Populations with a near total lack of Amerindian and African Y chromosomal lineages, passed from father to son, may still have high levels of non-European mtDNA, passed from mother to daughter. In this study they also looked at the X chromosome, which spends 2/3 of its time in females, and did find an enrichment of Amerindian ancestry there as well.

Tract But they didn’t just focus on the nature of admixture today, they inferred its history. The technique is rooted in basic concepts in genetics. When you have chromosomes come together from parents in a child, those are distinct and identical in nature to segments of ancestry one might find in parents. But genetic recombination in the next generation shuffles the segments, so that parental elements become mixed together on the same segment. When parents are from different geographic populations you see alternative segments of “ancestry tracts.” For example a chromosomal segment with alternative regions of European, Amerindian, and African, ancestry. Because there are only 20-30 recombination events per generation per individual the distribution of the length of these tracts is a function of the length of time since admixture. The early years after admixture will be characterized by long blocks of ancestry from one population, alternated with another. As time passes the segments will get smaller, and alternate much more rapidly. What the authors found was that indeed Amerindian segments exhibited the latter pattern, while European and African segments were more diverse in their distribution. The distinction was strongest in the Caribbean populations, but was evident elsewhere. The explanation is the one above. The early years of Iberian settlement were characterized by de facto polygyny and decimation of male Amerindians through enslavement (though there was population collapse more generally due to disease). Amerindian ancestry came in one singular pulse, and slowly dissipated and distributed itself through the population.

Finally, the results here also yield the finding that Latin American European ancestry seems to have diverged from its parent source. A detailed exploration of the technical issues can be found at the Haldane’s Sieve weblog, but I will say I am convinced that the authors have made a good, if not definitive, case for the proposition that the Latin American ancestral component is one which has diverged significantly. Again, the reason was listed above: de facto polygyny. This drives down the effective population, increases the drift, and skews the allele frequency distribution rapidly away from the source population. If this is a true result it shows us the possibilities for how new populations can arise through fission and rapid expansion. In particular, they may be male mediated. For this period, from 1500-1900, we have extensive documentation to corroborate the broad inferences made. But not so for many regions deep into the past. What these sorts of papers illustrate is the fine-grained power of genetics in shedding light on topics and issues which might otherwise have remained off limits. In particular genetics taps into some of the most primal activities of humankind, those that lead to procreation.

Citation: Moreno-Estrada A, Gravel S, Zakharia F, McCauley JL, Byrnes JK, et al. (2013) Reconstructing the Population Genetic History of the Caribbean. PLoS Genet 9(11): e1003925. doi:10.1371/journal.pgen.1003925

🔊 Listen RSS

Citation: Genetic Evidence for Recent Population Mixture in India
Moorjani et al.

The Pith:In India 5,000 years ago there were the hunter-gathers. Then came the Dravidian farmers. Finally came the Indo-Aryan cattle herders.

There is a new paper out of the Reich lab, Genetic Evidence for Recent Population Mixture in India, which follows up on their seminal 2009 work, Reconstructing Indian Population History. I don’t have time right now to do justice to it, but as noted this morning in the press, it is “carefully and cautiously crafted.” Since I am not associated with the study, I do not have to be cautious and careful, so I will be frank in terms of what I think these results imply (note that confidence on many assertions below are modest). Though less crazy in a bald-faced sense than another recent result which came out of the Reich lab, this paper is arguably more explosive because of its historical and social valence in the Indian subcontinent. There has been a trend over the past few years of scholars in the humanities engaging in deconstruction and intellectual archaeology which overturns old historical orthodoxies, understandings, and leaves the historiography of a particular topic of study in a chaotic mess. From where I stand the Reich lab and its confederates are doing the same, but instead of attacking the past with cunning verbal sophistry (I’m looking at you postcolonial“theorists”), they are taking a sledge-hammer of statistical genetics and ripping apart paradigms woven together by innumerable threads. I am not sure that they even understand the depths of the havoc they’re going to unleash, but all the argumentation in the world will not stand up to science in the end, we know that.

Since the paper is not open access, let me give you the abstract first:

Most Indian groups descend from a mixture of two genetically divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners, Caucasians, and Europeans; and Ancestral South Indians (ASI) not closely related to groups outside the subcontinent. The date of mixture is unknown but has implications for understanding Indian history. We report genome-wide data from 73 groups from the Indian subcontinent and analyze linkage disequilibrium to estimate ANI-ASI mixture dates ranging from about 1,900 to 4,200 years ago. In a subset of groups, 100% of the mixture is consistent with having occurred during this period. These results show that India experienced a demographic transformation several thousand years ago, from a region in which major population mixture was common to one in which mixture even between closely related groups became rare because of a shift to endogamy.

Young Stalin

I want to highlight one aspect which is not in the abstract: the closest population to the “Ancestral North Indians”, those who contributed the West Eurasian component to modern Indian ancestry, seem to be Georgians and other Caucasians. Since Reconstructing Indian Population History many have suspected this. I want to highlight in particular two genome bloggers, Dienekes and Zack Ajmal, who’ve prefigured that particular result. But wait, there’s more! The figure which I posted at the top illustrates that it looks like Indo-European speakers were subject to two waves of admixture, while Dravidian speakers were subject to one!

The authors were cautious indeed in not engaging in excessive speculation. The term “Indo-Aryan” only shows up in the notes, not in the body of the main paper. But the historical and philological literature is references:

The dates we report have significant implications for Indian history in the sense that they document a period of demographic and cultural change in which mixture between highly differentiated populations became pervasive before it eventually became uncommon. The period of around 1,900–4,200 years BP was a time of profound change in India, characterized by the deurbanization of the Indus civilization, increasing population density in the central and downstream portions of the Gangetic system, shifts in burial practices, and the likely first appearance of Indo-European languages and Vedic religion in the subcontinent. The shift from widespread mixture to strict endogamy that we document is mirrored in ancient Indian texts. [notes removed -Razib]

How does this “deconstruct” the contemporary scholarship? Here’s an Amazon summary of a book which I read years ago, Castes of Mind: Colonialism and the Making of Modern India:

When thinking of India, it is hard not to think of caste. In academic and common parlance alike, caste has become a central symbol for India, marking it as fundamentally different from other places while expressing its essence. Nicholas Dirks argues that caste is, in fact, neither an unchanged survival of ancient India nor a single system that reflects a core cultural value. Rather than a basic expression of Indian tradition, caste is a modern phenomenon–the product of a concrete historical encounter between India and British colonial rule. Dirks does not contend that caste was invented by the British. But under British domination caste did become a single term capable of naming and above all subsuming India’s diverse forms of social identity and organization.

The argument is not totally fallacious, as some castes are almost certainly recent constructions and interpretations, with fictive origin narratives. But the deep genetic structure of Indian castes, which go back ~4,000 years in some cases, falsifies a strong form of the constructivist narrative. The case of the Vysya is highlighted in the paper as a population with deep origins in Indian history. Interestingly they seem to be a caste which has changed its own status within the hierarchy over the past few hundred years. Where the postcolonial theorists were right is that caste identity as a group in relation to other castes was somewhat flexible (e.g., Jats and Marathas in the past, Nadars today). Where they seem to have been wrong is the implicit idea that many castes were an ad hoc crystallization of individuals only bound together by common interests relatively recently in time, and in reaction to colonial pressures. Rather, it seems that the colonial experience simply rearranged pieces of the puzzle which had deep indigenous roots.

Indra, slayer of Dasas? Credit: Gnanapiti

Stepping back in time from the early modern to the ancient, the implications of this research seem straightforward, if explosive. One common theme in contemporary Western treatments of the Vedic period is to interpret narratives of ethnic conflict coded in racialized terms as metaphor. So references to markers of ethnic differences may be tropes in Vedic culture, rather than concrete pointers to ancient socio-political dynamics. The description of the enemies of the Aryans as dark skinned and snub-nosed is not a racial observation in this reading, but analogous to the stylized conflicts between the Norse gods and their less aesthetically pleasing enemies, the Frost Giants. The mien of the Frost Giants was reflective of their symbolic role in the Norse cosmogony.


What these results imply is that there was admixture between very distinct populations in the period between 0 and 2000 B.C. By distinct, I mean to imply that the last common ancestors of the “Ancestral North Indians” and “Ancestral South Indians” probably date to ~50,000 years ago. The population in the Reich data set with the lowest fraction of ANI are the Paniya (~20%). One of those with higher fractions of ANI (70%) are Kashmiri Pandits. It does not take an Orientalist with colonial motives to infer that the ancient Vedic passages which are straightforwardly interpreted in physical anthropological terms may actually refer to ethnic conflicts in concrete terms, and not symbolic ones.

Finally, the authors note that uniparental lineages (mtDNA and Y) seem to imply that the last common ancestors of the ANI with other sampled West Eurasian groups dates to ~10,000 years before the present. This leads them to suggest that the ANI may not have come from afar necessarily. That is, the “Georgian” element is a signal of a population which perhaps diverged ~10,000 years ago, during the early period of agriculture in West Asia, and occupied the marginal fringes of South Asia, as in sites such as Mehrgarh in Balochistan. A plausible framework then is that expansion of institutional complexity resulted in an expansion of the agriculture complex ~3,000 B.C., and subsequent admixture with the indigenous hunter-gatherer substrate to the east and south during this period. One of the components that Zack Ajmal finds through ADMIXTURE analysis in South Asia, with higher fractions in higher castes even in non-Brahmins in South India, he terms “Baloch,” because it is modal in that population. This fraction is also high in the Dravidian speaking Brahui people, who coexist with the Baloch. It seems plausible to me that this widespread Baloch fraction is reflective of the initial ANI-ASI admixture event. In contrast, the Baloch and Brahui have very little of the “NE Euro” fraction, which is found at low frequencies in Indo-European speakers, and especially higher castes east and south of Punjab, as well as South Indian Brahmins. I believe that this component is correlated with the second, smaller wave of admixture, which brought the Indo-European speaking Indo-Aryans to much of the subcontinent. The Dasas described in the Vedas are not ASI, but hybrid populations. The collapse of the Indus Valley civilization was an explosive event for the rest of the subcontinent, as Moorjani et al. report that all indigenous Indian populations have ANI-ASI admixture (with the exceptions of Tibeto-Burman groups).

Overall I’d say that the authors of this paper covered their bases. Though I wish them well in avoiding getting caught up in ideologically tinged debates. Their papers routinely result in at least one email to me per week, ranging from confusion to frothing-at-the-mouth.

Related: The Gift of the Gopi.

Citation: et al., Genetic Evidence for Recent Population Mixture in India, The American Journal of Human
Genetics (2013),

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

Christopher Columbus

A few year ago there was a minor controversy when some evolutionary genomicists reported that they had reconstructed the genome of the extinct Taino people of Puerto Rico by reassembling fragments preserved in contemporary populations long since admixed. The controversy had to do with the fact that some individuals today claim to be Taino, and therefore, they were not an extinct population. Though that controversy eventually blew over, the methods lived on, and continue to be used. Now some of the same people who brought you that have come out with work which reconstructs the recent demographic history of the Caribbean, both maritime and mainland, using genomics. Even better, it’s totally open access because it’s up on arXiv, Reconstructing the Population Genetic History of the Caribbean (please see the comments at Haldane’s Sieve as well, kicked off by little old me). Though the authors pooled a variety of data sets (e.g., HapMap, POPRES, HGDP) the focus is on the populations highlighted in the map above.

Much of the novel insight in the results begins with their observation of a distinct “Latino” population genetic cluster with strong affinities with Europe within the Caribbean populations. This is clearly visible in their ADMIXTURE analysis. What they did was pool various populations, and run a method which decomposes the ancestry of each individual as a combination of K ancestral populations. In cases where the pooled populations are clear and distinct the results will be clear and distinct. For example, if you had 50 Finns and 50 Nigerians and pooled them, and ran ADMIXTURE at K = 2, then with a non-trivial number of SNPs (10,000 is more than sufficient) all the Finns and Nigerians will partition into two distinct ancestral populations according to these sorts of model based clustering. But it always has to be remembered that though these methods map onto reality, and give us some sense of the variation within the data sets, the K’s themselves are artificial constructs. So, for example, the HGDP Maya population is known to have non-trivial European gene flow. If you use this sort of Maya population as your “Native American” reference, then you will underestimate Native ancestry in admixed groups because your reference Native population is already skewed toward Europeans (this is obviously a major problem when you don’t have the appropriate reference because it is extinct, such as with the Taino).

With those cautionary preliminaries out of the way what’s going on in these results? As you can see many of the Caribbean populations are straightforward combinations of various continental ‘parent’ populations. This is clearly evident in K = 3, where green = Africa, red = European, and blue = Native (note that the Maya have a range of European ancestry just as I said). By looking at individual variation within populations you can already gain some insights as to the nature of the admixture. In Mexico there is a wide range of the European vs. Native fraction, though in this data set there are no “pure” individuals. Additionally, there are low, but relatively even, amounts of African ancestry across the population. Though African consciousness this is not a major element of modern Mexican national identity, people of African ancestry were a major part of the Spanish colonial enterprise (see Empire: How Spain Became a World Power, 1492-1763). In some areas, such as Veracruz, people of visibly African ancestry remain, but in much of Mexico these individuals intermarried and their physical characteristics were diluted toward the point of not being visible.

The situation in the maritime Caribbean is somewhat more complex. In these contexts it was the Native, not African, ancestry which was subsumed and submerged. It is genomics which has ‘rediscovered’ this ancestry, to the extent that many scholars had previously been skeptical of the possibility that modern Puerto Ricans and Dominicans inherited a substantial share of Taino ancestry. In both Puerto Rico and the Dominican Republic the relevant issue is that there is a wide range of proportion of African and European ancestry, with Cuba being the notable extreme case of this phenomenon. What’s going on with Cuba in particular is that there were late waves of migration from Spain, so some modern white Cubans are much less affected by admixture than other Caribbeans (remember that Cuba was part of Spain until 1898). In Haiti the situation is reversed, where the revolutions of the late 18th and early 19th centuries had a racial tinge, and whites were expelled (leaving a small mulatto class).

But it is K = 8 where things really get interesting. The black component is a European Iberian-like element which is distinct to Latino populations (including Maya). As you can see on this PCA the Latino element is related to the Iberian populations, as they took the European segments from the Caribbean populations and used them to flesh out the distribution in ancestry. There are several ways to interpret this. Dienekes suggested this might simply be a function of the source Iberian populations hundreds of years ago being somewhat different from the contemporary ones. For example, obviously contemporary Spaniards would be more subject to gene flow with other Europeans >1600 than their New World cousins. Another possibility is that there was extreme sampling from a particular region of Spain, and that has how broken out as its own cluster. For example, I know that a disproportionate number of migrants were from Andalucia and Extramadura. But the pattern here doesn’t suggest to me that possibility (the black dots should be more south-shifted I would think if they were from those two provinces).

Rather, the interpretation they seem to favor is that this element has been drifted away from the ancestral populations due to a bottleneck. This is not ethnographically implausible; the early years of the Spanish colonial experiment was characterized by de facto polygyny. Many adventurers lived lives not unlike those of the white grandees of the East India company in the late 18th century. Some have argued that this period of ubiquitous common law polygyny has influenced the fact that illegitimate births have traditionally been very common in Latin America. One reason the authors favor the bottleneck model is that the genetic distance between the Latino element and the Iberian one is rather high. This is often common in situations where drift/bottleneck has deviated allele frequencies particularly rapidly. Not only that, but the tendency is most strong in maritime Latin America, many of whose islands received relatively fewer subsequent migrants than the large and expansive mainland viceroyalties.

23andMe ancestry decomposition for friend who is 1/4 Asian

Another way the authors explored the demographic history was to look at the length distribution of the tracts of ancestries. How this works is simple. A first generation hybrid will have unbroken lengths of ancestry each parent, but subsequent generations will start to have fragmentation occur as recombination breaks apart long blocks identical by descent. You can see this in the figure to the left, where my friend who has one Asian grandparent has blocks of alternating European and Asian ancestry because of meiotic recombination events. The longer from the time of admixture the smaller and smaller the blocks will become, as recombination slices apart long blocks and recombines ancestral components. By looking at the distribution and mix of lengths the authors can construct demographic histories of the populations. In short it looks like much of the European ancestry came in one short quick pulse, rather early on in settlement. This is in keeping with the high reproductive output attested for European males thanks to polygyny during this period.

The same method was performed for the African ancestry, and the authors discovered an intriguing result. It seems that in the early years most of the Caribbean black slaves were derived from the western tip of Sub-Saharan Africa, from the Senegal river down to modern Ghana. Later on the longer tracts show affinities with populations further east, from the Bight of Benin toward the Equator. I don’t know the history of slavery well enough to confirm or deny the reality of this finding, but it illustrates the power of genomics combined with wide sampling strategies. More relevantly I suspect genomics’ role will be to assign magnitudes to known dynamics.

Finally, the authors also inferred diverse relationships for the Native admixture in the Caribbean populations. They confirmed some evidence of south-to-north migration into Central and Caribbean America, and also specific ethno-linguistic associations between now de facto extinct Caribbean populations and those of mainland South America. Some of these results have long been suggested, but lack of historical documentation makes inferences shadowy. Genomics can not resolve these debates, but they shed light upon them.

Overall this is an interesting study because I think it is a test run at the sort of historical-demographic questions that genomics will be used for. There has long been a ‘genetics as a tool’ school of thought among many ecologists and phylogeneticists, and now you shall have a ‘genomics as a tool’ to sit right along side that in many more diverse fields. Caribbean and Latin American populations are the low hanging fruit, because the Spanish and Portuguese colonial experiment are reasonably well attested, and the source populations are very distinct (so easy to pick signal out of the noise). But there are other historical questions of the same period which are also of interest. In Albion’s Seed David Hackett Fisher describes four Anglo-American folkways which contributed to the culture of this nation. Of these, ~20,000 Puritans arrived between 1620-1640 and became the ancestors of ~700,000 by 1970. Though 20,000 is not quite a bottleneck (in fact, they arrived from different sectors of England), I am curious if these individuals, a segment of “Old Americans,” can still be discerned in the genomic data. This is just one of many possible questions which will be with reach of answer in the near future….

Citation: arXiv:1306.0558
(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

My initial inclination in this post was to discuss a recent ordering snafu which resulted in many of my friends being quite peeved at 23andMe. But browsing through their new ‘ancestry composition’ feature I thought I had to discuss it first, because of some nerd-level intrigue. Though I agree with many of Dienekes concerns about this new feature, I have to admit that at least this method doesn’t give out positively misleading results. For example, I had complained earlier that ‘ancestry painting’ gave literally crazy results when they weren’t trivial. It said I was ~60 percent European, which makes some coherent sense in their non-optimal reference population set, but then stated that my daughter was >90 percent European. Since 23andMe did confirm she was 50% identical by descent with me these results didn’t make sense; some readers suggested that there was a strong bias in their algorithms to assign ambiguous genomic segments to ‘European’ heritage (this was a problem for East Africans too).

Here’s my daughter’s new chromosome painting:

One aspect of 23andMe’s new ancestry composition feature is that it is very Eurocentric. But, most of the customers are white, and presumably the reference populations they used (which are from customers) are also white. Though there are plenty of public domain non-white data sets they could have used, I assume they’d prefer to eat their own data dog-food in this case. But that’s really a minor gripe in the grand scheme of things. This is a huge upgrade from what came before. Now, it’s not telling me, as a South Asian, very much. But, it’s not telling me ludicrous things anymore either!

But in regards to omission I am curious to know why this new feature rates my family as only ~3% East Asian, when other analyses put us in the 10-15% range. The problem with very high values is that South Asians often have some residual ‘eastern’ signal, which I suspect is not real admixture, but is an artifact. Nevertheless, northeast Indians, including Bengalis, often have genuine East Asia admixture. On PCA plots my family is shifted considerably toward East Asians. The signal they are picking up probably isn’t noise. Almost every apportionment of East Asian ancestry I’ve seen for my family yields a greater value for my mother, and that holds here. It’s just that the values are implausibly low.

In any case, that’s not the strangest thing I saw. I was clicking around people who I had “shared” genomes with, and I stumbled upon this:

As you can guess from the screenshot this is Daniel MacArthur’s profile. And according to this ~25% of chromosome 10 is South Asian! On first blush this seemed totally nonsensical to me, so I clicked around other profiles of people of similar Northern European background…and I didn’t see anything equivalent.

What to do? It’s going to take more evidence than this to shake my prior assumptions, so I downloaded Dr. MacArthur’s genotype. Then I merged it with three HapMap populations, the Utah whites (CEU), the Gujaratis (GIH), and the Chinese from Denver (CHD). The last was basically a control. I pulled out chromosome 10. I also added Dan’s wife Ilana to the data set, since I believe she got typed with the same Illumina chip, and is of similar ethnic background (i.e., very white). It is important to note that only 28,000 SNPs remained in the data set. But usually 10,000 is more than sufficient on SNP data for model-based clustering with inter-continental scale variation.

I did two things:

1) I ran ADMIXTURE at K = 3, unsupervised

2) I ran an MDS, which visualized the genetic variation in multiple dimensions

Before I go on, I will state what I found: these methods supported the inference from 23andMe, on chromosome 10 Dr. MacArthur seems to have an affinity with South Asians (i.e., this is his ‘curry chromosome’). Here are the average (median) values in tabular format, with MacArthur and his wife presented for comparison.

ADMIXTURE results for chromosome 10
K 1 K 2 K 3
CEU 0.04 0.02 0.93
GIH 0.87 0.05 0.08
CHD 0.01 0.97 0.01
Daniel MacArthur 0.29 0.07 0.64
Ilana Fisher 0.01 0.06 0.94

You probably want a distribution. Out of the non-founder CEU sample none went above 20% South Asian. Though it did surprise me that a few were that high, making it more plausible to me that MacArthur’s results on chromosome 10 were a fluke:

And here’s the MDS with the two largest dimensions:

Again, it’s evident that this chromosome 10 is shifted toward South Asians. If I had more time right now what I’d do is probably get that specific chromosomal segment, phase it, and then compare it to various South Asian populations. But I don’t have time now, so I went and checked out the results from the Interpretome. I cranked up the settings to reduce the noise, and so that it would only spit out the most robust and significant results. As you can see, again chromosome 10 comes up as the one which isn’t quite like the others.

Is there is a plausible explanation for this? Perhaps Dr. MacArthur can call up a helpful relative? From what recall his parents are immigrants from the United Kingdom, and it isn’t unheard of that white Britons do have South Asian ancestry which dates back to the 19th century. Though to be totally honest I’m rather agnostic about all this right now. This genotype has been “out” for years now, so how is it that no one has noticed this peculiarity??? Perhaps the issue is that everyone was looking at the genome wide average, and it just doesn’t rise to the level of notice? What I really want to do is look at the distribution of all chromosomes and see how Daniel MacArthur’s chromosome 10 then stacks up. It might be a random act of nature yet.

Also, I guess I should add that at ~1.5% South Asian that would be consistent with one of MacArthur’s great-great-great-great grandparents being Indian. Assuming 25 year generation times that puts them in the mid-19th century. Of course, at such a low proportion the variance is going to be high, so it is quite possible that you need to push the real date of admixture one generation back, or one generation forward.

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

A new press release is circulating on the paper which I blogged a few months ago, Ancient Admixture in Human History. Unlike the paper, the title of the press release is misleading, and unfortunately I notice that people are circulating it, and probably misunderstanding what is going on. Here’s the title and first paragraph:

Native Americans and Northern Europeans More Closely Related Than Previously Thought

Released: 11/30/2012 2:00 PM EST
Source: Genetics Society of America

Newswise — BETHESDA, MD – November 30, 2012 — Using genetic analyses, scientists have discovered that Northern European populations—including British, Scandinavians, French, and some Eastern Europeans—descend from a mixture of two very different ancestral populations, and one of these populations is related to Native Americans. This discovery helps fill gaps in scientific understanding of both Native American and Northern European ancestry, while providing an explanation for some genetic similarities among what would otherwise seem to be very divergent groups. This research was published in the November 2012 issue of the Genetics Society of America’s journal GENETICS


The reality is ta Native Americans and Northern Europeans are not more “closely related” genetically than they were before this paper. There has been no great change to standard genetic distance measures or phylogeographic understanding of human genetic variation. A measure of relatedness is to a great extent a summary of historical and genealogical processes, and as such it collapses a great deal of disparate elements together into one description. What the paper in Genetics outlined was the excavation of specific historically contingent processes which result in the summaries of relatedness which we are presented with, whether they be principal component analysis, Fst, or model-based clustering.

What I’m getting at can be easily illustrated by a concrete example. To the left is a 23andMe chromosome 1 “ancestry painting” of two individuals. On the left is me, and the right is a friend. The orange represents “Asian ancestry,” and the blue represents “European” ancestry. We are both ~50% of both ancestral components. This is a correct summary of our ancestry, as far as it goes. But you need some more information. My friend has a Chinese father and a European mother. In contrast, I am South Asian, and the end product of an ancient admixture event. You can’t tell that from a simple recitation of ancestral quanta. But it is clear when you look at the distribution of ancestry on the chromosomes. My components have been mixed and matched by recombination, because there have been many generations between the original admixture and myself. In contrast, my friend has not had any recombination events between his ancestral components, because he is the first generation of that combination.

So what the paper publicized in the press release does is present methods to reconstruct exactly how patterns of relatedness came to be, rather than reiterating well understood patterns of relatedness. With the rise of whole-genome sequencing and more powerful computational resources to reconstruct genealogies we’ll be seeing much more of this to come in the future, so it is important that people are not misled as to the details of the implications.

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

The Pith: You’re Asian. Yes, you!

A conclusion to an important paper, Nick Patterson, Priya Moorjani, Yontao Luo, Swapan Mallick, Nadin Rohland, Yiping Zhan, Teri Genschoreck, Teresa Webster, and David Reich:

In particular, we have presented evidence suggesting that the genetic history of Europe from around 5000 B.C. includes:

1. The arrival of Neolithic farmers probably from the Middle East.

2. Nearly complete replacement of the indigenous Mesolithic southern European populations by Neolithic migrants, and admixture between the Neolithic farmers and the indigenous Europeans in the north.

3. Substantial population movement into Spain occurring around the same time as the archaeologically attested Bell-Beaker phenomenon (HARRISON, 1980).

4. Subsequent mating between peoples of neighboring regions, resulting in isolation-by-distance (LAO et al., 2008; NOVEMBRE et al., 2008). This tended to smooth out population structure that existed 4,000 years ago.

Further, the populations of Sardinia and the Basque country today have been substantially less influenced by these events.


It’s in Genetics, Ancient Admixture in Human History. Reading through it I can see why it wasn’t published in Nature or Science: methods are of the essence. The authors review five population genetic statistics of phylogenetic and evolutionary genetic import, before moving onto the novel results. These statistics, which measure the possibility of admixture, the extent of admixture, and the date of admixture, are often presented, but nested into supplements, in previous papers by the same group. On the one hand this removes from view the engines which are driving the science. On the other hand I have always appreciated that a benefit of this injustice to the methods which make insight possible is that those without academic access can actually bite into the meat of the researcher’s mode of thought.

I did read through the methods. Twice. I’ve encountered all the statistics before, and I’ve read how they were generated, but I’ll be honest and admit that I haven’t internalized them. That has to end now, because the authors have finally released a software package which implements the statistics, ADMIXTOOLS. I plan to use it in the near future, and it is generally best if you understand the underlying mechanisms of a software package if you are at the bleeding end of analytics. I will review the technical points in more detail in future posts, more for my own edification than yours. But for the moment I’ll be a bit more cursory. Four of the tests use comparisons of allele frequencies along explicit phylogenetic trees. That’s so general as to be uninformative as a description, but I think it’s accurate to the best of my knowledge. In the basics the tests are seeing if a model fits the data (as opposed to TreeMix, which finds the best model out of a range to fit the data). The last method, rolloff, infers the timing of an admixture event based upon the decay of linkage disequilibrium. In short, admixture between two very distinct populations has the concrete result of producing striking genomic correlations. Over time these correlations dissipate due to recombination. The magnitude of dissipation can allow one to gauge the time in the past when the original admixture occurred.

Let’s look at some results. To the left is a section of a table which illustrates the most significant 3-population test scores in the HGDP. The authors checked all the various combinations, and these came out at the top as likely admixtures (i.e., the two sources produce particular patterns in the target). Please remember that these triads should not be taken literally. The Uygur are not descended from Japanese and Italians. Rather, they are descended from populations with genetic affinities to these two sources. Precisely, the Uygurs are descended from Northeast Asian Turks, who assimilated an Indo-European speaking substratum. Most of the results are rather obvious and explicable. Several Middle Eastern populations are known to have Sub-Saharan African admixture, and this is shows up in the results. Others may be more confusing because of the obscurity of the populations, but the Burusho clearly have ancient East Asian ancestry on clustering algorithms, so their presence is not surprising to me. Similarly, the Russians in the HGDP data set have an ‘eastern’ affinity (or at least some do), either due to Finno-Ugric or Turkic ancestry (Tatars regularly assimilated into a Russian ethnic identity as the Tsars expanded their domains).

Some of the other results are more confusing, but one can still find a historical explanation. I have seen evidence that some of the Cambodian samples may have old Indian admixture, though it is not entirely clear to me. But that could explain why there is a signature of West Eurasian admixture into this population (though one wonders why the donor was not Baloch or Pathan.). The Xibo and Tu are Northeast Asian groups, on the border between China proper and the great Eurasian interior. West Eurasian admixture into these groups is not unexpected. West Eurasians are historically attested among the mercenaries and soldiers who arrived on the North China plain after the collapse of the Han dynasty, down to the Alans who served under Kublai Khan. Some of Mongolian and Turkic peoples have individuals who are attested as having characteristics more typical of Europeans (e.g., red hair), so it is likely that this admixture was relatively old and widespread, well before the era of the Pax Mongolica.

There is a minor dissonant note in these results above. The authors used rolloff and inferred an admixture of ~800 years before the present. This is far lower than earlier estimates, which were >2,000 years before the present. First, I have to say that I was mildly skeptical of the higher value reported earlier. From what little I know the roiling of Turco-Mongol peoples which reordered the Inner Asian landscape did not really establish itself beyond the Chinese fringe at this time. Recall that Central Asia was the domain of the Iranians from prehistory down to the Islamic age (the full transition of Central Asia from Persianate to Turkic has not completed itself to this date, though it has progressed over the centuries since 1000 A.D.). Is it creditable that the Turkic hordes were shut on the other side of the Pamirs for ~1,000 years? Perhaps. But it should warrant skepticism, and openness to the lower values proffered here. The technical reason that the authors consider is that STRUCTURE based inferences may overestimate admixture when reference populations are not appropriate. And yet the authors still concede that 800 years is simply difficult to credit when one consults the historical literature. Strangely though it does align with the date of the Mongol ascendancy, during which time the Uygurs served as civil servants in the barbarian empire (Mongol script derives from the old Uygur script). I managed to dig up a cave painting of Uygurs from this period. There is surely artistic license, but they look rather East Asian to me, as opposed to the hybrid Eurasian appearance modal among modern Uygurs. I won’t touch upon the rather fraught and complex ethnology and ethnogenesis of modern Uygurs, and their relationship to Russian and Chinese ethnographers, but suffice it to say that one needs to be careful about excessive reliance on the literality of historical documents in this area, because of semantic confusions.

So let’s move to the main course: what’s going on in Europe? Before putting the spotlight on the macro picture, let’s highlight one secondary aspect: the authors detect evidence of massive gene flow into Spain from Northern Europe ~4,000 years before the present. I’ll let them speak here:

We hypothesize that we are seeing here a genetic signal of the ‘Bell-Beaker culture’ (HARRISON, 1980). Initial cultural flow of the Bell-Beakers appears to have been from South to North, but the full story may be complex. Indeed one hypothesis is that after an initial expansion from Iberia there was a reverse flow back to Iberia (CZEBRESZUK, 2003); this ‘reflux’ model is broadly concordant with our genetic results, and if this is the correct explanation it suggests that this reverse flow may have been accompanied by substantial population movement.

Two things to hammer home here. First, pots move with people. That’s the inference being drawn from the results. It’s not pots-not-people, it’s people-and-pots. Second, the idea of reversals in the direction of gene flow are intriguing, and, I think need to be taken more seriously. It seems the most plausible candidate here are the people who later became the Celtiberians. Celts have been associated with the Bell Beakers before.

But the bigger shock is that Europeans, and especially Northern Europeans, seem to have a substantial Northeast Asian component. From the nature of the prose I feel that the authors were definitely taken aback. They basically say so in so many words. In the process of resolving their confusion they skinned the cat every which way. And it does look to me that Northern Europeans are truly descended in part from a population which has affinities to the “First Americans.” I say this specifically because the Siberian samples they tested actually gave a weaker result than the South American Amerindians on the 3-population test.

So what’s the proportion of ancestry? Using the Siberian population they came up with an interval of 5-18 percent in Northern Europeans. The authors used the Sardinians as their “pure” European reference, and admit that it is likely that their admixture estimate is lower than real value due to this fact. Inference is inference, do you trust this result? As it happens the authors also checked Ötzi the Iceman, and found that like the modern Sardinians he had very little Northeast Asian ancestry. Ötzi is dated to ~5,000 years before he present. Using rolloff the authors estimate an admixture date of ~4,000 years before the present, with an error of nearly 1,000. Additionally, using a different data set they came with an admixture date of ~2,000 years before the present. The latter is obviously wrong (they explain why this could happen in the text). But Ötzi seems to put a boundary on how early it could have been, at least in Southern Europe.

As of publication the authors did not have time to include a reference to this interesting nugget from the abstracts of ASHG 2012:

The complete genome of the 5,300 year old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has recently been published and yielded new insights into his origin and relationship to modern European populations. A key finding of this study has been an apparent recent common ancestry with individuals from Southern Europe, in particular Sardinians…We used unpublished data from whole genome sequencing of 452 Sardinian individuals, together with publicly available data from Complete Genomics and the 1000 Genomes project, to confirm that the Iceman is most closely related to contemporary Sardinians. An analysis of these data together with ancient DNA data from a recently published study on Neolithic farmers and hunter-gatherers from Sweden shows the Iceman most closely related to the farmer individual, but not the hunter-gatherers, with the Sardinians again being the contemporary Europeans with the highest affinity. Strikingly, an analysis including novel ancient DNA data from an early Iron Age individual from Bulgaria also shows the strongest affinity of this individual with modern-day Sardinians. Our results show that the Tyrolean Iceman was not a recent migrant from Sardinia, but rather that among contemporary Europeans, Sardinians represent the population most closely related to populations present in the Southern Alpine region around 5000 years ago. The genetic affinity of ancient DNA samples from distant parts of Europe with Sardinians also suggests that this genetic signature was much more widespread across Europe during the Bronze Age.

I’m betting that this Bulgarian sample won’t exhibit Northeast Asian ancestry, though who knows?

There is a definite geographic pattern within Europe to the strength of the signature of admixture. Northern European populations have the greatest, Southern European populations less, and islanders like Cypriots hardly any. Recall that Sardinians seem to be the best reference, so the ~0 floor may just be a statistical artifact of the measuring stick we have. All that being said, what went on <5,000 years before the present to reorder the European landscape?

The answer may sound crazy, but I think the most probable explanation (even if it is unlikely) is something to do with the Indo-Europeans. We know that Indo-European languages were spoken in Greece by ~1500 BC at the latest. One thing that is clear from less advanced clustering algorithms is that Basques and Finns are somewhat distinctive in relation to their neighbors. Though they are not genetically that different, they still lack some “interesting”elements. The results to the left are from Dienekes, though I’ve replicated it. You can see a similar difference between French, and French Basques. The Basques seem to lack something which has affinities with West Asia. These results, and hints elsewhere, imply that the Basque may not be descended from hunter-gatherers, but the first European farmers. So who came after them?

Though it strikes me as a bizarre conjecture, but I can’t help but imagine the rapid expansion of Indo-European populations into Europe, pushing into the peninsulas of the south. These people may have been a newly formed cosmopolitan mix of West Asians, Northern European Mesolithics, and Northeast Asians. I am at a loss to hazard a guess as to who the First American-like Northeast Asians were, though perhaps they were a western offshoot of the Kets? These people were then absorbed into a melange of tribes who themselves emerged from a synthesis between immigrant West Asian farmers and Northern Europeans. In shorthand: perhaps the Indo-Europeans were mongrels! This is not an entirely crazy proposition if you look at the historical record. Conquest populations often synthesized and absorbed those who they conquered. Sometimes they even became the conquered in deep cultural ways (e.g., the Bulgars).

To ward off accusations of glib and facile speculations, I well understand that much of what I suggest above is likely wrong. But bizarre results are going to elicit unhinged hypotheses. And I shouldn’t overplay how strange these results are, I think they are going to stand the test of time. The authors are top notch, and Dr. Joseph Pickrell found the same pattern (a connection between Europeans and Native Americans) with TreeMix! If we sit back and reflect on phenotype it shouldn’t be entirely surprising. Some Scandinavians have always struck me as having a generalized Eurasian cast to their features. Obviously this tendency is stronger among the Sami and Finns, but you can see it in Swedes and others. This is far less evident to me among Southern European peoples. I doubt one would ever confuse a Sardinian for a Eurasian, and I never had that feeling when I spent some time in Italy a few years back (in contrast, some Finns did look Asiatic to me).

Finally, this paper highlights the reality that population genetics has little to do with Plato. A population within a species is simply not clear and distinct in a sense which would satisfy an Idealist. The authors of the above paper nod to this, illustrating how their tests for admixture are confounded and confused by constant gene flow via isolation-by-distance dynamics. These results indicate that Northern Europeans are on the order of 10% Northeast Asian. Does this mean that Northern Europeans are 10% non-white? Well, it turns out that white people were always 10% non-white! We just didn’t know. Is my daughter (who is 50% Northern European) now majority non-white? Oh wait, I’m South Asian. That means I’m ~50% white! Is my friend who is 25% Japanese now more than 25% Northeast Asian? Words and concepts fail us on the boundary of unfamiliarity, in time and space. Populations and genealogies don’t brook our categorizations. On a deep level we are all admixtures, and partitioning of ancestry along phylogenetic trees are useful and comprehensible fictions. These techniques put flesh upon the bones of archaeology and smoke out the outlines of history. But we always need to be aware that that history is not made by humans, rather, we excavating it, and then giving it appropriate glosses in our museums. And yet it is.

Related: Dienekes has much to say (obviously).

Image credit: Wikipedia, Wikipedia, and Wikipedia.

Cite: 10.1534/genetics.112.145037

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

A comment below:

Does the higher genetic diversity in sub-Saharan Africans explain why mixed children of blacks + other couples usually look more black than anything?

As in, the higher number of genetic characteristics overwhelms those of the other parent and allows them to be present in the child.

But this makes you ask: is the assumption that people with some African heritage tend to exhibit that heritage disproportionately even true? From an American perspective the answer is obviously yes. But from a non-American perspective not always. Why? Doe the laws of genetics operate differently for Americans and non-Americans? I doubt t. Rather, hypodescent, and its undergirding principle of the “reversion to the primitive type” are still background assumptions of American culture. In fact today black Americans are perhaps most aggressive and explicit in outlining the logic and implications of the “one drop rule,” though non-blacks tend to accept it as an operative principle as well.

Assessing someone’s racial identity has a subjective aspect. We see through the mirror darkly, and that’s a function of the cultural preconditions of gestalt cognition. But there are some objective metrics we can look at it. Foremost among them is skin color.

The locus SLC24A5 is probably the largest effect gene which impacts normal human variation in skin color. It is responsible for 25-40% of the difference in pigmentation between Europeans and Africans. If you have the genotype AA you are lighter than if you are AG, and if you are AG you are lighter than GG. Almost all Africans are GG, and almost all Europeans are AA. What’s the impact on African Americans then? Below are two panels which show the distribution of skin pigmentation using a quantitative metric as a function of the genotypes.

If you don’t get the significance, let me highlight the text for this figure:

Effect of SLC24A5 genotype on pigmentation in admixed populations. (A) Variation of measured pigmentation with estimated ancestry and SLC24A5 genotype. Each point represents a single individual; SLC24A5 genotypes are indicated by color. Lines show regressions, constrained to have equal slopes, for each of the three genotypes. (B) Histograms showing the distribution of pigmentation after adjustment for ancestry for each genotype. Values shown are the difference between the measured melanin index and the calculated GG regression line (y = 0.2113x + 30.91). The corresponding uncorrected histograms are shown in fig. S7. Mean and SD (in parentheses) are given as follows: for GG, 0 (8.5), n = 202 individuals; for AG, –7.0 (7.4), n = 85; for AA, –9.6 (6.4), n = 21.

The difference between GG and AG is 7. AG to AA is 2.6. In short you see that the allele which results in lighter skin exhibits dominance to the allele which is correlated with darker skin! Strangely, this is not an isolated fluke.

The results to the left are again the quantitative pigmentation values for an admixed population of mostly African descent, with some European ancestry. In this case it is for the KITLG locus, which can explain 20-25% of the difference in value between European and African populations in terms of pigmentation. The G allele here is the lightening variant. You see that GG are lighter than AG, who are lighter than AA. But what are the median values? -4.7 for GG, -2.9 for AG, and 1.9 for GG. The difference between AG and AA is 4.2. Between AG and GG is 1.8. Again you note that the lightning variant exhibits dominance on this phenotype in relation to the darker variant!

The reason I’m rehashing these results (which I’ve presented before) is that in this case the cultural norm, at least in the U.S.A., is at variance at what is the reality on the phenotypic level. I was surpised when I first saw these results, and everyone else I’ve mentioned them to is moderately surprised. Why? Naively I assume that’s because in the U.S.A. a white phenotype is “default,” and deviations from that default prototype are particularly salient. Less noticeable to us is the possibility that those deviations may still result in a physical type closer to the default prototype than other types. More colloquially someone like Colin Powell is black because of the color of his skin, not white . But objectively his skin color is closer to white than it is to black (and judging from his features his ancestry is probably more European than it is African).

Of course I’m not arguing here that all non-African features are dominant to African features. In terms of hair form it seems that this is not true. But again, this is partly an artifact of how we as humans classify the phenotype. People of mixed ancestry often have a different hair form from both their parents. It simply is the fact though that we “bin” moderately kinky and very kinky hair together into one class.

The moral of the story is that when we talk about human population differences we need to be very careful of separating the subjective from the objective. Obviously this has been a fraught domain, and the best way to move it back into respectable mainstream discourse is to bleed it of its less scientific aspects.


(Republished from Discover/GNXP by permission of author or representative)
• Category: Science • Tags: Admixture, Genetics, Race 
🔊 Listen RSS

In The New York Times, DNA Turning Human Story Into a Tell-All:

The tip of a girl’s 40,000-year-old pinky finger found in a cold Siberian cave, paired with faster and cheaper genetic sequencing technology, is helping scientists draw a surprisingly complex new picture of human origins.

The new view is fast supplanting the traditional idea that modern humans triumphantly marched out of Africa about 50,000 years ago, replacing all other types that had gone before.

Instead, the genetic analysis shows, modern humans encountered and bred with at least two groups of ancient humans in relatively recent times: the Neanderthals, who lived in Europe and Asia, dying out roughly 30,000 years ago, and a mysterious group known as the Denisovans, who lived in Asia and most likely vanished around the same time.

Their DNA lives on in us even though they are extinct. “In a sense, we are a hybrid species,” Chris Stringer, a paleoanthropologist who is the research leader in human origins at the Natural History Museum in London, said in an interview.

First, for reasons of novelty we are emphasizing the exotic tendrils of the human family tree. Even Chris Stringer, the modern paleontological father of “Out of Africa,” is claiming we’re hybrids! But let’s not forget that non-Africans are the product of a very rapid radiation out of the margins of the Afrotropic ecozone within the last ~50-100,000 years. I am not entirely sure that this is as true of Africans (recall how extremely basal Bushmen are to the rest of humanity; they seem to have diverge well before the “Out of Africa” pulse).

Second, the old model was way easier to write about, even if there were confusions like the idea that mtDNA Eve was our only female ancestor from 200,000 years ago in the past. The new paradigm leaves one with awkward and unhelpful turns of phrase. For example:

But Dr. Reich and his team have determined through the patterns of archaic DNA replications that a small number of half-Neanderthal, half-modern human hybrids walked the earth between 46,000 and 67,000 years ago, he said in an interview. The half-Denisovan, half-modern humans that contributed to our DNA were more recent.

How to make sense of this gibberish? I suspect that the author didn’t have a good idea how to translate a particular population genetic statistic, and its importance to assessing time since admixture, into plainer prose. I have no idea either!

In other news, i09 has an interesting interview up with Rebecca Cann and Mark Stoneking. These two were heavily involved in the mtDNA Eve controversies of the 1980s. Nice capstone to an era. Like Stringer, even they admit the likelihood of a necessity to modify the simple “Out of Africa” with replacement model.

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

With all the talk about Basques I decided to do my own analysis with Admixture. Dienekes gave me a copy of his IBS file, which has all the 1000 Genomes Spanish samples, including Basques. I merged it with the HGDP sample, which has French Basques (just “Basques” in the plots below) and French non-Basques. I pruned most of the populations, but kept the Mozabites, which are a Berber group from Algeria. The number of markers was ~350,000, and I ran it up to K = 8, or 8 component populations. I stopped there because the components were starting to break up in a very choppy manner.

In general I do think that the idea that non-Basque Spaniards have Moorish genetic input seems supported. It isn’t definitive though. And you have to be careful, there are lower parameter values where Sardinians seem to have an affinity with Mozabites to a great extent, even more than Spaniards. But that disappears as you move up the number of K’s. But who is to say which K is the correct K? The consistent Sub-Saharan African among non-Basque Spaniards (also evident in the Behar et al. data set) component probably convinces me that there was a Moorish impact, since these are likely to have come with the Islamic conquest, and not Phoenicians.

All the files from the Admixture run (and csv files with tabular results) are here.

[nggallery id=32]

(Republished from Discover/GNXP by permission of author or representative)
• Category: Science • Tags: Admixture, Basques, Personal Genomics 
🔊 Listen RSS

Hominin increase in cranial capacity, courtesy of Luke Jostins

A few years ago a statistical geneticist at Cambridge’s Sanger Institute, Luke Jostins, posted the chart above using data from fossils on cranial capacity of hominins (the human lineage). As you can see there was a gradual increase in cranial capacity until ~250,000 years before the present, and then a more rapid increase. I should also note that from what I know about the empirical data, mean human cranial capacity peaked around the Last Glacial Maximum. Our brains have been shrinking, even relative to our body sizes (we’re not as large as we were during the Ice Age). But that’s neither here nor there. In the comments Jostins observes:

The data above includes all known Homo skulls, but none of the results change if you exclude the 24 Neandertals. In fact, you see the same results if you exclude Sapiens but keep Neandertals; the trends are pan-Homo, and aren’t confined to a specific lineage….

In other words: the secular increase in cranial capacity for our lineage extends millions of years back into the past, and also shifts laterally to “side-branches” (with our specific terminal node, H. sapiens sapiens, as a reference). This is why I often contend as an aside that humanity was to some extent inevitable. By humanity I do not mean H. sapiens sapiens, the descendants of a subset of African hominins who flourished ~100,000 years before the present, but intelligent and cultural hominins who would inevitably construct a technological civilization. The parallel trends across the different distinct branches of the hominin family tree which Luke Jostins observed indicated to me that our lineage was not special, but simply first. That is, if African hominins were exterminated by aliens ~100,000 years before the present, at some point something akin to H. sapiens sapiens in creativity and rapidity of cultural production would eventually arise (in all likelihood later, but possibly earlier!).

This does not mean that I think humanity was inevitable upon earth. For most of the history of this planet life was unicellular. I do not find it implausible that life on earth may have reached its “sell by” date due to astronomical events before the emergence of complex organisms (in fact, from what I have heard the end of life is going to occur ~1 billion years into the future due to the persistent increase in the energy output of Sol, not ~4 billion years in the future when Sol turns into a red giant). But, once complex organisms arose it does seem that further complexity was inevitable. This was Richard Dawkins’ case in The Ancestor’s Tale based simply on the descriptive record. But did the emergence of complex organisms necessarily entail the evolution of a technological species? I don’t think so. It took 500 million years for that to occur (it does not seem that coal resources formed hundreds of millions of years ago were tapped before humans). Given enough time obviously a technological species would evolve (e.g., extend the time of evaluation to 1 trillion years), but note that the earth has only ~5 billion years. Homo arrived on the scene in the last 20% of that interval.

Here I am positing at a minimum two not excessively likely or inevitable events over a 5 billion year time span which would lead to a hyper-technological and cultural species:

- The emergence of multicellular life

- The emergence of a lineage with the propensities of Homo

One Homo evolved and expanded outside of Africa I suspect that something of the form of a technological civilization became inevitable n this planet. We see parallelism in our own short post-Pleistocene epoch. Multiple human societies shifted from hunter-gatherers to agriculturalists over the past 10,000 years. The experience of the New World civilizations in particular illustrates that human universal tendencies are real. Not only were “game changing” cultural forms such as agriculture and literacy invented independently during the Holocene, but they were not invented during earlier interglacials (at least in all likelihood).

Khufu, Necho, Augustus and Napoleon

Why not? Well, consider the cultural torpidity of Paleolithic toolkits, which might persist for hundreds of thousands of years! I suspect some of this due to biology. But even over the Holocene we do perceive that cultural change has proceeded at a more rapid clip as time has progressed (i.e., at a minimum cultural change has been accelerating, and it may be that the rate of acceleration itself is increasing!). Consider that the civilization of ancient Egypt spanned at least 2,000 years. Though there are clear differences, the continuity between Old Kingdom Egypt and the last dynasties before the Assyrian and Persian conquests is very obvious to us, and would be obvious to ancient Egyptians. In contrast, 2,000 years separates us from Augustan Rome. The continuities here are clear as well (e.g., the Roman alphabet), but the cultural change is also clear (if you wish to argue that the early modern and modern period are sui generis, the 1,500 year interval from Augustan Rome to the Neo-Classical Renaissance would still be a stark contrast when compared against an ancient Egyptian reference*, despite the latter’s aping of the forms of the former).

So far I have focused on the vertical dimension of time. But there is also the lateral dimension, of cross-fertilization across the branches of the hominin family tree. The admixture of a Neanderthal element into non-Africans has started to become widely accepted recently, thanks to the confluence of archaeology and genomics in the field of ancient DNA. Even if one rejects the viability of Neanderthal admixture, the solution to the conundrum of these results must still entail stepping away from a simple model of recent exclusive origin of humans from a small African population. There are also hints of admixture with other archaic lineages on the Pacific fringe, and within Africa.

Until recently it was common to posit that modern humans, our own lineage, had some special genius which allowed it to sweep the field and extinguish our cousins. The qualitative result of Luke Jostins’ plot was known; that other hominin lineages also exhibited encephalization. In fact, it was a curious fact that Neanderthals on average had larger cranial capacities than anatomically modern humans. But the reality remained that we replaced them, ergo, we must have a special genius. Until the lack of distinction between Neanderthals and modern humans on loci implicated in the necessary (if not sufficient) competency of language that trait was a prime candidate for what made “us” special. But now I put “us” in quotation marks. The data do point to an overwhelming descent from an African or near-African population for non-Africans over the past 100,000 years. But the “archaic admixture” is not trivial. What was they are us, and we have become what they might have been.

For over two centuries there has been a debate in the West between monogenesis and polygenesis. The former is the position that humankind derives from one single pair or population (the former a straightforward recapitulation of the standard Abrahamic model). The latter is the position that different races of humans derive from different proto-humans, or, for the Christian polygenists that only Europeans descent from Adam and Eve (the other races being “non-Adamic”). Echoes of this conflict persist down to the present era. Many of the earlier partisans of “Out of Africa” have claimed that the proponents of multiregionalism were latter-day polygenists (not without total justification in some cases).

But the conflict between monogenism and polygenism is not the appropriate frame for what is being unveiled by reality before our eyes. What we see in the creation of modern humanity is a monogenic base inflected with the flavors of polygenism. Modern humans descend, by and large, from an expansion of an African population over the past 200,000 years. But on the margins there are other strands and filaments of ancestry which tie disparate populations back to lineages which branched off far earlier from the main trunk. At a minimum hundreds of thousands, and perhaps an order of 1 million years, before our own age. Today genomics avails of us the statistical power to extract out these discordant signals from the fluid “Out of Africa” narrative, but I would not be surprised if in the near future we stumble upon more and more “long branches” of less noteworthy quantity. Admixture is likely to be an old and persistent story in the hominin lineage, with only the most recent substantial bouts of separation and hybridization being of notice and curiosity at this moment in time.

What does all this mean? And why have I juxtaposed deep time natural history across the tree of life with inferences of relatively recent paleoanthropology? Let’s start with two propositions:

- Technological civilization, an outward manifestation of radically complex sentience, is not inevitable, though it is probable given certain preconditions (I believe that the existence of Homo increased its probability to ~1.0 over a reasonable time period)

- Radically complex sentience is not the monopoly of a particular exclusive lineage which accrues its genius from a particular specific forebear

John Farrell has pointed out the possible issues that the Roman Catholic church may have with the new model of human origins. But the Catholic church is only but a reflection of more general human strain of thought. Descent-groups, whether real or fictive, loom large in the human imagination. The evolutionary rationale for this is not too hard to explain, but we co-opt the importance of kinship in many different domains. Like evolution, human cultural forms simply take what is already present, and retrofit and modify elements to taste.

So why are humans special? And why do humans have inalienable rights? Many of us may not agree with the proposition that we are the descendants of Adam and Eve, and therefore we were granted the divine grace of eternal souls. But a hint of this logic can be found in the assumptions of many thinkers who do not agree with the propositions of the Roman Catholic church. Recently I listened to Sherry Turkle arguing against a reliance on “robot companions” which are able to exhibit the verisimilitude of human emotions for those who may be lacking in companionship (e.g., the aged and infirm). Though Turkles’ arguments were not without foundation, some of her arguments were of the form that “they are not us, they are not real, we are real. And that matters.” This is certainly true now, but will it always be? Who is this “they” and this “we”? And what does “real” mean? Are emotions a mysterious human quality, which will remain outside of the grasp of those who do not descend from Adam, literal or metaphorical?

If there arises a point where non-human sentience is a reality, do they have the same rights as we? Though the difference is radical in terms of quantity to some extent I think we know the answer: they are human by the way they are, not by the way their ancestors were. The “taint” of admixture with diverse lineages across the present human tree of life has not resulted in an updating of our understanding of human rights. That is because the idea that we are all the children of Adam, or the descendants of mitochondrial Eve, is a post facto justification for our understanding of what the rights of humanity are, adn what humanity is. And what it is is a particular ecological niche, a way of being, not being who descend down in a line of biological relationship from a particular person or persons.

* The cultural fundamentals of Old Kingdom Egypt arguably persisted in a living fossil form in the temple at Philae down to the 6th century A.D.! Therefore, a 3,500 year lineage of literature continuity.

Image credits: all public domain images from Wikpedia

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

I badger readers here to actually use all the analytic tools which researchers put out into public circulation, rather than just offering cheap opinions. Obviously it’s way more fun and informative to have discussions with someone who can check their own hunches by doing a few “runs” overnight. Secondly, if you have minimal technical skills all it requires is an investment of time. If you can’t be bothered to invest the time if you have a modicum of nerd-quotient then it says something about how passionate you are about these issues in my opinion (granted, life gets in the way, but as someone who routinely felt lucky to sleep 3 hours on many nights over the past 3 months, please spare me).

Maju, the author of For what they were… we are, has now taken the plunge (with the help of my tutorial, which I need to clean up and fix in some details). Please check out his results (which are preliminary). Also, don’t be bashful about contacting researchers for files if you want something that’s not easily accessible in online reposities; that’s how Zack and Dienekes have gotten a hold of some data sets. There’s no need for hundreds and hundreds of people running ADMIXTURE and posting PCA plots. Rather, it is useful as a supplement to the academic community if there are at least some dozens of individuals who engage in exploratory analyses as well as replicating the results of researchers.

(Republished from Discover/GNXP by permission of author or representative)
• Category: Science • Tags: Admixture, Anthropology 
🔊 Listen RSS

Meeting the Taino

In the comments below a few days ago someone expressed concern at the diminishing of genetic diversity due to the disappearance of indigenous populations. My response was bascally that it depends. The issue here is whether that disappearance is due to assimilation, or extinction. If a given population is genetically absorbed into another, obviously their genetic diversity is by and large maintained. What disappears are the specific genotypes, the combinations of gene pairs, which are distinctive to that given group. This is the same dynamic at the heart of the ‘disappearing blonde gene’ meme. Unless there is selection at the loci which encode or predispose one to blonde hair the ‘gene’ isn’t going anywhere. Rather, the implicit issue here is that blonde people are intermarrying with non-blonde people, and if the genetic variant has a recessive expression then the frequency of the trait will decrease. Populations with a high degree of homozygosity at the ‘blonde loci’ are distinctive in a very particular manner, but they’re no more or less ‘diverse’ than other populations which don’t manifest the same tendency.

A toy example will suffice. Take two populations, A and B, and one locus, 1, with two variants, X and x. Assume that the two populations are the same size. At locus 1 population A is 100% X, and population B is 100% x. In a diploid scenario then all the individuals in population A will be XX, and in B will be xx. When you add A + B you get a frequency of X of 0.5, and of x of 0.5 (since the two populations are balanced in size).


Now imagine a scenario where all individuals in population A pair up with someone in population B (assume sex balance in both populations). In the first generation, F1, all the offspring will be heterozygote Xx (hybrids). The frequency of X and x will be 0.5 still, as in the previous generation. But no individual now reflects the genotype of the parental populations, as all individuals are heterozgyote. At the level of alleles, specific genetic variants, you’ve go the same diversity (X and x at locus 1). But at the level of genotype there’s a huge shift. Two genotypes (XX and xx) no longer exist, but a novel one is now fixed in the population (Xy).

A novel combination

Finally, in the F2 generation, the offspring of F1, Hardy-Weinberg will reassert itself. 25% of the genotypes will be XX, 25% xx, and 50% Xx, due to p2 + 2pq + q2 = 1. In this scenario some of the distinctiveness of the parental and F1 generations in terms of genotype are evident, but the diversity in the allelic sense of the parental and F1 states remains the same, X = 0.5 and x = 0.5. Observe that if you’re looking at genotypic diversity the F2 generations are actually more diverse than the parental (because Xy is a different genotype). In other words, in some ways the aggregation of various distinct populations may increase diversity by generating novel combinations.

This is not to deny that a very specific historically contingent form of diversity in terms of distinctness of particular groups is threatened today. That’s why it was important that the HGDP was overloaded with threatened groups like the Bushmen, Kalash, and Pygmies. These populations may be assimilated soon, and with that assimilation it will be more difficult to extract out historically very important information which will inform us about the human past.

But another issue is extinction instead of assimilation. Wouldn’t this eliminate a lot of genetic variation? Perhaps. I actually considered this issue a few years back with the Star Trek reboot. If you haven’t watched the film, there’s a major spoiler next. So basically on the order of ~10,000 Vulcans survived the destruction of their planet. Culturally the preservation was rather good, because the Vulcan elders, who are the repositories of the culture, were saved. In this way a fully fleshed Vulcan culture could easily reemerge out of the genocide. On the other hand, the vast majority of Vulcans died. Isn’t ths population bottleneck a genetic catastrophe? It depends. If the Vulcans who survived are a relatively random assortment of the population genetically, then the disaster isn’t that bad in terms of genetic diversity.

To get some idea of why, consider the statistic of heterozygosity. This measures the extent of heterozygote states, where the two gene copies differ at a locus, across the population. It’s a proxy for genetic diversity, as more allelic diversity produces more heterozygosity.

The decay of heterozygosity over time due to random genetic drift (without mutation) can be modeled like so:

Ht = H0(1 – 1/(2N))t

The variable “t” is simply the generation time, from an initial time. H0 refers to the initial heterozygosity, and Ht is simply the value at a given time out from that initial value. The N is effective population size. This formula can be used to model population bottlenecks. The Vulcan population reduction from one on the order of billions to 10,000 was basically a massive population bottleneck. The decrease in heterozygosity that you’d expect would be:

Ht = (1 – 1/(2*10,000))1

Ht = 0.99995 of the initial value. Basically almost nothing. Why? Because 10,000 turns out to be a relatively large population. This makes some intuitive sense. If you have a sample size of 10,000, and it’s representative, sample variance isn’t going to be that high. If you have an infinite number of coin flips so that the ratio of heads and tails is 50:50, reducing that to 10,000 flips isn’t going to result in much of a deviation from 50:50.

Let’s look at the effect of population bottlenecks of 20 generations at various values of N. The x axis shows generation time, while the y axis illustrates the proportion of the initial heterozyosity which remains.

This is not to downplay the impact of bottlenecks and demographic stochasticity. Rather, it’s to suggestion that population genetic diversity is relatively resistant to a crash in numbers. The extinction of small tribal groups is a tragedy, but genetically it may not be as much of a problem as we think. Even in groups such as the Bushmen with a great deal of genetic diversity it is likely that most of that diversity is already found within non-Bushmen populations.

Image credits: Ian Beatty and Lesley-Ann Brandt.

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

Ed Yong has a good good review of a new Neandertal introgression/admixture paper in PNAS. It’s not live on the web yet, so let me quote Ed:

Even if the odds of successful interbreeding were just 5 percent, Neanderthal genes would make up the majority of the human genome today. As it is, a lack of viable sex explains why none of the Neanderthals’ mitochondrial DNA made its way into modern humans, and why so little of their main genome did.

Currat and Excoffier suggest that either modern humans and Neanderthals didn’t have sex very often, or their hybrids weren’t very fit. They favour the first idea. According to their model, it would only have taken between 197 and 430 liaisons between ancient humans and Neanderthals to fill 1-3 percent of modern Eurasian genomes with Neanderthal DNA. Considering that they two groups probably interacted for 10,000 years or so, it would have been enough for one human to sleep with one Neanderthal every 23 to 50 years.

From what I gather in the comments this is due to the fact that if there was a wave of advance very small levels of admixture per unit of advance can build up rather rapidly. I think this is easy to express in temporal rather than spatial terms.

For example, let’s imagine a population of modern humans expanding into a population of Neandertals. The original source population doesn’t receive any more contributions after the initial push, so you have a series of admixture events over time. Assuming 5% admixture per generation, this is the dilution of the “original ancestry” which would occur over 30 generations, or 750 years:

The model outlined in Ed Yong’s post needs to be examined with care though. No doubt there are all sorts of assumptions which can be disputed. Though I think I accept the final result as entirely plausible.

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

Last year when discussing the possible admixture of Neandertals with the ancestors of modern non-Africans I joked that Sub-Saharan Africans were “pure humans.” This was tongue-in-cheek in part because the results from the Neandertal genome shifted my assessment of the probability of archaic admixture within Africa as well. In other words, there may never have been a pure “human” type which expanded and assimilated archaic ancestry on the margins of its range. Species Platonism may be very misleading for our particular lineage. Rather, what it means to be human has always been in flux, a compromise between extremely different ancestral components.

For years some groups of researchers have been arguing that there is population structure within Africa itself which hints at admixture events before (or after?) the “Out of Africa” event. Genome blogger Dienekes Pontikos has been discussing this possibility for several years as well. With the possibility of archaic admixture outside of Africa it was inevitable that people would revisit their earlier exploration of ancient African admixture and the modern patterns of variation which that might explain. Finally one of the groups working on this has come out with something in PNAS, Genetic Evidence for Archaic Admixture in Africa. Unfortunately it’s not on the website, and I’m not privy to the embargoed copy, so I can’t say much. ScienceDaily and Nature have lengthy write-ups. The details are pretty straightforward. The authors infer using computational methods that there is a 1-2% admixture in Africans of a population which diverged from the mainline of the human ancestral tree ~700,000 years ago. The hybridization occurred on the order of ~40,000 years before the present. The proportions are highest in Central Africans. I assume that this means Pygmies. And I would further bet that the admixture is highest in the Eastern Pygmy populations, such as the Mbuti. The lead author also cautions that this may not be the last word on admixture. No doubt. There are other groups breathing down his neck.

If this is true then a assimilation model of the expansion of H. sapiens sapiens looks more and more plausible. The time period of admixture is pretty much what other scholars are estimating for Neandertals, and presumably Denisovans. I’m not smart enough to figure out how this could be a statistical artifact, but perhaps that explains the congruence? Otherwise, if this is true then you had several repeated events of expansion of one particular lineage (what I term “Neo-Africans”) which demographically swamped the indigenous populations, but still retained a faint, but discernible stamp of their distinctive genetic content. But this may not be exceptional. It may have happened before the emergence of Neo-Africans, and I believe it happened after them (e.g., the rise of agriculturalists). It’s possibly one instance of a rather banal dynamic in the evolution of Homo.

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

The class human or H. sapiens refers to a set of individuals. On the grand scale it’s really not all that clear and distinct. When do “archaic” humans become “modern” humans? Taking into account human variation, what is a “human universal”? A set of organisms are given a name which denotes the reality that they may share common ancestry, and interact behaviorally, and are potential mates. But many of these phenomenon are fuzzy on the margins. Many of the same issues which emerge in the “species concept” debates are rather general up and down the scales of natural complexity. A similar problem crops up when we conflate the history of genes with the history of populations. Such a conflation has value and utility to a first approximation. The story of mitochondrial Eve was actually the history of one particular locus, the mitochondrial genome. But it did tell us quite a bit about the history of the human species, even if in hindsight it looks as if some scientists overinterpreted those findings. One of the major issues I’ve noticed over the past year, with the heightened likelihood of archaic admixture in the modern human genome, is that people regularly get confused by the difference between total genome ancestry, and the evolutionary history of one particular gene.

Consider the possibility that a substantial proportion of the genetic variants at the dystrophin locus amongst Eurasicans (non-Africans) derive from Neanderthals. As I have observed one of my siblings carries only the Neanderthal variant (males have only one copy as this gene is on the X chromosome). Does this mean he is 100% Neanderthal? Obviously not. The patterns at one gene tell you the history of that one gene. Since the patterns across genes are correlated because of shared evolutionary history (ergo, the existence of geographical racial clusters) one gene can tell you more than just its own history because you are aware of the correlations. But you can’t take this too far. My sibling is less than 5% Neanderthal across his whole genome. He just happens to be “100% Neanderthal” at that gene. There isn’t a great contradiction here. His genome is not a Platonic ideal or a pure category of human vs. non-human.

I bring this up because a few months ago I relayed the findings at a conference as to the evidence of lots of introgression into the human genome from archaic hominins on immune related loci. The paper reporting those findings is now out in Science, The Shaping of Modern Human Immune Systems by Multiregional Admixture with Archaic Humans:

Whole-genome comparisons identified introgression from archaic to modern humans. Our analysis of highly polymorphic HLA class I, vital immune system components subject to strong balancing selection, shows how modern humans acquired the HLA-B*73 allele in west Asia through admixture with archaic humans called Denisovans, a likely sister group to the Neandertals. Virtual genotyping of Denisovan and Neandertal genomes identified archaic HLA haplotypes carrying functionally distinctive alleles that have introgressed into modern Eurasian and Oceanian populations. These alleles, of which several encode unique or strong ligands for natural killer cell receptors, now represent more than half the HLA alleles of modern Eurasians and also appear to have been later introduced into Africans. Thus, adaptive introgression of archaic alleles has significantly shaped modern human immune systems.

Introgression implies more than just ancestry. These results indicate that Denisovan ancestry at particular immunologically relevant loci is rather high amongst East Asian groups which have no discernible Denisovan ancestry across the total genome. Presumably that’s an artifact of the limits of statistical power in detecting very low levels of admixture. But out of tens of thousand of genes it is not unimaginable that there are some few gene copies from exotic sources which turn out to be adaptive, and so favored over “native” alleles (cultural analogs come to mind; the Roman language remains, but the Roman religion has been replaced by a Jewish derived sect). The paper has little new beyond the conference talk. Note this result:

From the combined frequencies of these six alleles, we estimate the putative archaic HLA-A ancestry to be >50% in Europe, >70% in Asia, and >95% in parts of PNG (Fig. 4, C and D). These estimates for HLA class I are much higher than the genome-wide estimates of introgression….

More precisely, the introgression estimates are around an order of magnitude greater than admixture. Intriguingly the authors note that though most Africans exhibit some evidence of introgression from Eurasian populations, Khoisan and Pygmies do not. This seems to point to the possibility that the generic class “African” may hide a lot of interesting population structure and history. It is clear that peoples from the Horn of Africa seem to have been recently influenced by Eurasian groups, but it may be that West and East Africans more generally have been touched by deep-time back-migrations. Though I’ve been skeptical of attempts to portray Khoisan and Pygmies as “ur-humans,” these results suggest that that characterization may be closer to the mark than I had argued earlier.

(Republished from Discover/GNXP by permission of author or representative)
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"