The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
Human Genetics

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Thanks, LOL, or Troll with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used three times during any eight hour period.
Ignore Commenter Follow Commenter
🔊 Listen RSS

Indigenous Arabs are descendants of the earliest split from ancient Eurasian populations:

An open question in the history of human migration is the identity of the earliest Eurasian populations that have left contemporary descendants. The Arabian Peninsula was the initial site of the out-of-Africa migrations that occurred between 125,000 and 60,000 yr ago, leading to the hypothesis that the first Eurasian populations were established on the Peninsula and that contemporary indigenous Arabs are direct descendants of these ancient peoples. To assess this hypothesis, we sequenced the entire genomes of 104 unrelated natives of the Arabian Peninsula at high coverage, including 56 of indigenous Arab ancestry. The indigenous Arab genomes defined a cluster distinct from other ancestral groups, and these genomes showed clear hallmarks of an ancient out-of-Africa bottleneck. Similar to other Middle Eastern populations, the indigenous Arabs had higher levels of Neanderthal admixture compared to Africans but had lower levels than Europeans and Asians. These levels of Neanderthal admixture are consistent with an early divergence of Arab ancestors after the out-of-Africa bottleneck but before the major Neanderthal admixture events in Europe and other regions of Eurasia. When compared to worldwide populations sampled in the 1000 Genomes Project, although the indigenous Arabs had a signal of admixture with Europeans, they clustered in a basal, outgroup position to all 1000 Genomes non-Africans when considering pairwise similarity across the entire genome. These results place indigenous Arabs as the most distant relatives of all other contemporary non-Africans and identify these people as direct descendants of the first Eurasian populations established by the out-of-Africa migrations.

This is a good paper. They’ve taken a stab at it, and are very circumspect. But in the end they state that “these two conclusions therefore point to the Bedouins being direct descendants of the earliest split after the out-of-Africa migration events that established a basal Eurasian population.”

To catch everyone up, Lazaridis et al. suggested based on results from ancient DNA that many West Eurasian populations have an ancestry which derives from a lineage basal to all other non-Africans unmixed with this population. That means that the genetic distance of this group to Pleistocene European hunter-gatherers and Pleistocene Australians is the same, while the genetic distance between these two groups is smaller than between them and this population. Therefore they are termed “basal Eurasians,” or bEu.

But it is also important to note that they are a construct. The ancient DNA has not found any unmixed basal Eurasians. This is in contrast to other groups which are used as donor populations: European hunter-gatherers and Siberian hunter-gatherers. About ~50% or so of the ancestry of the Anatolian farmers who were the precursor of the first agriculturalists in Europe derive from bEu ancestry, with the balance consisting of a heritage similar to to European hunter-gatherers. The hunter-gatherers recently discovered in the Caucasus also have this bEu ancestry. Ergo, almost all West Eurasian and South Asian populations have bEu ancestry.

In the paper above, which is open access, the authors found a group of Qatari Bedouin, who seem to have low admixture from Africans or other Middle Eastern groups. Though some preliminary analysis was done with SNP-chips, they went whole genome for most of the work (allowing them to look for rare variants, etc.). I would have been convinced to a great extent if they put a TreeMix graph out which showed that their indigenous Arab population was a good donor to ancient Anatolians along with European hunter-gatherers. But I did not see that. Or they could have done an F4 ratio test showing that the Bedouin were more basal Eurasian than any other modern population. I did not see that.

I did see an F4 ratio test for Neanderthal admixture. I am not confident that their assertions hold. Take a look at the pattern of Neanderthal admixture in the supplements; it’s all over the place. It isn’t in line with the broad patterns found in the latest work out of David Reich’s lab.

There are also some assumptions within the paper which I think are untenable. They seem to be positing a continuity of these Qatari Bedouin within the Arabian peninsula for tens of thousands of years. The divergence of the bEu population, putatively ancestral to these Bedouin, occurred from other non-Africans even before the settlement of Australia, over 50,000 years ago! I don’t think it is likely that the Bedouin were resident in or around the Arabian peninsula for that long.

Finally, there’s some reference to effective population sizes vs. X and autosome. This isn’t a major part of the paper, but I would be skeptical of these sorts of claims. There is a lot of work in this area, and it turns out everything is way more clouded than you might think on first blush.

Overall, good paper. But there’s still a mystery here. The only solution is clearly more ancient DNA from this region.

• Category: Science • Tags: Genetics, Human Genetics 
🔊 Listen RSS


F5.medium A common model of the range expansion of modern humans out of Africa ~50,000 year ago forces us to conceptualize it is as a tree with successive bifurcations. Each of these bifurcations often is accompanied by a bottleneck in one of the daughter populations, with the sum totality of the demographic events producing a “serial bottleneck” model of the origin of modern human lineages around the world. Though this paradigm has been around in various forms for decades, most influential among geneticists has been the 2005 paper, Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. In the paper the authors show persuasively that heterozygosity declines as a function of distance from Addis Ababa, near the likely point-of-departure out of Africa.

But there’s minor problem with this model. Most extant populations may in fact be compounds of highly diverged late Pleistocene lineages. That is, after an initial serial founder expansion ~50,000 years ago there may have been many local extinctions and admixture events overlaying it. That is the model that Joe Pickrell and David Reich argue for in Towards a new history and geography of human genes informed by ancient DNA. While in a serial founder conceptualization of an out of Africa migration modern populations are the tips of the phylogenetic tree, in the Pickrell and Reich framework they’re syntheses of divergent evolutionary histories. As Towards a new history and geography of human genes informed by ancient DNA shows older genetic techniques and data, not informed by ancient DNA, may not have had the power to differentiate between the serial bottleneck model, and one of reticulation and fusion.

41h+3YmTZRL._SY344_BO1,204,203,200_ It seems likely that the serial founder model has some utility. One can still hold defensibly that everyone outside of Africa, excluding recent admixture from within Sub-Saharan Africa, sits within their own phylogenetic clade. That is, all non-Africans are equally related to Sub-Saharan Africans, because all of them descend from an ancient population of ~1,000 breeding individuals. But, you have populations such as South Asians who are both numerous, and, fusions of two very distinct branches of out of Africa humanity. The stylized model of a tree subject to bifurcations fundamentally misleads in this case.

All this came to mind reading a new preprint that’s on biorxiv, Distance from Sub-Saharan Africa Predicts Mutational Load in Diverse Human Genomes. Theory and intuition should suggest to us that out of Africa populations will have higher genetic load of deleterious mutations than within Africa populations. The reasoning is straightforward: the power of selection to remove deleterious mutations is hampered the smaller the effective population size, as random genetic drift becomes more determinative in generation to generation changes in allele frequency. More formally if 4 N es << 1, where N e is effective population and s is the selection coefficient, then even deleterious alleles behave as if they are neutral. From neutral theory we know that the rate of substitution in this model is simply the rate of mutation. That is, molecular evolution is determined by new mutational input, rather than being constrained or diversified by selection. In small populations which are drift dominated N e can get very small. I’ve seen assertions that the original group of humans who settled North America and South America may have had an effective population in the first generation on the order of ~100. In their history they also had the out of Africa effective population of ~1,000. In addition, there were likely bottlenecks between Berengia and the out of Africa event.

download You can see in the figure above from the preprint that the number of deleterious mutations does seem to increase with further distance from Africa. Their genomic coverage was good, ~80x on the exome. That is, when they found variants that differed from the reference sequence they could be confident that it was not in error. On the other hand their population coverage struck me as less than ideal, though I am willing to accept that their result is probably true (and obvious with finite resources they selected their individuals and populations to be informative). They admit for example in the text that the Mozabite population has higher heterozygosity due to a back-to-Africa migration, which has had successive admixtures of Sub-Saharan ancestry. In addition, it is also rather inbred. Its demographic history bears no correspondence to the serial founder bottleneck model which spans 50 to 15 thousand years (i.e., from the out of Africa to the settlement of the NewWorld). The Pathan and Cambodian populations also are actually the product of Holocene fusions between distinct groups with very different histories. The PSMC results in the bottom left panel can be thought of as collapsing distinct population histories, and, from what I am to understand may then inflate the effective population trajectories of individuals who descend from admixture events.

As one might expect from this sort of title the authors refer back to the 2005 paper that I mention above. I don’t want to belabor this point, as I think the authors’ results are probably robust, and, important, population history aside. But, much of the audience will not know that the serial founder bottleneck model is now being challenged. The 2005 paper has 588 citations as of this writing. The Pickrell and Reich paper, which was published in 2014, has 5 citations. I just want to mention this since it’s a preprint and presumably the authors are taking in any critiques.

When it comes to burden of deleterious mutations in human populations there have been some conflicting results, reviewed in the preprint, about mutational load. The authors argue that results which suggested that non-Africans did not exhibit higher load were subject to a bias which did not have power to detect non-common variants, and also modeled mutations as additive, as opposed to across the full range of dominance (h). It turns out that non-Africans, and those populations which are more drifted, exhibit higher recessive deleterious loads. This is what you would expect intuitively, as you need large populations to purify this class of deleterious alleles (since they are only exposed to selection in homozygotes). This matters when it comes to expectations of the number of recessive diseases one might expect in populations which practice consanguinity. I would, though, have liked to see more typical populations in the mix. For example, instead of just African hunter-gatherers it would be nice to see the Yoruba, as well as Han Chinese, and a northern European population. I doubt it would be very surprising, but it would give one a better baseline.

Finally, I want to note that many ancient DNA results, from “archaics” to modern human Mesolithic hunter-gatherer groups, have very low effective population sizes due to inbreeding. Genetic load may be more important in the history of the human species, especially on the edge of the range in the far north, than we may now understand, because of its tendency to reduce fitness of groups due to the drag of recessive disease.

Citation: Distance from Sub-Saharan Africa Predicts Mutational Load in Diverse Human Genomes, Brenna M. Henn, Laura R Botigue, Stephan Peischl, Isabelle Dupanloup, MikhailLipatov, Brian K Maples, Alicia R Martin, Shaila Musharoff, Howard Cann,Michael Snyder, Laurent Excoffier, Jeffrey Kidd, Carlos D Bustamante,

• Category: Science • Tags: Genomics, Human Genetics 
🔊 Listen RSS

51fQMbh-NnL._SY344_BO1,204,203,200_ A few months ago the anthropologist Pat Shipman published a book, The Invaders: How Humans and Their Dogs Drove Neanderthals to Extinction. I’ve read Shipman before, and because of my interest in domestication it’s been on my radar, but I haven’t gotten around to purchasing it. The major reason is that as I understand it the title is somewhat misleading, in that there’s a lot less in the text on human-dog cooperation than one might think. Which is reasonable, it’s a speculative hypothesis at best.

Perhaps the biggest problem is that there’s no strong evidence that dogs were domesticated or distinct as early as ~35 thousand years ago, when modern humans replaced Neandertals in Europe. This comes up in a very highly rated comment on Amazon in fact. The best genetic work, Genome Sequencing Highlights the Dynamic Early History of Dogs, implies a date of ~15,000 years before the present, at the earliest.

But now it looks like it’s time to update our priors on this. Shipman’s speculative theory, still unlikely in opinion, is no longer extremely unlikely. The reason is ancient DNA. Ancient Wolf Genome Reveals an Early Divergence of Domestic Dog Ancestors and Admixture into High-Latitude Breeds:

The origin of domestic dogs is poorly understood…with suggested evidence of dog-like features in fossils that predate the Last Glacial Maximum…conflicting with genetic estimates of a more recent divergence between dogs and worldwide wolf populations…Here, we present a draft genome sequence from a 35,000-year-old wolf from the Taimyr Peninsula in northern Siberia. We find that this individual belonged to a population that diverged from the common ancestor of present-day wolves and dogs very close in time to the appearance of the domestic dog lineage. We use the directly dated ancient wolf genome to recalibrate the molecular timescale of wolves and dogs and find that the mutation rate is substantially slower than assumed by most previous studies, suggesting that the ancestors of dogs were separated from present-day wolves before the Last Glacial Maximum. We also find evidence of introgression from the archaic Taimyr wolf lineage into present-day dog breeds from northeast Siberia and Greenland, contributing between 1.4% and 27.3% of their ancestry. This demonstrates that the ancestry of present-day dogs is derived from multiple regional wolf populations.

gr2_lrg As you can see from the figure to the left the Taymyr sample diverges at about the same time as the common ancestor of wolves and modern dogs. In other words, you have a polytomy. Not only that, but there has been introgression from the Taymyr lineage into particular northern dog populations.

Genetics and genomics are big deals. But at this point I have to point out that archaeologists have really been here the whole time. Archaeologists reported that the Amerindians brought dogs with them over through Berengia. Historians know that the indigenous people had dogs. Yet in 2010 geneticists published, in Nature, Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication, who put the focus on the Middle East and the Neolithic revolution. There was basically no way this really made sense. Then you had a 2011 paper in PLOS ONE, A 33,000-Year-Old Incipient Dog from the Altai Mountains of Siberia: Evidence of the Earliest Domestication Disrupted by the Last Glacial Maximum. Even the authors themselves assumed that this was a “false dawn.” That this dog-like canid probably did not give rise to later dog lineages. But if the results above are correct, then in fact this 33,000 year old individual may actually be part of the extant proto-dog population.

Let’s let this sink in: if the results above hold, then the arrival of modern humans to northern Eurasia may have been coincident with the emergence of a distinct dog lineage. The term “man’s best friend” takes on a whole new meaning. The relationship between man and dog may be nearly as ancient as modern humans as we understand them, that is, populations capable of copious and protean symbolic cultural production which explode out in the archaeological record over the past ~40.000 years. In addition, I also believe we now need to totally reconceptualize how we view the relationship of wolves and dogs. Rather than an ancestral and derived set of populations, whose “species” status is only semantic convenience, they are actually sister clades. The results in this paper confirm other findings that the wolves of North America and Eurasia seem to share a post Last Glacial Maximum origin. Wolves as we understand them today may have emerged simultaneously with dogs, both descending from the melange of canid lineages which flourished during the Pleistocene. There’s a reason that feral dogs, such as dingos, do not “revert” to wolves. The ancestor may not have even been a wolf!

Additionally, the authors also note that the features of the dog which are hallmarks of domestication may themselves be derived within the dog lineage. That is, the separation of the ancestors of dogs and wolves predates the Last Glacial Maximum, ~20,000 years ago. But the evolution of dogs so that they exhibit particular derived traits may have occurred far later in time. In fact, I would hold that perhaps the true story is one of co-evolution between dogs and humans.

The ultimate moral of this true story to me is that many Pleistocene mega-fauna with wide ranges in Eurasia were subject to similar evolutionary dynamics. Extinction of distinct local lineages was the rule, not the exception. Recolonization from populations which dodged extinction was also inevitable. The phylogenetic tree was pruned repeatedly, but tempered somewhat in the ferocity of clipping by admixture and introgression, as branches fused together.

• Category: Science • Tags: Evolution, Human Genetics, Science 
🔊 Listen RSS
Distribution of rs17822931 from HGDP

Distribution of rs17822931 from HGDP

Yoshiura, Koh-ichiro, et al. "A SNP in the ABCC11 gene is the determinant of human earwax type." Nature genetics 38.3 (2006): 324-330.

Yoshiura, Koh-ichiro, et al. “A SNP in the ABCC11 gene is the determinant of human earwax type.” Nature genetics 38.3 (2006): 324-330.

I’ve talked about rs17822931 in ABCC11 several times. The reasons are manifold. First, on many traits of interest it exhibits variation across populations in a simple Mendelian (recessive expression) manner. Second, there are suggestive variations in distribution. Third, the traits are kind of interesting without being biomedical. In other words, it’s a cool illustration of pleiotropy and human genetic variation that isn’t going to depress you. If you check out the SNPedia page you note that it is associated with variation in earwax type (wet vs. dry), body odor, and colostrum secretion. This is not the full list, and I’m moderately confident that biologists haven’t hit on all the major phenotypes that this affects variation in.

Until recently I’ve really only been interested in the population genetics of the trait. But talking with a few friends who were molecular biologists I realized I should follow up and dig deeper, and what I found was very interesting. Specifically, as it relates to body odor, which, like it or not is a phenotype of significance in the modern world. The trait happens to segregate within my family. My son is a TT genotype, because his parents are heterozygotes. That means he will exhibit less body odor as an adult. How much less?

In The Journal of Dermetological Science I found Functional characterisation of a SNP in the ABCC11 allele—Effects on axillary skin metabolism, odour generation and associated behaviours. Obviously this is not a journal I read often, but some of the tables are fascinating. The subjects were a few hundred Filipins. This is a population where the allele of interest segregates in intermediate frequencies. So there are many individuals with dry earwax as well as wet earwax, and all the associated traits.

Here are some tables I extracted*:

Mean malodour scores
5 hours 24 hours
TT 2.59 2.6
CT 3.26 3.4
CC 3.21 3.5
Uses deodorant 0.5 0.86 0.97
Does not use 0.5 0.14 0.03

I have no idea how subjective malodour scales work, but the moral is pretty straightforward. Those with the TT genotype saturate at a much lower point. This manifests in daily behavior. There is a fair amount of Japanese data that people who go to the doctor for body odor issues are much more likely to have wet earwax. This data from the Philippines illustrates that individuals with the derived genotype, TT, must be conscious enough of their lack of body odor to forgo deodorant purchases, even though I assume it is normative in the American influenced culture of the Philippines.

1-s2.0-S0923181113003058-gr1But most interesting to me are the chemical differences of the sweat of the different genotypes. They note that there were differences in Nα-3-methyl-3-hydroxy-hexanoylglutamine (HMHA-Gln), Nα-3-methyl-2-hexenoyl-glutamine (3M2H-Gln), and 3-methyl-3-sulfanyl-hexanol-cysteine-glycine between the genotypes. I don’t know much about these chemicals, except that they are “malodour conjugate precursors”. Not surprisingly there’s some difference in the microbial flora of the individuals as a function of genotype.

There have been attempts to understand the selection processes which may have shaped the distribution of the regional variation of this trait, but I’m not entirely convinced of what I’ve seen. Especially when the authors presume that earwax phenotype is in some ways causal (or at least it can give insight to causality, if that makes sense), when it may just be a developmental side effect. A consideration is that some models assume a recessive expression of the trait, which is true for body odor and earwax. But we don’t know if selection occurred that it was on these traits. Because of pleiotropy traits due to variation at a given gene may exhibit different levels of dominance, from full dominance, to additivity, to recessive expression. The target of selection may exhibit a different dominance coefficient than many of the side effect phenotypes (to give you a concrete example, the locus responsible for blue vs. non-blue eye color in Europeans exhibits some recessivity, but it is also responsible for variation in skin color where it is additive).

A 2009 paper using the HGDP data set found evidence of selection on ABCC11 using XP-EHH but not iHS. In other words, extended haplotype differences across populations, but not within them, which often imply sweeps near fixation between populations, rather than ongoing ones within them. To get a better sense of the distribution of the allele I decided to query the SNP in the 1000 Genomes Browser. I invite you to look at the data yourself. The sample sizes start to get pretty large in some of these populations. It is interesting that in West African populations the ancestral variant is nearly fixed, or totally so. The cases where it is not so can pretty easily be hypothesized as due to recent (last 10,000 years) Eurasian admixture. In Europe the frequency of the derived variant is low, on the order of ~10%, but in the Finnish sample it peaks at ~25%. This aligns with patterns in the HGDP data set. African populations tend to be fixed for the ancestral variant, C, while European populations have a low frequency of the derived variant, T, with a cline toward the northeast from the southwest (i.e., peaks in the Russians, lowest fraction in Sardinians). But, Middle Eastern samples in the HGDP data set have European proportions of T as well, though the Mozabites in North Africa do not. The South Asian samples in the HGDP have higher levels of the derived variant than Europeans, intermediate between that group and East Asians. But the 1000 Genomes data results in a thickening of the plot (and, with large sample sizes!). The Bangladeshis are at even a higher fraction than the Pakistani populations. The genotype counts are like so: 12 CC, 54 CT, TT. When I saw this I assumed it was the East Asian admixture, on the order of 10-20%, which might account for the enrichment of T in relation to Pakistan groups. But that is not correct. Here are the counts for Indian Telegus: 20 CC, 49 CT, and 33 TT. And Sri Lankan Tamils: 23 CC, 49 CT and 30 TT. Many hypotheses about the derived variant involve adaptations to cold climates in Northeast Asia. This may still be the case in Northeast Asia, but what you see here is a NW to SE cline of ancestral to derived variant of ABCC11 in South Asia. The Punjabis and Gujaratis have higher fractions of the ancestral variant, as you’d except from the HGDP data.** (the fraction in the Bangladeshi sample might be elevated by East Asian admixture)

The results form East Asian samples in the 1000 Genomes is also illuminating. With sample sizes of around 200 each the Dai minority (related to the Tai people culturally as their antecedents) has a frequency of 56% for T, the Han from Beijing have 97%, the Han from South China are at 86%, the Japanese 88%, and the Vietnamese from the southern region of the country 64%. First, my intuition is that this seems a strange pattern for a allele which was selected on a recessive trait. Rather, it looks more likely for selection on a dominant trait, where the equilibrium frequency remains below 100% because of recessive expression of the unfavored state. Second, the fraction for the Dai seems rather high for the ancestral state. This particular population is sampled from the Mekong region of southern China, as far south as you can go in the nation. This sort of cline correlated with latitude goes a long way to explaining why the thesis often emerged that this variation is somehow related to climate (there is something of a north-south cline in Japan as well).

Where does this leave us? I honestly don’t think we can make a general conclusion about the nature of selection around this variation. To me it looks as it was functionally constrained in Africa. African populations have the derived variant, but those that do can be explained via recent Eurasian admixture pretty easily (e.g., the LWK sample are Kenyan Bantus who have mixed with Nilotic peoples, who do have Eurasian ancestry. The same for the samples from Gambia or Senegal in relation to Eurasian mixed Fula). But once you leave Africa it look as if the constraint was removed, and lots of populations have low frequencies of the derived nonsynonymous mutation. The 2006 paper which focused in on the SNP of interest had Oceanian samples, and the derived variant fraction is too high to simply be a matter of Austronesian admixture. Could it be some form of balancing selection outside of Africa? Who knows. It might be neutral in some areas, under positive selection in others, balanced in a few locations, and under constraint in Africa.

But despite the evolutionary enigma of this locus, the phenotypic correlations keep building up. It’s a classical genetics illustration because of its Mendelian character. In terms of morphology I should emphasize that the body odor related information probably relates to the apocrine glands, which are localized in the armpits and genitals, and also are precursors to mammary secretion glands. Someone who understands these sorts of pathways and how they influence development could probably say much more. I’m sure at some point we’ll be able to answer the big evolutionary questions about this locus, and how it relates to human biological variation, but that will probably necessitate a better catalog of its phenotypic consequences.

Addendum: If you have a 23andMe account, here is the link that will show you your genotype (and anyone else on your account): (be logged in ahead of time).

* I flipped the strand, so converted T to A and G to C.

** To be fair, there was some evidence from Tamils in earlier studies, but two South Indian populations in the 1000 Genomes with high sample sizes nails it.

🔊 Listen RSS
Racially admixed

Racially admixed

In the comments below in regards to mixed-race children:

“Designer baby”? How about “genetic experiment”? Like sitting all day long, eating lots of sugar, etc., mixed race offspring are biological experiments we are doing on ourselves that very seldom happened among our ancestors in the environments of evolution. Ideological certainty notwithstanding, we don’t know what most of the biological consequences, good or bad, for mixed-race descendants are going to be.

On the one hand there is some truth in this is, insofar as contact between populations which are very genetically divergent was likely to be a rare event in the past. That is because even minimal gene flow between populations can homogenize between population genetic differences (population genetic theory usually gives 1 migrant per generation). Sustained contact between populations genetically and physically different to a magnitude which we would term racial probably required very specific events, such as trans-continental migrations, or the opening of virgin land due to deglaciation and demographic expansion from differing directions.

But, there are two caveats, a general and a specific. A general caveat is that those periods of contact were very important. This is trivially obvious when you consider the expansion of a recently African derived population over Eurasia and Oceania ~50,000 years ago, replacing earlier Homo lineages in less than 10,000 years. But there is a second critical point which has started to become evident over the past few years, and that is that it seems likely that the majority of populations extant today are the outcome of admixture events between rather isolated lineages which were genetically and physically distinct. I’ve posted on this before in a broad sense. But for Dr. Calvin Greene, and I suspect the reader below, this is probably the paper of relevance in terms of what they care about: Ancient human genomes suggest three ancestral populations for present-day Europeans. In short most of the world’s “races” are the products of admixture events in the range of 5,000 to 10,000 years ago.

Note: I use the image of Vendela Krisebom because she is the product of ancient and recent admixture.

• Category: Science • Tags: Genetics, Human Genetics 
🔊 Listen RSS
Yoshiura, Koh-ichiro, et al. "A SNP in the ABCC11 gene is the determinant of human earwax type." Nature genetics 38.3 (2006): 324-330.

Yoshiura, Koh-ichiro, et al. “A SNP in the ABCC11 gene is the determinant of human earwax type.” Nature genetics 38.3 (2006): 324-330.

When I was in college a Korean American friend confided to me that his roommate had an issue. He had seen a q-tip in the waste-bin, and what was at the end of it was shocking to him. What my friend was describing was wet earwax (Google it yourself if you want to see it). As this was the first time he was living with a non-Korean he had assumed that everyone’s earwax was dry, like his own. The maps above and to the left show you the frequencies of the allele which has an extremely strong correlation with this trait. In Korea the frequency of dry earwax is close to 100%. Since the expression pattern for dry earwax is recessive, you need two copies of the derived allele, so in any population where the ancestral variant exists in appreciate frequencies you’ll have the wet variant of the trait.

This is why in 2006 a Japanese group published research in this area, A SNP in the ABCC11 gene is the determinant of human earwax type. A substantial minority of Japanese happen to have wet earwax. And it turns out that wet earwax has some other associations of interest.


Nakano, Motoi, et al. “A strong association of axillary osmidrosis with the wet earwax type determined by genotyping of the ABCC11 gene.” BMC genetics 10.1 (2009): 42.

Axillary osmidrosis is scientific for body odor. These results show a very strong association between someone with the ancestral allele which results in wet earwax, and strong body odor. Obviously this is a “news you can use” sort of result, so no surprise at seeing this paper: Dependence of Deodorant Usage on ABCC11 Genotype: Scope for Personalized Genetics in Personal Hygiene:

Earwax type and axillary odor are genetically determined by rs17822931, a single-nucleotide polymorphism (SNP) located in the ABCC11 gene. The literature has been concerned with the Mendelian trait of earwax, although axillary odor is also Mendelian. Ethnic diversity in rs17822931 exists, with higher frequency of allele A in east Asians. Influence on deodorant usage has not been investigated. In this work, we present a detailed analysis of the rs17822931 effect on deodorant usage in a large (N~17,000 individuals) population cohort (the Avon Longitudinal Study of Parents and Children (ALSPAC)). We found strong evidence (P=3.7 × 10−20) indicating differential deodorant usage according to the rs17822931 genotype. AA homozygotes were almost 5-fold overrepresented in categories of never using deodorant or using it infrequently. However, 77.8% of white European genotypically nonodorous individuals still used deodorant, and 4.7% genotypically odorous individuals did not. We provide evidence of a behavioral effect associated with rs17822931. This effect has a biological basis that can result in a change in the family’s environment if an aerosol deodorant is used. It also indicates potential cost saving to the nonodorous and scope for personalized genetics usage in personal hygiene choices, with consequent reduction of inappropriate chemical exposures for some.

I don’t want to get into the biology of this is too much detail. Suffice it to say that the SNP in ABCC11 has a lot of effects. It looks like there might have been a selection event to drive up its frequency at some point. I’m intrigued at the fact that among European populations it is among Sardinians that the derive allele is least common. If it was selection I’m pretty sure don’t know the target phenotype. Less body odor is probably simply a nice side effect. Ultimately though this is personal. If you read NPR’s stupid Code Switch blog you will have seen that Study Says Your Race Determines Your Earwax Scent. Actually, obviously no. This gene has a high between population difference as far as genes go, but the wet and dry phenotypes segregate in many families, and are found in appreciable frequencies in many populations. Both alleles are found in my own immediate family. My wife and myself carry both alleles. We’re heterozygotes. My daughter is a homozygote for the ancestral variant. My soon-to-be-born son is homozygote for the derived variant. Perhaps we’ll be saving on deodorant purchases?

• Category: Race/Ethnicity, Science • Tags: ABCC11, Body Odor, Earwax, Human Genetics 
🔊 Listen RSS

51qciM4cBhL._SS400_ A quick post to clarify things. When we talk about human variation and history we’re talking about phenomena which we need to decompose into different levels of analysis, because there are major differences in terms of methodology and questions we’re asking. Too often public presentation tends to melt them together and confuse separate strands.

First, there is the level of phylogeny. The question here relates to the history of human genes and populations. With the rise of genomic technologies it is trivially easy to identify human clusters, and not that much harder to infer historical-demographic events. This is important, because due to Richard C. Lewontin’s popularization of the correct point that most human genetic variation is partitioned within, not across, populations, the public is under the impression that population structure is trivial. It’s not. Some extant modern human populations may have diverged from the rest of our species as early as ~100,000 years ago (the Khoisan peoples of southern Africa). Gene flow between even Eurasian populations can be rather attenuated over the over of 50,000 years (e.g., Northeast Asians seem to have had minimal gene flow from other Eurasian populations over the past ~30,000 years).

Second, there is the question of variation in phenotype. Do biological traits vary by population? Yes. Is this variation heritable? Some of it is. This is where an understanding of phylogeny is important, because it turns out that some phenotype distributions are strongly divergent from phylogeny. For example, as early as 2007 it was clear that human pigmentation had been subject to regional selection events which decoupled ancestry from variation on those traits. Melanesians who have been separate from Africans as long (or longer in the case of West Eurasians due to admixture) as any other human population exhibit phenotypic and genotypic similarities to Africans, likely because of functional constraint at low latitudes. But this is not always the case. The peoples of Southeast Asia tend to be rather light skinned for their environmental conditions. But now we know that they are likely the product of recent migration due to human phylogenomic and archaeological results (i.e., in Southeast Asian for <4,000 years).

Third, there is the question of between group differences in complex traits. This where the first point, the triviality of between group differences as asserted in Lewontin’s Fallacy comes into play. If between group differences are not viable phylogenetically, i.e., human genetic history is not structured by divergences and fusions which are evident in the genome, then the whole argument collapses. But I think point one and two illustrate that the move into question three is not logically unsound, but must be gripped one empirical trait at a time. Complex traits are difficult to tackle, and we have to be cautious and not let prior conceptions cloud our judgment, but here’s a paper from a few years ago which illustrates where we might be going, Evidence of widespread selection on standing variation in Europe at height-associated SNPs:

By studying height, a classic polygenic trait, we demonstrate the first human signature of widespread selection on standing variation. We show that frequencies of alleles associated with increased height, both at known loci and genome wide, are systematically elevated in Northern Europeans compared with Southern Europeans…This pattern mirrors intra-European height differences and is not confounded by ancestry or other ascertainment biases. The systematic frequency differences are consistent with the presence of widespread weak selection (selection coefficients ~10−3–10−5 per allele) rather than genetic drift alone….

• Category: Science • Tags: Human Genetics, Human Variation, Variation 
🔊 Listen RSS

41BYpEQumNL._SY344_BO1,204,203,200_ Of late people have been leaving off-topic comments early on in threads. I don’t understand why this is happening, as I always post (or try to) an “Open Thread” every Sunday. I don’t post enough at this point where this isn’t usually on the front page, or near it. Please make use of it! From now on I’m going to just not publish off-topic comments because it seems a little rude that people don’t post them in “Open Thread”. I see the beginning of all comments as I have to approve them manually right now, so there’s no reason to hijack another thread. It just annoys me, and probably makes me less likely to actually respond.

A lot of these off-topic comments lately have been about Nick Wade’s new book, A Troublesome Inheritance: Genes, Race and Human History. The reason I haven’t reviewed it is that I haven’t read it, and the reason I haven’t read it is that I don’t have the time. It is easy for me to read an article or paper, and then put up a quick response. Or perhaps one of my own analyses of data sets I have lying around. To read a book and then review it takes a lot more time. Second, there have already been a lot of reactions on the book, so I don’t see what I would have to add. Nick actually told me around four years ago that he was thinking about writing this book, so its appearance did not surprise me at all, though the mainstream reaction seems more muted than I would have guessed.

Some general points though. First, the modern American consensus that race is a social construct is true but trivial. It’s true because a de facto race such as “Latinos/Hispanics” were created in the 1960s by the American government and elite for purposes of implementing public policies such as affirmative action. Obviously this is a classic case of a social construct, as the quasi-racial category is based upon social, not biological, factors (Latinos/Hispanic can explicitly be of any race, though implicitly it’s transformed into a non-white class in the United States). A group like “black Americans” ranges from people with considerably less than 50% African ancestry to more than 90% African ancestry (though almost always black Americans who are not immigrants from Africa or first generation offspring of those immigrants have some segments of European ancestry). The problem is that people move from this non-controversial point, that some racial categories are social constructs, to the assertion that all racial categories are social constructs, and that phylogenetic clustering of human populations is irrelevant or impossible. It is not irrelevant, or impossible. Human populations vary, and that variation matters. Human populations have specific historical backgrounds, and phylogenetics can capture that history through methods of inference.

Moving from phylogenetics to population genetics, there is the question about whether population-genetic dynamics such as migration, drift, mutation, and selection have resulted in significant variation across human populations. Yes, they have. Human populations have significant functional differences which track regional adaptation, and also correlate to an extent with racial clusters, and phylogenetic history. The details here are empirical, and you need to take into account what we’re learning about human demographic history to make sense of how and when adaptation occurred. This where the controversial aspects of Wade’s book come in, because he argues that there are behavioral differences across populations due to distinctive evolutionary histories. Complex traits like behavior are often subject to numerous upstream causal variations, so untying the knot is not easy.

But I don’t think it’s impossible, and I suspect there are indeed behavioral differences between populations due to genetic differences between populations. The problem is that we haven’t really done enough research in this area to talk about the genetics of it in anything more than a speculative fashion, and complex traits which are less controversial and more tractable than behavior or cognition, such as height, have already presented difficulties for researchers despite extensive devotion of resources. But the truth of the matter in this area will come out at some point. As it is right now, it does indeed seem that the small differences in height between Northern and Southern Europeans are due at least in part to differences in frequencies of alleles which are known to influence height.

🔊 Listen RSS

 Citation: Towards a new history and geography of human genes informed by ancient DNA, Joseph Pickrell, David Reich, doi: 10.1101/003517

Citation: Towards a new history and geography of human genes informed by ancient DNA, Joseph Pickrell, David Reich, doi: 10.1101/003517

And when the Lord thy God shall deliver them before thee; thou shalt smite them, and utterly destroy them; thou shalt make no covenant with them, nor show mercy unto them.

Neither shalt thou make marriages with them; thy daughter thou shalt not give unto his son, nor his daughter shalt thou take unto thy son.

– Deuteronomy 7:2, 7:3

In my post below I did not elaborate in detail my personal model for how the genetic variation we see around us in modern humans came about. Part of the reason is that what I have in mind is not necessarily parsimonious. Rather, it’s a conception that has developed organically over the years reading papers, looking at the data on humans myself, and finally feedback from people in the comments of this weblog. One thing that Joe Pickrell and David Reich have shown with simulations though is that elegant parsimonious models can fit data with low enough granularity. To be concrete the “serial founder bottleneck” framework wasn’t constructed out of thin air. Rather, it actually was rooted in empirical patterns which were evident in the genotypic data. But all models are approximations and miss details. For example, Sohini Ramachandran’s 2005 paper was anchored around Addis Ababa. But this was even at that time obviously a simplification, as the authors surely knew that there was plenty of circumstantial evidence that Ethiopia has been subject to recent admixture, so that particular population could not be the source of the Out of Africa expansion (see Pagani et al. for confirmation with dense marker data sets).

As another example, consider India. Between Western and Eastern Eurasia, this region is inhabited by populations which are at approximately symmetrical genetic distance in either direction, with perhaps a mild bias toward Western Eurasians (this may be an artifact of sampling too many high caste populations though). Taken at face value it is a perfect illustration of the maxim that geography predicts genetic variation. But what Moorjani et al. most recently have confirmed is that raw genetic distance measures which summarize allelic differences on the population wide level missed patterns of recent admixture evident on the genomic scale. That is, when you look across segments of the genome of South Eurasians it is obvious that they are the outcome of an admixture event between a West Eurasian population and another group which has closer affinities to Eastern Eurasians (and closest to Andaman Islanders). When you average these ancestral components out it does place South Asians between the two antipodes of Eurasia, but that elides an essential component of historical information about how that average came about.

Cann, Rebecca L. "DNA and human origins." Annual Review of Anthropology (1988): 127-143.

Cann, Rebecca L. “DNA and human origins.” Annual Review of Anthropology (1988): 127-143.

To outline more fully what I have in mind it is essential to reiterate my understanding of the default model which coalesced about ~10 years ago, though its origins are deeper. The modern orthodoxy was solidified by mtDNA Eve. What these results from the 1980s showed is that the phylogenetic tree of the mitochondrial lineage of humans is rooted in a diversification from an Africa population, with a series of branching-offs successively of anatomically modern groups which swept into Western Eurasia, Eastern Eurasia, Australasia and the New World, successively. In the 1990s microsatellites and other autosomal markers confirmed the broad outlines of this narrative. So somewhere on the order of ~50-100,000 years ago something happened, and the anatomically modern human beings which had been resident within Africa for a ~100,000 years left. After they left they swept aside all the other archaic hominins which were long resident across Eurasia, and charted new territory beyond in Australasia and the New World. As the branches of this human tree took root in particular localities they established the broad patterns of genetic variation we see around us today, with the obvious exceptions due to European colonialism. That is, you could take the phylogenetic tree and transpose it after a fashion on a geographic map, because the general framework of human variation was established ~15,000 years ago, after the Last Glacial Maximum. Here is a 2003 paper from The American Journal of Human Genetics, , The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. In the abstract the authors state: “Taken together, these results show that Indian tribal and caste populations derive largely from the same genetic heritage of Pleistocene southern and western Asians and have received limited gene flow from external regions since the Holocene.” This last section seems likely to be wrong. Around half of the genetic heritage of South Asians is probably derived from a component whose ancestors were not within the confines of the Indian subcontinent before the Holocene.* Similar issues crop up in Europe. Again, from The American Journal of Human Genetics, 2000, Tracing European Founder Lineages in the Near Eastern mtDNA Pool. The last sentence of the abstract states that “the immigrant Neolithic component is likely to comprise less than one-quarter of the mtDNA pool of modern Europeans.” Though the authors were focusing on mtDNA, it led to a spate of articles which presumed that Europeans are predominantly descended from the Pleistocene populations of the continent. The current consensus is more muddled, though the most recent work suggests a greater role for farmers, and an exogenous non-European element.

Credit: Kwamikagami, Indigenous Australian language families

Credit: Kwamikagami, Indigenous Australian language families

As Pickrell and Reich emphasize there are simply too many exceptions to the old model of diversification Out of Africa and subsequent stasis and gene flow via isolation by distance dynamics. It is fashionable to say that human genetic variation is clinal, and it is in a rough fashion. But this description does I think mislead us as to how these clines came about. The old model had a simple elegant attraction, as it consisted of a rapidly bifurcating phylogenetic tree, which was modified on the edges by subsequent gene flow in the new equilibrium environment. Where did it go wrong? First, I think the revision begins before the Holocene. The Holocene is not particularly special, there have been Interglacials before. What is different is that modern humans developed agriculture, and then what we call civilization. But not everywhere. In Australia the native populations remained hunter-gatherers up to the European discovery. But it is not feasible to assume that they were subject to cultural or genetic stasis. The expansion of the Pama–Nyungan languages likely occurred within the last 4,000 years. This shows that hunter-gatherer populations can enter into phases of rapid cultural change and expansion. Does this reflect a genetic change across Australia? We don’t have extensive samples to test this hypothesis to my knowledge, though surely it will be checked within the next 10 years. I would predict that this change did perturb the genetic landscape across Australia. And it seems implausible that Australia was exceptional. When the first dense marker analyses or whole genomes come back from Cro-Magnons I predict they will be radically different from the hunter-gatherers which were resident on the continent during the early Holocene. At least different enough to warrant the hypothesis of Pleistocene population turnover. I could be wrong, but the conjecture is not crazy.

Obviously agriculture changed things, by turbo-charging possible winner-take-all dynamics. We have historical cases of this. French Canadians and Slavs are both cases of populations which were once relatively modest and began in a narrow delimited region, but now are quite expansive and numerous. In the case of the East Slavs the demographic expansion also entailed the absorption of numerous Uralic tribes, as well as later Turks. And this illustrates one of the major details which I think has characterized the genetic turnover of human populations: phase shifts from a relatively static one defined by isolation by distance gene flow across clinal gradiants, to a rapid expansion of a small subset, and the overlay of this component as a palimpsest over the underlying variation. In some cases the replacement is nearly total, as in the modern United States. In other cases, as among Great Russians, the Slavic affinities of this population, and its association with Poles and other groups are clear, but there was a non-trivial uptake of exogenous segments which might allow for a reconstruction of the prior genetic landscape. These changes occur over short periods, and are bright fireworks against the comparatively static firmament. Explosions in the dark, which rapidly requilibrates to a stationary state.

In a way what I am describing is Out of Africa writ large. For decades scholars have been perplexed by the rapid expansion of anatomically modern humans out of Africa over a short period, and their subsequent sweep across the Old World and into Australia. But perhaps this is not sui generis, but the typical pattern of humans. For example the replacement of the Melanesian-like population of Southeast Asia over the last 4,000 years by people from the valleys and coastal zones of southern China could be said to resemble an Out of Africa event, perturbing a genetic landscape that had been relatively static for on the order of 10,000 years. And if humans, why not other medium sized mammals? The most recent results from dog genomics are intriguing, and the perplexing aspects are awful familiar to me.

* I do not believe that the “Ancestral North Indians” were resident within the confines of the Indian subcontinent before the Holocene.

• Category: Science • Tags: Human Genetics 
🔊 Listen RSS

Citation: Wilde et al.

Credit: Igor Kruglenko

Credit: Igor Kruglenko

A new paper in PNAS, Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 years, uses ancient DNA to examine the possibility of very recent natural selection in Europeans. In particular, it focuses on eastern Europeans, and roughly a region coterminous with Ukraine ~6000 to ~4000 years ago. The sample seems somewhat biased toward the low end of the age range if you look the supplemental tables. In the paper itself (which is open access) I don’t see a map to get a sense of the distribution of the sites from which the DNA was extracted. So I took the supplemental table and used the latitude and longitude information, as well as the samples from each site, and produced a density map with a bubble plot overlain upon it with specific locations (size of bubble proportional to number of samples at site). Like the earlier ancient DNA from a few European hunter-gatherers one must keep in mind the limitations of the scope of sampling so few to infer about so many. Though the number here is far larger (N >20 or >40 depending on the SNP), the set of markers examined was much smaller, a few pigmentation loci and mtDNA. Nevertheless this is not a trivial geographic example, nor is the time frame, from the Early Eneolithic down to the Bronze Age.

Figure S1

Figure S1

The clearest illustration of the topline result is found in the supplements (I prefer figures to tables obviously). What you see here is that there is a large difference in allele frequencies between ancient samples and modern ones from the equivalent geographic region at specific markers diagnostic for variation in pigmentation in modern Europeans. HERC2 is well known for being one of the two loci which span a long haplotype strongly correlated with blue eyes in Europeans. SLC452 and TRY are part of the standard suite of pigmentation genes which show up as variable across Eurasia. I am confused as to why they did not focus on SLC24A5, a locus which is nearly totally fixed in modern Europeans for the A allele, but may not have been so in hunter-gatherers. But in any case the result is rather clear: the ancient populations sampled here are statistically differentiated from modern populations in the same region, and, seem to have been darkly complected in comparison. The natural inference then is that powerful sweeps of natural selection increased the allele frequencies of lightening alleles in Europeans within the last ~4,000-6,000 years. This is not a crazy proposition; tests for recent natural selection in Europeans are often enriched around pigmentation loci, which are genomically atypical (long homogeneous blocks are common). What this study does is intersect inferences from modern variation with the distribution of variants in an ancient population presumed to be ancestral.

The problem of course is whether these are truly ancestral. But recall I stated earlier that they had mtDNA. This is copious, and so rather easy (comparatively!) to get from ancient DNA. Comparing their samples with modern ones from the region they find there isn’t great discontinuity. Using a model of genetic drift they support the scenario of continuity, and that the F st of ~0.005 is what you would expect for a set of populations ~4,000-6,000 years in the past. To put this in perspective this is about the Fst using autosomal SNPs between Russsians and French, or Palestinians and Greeks. Considering the time depth separating these putative populations I think even without their coalescent simulation models I can accept continuity of mtDNA intuitively. Of course the key is to not forget this is mtDNA, only the maternal lineage. If you looked at modern South Asians you’d see they’re mostly not West Eurasian. But if you looked at their Y chromosomes they’d be mostly West Eurasian. The autosomal DNA gives a half & half picture. The issue of sex mediated gene flow is made even more stark in the case of Latin America.

k8488 A model like is made more plausible by the fact that many of these individuals were of the Yamna culture, Kurgans. The thesis forwarded by some scholars is that it is these Kurgans, a patriarchal nomadic society, who brought Indo-European languages to central and western Europe ~5,000 years ago (their eastern cousins becoming Tocharians and Indo-Iranians, their southern ones Hittites and perhaps Armenians). Probably the best recent outline of this thesis is by David Anthony in The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World. I found it so engrossing that I finished it in one sitting in 2008. If these data are correct the Kurgans did not look like blonde Aryan Übermensch, rather, they became that (though to be fair, in this case we are talking about them becoming Slavs, who the Nazis labelled Untermensch). But one of the general assumptions about Kurgans is that they were groups of mobiles males. In that case one wouldn’t be surprised if their mtDNA tended to reflect subject peoples, while the whole genome was more mixed and cosmopolitan, reflecting their migrations.

So the crux then is whether to trust this mtDNA evidence as representative of the whole genome. If I simply had the mtDNA, along with the information about provenance in terms of time and place, I’d probably accept the argument for continuity. But the phenotypic markers are so different, either there’s been population replacement, or, we’ve had a lot of in situ selection. Replacement seems like the more boring hypothesis, especially in light of the fact that many of the sites sampled were not in classically Slav zones of habitation, but were occupied by Iranian or Uralic peoples, or more recently Turks. Though the researchers are using contemporary East Slavs to compare to the ancient samples, across many of these sites Slavs only become dominant in the area with the rollback of the Ottomans in the 18th century.

Ultimately I’m very unsure that the assumption of genetic continuity in this case will hold, so let’s simply take that as a given for now. Then what? You have lots of selection. The question naturally moves to why. What drove the selection? In the discussion the authors the go over many of the hypotheses rather thoroughly. Roughly they fall into two classes, the ecological/environmental and the social/sexual. The former generally has do with a combination of a switch to agriculture and the need to synthesize vitamin D due to the shift away from fish in the far north. The latter focuses on sexual selection, and favoring particular markers due to heightened paternity certainty. In particular the sexual selection hypothesis would seem to be able to explain the rise of HERC2, which is associated with light eyes, as that may be a favored trait. The immediate rejoinder is provided in the text: many of the pigmentation loci have pleiotropic effects. In other words, they tune overall pigmentation, skin, hair, and eyes, though perhaps to different extents. So if the selection was environmental due to skin it would not be totally surprising if hair and eyes changed as a side effect. Of course, as suggested in the comments here one need not posit that there was one singular selection event, as opposed to a sequential composite. Perhaps it was both environmental and sexual selection?

This again is another area where I’ll throw my hands up the air. If selection is the answer, and not population replacement, then it’s very strong. It seems that these loci were subject to sweeps in the same range of power as that around LCT, for lactase persistence, the Tibetan high altitude adaptations, as well as the various malaria resistance alleles (which have different selective dynamics, some of them balancing). One can actually still detect differential fitness at high altitudes based on phenotype, and the same with malaria, at least before modern medicine. The problem I have is that I’m just not aware of studies on the extent of differential fitness in human populations due to sexual selection. In theory sexual selection is very powerful, especially in contexts of hyper-polygyny, but to have it be realized in humans would require very particular social structures. The environmental selection arguments by their nature tend to be simpler, and therefore more attractive. But we’ve reached a point where there’s a lot of confusing stuff coming out of ancient DNA, and we need to go back to first principles, and reexamine everything. This includes sexual selection, as more than simply a deus ex machina to throw out there when we don’t have a better model on hand. That necessitates a serious examination of patterns of variance in reproductive output by phenotype, and plugging these back into models of selective sweeps.

Citation: Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 years

Note: Yulia Tymoshenko has very dark eyes. So I assume she’s not a natural blonde.

🔊 Listen RSS

BurnTree A phylogenetic tree is an essential tool in understanding the broad scope of natural history, placing particular lineages in specific evolutionary contexts of relatedness. These sorts of trees range from Ernst Haeckel’s classical attempt, depicting relationships which biologists derived from intuition within the framework of a grand evolutionary scheme, all the way down to modern methods implemented in software packages such as Mr. Bayes, which many frankly utilize in a “turnkey” manner. These trees are abstractions, in that they reduce down a wide range of phenomena into schematic representations which impart aspects of particular interest in a stylized form. This is important, because the actual nature of the phenomena being represented may be more complex than is being represented. A simple illustration of what I’m getting is clear when you look at the long history of phylogenetics and phylogeography utilizing mitochondrial DNA lineages (mtDNA). Because mtDNA is copious in comparison to nuclear DNA, it is easy to obtain. And, as there is no recombination and it is inherited in a haploid fashion (mother to daughter) it makes the inference of gene trees much easier. The key problem is that the genealogy of this particular sequence is used to infer aspects about population history, when they may not accurately represent the history of other regions of the genome very well. Different genes may have different histories.

These issues of conflating the history of genes with the history of populations move further into the foreground the less genetic distance separates the populations you are comparing. Phylogenetic analysis involving distinct species has its own problems, but they are dwarfed by what must confront those who attempt to parse out relatedness of populations within species. Because of the ubiquity of gene flow across populations within species attempts to generate a tree of relationships of populations is always bound to be a gross simplification. Instead of a sequence of bifurcations the true relationship of putative populations is more accurately represented by a networked graph.

Jumping from the theoretical to the concrete one of the major issues in regards to constructing a sequence of events of the human past which can be used to inform the human present is that a graph relationship is very complex and difficult to tease apart when the tips of your tree are extant populations which are highly admixed. When you try and reconstruct the past from the present, a necessity in phylogenetic analysis which utilizes genetic data (obviously the issues are different if you are focusing on paleontological information), you necessarily gain a blinkered perspective.*

All this came to a head for me when I read the post The First of the Mohicans, which cited a preprint I’d skimmed over earlier in the year, Efficient moment-based inference of admixture parameters and sources of gene flow. It is by its nature a technical paper, but within it is lodged some genuine dynamite. Let me quote:

Our interpretation is that most if not all modern Europeans are descended from at least one large-scale ancient admixture event involving, in some combination, at least one population of Mesolithic European hunter-gatherers; Neolithic farmers, originally from the Near East; and/or other migrants from northern or Central Asia. Either the first or second of these could be related to the “ancient western Eurasian” branch in Figure 5, and either the first or third could be related to the “ancient northern Eurasian” branch. Present-day Europeans differ in the amount of drift they have experienced since the admixture and in the proportions of the ancestry components they have inherited, but their overall profiles are similar.

The result here is outlined graphically in the preprint:


What you see above are two varieties of abstractions which attempt to reconstruct phylogenetic relatedness, and implicitly historical change over evolutionary lineages. To the left is a classical tree, where all the terminal nodes (contemporary populations) are te outcomes of bifurcation events. To the right you have an attempt to produce more informatively representation of the relatedness by drawing out likely admixture events. Here’s the major result: modern Europeans seem to be the products of a major admixture event between a population which roots in northern Eurasia, and another with roots in western Eurasia. At the current rate it seems likely that most major world population are the result of mingling between very distinct populations (to varying extents). In fact, I’m rather certain these sorts of inferences underestimate the extent of admixture, rather than overestimate them. By their nature the methods elide complexity.

The ubiquity of this admixture leaves me a bit chagrined, because with the rise of genome-wide data in the mid aughts I’ve been reading papers which produce neat trees and elegant admixture bar plots, all the while unable to confront the reality that the abstractions before me were not reflecting what truly transpired over the past ~10,000 years. A world where modern human expansion resulted in isolation of several major lineages from each other by the end of the last Ice Age down to the present never existed. A world where these major lineages were connected by continuous isolation-by-distance dynamics is very misleading. Here is what I think is more accurate: a world where the “tips” of the phylogenetic tree are pruned repeatedly, and populations which are the outcomes of admixture events expand rapidly to fill the emptying space. Both “ancient North Eurasians” and the “ancient South Eurasians” do not seem to exist in unadmixed form, perhaps with the exception of Andaman Islanders, and some populations in the far north of Siberia. This begs the question, do any populations exist in an “unadmixed form”? What does that even mean? The paper I mention above actually does answer the question in a somewhat precise manner. Populations such as the Japanese are useful in forming an unadmixed scaffold after populations identified as admixed are removed using f3 statistics (see Ancient Admixture in Human History). But this is not the last word on whether the Japanese are admixed or not, though it suffices for the purposes of the questions being asked in the paper.

Where does this leave us? Let’s go back to Europeans. The authors of Efficient moment-based inference of admixture parameters and sources of gene flow assert that pretty much all Europeans exhibit evidence of massive admixture between very distinct lineages. To me this is highly suggestive of events which have roots prior to the Neolithic Revolution. In other words admixture between west and north Eurasian lineages may have occurred in Europe at the end of the last Ice Age, as the continent was being resettled by hunters from the east and south. Later, Neolithic farmers from the Middle East related to the west Eurasian population in Europe during the Pleistocene added a subsequent layer of west Eurasian ancestry, and to a great extent replaced or absorbed the admixed hunter-gatherers. Finally, it seems now entirely possible that a further wave of migrants from Central Asia, who were also an admixed population, erupted into Europe and replaced or absorbed many of the descendants of the Neolithic farmers.

What we’re confronted by is intellectual rubble and bombs are dropped all over the landscape. The world is turned upside down. We’ll rebuild, but it’s going to take time. The past was a strange land, far stranger than we’d thought. In science you go for the boring answers as a null, but in this case the boring answers are turning out to be wrong.

* Ancient DNA analysis is changing this somewhat.

• Category: Science • Tags: Human Genetics 
🔊 Listen RSS

In my write up on variation in inheritance patterns for Slate last week I did not explore the likely quantitative distribution in any detail (frankly, I think that part is confused or muddled at best). My primary focus though was on the empirical reality of variation, which people utilizing personal genomic services will receive, perhaps to their surprise. But in part triggered by that Slate piece and follow-up discussions at Twitter with Michael Eisen, Graham Coop decided to crunch the numbers. More concretely he took the known patterns of recombination in the human genome (from a paper he co-authored, Broad-Scale Recombination Patterns Underlying Proper Disjunction in Humans), and input these values into a simulation which generated distributions of contribution from maternal and paternal grandparents, How much of your genome do you inherit from a particular grandparent?

As Coop observes, one of the most surprising things is the very long tail of distributions for paternal grandparents. This is due to the lower recombination rates in males. Remember that recombination tends to reduce the variance in transmission from the grandparental generation, so reduced recombination increases the variance. Therefore, you see that 1 in 200 sperm are skewed such that 20% or less of genetic material in the sperm is from one grandparent. Because you have to divide this by half in the fertilized zygote, what this means that in 1 in 200 individuals you have a case where 10% or less of their genome is from one paternal grandparent, and 40% or more from the other! The histogram for females, as you can see, is much less dispersed, though the variation there is not trivial either. I suspect that the scientists at 23andMe almost certainly know the empirical distribution, as they likely have many pedigrees to compare. It would be nice if they shared that with us.

• Category: Science • Tags: Genetics, Human Genetics 
🔊 Listen RSS

Likely an individual with derived allele on KITL locus (Credit: David Shankbone)

An individual polymorphic on the KITL locus? (Credit: David Shankbone)

Pigmentation is one of the few complex traits in the post-genomic era which has been amenable to nearly total characterization. The reason for this is clear in hindsight. As far back as the 1950s (see The Genetics of Human Populations) there were inferences made using human pedigrees which suggested that normal human variation on this trait was controlled by fewer than ten genes of large effect. In other words, it was a polygenic character, but not highly so. This means that the alleles which control the variation are going to have reasonably large response, and be well within the power of statistical genetic techniques to capture their effect.

I should be careful about being flip on this issue. As recently as the mid aughts (see Mutants) the details of this trait were not entirely understood. Today the nature of inheritance in various populations is well understood, and a substantial proportion of the evolutionary history is also known to a reasonable clarity as far as these things go. The 50,000 foot perspective is this: we lost our fur millions of years ago, and developed dark skin, and many of us lost our pigmentation after we left Africa ~50,000 years ago (in fact, it seems likely that hominins in the northern latitudes were always diverse in their pigmentation)

A new paper in Cell sheds some further light on the fine-grained details which might be the outcome of this process. Being a Cell paper there is a lot of neat molecular technique to elucidate the mechanistic pathways. But I will gloss over that, because it is neither my forte nor my focus. A summary of the paper is that it shows that p53, a relatively well known tumor suppressor gene, seems to have an interaction with a response element (the gene product binds in many regions, it is a transcription factor) around the KITLG locus. This locus is well known in part because it has been implicated in pigment variation in human and fish. So KITLG is one of the generalized pigmentation pathways which spans metazoans. There are derived variants in both Europeans and East Asians which are correlated with lighter skin, though there is polymorphism in both cases (it has not swept to fixation).

The wages of adaptation? (Credit: Hoggarazzi Photography)

The wages of adaptation? (Credit: Hoggarazzi Photography)

But this is a Cell paper, so there has to be a more concrete and practical angle than just evolution. And there is. It turns out that a single nucleotide polymorphism mutation in the p53 response element results in a tendency toward upregulation of KITLG and male germ line proliferation. The latter matters when it comes to tumor genesis, and in particular testicular cancer. This form of cancer is one where there doesn’t seem to be a somatic cell mutation of p53 itself. Additionally, the authors observe that testicular cancer manifests at a 4-5 fold greater rate in people of European descent than African Americans. And, presumably the upregulation of KITLG is somehow related to increased melanin production. The authors posit that because of lighter skin in Europeans due to selection at other loci there has been a balancing effect at KITLG (increased tanning response). There is evidence of selection at this locus (a long haplotype and increased homozygosity), so this is not an unreasonable conjecture, though the high frequency of loss of function alleles suggests that the model is likely complex.

I don’t know if this particular story is correct in its details (though I am intrigued that variation in KITLG is associated with cancer in other organisms). But it illustrates one of the possible consequences of rapid evolutionary change due to human migration out of Africa: deleterious side effects because of pleiotropy. In other words, as you tinker with the genomic architecture of a population you are going to have to accept tradeoffs as you are optimizing one aspect of function. Genes don’t have just one consequence, but are embedded in myriad pathways. Over time evolutionary theory predicts a slow re-balancing, as modifier genes arise to mask the deleterious side effects. But until then, we will bear the burdens of adaptation as best as we can.

Citation: Zeron-Medina, Jorge, et al. “A Polymorphic p53 Response Element in KIT Ligand Influences Cancer Risk and Has Undergone Natural Selection.” Cell 155.2 (2013): 410-422.

🔊 Listen RSS


Credit: Aviok

“Think not that I am come to send peace on earth: I came not to send peace, but a sword.” -Matthew 10:34

“There were giants in the earth in those days…when the sons of God came in unto the daughters of men, and they bare children to them, the same became mighty men which were of old, men of renown.” -Genesis 6:4

Seven years ago I wrote a short post, Why patriarchy?, which attempted to present a concise explanation for the ubiquity of what we might term patriarchy in complex societies (i.e., not “small-scale societies”). Broadly speaking my conjecture is that social and political dominance of small groups of males (proportionally) over the past several thousand years is an example of “evoked culture”. The higher population densities in agricultural societies produced a relative surfeit of accessible marginal surplus, which could be given over to supporting non-peasant classes who specialized in trade, religion, and war, all of which were connected. This new economic and cultural context served to trigger a reorganization the typical distribution of power relations of human societies because of the responses of the basic cognitive architecture of our species inherited from Paleolithic humans. Agon, or intra-specific competition, has always been part of the game on human socialization. The scaling up and channeling of this instinct in bands of males totally transformed human societies (another dynamic is elaboration of cooperative structures, though this often manifests as agonistic competition between coalitions of humans).

To get a sense of what I mean when I say transforming, consider this section of an article in The Wall Street Journal which profiles the wife of one of the 2012 New Delhi gang rape:

Some people blame the December gang rape and similar attacks in part on a collision of traditional social expectations—commonplace in rural areas—and the modernity of India’s cities, where rural migrant workers encounter the values of urbanites living by a different set of rules. During the brutal Delhi assault, for instance, the attackers accosted the woman and the young man she was with, asking why they were out together in the evening, the young man told the court.

Speaking about the events of that night, Ms. Devi says she doesn’t understand how a woman could be out for the evening with a man who wasn’t her husband.

The normalcy of this sort of ‘mate guarding’ is taken for granted in many ‘traditional’ societies. You see it reflected in the 1995 film First Knight, where King Arthur tries Lancelot and Guinevere for treason based on a kiss (dishonor to the realm). I won’t go into excessive psychoanalysis, but end by saying that the emergence of radical inequality and stratification with complex societies transformed instincts shaped in small-scale bands where petty conflicts were no doubt the norm. To my knowledge the literature from small-scale societies tends to imply a relatively more relaxed, even modern, attitude toward sexuality than one can see in world of the Eurasian Ecumene.

At this point you might be curious as to the point of reviewing this conjecture. Perhaps I will bring to the fore historical and archaeological evidence which might support this model? No. Rather, I contend that the evidence of this radical reshaping of human power structures, which led to the emergence of patriarchy as we understand it, is reflected in the phlyogenetic history of our species. Two papers illustrate the differing patterns which one sees in the maternal lineage, mtDNA, and the paternal lineage, Y chromosomes.

First, Y Chromosomes of 40% Chinese Are Descendants of Three Neolithic Super-grandfathers:

Demographic change of human populations is one of the central questions for delving into the past of human beings. To identify major population expansions related to male lineages, we sequenced 78 East Asian Y chromosomes at 3.9 Mbp of the non-recombining region (NRY), discovered >4,000 new SNPs, and identified many new clades. The relative divergence dates can be estimated much more precisely using molecular clock. We found that all the Paleolithic divergences were binary; however, three strong star-like Neolithic expansions at ~6 kya (thousand years ago) (assuming a constant substitution rate of 1e-9/bp/year) indicates that ~40% of modern Chinese are patrilineal descendants of only three super-grandfathers at that time. This observation suggests that the main patrilineal expansion in China occurred in the Neolithic Era and might be related to the development of agriculture.

Second, Analysis of mitochondrial genome diversity identifies new and ancient maternal lineages in Cambodian aborigines:

Cambodia harbours a variety of aboriginal (and presumably ancient) populations that have largely been ignored in studies of genetic diversity. Here we investigate the matrilineal gene pool of 1,054 Cambodians from 14 geographic populations. Using mitochondrial whole-genome sequencing, we identify eight new mitochondrial DNA haplogroups, all of which are either newly defined basal haplogroups or basal sub-branches. Most of the new basal haplogroups have very old coalescence ages, ranging from ~55,000 to ~68,000 years, suggesting that present-day Cambodian aborigines still carry ancient genetic polymorphisms in their maternal lineages, and most of the common Cambodian haplogroups probably originated locally before expanding to the surrounding areas during prehistory. Moreover, we observe a relatively close relationship between Cambodians and populations from the Indian subcontinent, supporting the earliest costal route of migration of modern humans from Africa into mainland Southeast Asia by way of the Indian subcontinent some 60,000 years ago.

The scientific methods here are straightforward, or at least tried and tested. The main gains here are in terms of raw numbers and sequencing. Basically this is the extension of phylogeographic work which goes back 20 years, but on steroids. As such one should be cautious. The old phylogeography literature has turned out to be wrong on many of the details. But that’s OK, there’s still gold there, you just have to look.

The broad scale implication of the paper on Chinese Y chromosomal diversity is obvious. Like the Genghis Khan modal haplotype these are lineages which exhibit a ‘star-like phylogeny.’ They explode out of a common ancestor in short order, with few mutational steps. This explosion is simply a reflection of very rapid population growth. The skewed distribution of Y lineages here (i.e., three lineages representing nearly half the population) indicates to me a pattern where elite males tend to be much more fit in reproductive terms than the average male. Rapid population growth may have been correlated with a high rate of extinction of Y lineages due to “elite turnover“.


Citation: Zhang, Xiaoming, et al. “Analysis of mitochondrial genome diversity identifies new and ancient maternal lineages in Cambodian aborigines.” Nature Communications 4 (2013).

The second paper looks at mtDNA, the maternal line. There are some specific results which are interesting. In line with Joe Pickrell’s TreeMix results it does look like Cambodians and Indians share deep ancestry dating to the Paleolithic. The PCA to the left shows the relationships of populations in relation to their haplogroups, and one clear finding is that Cambodians tend to cluster with Indians, and not Northeast Asians. This result is not unsurprising. As I’ve noted before on mtDNA lineages South Asians are closer to East Eurasians than they are to West Eurasians. The result for the Y chromosomes is inverted, while autosomes are somewhere in the middle. In addition the results above show that South Chinese Han mtDNA tend to occupy the same part of the plot as the Dai, who are related to the Thai people of Southeast Asia. In contrast the few North Chinese Han tend to cluster with Tibetans and Altaics. Could Sinicization have been male mediated? There’s been circumstantial ethnographic evidence which points to this (e.g., some Cantonese marriage practices may reflect assimilation of Dai women).

The big picture result to me is that it illustrates the discordance between migration patterns of males and females over the past 10,000 years due to the rise of agriculture and its offspring, patriarchy. I hold that there was no hunter-gatherer Genghis Khan. Such a reproductively prolific male, worthy of an elephant seal, is only feasible with the cultural and technological accoutrements of civilization. ~20,000 years ago Temujin may have had to be satisfied with being the big man in a small clan. Thanks to various ideological and military advancements by the year 1200 AD you saw the rise to power of a man who could realistically assert that he was a ‘world conqueror.’


Credit: Brocken Inaglory

Of course I do not believe that the world before agriculture was static. On the contrary the Chinese Y chromosomal paper reports an inferred pattern of lineage extinction which is regular and consistent. But civilization escalated the magnitude of genocide, and in particular androcide of the losers in the games of power. The relative continuity of mtDNA across vast swaths of southern Eurasia is a testament to the fact that the lineages of the ‘first women’ still persists down among the settled agricultural peoples, whose genomes have been reshaped by untold sequences of conquests and assimilations. While female mediated gene flow can be imagined to be constant, continuous, and localized, I believe that male mediated gene flow has a more punctuated pattern. It explodes due to cultural and social innovations, such as the horse or Islam, and long standing Y chromosomal variation which has emerged since the last wave of conquerors is wiped away in a single fell swoop. Obviously this has an effect on the total genome, and I suspect that in some cases repeated male mediated expansions have resulted in striking discordances between the autosomal and mtDNA lineages. You see this in Argentina, where Native American mtDNA seems to persist to a higher degree than autosomal ancestry because of male skew of European migration. And it looks to be the case in Cambodia, where non-North East Asian autosomal ancestry seems to be present a lower fraction than the equivalent mtDNA.

With the rise of ubiquitous genomic typing and sequencing the geographical coverage will be fine grained enough the broad patterns, and specific details, will become clear. Then we will finally be able to understand if the societies fueled by grain truly ushered in the age of the domination of the many by the few. How easily does a scythe become a sword?

🔊 Listen RSS

Have no fear

There has been a lot of attention to Erika Check Hayden’s piece Ethics: Taboo genetics, at least judging by people commenting on my Facebook feed. In some ways this is not an incredibly empirically grounded argument, because the biological basis of complex traits is going to be rather difficult to untangle on a gene-by-gene basis. In other words, this isn’t a clear and present “concern.” The heritability of many behavioral traits has long been known. This is not revolutionary, though for cultural reasons may well educated people are totally surprised when confronted with data that many traits, such as intelligence and personality, have robust heritabilities* (the proportion of trait variation explained by variation in genes across the population). The literature reviewed in The Nurture Assumption makes clear that a surprising proportion of contribution any parents make to their offspring is through their genetic composition, and not their modeled example. You wouldn’t know this if you read someone like Brian Palmer of Slate, who seems to be getting paid to reaffirm the biases of the current age among the smart set (pretty much every single one of his pieces that touch upon genetics is larded with phrases which could have been written by a software program designed to sooth the concerns of the cultural Zeitgeist). But the new genomics is confirming the broad outlines of the findings from behavior genetics. There’s nothing really to see there. The bigger issue of any interest is normative; the values we hold dear as a culture.

For example:

Chabris says that the work can actually contribute to greater social mobility — for instance, by helping to identify preschoolers who could be helped by more intensive early childhood education. “The fact that people in the past interpreted the results in a certain way doesn’t mean that it shouldn’t be studied,” he says. But not everyone buys that potential misuses of the information can be divorced from gathering it. Anthropologist Anne Buchanan at Pennsylvania State University in University Park wrote on the blog The Mermaid’s Tale that rather than being purely academic and detached, such studies are “dangerously immoral”.

Of course John Horgan reiterates his call for race and IQ research to be banned. To some extent this reminds me of Patricia Churchland’s account of being verbally attacked by an anthropologist in an elevator as a “reductionist.” These are matters of morality, and reflect quasi-religious sensibilities. The science is secondary.

But there’s a major problem when you have norms and facts operating at cross-purposes: the facts are ultimately always there, invariant, and true. Banning research is totally a short-term step, because it isn’t as if the United States, with its particular set of values, has a monopoly on research. Patricia Churchland’s work which reduces human consciousness to a totally natural process would not get funded in Saudi Arabia, or by the Vatican, but that’s irrelevant because it will get funded in the Western world. Similarly, the cultural Left taboos which are very strong in Western academia are far weaker in Asia. Assuming that economic development proceeds apace, someone will do the research, and it will be published. If the facts of the world are as you’d always assumed, you have nothing really to fear.

* I think human psychology is complicated enough though that on some level people do understand the importance of genes. Look at who they choose to reproduce with.

🔊 Listen RSS

Dienekes has a post up highlighting a preprint out of Pontus Skoglund’s group. It is titled Ancient genomes mirror mode of subsistence rather than geography in prehistoric Europe. It doesn’t seem to be online (fingers crossed that it shows up linked at Haldane’s Sieve soon). In any case I am not surprised by the broad outlines of the thesis. And, it is not as if Skoglund’s group is the only one working in this area, I have suspicions that others are finding something very similar. These results out of Europe are probably reflective of the fact that much of the model in Peter Bellwood’s First Farmers is generally correct, the emergence of an agriculture revolution in a few select world societies produced a cultural and demographic revolution.

But one can take things too far, and I think Bellwood did. The results above, and elsewhere, also confirm that there was not total demographic replacement. In many regions the agriculturalists absorbed the hunter-gatherers. Using light coverage of whole genomes Skoglund’s group inferred that the farmers who arrived in prehistoric Europe from the Middle East seem to have contributed most of the ancestry of Southern Europeans, and also match nearly perfectly the genetic profiles of farming populations in Scandinavia. In contrast, ancient hunter-gatherers in Scandinavia and prehistoric Spain resemble contemporary Baltic groups (e.g., Lithuanians, Finns, and Scanadinavians). Finally, they conclude that modern Scandinavians are slightly more close to ancient hunter-gatherers than the farmers. This implies substantial farmer admixture in demographic terms. The group notes that previous inferences of Middle Eastern admixture were compromised by the fact that modern groups often exhibit African admixture. When you look only at the non-African portion, they are a much better proxy. One reason I am curious about the preprint in full is that I believe that even removing African admixture, one might be biased by selection of ‘reference’ populations. It seems likely that there were major eruptions from Arabia in antiquity and the medieval period (inferred by comparing religious isolates such as Arab Christians with their Muslim neighbors).

Setting aside that a two-way admixture seems unlikely to be the whole story*, it does offer up a way to explore an interesting question: on what sort of genes do hunter-gatherers and farmers differ? Specifically I’m wondering about utilizing admixture mapping. In Scandinavian populations regions of the genome where there are adaptations for farming are localized should be enriched for farmer ancestry, and vice versa for hunter-gathering. In the case of Northern Europeans farmers moved up latitudinal clines, and adapted to local conditions only halting. This explains in part that it seems Iberia and parts of Western Europe have more farmer ancestry than regions of Northeast Europe which are closer to the original zone of expansion as the crow flies. Not only did the hunter-gatherers have some cultural traits which conferred benefits in the deep north, but likely there were particular adaptations in these climes which aided their survival.

I understand that admixture mapping for populations which have mixed (and frankly were not that genetically different in the first place) may be difficult. But linkage disequilibrium based methods have been pushed much further back than I could have imagined, and it has been done with South Asians in regards to “Ancestral Northern Indian” vs. “Ancestral South Indian” ancestry and genetic functionality (specifically disease risk alleles).

* In addition, I am more skeptical of a simple demic diffusion model than I was in the past. I think there may have been multiple demographic pulses, followed by equilibration phases.

🔊 Listen RSS

Layers and layers….

There is the fact of evolution. And then there is the long-standing debate of how it proceeds. The former is a settled question with little intellectual juice left. The latter is the focus of evolutionary genetics, and evolutionary biology more broadly. The debate is an old one, and goes as far back as the 19th century, where you had arch-selectionists such as Alfred Russel Wallace (see A Reason For Everything) square off against pretty much the whole of the scholarly world (e.g., Thomas Henry Huxely, “Darwin’s Bulldog,” was less than convinced of the power of natural selection as the driving force of evolutionary change). This old disagreement planted the seeds for much more vociferous disputations in the wake of the fusion of evolutionary biology and genetics in the early 20th century. They range from the Wright-Fisher controversies of the early years of evolutionary genetics, to the neutralist vs. selectionist debate of the 1970s (which left bad feelings in some cases). A cartoon-view of the implication of the debates in regards to the power of selection as opposed to stochastic contingency can be found in the works of Stephen Jay Gould (see The Structure of Evolutionary Theory) and Richard Dawkins (see The Ancestor’s Tale): does evolution result in an infinitely creative assortment due to chance events, or does it drive toward a finite set of idealized forms which populate the possible parameter space?*

But ultimately these 10,000 feet debates are more a matter of philosophy than science. At least until the scientific questions are stripped of their controversy and an equilibrium consensus emerges. That will only occur through an accumulation of publications whose results are robust to time, and subtle enough to convince dissenters. This is why Enard et al.’s preprint, Genome wide signals of pervasive positive selection in human evolution, attracted my notice. With the emergence of genomics it has been humans first in line to be analyzed, as the best data is often found from this species, so no surprise there. Rather, what is so notable about this paper in light of the past 10 years of back and forth exploration of this topic?**

By taking a deeper and more subtle look at patterns of the variation in the human genome this group has inferred that adaptation through classic positive selection has been a pervasive feature of the human genome over the past ~100,000 years. This is not a trivial inference, because there has been a great deal of controversy as to the population genetic statistics which have been used to infer selection over the past 10 years with the arrival of genome-wide data sets (in particular, a tendency toward false positives). In fact, one group has posited that a more prominent selective force within the genome has been “background selection,” which refers to constraint upon genetic variation due to purification of numerous deleterious mutations and neighboring linked sites.

The sum totality of Enard et al. may seem abstruse, and even opaque, in terms of the method. But each element is actually rather simple and clear. The major gist is that many tests for selection within the genome focus on the differences between nonynonymous and synonymous mutational variants. The former refer to base positions in the genome which result in a change in the amino acid state, while the latter are those (see the third positions) where different bases may still produce the same amino acid. The ratio between substitutions, replacements across lineages for particular base states, at these positions is a rough measure of adaptation driven by selection on the molecular level. Changes at synonymous positions are far less constrained by negative selection, while positive selection due to an increased fitness via new phenotypes is presumed to have occurred only via nonsynonymous changes. What Enard et al. point out is that the human genome is heterogeneous in the distribution of characteristics, and focusing on these sorts of pairwise differences in classes without accounting for other confounding variables may obscure dynamics on is attempting to measure. In particular, they argue that evidence of positive selective sweeps are masked by the fact that background selection tends to be stronger in regions where synonymous mutational substitutions are more likely (i.e., they are more functionally constrained, so nonsynonymous variants will be disfavored). This results in elevated neutral diversity around regions of nonsynonymous substitutions vis-a-vis strongly constrained regions with synonymous substitutions. Once correcting for the power of background selection the authors evidence for sweeps of novel adaptive variants across the human genome, which had previous been hidden.

There are two interesting empirical findings from the 1000 Genomes data set. First, the authors find that positive selection tends to operate upon regulatory elements rather than coding sequence changes. You are probably aware that this is a major area of debate currently within the field of molecular evolutionary biology. Second, there seems to be less evidence for positive selection in Sub-Saharan Africans, or, less background selection in this population. My own hunch is that it is the former, that the demographic pulse across Eurasia, and to the New World and Australasia, naturally resulted in local adaptations as environmental conditions shifted. Though it may be that the African pathogenic environment is particularly well adapted to hominin immune systems, and so imposes a stronger cost upon novel mutations than is the case for non-Africans. So I do not dismiss the second idea out of hand.

Where this debate about the power of selection will end is anyone’s guess. Nor do I care. Rather, what’s important is getting a finer-grained map of the dynamics at work so that we may perceive reality with greater clarity. One must be cautious about extrapolating from humans (e.g., the authors point out that Drosophila genomes are richer in coding sequence proportionally). But the human results which emerge because of the coming swell of genomic data will be a useful outline for the possibilities in other organisms.

Citation: Genome wide signals of pervasive positive selection in human evolution

* The cartoon qualification is due to the fact that I am aware that selection is stochastic as well.

** Voight, Benjamin F., et al. “A map of recent positive selection in the human genome.” PLoS biology 4.3 (2006): e72., Sabeti, Pardis C., et al. “Detecting recent positive selection in the human genome from haplotype structure.” Nature 419.6909 (2002): 832-837., Wang, Eric T., et al. “Global landscape of recent inferred Darwinian selection for Homo sapiens.” Proceedings of the National Academy of Sciences of the United States of America 103.1 (2006): 135-140., Williamson, Scott H., et al. “Localizing recent adaptive evolution in the human genome.” PLoS genetics 3.6 (2007): e90., Hawks, John, et al. “Recent acceleration of human adaptive evolution.” Proceedings of the National Academy of Sciences 104.52 (2007): 20753-20758., Pickrell, Joseph K., et al. “Signals of recent positive selection in a worldwide sample of human populations.” Genome research 19.5 (2009): 826-837., Hernandez, Ryan D., et al. “Classic selective sweeps were rare in recent human evolution.” Science 331.6019 (2011): 920-924.

🔊 Listen RSS

Citation: Genetic Evidence for Recent Population Mixture in India
Moorjani et al.

The Pith:In India 5,000 years ago there were the hunter-gathers. Then came the Dravidian farmers. Finally came the Indo-Aryan cattle herders.

There is a new paper out of the Reich lab, Genetic Evidence for Recent Population Mixture in India, which follows up on their seminal 2009 work, Reconstructing Indian Population History. I don’t have time right now to do justice to it, but as noted this morning in the press, it is “carefully and cautiously crafted.” Since I am not associated with the study, I do not have to be cautious and careful, so I will be frank in terms of what I think these results imply (note that confidence on many assertions below are modest). Though less crazy in a bald-faced sense than another recent result which came out of the Reich lab, this paper is arguably more explosive because of its historical and social valence in the Indian subcontinent. There has been a trend over the past few years of scholars in the humanities engaging in deconstruction and intellectual archaeology which overturns old historical orthodoxies, understandings, and leaves the historiography of a particular topic of study in a chaotic mess. From where I stand the Reich lab and its confederates are doing the same, but instead of attacking the past with cunning verbal sophistry (I’m looking at you postcolonial“theorists”), they are taking a sledge-hammer of statistical genetics and ripping apart paradigms woven together by innumerable threads. I am not sure that they even understand the depths of the havoc they’re going to unleash, but all the argumentation in the world will not stand up to science in the end, we know that.

Since the paper is not open access, let me give you the abstract first:

Most Indian groups descend from a mixture of two genetically divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners, Caucasians, and Europeans; and Ancestral South Indians (ASI) not closely related to groups outside the subcontinent. The date of mixture is unknown but has implications for understanding Indian history. We report genome-wide data from 73 groups from the Indian subcontinent and analyze linkage disequilibrium to estimate ANI-ASI mixture dates ranging from about 1,900 to 4,200 years ago. In a subset of groups, 100% of the mixture is consistent with having occurred during this period. These results show that India experienced a demographic transformation several thousand years ago, from a region in which major population mixture was common to one in which mixture even between closely related groups became rare because of a shift to endogamy.

Young Stalin

I want to highlight one aspect which is not in the abstract: the closest population to the “Ancestral North Indians”, those who contributed the West Eurasian component to modern Indian ancestry, seem to be Georgians and other Caucasians. Since Reconstructing Indian Population History many have suspected this. I want to highlight in particular two genome bloggers, Dienekes and Zack Ajmal, who’ve prefigured that particular result. But wait, there’s more! The figure which I posted at the top illustrates that it looks like Indo-European speakers were subject to two waves of admixture, while Dravidian speakers were subject to one!

The authors were cautious indeed in not engaging in excessive speculation. The term “Indo-Aryan” only shows up in the notes, not in the body of the main paper. But the historical and philological literature is references:

The dates we report have significant implications for Indian history in the sense that they document a period of demographic and cultural change in which mixture between highly differentiated populations became pervasive before it eventually became uncommon. The period of around 1,900–4,200 years BP was a time of profound change in India, characterized by the deurbanization of the Indus civilization, increasing population density in the central and downstream portions of the Gangetic system, shifts in burial practices, and the likely first appearance of Indo-European languages and Vedic religion in the subcontinent. The shift from widespread mixture to strict endogamy that we document is mirrored in ancient Indian texts. [notes removed -Razib]

How does this “deconstruct” the contemporary scholarship? Here’s an Amazon summary of a book which I read years ago, Castes of Mind: Colonialism and the Making of Modern India:

When thinking of India, it is hard not to think of caste. In academic and common parlance alike, caste has become a central symbol for India, marking it as fundamentally different from other places while expressing its essence. Nicholas Dirks argues that caste is, in fact, neither an unchanged survival of ancient India nor a single system that reflects a core cultural value. Rather than a basic expression of Indian tradition, caste is a modern phenomenon–the product of a concrete historical encounter between India and British colonial rule. Dirks does not contend that caste was invented by the British. But under British domination caste did become a single term capable of naming and above all subsuming India’s diverse forms of social identity and organization.

The argument is not totally fallacious, as some castes are almost certainly recent constructions and interpretations, with fictive origin narratives. But the deep genetic structure of Indian castes, which go back ~4,000 years in some cases, falsifies a strong form of the constructivist narrative. The case of the Vysya is highlighted in the paper as a population with deep origins in Indian history. Interestingly they seem to be a caste which has changed its own status within the hierarchy over the past few hundred years. Where the postcolonial theorists were right is that caste identity as a group in relation to other castes was somewhat flexible (e.g., Jats and Marathas in the past, Nadars today). Where they seem to have been wrong is the implicit idea that many castes were an ad hoc crystallization of individuals only bound together by common interests relatively recently in time, and in reaction to colonial pressures. Rather, it seems that the colonial experience simply rearranged pieces of the puzzle which had deep indigenous roots.

Indra, slayer of Dasas? Credit: Gnanapiti

Stepping back in time from the early modern to the ancient, the implications of this research seem straightforward, if explosive. One common theme in contemporary Western treatments of the Vedic period is to interpret narratives of ethnic conflict coded in racialized terms as metaphor. So references to markers of ethnic differences may be tropes in Vedic culture, rather than concrete pointers to ancient socio-political dynamics. The description of the enemies of the Aryans as dark skinned and snub-nosed is not a racial observation in this reading, but analogous to the stylized conflicts between the Norse gods and their less aesthetically pleasing enemies, the Frost Giants. The mien of the Frost Giants was reflective of their symbolic role in the Norse cosmogony.


What these results imply is that there was admixture between very distinct populations in the period between 0 and 2000 B.C. By distinct, I mean to imply that the last common ancestors of the “Ancestral North Indians” and “Ancestral South Indians” probably date to ~50,000 years ago. The population in the Reich data set with the lowest fraction of ANI are the Paniya (~20%). One of those with higher fractions of ANI (70%) are Kashmiri Pandits. It does not take an Orientalist with colonial motives to infer that the ancient Vedic passages which are straightforwardly interpreted in physical anthropological terms may actually refer to ethnic conflicts in concrete terms, and not symbolic ones.

Finally, the authors note that uniparental lineages (mtDNA and Y) seem to imply that the last common ancestors of the ANI with other sampled West Eurasian groups dates to ~10,000 years before the present. This leads them to suggest that the ANI may not have come from afar necessarily. That is, the “Georgian” element is a signal of a population which perhaps diverged ~10,000 years ago, during the early period of agriculture in West Asia, and occupied the marginal fringes of South Asia, as in sites such as Mehrgarh in Balochistan. A plausible framework then is that expansion of institutional complexity resulted in an expansion of the agriculture complex ~3,000 B.C., and subsequent admixture with the indigenous hunter-gatherer substrate to the east and south during this period. One of the components that Zack Ajmal finds through ADMIXTURE analysis in South Asia, with higher fractions in higher castes even in non-Brahmins in South India, he terms “Baloch,” because it is modal in that population. This fraction is also high in the Dravidian speaking Brahui people, who coexist with the Baloch. It seems plausible to me that this widespread Baloch fraction is reflective of the initial ANI-ASI admixture event. In contrast, the Baloch and Brahui have very little of the “NE Euro” fraction, which is found at low frequencies in Indo-European speakers, and especially higher castes east and south of Punjab, as well as South Indian Brahmins. I believe that this component is correlated with the second, smaller wave of admixture, which brought the Indo-European speaking Indo-Aryans to much of the subcontinent. The Dasas described in the Vedas are not ASI, but hybrid populations. The collapse of the Indus Valley civilization was an explosive event for the rest of the subcontinent, as Moorjani et al. report that all indigenous Indian populations have ANI-ASI admixture (with the exceptions of Tibeto-Burman groups).

Overall I’d say that the authors of this paper covered their bases. Though I wish them well in avoiding getting caught up in ideologically tinged debates. Their papers routinely result in at least one email to me per week, ranging from confusion to frothing-at-the-mouth.

Related: The Gift of the Gopi.

Citation: et al., Genetic Evidence for Recent Population Mixture in India, The American Journal of Human
Genetics (2013),

🔊 Listen RSS

For various reasons the idea of mitochondrial Eve and Y chromosomal Adam capture the public imagination. This frustrates many people, including me. I’ve gotten into the fatigue stage on this topic, but some sort of counter-attack is necessary against malignant memes. Even geneticists who don’t usually work with populations can get confused by the implications of mtDNA and Y chromosomal phylogenies. Melissa Wilson Sayres, who works on Y chromosomes, has a useful post (promised first of two) at Panda’s Thumb, Y and mtDNA are not Adam and Eve: Part 1. If you have friends/acquaintances who are confused by this issue, it might be a good place to start.

Much of the discussion around this topic was triggered by the recent paper in Science, Sequencing Y Chromosomes Resolves Discrepancy in Time to Common Ancestor of Males Versus Females. As Graham Coop observed on Twitter the idea of a “discrepancy” is not clear, insofar as it would not be that surprising if the last common ancestor of the extant Y chromosomal lineages existed at a different time than the last common ancestor of the mtDNA lineages. Expected coalescence is contingent upon various population genetic parameters such as effective population size, but expectations are also subject to variation in realized outcomes. And, as Sayres observes the references to the Adam & Eve analogy were present within the paper, fueling the fire. Finally, the reference to “dogma” tagged onto the end struck me as a touch too cute.

• Category: Science • Tags: Adam, Eve, Human Genetics, Human Genomics, MTDNA, Y Chromosome 
🔊 Listen RSS

SLC45A2 rs16891982 frequency, Norton, Heather L., et al. “Genetic evidence for the convergent evolution of light skin in Europeans and East Asians.” Molecular biology and evolution 24.3 (2007): 710-722.


The above figure is from Norton et al.’s Genetic Evidence for the Convergent Evolution of Light Skin in Europeans and East Asians. It shows that rs16891982 on the SLC45A2 locus exhibits strong differentiation between Europe and the rest of the world. This is in contrast to SLC24A5, where the well known allele which differentiates Africans/East Asians from Europeans is found at very high frequencies across Western Eurasia (both my parents are homozygotes for the “European” variant; in fact SLC24A5’s derived variant is found at fractions on the order of ~50% in eastern and southern India). The ancestral allele on SLC24A5 is very difficult to find in Europeans, it is so close to fixation for the derived variant. In contrast SLC45A2‘s minor allele is segregating at appreciable frequencies in places like southern Spain, and the derived allele is not fixed even in Northern Europe.

I won’t review the literature on the genomics and evolution of human pigmentation at this point. Rather, I’ll just note that it seems most of the inter-population variation is controlled by a handful of genes. It’s a polygenic trait, but just. Second, a fair amount of evidence has emerged that some of the lightening derived variants have increased in frequency only very recently (e.g., on the order of ~10,000 years). Pigmentation is then a peculiar trait where the genetic underpinnings can give historical phylogenetic information because of the varied dates of differentiation and selective sweeps.

Below I’ve collated results from several studies on frequencies of SLC45A2. I invite readers to persue them. I will say two things. First, the frequency of the “European” variant in ~140 northern Ethiopians is 0%. This is peculiar for a population which may be on the order of ~50% West Eurasian. Second, the fraction of SLC45A2 derived variant in South Asians coincidentally tracks the “NE Euro” percentage in Zack Ajmal’s results.


Country/Region Group/Place N Frequency light allele
A Decreasing Gradient of 374F Allele Frequencies in the Skin Pigmentation GeneSLC45A2, from the North of West Europe to North Africa
Denmark Copenhagen 51 0.98
England London 56 0.955
Belgium Brussels 53 0.934
France Lille 64 0.945
Rheims 98 0.893
Rennes 52 0.971
Marseilles 312 0.888
Perpignan 101 0.827
Corsica 328 0.878
Germany Mulheim 59 0.975
Switzerland Basel 51 0.96
Italy Genoa 97 0.85
Roma 64 0.898
Napoli 128 0.859
Sicily 39 0.833
Sardinia 100 0.805
Spain Barcelona 59 0.856
Sevilla 71 0.725
Portugal North 79 0.829
South 59 0.78
Near Fixation of 374l Allele Frequencies of the Skin Pigmentation Gene SLC45A2 in Africa
Algeria Algiers 141 0.7
Morocco Tangier 123 0.69
Rabat 102 0.68
Berbers from Morocco 75 0.57
Libya Tripoli 38 0.58
Egypt Alexandria 162 0.65
Assouan 66 0.14
South 46 0.2
Mauritania Moors 65 0.41
Senegal Wolof 209 0
Serrere 92 0
Mandingue 51 0
Diola 42 0
Balant 21 0
Peuls 71 0.1
Toucouleur 70 0.03
Soninké 69 0.03
Ethiopia Addis Ababa 104 0
Falashas 38 0
Democratic Republic of Congo 188 0
Distribution of the F374 Allele of the SLC45A2 (MATP) Gene and Founder-Haplotype Analysis
Munich German 93 0.962
West Germany Turk 200 0.615
New Delhi Indian 51 0.147
Dhaka Bangladeshi 118 0.059
Ulaan Baator Khalha 173 0.113
Dashbalbar Buryat 143 0.115
Shenyang Han 89 0.028
Wuxi Han 119 0
Huizhou Han 111 0.005
Tottori Japanese 103 0
Okinawa Japanese 87 0
Surabaya Indonesian 105 0.005
White S African 54 0.89
Ghanaian 50 0
New Guinean 52 0
Japanese 49 0
Polymorphisms of four pigmentation genes (SLC45A2, SLC24A5, MC1R and TYRP1) among eleven endogamous populations of India
Jharkhand Munda 68 0.03
Madhya Pradesh Kanyabuja Brahmin 78 0.11
Madhya Pradesh Gond 75 0.02
Maharashtra Konkanastha Brahmin 71 0.06
Maharashtra Mahadev Koli 65 0.06
Tamil Nadu Iyengar Brahmin 66 0.07
Tamil Nadu Kurumans 67 0.07
Tripura Tripuri 65 0
Tripura Riang 67 0.01
• Category: Science • Tags: Genomics, Human Genetics, Pigmentation 
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"