The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information



=>
Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
/
Human Evolutionary Genetics

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

Likely an individual with derived allele on KITL locus (Credit: David Shankbone)

An individual polymorphic on the KITL locus? (Credit: David Shankbone)

Pigmentation is one of the few complex traits in the post-genomic era which has been amenable to nearly total characterization. The reason for this is clear in hindsight. As far back as the 1950s (see The Genetics of Human Populations) there were inferences made using human pedigrees which suggested that normal human variation on this trait was controlled by fewer than ten genes of large effect. In other words, it was a polygenic character, but not highly so. This means that the alleles which control the variation are going to have reasonably large response, and be well within the power of statistical genetic techniques to capture their effect.

I should be careful about being flip on this issue. As recently as the mid aughts (see Mutants) the details of this trait were not entirely understood. Today the nature of inheritance in various populations is well understood, and a substantial proportion of the evolutionary history is also known to a reasonable clarity as far as these things go. The 50,000 foot perspective is this: we lost our fur millions of years ago, and developed dark skin, and many of us lost our pigmentation after we left Africa ~50,000 years ago (in fact, it seems likely that hominins in the northern latitudes were always diverse in their pigmentation)


A new paper in Cell sheds some further light on the fine-grained details which might be the outcome of this process. Being a Cell paper there is a lot of neat molecular technique to elucidate the mechanistic pathways. But I will gloss over that, because it is neither my forte nor my focus. A summary of the paper is that it shows that p53, a relatively well known tumor suppressor gene, seems to have an interaction with a response element (the gene product binds in many regions, it is a transcription factor) around the KITLG locus. This locus is well known in part because it has been implicated in pigment variation in human and fish. So KITLG is one of the generalized pigmentation pathways which spans metazoans. There are derived variants in both Europeans and East Asians which are correlated with lighter skin, though there is polymorphism in both cases (it has not swept to fixation).

The wages of adaptation? (Credit: Hoggarazzi Photography)

The wages of adaptation? (Credit: Hoggarazzi Photography)

But this is a Cell paper, so there has to be a more concrete and practical angle than just evolution. And there is. It turns out that a single nucleotide polymorphism mutation in the p53 response element results in a tendency toward upregulation of KITLG and male germ line proliferation. The latter matters when it comes to tumor genesis, and in particular testicular cancer. This form of cancer is one where there doesn’t seem to be a somatic cell mutation of p53 itself. Additionally, the authors observe that testicular cancer manifests at a 4-5 fold greater rate in people of European descent than African Americans. And, presumably the upregulation of KITLG is somehow related to increased melanin production. The authors posit that because of lighter skin in Europeans due to selection at other loci there has been a balancing effect at KITLG (increased tanning response). There is evidence of selection at this locus (a long haplotype and increased homozygosity), so this is not an unreasonable conjecture, though the high frequency of loss of function alleles suggests that the model is likely complex.

I don’t know if this particular story is correct in its details (though I am intrigued that variation in KITLG is associated with cancer in other organisms). But it illustrates one of the possible consequences of rapid evolutionary change due to human migration out of Africa: deleterious side effects because of pleiotropy. In other words, as you tinker with the genomic architecture of a population you are going to have to accept tradeoffs as you are optimizing one aspect of function. Genes don’t have just one consequence, but are embedded in myriad pathways. Over time evolutionary theory predicts a slow re-balancing, as modifier genes arise to mask the deleterious side effects. But until then, we will bear the burdens of adaptation as best as we can.

Citation: Zeron-Medina, Jorge, et al. “A Polymorphic p53 Response Element in KIT Ligand Influences Cancer Risk and Has Undergone Natural Selection.” Cell 155.2 (2013): 410-422.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

Soft serve

The trait of lactase persistence (lactose tolerance) is probably one of the better schoolbook examples of natural selection in human populations. The reasons for this are probably two-fold. There is a very strong signature of selection within a specific gene known to associate with the trait in question in many populations. And, there is a very compelling historical narrative which explains rather neatly how this particular functional change could have undergone such strong selection within the past ~5,000 years across these populations. But the elucidation of the origin and spread of this genetic adaptation is also interesting because it looks as if it was not a singular event. Populations as disparate as Arabians, Danes, and Masai seem to carry different alleles around the locus of interest which confer the ability to digest milk. This illustrates the fact when selection pressures have a viable target, there is a rapid response on the genomic level. At some point during the maturation of a mammal the regulatory pathway which produces lactase enzyme shuts down. Yet within numerous human populations this gradual shutdown process has been short-circuited.

The variety of response in relation to this adaptation was brought home to me as I read Diversity of Lactase Persistence Alleles in Ethiopia – Signature of a Soft Selective Sweep, in the latest issue of The American Journal of Human Genetics:

The persistent expression of lactase into adulthood in humans is a recent genetic adaptation that allows the consumption of milk from other mammals after weaning. In Europe, a single allele (−13910∗T, rs4988235) in an upstream region that acts as an enhancer to the expression of the lactase gene LCT is responsible for lactase persistence and appears to have been under strong directional selection in the last 5,000 years, evidenced by the widespread occurrence of this allele on an extended haplotype. In Africa and the Middle East, the situation is more complicated and at least three other alleles (−13907∗G, rs41525747; −13915∗G, rs41380347; −14010∗C, rs145946881) in the same LCT enhancer region can cause continued lactase expression. Here we examine the LCT enhancer sequence in a large lactose-tolerance-tested Ethiopian cohort of more than 350 individuals. We show that a further SNP, −14009T>G (ss 820486563), is significantly associated with lactose-digester status, and in vitro functional tests confirm that the −14009∗G allele also increases expression of an LCT promoter construct. The derived alleles in the LCT enhancer region are spread through several ethnic groups, and we report a greater genetic diversity in lactose digesters than in nondigesters. By examining flanking markers to control for the effects of mutation and demography, we further describe, from empirical evidence, the signature of a soft selective sweep.

To some extent the paper was written rather confusingly for my taste. Importantly, they did not even consider the results of Pagani et al. (in the same journal!) from last year in their analysis. The big picture result is that whereas in Eurasia and East Africa it looks as if lactase persistence spread through populations via “hard” selective sweeps, in Ethiopia it may have been propagation through “soft” sweeps. The former are cases where a single new mutant confers a beneficial phenotype. In the absence of allelic competitors this variant sweeps up in frequency extremely rapidly, and flanking regions of the genome generate a long haplotype block. In Europeans this has resulted in a strongly homogenized region of the genome around LCT.

The situation in Ethiopia is a touch paradoxical in light of the above model. Instead of one allele, it looks as if several are segregating. And, the lactase persistence haplotypes exhibit more, not less, genetic diversity than the non-persistent variants. As noted in the article it may be that there are strong selective constraints against lactase persistence. Apparently there is a long non-persistent haplotype in Horn of Africa populations, explaining the reduced diversity of this subset of the sample. Whereas in a hard sweep a single mutation can rise in frequency against disfavored ancestral variants, in this situation you have a soft sweep where alternative variants with similar fitness values are presumably increasing in frequency.

But all this needs to be considered in light of Pagani et al., which indicates a very recent admixture in Ethiopia. The discussion above seems to suggest in situ selective events within the Horn of Africa, but the possibility is that the sweeps may have initiated among the Eurasian ancestors of the Ethiopians (perhaps some admixture mapping would be useful?). Ultimately this is going to be a complicated story. It doesn’t take away from the bigger picture that lactase persistence is an excellent model for natural selection, but the sketch has more details to be filled in, though I’m not quite sure about the specific character of this from this paper

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

Layers and layers….

There is the fact of evolution. And then there is the long-standing debate of how it proceeds. The former is a settled question with little intellectual juice left. The latter is the focus of evolutionary genetics, and evolutionary biology more broadly. The debate is an old one, and goes as far back as the 19th century, where you had arch-selectionists such as Alfred Russel Wallace (see A Reason For Everything) square off against pretty much the whole of the scholarly world (e.g., Thomas Henry Huxely, “Darwin’s Bulldog,” was less than convinced of the power of natural selection as the driving force of evolutionary change). This old disagreement planted the seeds for much more vociferous disputations in the wake of the fusion of evolutionary biology and genetics in the early 20th century. They range from the Wright-Fisher controversies of the early years of evolutionary genetics, to the neutralist vs. selectionist debate of the 1970s (which left bad feelings in some cases). A cartoon-view of the implication of the debates in regards to the power of selection as opposed to stochastic contingency can be found in the works of Stephen Jay Gould (see The Structure of Evolutionary Theory) and Richard Dawkins (see The Ancestor’s Tale): does evolution result in an infinitely creative assortment due to chance events, or does it drive toward a finite set of idealized forms which populate the possible parameter space?*


But ultimately these 10,000 feet debates are more a matter of philosophy than science. At least until the scientific questions are stripped of their controversy and an equilibrium consensus emerges. That will only occur through an accumulation of publications whose results are robust to time, and subtle enough to convince dissenters. This is why Enard et al.’s preprint, Genome wide signals of pervasive positive selection in human evolution, attracted my notice. With the emergence of genomics it has been humans first in line to be analyzed, as the best data is often found from this species, so no surprise there. Rather, what is so notable about this paper in light of the past 10 years of back and forth exploration of this topic?**

By taking a deeper and more subtle look at patterns of the variation in the human genome this group has inferred that adaptation through classic positive selection has been a pervasive feature of the human genome over the past ~100,000 years. This is not a trivial inference, because there has been a great deal of controversy as to the population genetic statistics which have been used to infer selection over the past 10 years with the arrival of genome-wide data sets (in particular, a tendency toward false positives). In fact, one group has posited that a more prominent selective force within the genome has been “background selection,” which refers to constraint upon genetic variation due to purification of numerous deleterious mutations and neighboring linked sites.

The sum totality of Enard et al. may seem abstruse, and even opaque, in terms of the method. But each element is actually rather simple and clear. The major gist is that many tests for selection within the genome focus on the differences between nonynonymous and synonymous mutational variants. The former refer to base positions in the genome which result in a change in the amino acid state, while the latter are those (see the third positions) where different bases may still produce the same amino acid. The ratio between substitutions, replacements across lineages for particular base states, at these positions is a rough measure of adaptation driven by selection on the molecular level. Changes at synonymous positions are far less constrained by negative selection, while positive selection due to an increased fitness via new phenotypes is presumed to have occurred only via nonsynonymous changes. What Enard et al. point out is that the human genome is heterogeneous in the distribution of characteristics, and focusing on these sorts of pairwise differences in classes without accounting for other confounding variables may obscure dynamics on is attempting to measure. In particular, they argue that evidence of positive selective sweeps are masked by the fact that background selection tends to be stronger in regions where synonymous mutational substitutions are more likely (i.e., they are more functionally constrained, so nonsynonymous variants will be disfavored). This results in elevated neutral diversity around regions of nonsynonymous substitutions vis-a-vis strongly constrained regions with synonymous substitutions. Once correcting for the power of background selection the authors evidence for sweeps of novel adaptive variants across the human genome, which had previous been hidden.

There are two interesting empirical findings from the 1000 Genomes data set. First, the authors find that positive selection tends to operate upon regulatory elements rather than coding sequence changes. You are probably aware that this is a major area of debate currently within the field of molecular evolutionary biology. Second, there seems to be less evidence for positive selection in Sub-Saharan Africans, or, less background selection in this population. My own hunch is that it is the former, that the demographic pulse across Eurasia, and to the New World and Australasia, naturally resulted in local adaptations as environmental conditions shifted. Though it may be that the African pathogenic environment is particularly well adapted to hominin immune systems, and so imposes a stronger cost upon novel mutations than is the case for non-Africans. So I do not dismiss the second idea out of hand.

Where this debate about the power of selection will end is anyone’s guess. Nor do I care. Rather, what’s important is getting a finer-grained map of the dynamics at work so that we may perceive reality with greater clarity. One must be cautious about extrapolating from humans (e.g., the authors point out that Drosophila genomes are richer in coding sequence proportionally). But the human results which emerge because of the coming swell of genomic data will be a useful outline for the possibilities in other organisms.

Citation: Genome wide signals of pervasive positive selection in human evolution

* The cartoon qualification is due to the fact that I am aware that selection is stochastic as well.

** Voight, Benjamin F., et al. “A map of recent positive selection in the human genome.” PLoS biology 4.3 (2006): e72., Sabeti, Pardis C., et al. “Detecting recent positive selection in the human genome from haplotype structure.” Nature 419.6909 (2002): 832-837., Wang, Eric T., et al. “Global landscape of recent inferred Darwinian selection for Homo sapiens.” Proceedings of the National Academy of Sciences of the United States of America 103.1 (2006): 135-140., Williamson, Scott H., et al. “Localizing recent adaptive evolution in the human genome.” PLoS genetics 3.6 (2007): e90., Hawks, John, et al. “Recent acceleration of human adaptive evolution.” Proceedings of the National Academy of Sciences 104.52 (2007): 20753-20758., Pickrell, Joseph K., et al. “Signals of recent positive selection in a worldwide sample of human populations.” Genome research 19.5 (2009): 826-837., Hernandez, Ryan D., et al. “Classic selective sweeps were rare in recent human evolution.” Science 331.6019 (2011): 920-924.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

There were two papers in Science which came out on the Y chromosome, Sequencing Y Chromosomes Resolves Discrepancy in Time to Common Ancestor of Males Versus Females and Low-Pass DNA Sequencing of 1200 Sardinians Reconstructs European Y-Chromosome Phylogeny. I can recommend what Dienekes had to say, and I wasn’t going to comment until I saw this egregious piece in The New Scientist: Arabian flights: Early humans diverged in 150 years. Because of the title I did not initially think that this had anything to do with the Y chromosome, but it turns out that the piece uses the finding that three primary non-African haplogroups diverged in rapid succession from each other as the hook for the headline. In fact not only does the Y not offer definitive accounts of human history, it doesn’t even necessarily tell us about the history of men. It’s a marker, not a time machine. To repeat: the history of a specific genetic locus is not the history of a population. It has to be said.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Human Evolutionary Genetics, Y Chromosome 
🔊 Listen RSS

Citation: Corona, Erik, et al. “Analysis of the Genetic Basis of Disease in the Context of Worldwide Human Relationships and Migration.” PLoS Genetics 9.5 (2013): e1003447.

The above figure is from a paper in PLoS GENETICS, Analysis of the Genetic Basis of Disease in the Context of Worldwide Human Relationships and Migration. The authors synthesize two diverse domains of human genomics. First, there are biomedically focused genome-wide association studies and their like which attempt to identify risk alleles for particular diseases. In some cases these risk alleles are very penetrant, in that a particular state predicts with high likelihood a disease phenotype. But in most cases the yield is elevated or decreased risks for highly complex traits such as type 2 diabetes. Second, there is the domain of evolutionary genomics which attempts to reconstruct a phylogenetic and population genetic history so as to frame contemporary patterns of variation in their proper context. How this might be important or of interest is obvious in the case of malaria resistance genes. Alleles conferring resistance have arisen in multiple populations due to parallel environmental pressures. Phylogenetic relationships between these populations should inform your predictions as to the likely similarities of the mutations between the populations. Meanwhile, population genetic theory can give you clues as to the likelihood of multiple adaptations.

The goal here is to increase understanding of the nature of the emergence of disease, and perhaps target individual risk more effectively. Above in the figure you see two interesting patterns: risk for type 2 diabetes alleles as a function of descent, and risk as a function of de novo mutation or independent selective event. The phylogenetic tree represents real relationships as inferred from the >600,0000 SNPs in the HGDP data set. The risk alleles were culled from the literature, and were computed for individuals and populations. The real population risks were then compared to a model of risks which might occur in a scenario with this particular phylogenetic tree and the normal process of random genetic drift (see methods for the gory details!). What you see are phylogenetic relationships (African populations shifted toward higher risk) and independent events (Pima Indians shifted toward higher risk) where there is a higher risk toward diabetes (red shifted).

There are all sorts of shortcomings to this analysis. The authors are limited by the risk alleles in their study, which is certainly far less than thorough or exhaustive. Additionally, their population coverage was thin in some regions, resulting in reduced ability to even squeeze power from their model in particular cases. But one thing that jumps out at you is that the patterns here inferred from risk alleles in a highly polygenic disease like type 2 diabetes don’t even track what you see in the real world. Many South Asian groups have very high risks of type 2 diabetes. It just so happens that these groups are not in the HGDP sample. There is actually a rather informative critique from two epidemiologists in the comments of this paper. They make many points that came to mind in the specifics. But they ended in a fashion which raised my eyebrows:

Finally, the need to avoid stigmatizing populations based on genetic risk has been much discussed.It is not difficult to imagine a media announcement based on this publication – “Genetic risk of diabetes found in African populations”. Similar claims were made for intelligence not very long ago. Not all speculation is neutral.

As it happens I come from a population with very high risk for metabolic disease. I have no idea if I’m stigmatized by this fact, but I am very glad that medical professionals are becoming aware of differential risks, and moving beyond coarse one-size-fits-all understandings of human health. The BMI values developed for European Americans are probably rather inappropriate to South Asians because of the way we distribute fat (in short, we need to be thinner to exhibit the same risk profile all things equal). Again , I have no idea if this is stigmatizing, but it is real.

So despite all the real concerns I have with the methodology in the paper above, I believe that these sorts of analyses are essential parts of the broader answer. We now live in the age of the antiobiotic revolution and an understanding of germ theory. Those were the big returns on investment for public health. For the short term gains in human well being and life expectancy are going to be on the margin, through increments. Despite all the skepticism I have with initial attempts to work out the relationship between population history and disease, one must begin somewhere.

Citation: Corona, Erik, et al. “Analysis of the Genetic Basis of Disease in the Context of Worldwide Human Relationships and Migration.” PLoS Genetics 9.5 (2013): e1003447

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

There’s an excellent paper up at Cell right now, Modeling Recent Human Evolution in Mice by Expression of a Selected EDAR Variant. It synthesizes genomics, computational modeling, as well as the effective execution of mouse models to explore non-pathological phenotypic variation in humans. It was likely due the last element that this paper, which pushes the boundary on human evolutionary genomics, found its way to Cell (and the “impact factor” of course).

The focus here is on EDAR, a locus you may have heard of before. By fiddling with the EDAR locus researchers had earlier created “Asian mice.” More specifically, mice which exhibit a set of phenotypes which are known to distinguish East Asians from other populations, specifically around hair form and skin gland development. More generally EDAR is implicated in development of ectodermal tissues. That’s a very broad purview, so it isn’t surprising that modifying this locus results in a host of phenotypic changes. The figure above illustrates the modern distribution of the mutation which is found in East Asians in HGDP populations.

One thing to note is that the derived East Asian form of EDAR is found in Amerindian populations which certainly diverged from East Asians > 10,000 years before the present (more likely 15-20,000 years before the present). The two populations in West Eurasia where you find the derived East Asian EDAR variant are Hazaras and Uyghurs, both likely the products of recent admixture between East and West Eurasian populations. In Melanesia the EDAR frequency is correlated with Austronesian admixture. Not on the map, but also known, is that the Munda (Austro-Asiatic) tribal populations of South Asia also have low, but non-trivial, frequencies of East Asian EDAR. In this they are exceptional among South Asian groups without recent East Asian admixture. This lends credence to the idea that the Munda are descendants in part of Austro-Asiatic peoples intrusive from Southeast Asia, where most Austro-Asiatic languages are present.

And yet one thing that jumps out at me is that there is no East Asian EDAR in European populations, even in Russians. I am a bit confused by this result, because of the possibility of Siberian-affiliated population admixture with Europeans within the last 10,000 years, as adduced by several researchers (this is not an obscure result, it manifests in TreeMix repeatedly). The second figure shows the inferred region from which the East Asian EDAR haplotype expanded over the past 30,000 years. The authors utilized millions of forward simulations with a host of parameters to model the expansion of EDAR, so that it fit the distribution pattern that is realized (see the supplements here for the parmeters). To make a long story short they infer that there was one mutation on the order of ~30,000 years before the present, and that it swept up in frequency driven by selection coefficients on the order of ~0.10 (10% increase relative fitness, which is incredibly powerful!). This is on the extreme end of selective sweeps, and likely of the same class as the haplotype blocks which characterize SLC24A5 and LCT (the block is shorter, though that makes sense because of the deeper time depth). Again, I am perplexed why such an ancient allele, which is found in Amerindians, or Munda populations, is absent in Europeans who have putative East Eurasian admixture. The whole does not cohere for me. There is a weak point in one or more of my assumptions.

Then there’s the section on the mouse model. To me this aspect was ingenious, though I’m not particularly able to assess it on its technicalities. The earlier usage of mouse models to test the effects of mutations on EDAR was in the context of coarse copy number changes which resulted in massive dosage changes of protein. The phenotypic outcomes were rather extreme in that case. Here they used a “knockin” model where they recreated the specific EDAR point mutation. Instead of extreme phenotypes they found that the mice were much more normal in their range of traits, though the hair form shifts were well aligned with what occurred in humans. Additionally there were some changes in the number of eccrine glands, with a larger number in the derived East Asian EDAR carriers (with additive effect). Finally they noticed that there were differences in mammary gland pad area and branching. None of this is that surprising, EDAR is a significant regulatory gene which shapes the peripheries and exterior of an organism.

To double check the human relevance of what they found in the mouse model they performed a genome-wide association in a large cohort of Han Chinese. The correlations of particular traits were in the directions that they expected; those individuals with East Asian EDAR variants had thicker hair, shovel-shaped incisors, and a greater density of eccrine glands. It is perhaps important to note that the frequency of the derived variant is so high in Han populations that they didn’t have enough homozygote ancestral genotypes to perform statistics, so their comparisons involved heterozygotes with the derived mutant and also a copy of the ancestral state. This is like SLC24A5 in Europeans, where it is difficult to find individuals of European heritage who have double copies of the non-European modal variant.

Let’s review all the awesome things they did in this study. They dug deeply into the evolutionary genomics of the region around the EDAR, concluding that this haplotype was driven up in frequency from on ancestral variant ~30,000 years ago in a hard selective sweep. And a sweep of notable strength in terms of selection coefficient. This may be one of the largest effect targets of natural selection in the genome of non-Africans over the past 50,000 years. Second, they used a humanized mouse model to explore the range of phenotypes correlated with this mutational change in East Asians. So you have a strong selection coefficient on a locus, and, a range of traits associated with changes on that locus. Third, they confirmed the correlation between the traits and the mutation in humans, despite there being prior research in this area (i.e., they reproduced). This is all great science, and shows the power of collaboration between the groups.

Much of the elegance and power of the paper applies to the discussion section as well, but to be frank this is where things start falling apart for me. You can get a sense of it in The New York Times piece, East Asian Physical Traits Linked to 35,000-Year-Old Mutation. The headline here points to a legitimately important inference from this line of research, many salient physical characteristics of the human races seem to be due to strong selection events at a few loci. In addition to EDAR I’m thinking of the pigmentation loci, such as SLC24A5. I wouldn’t be surprised if there was something similar for the epicanthic fold. If it is visible, and defines between populations differences, it is generally not genomically trivial. There’s usually a story underneath that difference.

In the broad scale of human natural history the problem that arises for me is that we have traits, we have genes under selection, but we have very weak stories to explain the mechanism and context of natural selection. Here there is a strong contrast with the loci around lactase persistence and malaria resistance. In those situations the causal mechanism for the selection seems relatively clear. Critics of evolutionary psychology are wont to accuse the field of ‘Just So’ storytelling, but the same problem crops up in the more intellectually insulated domain of evolutionary genomics (in part because the field is very new, and also mathematically and computationally abstruse). To illustrate what I’m talking about I’m going to quote from the discussion of the above paper:

A high density of eccrine glands is a key hominin adaptation that enables efficient evapo-traspiration during vigorous activities such as long-distance walking and running (Carrier et al., 1984; Bramble and Lieberman, 2004). An increased density of eccrine glands in 370A carriers might have been advantageous for East Asian hunter-gatherers during warm and humid seasons, which hinder evapo-transpiration.

Geological records indicate that China was relatively warm and humid between 40,000 and 32,000 years ago, but between32,000 and 15,000 years ago the climate became cooler and drier before warming again at the onset of the Holocene (Wang et al., 2001; Yuan et al., 2004). Throughout this time period, however, China may have remained relatively humid due to varying contribution from summer and winter monsoons.

High humidity, especially in the summers, may have provided a seasonally selective advantage for individuals better able to functionally activate more eccrine glands and thus sweat more effectively (Kuno, 1956). To explore this hypothesis, greater precision on when and where the allele was under selection—perhaps using ancient DNA sources—in conjunction with more detailed archaeological and climatic data are needed.

A climate adaptation is always a good bet. The problem I have with this hypothesis is that modern day gradients in the distribution of this allele are exactly the reverse of what one might expect in terms of adaptation to heat and humidity. Additionally, is there no cost to this adaptation? After the initial sweep upward, the populations where the derived EDAR mutant is found in high frequencies went through the incredible cold of the Last Glacial Maximum, and groups like the Yakuts are known to have cold adaptations today. Not only that, but the Amerindians from the arctic to the tropics all exhibit a cold adapted body morphology, the historical consequence of the long sojourn in Berengia.

Granted, the authors are not so simplistic, and the somewhat disjointed discussion alludes to the fact that EDAR has numerous phenotypic effects, and it may be subject to diverse positive selection pressures. This seems plausible on the surface, but this complexity of mechanism seems ill-fitted to the fact that the signal of selection around this locus is so clean and crisp. It seems that this is not going to be an easy story to unpack, and there’s a good deal of implicit acknowledgement of that fact in this paper. But tacked right at the end of the main text is this whopper:

It is worth noting that largely invisible structural changes resulting from the 370A allele that might confer functional advantage, such as increased eccrine gland number, are directly linked to visually obvious traits such as hair phenotypes and breast size. This creates conditions in which biases in mate preference could rapidly evolve and reinforce more direct competitive advantages. Consequently, the cumulative selective force acting over time on diverse traits caused by a single pleiotropic mutation could have driven the rise and spread of 370A.

A simple takeaway is that the initial climatic adaptation may have given way to a cultural/sexual selective adaptation, whereby there was a preference for “good hair” as exemplified by pre-Western East Asian canons (black and lustrous), as well as a bias toward small breasts. This aspect gets picked up in The New York Times piece of course. I’ll quote again:

But Joshua Akey, a geneticist at the University of Washington in Seattle, said he thought the more likely cause of the gene’s spread among East Asians was sexual selection. Thick hair and small breasts are visible sexual signals which, if preferred by men, could quickly become more common as the carriers had more children. The genes underlying conspicuous traits, like blue eyes and blond hair in Europeans, have very strong signals of selection, Dr. Akey said, and the sexually visible effects of EDAR are likely to have been stronger drivers of natural selection than sweat glands.

The passage here is ambiguous because the author of the article, Nick Wade, doesn’t use quotes, and I don’t know what is Akey and what is Wade’s gloss on Akey. For example, for theoretical reasons of reproductive skew (a few men can have many children) in general sexual selection is considered to be driven most often by female preference for male phenotypes. I assume Akey knows this, so I suspect that that section is Wade’s gloss (albeit, a reasonable one given the proposition of preference for smaller breasts). The main question on my mind is how seriously prominent population geneticists such as Joshua Akey actually take sexual selection to be as a force driving variation and selection in human populations. It seems that quite often sexual selection is presented as a deus ex machina. A phenomenon which can rescue our confusion as to the origins of a particular suite of traits. But our assessment of the likelihood of sexual selection presumably has to be premised on prior expectations informed by a balance of different forces one can gauge from the literature, and here my knowledge of the current sexual selection literature is weak. Perhaps my skepticism is premised on my ignorance, and the population geneticists who proffer up this explanation are more informed as to the state of the literature.

All this brings me back to the farcical title. When this paper first made news last week I was having dinner with a friend of Japanese heritage (who spent his elementary school years in Japan). I asked him point blank, “Do you like small breasts?” His initial response was “WTF!?! Razib,” but as a mouse geneticist he understood the thrust of my question after I outlined the above results to him. From personal communication with many East Asian American males I am not convinced that there is a overwhelmingly strong preference for small breasts within this subset of the population. But the key here is American. These are individuals immersed in American culture. The norms no doubt differ in East Asia. The typical visual representation of celebrity East Asian females that we see in the American media depict individuals who are slimmer and more understated in their secondary sexual characteristics than is the norm among Western female celebrities (e.g., Gong Li, the new crop of Korean pop stars, even taking into account the plastc surgery of the latter). Part of this is no doubt the reality that the normal range of variation across the population differs, and part of it may be the nature of aesthetic preferences.

But the possibility of deep rooted psychological reasons driving sexual selection (to my knowledge there was no culture which spanned South China and Siberia) brings us back to old ideas about the Pleistocene mind. And, it brings us back to evolutionary psychology, a field which is the whipping boy of both skeptics of the utility of evolutionary science in understanding human nature, and rigorous practitioners of evolutionary biology. And yet here it is not the evolutionary psychologists, but rock-ribbed statistical geneticists who I often see being quoted in the media invoking sexual selection. But do we know it is sexual selection, or is it just our best guess? Because more often than not best guesses are wrong (though best guesses are much more likely to be right than worst guesses!).

Evolutionary genomics has come a long way in the past 10 years. We know, for example, the genetic architecture and some aspects of the natural history of many traits. But, there are still shortcomings. Lactase persistence is the exception to the rule. Even a phenotype as straightforward as human pigmentation has no undisputed answer as to why it has been the repeated target of selection across Eurasia over the past 40,000 years. Oftentimes the right answer is simply that we just don’t know.

Citation: http://dx.doi.org/10.1016/j.cell.2013.01.016

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

A few days ago I was browsing Haldane’s Sieve,when I stumbled upon an amusing discussion which arose on it’s “About” page. This “inside baseball” banter got me to thinking about my own intellectual evolution. Over the past few years I’ve been delving more deeply into phylogenetics and phylogeography, enabled by the rise of genomics, the proliferation of ‘big data,’ and accessible software packages. This entailed an opportunity cost. I did not spend much time focusing so much on classical population and evolutionary genetic questions. Strewn about my room are various textbooks and monographs I’ve collected over the years, and which have fed my intellectual growth. But I must admit that it is a rare day now that I browse Hartl and Clark or The Genetical Theory of Natural Selection without specific aim or mercenary intent.

R. A. Fisher

Like a river inexorably coursing over a floodplain, with the turning of the new year it is now time to take a great bend, and double-back to my roots, such as they are. This is one reason that I am now reading The Founders of Evolutionary Genetics. Fisher, Wright, and Haldane, are like old friends, faded, but not forgotten, while Muller was always but a passing acquaintance. But ideas 100 years old still have power to drive us to explore deep questions which remain unresolved, but where new methods and techniques may shed greater light. A study of the past does not allow us to make wise choices which can determine the future with any certitude, but it may at least increase the luminosity of the tools which we have iluminate the depths of the darkness. The shape of nature may become just a bit less opaque through our various endeavors.

Figure from “Directional Positive Selection on an Allele of Arbitrary Dominance”, Teshima KM, Przeworski M

So what of this sieve of Haldane? As noted at Haldane’s Sieve the concept is simple. Imagine two mutations, one which expresses a trait in a recessive fashion, and another in a dominant one. The sieve operates by favoring the emergence out of the low frequency zone where stochastic forces predominate of dominantly expressing variants (i.e., even if an allele confers a large fitness benefit, at low frequencies the power of random chance may still imply that it is highly likely to go extinct). An example of this would be lactase persistence, which in the modal Eurasian variant seems to exhibit dominance. The converse case, where beneficial mutations are recessive in expression suffer from a structural problem where their benefit is more theoretical than realized.

The mathematics of this is exceedingly simple, a consequence of the Hardy-Weinberg dynamics of diploid random mating organisms. Let’s use the gene which is implicated in variation in lactase persistence as an example, LCT. Consider two alleles, LP and LNP, where the former confers persistence (one can digest lactose sugar as an adult), and the latter manifests the conventional mammalian ‘wild type’ (the production of lactase ceases as one leaves the life stage when nursing is feasible). LP is clearly the novel mutant. In a small population it is not unimaginable that by random chance the frequency of LP rises to ~10%. What now? At HWE you have:

p2 + 2pq + q2 = 1, where q = LP allele. At ~10% the numbers substituted would be:

(0.90)2 + 2(0.90)(0.10) + (0.10)2

This is where dominance or recessive expression is highly relevant. The reality is that LP is a dominant trait. So in this population the frequency of LP as a trait would be:

(0.10)2 + 2(0.90)(0.10) = 19%

Now imagine a model where LP is favored, but it expresses in a recessive fashion. Then the frequency of the trait would equal q2, the homozygote LP-allele proportion. That is, 1%. Though population genetics is often constructed on an algebraic foundation, the results lend themselves to intuition. A structural parameter endogenous to the genetic system, dominant or recessive expression, can have longstanding consequences in terms of the likely trajectory of the alleles. Selection only “sees” the trait, so a recessive trait with sterling qualities may as well be a trait with no qualities. In contrast, a dominantly expressed allele can cut like a scythe through a population, because every copy “counts.”

In preparation for this post I revisited the selection on Haldane’s Sieve in the encyclopediac Elements of Evolutionary Genetics. The authors note that this phenomenon, though of vintage character as these things can be reckoned is a field as young as evolutionary genetics, is still a live one. The dominance of favored mutations in wild populations, or the recessive character of deleterious ones in laboratory stock, may reflect the different regimes which these two genes pools are subject to. The nature of things is such that is easier to generate recessive mutations than dominant ones (i.e., loss is easier than gain), so the preponderance of dominant variants in wild stocks subject to positive selective pressure lends credence to the idea that evolutionary rather than development forces and constraints shape the genetic character of many species.

And yet things are not quite so tidy. Haldane’s Sieve, and the framework of dominant versus recessive alleles, operates differently in the area of sex chromosomes. In many lineages there is a ‘heterogametic sex’ which carries only one functional chromosome for most of the genome. In mammals this is the male (XY), while in birds this is the female (ZW). As males have only one functional copy of most genes on the sex chromosome, the masking effect of recessive expression does not apply to them in mammals. This may imply that because of the exposure of many deleterious recessive variants to natural selection within the heterogametic sex one would see different allelic distributions and genetic landscapes on these chromosomes (e.g., more rapid adaptation because of the exposure of nominally recessive alleles in the heterogametic sex, as well as more purifying selection on deleterious variants). But the reality is more complex, and the literature in this area is somewhat muddled. More precisely, it seems phylogenetically sensitive. Validation of the theory in mammals founders once one moves to Drosphila.

And that is why research in evolutionary genetics continues. The theory stimulates empirical exploration, and is tested against it. Much of the formal theory of classical evolutionary genetics, which crystallized in the years before World War II, is now gaining renewed relevance because of empirical testability in the era of big data and big computation. This is an domain where the past is not simply of interest to historians. Scientists themselves, chasing the next grant, and producing the expected stream of publications, may benefit from a little historical perspective by standing upon the shoulders of giants.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

The above map shows the population coverage for the Geno 2.0 SNP-chip, put out by the Genographic Project. Their paper outlining the utility and rationale by the chip is now out on arXiv. I saw this map last summer, when Spencer Wells hosted a webinar on the launch of Geno 2.0, and it was the aspect which really jumped out at me. The number of markers that they have on this chip is modest, only >100,000 on the autosome, with a few tens of thousands more on the X, Y, and mtDNA. In contrast, the Axiom® Genome-Wide Human Origins 1 Array Plate being used by Patterson et al. has ~600,000 SNPs. But as is clear by the map above Geno 2.0 is ascertained in many more populations that the other comparable chips (Human Origins 1 Array uses 12 populations). It’s obvious that if you are only catching variation on a few populations, all the extra million markers may not give you much bang for the buck (not to mention the biases that that may introduce in your population genetic and phylogenetic inferences).


To the left are the list of populations against which the Human Origins 1 Array was ascertained, and they look rather comprehensive to me. In contrast, for Geno 2.0 ‘ancestrally informative markers’ were ascertained on 450 populations. The ultimate question for me is this: is all the extra ascertainment on diverse and obscure groups worth it? On first inspection Geno 2.0′s number of SNPs looks modest as I stated, but in my experience when you quality control and merge different panels together you are often left with only a few hundred thousand SNPs in any case. 100-200,000 SNPs is also sufficient to elucidate relationships even in genetically homogeneous regions such as Europe in my experience (it’s more than enough for model-based clustering, and seems to be overkill for MDS or PCA). One issue that jumps out at me about the Affymetrix chip is that it is ascertained toward the antipodes. In contrast, Geno 2.0 takes into account the Eurasian heartland. I suspect, for example, that Geno 2.0 would be better for population or ancestry assignment for South Asians because it would have more informative markers for those populations.

Ultimately I can’t really say much more until I use both marker sets in different and similar contexts. Since Geno 2.0 consciously excludes many functional and medically relevant SNPs its utility is primarily in the domain of demographics and history. If the populations in question are well covered by the Human Origins 1 Array, I see no reason why one shouldn’t go with it. Not only does it have more information about biological function, but the number of markers are many fold greater. On the other hand, Geno 2.0 may be more useful on the “blank zones” of the Affy chip. Hopefully the Genographic Project results paper for Geno 2.0 will come out soon and I can pull down their data set and play with it.

Cite: arXiv:1212.4116

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

To understand nature in all its complexity we have to cut down the riotous variety down to size. For ease of comprehension we formalize with math, verbalize with analogies, and visualize with representations. These approximations of reality are not reality, but when we look through the glass darkly they give us filaments of essential insight. Dalton’s model of the atom is false in important details (e.g., fundamental particles turn out to be divisible into quarks), but it still has conceptual utility.

Likewise, the phylogenetic trees popularized by L. L. Cavalli-Sforza in The History and Geography of Human Genes are still useful in understanding the shape of the human demographic past. But it seems that the bifurcating model of the tree must now be strongly tinted by the shades of reticulation. In a stylized sense inter-specific phylogenies, which assume the approximate truth of the biological species concept (i.e., little gene flow across lineages), mislead us when we think of the phylogeny of species on the microevolutionary scale of population genetics. On an intra-specific scale gene flow is not just a nuisance parameter in the model, it is an essential phenomenon which must be accommodated into the framework.


This is on my mind because of the emergence of packages such as TreeMix and AdmixTools. Using software such as these on the numerous public data sets allows one to perceive the reality of admixture, and overlay lateral gene flow upon the tree as a natural expectation. But perhaps a deeper result is the character of the tree itself is torn asunder. The figure above is from a new paper, Efficient moment-based inference of admixture parameters and sources of gene flow, which debuts MixMapper. The authors bring a lot of mathematical heft to their exposition, and I can’t say I follow all of it (though some of the details are very similar to Pickrell et al.’s). But in short it seems that in comparison to TreeMix MixMapper allows for more powerful inference of a narrower set of populations, selected for exploring very specific questions. In contrast, TreeMix explores the whole landscape with minimal supervision. Having used the latter I can testify that that is true.

The big result from MixMapper is that it extends the result of Patterson et al., and confirms that modern Europeans seem to be an admixture between a “north Eurasian” population, and a vague “west Eurasian” population. Importantly, they find evidence of admixture in Sardinians, which implies that Patterson et al.’s original were not sensitive to admixture in putative reference populations (note that Patterson is a coauthor on this paper as well). The rub, as noted in the paper, is that it is difficult to estimate admixture when you don’t have “pure” ancestral reference populations. And yet here the takeaway for me is that we may need to rethink our whole conception of pure ancestral populations, and imagine a human phylogenetic tree as a series of lattices in eternal flux, with admixed nodes periodically expanding so as to generate the artifice of a diversifying tree. The closer we look, the more likely that it seems that most of the populations which have undergone demographic expansion in the past 10,000 years are also the products of admixture. Any story of the past 10,000 years, and likely the past 100,000 years, must give space at the center of the narrative arc lateral gene flow across populations.

Cite: arXiv:1212.2555 [q-bio.PE]
(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

While I was at Spencer Wells’ poster at ASHG I was primarily curious about bar plots. He’s got really good spatial coverage, so I’m moderately excited about the paper (though I didn’t see much explicit testing of phylogenetic hypotheses, which I think this sort of paper has to do now; we’re beyond PCA and bar plots only papers). That being said, Spencer was more interested in me promoting the Scientific Grants Program. Here’s some more information:

The Genographic Project’s Scientific Grants Program awards grants on a rolling basis for projects that focus on studying the history of the human species utilizing innovative anthropological genetic tools. The variety of projects supported by the scientific grants will aim to construct our ancient migratory and demographic history while developing a better understanding of the phylogeographic structure of world populations. Sample research topics could include subjects like the origin and spread of the Indo-European languages, genetic insights into Papua New Guinea’s high linguistic diversity, the number and routes of migrations out of Africa, the origin of the Inca, or the genetic impact of the spread of maize agriculture in the Americas.

Recipients will typically be population geneticists, students, linguists, and other researchers or scientists interested in pursuing questions relevant to the Genographic Project’s broad goal of exploring our migratory history. Recipients of Genographic scientific grant funds will become members of the Genographic Consortium, and will be expected to act as agents of the greater Genographic mission, participating in and reporting on multiple aspects of Genographic fieldwork, in addition to their own proposed and mission‐aligned pilot projects. Openness and transparency within the Consortium are the key values of the project’s research team, and grantees will be expected to abide by this code of conduct.


If you poke through their material they say that the grant will be $25 to $50 thousand dollars. That’s 125 to 250 Geno 2.0 chips. Speaking of which, I sent in a chip about a month ago now. The results should be back soon.

So why was Spencer so keen on me pushing this again? (I’ve mentioned it before) After being at ASHG 2012 I’m shocked in the small sample space of people interested in these sorts of historical genetic questions. I say this because I’ve reviewed/read most of the papers which were present as posters. I wonder on occasion if I’m missing out on something, but these results indicate no, there’s only so many labs doing this sort of work. The last is the key question. This is where “bottom up” non-academic science can do wonders. An Indian group presented a poster at ASHG, and when they told me of the similarities between Iyers and Bengali Brahmins I couldn’t help but admit that “Yes, I know that already, my friend Zack Ajmal came to that conclusion.” If you are an academic you need to go beyond tools and methods and analytic insights which someone with a spare computer and some marginal free time can generate. Academic monopolies on these data are going to be short-lived at best. And all for the good. I’m sick & tired of intellectual rents.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

As most readers know I was at ASHG 2012. I’m going to divide this post in half. First, the generalities of the meeting. And second, specific posters, etc.

Generalities:

- Life Technologies/Ion Torrent apparently hires d-bag bros to represent them at conferences. The poster people were fine, but the guys manning the Ion Torrent Bus were total jackasses if they thought it would be funny/amusing/etc. Human resources acumen is not always a reflection of technological chops, but I sure don’t expect organizational competence if they (HR) thought it was smart to hire guys who thought (the d-bags) it would be amusing to alienate a selection of conference goers at ASHG. Go Affy & Illumina!

- Speaking of sequencing, there were some young companies trying to pitch technologies which will solve the problem of lack of long reads. I’m hopeful, but after the Pacific Biosciences fiasco of the late 2000s, I don’t think there’s a point in putting hopes on any given firm.

- I walked the poster hall, read the titles, and at least skimmed all 3,000+ posters’ abstracts. No surprise that genomics was all over the place. But perhaps a moderate surprise was how big exomes are getting for medically oriented people.

- Speaking of medical/clinical people, I noticed that in their presentations they used the word ‘Caucasian‘ a lot. This was not evident in the pop-gen folks. It shows the influence of bureaucratic nomenclature in modern medicine, as they have taken to using somewhat nonsensical US Census Bureau categories.

- Twitter was a pretty big deal. There were so many interesting sessions that I found myself checking my feed constantly for the #ASHG2012 hashtag. It was also an easy way to figure out who else was at the same session (e.g., in my case, very often Luke Jostins).

- If you could track the patterns of movements of smartphones at the conference it would be interesting to see a network of clustering of individuals. For example, the evolutionary and population genomics posters were bounded by more straight-up informatics (e.g., software to clean your raw sequence data), from which there was bleed over. But right next to the evolution and population genomics sections (and I say genomics rather than genetics, because the latter has been totally subsumed by the former) you had some type of pediatric disease genetics aisles. I wasn’t the only one to have a freak out when I mistakenly kept on moving (i.e., you go from abstruse discussions of the population structure of Ethiopia, to concrete ones about the likely probability of death of a newborn with an autosomal dominant disorder, with photos of said newborn!).

- It was obvious which sessions were more multidisciplinary: just note the “churn” between speakers. People were switching sessions speaker-by-speaker, so if there was a stretch not to their liking, they would opt out.

- Number of questions per talk seemed to follow a power law. Many, many, talks had to have the moderator ask a token question. But there were a few panels where people rushed to the mics, and the moderators had to turn them away (this happened to me a couple of times, though I had the habit of sitting in the middle of aisles so that people wouldn’t have to edge past me, which disadvantaged me).

- 23andMe will supposedly have the new ancestry painting, with many more populations, up by the end of the year at the latest. I’ll believe it when I see it, but the person who was telling me this seemed totally sincere, and I’m hopeful.

- I drank a fair amount some of the nights, and have a lot of business cards from people I don’t remember. But one thing that seems to be emerging is a proliferation of intermediation and b2b services. With the diversity of choices it stands to reason that some firms are stepping into the clutter and attempting to make a profit by matching the two parties at the ends of the transaction. One person who I do recall Michael Heltzen of BlueSEQ, which he pitches as the “Hotels.com of sequencing.”

- Overall this was a well run conference in terms of logistics. I’ll definitely be at Boston next year!

- Lots of stuff on archaic hominin admixture and selection. Lots.

- Friends don’t let friends use structure when they could use admixture. It seems that most people have switched to the latter, which is fast. But a few groups are still using the former. And they shouldn’t be, because their burn-in and replication parameters are set way low (I use structure for microsatellites) so that it won’t take a thousand years to converge. If you are doing this, why go for the power of Bayesian phylogenetics in the first place?

- Luke Jostins suggested that I looked different in real life from the head shot. I suspect that perhaps Luke has lower powers of perception in this domain. A very drunk member of a well respected lab decided to start yelling my name at a dim after party,* so I can’t look that different (and it isn’t as if there aren’t a lot of brown dudes walking around at these things).

- Konrad Karczewski gave me a “free the data” button which I wore, but there were mixed results when I asked if people were going to release their data sets. Some presenters offered to email me the data, but since I wasn’t flashing my badge I’m curious why they’d offer to even do this, as opposed to releasing it to millions of other strangers!

Specifics:

This section will mostly cover the talks and posters in the evolutionary and population genomics category. I can comment on the talks because I went to them, and on the posters because I looked at them multiple times. One thing to note is that many of the posters and some of the talks were on papers which are already in circulation (preprint or already published). I’m not going to touch on that much. I’ve reviewed/linked to most of those.

- One of the guys involved in fetal whole genome sequencing from last spring stated that the primary cost here is going to be in the sequencing. He’s also confident that they can move the sequencing much further back from 18 weeks (i.e., in terms of sample collection and analysis turnaround).

- There is a lot of talk about structural variants, etc., but for high-throughput sequencing methods we’re still not ‘there.’ I actually went to a CNV talk where the presenter presented some RFLP results! He stated that the reality was that for clinical purposes high-throughput isn’t feasible or accurate enough to distinguish 3 vs. 4 vs. 5 copies.

- I don’t get a lot of the CNV stuff which repeats SNP-results with CNVs. For example, the posters which recapitulated geographical fine-structure with CNV. This was OK for the first pub, but doing it over and over again seems gratuitous.

- Simon Gravel has some very awesome software.

- Luca Pagani is confident about rolloff‘s admixture estimate for Ethiopia. He’s moving to Ethiopian whole genomes now, and plans on doing follow ups on this question (his own methods are in line with rolloff).

- The rumored paper (i.e., I’ve heard about this paper for a few years) which connects Northeast African populations with the Khoisan of southern Africa will finally be published soon. At least that was what I was told…as I noted, this result has been around for a long time, but someone hasn’t been published. Basically the group has some Cushitic speaking samples from Kenya, and it looks like that these are the Ethiopian analogy to Andaman Islanders (or as close as you can get).

- We’ll see something on Afrikaner genomics soon enough. I wasn’t told explicitly, but it was pretty obvious.

- The Nielsen Group is still working on high altitude adaptations. They don’t see hard sweeps. Of course I didn’t get confirmation of whether these were old variants, but it looks as if a lot of preliminary stuff did not have the power to detect anything in the first group. As usual they are up to something.

- Speaking of the Nielsen Group, Melissa Wilson Sayres’s work on purifying selection on Y chromosomal lineages was persuasive to me. Basically, effective population differences (e.g, polygyny) just can not explain the lower diversity of the Y lineages (they ran simulations). Luckily for the phylogeographers this won’t impact the utility of Y trees (positive selection would, but that’s not what she’s talking about). I’m a little confused whether it was Sayres’ talk or not, but these results may explain the discordance in coalescence between mtDNA and Y lineages (the former has a deeper coalescence).

- Also, Amy Goldberg from Noah Rosenberg‘s lab presented some theoretical work that showed that complex demographic history has an impact on the variance, as opposed to the mean, effective population size you might infer for a given sex. Someone from Michael Hammer’s lab started asking me if I liked their research while I was looking at Amy’s poster, and I said sure (I’d blogged it), but her theoretical results also explain some of the weird stuff I’d see out of their lab.

- Sriram Sankararaman had a poster on Neandertal admixture in modern human lineages. In the broad outlines the Reich lab and the Wall lab seem to agree (along with others, such as Melinda Yang in the Slatkin lab). We’re seeing the convergence of a new orthodoxy/paradigm. And they seem to agree broadly with Graham Coop’s conjecture.

- There was a lot of stuff on East Asian genetics, but nothing too cutting edge. I was kind of disappointed. A massive Y and mtDNA study did suggest two waves of admixture in the Tibetan highland, which a priori seems plausible to me. But the rule-of-thumb I have is not to bet against the Nielsen Group, which remains skeptical. Another paper suggested deep lineages of haplogroup M among the Burmans. This is interesting because the Burmans are presumably culturally somewhat intrusive, supplanting the Mon populations.

- The guy from the Peopling of the British Isles presented. Two points. First, ~40 percent of the ancestry in England proper seems Anglo-Saxon. Second, their clustering method seemed to find many more ‘micro-populations’ along the “Celtic Fringe” and in Scotland. Why? My hunch is that the Anglo-Saxon expansion wasn’t a diffusion process. Rather, the hordes of Hengist and Horsa probably admixed with the local Brythonic Celtic population on the East Anglian shore, and the rapidly expanded. There is a high probability of some later assimilation (there is some suggestion that Alfred the Great’s line were Brythonic nobles who were absorbed into the Anglo-Saxon power structure), but the emergence of a huge Anglo-Saxon/England proper cluster was very evident in the figure displayed. The main opposition to this thesis I can think of is that isolation-by-distance gene flow is very efficacious in the topography of England, but less so in the more rugged borderlands.

- Speaking of isolation-by-distance, an Estonian geneticist claimed to me that the distinction between Estonians and Finns probably has to do with the arrival of the original Finnic populations from the east, and their subsequent separation. While the Estonians engaged in gene flow with the Latvians, they diverged from the Finns across the water, who were more isolated until the Swedes arrived.

- There was a poster (didn’t talk to the presenters) which did whole genome analysis of a South Indian man or two, and indicated that there is evidence that these individuals are basal to all other non-Africans. This is another attempt to reaffirm the possibility of an ancient “southern route” out of Africa. I wasn’t convinced because there wasn’t much detailing of their methods (they pointed to a diversity estimate, but that’s not enough these days).

- Another Indian group confirmed a lot of stuff that Zack has found already, but supplemented it with lots of low caste/tribal samples, which most people lacked. They assert (rightly) that within South Asia there are genetic distances across populations/castes which are analogous to inter-continental differences.

- I am excited by the synthesis of spatial and genetic variation data…but am beginning to realize that this has limitations, because we can’t transpose genetic variation representation onto tesseracts (because we can’t visualize tesseracts). In short, two or three dimensional representations remove important information at the finer-grain. And it’s at the finer-grain that we’re focusing now.

- Apparently Mexicans and Chileans overestimate their European ancestry. The presenters found that 40-45% of the ancestry of their Chilean sample was Amerindian. I asked about sampling, and they admitted this might be an issue. The same applied to their other results. We need thicker data sets here. Basically if it’s a heterogeneous country, you can’t have a pie-graph labelled with that country.

- There was a poster on associating OCA2-HERC2 in Brazilians with hair, eye, and skin color. The association of OCA2-HERC2 with skin color is unadmixed Europeans is mixed, but seems to show up in this population. Assuming stratification is not a problem (I believe they looked at that genomically), it seems that the effect on skin only shows up when you have a particular pigmentation genetic architecture. It’s a matter of statistics, not biology.

- Speaking of pigment, Mark Shriver had a poster which correlated perceived, apparent, and genomic racial ancestry. Perceived means how you’re perceived by others. Apparent is taking physical traits and averaging them quantitatively (facial features, skin color, etc.). And genomic ancestry is what you know about. Estimating ancestry quanta. The surprising thing is that people seemed to underestimate African ancestry from apparent physical features (looking at the scatter of apparent to genomic ancestry). This goes against folk wisdom, which asserts that “African features are dominant.”

- Lots of corrections of naive usages of Fst in the literature. A poster out of the Price lab suggested using likelihood ratios, and if not possible, Hudson’s Fst. This showed up multiple times in various forms. Fst will not die, but will be reborn!

- Saw a poster which claimed first cousin marriage decreases expected value of offspring by 3 cm! (this was not in the evolution and pop genomics sections, and I probably should have spent more time looking over complex traits, etc., but there’s only so much you can do)

- More evidence of multiple migrations into New World. Lots of New World genomics. I didn’t talk to these presenters because they were always busy.

- Spencer Wells told me that they’d finally be publishing their paper using their Geno 2.0 results soon. They had really good population coverage, though I wish they’d had the bar plot rotated 90 degrees. I couldn’t read labels too well.

Finally, there was A LOT of software, and A LOT of methods. This is one of the things where I assume over the next decade it will shake out into a few big players. Right now labs are pumping out software to infer ancestry, phase data, etc., and playing up their advantages. This is all good, but at some point the focus will go back to biology, and the software will be the wind beneath its wings. I’m trying to free up time to play with some of the software, though much of it isn’t online yet (the presenters always assured it would be up soon, but I know how that goes.).

* This was not a pleasant experience.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

OK, perhaps I can help with that. Dr. Coop speaks of the collaboration between himself & Dr. Joseph Pickrell, Haldane’s Sieve, which I added to my RSS days ago (and you can see me pushing it to my Pinboard). From the “About”:

As described above, most posts to Haldane’s Sieve will be basic descriptions of relevant preprints, with little to no commentary. All posts will have comment sections where discussion of the papers will be welcome. A second type of post will be detailed comments on a preprint of particular interest to a contributor. These posts could take the style of a journal review, or may simply be some brief comments. We hope they will provide useful feedback to the authors of the preprint. Finally, there will be posts by authors of preprints in which they describe their work and place it in broader context.

We ask the commenters to remember that by submitting articles to preprint servers the authors (often biologists) are taking a somewhat unusual step. Therefore, comments should be phrased in a constructive manner to aid the authors.

It might be helpful if other evolution/genetics bloggers reblog this so we can push it up the Google search results. If you google “Haldane’s Sieve” some of the results are interesting…and not necessarily in a good way. I do feel guilt blogging on stuff my readers can’t get, so the more preprints become acceptable the more we (as in, the general public) can understand about evolution.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

The Pith: the evolution of lighter skin is complex, and seems to have occurred in stages. The current European phenotype may date to the end of the last Ice Age.

A new paper in Molecular Biology and Evolution, The timing of pigmentation lightening in Europeans, is rather interesting. It’s important because skin pigmentation has been one of the major successes of the first age of human genomics. In 2002 we really didn’t know the nature of normal human variation in skin color in terms of specific genes (basically, we knew about MC1R). This is what Armand Leroi observed in Mutants in 2005, wondering about our ignorance of such a salient trait. Within a few years though Leroi’s contention was out of date (in fact, while Mutants was going to press it became out of date) . Today we do know the genetic architecture of pigmentation. This is why GEDmatch can predict that my daughter’s eyes will be light brown from just her SNPs (they are currently hazel). This genomic yield was facilitated by the fact that pigmentation seems to be a trait where most human variation is controlled by half a dozen genes. In contrast, height or I.Q. are controlled by innumerable genes.

But first, a major gripe. In the discussion they write: “Our estimates additionally show that the onset of selective sweeps at SLC24A5, SLC45A2, and TYRP1, the three genes in which the geographic distribution of the polymorphisms is primarily restricted to European populations.” This is just not literally true. SLC24A5 in its derived skin lightening state is found outside of Europe. As the map from the HGDP browser to the left indicates, the derived “European” variant is nearly fixed in Middle Easterners. If you subtract Sub-Saharan admixture it almost is fixed in Middle Easterners. It is also found in high frequencies in South Asians. The HGDP samples are Pakistani, but the derived variant is present at a frequency of 95% in the HapMap Gujaratis! My parents are also homozygotes for the derived “European” variant. I’m rather sure there are more copies of the derived “European” allele among non-Europeans: South Asians, Middle Easterners, and North Americans. The problem here is semantic I think. The authors were really talking about West Eurasians in a generic sense, but because their data utilized Europeans, East Asians, and Africans, they felt like they had to speak about Europeans specifically. Additionally, during the Last Glacial Maximum much of Europe was not inhabited, or very sparsely so. That suggests to me that much of the evolution of “European pigmentation” may have taken outside of geographical Europe proper.

As for the paper, the results are pretty simple and striking. And speaking of striking, I’ll just paste this figure illustrating a neighbor-joining network of haplotypes at four skin pigmentation loci first to orient you. The yellow bubbles are derived lineages (in this case, they are often associated with SNPs correlated with lighter skin), while the black are ancestral ones.

What you see in the first two panels is that derived lineages are tightly clustered. SLC24A5 looks in particular to have almost a “star phylogeny,” so that you are seeing signatures of rapid expansion of this haplotype. SLC45a2 in contrast is dispersed across the networks. The authors posit that there may have been a recombination event which resulted in the jumping of the derived lineage onto the background of the ancestral one. Finally, with KITLG you see a pattern where numerous derived lineages are widely dispersed, albeit differentiated from the ancestral branch.

How did they do this? For the purposes of this blog post what I will say is that they first focused on a SNP, a single nucelotide polymorphism, associated with the lightening of the skin. This need not be the causal mutation, but generally they are strongly associated with the trait, and so can serve as useful markers. Second, around these focal SNPs they assembled a set of microsatellites with which they could perform phylogenetic tests. Microsatellites mutate fast, and accumulate variation. The main issue is that they mutate so fast you lose resolution at deeper time depths.

With the combination of SNP and microsatellite data the authors tested their empirical patterns against explicit models from which they generated simulations. Basically the goal here was to test for neutrality. In other words, you have a set of outcomes you’d expect based on neutral dynamics (i.e., just drift changing the frequencies), and you see how the “real world” results fit in. If the empirical data are not well explained by the neutral model, perhaps it was selection? Looking at patterns of variation around these loci you can also get a sense of the strength of the selection and time since the last common ancestor. Here’s a table with the outcomes:

Just so you know, a selection coefficient of 0.01 is respectable, and 0.10 is massive. In particular in the case of SLC24A5 it looks like there was a lot of selection, and recently. A few years ago a conference presentation implied that the selective sweep around SLC24A5 began ~6,000 years ago. To my knowledge a paper never came out of this, and from what I’ve heard in part that’s because that very low number is probably not right, and you may have to push it back some. These results look around to be in the right range from what I’ve heard. Others have found similar ages for SLC24A5 and SLC45A2 sweeps. But take a look at the confidence intervals. This is a case where I would really like to play around with their data and the model assumptions, and see how robust they are.

More intuitively obvious though are the patterns of KITLG in terms of geography, as well as the haplotype phylogenetic tree. The authors basically conclude that KITLG is a variant which precedes the differentiation between Europeans and East Asians, while the other genes have sweeps which postdated the divergence. The latter makes sense in light of the differentiation in skin pigmentation architecture in western and eastern Eurasians. Repeatedly the authors basically admit that this is a complicated issue, so I wouldn’t take these results home. It does concern me that they assume a demographic model which is a tree without reticulation. My own question in regards to the ~25,000 year values for divergence of west and east Eurasians is the extent to which admixture and gene flow are pulling forward in time the node. Second, the authors focused on a few representative populations in Europe, East Asia, and Africa. But there’s a whole world out there. It isn’t as if evolution occurred in isolation at these antipodes, and everyone else is a linear combination of subsequent admixture. In fact, I have to wonder if the estimates here are for populations which are intrusive to Europe, rather than indigenous. One point is that one might speculate that newcomers assimilated old lightening variants from the European Ice Age hunter-gatherers. But the haplotype structure mitigates against this. You should see more diverse derived variants if they’re drawn from the reservoir of ancient variants extant in Ice Age Europe.

So what’s the explanation from the authors? One proposal they make is that human evolution is accelerating due to more genetic variation because of larger effective population sizes. I assume they make this argument because it doesn’t look like the more recently selected variants emerged from standing variation, the diversity already present at the time of the sweep. Rather, the sweeps are triggered by new mutations which emerged recently (ergo, fewer “steps” away mutationally in the network for all the derived variants).

Ultimately there’s a lot to think about here. But I do wonder how ancient DNA is going to update and revise things. As I’ve said over and over again I’m a lot more skeptical of inferences and simulations after the dozens of phylogenetic model papers I read in the 2000s which “proved” no admixture between archaics and modern humans.

Image credit: Rita Molnar

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

Dienekes has summaries up of human-related abstracts of Society for Molecular Biology & Evolution 2012.

1) Remember these are not papers, and some of the abstracts may never become papers, at least in recognizable form

2) Speaking of which, Estimating a date of mixture of ancestral South Asian populations:


Linguistic and genetic studies have demonstrated that almost all groups in South Asia today descend from a mixture of two highly divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners and Europeans, and Ancestral South Indians (ASI) not related to any populations outside the Indian subcontinent. ANI and ASI have been estimated to have diverged from a common ancestor as much as 60,000 years ago, but the date of the ANI-ASI mixture is unknown. Here we analyze data from about 60 South Asian groups to estimate that major ANI-ASI mixture occurred 1,200-4,000 years ago. Some mixture may also be older—beyond the time we can query using admixture linkage disequilibrium—since it is universal throughout the subcontinent: present in every group speaking Indo-European or Dravidian languages, in all caste levels, and in primitive tribes. After the ANI-ASI mixture that occurred within the last four thousand years, a cultural shift led to widespread endogamy, decreasing the rate of additional mixture.

To comments on this. The ~1,200 estimate for large-scale admixture is just nearly impossible to credit. Historically the only group which are likely candidates for this would be the Jatts of Punjab, who have myths of descents from the last pre-Islamic Central Asian populations which intruded upon the Indian subcontinent. In fact, if 1,200-4,000 represents an interval, the expected value is ~2,600 years ago. Approximately the time of the Buddha. This seems rather too recent to be plausible. But…the authors do note that there may be older admixture events. If the signal they’re picking up is the Indo-Aryan expansion, then that is somewhat plausible, in that it seems that as lage as that period large swaths of the eastern Indo-Gangetic plain and much of Central India were in the process of becoming part of greater Aryavarta.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

Over the years one issue that crops up repeatedly in human evolutionary genetics and paleoanthropology (or more precisely, the popular exposition of the topics in the media) is the idea that is that “population X are the most ancient Y.” X will always refer to a population within a larger set, Y, which is defined by relative marginalization or retention of older cultural folkways. So, for example, I have seen it said that the Andaman Islanders are the “most ancient Asian population.” Why? The standard model for a while now has been that non-Africans derive from a line of Africans which left the ancestral continent 50 to 100 thousand years ago, and began to diversify. Presumably Andaman Islanders have ancestry which goes back to this original dispersion, just as Europeans and Chinese do (revisions which suggest that Aboriginals may have been part of an earlier wave, still put the Andamanese in the second wave). The reason that the Andaman populations are termed ancient is pretty straightforward: they’re Asia’s last hunter-gatherers, literally chucking spears at outsiders. An ancient lifestyle gets conflated with ancient genetics.

This is a much bigger problem with the hunter-gatherers of Africa, the Pygmies, Hadza, and Bushmen. The reason is that these populations are of particular interest because they seem to have diverged from the rest of humanity rather early on. Both Y chromosomes and mtDNA confirmed this, and now autosomal analyses looking across the whole genome are confirming it. In other words, they’re basal to the rest of humanity. I believe this is moderately misleading. With the Bantu Expansion much of African genetic diversity disappeared. The hunter-gatherers seem exceptional long and bare branches on the phylogenetic tree because all their relatives are gone!

But the hunter-gatherers remain, and their genetic material has been collected for scientists to study. A new paper in PLoS Genetics puts the spotlight on Western Pygmies, and their relationship to their Bantu neighors. Patterns of Ancestry, Signatures of Natural Selection, and Genetic Association with Stature in Western African Pygmies:

Africa is thought to be the location of origin of modern humans within the past 200,000 years and the source of our dispersion across the globe within the past 100,000 years. Africa is also a region of extreme environmental, cultural, linguistic, and phenotypic diversity, and human populations living there show the highest levels of genetic diversity in the world. Yet little is known about the genetic basis of the observed phenotypic variation in Africa or how local adaptation and demography have influenced these patterns in the recent past. Here, we analyze a set of admixing Bantu-speaking agricultural and Western Pygmy hunter-gatherer populations that show extreme differences in stature; Pygmies are ~17 cm shorter on average than their Bantu neighbors and among the shortest populations globally. Our multifaceted approach identified several genomic regions that may have been targets of natural selection and so may harbor variants underlying the unique anatomy and physiology of Western African Pygmies. One region of chromosome three, in particular, harbors strong signals of natural selection, population differentiation, and association with height. This region also contains a significant association with height in Europeans as well as a candidate gene known to regulate growth hormone signaling.

The method here is simple. Previous work already confirmed that the height of a given Pygmy was strongly predicted by the amount of non-Pygmy ancestry they carried within their genome. Now the authors here are focusing on regions of the genome which not only show association with the phenotype in question, but signatures of natural selection. At this point I’m cautious enough about associations and positive results from tests for natural selection to be wary of accepting this on face value, but we have some priors here which should make this plausible. That is, there are strong functional rationales, and it isn’t as if the Pygmies are not distinctive in their height phenotype.

Let’s take the likelihood of natural selection for height as a given. What fascinates me is that the authors suggest that selection post-dates the divergence of the Western and Easter Pygmy populations. Why does this matter? Because it may give us a better clue as to the nature of the “pygmy” phenotype, which is common among relic hunter-gatherers the world over. The Bushmen, Pygmy, and various “Negritos” of Asia are small. Some have suggested this is an ancestral human type, or a natural adaptation, or an adaptation to the rainforest. On the other hand, the populations of Oceania are not small. To my knowledge the Indians of the Amazon are not the size of Pygmies. To put my own cards on the table I lean toward the proposition that the “pygmoid” body plan emerges when populations are driven to the margins, or, are being buffeted by disease and stress. It seems likely now that the closest relatives of the Philippine Negritos are the people of Oceania, most of whom are not small of stature. There are non-Bushmen Khoisan populations who are not small of stature. And, reportedly the isolated Andamanese of Sentinel Island are not of small stature!

The point here is that studying marginalized hunter-gatherers has limits in telling us about the nature of the human ancestors. It may be that Pygmies are in many ways derived in their phenotypes, relatively recent adaptations to contemporary exigencies. The results above even imply that the small stature of these populations may be a byproduct of the genetic correlation between various traits, and selection in one direction resulted in a correlated response in height. I would like to make a modest proposal: simply take these people on their own terms, and stop trying to slot them into a convenient paradigm. I doubt that Pygmies are going to be the great physicists of the 21st century because of their genetic variation (this was floated by Dierdre McCloskey on Dan MacArthur’s blog), nor do I think they are a special window in the very earliest of H. sapiens sapiens. They are who they are.

Addendum: Though I do know that some people would be curious about the evolutionary origins of other traits besides height in African hunter-gatherers.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

The new article in The American Journal of Human Genetics, A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root, is open access, so you should check it out. The discussion gets to the heart of the matter:

Supported by a consensus of many colleagues and after a few years of hesitation, we have reached the conclusion that on the verge of the deep-sequencing revolution…when perhaps tens of thousands of additional complete mtDNA sequences are expected to be generated over the next few years, the principal change we suggest cannot be postponed any longer: an ancestral rather than a “phylogenetically peripheral” and modern mitogenome from Europe should serve as the epicenter of the human mtDNA reference system. Inevitably, the proposed change could raise some temporary inconveniences. For this reason, we provide tables and software to aid data transition.

What we propose is much more than a mere clerical change. We use the Ptolemaian geocentric versus Copernican heliocentric systems as a metaphor. And the metaphor extends further: as the acceptance of the heliocentric system circumvented epicycles in the orbits of planets, switching the mtDNA reference to an ancestral RSRS will end an academically inadmissible conjuncture where virtually all mitochondrial genome sequences are scored in part from derived-to-ancestral states and in part from ancestral-to-derived states. We aim to trigger the radical but necessary change in the way mtDNA mutations are reported relative to their ancestral versus derived status, thus establishing an intellectual cohesiveness with the current consensus of shared common ancestry of all contemporary human mitochondrial genomes.

Note that the problem is not restricted to mtDNA. Indeed, in the much larger perspective of complete nuclear genomes in which comparisons are often currently made relative to modern human reference sequences, often of European origin, it seems worthwhile to begin considering, as valuable alternatives, public reference sequences of ancestral alleles (common in all primates) whereby derived alleles (common to some human populations) would be distinguished.

Perhaps the first generation or so of human molecular evolutionary genetics might be thought of as a “first draft.” A serviceable first draft which rendered in broad strokes the gist of the truth as we understand it, but lacking in some essential details.

On a minor note, there are some theoretical reasons why mtDNA did not yield much evidence for archaic admixture, which is clear in the nuclear genomics (e.g., higher rate of change due to lower effective population size, so more rapid extinction of ancient lineages). But perhaps now that the number of complete mtDNA genomes is increasing in size we might start to see “long branches,” which reflect the inferences generated from the ancient nuclear genomes.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

The face is an important aspect of our phenotype. So important that facial recognition is one of many innate reflexive cognitive competencies. By this, I mean that you can recognize a face in a gestalt manner, just like you can recognize a set of three marbles. You don’t have to think about it in a step-by-step fashion. Particular types of brain injuries can actually result in disablement of this faculty, and a minority of humans seem to lack it altogether at birth (prosopagnosia). That’s why I’ve long been interested in the genetic architecture and evolution of craniofacial traits. I long ago knew the potential range of pigmentation phenotypes for my daughter because both her parents have been genotyped, but when it comes to facial features we’re stuck with the old ‘blending inheritance’ heuristic. The most obvious importance of teasing apart the genetic architecture of craniofacial traits is forensics. It might not put the sketch artist out of a job, but it would be an excellent supplement to problematic eye witness reports.

But it isn’t just forensics. The issue has evolutionary relevance. It looks like that in terms of morphology our own lineage has had a lot of diversity up until recently. I’m thinking in particular of the ‘archaic’ looking humans recently discovered in China and Nigeria, who seem to have persisted down into the Holocene. More generally, humans as a whole have become more gracile over the last 10,000 years. Why? There are two extreme answers we can look to. First, gracile humans have replaced robust humans. Second, natural selection for gracility has resulted in the in situ evolution of many populations over the last ~10,000 years. An interesting aspect of this is that it looks as if many salient traits have been targets of selection, and therefore evolution and population differentiation.

Here the top 10 SNPs which deviate from the overall phylogenetic tree of population relationships in the HGDP data set:

 

SNP Chr Nearest gene Phenotype
rs1834640 15 SLC24A5 skin pigmentation
rs260690 2 EDAR hair morphology
rs10882168 10 CYP26A1/FER1L3 ?
rs4918664 10 CYP26A1/FER1L3 ?
rs2250072 15 SLC24A5 skin pigmentation
rs6583859 10 CYP26A1/FER1L3 ?
rs2384319 2 KIF3C ?
rs6500380 16 LONP2 ?
rs4497887 2 CNTNAP5 ?
rs9809818 3 FOXP1 ?

There are two things I want to say off the bat. First, a given SNP likely has many phenotypic effects. So the trait that we “see” in terms of its effect may not be the same trait that natural selection “sees.” Second, it is not a surprise that out of the traits that a given variant may affect the physically salient ones stand out; sometimes you do go looking where the light is shining on a dark street. We know that the lighter complexion of East and West Eurasians seems to be due to independent evolutionary events. In other words, they aren’t derived from common ancestry. When it comes to hair form the EDAR locus seems to be responsible for the distinctive characteristics of East Asians, and has been under recent selection.

What does all this have to do with craniofacial traits? Simple: the coarse and “skin deep” traits that physical anthropologists used decades ago to classify human beings have been rather informative to a first approximation of both details of phylogeny and natural selection. I see no reason why craniofacial traits should be any different. Humans have become more gracile, and some human populations seem to have been changing rather rapidly. I am highly skeptical that this is a neutral process. We care a great deal about facial features, and deviation from the norm can be arresting. If there has been change it is either due to population replacement, or selection (it could be a correlated response, or direct selection).

It is with that preamble that I offer up Mark Shriver’s abstract at the Modern Human Genetic Variation symposium:

The genes determining normal-range variation in human faces are arguably some of the most intrinsically interesting and fastest evolving. However, so far, little work has been focused on discovering these genes. Working under the hypothesis that genes causing Mendelian craniofacial dysmorphologies also may be important in determining normal-range facial-feature variation, and that those genes associated with population differences in facial features should have experienced greater levels of evolution (change in allele frequency), we have taken an admixture mapping/selection scan approach to identifying and studying the genes directly affecting facial features. We have applied the methods of automated quasi-landmark analyses, partial least squares regression, and individual genomic ancestry estimates to explore the distribution of facial features across two groups of human populations — West Africans and Europeans. Using three samples of admixed subjects (American; N=159, Brazilian; N=197, and Cape Verdean; N=248) we have modeled facial variation in the parental populations and compared the extent to which estimates of ancestry from the face compare to genomic-ancestry estimates. We also have tested six selection-nominated craniofacial candidate genes for functional effects on facial features using admixture mapping. In objective tests, two of these six genes (FGFR1 and TRPS1) show significant effects on facial features. In addition, human-observer ratings of the similarity between subjects and allele-specific facial morphs show the same effects for these two genes. Additionally, exaggerated allele-specific morphs based on normal-range variation in these genes recapitulates the syndromic facies of the craniofacial dysmorphologies with which they are associated.

I asked Mark about the nature of these genes and the traits. The paper is coming soon, but he told me that he does not think that the genetic architecture of craniofacial traits is going be as simple or easy to characterize as pigmentation genes. On the other hand, he’s reportedly capturing 35% of the African vs. European difference with his marker set, so that’s not trivial, and some of the individual loci have a strong enough effect that it’s visible by eye! Also, given the preserved extant diversity within populations (pigmentation genes are often disjoint across Africans and Europeans) he believes that the selection events are recent.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

The latest edition of The American Journal of Human Genetics has two papers using “old fashioned” uniparental markers to trace human migration out of Africa and Siberia respectively. I say old fashioned because the peak novelty of these techniques was around 10 years ago, before dense autosomal SNP marker analyses, let alone whole genome sequencing. But mtDNA, passed down the maternal line, and Y chromosomes, passed from father to son, are still useful. Prosaically they’re useful because the data sets are now so large for these sets of markers after nearly 20 years of surveying populations. More technically because these two regions of the genome do not recombine they lend themselves to excellent representation as a tree phylogeny. Finally, mtDNA in particular is particularly amenable to estimates via molecular clock methodologies (it has a region with a higher mutational rate, so you can sample a larger range of variation over a given number of base pairs; you can use STRs, which mutate rapidly, for Y chromosomes, but there seems to be a lot of controversy in dating).

The papers are The Arabian Cradle: Mitochondrial Relicts of the First Steps along the Southern Route out of Africa and Mitochondrial DNA and Y Chromosome Variation Provides Evidence for a Recent Common Ancestry between Native Americans and Indigenous Altaians. Dienekes has already commented on the first paper. I am not going to take a detailed position on either, but I have to add that we need to be very careful of extrapolating from maternal or paternal lineages, and, assuming that population turn over is low enough that we can make phylogeographic inferences about the past from the present. For example, if you look at mtDNA South Asians as a whole strongly cluster with East Asians and not Europeans, while if you look at Y chromosomes you see the reverse. The whole genome gives a more mixed picture. Additionally, ancient DNA analyses in Northern Eurasia are showing strong discontinuities between past and present populations. So coalescence back to last common ancestor between two different lineages in two different regions may actually be due to diversity in a common source population more recently, which entered into demographic expansion and replaced other groups.

If you need the papers, email me. Some of you know the alphabet soup of haplogroups better than I do. Below are two figures which I think give the top line results.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

The excellent site io9 has a piece up today which is a fascinating indicator of the nature of popular science publications as a lagging indicator. It is a re-post of a piece published last April, How Mitochondrial Eve connected all humanity and rewrote human evolution. In it you have an encapsulation of a particular period in our understanding of human natural history through evolutionary genetics. Notice for example the focus on maternally transmitted lineages, mtDNA and Y chromosomes. And the citations on genealogy date to the middle aughts. The science is mostly correct as far as it goes in the details (or at least it is defensible, last I checked there was still debate as to the validity of the molecular clocks used for Y chromosomal lineages), but it misses the big picture of how we’ve reframed our understanding of the human past over the last few years. The distance between 2011 and 2009 is far greater in this sense than between 2009 and 1999 (or even 2009 and 1989!). The io9 piece is a reflection of the era before the paradigmatic rupture.

We are no longer talking just about African mtDNA Eve and her husband Y chromosomal Adam. I’m going to consciously avoid the term “revolutionize,” because the broad outlines of the old story certainly hold. Rather, as we are wont to do it seems that we became a bit too bold with some of our brush strokes, and elided fascinating and subtle elements of the landscape on the margins. There were Crebs, and other assorted Oogas and Boogas. And the painting is not completed yet. As such we can’t really draw any conclusions as to “what it all means,” aside from the fact that it’s fascinating.

Addendum: Someone in the comments observes in relation to a depiction of Eve in the story that “She’s awfully pale for an East African.” This is true on the merits, but the logic is kind of dumb. Why exactly do we think that people ~150,000 years ago looked anything like modern East Africans? It is very likely that Europeans ~35,000 years ago did not look like Daryl Hannah.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

I have blogged about the genetics of altitude adaptation before. There seem to be three populations in the world which have been subject to very strong natural selection, resulting in physiological differences, in response to the human tendency toward hypoxia. Two of them are relatively well known, the Tibetans and the indigenous people of the Andes. But the highlanders of Ethiopia have been less well studied, nor have they received as much attention. But the capital of Ethiopia, Addis Ababa, is nearly 8,000 feet above sea level!

Another interesting aspect to this phenomenon is that it looks like the three populations respond to adaptive pressures differently. Their physiological response varies. And the more recent work in genomics implies that though there are similarities between the Asian and American populations, there are also differences. This illustrates the evolutionary principle of convergence, where different populations approach the same phenotypic optimum, though by somewhat different means. To my knowledge there has not been as much investigation of the African example. Until now. A new provisional paper in Genome Biology is out, Genetic adaptation to high altitude in the Ethiopian highlands:

We highlight several candidate genes for involvement in high-altitude adaptation in Ethiopia, including CBARA1, VAV3, ARNT2 and THRB. Although most of these genes have not been identified in previous studies of high-altitude Tibetan or Andean population samples, two of these genes (THRB and ARNT2) play a role in the HIF-1 pathway, a pathway implicated in previous work reported in Tibetan and Andean studies. These combined results suggest that adaptation to high altitude arose independently due to convergent evolution in high-altitude Amhara populations in Ethiopia.

The main shortcoming about this paper for me is that it does not highlight the evolutionary history of this adaptation. In the paper the authors compared the Amhara (a highland population) to nearby lowland populations. But did not explore the nature of the population structure and how it might have influenced the arc of adaptation. Are these very ancient adaptations? Or new ones? It seems that hominins have been resident in Ethiopian for millions of years. If this is so presumably there have been adaptations to higher elevations from time immemorial. But what if these adaptations are new?

More pointedly the Ethiopians can be modeled as a compound of an Arabian population with an indigenous East African one. If this is a genuine recent admixture event, then one might be able to ascertain via haplotype structure whether the adaptive variants derive from ancient African genetic variation, or whether they’re novel mutations. It seems that this paper is a good first step, but there’s a lot more to see here….

Citation: Genome Biology, doi:10.1186/gb-2012-13-1-r1

Image credit: Wikipedia

(Republished from Discover/GNXP by permission of author or representative)
 
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at http://www.razib.com"