The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
 TeasersGene Expression Blog

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Thanks, LOL, or Troll with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used three times during any eight hour period.
Ignore Commenter Follow Commenter
🔊 Listen RSS

Seven-Daughters There’s a new open access paper in AJHG, Tracing the Route of Modern Humans out of Africa by Using 225 Human Genome Sequences from Ethiopians and Egyptians, which is nice in that it uses state-of-the-art methods to analyze the genetics of a part of the world that warrants greater investigation. As the title of the paper implies the authors are focusing on a region which is likely the site of the exit of an ancient African population ~50-100,000 years ago which is responsible for over 90 percent of the ancestry of non-Africans. In short, they’re looking at the variation of modern populations across the region, and relating it to populations outside of the region, to infer historical relationships. This method has a long pedigree, at least by the standards of historical population genetics. About 15 years ago the Oxford geneticist Bryan Sykes wrote Seven Daughters of Eve, where he traced ancient European migrations to the most common mtDNA haplogroups in the continent. Using these results Sykes asserted that most of the ancestry of modern Europeans derives from Paleolithic hunter-gatherers; not Middle Eastern farmers. More precisely, modern Europeans exhibit overwhelming continuity with the Pleistocene populations. It turns out that this is wrong.

We know this because of ancient DNA, which is coming to various novel conclusions and overturning older understandings. One of them is that the genetic variation you see in a locale today has limited time depth into the past. That is why I state that it is likely that Cro-Magnons may contribute to less than 1 percent of the ancestry of modern Europeans. There are regions, such as the New World, where over the past 10,000 years genetic turnover on the whole has been modest, to negligible (most of the Holocene turnover in the New World before the arrival of Europeans is in northern North America). But this seems the exception rather than the rule. In South Asia, Africa, Europe, Siberia, East Asia and Southeast Asia, there is no dispute that the Holocene witnessed enormous changes in the genetic and demographic makeup of the dominant population. The flip side is that very ancient “archaic” lineages in some regions of the Eurasia have modern descendants. That is why I say we need to update our priors; the ancient branches of our family were mostly, but not entirely pruned, while many of the recent branches were mostly or even entirely pruned.

This brings me to the main question: how plausible it is that the genetic patterns on evidence in the paper in AJHG tell us about human evolutionary history with time depths of ~50,000 years. Color me skeptical. There are some specific issues that I’m confused by, in addition to the bigger framework. Greg Cochran has already put them into focus rather trenchantly. First, this section of the paper:

Using ADMIXTURE and principal-component analysis (PCA)18 (Figure 1A), we estimated the average proportion of non-African ancestry in the Egyptians to be 80% and dated the midpoint of the admixture event by using ALDER to around 750 years ago (Table S2), consistent with the Islamic expansion and dates reported previously.

300px-Fayum-34 A plain reading implies that 750 years ago non-African ancestry admixed into the population of Egypt so that it’s now 80% of the ancestry. Obviously this is insane. Egypt has a long history, and all the evidence that is not genetic indicates that ancient Egyptians were predominantly a population with Near Eastern and North African, not Sub-Saharan, affinities. The Roman era Fayum portraits suggest a people who resemble by and large modern Egyptians. Some do seem to have aspects of appearance which strike one as Sub-Saharan, but the presence of Nubians, as well as likely an ancient admixture event that occurred when Middle Eastern farmers arrived in the Nile Valley, can explain that. But when ascertaining the “Out of Africa” event you need to focus on the oldest element of ancestry. So you would have to look the people who contributed indigenous African ancestry well before the emergence of Egypt as a distinct civilization.

Here is the confusing part which inverts expectations. This last component is most likely to be within the “Non-African” segments of the Egyptian genome. I say this because the latest period of a mass population movement into Egypt from the Near East is ~8,000 year ago. 8,000 years is a long time, so recombination every generation would break apart the association between tracts of ancestry traceable to the newcomers, and that traceable to indigenous hunter-gatherers. Over time a new synthetic populatoin with its own distinctive population profile emerges. This is the case with South Asians, who are genetic compound of two very distinctive groups with extremely diverged histories. The latest evidence suggests that the admixture occurred on the order of ~4,000 years ago. That’s half the time depth of what likely occurred in ancient Egypt.

61kCcH+1C9L._SY344_BO1,204,203,200_ And about the African ancestry they did focus on, the 750 year time depth gives you a clue about where it came from: the rise of the Islamic empires and trans-Saharan trade enabled by camel triggered a massive influx of slaves from Africa into North Africa and the Near East (there was also an influx of slaves from the Caucasus and Central Asia, and for a time Europe, in large part because Islamic law banned the enslavement of believers). In the Maghreb these slaves were from West Africa. In the Persian Gulf the sources were diverse, but many were from East Africa. The natural source of Egyptian slaves is likely to be from the Sudan, what was ancient Nubia. Also, the Gumuz, who are used as a relatively unadmixed Ethiopian population (i.e., low Eurasian admixture fraction), are themselves of possible Sudanic origin and background!

I can agree that the Nubian/Sudanic ancestry exhibits a closer relationship to the population basal to non-Africans than West Africans. But, to me this paper does not make a strong case for a “northern” route through Egypt compared to the “southern” route, via the Bab-el-Mandeb. First, 50,000 years is a long time. My null assumption is that there has been enough population movement in Northeast Africa even before the Holocene to obscure the signal. Second, even without this consideration in mind, it strikes me that the African ancestry in Egyptians that they are focusing on is not a good geographic proxy in the first place, since it derives from Sudanic groups from further south. Finally, I do observe that this region of the world is relatively dry, making ancient DNA a possibility. So I have optimism that greater clarity will be achieved in the near future.

• Category: Science • Tags: Egypt, Out-of-Africa 
🔊 Listen RSS

Over the past week or so the perpetual argument about whether we were “superior” to Neandertals or not has cropped up again, thanks to a new paper in PLoS ONE, Neandertal Demise: An Archaeological Analysis of the Modern Human Superiority Complex. In it the authors utilize material remains to infer that no, in fact Neandertals were not “inferior”, and their demise was more a matter of demographic assimilation than competitive exclusion and extinction. To get the “other perspective” in a measured fashion I recommend this interview of Chris Stringer in National Geographic.

The-Dawn-of-Human-Culture As I suggest above this debate has gone back and forth for a long time, and seems subject to fashion as much as empirical results. Ten years ago the Stanford paleoanthropologist Richard Klein published The Dawn of Human Culture, which laid out very cogently the dominant perspective of the time that Neandertals were not humans as we understand humans, and were superseded by a neo-African lineage which was gifted with the bio-behavioral capacity for exceedingly flexible and protean cultural adaptability. Klein’s central conjecture is that ~50,000 years ago a subset of ancient humans in Africa developed the biological capacity for cultural creation on a scale unprecedented before in the hominin lineage. He appealed in a rather hand-wavy fashion to punctuated equilibrium to serve as the basis for what was basically a model of modern human origins out of a single saltation event. Though not perhaps in every exact detail, the broad outlines of Klein’s thesis were accepted as part of the “Out of Africa” canon which had crystallized over the previous 20 years. Modern humans came, they saw, they conquered.

All this changed in 2010 when A Draft Sequence of the Neanderthal Genome was published. Over the past four years the general inference established in this paper, that a few percent of non-African human ancestry derives from Neandertals, seems to have been supported by further publications. At the time I observed, and have continued to see, a trend to humanize Neandertals. After all, they were in part our ancestors, so the previous dehumanization now seems rather uncharitable. Klein and his fellow travelers even hypothesized that Neandertals did not have language, that complex speech may have been the bio-behavioral shift which allowed for the rapid expansion of neo-Africans. Though those who study science and claim that it is strongly shaped by cultural priors tend to become overly enthusiastic about this dynamic, there is some truth in it, and how we view Neandertals does seem subject in part to what we want them to be.

But it’s hard to shake the intuition that there was something special about neo-Africans 50,000 years ago. Stringer lays out the archaeological perspective, what I might term the “Great Leap Forward-lite” model. Complex, elaborate, and protean cultural expression with symbolic connotations seem to have exploded exponentially after the expansion of neo-Africans. That’s hard to deny. It’s easy then to make the leap to assuming that this cultural change was due to a biological change.

My own primary reason to be swayed by the view that neo-Africans were different in some fundamental and biological way is not much informed by archaeology, since I have only a superficial grasp of this field. Rather, it is the fact that it is neo-Africans who crossed the Wallace Line, and ventured into the New World. Hominins, humans of some kind, were present across Eurasia for over one million years, but it took neo-Africans to expand the frontiers of the Homo range almost immediately after they appear on the scene outside of the original contintent. That’s not in dispute at all, and requires no deep and specialized knowledge.

And yet this point does make me think somewhat about what to make of the Polynesians. The ancestors of the Melanesians arrived in “Near Oceania” ~30,000 years ago, and there they stopped. It took 25,000 years for humans to push past the eastern fringes of Melanesia into the vast Pacific, with the arrival of the Austronesian Lapita culture. As you can see on the map above the Austronesian maritime range was incredible, from Madagascar all the way to Easter Island off the coast of Chile. In fact it seems likely that Austronesians were the first settlers of Madagascar, as opposed to Africans from the relatively nearby coast. Using the same logic that informs my perception of Neandertals vs. neo-Africans one might ask if the Austronesians of Taiwan were somehow biologically different from the Melanesians. Perhaps there was a mutation which compelled the Austronesians on their long journey out of Taiwan?

The point here is not to underplay biological differences between Neandertals and neo-Africans. The fact that there seems to be a paucity of Neandertal ancestry on the X chromosome is suggestive of hybrid incompabilities. But, the conclusions one draws from the facts seem to be strongly colored by the model which one already presupposes. As a matter of fact there may be bio-behavioral differences between modern human lineages which explain differences in cultural expression. But obviously these are not so clear in a genetic sense. When we think of differences between Neandertals and neo-Africans it is often as if the traits were disjoint between the two groups. But perhaps it is a much more subtle and nuanced biological difference, which was ratcheted upward and amplified by the flexible nature of human culture. The rise and fall of neo-African cultures themselves exhibits an almost inexplicable waxing and waning. There is no reason it had to be any different in the past.

• Category: Science • Tags: Neandertals, Out-of-Africa 
🔊 Listen RSS

The latest edition of The American Journal of Human Genetics has two papers using “old fashioned” uniparental markers to trace human migration out of Africa and Siberia respectively. I say old fashioned because the peak novelty of these techniques was around 10 years ago, before dense autosomal SNP marker analyses, let alone whole genome sequencing. But mtDNA, passed down the maternal line, and Y chromosomes, passed from father to son, are still useful. Prosaically they’re useful because the data sets are now so large for these sets of markers after nearly 20 years of surveying populations. More technically because these two regions of the genome do not recombine they lend themselves to excellent representation as a tree phylogeny. Finally, mtDNA in particular is particularly amenable to estimates via molecular clock methodologies (it has a region with a higher mutational rate, so you can sample a larger range of variation over a given number of base pairs; you can use STRs, which mutate rapidly, for Y chromosomes, but there seems to be a lot of controversy in dating).

The papers are The Arabian Cradle: Mitochondrial Relicts of the First Steps along the Southern Route out of Africa and Mitochondrial DNA and Y Chromosome Variation Provides Evidence for a Recent Common Ancestry between Native Americans and Indigenous Altaians. Dienekes has already commented on the first paper. I am not going to take a detailed position on either, but I have to add that we need to be very careful of extrapolating from maternal or paternal lineages, and, assuming that population turn over is low enough that we can make phylogeographic inferences about the past from the present. For example, if you look at mtDNA South Asians as a whole strongly cluster with East Asians and not Europeans, while if you look at Y chromosomes you see the reverse. The whole genome gives a more mixed picture. Additionally, ancient DNA analyses in Northern Eurasia are showing strong discontinuities between past and present populations. So coalescence back to last common ancestor between two different lineages in two different regions may actually be due to diversity in a common source population more recently, which entered into demographic expansion and replaced other groups.

If you need the papers, email me. Some of you know the alphabet soup of haplogroups better than I do. Below are two figures which I think give the top line results.

🔊 Listen RSS

The BBC has a news report up gathering reactions to a new PLoS ONE paper, The Later Stone Age Calvaria from Iwo Eleru, Nigeria: Morphology and Chronology. This paper reports on remains found in Nigeria which date to ~13,000 years B.P. that exhibit a very archaic morphology. In other words, they may not be anatomically modern humans. A few years ago this would have been laughed out of the room, but science moves. Here is Chris Stringer in the BBC piece:

“[The skull] has got a much more primitive appearance, even though it is only 13,000 years old,” said Chris Stringer, from London’s Natural History Museum, who was part of the team of researchers.

“This suggests that human evolution in Africa was more complex… the transition to modern humans was not a straight transition and then a cut off.”

Prof Stringer thinks that ancient humans did not die away once they had given rise to modern humans.

They may have continued to live alongside their descendants in Africa, perhaps exchanging genes with them, until more recently than had been thought.

In the broad outlines most people still seem to hold that within the last ~100,000 years there was a major demographic pulse which swept out of Africa and populated the rest of the world. Something special did happen. Oceania and the New World were settled by the descendants of anatomically modern humans, whom we can trace back to Africa. The key modifications to the old model seem to be two-fold:

1) The possibility of admixture with other lineages on the way out

2) The sublocalization of the “Out of Africa” scenario, and further admixture with lineages within Africa

There have long been debates about an East or South Africa ur-heimat for the first anatomically modern humans. Others are now even positing a North African origin! To a great extent I wonder if a West or Central African origin is forgone in part due to the paucity of fossil remains entailed by the unfavorable conditions for preservation.

However the details shake out the story seems to be getting more, not less, complicated. This makes for less pithy one liners for the media, but also more work for scientists. Figuring out stuff can be fun!

🔊 Listen RSS

The Pith: The human X chromosome is subject to more pressure from natural selection, resulting in less genetic diversity. But, the differences in diversity of X chromosomes across human populations seem to be more a function of population history than differences in the power of natural selection across those populations.

In the past few years there has been a finding that the human X chromosome exhibits less genetic diversity than the non-sex regions of the genome, the autosome. Why? On the face of it this might seem inexplicable, but a few basic structural factors derived from the architecture of the human genome present themselves.

First, in males the X chromosome is hemizygous, rendering it more exposed to selection. This is rather straightforward once you move beyond the jargon. Human males have only one copy of genes which express on the X chromosome, because they have only one X chromosome. In contrast, females have two X chromosomes. This is the reason why sex linked traits in humans are disproportionately male. For genes on the X chromosome women can be carriers of many diseases because they have two copies of a gene, and one copy may be functional. In contrast, a male has only a functional or nonfunctional version of the gene, because he has one copy on the X chromosome. This is different from the case on the autosome, where both males and females have two copies of every gene.

This structural divergence matters for the selective dynamics operative upon the X chromosome vs. the autosome. On the autosome recessive traits pay far less of a cost in terms of fitness than they do on the X chromosome, because in the case of the latter they’re much more often exposed to natural selection via males. In the rest of the genome recessive traits only pay the cost of their shortcomings when they’re present as two copies in an individual, homozygotes. A simple quasi-formal example illustrates the process.

Imagine a population which has an allele which expresses recessively and has sharply reduced fitness when it expressed. Assume that the allele in question, q, is present at a proportion of 0.50. All the other functional alleles are classed together as p, and are also 0.50. In the next generation the Hardy-Weinberg Equilibrium would entail that: 75% of the individuals would not express the recessive trait, but 25% of the individuals would.* But for ever copy of the deleterious allele which is expressed and so exposed to natural selection, there’s another copy of the deleterious allele which is “masked” in a heterozygous individual with one good copy, and so evades natural selection. As natural selection decreases the frequency of the deleterious allele fewer and fewer copies will be found in recessively expressed individuals, and so the power of selection to remove the allele will decrease as its own frequency declines. When the frequency of the deleterious alleles is ~0.01, only about 1 out of 100 copies will be found in a homozygote exposed to natural selection. In this way genetic diversity of even deleterious alleles can be preserved as many low frequency recessively expressed variants.

The situation differs on the X chromosome. If the population consisted only of females then the model above would hold. The trait only expresses if a female has two copies of the faulty gene. But one out of every three X chromosomes in the typical human population is present in a male. That means that every deleterious allele on that X will bear its full cost if it happens to be in a male, a 1 out of 3 probability. So I calculate that when you have a situation where the deleterious allele is present as a fraction ~0.01 on the X chromosome about 1 out of 4 copies will be expressed, overwhelmingly in males. This is a 25-fold difference between the X and autosome in terms of copies of a deleterious allele exposed to natural selection, all due to the hemizygosity of males.

But the effect of selection isn’t uniformly negative, the purification of bad gene copies from the population. Positive forces can also reduce diversity via a selective sweep. How and why this happens is rather straightforward. Imagine that you have a single base pair which fortuitously has a mutation which is very beneficial in a single individual. To make the expression simple imagine that it is dominant, and the individual is a heterozygote. The single individual who carries the favored mutation has a very large family because ~50 percent of their offspring also carry the favored mutation and are much more fit than the population average. And so on. This favored variant can spread very fast. Lactose tolerance is a good concrete case of this. When I say the favored variant spreads, I’m actually talking about one gene copy from one person which starts to increase in frequency because of its adaptive value. But recall that a single base pair is embedded within the genome, and that chromosomal regions are generally passed on together from parent to offspring. It’s quite often a package deal. When a favored allele emerges it enables the “hitchhiking” of nearby variants which have no selective advantage, except that they luckily exist next to a very adaptively beneficial allele (think of them as the gene’s “posse” or entourage). Of course genetic recombination breaks apart these associations over time, but this process takes generations. Until then what you see is the proliferation of a particular genomic segment along with the increase in frequency of the favored gene which is embedded in that particular region. By straightforward logic when a whole segment with associated alleles starts to increase in frequency aggregate genetic diversity decreases, as variation is swept aside.

And yet evolution is not simply natural selection. There are two processes which have nothing to do with selection as such which might reduce genetic variation. The motor which both these phenomenon turn on is random genetic drift. As you increase the power of drift to fluctuate gene frequencies generation to generation you also increase its power to render alleles extinct as they are extinguished once they hit the zero frequency boundary condition. This is why populations which have gone through population bottlenecks are so homogeneous; drift has squeezed most of the variation out of the gene pool by capriciously favoring some alleles and eliminating most of the rest.

The dynamics relevant to this specific case are differences in male and female effective population size, and large fluctuations in long term effective population size. For purposes of reduced X chromosomal diversity one would have to posit lower female effective population size than male effective population size. The reason why this would impact the diversity of the X relative to the autosome is that the X spends 2/3 of its time in females, while the autosome only spends 1/2 of its time in both sexes. So if females have lower effective population sizes than males the X chromosome is being buffeted by greater stochastic forces than the autosome. More generally, the X chromosome has a lower effective population even assuming sex balance because for every 4 copies of an autosomal chromosome there are 3 X chromosomes. Because of this reduced effective population size the X would be more sensitive to bottlenecks and the like, one of the consequence of which is reduced genetic diversity.

All the above is important to keep in mind when reading a new report in Nature Genetics on the balance between selection and drift in reducing variation on the X chromosome and across populations. The second refers to the fact that Africans seem to exhibit less relative reduction of variation on the X chromosome than non-Africans. First, the paper’s abstract, Analyses of X-linked and autosomal genetic variation in population-scale whole genome sequencing:

The ratio of genetic diversity on chromosome X to that on the autosomes is sensitive to both natural selection and demography. On the basis of whole-genome sequences of 69 females, we report that whereas this ratio increases with genetic distance from genes across populations, it is lower in Europeans than in West Africans independent of proximity to genes. This relative reduction is most parsimoniously explained by differences in demographic history without the need to invoke natural selection.

This research is part of the trend I’ve alluded to toward looking at whole-genome sequences. Remember, a lot of the 1 million SNP papers are focusing only upon genetic variants, polymorphisms, across the 3 billion base pairs. These variants are especially informative, but they miss a lot of the genome. Additionally there are some statistical problems with bias in the selection of the variants because they’re usually tuned toward one population, Europeans (different populations have somewhat different variants across the genome). The takeaway is that the time is now nearly here when we can look at the genome at its most precise and fine-grained scale, rather than using approximations, whether it be one locus, or 1 million SNPs.

With this broad canvas in mind, if there’s one thing you’ve read about the genome it’s that much of it is not functional. It doesn’t code. There are zones of the genome which are intergenic, between genes. Natural selection generally targets functional regions, not intergenic ones. If natural selection is the primary dynamic effecting the pattern we see here then differences should manifest between genic and intergenic regions since selection plays a much larger role in the former than the latter, both in constraining variation and increasing the frequency of favored alleles.

The figure below has four panels. Every panel has an x-axis defined by distance from a gene, left to right with increasing distance. So the leftmost point can be thought of as genic, and the rightmost point as intergenic. The left panels define Europeans, and the right panels Africans. More precisely they’re displaying results from whole-genome sequences of 36 West African Yoruba and 33 European American females. The top row shows the change in raw nucelotide diversities for autosomes and X, and the bottom row illustrates the change in ratio of diversity of the two genomic classes (X vs. autosome) as a function of distance.

In molecular evolutionary genetics it often useful to assume that the null hypothesis is neutrality. Basically that means that selection is not a main effect in driving the variation. Instead it’s a function of random forces such as mutation and drift. When one sees deviation from neutrality then one considers the effect of natural selection and the possibility of adaptation. You see here clear evidence for natural selection. The genetic diversity on the X chromosome has a much stronger relationship to distance from genes than the autosome. This matters because as you recall the X chromosome is much more brutally sculpted by natural selection on a priori grounds because disfavored alleles would be pruned more efficiently, while recessively expressing favored alleles would be less handicapped by the fact that their favored traits often did not express when they were present (because they were suppressed when in heterozygote). The pattern above is entirely in keeping with that model.

So now we’ve seen that a closer whole-genome examination of these samples implies that the X vs. autosomal difference in diversity is not just a function of neutral forces, but may have been driven by natural selection. But there’s a second part of the phenomenon: the disjunction is usually more stark in non-Africans. If so, does this imply that non-Africans have been subject to more natural selection? The manner in which they explored this question was clean and elegant: they compared the ratios of ratios as a function of distance from genes. By this, I mean that they looked at the ratio of diversity of the genome between the X and the autosome, and then generated a ratio from this value by comparing across Europeans and Africans. Unlike those above the figure to the left shows no differences as a function of genetic distance. What does this tell us? If natural selection was more efficacious in Europeans than Africans then the differences in diversity across these two populations should be stronger near genetic regions, because that is where the power of selection is most felt. Instead, what you see is that though the difference across X and autosomal genomes is real, it is consistent between the genomes of Africans and Europeans across the X and the autosome.

This suggests that the difference between Africans and Europeans is driven by demographics and not adaptation (positive selection) or functional constraint (negative or purifying selection). Random evolutionary forces don’t see genic or intergenic regions. They’re random, and blind or neutral to functional import. Unlike selection their impact is going to be genome-wide, just as the inter-regional differences we see here are.

In this case what happened? Going back to the beginning there were two specific possibilities: sex-biased migration and greater fluctuation in effective population size among non-Africans. The latter model is entirely consistent with an “Out of Africa” scenario where non-Africans derive from a small ancestral population which left Africa. This is the great “Out of Africa” bottleneck which seems to be a consistent finding by human molecular evolutionists. Because the X chromosome has a somewhat smaller effective population it would presumably have been more impacted by the homogenizing force of this bottleneck.

The first option though is intriguing, if peculiar. What if there were multiple “Out of Africa” pulses which consisted disproportionately of groups of young males? This would have enriched the genetic diversity of non-Africans on the autosome far more than the X chromosome, because the males would bring only one X chromosome for every two autosomes. I think the “Out of Africa” model is more plausible, but I’m not going to dismiss this scenario out of hand. We live in interesting and strange times when it comes to the origin of modern humans.

* p2 + 2pq + q2 = 1 = 0.502 + 2(0.50)(0.50) + 0.502

Citation: Gottipati, Srikanth, Arbiza, Leonardo, Siepel, Adam, Clark, Andrew G, & Keinan, Alon (2011). Relative autosomal, X-linked and X/A diversity are not correlated with genetic distance from the nearest gene. Nature Genetics : 10.1038/ng.877

🔊 Listen RSS

ResearchBlogging.orgThe Pith: We are now moving from the human genome project, to the human genome s project. As more and more full genomes of various populations come online new methods will arise to take advantage of the surfeit of data. In this paper the authors crunch through the genomes of half a dozen individuals to make sweeping inferences about the history of the human species over the past few hundred thousand years.

Since the integration of evolution and genetics in the early years of the 20th century there have been several revolutions in our ability to perceive the underlying variation which is the raw material and result of evolutionary genetics. The understanding that DNA was the concrete substrate of Mendelian genetics, and the rise to prominence of molecular genetic techniques in understanding evolution the 1970s and 1980s, was one key transition. No longer were geneticists simply tracking the coat colors of mice or the visible mutations of fruit flies. In the 1990s the uniparental loci, the maternal and paternal lineages as inferred from the mtDNA and Y chromosomes, came into their own. Finally, the 2000s saw the post-genomic era, and researchers routinely began analyzing data sets of hundreds of thousands of single nucelotide polymorphisms (SNPs), genetic variants, in hundreds of individuals.

In this decade some of the promise of the Human Genomic Project will finally ripen, in that whole genomes are going to be used more and more in analyses. This is exciting, but there are some obvious issues. The human genome has ~3 billion base pairs, vs. the 1 million or less you might manipulate per individual in data sets focused on SNPs. There are some things for which a human genome is overkill. You don’t need a full genomic sequence to ascertain your identity as a member of a particular geographic race. Not only can visual inspection usually suffice to reassure you as to your background, but depending on the scale of granularity you want a random SNP set on the order of ~10,000 should suffice, or as few as 25 ancestrally informative markers! But, if you want to ascertain mutation rates within families will precious and confidence, you do need the full genome.

A new paper in Nature illustrates the possibilities of looking at the whole genome, instead of simply a variant subset. In it, the authors show the power of using only a few individuals’ whole genomes to derive insights about broader population histories. That’s because with a whole genome you obviously are maximizing the amount of data you’re getting in terms of raw sequence, and there’s no need for approximations.

Inference of human population history from individual whole-genome sequences:

The history of human population size is important for understanding human evolution. Various studies…have found evidence for a founder event (bottleneck) in East Asian and European populations, associated with the human dispersal out-of-Africa event around 60 thousand years (kyr) ago. However, these studies have had to assume simplified demographic models with few parameters, and they do not provide a precise date for the start and stop times of the bottleneck. Here, with fewer assumptions on population size changes, we present a more detailed history of human population sizes between approximately ten thousand and a million years ago, using the pairwise sequentially Markovian coalescent model applied to the complete diploid genome sequences of a Chinese male (YH)…a Korean male (SJK)…three European individuals…and two Yoruba male...We infer that European and Chinese populations had very similar population-size histories before 10–20 kyr ago. Both populations experienced a severe bottleneck 10–60 kyr ago, whereas African populations experienced a milder bottleneck from which they recovered earlier. All three populations have an elevated effective population size between 60 and 250 kyr ago, possibly due to population substructure…We also infer that the differentiation of genetically modern humans may have started as early as 100–120 kyr ago…but considerable genetic exchanges may still have occurred until 20–40 kyr ago.

The results of the paper itself are not earth-shattering. It’s really just a test-run of a series of methods which will probably become widespread if they turn out to be more useful than the alternatives. We’ve long seen a pattern in the genetic data of a relatively larger long term African population, and a bottleneck with non-Africans.

In terms of method, first they seem to have focused on patterns of genetic variation on the intra-locus dimension. By this, I mean that they had diploid whole genomes, as every gene necessarily comes in two copies except on the sex chromosomes, and they analyzed the patterns of variation of heterozygosity (two different variants of the gene) or homozygosity (same variant of the gene). These patterns would be distributed across the genome, on the inter-locus dimension, differentiated by recombination events which chop apart the patterns across the genome by mixing and matching chromosomal segments. Recombination events occur steadily across time, so the nature of the patterns can allow one to infer recombination events, the magnitude of which can then lead one to to the time of the last common ancestor of two segments.

As I note, qualitatively they replicated what has long been known, but the authors claim that their model allows for more precise quantitative inferences with fewer parameters. The parameters free to vary in their model were the mutation rate, the recombination rate, and ancestral population sizes. With their assumptions in hand they generated the following figure panel which shows the effective population size inferred from genomes as a function of time:

Moving to the left you come closer to the present, while to the right you move further into the past. Because of the reliance on recombination rates the authors admit that their method lacks power <20,000 years before the present, and > 3 million years before the present. In the former case there are too few recombination events, and in the latter case I assume that the events saturate the genome (they also note that deep balancing selection could generate artifacts). The authors validated their method by simulating genomes, but the results are obviously correct to a first approximation from what we know in other disciplines. You have a major Eurasian bottleneck, and a less severe bottleneck for Africans. Then the bounce back after the Last Glacial Maximum.

The second chart is more complicated, but the take away is that it is from this that the authors inferred that there must have been admixture between the ancestors of West Africans and Eurasians ~20-40, thousand years ago. More intelligibly the authors noted that the X chromosomes of the Korean and African individuals did not diverge nearly as much as they should, in that regions with last common ancestry in the ~20-40,000 year interval were far more numerous in their data than simulated results would imply using a model of total separation 60,000 years ago.

Let’s jump straight to the discussion:

The time frame proposed above for continued genetic exchange between Africans and non-Africans is more recent than the archaeologically documented time of the out-of-Africa dispersal, because there are modern human fossils in both Europe and Australasia that date to >40 kyr ago…Further analysis of additional non-African genomes indicates that this genetic exchange occurred primarily before the separation of Europeans and East Asians…An important caveat to this conclusion is the uncertainty of the per-year mutation rate of 1.0 × 10−9 (2.5 × 10−8/25). Although this mutation rate agrees well with the rates estimated between primates averaged over millions of years…generation intervals as high as 29 years per generation over the last few thousand years23, and present mutation rates lower than 2.5 × 10−8 per generation…are possible in principle. These factors could make our recent date estimates too recent, although it seems unlikely that such inaccuracies would be consistent with a date of final genetic exchange as far back as 60 kyr ago….

If I was a journalist I would probably put this into the “developing….” bin, as there may be revisions to the human mutation rate, as they acknowledge above. In fact I have to wonder if a reviewer prodded them to add that caveat to the paper, though I am also rather sure that many of the authors are quite aware of some of the discussions as to the exact value of this parameter.

My own position as to the details of mutation rates and their implications for modern human origins are inchoate. But, let’s assume that we push back the last common ancestry estimates by a factor of 1.5-2. This may explain Eurasian-African admixture easily, if we presume that the ancestral proto-Eurasians were a liminal African population, which was in position to interbreed with both Africans proper and Neandertals because of their geographically equidistant position. Of course one thing that jumps out at me is that many of these arguments would be resolved if we sequenced a full blooded Australian Aborigines. If this population is descended from the humans who arrived 40-50,000 years ago, then we can test whether the African admixture occurred 20-40,000 years before the present. If the Aborigine shows signs of admixture, then we have to be open to moving the time back to a period when the Aborigines were resident on the Eurasian mainland. Another possibility of course is that the Aborigines we deem indigenous today are actually late arrivals, on the order of ~20-40,000 years, replacing the original humans who arrived ~40-50,000 years ago.

I suspect many of these questions will be answered with larger data sets in the near future. The utility of methods such as the one above will increase once we fine tune some of the parameters. Interesting times.

Citation: Heng Li, & Richard Durbin (2011). Inference of human population history from individual whole-genome sequences Nature : 10.1038/nature10231

🔊 Listen RSS

A few months ago I exchanged some emails with Milford H. Wolpoff and Chris Stringer. These are the two figures who have loomed large in paleoanthropology and the origins of modernity human for a generation, and they were keen in making sure that their perspectives were represented accurately in the media. To further that they sent me some documents which would lay out their perspective, in their own words, and away from the public glare (as in, they’re academic publications).

Here is Wolpoff’s 1984 manifesto of sorts of ‘Multi-regionalism.’ Much of the morphological material is totally opaque to me, but the basic evolutionary logic is rather clear. Stringer sent me two documents, a scientific paper and a more personal chapter of a book. These works predate recent developments, so they are of interest from a history of thought perspective.

I’m not one of the personalities at the heart of this debate obviously. There are hard feelings here. Wolpoff indicated to me that he still has issues with Stringer, despite reports that there was some sort of reconciliation. But one of the things that is really evident to me to reading through this material is that there are real scientific issues, and a great deal of methodological overlap. As a kid I read a lot of popularizations, and though the science was outlined it seems to me that the models were framed more starkly to align with the inter-personal conflict and division. The “primary documents” are less readable, but ultimately so much more fruitful.

🔊 Listen RSS Last summer I made a thoughtless and silly error in relation to a model of human population history when asked by a reader the question: “which population is most distantly related to Africans?” I contended that all non-African populations are equally distant. This is obviously wrong on the face of it if you look at any genetic distance measures. West Eurasians, even those without recent Sub-Saharan African admixture (e.g., North Europeans) are closer than East Eurasians, who are often closer than Oceanians and Amerindians. One explanation I offered is that these latter groups were subject to greater genetic drift through a series of population bottlenecks. In this framework the number of generations until the last common ancestor with Sub-Saharan Africans for all groups outside of Africa should be about the same, but due to evolutionary factors such as more extreme genetic drift or different selective pressures some non-African groups had diverged more from Africans than others in terms of their genetic state. In other words, the most genetically divergent groups in relation to Africans did not diverge any earlier, but simply diverged more rapidly.

Dienekes Pontikos disagreed with such a simple explanation. He argued that admixture or gene flow between Africans and non-African groups since the last common ancestor could explain the differences. I am now of the opinion that Dienekes may have been right. My own confidence in the “serial bottleneck” hypothesis as the primary explanation for the nature of relationships of the phylogenetic tree of human populations is shaky at best. Why my errors of inference?

There were two major issues at work in my misjudgments of the arc of the past and the topology of the present. In the latter instance I saw plenty of phylogenetic trees which illustrated clearly the variation in genetic distance from Africans for various non-African groups. Why didn’t I internalize those visual representations? It was I think the power of the “Out of Africa” (OoA) with replacement paradigm. Even by the summer of 2010 I had come to reject it in its strong form, due to the evidence of admixture with Neanderthals, and rumors of other events which were born out to be true with the publishing of the Denisovan results. But to a first approximation the clean and simple OoA was still looming so large in my mind that I made the incorrect inference, whereby all non-Africans are viewed simply as a branch of Africans without any particular differentiation in relation to their ancestral population. Secondarily, I also was still impacted by the idea that most of the genetic variation you see in the world around us has its roots tens of thousands of years ago. By this, I mean that the phylogeographic patterns of 25,000 years in the past would map on well to the phylogeographic patterns of the present. This assumption is what drove a lot of phylogeography in the early aughts, because the chain of causation could be reversed, and inferences about the past were made from patterns of the present. My own confidence in this model had already been perturbed when I made my errors, but it still held some sort of sway in my head implicitly I believe. It is one thing to move on from old models explicitly, but another thing to remove the furniture from your cognitive basement and attic.

I have moved further from my preconceptions between then and now. It took a while to sink in, but I’m getting there. A cognitive “paradigm shift” if you will. In particular I am more open to the idea of substantive back migration to Africa, as well as secondary migrations out of Africa. A new paper in Genome Research is out which adds some interesting details to this bigger discussion, and seems to weigh in further against my tentative hypothesis that serial bottlenecks and genetic drift can explain variation in distance to Africans of various non-African groups. Human population dispersal “Out of Africa” estimated from linkage disequilibrium and allele frequencies of SNPs:

Genetic and fossil evidence supports a single, recent (<200,000 yr) origin of modern Homo sapiens in Africa, followed by later population divergence and dispersal across the globe (the “Out of Africa” model). However, there is less agreement on the exact nature of this migration event and dispersal of populations relative to one another. We use the empirically observed genetic correlation structure (or linkage disequilibrium) between 242,000 genome-wide single nucleotide polymorphisms (SNPs) in 17 global populations to reconstruct two key parameters of human evolution: effective population size (N e) and population divergence times (T). A linkage disequilibrium (LD)–based approach allows changes in human population size to be traced over time and reveals a substantial reduction in N e accompanying the “Out of Africa” exodus as well as the dramatic re-expansion of non-Africans as they spread across the globe. Secondly, two parallel estimates of population divergence times provide clear evidence of population dispersal patterns “Out of Africa” and subsequent dispersal of proto-European and proto-East Asian populations. Estimates of divergence times between European–African and East Asian–African populations are inconsistent with its simplest manifestation: a single dispersal from the continent followed by a split into Western and Eastern Eurasian branches. Rather, population divergence times are consistent with substantial ancient gene flow to the proto-European population after its divergence with proto-East Asians, suggesting distinct, early dispersals of modern H. sapiens from Africa. We use simulated genetic polymorphism data to demonstrate the validity of our conclusions against alternative population demographic scenarios.

Here are the details. The authors use patterns of linkage disequilibrium (LD) to gauge divergence, time since divergence, and, the effective population sizes of various groups. LD measures the correlations of genetic variations across loci. Because of the shuffling properties of recombination the correlation of markers across the genome should be relatively low. That is, they should be independent. But not in all cases. You could, for example, have two markers at two genes which are positioned together close physically. Now imagine a selective sweep event which increases the frequency of one of the variants through positive selection. Then the other marker on the second gene will also rise up in frequency by “hitchhiking” along on the other’s good fortune. Over time recombination will break apart these associations, but that decay of LD takes time. Important, it is not just natural selection which can generate these patterns within the genome. Population bottlenecks can drive up (and down ) fragments of the genome wildly because of the jacking up of “noise” into the generation-to-generation transmission of allele frequency values within a population. So LD can reflect both demographic events as well as bouts of adaptation.

Another measure of genetic variation that the authors rely is the fixation index (Fst). This ignores patterns of correlation across genes, and is a comparison of the variation of a given specific marker from population to population. High Fst values are a signal to a lot between population differentiation. An Fst value of ~0 indicates almost no between population differentiation. An extreme example would be a marker, 1, which is at frequency 0.5 in population A and population B, and a marker, 2, which is at frequency 0.0 and 1.0 in population A and B. Fst = 0.0 for marker 1, and 1.0 for marker 2. The Fst values in this paper are averaging across the genome, so obviously you’ll get values on the interval between 0 and 1, though it will usually be closer to 0 for any given marker (average intercontinental human Fst values at a given marker is famously ~15%; ergo, the chestnut of wisdom that 85% of variation is within races, and 15% between).

The chart at the top of the post shows the divergence times inferred from an Fst based statistic and an LD based statistic, above and below respectively. Two notable things to observe. First, the basic structure of both statistics is similar. Second, LD tends to give smaller values. The authors contend that LD is clearly an underestimate because it doesn’t take into account migration and fixation of allele frequencies, where one variant reaches 100% and so LD can not be calculated.

An aspect of LD which is useful for the authors is that they could calculate effective population sizes over time for their disparate samples. Below is a plot which shows the variation over time. I’ve added some clarifying labels (you should recognize many of the abbreviations from the HapMap populations):

Some observations:

1) African have a relatively large breeding population from before to after the putative OoA event.

2) Non-Africans show the small ancestral population during the Pleistocene that you’d expect, rising very slowly if at all from the exit event from Africa across the Ice Ages.

3) Then ~10,000 years ago you start to see divergences. The Chinese crest to very large effective populations. The Tuscans are next in order. Then there’s a cluster of Northwest European groups. The Japanese are between the Tuscans and Northwest Europeans. Finally at the bottom you see Finns and Mexicans. This is not too surprising in terms of rank order. But here’s the interpretation from the paper at the European patterns:

…likely the consequence of bottlenecks associated with the depopulation and recolonization of Northern Europe before and after the last glacial maximum…growth accelerates moving forward in time, with the average rate about threefold higher in the period 8–5 KYA than 20–8 KYA, presumably representing the impact of agricultural innovations on population density.

Remember my point that it is problematic to back project contemporary variation to the past? I think this needs to be emphasized here. My own hunch is that the difference between the Finns and other Northwest Europeans has to do with the relative late adoption of agriculture of the former, and the possibility that much of the genome of the latter is due to relatively late intrusions from southern and eastern Europe of explosively expanding agricultural groups. In other words, I’m not sure that aside from the Finns the recolonization after the LGM matters much at all.

Also, there’s one point I want to make sure to get to: the authors contend that the time until last divergence can’t be explained by a model of serial bottlenecks, as I had posited last summer. In other words, there has to be more complex dynamics at work here. They ran a bunch of simulations with constructed genomic sequences. Varying effective population size so that you have a bunch of serial bottlenecks was not enough to explain the difference between East Asians and West Eurasians when it came to time until last divergence to Sub-Saharan Africans. There has to be something more complex going on.

Speaking of complexity, I would also like to add that this paper reinforces the likelihood of a “pause” of the ur-non-African population after they left Africa. There’s a ~20,000 year gap between the time until the last common ancestor, and then the separation of West and East Eurasians. Several genomic analyses have pointed in this direction. I think the exact span of this interval is going to be debated, but I suspect that it is real. Additionally, the authors contend that the genetic closeness of West Eurasians to Sub-Saharan Africans may point to a ancient second migration out of Africa.

First, let’s walk back to where we started. Here was the rough “cartoon” model of the origin of modern H. sapiens sapiens circa 2009:

1) 50-100 thousand years ago you have a huge number of hominin groups across Africa and Eurasia.

2) At some point within this interval of time a small population of East Africans began to rapidly expand in population. They replaced in totality all other hominins, within, and outside of, Africa.

3) Therefore the inference can be made that all human beings alive today are descended from one tribe of East Africans.

At this point we can probably reject this model as being the full story. There is now suggestive evidence that the population fluctuations of Africans has been far more modest than non-Africans over the past 100,000 years. We also have to confront the likelihood of multiple admixture events with those “Other” hominins outside of, and possibly within, Africa. Finally, we can’t reject back migration events as well as multiple Out of Africa pulses.

I believe that the pattern of genetic variation across the whole world, including within Africa, has re-ordered itself radically over the past 10,000 years. We need to stop, and take a breath. If we know so little about the past 10,000 years, how much can we confidently infer about the past 100,000 years? Only a few points I suspect. For now.

Related: See Dienekes’ comment as well.

Citation: McEvoy BP, Powell JE, Goddard ME, & Visscher PM (2011). Human population dispersal “Out of Africa” estimated from linkage disequilibrium and allele frequencies of SNPs. Genome research PMID: 21518737

🔊 Listen RSS

ResearchBlogging.orgThe Pith: I review a recent paper which argues for a southern African origin of modern humanity. I argue that the statistical inference shouldn’t be trusted as the final word. This paper reinforces previously known facts, but does not add much that both novel and robust.

I have now read the paper which I expressed a touch of skepticism toward yesterday. Do note, I did not dispute the validity of their results. They seem eminently plausible. I was simply skeptical that we could, with any level of robustness, claim that anatomically modern humans arose in southern vs. eastern, or western, Africa. If I had to bet, my rank order would be southern ~ eastern > western. But my confidence in my assessment is very low.

First things first. You should read the whole paper, since someone paid for it to be open access. Second, much props to whoever decided to put their original SNP data online. I’ve already pulled it down, and sent off emails to Zack, David, and Dienekes. There are some northern African populations which allow us to expand beyond the Mozabites, though unfortunately there are only 55,000 SNPs in that case (I haven’t merged the data, so I don’t know how much will remain after combining with HapMap or HGDP data set).

The abstract:

Africa is inferred to be the continent of origin for all modern human populations, but the details of human prehistory and evolution in Africa remain largely obscure owing to the complex histories of hundreds of distinct populations. We present data for more than 580,000 SNPs for several hunter-gatherer populations: the Hadza and Sandawe of Tanzania, and the ≠Khomani Bushmen of South Africa, including speakers of the nearly extinct N|u language. We find that African hunter-gatherer populations today remain highly differentiated, encompassing major components of variation that are not found in other African populations. Hunter-gatherer populations also tend to have the lowest levels of genome-wide linkage disequilibrium among 27 African populations. We analyzed geographic patterns of linkage disequilibrium and population differentiation, as measured by FST, in Africa. The observed patterns are consistent with an origin of modern humans in southern Africa rather than eastern Africa, as is generally assumed. Additionally, genetic variation in African hunter-gatherer populations has been significantly affected by interaction with farmers and herders over the past 5,000 y, through both severe population bottlenecks and sex-biased migration. However, African hunter-gatherer populations continue to maintain the highest levels of genetic diversity in the world.

Why would hunter-gatherers have so much diversity? The historical and ethnographic data here are clear: it is not that hunter-gatherers are particularly diverse, but that descendants of farming populations tend to be less diverse, and most of the world’s population are descendants of farmers. To give a classic example, ~30,000 Puritans and fellow travelers who arrived in the 1630s to New England gave rise to ~700,000 New Englanders in 1790. This is a growth by a factor of 3 to 4 per generation. And, this does not include the substantial back migration to England during the 1650s, as well as the fact that there was already spillover of New Englanders to other regions of the American colonies in the 17th and 18th centuries (e.g., eastern Long Island was dominated by New Englanders). 30,000 is not small enough to constitute a bottleneck genetically, but one can imagine much smaller founding populations rapidly compounding as agriculturalists push their way through ecologically constraining bottlenecks.

For Africa we have a good candidate for this phenomenon: the Bantu expansion. This rise of African farmers began around the region of eastern Nigeria and Cameroon ~ 3,000 years ago. It swept east, toward the lakes of eastern Africa, and down along the Atlantic coast toward modern day Angola. Between 1,000 and 2,000 years ago in its broad outlines the expansion had crested, reaching its limit in southern Africa, where the climatic regime was not favorable for their tropical agricultural toolkit (e.g., the Cape region has a Mediterranean climate). Here you still have the hunter-gatherer Bushmen, and other Khoisan groups such as the Nama, who practiced animal husbandry. By and large this expansion seems to have resulted in a great deal of biological replacement of previous peoples. South African Bantu speakers, such as Desmond Tutu, share more with Nigerians genetically than they do with the nearby Bushmen, though there has been some admixture on the frontier among Xhosa.

As I have stated, most of this paper elicits little objection from me. The major issue that I do take objection to is the inference that these results indicate the likelihood of southern, not eastern, Africa, being the origin of anatomically modern humanity. The authors do point out that many of the hallmarks of modern humanity have their earliest dates in southern, not eastern, Africa. That does add to the plausibility of their overall case, and I would be curious as to the opinion of someone more versed in the material culture and fossil remains to weigh in. But that’s where we started, not where we are, assuming that their specific contribution to the model does push it forward. So I’ll focus on the genetic data. Here’s the point which seems tendentious to me:

…Regressions of LD on distance from southwestern Africa were highly statistically significant (at 5-Kb windows, P ≈ 4.9 × 10−6) (Fig. 2C). Best-fit (Materials and Methods) locations based on LD are consistent with a common origin in southern Africa. A point of origin in southwestern Africa was approximately 300–1,000 times more likely than in eastern Africa….

If you’ve calculated regressions, you know that this can be quite the art. They are sensitive to various assumptions, as well as the data you throw into them. They’re dumb algorithms, so they’ll give you a result, even if it doesn’t always make sense. To really understand why I remain moderately skeptical of the inference in this paper, you need to look at figure 2B. I’ve reedited a bit for style. Also, some of the groups were so obscure that even I didn’t know them, so I just put in their nation.

On the y axis is linkage disequilibrium. Basically, population bottlenecks, and admixture events, along with localized selective sweeps, can elevate this statistic. The LD statistic for non-African populations is invariably higher than for African ones, and the further away, the higher the value. On the x axis is the distance from their inferred point of origin of the human expansion in south-eastern Africa. The Hadza seem to have gone through a recent bottleneck (or, are going through it now) according to other measures in the paper, so no surprise that they’re deviated above the trend line. The other hunter-gatherer groups, the Bushmen and Pygmies (Namibian and South African Bushmen, the the Biaka from western Congo and the Mbuti from the east of that nation) have low LD values, consistent with relatively stable and deep time histories for the populations, when viewed as a coherent whole (all humans have equally ancient lineages, but coherent populations can be older, or younger, depending on how you view them). My main issue is this: once you remove the non-Sub-Saharan African populations the trend line is far less stark. The Fang, who are a Bantu group near the point of origin of that language family, have nearly the same LD as some of the hunter-gatherer groups. The Mandenka, in far western Africa, have elevated LD vis-a-vis hunter-gatherers, but not nearly so much as the groups with more “northern” admixture (e.g., the Fulani).

The moral of the story here is to not just rely on the final numbers generated by statistical methods, which can be quite of a large magnitude, but look at the figures and try to make sense of them. Overall, I would say that this paper presents many interesting results, but the most robust look to be confirming what we know previously, rather than increasing the probability of a novel locus for the point of origin of modern humans (though the southern origin already gains some support from archaeology).

Citation: Brenna M. Henn, Christopher R. Gignoux, Matthew Jobin, Julie M. Granka, J. M. Macpherson, Jeffrey M. Kidd, Laura Rodríguez-Botigué, Sohini Ramachandran, Lawrence Hon, Abra Brisbin, Alice A. Lin, Peter A. Underhill, David Comas, Kenneth K. Kidd, Paul J. Norman, Peter Parham, Carlos D. Bustamante, Joanna L. Mountain, & Marcus W. Feldman (2011). Hunter-gatherer genomic diversity suggests a southern African origin for modern humans PNAS : 10.1073/pnas.1017511108

Image credit: Mark Dingemanse.

🔊 Listen RSS

John Farrell pointed me to this Anne Gibbons’ piece, A New View Of the Birth of Homo sapiens. Here’s some interesting passages:

The new picture most resembles so-called assimilation models, which got relatively little attention over the years. “This means so much,” says Fred Smith of Illinois State University in Normal, who proposed such a model. “I just thought ‘Hallelujah! No matter what anybody else says, I was as close to correct as anybody.’ ”

But the genomic data don’t prove the classic multiregionalism model correct either. They suggest only a small amount of interbreeding, presumably at the margins where invading moderns met archaic groups that were the worldwide descendants of H. erectus, the human ancestor that left Africa 1.8 million years ago. “I have lately taken to talking about the best model as replacement with hybridization, … [or] ‘leaky replacement,’ ” says paleogeneticist Svante Pääbo of the Max Planck Institute for Evolutionary Anthropology in Leipzig, lead author of the two nuclear genome studies.

I suppose ‘assimilation’ sounds too generic, but ‘leaky replacement’ seems more fitting for a building ‘super’. But it isn’t as if paleoanthropology has a Don Draper, a rogue with a way with words.

Here’s the infographic that went along with it:

One issue I’m wondering: what’s the best mammalian analog for humans? By humans I mean our lineage since the emergence of H. erectus (or whatever they’re calling it now) and its expansion into Eurasia. Off the top of my head, I think the gray wolf is the best candidate:

🔊 Listen RSS

Image credit:

My post The paradigm is dead, long live the paradigm! expressed to some extent my befuddlement at the current state of human evolutionary genetics and paleoanthropology. After the review of the paper of possible elevated admixture with Neandertals on the dystrophin locus a friend emailed, “Remember when we thought everything would be so simple once we could finally see this stuff?” Indeed I do remember. The fact that things aren’t simple is very exhilarating, but it is also a major quash on theoretical clarity. Science is after all not a collection of facts, but it is in part facts which one can sieve through a analytic framework.

In hindsight with the relative robustness of ancient DNA results we can make some assessments about the role of human bias within particular heuristic frameworks over the past generation. From the mid-1980s up until 2000 it was victory after victory for the Out-of-Africa with total replacement model. The rise of mtDNA and Y chromosomal lineage studies seemed to buttress the idea of common descent from neo-Africans within the last 100-200,000 years for all human populations. There wasn’t much of a perturbation from this march toward paradigm ascendancy in the aughts, except that there were now also now a trickle of papers which claimed to phylogenetic “long branches” in the human genome. The 2006 Evans et al. paper, Evidence that the adaptive allele of the brain size gene microcephalin introgressed into Homo sapiens from an archaic Homo lineage, was probably the one that made the biggest media splash. But these were inferences. Subsequent analysis of the draft Neandertal genome seems to suggest that in fact the microcephalin allele in question did not introgress.

Case closed? Obviously not. Now we’re in a different era. The Evans et al. paper may have wrong in the specifics, but its general framework seems to likely have been validated: there are genetic lineages in the modern human genome which are not derived from the neo-Africans. But, let us remember that the overwhelming majority of the human genome is neo-African. A reasonable interval for non-Africans is 90-99% neo-African. But, a non-trivial minority has introgressed or admixed from other lineages. Out-of-Africa is mostly correct, but in some ways so is Multiregionalism. But how do we describe this? “Weighted multiregionalism”? “Mostly Out-of-Africa?” The old terms were nice because they were punchy and precise. If you look at Multiregionalism or Out-of-Africa in Wikipedia the newest results are noted, but it doesn’t seem that they’ve been integrated into the analytic narrative. Yet.

🔊 Listen RSS

Guelta d’Archei, Chad. Credit: Dario Menasce.

Everyone who is literate knows that the Sahara desert is the largest of its kind in the world. The chasm in cultural, biological, and physical geography is very noticeable. Northern Africa is part of the Palearctic zone, while the peoples north of the Sahara have long been part of the circum-Mediterranean population continuum. The primary continuous habitable corridor is that of the Nile valley. And yet scholars have long known that there has been variation in the climatic regime of the Sahara. The pharaohs of ancient Egypt seem to have hunted a wider range of fauna than is to be found in the deserts surrounding the current Nile valley, perhaps relics from a more humid period. Rock art in some regions of the desert indicate aquatic life, and species more characteristic of the savanna. And yet we should not think of the Sahara as a recent phenomenon; it does seem to be geologically ancient, despite periodic humid interregnums. A new paper in PNAS attempts to map the hydrography of the Sahara over the Holocene, as well as back to the Pleistocene. The ultimate aim seems to be to better frame the geographic constraints on the expansion of humanity from its African homeland, and refute a simple projection from the present to the past. In this case, it is the existence of the Nile as a verdant and habitable watercourse which connects the north and south, and bisects the continuous desert. Ancient watercourses and biogeography of the Sahara explain the peopling of the desert:

Evidence increasingly suggests that sub-Saharan Africa is at the center of human evolution and understanding routes of dispersal “out of Africa” is thus becoming increasingly important. The Sahara Desert is considered by many to be an obstacle to these dispersals and a Nile corridor route has been proposed to cross it. Here we provide evidence that the Sahara was not an effective barrier and indicate how both animals and humans populated it during past humid phases. Analysis of the zoogeography of the Sahara shows that more animals crossed via this route than used the Nile corridor. Furthermore, many of these species are aquatic. This dispersal was possible because during the Holocene humid period the region contained a series of linked lakes, rivers, and inland deltas comprising a large interlinked waterway, channeling water and animals into and across the Sahara, thus facilitating these dispersals. This system was last active in the early Holocene when many species appear to have occupied the entire Sahara. However, species that require deep water did not reach northern regions because of weak hydrological connections. Human dispersals were influenced by this distribution; Nilo-Saharan speakers hunting aquatic fauna with barbed bone points occupied the southern Sahara, while people hunting Savannah fauna with the bow and arrow spread southward. The dating of lacustrine sediments show that the “green Sahara” also existed during the last interglacial (∼125 ka) and provided green corridors that could have formed dispersal routes at a likely time for the migration of modern humans out of Africa.

This paper was written before the Denisovan admixture results shifted the necessity to genuflect so explicitly to Out of Africa. But its results are interesting nonetheless, since they don’t depend too deeply on a paleoanthropological model. Rather, by surveying biogeogeography and geologic data they produce a sense of how the Sahara exhibited climatic flux over the past 100,000 years as a function of time and space. The latter is important because the Sahara is not an amorphous sandy waste. Rather, it exhibits a great deal of topographical variation:

Credit: T L Miles

In the Tibesti mountains the highest peaks are ~11,000 feet above sea level (3,400 meters). Because of the aridity of the Sahara in general even these elevations does not induce sufficient precipitation to produce a “green mountain” effect, common in other arid parts of northern Africa and Arabia. But in a regime of slightly only higher precipitation and milder temperatures (remove 3 degrees fahrenheit per 1,000 feet against latitude controlled sea level temperature) one can imagine the Tibesti having been much more biologically productive in the past. Consider this from the Tassili n’Ajjer region of southern Algeria:

Because of the altitude and the water-holding properties of the sandstone, the vegetation is somewhat richer than the surrounding desert; it includes a very scattered woodland of the endangered endemic species Saharan Cypress and Saharan Myrtle in the higher eastern half of the range.

The range is also noted for its prehistoric rock paintings and other ancient archaeological sites, dating from neolithic times when the local climate was much moister, with savannah rather than desert. The art depicts herds of cattle, large wild animals including crocodiles, and human activities such as hunting and dancing….

The main thrust of the paper seems to be to refute the common assumption that an eternal Nile served as the north-south corridor for African fauna, including humans. Here is the reason:

Reanalysis of the Saharan zoogeography…suggests that many animals, including water-dependent creatures such as fish and amphibians, dispersed across the Sahara recently. For example, 25 North African animal species have a spatial distribution with population centers both north and south of the Sahara and small relict populations in central regions. This distribution suggests a trans-Saharan dispersal in the past, with subsequent local isolation of central Saharan populations during the more recent arid phase. If a diverse range of species (including fish) can cross the Sahara, it is impossible to envisage the Sahara functioning as barrier to hominin dispersal. The zoogeography of the Nile suggests that it was a much less effective corridor…Only nine animal species that occupy the Nile corridor today are also found both north and south of the Sahara….

There are also isolated pieces of evidence which refute a Nile-only model: Saharan oases which have endemic species of crocodiles. The existence of “desert crocodile” populations is a signal of a more well-watered past, with a subsequent retreat into isolated oases (some of these populations did go extinct in the 20th century though). In some ways this is a problem. Simple models make simple predictions, and are easier to test. But if simple models are false, that is an even greater problem.

Here are the figures which outline the primary results from geology and biogeography:

[nggallery id=27]

There are two primary inferences made in regards to humans:

1) The Holocene inference seems to be that Nilo-Saharan populations have their origins in the societies which expanded north and south along the liminal zone of the Sahara. The authors argue that Nilo-Saharan populations on isolated oases in the northern Sahara are relics from the past expansion in the early Holocene. This sounds plausible, but it would be nice to explore this in more depth via linguistic and genetic analysis. With the rise of the camel and Islam a trans-Saharan trade in humans may have resulted in a great deal of trans-location of whole populations from one area to another. Concurrent with the Nilo-Saharans who pushed north the authors also suggest that savanna hunters moved south. I am not clear who these people are from the paper, and the mapping between archaeology and linguistics here seems more tentative.

2) A deep history inference also seems to be that trans-Sahara population movements were feasible in a period around 120-100 years BP, but not 50-60 years BP. The distinction here matters because the latter is a relatively young age for the Out of Africa migration, while the former is an older one. If the latter view is correct then the only plausible route of migration is probably the coastal fringe of the Horn of Africa. If the former view is correct then a whole host of possibilities confront us, because the hydrography of the Sahara may have been constrained, but there were several avenues of migration.

In regards to #2, a clement phase, and then resealing of the genetic barrier, may align well with recent models which posit a non-trivial period of separation between Africans and non-Africans after the Out of Africa migration. In other words early modern humans may have followed the pattern of many species, with an expansion into, and beyond, the Sahara, and then a subsequent separation of two populations by a resurgent desert. The difference is that the daughter population isolated on the far side of the desert eventually “broke out” from the margins of the African homeland to the rest of the world.

Citation: Drake NA, Blench RM, Armitage SJ, Bristow CS, & White KH (2010). Ancient watercourses and biogeography of the Sahara explain the peopling of the desert. Proceedings of the National Academy of Sciences of the United States of America PMID: 21187416

• Category: Science • Tags: Geography, Human Evolution, Out-of-Africa 
🔊 Listen RSS

Dilettante human genetics blogger Dienekes Pontikos has a post up with a somewhat oblique title, Is multi-regional evolution dead? I say oblique because a straightforward title would be “Multi-regionalism lives!” He posted a chart from a 2008 paper which outlines various models of human origins, and their relationship to molecular data at the time. I have also posted the chart, but with a little creative editing on the “assimilation” scenario to reflect the possible Neandertal and Denisovan admixture events. Of these models the “candelabra” can be rejected as highly implausible. It posits very deep roots in a given region for distinct human populations. Unless you accept some sort of hominin population structure in Africa which were maintained by distinctive migrations out of Africa then the “replacement” model can be discarded (since the classic replacement model did not posit ancient African population structure being of any relevance outside of Africa you’d have to salvage it with a modification in light of new results).

So the two primary disputants are a resurrected multi-regional model, and the assimilation model. But these two are really endpoints on a spectrum of models. What you need to do is vary the number of discrete populations and the rate of migration between the populations over time. The beauty of the replacement model was its parsimony: as far as recent human origins were concerned past gene flow via migration was a relatively academic concern. It was an exceedingly simple narrative framework. Consider this first episode of a 2009 British documentary, The Incredible Human Journey:

In the first episode, Roberts introduces the notion that genetic analysis suggests that all modern humans are descended from Africans. She visits the site of the Omo remains in Ethiopia, which are the earliest known anatomically modern humans, and visits the San people of Namibia to demonstrate the hunter-gatherer lifestyle. In South Africa, she visits Pinnacle Point, to see the cave in which very early humans lived. She then explains that genetics suggests that all non-Africans may descend from a single, small group of Africans who left the continent tens of thousands of years ago. She explores various theories as to the route they took. She describes the Jebel Qafzeh remains in Israel as a likely dead end of a traverse across Suez, and sees a route across the Red Sea and the around the Arabian coast as the likelier route for modern human ancestors, especially given the lower sea levels in the past.

A neat and tidy story. But reality is getting a lot less tidy & neat. Personally, the assimilation model as we understand it now seems to be the most plausible model. It remains more parsimonious than the alternatives: ancient population structure and complex patterns of gene flow and hybridization. But parsimony has misled us toward undo confidence in the recent past, so we should not weight this as strongly at this point. Where would we be without ancient DNA extraction? Some researchers have long claimed a more complex model than Out of Africa, but as long we relied in inferences from extant populations theses result were ignored or dismissed (notably, ancient DNA extraction is also unsettling our understanding of the very recent human past).

There is though the pattern of greater African genetic diversity. Dienekes observes that a recent paper reports that some Indian populations may be more diverse genetically than HapMap Africans. I’m not too keen on overturning a generation of consensus yet in regards to this question based on one deeply sequenced region on one chromosome comparing some Indian tribal groups to two HapMap African populations (Yoruba, and a Kenyan Bantu group). So I accept the pattern of greater diversity until further research brings it more into doubt. Now the question is to explain the pattern. The most plausible explanation would naturally be the one outlined above in the 2009 documentary: non-Africans are the descendants by and large of a small number of Africans who left ~100,000 years B.P. They went through a population bottleneck which reduced genetic diversity sharply. Their genetic variance was a subset of that of Africans (with some admixture from other human lineages outside of Africa, as it now happens).

But, there are other possibilities. One option sounds rather bizarre to me on first blush:

With respect to the reduced genetic diversity, one idea is that it is the result of genetic drift following a bottleneck in a small African population. But, the data can just as well be explained by species-wide selection which culled genetic variation.

Presumably selection would operate outside of Africa and homogenize non-Africans through a series of sweeps. Remember that selection and stochastic population events can sometimes be hard to differentiate, because both expunge variation from long swaths of the genome, resulting in long linkage disequilibrium blocks. This seems rather incredible as a proposition to me. Could selection operate all across Eurasia in such a fashion? From what I can tell in relation to more recent signatures of natural selection that does not tend to occur. The pattern for skin color for example is convergent phenotypes through different genetic architectures. How could gene flow tie together ancient human lineages and not H. sapiens sapiens? On the other hand, this could be an explanation for the consistent and taxon wide pattern of encephalization (though I believe this occurred in Africa as well).

A second alternative would be that Africa’s greater genetic diversity is simply a function of a much longer term effective population. In this model the climatic fluctuations of the Pleistocene periodically reduced non-African population to such an extent that these groups became a very minor proportion of the total census size of humans, and were so were swamped out by gene flow with the more numerous African humans. It seems to me that an extreme case of this model really verges into the same territory as the assimilation model. So I see this as more of a difference of degree than kind.

Dienekes points to Y chromosomal markers which suggest “back-migration” to Africa. I don’t totally discount this, but looking at the enormous diversity in groups like the Bushmen, I don’t think we can attribute that to back-migration from Eurasia. It is notable that the Bushmen are basal to the rest of humanity, including the Yoruba + (Eurasicans + Australasians). Also, the genetic divergence between the Denisovan/Neandertal clade and modern humans is only ~33% greater than between Bushmen and Papuans. Speaking of differences of degree, that is becoming more and more the case when it comes to the so-called “dead ends” of human evolution and ourselves.

Finally, there’s the issue of non-neo-African admixture. Reich et al. give a figure of ~7.5% in Melanesians, and ~2.5% in Eurasicans. It is valid I think to point out that though others have offered figures in the literature before only with the reference sequences of ancient DNA are these widely accepted values. Perhaps they would be revised upward with other sequences. But two cautions:

– There are only so many hominins to go around. Australia and the New World were only settled by modern humans. So how many were there running around in Eurasia? I think perhaps there may have been something different in South Asia, but that’s just a very uninformed guess.

– On the margin it seems clear that the autosomal DNA has enough fudge that interpretation meant that the archaic admixture signal could be dismissed. But the upper bound can’t be that high, or the Fst values would be more extreme than they currently are. Modern humans do seem to share a great deal of “shallow” common ancestry.

At the end of the day I am going to put my money on the assimilationist model because I believe in diminishing marginal returns. The Out of Africa replacement model was maximalist. Some tweaking on the margin is not very surprising, at least in hindsight, but more baroque forms of multi-regionalism have far too many moving parts. Newtonian mechanics may have been superseded in some domains by Einstein’s theories and Quantum Mechanics, but for many purposes it does very well at predicting phenomena and modeling the world. I have full expectation of further refinements in the assimilation model, but I would bet that the age of revolutions is over for a long time. Then again, my confidence is modest at best. This is no time for certitudes.

Note: A illustration of models:

🔊 Listen RSS

modelhumanQuick review. In the 19th century once the idea that humans were derived from non-human ancestral species was injected into the bloodstream of the intellectual classes there was an immediate debate as to the location of the proto-human homeland; the Urheimat of us all. Charles Darwin favored Africa, but in many ways this ran against the cultural grain. The theory of evolution was birthed before the highest tide of the age of white supremacy and European hegemony, and Darwin’s model had to swim against the conviction that Africans were the most primitive of the colored races. After the waning of the ideological edifice of white supremacy, and the shock it received during and after World War II, the debates as to the origin of humanity still remained contentious and followed the same outlines (though without the charged normative inferences). But as the decades wore on many more researchers began to believe that Darwin was correct, and that the origin of humanity lay in the African continent. First, the deep origin of the human lineage in Africa was accepted, but eventually a more recent expansion out of Africa was argued for by one school. The turning point in these academic disputes was the popularization of the “mitochondrial Eve” theory of the 1980s.

What some paleontologists had long argued, that anatomically modern humans have their locus of origin in Africa, was supported now by research from genetics which indicated that Africans were the most basal clade of humans on a continental scale, so that non-Africans could be conceived of as a subset of Africans. From this originates the chestnut of wisdom that Africans have more genetic diversity than all other human populations combined. By the year 2000 one could say that the “Out of Africa” triumphalism had proceeded to the point where an almost exterminationist model had taken hold when it came to the relationships of anatomically modern H. sapiens, and other groups which had evolved outside of Africa over the past million or so years, such as the Neandertals. But the theoretical dichotomies were too coarse and absolute as it turns out. A division between multiregionalist phyletic gradualism, where H. sapiens evolved out of its hominin ancestors concurrently on a world wide scale, and a model of rapid expansion of one tribe in Africa to replace all others in totality, may have been warranted in the age of classical genetics and a morphometric analysis, but now we can look at the raw genomic material in a more fine-grained fashion. In fact, we can now look at the genomic patterns of variation among extinct hominins! Though there have long been hints that the expansion-and-replacement paradigm was too extreme from the genetic and morphological data, with the publication last spring in Science of a paper which made the claim for admixture between Neandertals and non-Africans in the range of 1-4% in all non-African groups based on a comparison of Neandertal and modern human genetic variation, one can dismiss absolutist expansion-and-replacement as self-evidently true orthodoxy. But one orthodoxy has no given way to another, and the shock to the old models presented by the data has not resulted in the coalescence of new robust paradigms. We live in a time of scientific troubles, so to speak.

One of the more notable results in the Science paper from last spring was that all non-Africans had about the same admixture in relation to the Neandertal reference genome, ~1-4%. This means from the Orkneys to New Guinea. Because Neandertals were distributed only in the western half of Eurasia this implies that the admixture was an early event. By the time of modern human expansion across Eurasia, Australasia, and the New World, it had become equally distributed across the individuals within the population. Recall the contrast between African Americans and Uyghurs. Among the Uyghurs the ancestral quanta are equitably distributed from individual to individual, but among African Americans there remains substantial intra-population variance. The reason is that African Americans are quite new, an order of magnitude younger than the Uyghurs in a genetic sense, and admixture is still occurring into the African American population from the ancestral groups. The Uyghurs as we known them today genetically are probably ~1,000-2,000 years old (though their cultural origins are both more and less ancient, as a matter of linguistics in the former, and ethnic self-conception as a Muslim East Turkic group in the latter). The implication here is clear: there was a pause in the Out of Africa movement, where the proto-non-Africans mixed with a Neandertal group, possibly in the Middle East, and only began a massive demographic expansion after an unspecified sojourn. A paper from last spring makes this all explicit:

A more likely explanation for the OoA bottleneck is that Eurasia was populated by a larger population that had been relatively isolated from other modern human populations for tens of thousands of years prior to the expansion. The first fossil evidence for modern humans outside of Africa is in the Middle East at Skhul and Qafzeh between 80,000-100,000 years ago, which is at least 20,000 years prior to the Eurasian diaspora. If a population of modern humans remained in the Middle East until the expansion into Eurasia, there would have been sufficient time for genetic drift to reduce heterozygosity dramatically before the Eurasia expansion. This “Middle East isolation” hypothesis provides a robust explanation for the relative homogeneity of European and Asian populations relative to African populations (see Figures 3A-B) and is supported by a recent maximum likelihood estimate of 140,000 years ago for the time of Eurasian-West African population separation . Interestingly, a recent study of the Neandertal genome suggests that the non-African individuals, but not the Africans, contain similar amount of admixture (1-4%) with the Neandertals . The authors suggest that the admixture must have happened between the Neandertals with an ancestral non-African population before the Eurasian expansion. Given the fossil, archaeological, and genetic evidence, the Middle East isolation hypothesis warrants rigorous evaluation as whole-genome sequence data become available.

Now the same group has published a follow up paper in Genome Biology which fleshes out the Deep Time aspect of human evolutionary history by looking closely at the genetic variation of an under-sampled population: South Asians. You may have noticed that the HGDP populations include Pakistani groups as South Asian exemplars. That’s apparently because during the Permit Raj era in India the government was wary of cooperating with the HGDP consortium. But more recently the barriers have come down in India, and one can viably supplement the data sets with Indian Americans. So the GIH sample in HapMap3 consists of Gujaratis from Houston. At ~1.25 billion, or nearly 20% of the world’s population, South Asians are a critical portion of the “big picture” when it comes to world wide genetic variation.

Genetic diversity in India and the inference of Eurasian population expansion:

To analyze an unbiased sample of genetic diversity in India and to investigate human migration history in Eurasia, we resequenced one 100 kb ENCODE region in 92 samples collected from three castes and one tribal group from the state of Andhra Pradesh in south India. Analyses of the four Indian populations, along with eight HapMap populations (692 samples), showed that 30% of all SNPs in the south Indian populations are not seen in HapMap populations. Several Indian populations, such as the Yadava, Mala/Madiga, and Irula, have nucleotide diversity levels as high as those of HapMap African populations. Using unbiased allele-frequency spectra, we investigated the expansion of human populations into Eurasia. The divergence time estimates among the major population groups suggest that Eurasian populations in this study diverged from Africans during the same time frame (approximately 90-110 thousand years ago). The divergence among different Eurasian populations occurred more than 40,000 years after their divergence with Africans.

First, I want to put into the record that I think there are high enough uncertainties (evident in the confidence intervals in the paper itself) that we need to be careful about taking the divergence times from their results as values we’d bet the house on. Someone with a better knowledge of the fossils (e.g., John Hawks) or controversies about the mutational rates (e.g., Dienekes) can comment on the plausibilities of the dating. But, I think we can infer that there was a time lag closer to a 10,000 years order of magnitude than 1,000 years when it comes to the Middle Eastern sojourn of non-African humans.

The basic method here is that the research group zoomed in on a ~100 kb region of the genome, on chromosome 12, and surveyed their Indian populations, as well as the HapMap3 ones. This is important because the SNPs in the HapMap probably exhibit an ascertainment bias toward variants in European and other more widely surveyed groups. The fact 30% of the SNPs in the South Indian groups seem to not be found among the HapMap populations confirms this hunch. Before digging into the details of the paper, let’s note that the South Indian groups are from the state of Andhara Pradesh, Brahmins, a lower caste group (Yadava), Dalits (Mala/Madiga), and a tribe (Irula). This is a case where even more thorough coverage is necessary. There is some suggestion that South Asian groups have a long history of endogamy and genetic peculiarities, which would limit the usefulness of extrapolations from this sample. Even within the HapMap Gujarati sample there seems to be two clusters when the PCA is used with reference to the European samples.

There are basically three portions of the paper:

– A survey of conventional population genetic statistics,

θ = 4N eμ (N e = effective population, μ = mutation rate)
π = nucleotide diversity
H = heterozygosity
D = Tajima’s D

– Measures of genetic distance between contemporary populations, F st and PCA

– Finally, taking the genetic variance from the ~100 kb and plugging it into explicit models of human evolutionary history

Table 1 (I reformatted) shows the genetic statistics by “continent.” Indian includes some Gujarati individuals. They sampled out of the HapMap populations to equalize the numbers.


euro2Some of these results are striking. The general truism is that Africans are the most diverse population in the world, but some of the South Indian groups are very diverse indeed. Of particular interest though is that some Indian groups are not very diverse at all. What’s going on here? Here you have to look at the specifics of each group. It is likely that South Indian Brahmins are the result of a relatively recent population expansion, with some uptake of other genes through hypergamy. A paper from last year argued that all Indian populations can be modeled as a two-way admixture of different quantities from two ancestral groups, Ancient North Indians and Ancient South Indians. The heterozygosity values may be explained in such a fashion, though the relatively low values for Gujaratis and Andhara Pradesh Brahmins would still surprise. Frankly, I’m just mostly confused by the diversity statistics. Probably the substructure through endogamy and population bottlenecks are obscuring broader dynamics. We can, though, conclude that the idea that all non-Africans are uniformly homogeneous in comparison to Africans may not hold water. Figure 2 above illustrates this by plotting heterozygosity vs. distance from Africa.

Next, let’s move to genetic distance. There’s two ways you can look at this: a summary statistic like F st, which partitions between and within population variance, and PCA, which visualizes the largest dimensions of variations in the data set. So you have both below (reedited for reasons of space):


In the generality the results are expected, but there are weird details. For example, the Brahmins from Andhara Pradesh are on the margins, where you’d expect them to cluster with the Gujaratis. The Gujaratis are closer to the Chinese from Denver than Utah Whites? This is a provisional paper, so I’m almost wondering if there’s a typo or coding error here, as I don’t understand how the GIH can be so close to the Tuscans and Chinese from Denver, and much further from the Northern Europeans and Chinese from Beijing. The two European and Chinese samples are rather close in other analyses.

So let’s get to the real deal. The modified Out of Africa model where non-Africans take a “break” after they leave the mother continent:


I’ve mashed up the figures. The models were generated by looking at allele frequencies. They took the variants they found by sequencing the ~100 kb on chromosome 12, which was in a very gene-poor region so as to bias it toward neutrality, and plugged them into a few models in the ∂a∂i program. I’ll jump to the text here:

…the divergence time between African and the ancestral Eurasian population (88-112 kya, CIs: 63-150 kya) is much older than the divergence time among the Eurasian groups (27-39 kya, CI: 20-59 kya). The more recent divergence time and the low migration rate estimates among the current Eurasian populations support the “delayed expansion” hypothesis for the human colonization of Eurasia (Figure 5). Consistent with previous studies…these estimates indicate that a single Eurasian ancestral population remained separated from African populations for more than 40 thousand years prior to the population expansion throughout Eurasia and the divergence of individual Eurasian populations.

Manafi al-Hayawan, Adam and Eve

Take a good look at those confidence intervals. We know that some of those have to be false: the bones don’t lie. From what little I know a very young consensus date for the settlement of Australasia by modern humans is 40,000 years ago. That happens nicely to be their median, but the dispersion toward younger dates is probably not right, unless Aborigines are a separate population who are remnants of an earlier wave of migrants (or the current Aborigines replaced earlier waves). It is also hard to reconcile these dates for the diversification of non-African humanity with very old dates for Chinese fossils which exhibit some elements of modern morphology.

In the broad outlines I think we can accept that the model outlined in this paper may be correct. It would explain the uniform admixture of Neandertal in non-Africans, since they’d need time as a compact population before demographic expansion to integrate the Neandertal genes as part of their genetic background. But before the Neandertal genome came out there were plenty of papers which purported to show how there was no archaic admixture in modern humans, and plenty of papers which did claim there was evidence for such admixture. The point is that these computational models are sensitive to their inputs, and being models they simplify what really happened. In the discussion the authors repeatedly observe that migration between the various non-African demes doesn’t effect the outcome. That is fine, but there is modestly strong evidence that the Indian samples that they’re using are an admixed population of old. That would make me skeptical of claims about dating the separation of “Indians” when Indians are themselves possibly a compound between other groups.

Below is the model presented from Reconstructing Indian population history:


The teens of this century are going to be very exciting when it comes to reconstructing human evolutionary history. You’d be a fool to put bets on any horse at this time.

eurasicansAddendum: I need a term for non-African humanity. So I’m making up one right now: Eurausicans. From Eurasians, Australasians, and Americans.

Citation: Jinchuan Xing, W Scott Watkins, Ya Hu, Chad D Huff, Aniko Sabo, Donna M Muzny, Michael J Bamshad, Richard A Gibbs, Lynn B Jorde, & Fuli Yu (2010). Genetic diversity in India and the inference of Eurasian population expansion Genome Biology : 10.1186/gb-2010-11-11-r113

🔊 Listen RSS Despite the reality that I’ve cautioned against taking PCA plots too literally as Truth, unvarnished and without any interpretive juice needed, papers which rely on them are almost magnetically attractive to me. They transform complex patterns of variation which you are not privy to via your gestalt psychology into a two or at most three dimensional representation which can you can grok immediately. That is why History and Geography of Genes was so engrossing. You recognize patterns which were otherwise unrecognizable. But how you interpret those patterns, that’s a wholly different matter. And how those patterns arise is also not something one can ignore.

price_fig1First, let’s start with an easy case. To the left is a PCA plot with four populations. Nigerians, East Asians (Chinese + Japanese), Europeans (whites from Utah), and finally, African Americans. The x-axis is the first principal component of variation, and the y-axis the second. That means that the x-axis is the independent dimension of variation within the patterns of genetic data which explains the largest fraction of the total amount of genetic variation. The sum totality of the variation can be decomposed into an large set of independent dimensions which can be rank ordered from the largest explanatory components to the smaller ones, successively by number. In a human genetic context the first principal component invariably separates Africans from non-Africans, and the second principal component often maps onto a west-east axis from Europe to the New World. Subsequent principal components can often be useful in smoking out fine scale distinctions, or relationships which are confused by the existence of similar but different signals in admixed populations.

The interpretation of this plot is rather easy. You see that African Americans lay along a continuum between Nigerians and Europeans, skewed toward Nigerians, with some outliers toward East Asians. We know from other genetic findings that ~20% of the African American ancestral quanta is European, but, that quanta is not equally distributed across the population. ~10% of the African American population is more than 50% European in ancestry, while 90% is less than 50% European. And so you have a distribution which reflects this variation. As for the outliers, I will speculate and suggest that these are indications of Native American ancestry among some African Americans.

The story I presented above is probably plausible as an explanation of the visual because we have a wealth of historical data to corroborate the plausibility of that narrative. The fit between the results from the technique of analysis of genetic variation and what scholars have long inferred from textual sources is relatively easy. It is far more difficult to look at a PCA plot, and generate a plausible narrative that you yourself accept with a high degree of confidence with little external support. It is with that caveat in mind that I present Toward a more uniform sampling of human genetic diversity: A survey of worldwide populations by high-density genotyping:

High-throughput genotyping data are useful for making inferences about human evolutionary history. However, the populations sampled to date are unevenly distributed, and some areas (e.g., South and Central Asia) have rarely been sampled in large-scale studies. To assess human genetic variation more evenly, we sampled 296 individuals from 13 worldwide populations that are not covered by previous studies. By combining these samples with a data set from our laboratory and the HapMap II samples, we assembled a final dataset of ~ 250,000 SNPs in 850 individuals from 40 populations. With more uniform sampling, the estimate of global genetic differentiation (FST) substantially decreases from ~ 16% with the HapMap II samples to ~ 11%. A panel of copy number variations typed in the same populations shows patterns of diversity similar to the SNP data, with highest diversity in African populations. This unique sample collection also permits new inferences about human evolutionary history. The comparison of haplotype variation among populations supports a single out-of-Africa migration event and suggests that the founding population of Eurasia may have been relatively large but isolated from Africans for a period of time. We also found a substantial affinity between populations from central Asia (Kyrgyzstani and Mongolian Buryat) and America, suggesting a central Asian contribution to New World founder populations.

The studies which came out of the original HapMap had northern Europeans, Yoruba from Nigerians, and Chinese & Japanese. These three populations can tell us a lot, but there’s something lacking in the coverage. The HGDP sample is better. But specifically because of political considerations it was not feasible to collect Indian samples, so Pakistani ones are used in their stead. Additionally, the HGDP sample is a touch biased toward isolated and distinctive populations, such as the Kalash of Pakistan. This genetic distinctiveness is important to catalog because it is fast disappearing. But the Kalash are so unique because of their long history of isolation, so one can’t really use them as a proxy population for Pakistanis, as one could with Sindhis. The POPRES sample seems to complement the HGDP well, but I don’t see it being used so much. Since the next phase of the HapMap has more populations, some of the deficiencies which emerged with the utilization of just three terminal groups (in a World Island context) will soon no longer be an issue.

But until that time it’s nice when studies come out which close some of the gaps in our knowledge of world wide genetic variation. This is one such study. I’m somewhat familiar with the samples already because I’ve seen it in an analysis of Indian populations. It seems that it is somewhat skewed toward South and Southeast Asian populations, but hey, these are groups which need to draw the long straw sometimes as well.

Before I go any further I should mention that they use a SNP-chip with hundreds of thousands of markers. Additionally, they looked at copy number variation. Two rather different types of variation within the genome, probably to double check that the outcomes were the same. Population historical events which shape patterns of genomic variation would presumably have a similar large scale effect on both types of variation. In their results that checked out, or so they claimed, as the paper is a manuscript without the supplements attached.

Though there’s some interesting fine-grained analysis to be had, they draw some macro-scale and deep time inferences as well. First, you probably know the famous fact that 15% of variation in genes is between races, and 85% within races. That’s derived from the Fst statistic, which is basically partitioning between and within population variance across two populations. Obviously the value of Fst varies by the set of populations you’re comparing. That between Mbuti Pygmies and Japanese is far higher than between Chinese and Japanese. Using the HapMap the Fst was 16%. About what you’d expect. To equalize sample sizes with the HapMap they randomly selected individuals from a pooled set grouped by continent from their populations, and calculated Fst. They found values around 11%. Why the difference? Because their data set included populations which were between the three clusters within the HapMap.

This is naturally not a surprising result at all, but it does reiterate one issue which sometimes crops up: Platonism in relation to race. The northern European whites in the HapMaps are the whites par excellence. Turks, who are perhaps more centrally located in the genetic variation of West Eurasian and North African peoples, what used to be termed “Caucasoid,” are “less white.” Similarly, Nigerians are more African than Ethiopians. Chinese and Japanese are more Asian than Burmese. And so forth. When modeling between group differences there is I think a somewhat old-fashioned tendency to consider some populations racial archetypes. That modulates the input which modifies the results somewhat. The analytical technique may be as cold as stone, but they are used by flesh and blood human beings.

There is also some funny business going on with haplotype and SNP heterozygosities which I think needs to be highlighted, and speaks to the fact that SNP-chips are not perfect. They’re tools, and human tools are impacted by arbitrary or instrumental choices humans make. Let me quote:

We also compared the SNP and haplotype heterozygosity values in each population (Figure 2B). These two quantities are generally highly correlated, although there are several exceptions: First, SNP heterozygosity is higher than haplotype heterozygosity in European and Central Asian populations. This may reflect a SNP ascertainment bias, since many of these polymorphisms were historically selected to maximize heterozygosity in European populations. Second, the Pygmy sample shows a low SNP heterozygosity despite relatively high haplotype heterozygosity. This unusual pattern could be caused by stronger effects of SNP ascertainment bias in this population than in others. Indeed, a recent study of Khoisan individuals (another hunter-gatherer group from Africa) showed a similar pattern: despite high SNP heterozygosity (~60%) in whole-genome sequence data, a Khoisan individual showed low heterozygosity on the SNP microarray genotypes (~22%) . Alternatively, this difference could also reflect unique attributes of population history.

In plain English the gene chips were designed with Europeans in mind, so they don’t necessarily pick up all the variation in non-European groups, who are believe it or not genetically different. This issue cropped up (as alluded to in the above text) with the recent paper which sequenced some Bushmen as well as Desmond Tutu. The Bushmen have a lot of variation, this is well known, but they have variation at markers where Europeans don’t, and if Europeans don’t the chips may not look for polymorphism at that locus. This sort of thing probably doesn’t affect broad population relationships, but if you want to zoom in and do analysis which is sensitive to fine distinctions and quantitative differences, then it might be problematic.

Let’s jump to the pretty charts. First, a PCA plot with all of the individuals from all of the populations:


Note that PC 1 accounts for nearly eight times as much variation as PC 2. This speaks to the African vs. non-African gap. Because their data set is relatively thick in “intermediate” groups you see a spectrum. The vertical axis is obviously mostly east-west. And here’s the accompanying bar plot derived from the ADMIXTURE program. K = putative ancestral populations.


With this many populations at K = 12 I think you could write a fantasy novel worthy of Tolkien. K = 4 is more realistic. Among the African populations you see likely Eurasian admixture in some eastern, and it seems Bushmen, individuals. In Eurasia itself you see a clinal gradation of admixture between putative ancestral components that seems to follow longitude rather well.

Because so much of the variation in the total sample is due to Africans, removing them from the picture will allow us to focus more on the relationships of the Eurasian groups. And so that’s exactly what they did. Note that focusing on the Eurasian groups does not mean simply magnifying or zooming in on the Eurasian section of the PCA plot, rather, the plots are regenerated with a subset of the previous genetic variation. In other words, the dimensions will shake out a bit differently.

The first plot shows Eurasian populations as a whole. The second removes Europeans and Near Easterners.


Notice again the scale. The vast majority of the variance seems to be east-west. But, there is a noticeable north-south split. For the South Asian population it looks like they had Pakistanis who were farmers of modest means (Arain), high caste South Indians, and very low caste or tribal South Indians. For this Indian sample there’s a problem, and it’s the sample problem which plagued the Up Series, they are looking at the very top and bottom of Indian society and ignoring the middle. Presumably the middle is going to be somewhere in the middle genetically as well, but nevertheless that’s something to consider in a paper which presumes to fill in the patchiness of others. In contrast, the Nepali sample was notably ethnically diverse, including both the dominant Indo-Aryan segment as well as the Tibeto-Burman Newar.

In the first panel there are some curious patterns with the Southeast Asian groups. Culturally, as in language and history, the Thai and Vietnamese have relatively recent roots in the southern regions of modern China. The Dai of Yunnan are the same people in origin as the Thai of Thailand and the Lao of Laos. Both derive from migrations from Yunnan. This is historically attested, even if somewhat fragmentarily. The heartland of the Vietnamese was in the Red River valley and north into southern China, and they spread down the coast and toward the Me kong only within the last 1,000 years. Southeast Asia was not uninhabited during this period. It was dominated by the Khmer Empire, which was slowly consumed by the expanding Thai and Vietnamese polities. Some scholars argue that French colonialism actually preserved an independent Khmer nation, which otherwise would have been divided between Thailand and Vietnam, as Poland was between Germany and Russia. So the Khmer are the indigenous people, while the Thai and Vietnamese are intrusive.

What do the PCA plots tell us? I do not know where the Vietnamese samples were collected. If they were from South Vietnam, then their close position to the Chinese suggests to me that there was substantial demographic replacement or expansion from the Red River valley. In contrast, the Thai are relatively distant from the Chinese. In fact, the Cambodians are somewhat closer to the Chinese! The samples here are small, and the sets overlap, so I wouldn’t put too much stock in that. But, Thailand is geographically closer to South Asia, so isolation by distance models would predict this pattern. It seems that the ethnogenesis of the Thai occurred through the expansion of the Thai identity, likely among Khmer peoples. And it is intriguing that the Iban, an indigenous people of Borneo, are closer to the Vietnamese than they are to the Cambodians. We know that there was substantial migration between coast Vietnam and Maritime Southeast Asia, the Chams of central Vietnam, and dominant in the southern half of the nation before the Vietnamese expansion, are a Malayan people who may have migrated from Borneo.

Shifting to the second panel there’s more here to say about the South Asians. First, geography. The two lower caste groups are actually Dalits from Andhara Pradesh, a South Indian state. Dalits used to be called outcastes, so they aren’t even lower caste, but without caste. The upper caste groups are Brahmins from Andhara Pradesh and Tamil Nadu. Finally, the Irula are tribal people from Tamil Nadu. To me the tribal samples often produce weird results, and I suspect that has to do with population bottlenecks and their demographic isolation. People leave the tribes (becoming part of the Hindu society, or converting to Islam or Christianity), but few join them. The Pakistani sample are Araina, a group of conventional Punjabi farmers who have a made up ancestry from Arabs (obviously made up because they don’t cluster with Near Easterners). Let’s compare to a chart from Reich et al.:


It seems to me that they’re in rough agreement (Reich et al. uses the same two low caste groups for Andhara Pradesh for low caste South Indians by the way). Though South Indian Brahmins speak South Indian languages, and reside amongst other South Indian groups, their genetic heritage is somewhat different. Similarly, tribal peoples are also distinct from caste Hindus. Reich et al. posit that South Asians can be modeled as a composite of two groups, Ancestral North Indians, ANI, and Ancestral South Indians, ASI. Presumably the former are intrusive to the subcontinent in relation to the latter. There seem two clear dimensions along which the ratio of ANI to ASI vary: geography and caste. The proportion of ASI seems to increase from the northwest to the southeast. And, the proportion of ANI seems to increase from tribal to low caste to upper caste. The Pakistani sample does not seem to be from an elite caste (or it does not seem they were converted from an elite caste), but they have more affinity with West Eurasian populations than South Indian Brahmins. It is likely that the latter are intrusive to the south, and have admixed with the local population.

Finally, a word on the Nepali sample. On top of the ANI-ASI mixture, the Nepali groups have varying levels of Tibeto-Burman, and so East Asian, affinity. This is not a surprise if you have met Nepalis. The Assamese, and to a lesser extent Bengalis, also exhibit this pattern of Tibeto-Burman admixture. The Brahmins of Nepal are intrusive like the Brahmins of South India, and like the South Indians they admixed with the local substrate.

Next let’s move to a ADMIXTURE plot.


The selection of a particular K obviously is conditioned by the patterns which “fit” with what you know, and what you expect. With that caution aired, the population represented by red can easily be thought of as a Middle Eastern group which expanded with agriculture. That seems to be what the authors favor. The brown population is the modal Indian ancestral population, which has little presence outside the subcontinent (nice color coding by the way! Brown people are brown). A green color represents a population which the tribal group, the Irula, are heavily weighted on. This reminds me too much of the Kalash. I suspect that the Irula went through some bottleneck or other distinctive event, and some have assimilated to various low status groups in South India.

I’m not a fantasist intent on world-building, so I’ll stop with that in reading the tea leaves of the charts. But there’s an important section which I skipped over, and will move back to now. And that’s the deep time aspect:

A more likely explanation for the OoA bottleneck is that Eurasia was populated by a larger population that had been relatively isolated from other modern human populations for tens of thousands of years prior to the expansion. The first fossil evidence for modern humans outside of Africa is in the Middle East at Skhul and Qafzeh between 80,000-100,000 years ago, which is at least 20,000 years prior to the Eurasian diaspora. If a population of modern humans remained in the Middle East until the expansion into Eurasia, there would have been sufficient time for genetic drift to reduce heterozygosity dramatically before the Eurasia expansion. This “Middle East isolation” hypothesis provides a robust explanation for the relative homogeneity of European and Asian populations relative to African populations (see Figures 3A-B) and is supported by a recent maximum likelihood estimate of 140,000 years ago for the time of Eurasian-West African population separation . Interestingly, a recent study of the Neandertal genome suggests that the non-African individuals, but not the Africans, contain similar amount of admixture (1-4%) with the Neandertals . The authors suggest that the admixture must have happened between the Neandertals with an ancestral non-African population before the Eurasian expansion. Given the fossil, archaeological, and genetic evidence, the Middle East isolation hypothesis warrants rigorous evaluation as whole-genome sequence data become available.

Like the vast majority of genetic studies this work supports the Out of Africa hypothesis. Non-Africans are all branches from a specific African branch. Or more accurately, an African branch which left Africa. The reduction in heterozygosity, a measure of genetic variation, from Africa to Eurasians was large. Additionally, within Africa south of the Sahara there’s little difference in heterozygosity as a function of geography, but outside of Africa it drops off as a function of distance from Africa. A plausible model then is a radiation from a small ancestral population to the four corners of the world, going through a series of bottlenecks along the way. Or at least that’s a model supported by genomic data. But, the drop in heterozygosity is so great a quick separation from the parental African population would require an implausibly small number of founders (less than 10 in one generation). So, to explain the data, they are suggesting here that the original population was not quite so small, but was isolated from the large African population for thousands of years. They assume genetic drift reduced heterozygosity, but if the model is correct I suspect that the way it worked was that bottlenecks due to climatic fluctuations swept clean a lot of the genetic variation. But in the interregnum the isolated population may have interbred with Neandertals. In fact, perhaps they picked up genes from Neandertals when their own effective population was extremely small.

In any case, a wide ranging paper. They manage to tie their results into two other blockbuster papers.

H/T Dienekes

Citation Xing J, Watkins WS, Shlien A, Walker E, Huff CD, Witherspoon DJ, Zhang Y, Simonson TS, Weiss RB, Schiffman JD, Malkin D, Woodward SR, & Jorde LB (2010). Toward a more uniform sampling of human genetic diversity: A survey of worldwide populations by high-density genotyping. Genomics PMID: 20643205

🔊 Listen RSS

I assume by now that everyone has read A Draft Sequence of the Neandertal Genome. It’s free to all, so you should. At least look at the figures. Also, if you haven’t at least skimmed the supplement, you should do that as well. It’s nearly 200 pages, and basically feels more like a collection of minimally edited papers than anything else. There’s no point in me reviewing the paper, since you can read it, and plenty of others have hit the relevant ground already.

Since there seem to be three main segments of the paper, here are a few minimal thoughts on each.

First, the draft genome. What would you have said if someone came up to you ten years ago and told you that you’d live to to see this? Svante Paabo himself admitted he didn’t think he’d see something like this in his lifetime. There was a lot of hard work that went into figuring out how to get at the genetic material, purify it, and confirm that it was actually from the samples in question and not handler contamination and such (remember that there was a problem with contamination a few years back). To a great extent the focus on the results, instead of the methods, is like critiquing a set of landscape photographs taken from a very high peak. We can’t forget the effort and energy that went into scaling the peak itself. A lot of labor input obviously went into this, but additionally we can thank the fact that we live in a technological society where progress is not only expected, but often can’t be accounted for in our projections of future possibilities. I think that’s a very hopeful thing which makes me a little less pessimistic about the possibility of the magic carpet economy.

Second, the are the comparisons between Neandertals, modern humans, and chimpanzees. As Carl Zimmer noted there are an alphabet soup of genes thrown at you in the results. It is hard to make sense of it all, though I did note that genes involved in skin function and phenotype seem to have been the subject of differential evolution between Neandertals and modern humans (i.e., SNP differences in regards to substitutions in the lineages). We already know that there are suggestive signs that Neandertals lost function on pigmentation independently from modern humans. That shouldn’t be too surprising, given that it seems that West and East Eurasians evolved light skin independently. There are some uncertainties about the timing of this, but the different genetic architecture implies that it was unlikely to have occurred immediately after the Out-of-Africa event, and in fact some of the loci imply that depigmentation may have occurred in the Holocene. Skin is famously our biggest organ, so it shouldn’t be that shocking that it is possibly a target of selection, but curious nonetheless (recall that it seems that humans evolved darker skin from a paler ancestor as we lost our fur in the tropics).

Additionally, I think the finding that Neandertals and modern humans seem to share most of the same HARs, regions of the genome where our human lineage seems to differ from other mammals in exhibiting a lot of evolutionary change, is of great interest, though not necessarily surprising. When pointing to Luke Jostins’ post on rates of encephalization, I observed that in some ways it seems like there was a very powerful and consistent lineage specific trend toward greater cranial capacity which had incredible time depth. In The Dawn of Human Culture Richard Klein puts the emphasis on the sharp break between those populations before ~50 thousand years ago, and after. This period is marked by the shift toward behavioral, as opposed to just anatomical, modernity (there were anatomically modern humans in Africa ~200 thousand years ago). Klein’s thesis is that some mutation triggered a radical biocultural change, and was responsible for the Great Leap Forward, the efflorescence of creative symbolic culture which we truly consider the sin qua non of culture. The sharing of HARs between Neandertals and pure humans, and the consistent trend toward encephalization (aside from the post-Ice Age reversal), makes me shift the priors a touch more toward inevitable continuity and away from contingency. I find much of the politics of Robert J. Sawyer’s Neanderthal Parallax series a bit heavy-handed, but his depiction of Neandertals as fundamentally intelligent creatures who differ only on the margins seems a lot more plausible to me now than it was when I first read it in the early to mid-aughts.

Third, and finally, there’s the story of admixture and sex. This is getting all the press, but of course this is the most uncertain, inferential, and speculative aspect of the paper. It’s impressive, but it should open to skepticism, especially after the Out-of-Africa totalism which was ascendant until recently. John Hawks accepts the thrust of the findings, but obviously has his own ideas as to modifications, extensions and qualifications. Dienekes Pontikos favors an alternative interpretation of the data, which the authors point to in the text but dismiss as less parsimonious. My own inclination is to favor the authors in their interpretation of parsimony, but I will admit that this assertion is disputable. Dienekes and others would suggest that it is just as, or more, plausible that the shared variants between non-Africans and Neandertals arise from their common northeast African ancestral population (or some ancestral population of non-Africans and Neandertals). He rightly points out that there may be ancient population substructure within Africa, and using a particular African group as a “reference” for the whole continent may lead to false inferences. The main issue is that the probability of retrieving ancient DNA from northeast African samples in the near future seems low because the conditions for preservation are not optimal (tropical climates famously degrade and recycle biological material more efficiently than temperate or boreal climates). Additionally, using modern northeast African populations is somewhat problematic because there has clearly been some back-migration from the nearby Arabian populations into this area in the medium-term past (the languages of the Ethiopian highlands are Semitic). One supposes that one could differentiate between the African and Arabian components of the genome of Ethiopians and Somalis, but if the admixture event was two to three thousand years ago I presume it would be technically more challenging than an African American, where very few generations have passed since admixture for recombination to fragment long genomic regions attributable to one ancestral population. In other words, how do you distinguish Neandertal variants which arrived back from Eurasia from ancient African ones? (I suppose that the haplotypes would differ so that the genuinely African ones would be more diverse)

But even if you reject the top-line finding, that most of us are not pure human, I think the paper is a game-changer in terms of shifting your priors in relation to evaluating the plausibility of a result which suggests admixture from an ancient non-African population. I found out about the high likelihood of this paper just before the UNM results were presented at the American Anthropological Society meeting, and it is clear in hindsight with the large author list that many people knew what was coming down the pipepline and had recalibrated their assessment of results which indicated admixture. It is perhaps time to go back and take a second look at papers which you skipped over before because it seemed that they may have been spurious or reporting a statistical quirk because they lay outside of the orthodox paradigm. This is clearly a case where it is good to live in interesting times.

Citation: Green, R., Krause, J., Briggs, A., Maricic, T., Stenzel, U., Kircher, M., Patterson, N., Li, H., Zhai, W., Fritz, M., Hansen, N., Durand, E., Malaspinas, A., Jensen, J., Marques-Bonet, T., Alkan, C., Prufer, K., Meyer, M., Burbano, H., Good, J., Schultz, R., Aximu-Petri, A., Butthof, A., Hober, B., Hoffner, B., Siegemund, M., Weihmann, A., Nusbaum, C., Lander, E., Russ, C., Novod, N., Affourtit, J., Egholm, M., Verna, C., Rudan, P., Brajkovic, D., Kucan, Z., Gusic, I., Doronichev, V., Golovanova, L., Lalueza-Fox, C., de la Rasilla, M., Fortea, J., Rosas, A., Schmitz, R., Johnson, P., Eichler, E., Falush, D., Birney, E., Mullikin, J., Slatkin, M., Nielsen, R., Kelso, J., Lachmann, M., Reich, D., & Paabo, S. (2010). A Draft Sequence of the Neandertal Genome Science, 328 (5979), 710-722 DOI: 10.1126/science.1188021

🔊 Listen RSS

Alan Templeton, whose text Population Genetics and Microevolutionary Theory is right below Hartl & Clark in my book, recently published a strongly worded paper, Coherent and incoherent inference in phylogeography and human evolution. The possibility of statistical errors in published work is not shocking, I have heard that when statisticians are asked to sort through papers in medical genetics journals there are elementary errors in ~3/4 of those which have made it beyond peer review. That being said Templeton seems to be making a stronger case than simple refutation of basic errors, in particular he is suggesting that the “ABC” method which lay at the heart of the paper I reviewed last week is incoherent at the root. Here’s Templeton’s abstract:

A hypothesis is nested within a more general hypothesis when it is a special case of the more general hypothesis. Composite hypotheses consist of more than one component, and in many cases different composite hypotheses can share some but not all of these components and hence are overlapping. In statistics, coherent measures of fit of nested and overlapping composite hypotheses are technically those measures that are consistent with the constraints of formal logic. For example, the probability of the nested special case must be less than or equal to the probability of the general model within which the special case is nested. Any statistic that assigns greater probability to the special case is said to be incoherent. An example of incoherence is shown in human evolution, for which the approximate Bayesian computation (ABC) method assigned a probability to a model of human evolution that was a thousand-fold larger than a more general model within which the first model was fully nested. Possible causes of this incoherence are identified, and corrections and restrictions are suggested to make ABC and similar methods coherent. Another coalescent-based method, nested clade phylogeographic analysis, is coherent and also allows the testing of individual components of composite hypotheses, another attribute lacking in ABC and other coalescent-simulation approaches. Incoherence is a highly undesirable property because it means that the inference is mathematically incorrect and formally illogical, and the published incoherent inferences on human evolution that favor the out-of-Africa replacement hypothesis have no statistical or logical validity.

The method which Templeton favors is naturally one which he has pushed in the past. In any case, I don’t know the statistical details well enough to comment with much knowledge, but I see that a statistician has responded to Templeton already, so I would recommend checking that out. I immediately went looking for responses because the paper uses really strong and dismissive language, and I am somewhat wary of that sort of thing when attempting to tear down the fundamentals of a whole field of research (I want to emphasize that overall I enjoy Templeton’s work, but the paper reminded me a bit too much of Jerry Fodor). His citation of Popper in particular seems an appeal to authority that aims to convince the non-statisticians in the audience, and I don’t see the point of that besides rhetorical utility. I do tend to accept somewhat Templeton’s critique of models which assume very little gene flow between hominin populations before the Out-of-Africa migration, though from what I can tell it does seem that Africa has had relatively little back-migration south of the Sahara over the past 50,000 years, so perhaps this is an older dynamic as well. I am cautiously optimistic that DNA extraction from fossils themselves may put to bed some of these arguments over the dance of parameters, though naturally interpretation is always an issue outside of pure mathematics.

For what it’s worth, here’s the model which Templeton’s method favors:


The thin lines represent continuous gene flow between populations, and the thick lines extremely strong demographic & genetic pulses which overwhelm the genetic structure status quo periodically. I have implied something similar as operative on the smaller scale of H. sapiens sapiens.

Citation: Coherent and incoherent inference in phylogeography and human evolution, PNAS 2010 107 (14) 6376-6381; doi:10.1073/pnas.0910647107

Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"