The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
Population genomics

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Thanks, LOL, or Troll with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used three times during any eight hour period.
Ignore Commenter Follow Commenter
🔊 Listen RSS

He under whose supreme control are horses, all chariots, and the villages, and cattle; he who gave being to the Sun and Morning, who leads the waters: He, O men, is Indra.

To whom two armies cry in close encounter, both enemies, the stronger and the weaker; whom two invoke upon one chariot mounted, each for himself: He, O ye men, is Indra.

Rig Veda

Sons of Indra

Sons of Indra

Five years ago I found out that my friend Daniel MacArthur and I are members of the same Y chromosomal haplogroup, R1a. Both of us thought it was rather cool, that ~5,000 years ago there lived a man who was ancestral to us both on the direct paternal line. Five years on, and both Dan and I have sons who continue this lineage. True, surely Dan and I share more than one lineage of connection over the past ~5,000 years, the Y chromosomal one is simply the one that is genetically irrefutable since recombination does not break apart the sequence of variants, the haplotype, allowing the inference to be as simple as taking candy from a baby. The common ancestral information is transmitted as a whole block, excepting the mutations which separate us from our common forefather. Additionally, since he has attested South Asian ancestry (< 200 years), we probably share many lines of descent over the past ~3,000 years (one of Dan’s ancestors was stationed in Bengal in the 19th century, so I think our genealogies intersect a decent amount for non-related individuals).

Screenshot - 10272015 - 03:41:22 PM But there’s something special about R1a beyond the fact that it binds me paternally with a host of people who I know from all around the world. The figure to the right is from the supplements of a Genome Research paper, A recent bottleneck of Y chromosome diversity coincides with a global change in culture. You see that R1a1 diverges by very few mutational steps, and a rake-like pattern defines the phylogeny. That is in keeping with a history of relatively recent diversification, and rapid population expansion. The Genome Research paper found that R1a, along with a host of other Y chromosomal lineages, have undergone very rapid demographic expansion over the past when put through the sieve of phylogenomic inference. This is similar to what you see with the Genghis Khan haplotype. Remember, this is a very specific signature of direct male descent. It does not necessarily extrapolate well to the rest of the human genome. So, though Daniel MacArthur and I share a common Y chromosomal lineage, he is Northern European and I am South Asian, with all that implies for the set of genealogies which come together to contribute to the patterns of variation we see in our whole genomes.

Screenshot - 11012015 - 11:20:26 AM But recently we’ve been gaining even more understanding at the phylogeography of R1a, and its likely history. To the left is a figure from the supplements of Reconstructing Genetic History of Siberian and Northeastern European Populations. You see in this chart a few important things. First, the sister to the haplogroup R, which includes R1b and R1a, and therefore huge numbers of European, West, and South Asian men, is Q, an Amerindian one. The Mal’ta boy, who lived ~24,000 years ago, seems likely to have carried a basal R1 lineage. This is reasonable because most people peg the divergence of R1a and R1b ~20,000 years ago (or somewhat more recently). A major takeaway here is that the dominant lineages across much of western Eurasia today on the male side seem to derive from a group with central Eurasian affinities. The two R1 lineages are very rare in Europe before ~4,000 years ago, according to ancient DNA. This is also concomitant with the arrival of “Ancient North Eurasian” (ANE) ancestry, which is closer to that of Mesolithic European hunter-gatherers than East Eurasians, but still rather anciently diverged, on the order of ~30-40,000 years before the present. Amerindians also have substantial admixture from this group, as do many groups in the Caucasus, and South Asia.


The second major issue that is evident from this figure is that Western and most Eastern European R1a diverge from South Asian and Central Asian R1a. The Altay population in this paper are Turkic, but “trace approximately 37%…of their ancestry to another unknown population, which the model predicted to be related to modern Europeans.” And, its R1a looks basal to the South Indian sample, which because it is from Singapore, is likely to be Tamil. Nearly 15 years ago in The Eurasian Heartland: A continental perspective on Y-chromosome diversity, Spencer Wells reported R1a at reasonable frequencies even among non-Brahmin South Indians. More recent work using more markers suggests that R1a has two very common major lineages in Eurasia, with one very common in Eastern Europe, and decreasing in frequency west, and another common in South Asia, with appreciable fractions in regions of Central Asia such as the Altai mountains. Going back to the earlier work, and connecting the dots, it looks like these two “brotherhoods” of R1a diverged on the order of ~4,000 years ago, both undergoing rapid expansion in different regions of Eurasia.

Oh, but there’s more! Eight thousand years of natural selection in Europe has been updated with new ancient DNA results form Iosif Lazaridis’ work. As you might know by now it seems likely that the Indo-European languages were brought into Europe by peoples related to (descended from?) the Yamna culture of the trans-Caspian steppe. The Yamna were genetically a compound population, with about half their ancestry being derived form “eastern hunter-gatherers” (EHG), who themselves were a equal compound between “western hunter-gatherhers” (WHG), the latter presumably descendants of the Pleistocene populations which had retreated to the habitable fringes of the continent, and the previously mentioned ANE group, with Siberian affinities. The other half of the Yamna peoples’ ancestry derives from something similar to that of the early European farmers (EEF), but somewhat different. In particular, rather than western Anatolian affinities, this ancestry seems more trans-Caucasian or eastern Anatolian, with Armenians and Kartvelian groups either being source population, or related to the source populations.

Intriguingly, the Yamna carry the R1b haplogroup, today rather rare in Eastern Europe, but common, and modal, in Western Europe, with extremely high frequencies along the Atlantic fringe. The new version of the preprint now reports some ancient DNA results form the successor culture to the Yamna, the Srubna. There are two intriguing aspects to the new results. First, the Srubna have nearly ~20% ancestry from a population related to the EEF. There are two possible options here. One, that there was back-migration from Europe after the initial migration west. Second, that an EEF-like migration occurred directly from the Middle East to the steppe. But now, from the preprint:

Srubnaya possess exclusively (n=6) R1a Y-95 chromosomes (Extended Data Table 1), and four of them (and one Poltavka male) belonged to haplogroup R1a-Z93 which is common in central/south Asians…very rare in present-day Europeans…and absent in all ancient central Europeans studied to date.

First things first. There are some “Out of India” theorists who posit that R1a derives from South Asia. If you take a very deep time perspective this may be true; recall that much of Eurasia was not habitable during the Last Glacial Maximum (LGM), so the distribution of populations was very different from what we see today. But, on the scale of ~4,000 years ago it seems that one can say that the very common variant of R1a found in the eastern Iranian world and South Asia likely derives from the steppe. The reasoning here is that while peoples in South Asia have elements of ancestry across their genome with affinities to the steppe people (e.g., ANE), there is little evidence for South Asian distinctive ancestry (e.g., ASI) in the steppe people. Additionally, the majority of South Asia mtDNA does not have a West Eurasian profile, but is closer to the lineages of eastern Eurasia. This is strongly suggestive of mostly male migrants. What we can say definitively is that it looks as if male lineages overturned each other multiple times on the steppe. First, R1b was dominant. Then in the same region one lineage of R1a came to the fore, only to later be marginalized by another lineage of the same haplogroup. Finally, in Central Asia more generally the Turkic migrations reshaped the whole ethnographic landscape within historical memory.

F5.mediumThough I begin this post with Y chromosomes, I will not end with them. My belief though is that the Y chromosomal story gives us a deep insight into the nature of social relations over the past ~5,000 years. More on this later. But, the constant turnover of the Y chromosomal record should clue us in to the fact that human demographic history exhibits punctuated turnover events, which reshape the genetic landscape radically over a few centuries. This is a far cry from a model of a set of serial founder events from Africa, dispersing outward as a phylogenetic tree overlain upon a spatial map over a time-scale of tens of thousands of years in Fisher waves.

Specifically, I’m referring here to the 2005 paper, Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Currently, the best rejoinder to this model is probably Towards a new history and geography of human genes informed by ancient DNA, by Joe Pickrell and David Reich. In this review the authors show that though the serial founder bottleneck framework is consistent with the data at a certain level of granularity, it is not the only possibility. What ancient DNA in particular is telling is that local geographic continuity of lineage is often very rare. This result then should make us skeptical of taking contemporary genetic variation, inferring phylogenies, and then overlaying those phylogenies upon the spatial distribution of particular ethno-linguistic groups. Of course, on a coarser scale of granularity the “Out of Africa” model inferred from older genetic work from the pre-ancient DNA era is probably correct. That is, African populations tend to harbor lots of genetic variation, and are basal in relation to non-African lineages. Or, put another way, non-Africans are a derived lineage of Africans. ~100,000 years ago almost all of the ancestors of non-Africans would have been in Africa (or perhaps the biogeographic extension of Africa in the Middle East).

But the story beyond that scale is more complex. At least some of the first settlers of Europe have no modern descendants in Europe. In fact, these populations are nearly as close to East Asians as they are to modern Europeans, suggesting that the modern east-west and north-south axes in Eurasia are products of events of the last few tens of thousands of years at most. In fact, the synthetic origins of Europeans and South Asians is strongly suggestive of the likelihood that inferences from modern genetic variation only have time depths back ~4-5,000 years or so in much of Eurasia. A recent paper in Science, Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent, suggests widespread back-migration to Africa itself from Eurasia! Though I disagree with the interpretation in some details (I don’t believe that this occurred ~3,000 years ago), the circumstantial evidence from this and other studies is strong that there has been several waves of migration of Eurasian groups back to Africa. Excepting the northern fringe of the continent in no region is this preponderant, so that the status of Africa as the home of the original population of modern humans from which others derives, remains unshaken. For now.

Nevertheless, both ancient DNA and whole genome sequencing are fleshing out surprising and enigmatic details in relation to how human genetic variation came to distribute itself around the the world today. Here we can come back to Europe. Mostly because there has been a lot of genetic work on this continent, and the ancient DNA is probably thick enough that we won’t find any major new surprises. In short, the phylogenomic history of the continent over the past ~10,000 years has been “solved” more or less. What did we find out? What can it tell us about the more general human story?

nature14317-f3 We can start with the present. As noted in The History and Geography of Human Genes Europe is a very genetically homogeneous continent. The distances as inferred from allele frequency differences between two given populations is very low, and Northern Europe between the Atlantic fringe and the great Eurasian plain in particular is very uniform in terms of the total genome. Today, we know why. As outlined in Massive migration from the steppe was a source for Indo-European languages in Europe, Northern Europe was demographically shaken ~4,000-5,000 years ago by population movements triggered by peoples which left the steppe. It was not a total replacement. But the world of the first farmers, who had issued out of the Middle East ~8,000 years ago, was rocked in the north. The male Y haplogroups associated with these old farming groups, such as G2a, are found at low, though relatively even, proportions all across Northern Europe today.

One interesting aspect of the story is the huge genetic distance between some of these ancient groups. For example, that between the first farmers from the Middle East and their nearby hunter-gatherer neighbors ~8,000 years ago was of the same order as between Europeans and East Asians! This is more than ten times the larger genetic distances you can find in Europe today, but this persisted for thousands of years, though it seems that hunter-gatherer ancestry increased over time among the farming populations, likely through admixture with the local substrate. The reason for this high genetic distance is because the early European farmers carried ancestry which has been termed “Basal Eurasian” (BEu). This points to the fact that these people seem to have diverged first away from all other non-Africans when it comes to Out-of-Africa populations. In other words, ~40% of the ancestry of early European farmers is from a population which is more genetically distant from European hunter-gatherers than Andaman Islanders are. It was the arrival of the steppe people which resulted in the leveling of the genetic distances across much of Europe, overwhelmingly so in the north, and to a non-trivial extent in the south.

nihms132060f1 So if Europe went through a great homogenization and leveling ~4,000 years ago, why does the “genetic map of Europe” exist? That is, why does geography predict variation in genes so well? There are three things one might say about this. First, PC 1, the larger dimension of variation is north-south. This comports with the idea that the heritage of the early farmers persisted in the south to a far greater extent, and the Indo-European demographic impact was more modest, if not trivial. An earlier explanation I had seen floated around was that there was a north-south gradient due to expansion from the post-Pleistocene refugia, via the serial bottleneck effect. The real explanation for the north-south difference though seems more likely to be the differing proportions of Indo-European ancestry, overlain upon the early farmer and hunter-gatherer ancestry.

The second issue to consider is that the underlying genetic variation in Europe was absorbed into the expanding population. Even if the steppe invaders differed little from east to west, there were differing levels of absorption of the substrate, and after several thousand years there had likely been some divergence between the different early farmer groups, perhaps due to differing levels of admixture with hunter-gatherers. Basically, PC analysis could still pick up the signal of underlying variation even if that component was minor if the dominant element was not particularly structured (you can pick up indigenous structure in Mestizo populations in Mexico for this reason).

Finally, after the initial punctuated change, there was an equilibration as isolation by distance dynamics resulted in divergence across the North European plain. We have enough historical records to know that aside from the Slavic migrations there seems to have been little change in the population structure of Europe since the Roman period (the Saxon migrations were not trivial, but they were neither preponderant nor continent-wide in impact).

What general inferences can we glean from this specific European case? As Graham Coop’s group has noted, one must account both for continuous gene flow via isolation by distance dynamics, and pulse admixture events between very distant populations. Consider the metaphor of a forest expanding over the landscape. There will be local structure, accrued over generations, hundreds and thousands of years. But perhaps periodically a fire will sweep through the landscape and clear huge swaths of territory. Into this virgin landscape may expand forests which derive from isolated reservoirs which escape the flames. Over time geographic structuring will be evident again, and depending on the number of refuges the jigsaw puzzle of genetic islands expanding into the gaps will fade somewhat as migration smooths the edges.

The reference to fire here is conscious, insofar as fire can immolate structure which has taken generations to develop. Before the steppe people arrived in Northern Europe the first farmers had established a long-standing cultural commonwealth of sorts. Their legacy had persisted for thousands of years. Then, in a period of centuries, it all changed. Why? Culture.

Outright genocide with weapons is a dangerous business. Societies which engage in endemic long-term warfare as a primary male vocation, such as highland New Guinea, have high mortality rates. But in the context of the Malthusian world, where villages persist on the knife’s edge of subsistence, marginalization and disturbance of long-held patterns is all that might be needed for cultures to descend into famine and starvation. In 1493 Charles C. Mann notes that the mass death triggered by the arrival of Europeans and Africans to the New World had as much to do with the destabilization of society by illness as much as the illness itself. In a world where all hands were on deck to bring in the harvest, the loss of critical labor during those periods could result in starvation, and high death rates led to the rapid collapse of the institutions which served as scaffolds for the maintenance of everyday life.

The scenario then might be one where populations on the Eurasian steppe develop some of the basic elements which would lead to agro-pastoralism, and undergo population expansion. With numbers, and well fed on the agro-pastoralist diet, these tribes might have poured into the lands of the farmers as rapid mobile groups in their wagons. The pattern in antiquity down to the early modern period, from the Goths to the Mongols, was to extract rents and treat the farmers as cattle. There was no incentive for one to starve cattle, and so the demographic impact of conquests was relatively modest.

But what about a world with less institutional complexity? In a world where the basic levers of rent to extract from the conquered did not exist, the natural path would be to replace them. The story goes that Genghis Khan had hoped to turn North China into a vast pastureland by driving out the peasantry (and almost certainly killing most of them through starvation), but his sage Khitai adviser explained the wealth that could be gained by taxing humans rather than raising stock on land. But the Khitai themselves were a semi-civilized people with centuries of experience milking the Han peasantry, and were heirs to a tradition of pastoralist predation that went back to the Hsiung-Nu. And yet no doubt there was a time when the idea of collecting rents from a conquered people was an innovation in and itself. The genocidal antics of the Israelites in the Hebrew Bible strike us as dark and atavistic, but they reflect a cultural mindset which is nearly contemporaneous* with the arrival of Indo-Europeans to Europe.

This plausible sketch puts into better perspective Steven Pinker’s thesis in Better Angels of Our Nature as well as Peter Turchin’s War and Peace and War. The emergence of state institutions and pacific ideologies in the past ~3,000 years may be a sort of response to the high-stakes inter-group competitions which would level societies and turnover populations on a regular basis in the human past.

And yet not all was as sweetness and light. In terms of their total genome the differences between the Srubna and their predecessors were not very great. Conversely, the differences in the total genome between Slavic people and South Asians are legion. But interlaced more recently across the landscape of a more stable structuring of genetic variation, a great regrowth of the forest through isolation by distance equilibration if you will, has been the explosion of powerful patrilineages which trace out an intriguing skein across the landscape. The total genome signal of these men may quickly decay over the generations, as their female-line descendants lose the golden allure of their status, but their male-line descendants continue to accrue mating prowess by dint of their association with great kinship units which succeeded in a winner-take-all game with other such groups of men. On top of the story of migrations of whole peoples, and the extinction and absorption of others, is the story of bands of men operating as units, related either in truth or fictively, which extract rents across a thickly populated landscape of human cattle. Another way to state this is that the thuggish state which imposed a monopoly of violence on a chaotic world where small-scale conflict was becoming too expensive allowed for the emergence of patriarchy as we understand in its customary form. Like so many hirelings, the men charged with protecting the people, made the whole world their possession and left dreams of their people behind.

John Ross, Cherokee Chief

John Ross, Cherokee Chief

While the cultural and genetic affinities of folk wanderings were tightly coupled, I am not sure that the Y chromosomal lineages are so neat. The Hazara people of Afghanistan exhibit an Asiatic appearance in comparison to other Afghans, and their Y chromosomes suggest a close connection to the Khalkha Mongols, but they are Shia Muslims who speak Persian. It does seem that the R1 lineages ascendant in Europe and South Asia owe their success to the Indo-Europeans, but both R1b and R1a transcend a connection to Indo-European ethno-linguistic groups. In some cases, as in that of R1a in the Levant, one might see in that a submerged Indo-European element, from the Mitanni down to the later Persian and Kurdish peoples. But in other cases, such as R1b among the Basque and R1a among Dravidian-speaking tribal people in South India, what we are seeing is the long arm of the patriarchy reaching beyond bounds of cultural and genetic affinity. The great Cherokee chief John Ross was famously 7/8th Scottish in ancestry. But he was a voice for the Cherokee people nevertheless. In most places where the Mongol hordes washed over they assimilated to the cultural folkways of the people whom they conquered. Like modern corporations the patriarchies were only loosely associated with other units of human organization, even if they used them as their vehicles of choice.

And so the story ties back to the beginning. Many of us are the sons of Indra, Zeus, and Thor. The descendants of Herakles, and of Abraham who haggled with God himself. Of Ishmael, whose hand will be against everyone and everyone’s hand against him. Of Niall of the Nine Hostages, and Temujin. The interests of men like this know no nation, nations are but ends to their will. The tension we see in our modern world, between egocentric plutocratic elites jostling nation-states like playthings, might be simply the repetition of an old pattern. In the Bible Saul was rebuked for not destroying all the capital of the Amalekites, perhaps reflecting the tensions of interests which reflect the leader of a people who must act in the collective good, but have their own selfish needs and dreams of self aggrandizement for their own very particular posterity.

Addendum: Ancient DNA will expand in its ability to discern various patterns in the past. But the general disturbances will fall in line with what I have outlined above, I believe. Rather, the move will be from phylogenomics, to population genomics. Phylogenomics leverages genomic methods to attempt to infer phylogenetic patterns. Population genomics explores the classical parameters which shape the change in allele frequencies in lineages, and ultimately, deep evolutionary questions. We now know from ancient DNA that in all likelihood the phenotype which we associate with modern Europeans is a novel configuration. To some extent this is to be expected, as the basic elements which combine to form the European genome, fusing together lineages which diverged at least ~50,000 years before the present (BEu vs. everyone else outside of Africa) and ~35,000 years before the present (ANE vs. WHG), only came together around ~4,000 years ago. But there is more, as natural selection seems to have changed allele frequencies after these elements came together. That is, selection may have been operating across the European landscape when Hannibal was skirting the Alps!

And again, this is likely a general story. Physical anthropologists have long wondered why classical East Asian skeletal morphology seems to be scarce in the prehistoric past. But what if the classical East Asian appearance is relatively new? The Ainu, who have long been considered at “Lost White Race” turn out to be a basal Northeast Asian group. It may be that they retain more of the “ancestral” features of East Eurasians.

The first age of selection studies in the 2000s was fraught with confusion and false positives. To a great extent we still don’t know what to make many of the signals, which are deposited in the middle of obscure open reading frames. But the real golden age of selection will probably begin when we have more temporal transects with whole genome sequencing of ancient DNA, and with the phylogenomic context relatively robust as an interpretative framework.

* I am aware that the Hebrew Bible coalesced between thousands of the years after the arrival of Indo-Europeans to Europe, but it no doubt distills very ancient folkways. This seems obvious for example in the recollection of the Sumerian flood story.

🔊 Listen RSS

recomb2 I’m someone who until a few years ago thought of recombination as a pretty boring and static evolutionary genetic parameter. Then I went to a talk by John Novembre which reported on variation between human populations in patterns of recombination (in particular, differences in “hotspots”). For a quick review, recombination is important for two primary reasons. One is molecular genetic, insofar as it seems to have structural value for meiotic process and DNA repair. No recombination is generally not good. Second, recombination maintains the law of independent assortment of traits even on the same chromosome, because over time even nearby genes will be uncoupled in their inheritance due to crossing over. From an evolutionary perspective this is important because in this way “good” and “bad” alleles can be decoupled from other other. Recombination is basically a way to enhance the ability of sex to mix and match variation.

Graham Coop revealed patterns of variation among individuals years ago. For example, it is from Graham’s work that I came to understand to recombination is less common in sperm than in eggs, ergo, you’ll have more variance in genomic contributions from paternal than maternal grandparents. Recently at BAPG XI Laurie Stevison presented work reveal patterns of recombination variation, and the role of PRDM9, across great ape lineages. I tweeted some of the results out, but there were a lot of them. I found the talk interesting, but difficult to take in because there was so much. Now Stevison has put out a preprint, The Time-Scale of Recombination Rate Evolution in Great Apes, and I feel somewhat the same about it. There’s lots of good stuff, but unless you are steeped in this domain it is somewhat difficult to parse it and tease out distinct threads coherently. But, as you can tell from the figure at the top of this post changes in patterns of recombination vary as a linear function of genetic divergence. Some of this stands to reason as the karyotypes of great apes differ. And yet even taking this into account it seems there are differences in patterns such as skew of recombination across the genome (e.g., ~75% of the recombination in human genomes occurs on ~20% of the sequence, with enrichment around telomeres, and very little around centromeres). Looking over Stevison’s preprint I have to wonder as to the role of quality of data in some of the results. Genetic maps are hard to get in some populations, and the ones floating around are not always good. The big takeaway of note for me is that though there is lots of variation in fine scale recombination patterns, there are some broad constraints. That makes sense when you note that there are structural/mechanistic reasons for recombination rooted in the nature of meiosis. It’s not a totally neutral parameter which can explore the full space of possibilities. But, in this context obviously the variation in hotspots shows that there are different ways to skin this cat.

Finally, there’s one issue that jumped out at me, and that is they found that “European human population presents the strongest hotspot usage across the genome.” This aligns with earlier work. But I wonder how much of this tendency to find uniqueness in Europeans is due to the enormous amount of genomic resources available for this population. It’s also intriguing in light of the evidence that the European mutation spectrum is different.

In any case, I think everyone should read this preprint several times. I know I’m going to.

• Category: Science • Tags: Population genomics, Recombination 
🔊 Listen RSS

hartle A friend of mine is beginning grad school and has settled upon a lab. The core research within the laboratory is population genomics, and they now need to get up to speed in the area. Taking a class is certainly the start. You can read Haldane’s Sieve to keep up on the literature, which is a necessity if you are doing genomics work, as texts get out of date quickly. Additionally, Graham Coop, Joe Felsenstein and Kent Holsinger have excellent online notes. The upside to this is that they are free. The downside is sometimes you are away from a computer screen. Often a soft intro recommended by many is John Gillespie’s Population Genetics: A Concise Guide, which nicely has a Kindle edition. But if you are going to do graduate level work, I think it is best to just go whole hog. The Gillespie book is appropriate for a quick course or for the undergraduate level, but you really need something as a reference at some point. And for that nothing beats Daniel Hartl and Andrew Clark’s Principles of Population Genetics. There are other texts out there in this area. For example, I have Philip Hedrick’s Genetics of Populations, and Alan Templeton’s Population Genetics and Microevolutionary Theory. For various reasons I would still pick Hartl & Clark if I had to pick.

falconer I also think it’s important to know quantitative genetics, and for that Trudy MacKay and Douglas Falconer’s Introduction to Quantitative Genetics is the best bet in the business that I know of. It’s an excellent complement to Principles of Population Genetics because it starts with pop gen foundations. Derek Roff’s Evolutionary Quantitative Genetics and Michael Lynch and Bruce Walsh’s Genetics and Analysis of Quantitative Traits are probably too specialized for the beginner, and frankly even many steeped in the field haven’t read those books.

slatkinnielsen There are plenty of other books out there which might suffice in some fashion. In my previous post I mentioned Elements of Evolutionary Genetics. The old John Maynard Smith classic Evolutionary Genetics is also excellent. But if you are working in genomics and want a book less focused on classical methods and geared toward contemporary best practices, then Rasmus Nielsen and Monty Slatkin’s An Introduction to Population Genetics: Theory and Applications is pretty good. It’s a short book, and because it’s in its first edition there are many errors in it. From what I recall it was developed out of notes from a course taught at Berkeley, and it outlines the sort of methods you see in the papers which being published today, utilizing coalescent theory and site frequency spectra. It might be a reasonable quickstart, though I’m not sure it is developed well enough to be a reference (for what it’s worth, I have a copy of it too, and it is being used in graduate level courses here at UC Davis).

• Category: Science • Tags: Population Genetics, Population genomics 
🔊 Listen RSS

elementarysofevolutionarygenetics In the early 1970s the eminent evolutionary geneticist Richard C. Lewontin wrote that population genetics “was like a complex and exquisite machine, designed to process a raw material that no one had succeeded in mining.” By this, Lewontin meant that in the 1930s when R. A. Fisher, Sewall Wright and J. B. S. Haldane established the theoretical foundations of the field, the techniques to discover the variation in populations to test their suppositions was rather thin (naturally, this resulted in many controversies, see The Origins of Theoretical Population Genetics). Geneticists were using classical methods, utilizing salient phenotypes which were proxies for underlying genetic markers, and tracing patterns of co-inheritance of traits with known locations in the genetic map with novel mutants. Researchers were not even clear at that point as to the underlying biochemical structure of the particle of Mendelian inheritance, what we term DNA. That arrived onto the scene in in the 1960s. But in the early 1970s when the above was written we’re not talking about DNA sequencing. Rather, this is the allozyme era, which Lewontin helped usher in with a paper in 1966. He expresses the excitement of the times later in the passage:

Quite suddenly the situation has changed. The mother-lode has been tapped and facts in profusion have been poured into the hoppers of this theory machine. And from the other end has issued–nothing. It is not that the machine does not work, for a great clashing of gears is clearly audible, if not deafening, but it somehow cannot transform into a finished product the great volume of raw material that has been provided.”

Despite the pessimism expressed above the emergence of molecular evolution stimulated the debates around neutral theory. Over a generation ago evolutionary geneticists were grappling with the swell of data which was confronting theoretical frameworks constructed in the early 20th century. Today we live in the “post-genomic” era, and now think in terms of whole genomes. The details may differ, but many of Lewontin’s observations in the 1970s still hold true, as novel results meet the paradigms of old. Last month in PNAS Brian Charlesworth published a paper which brought this to mind, Causes of natural variation in fitness: Evidence from studies of Drosophila populations. You may know Charlesworth as the coauthor of Elements of Evolutionary Genetics, an encyclopedia of a text which I highly recommend to all. In the paper, which is both review for those of us not steeped in Drosophila genetics, and a distillation of derivations to be found in the supplements, Charlesworth notes that there is a contradiction in terms of the typical selection coefficients inferred for deleterious alleles from population genomics in relation to those from quantitative genetics. Population genomics is a new field, and involves sequencing many markers (often whole genomes) to good accuracy across a reasonable number of individuals. Quantitative genetics is a more classical framework utilizing statistical methods which interpret variation in traits within laboratory populations.

220px-Drosophila_repleta_lateral The fruit fly has a storied role in Mendelian genetics. To a great extent the study of the fruit fly is the early history of Mendelian genetics (see Lords of the Fly: Drosophila Genetics and the Experimental Life). Therefore it is natural that a large body of research exists in this area, and one can’t accept novel results obtained through new methods such as genomics at face value without some degree of skepticism. Charlesworth notes that the extremely small fitness effects of the mutation discovered via genomic methods are biased toward single nucleotide variants (SNVs); point mutations. In contrast it seems likely that the larger effect mutations implied by quantitative genetic studies, which are rather rare, and so missed in population genomic sample sizes, are due to transposable elements (TEs) interspersing themselves across the genome, and presumably disrupting function. In line with older theoretical models, most of the variation in fitness is due to a small number of mutations. Presumably as genomic methods get better (e.g., longer read to catch repeat elements and larger sample sizes) they will converge upon the older established quantitative genetic methods. Two interesting other results in this paper is that much of the variation is due to balancing selection. For theoretical reasons balancing selection can not be pervasive across the genome (too much fitness variation would result in huge death rates per generation), but, of the variation within the population much of it is maintained by balancing selection according to Charlesworth. Another interesting dynamic is that the population genomic method seem to be better at capturing the distribution of fitness effects in humans, because of our smaller effective population size. You can read the paper for the technical reason why, but the key here is to remember that one has to be careful about extrapolating from model organisms. The models are imperfect, and we always need to never outrun our ability to generalize.

As genomics becomes pervasive in population genetics this sort of analysis will be more common. Rather than “genome-of-the-week” papers we’ll move to actually trying to grapple with what the sequence data is telling us specifically about the lineage in question, and, what we can generalize from the results about evolution writ large. Some organisms have a long history of scientific study, so population genomics will supplement and complement. In other cases though organisms do not have such a rich literature and scientific culture, and the pitfalls that are highlighted here might alert us to the deficiencies in genomic methods.

Citation: Charlesworth, Brian. “Causes of natural variation in fitness: Evidence from studies of Drosophila populations.” Proceedings of the National Academy of Sciences (2015): 201423275.

🔊 Listen RSS

Adapted from Khoisan hunter-gatherers have been the largest population throughout most of modern-human demographic history

It is common for strong results from population genetics to be confused when it is translated for public consumption. The best example is that of “mtDNA Eve.” Despite the big warning label that mtDNA Eve is one of many female ancestors, the public has gained the impression that she is the female ancestor. A similar problem is cropping up with the Khoisan paper which reports that they went through a relatively mild bottleneck in comparison to other modern human populations. There’s a reason I titled the post The Least Bottlenecked Humans of All. It’s a defensible reduction of the results. In contrast many popular treatments are translating the results into the conclusion that “the Khoisan had the largest population of all human groups at some point in the past.” The reason I avoided this formulation is that plainly stated I doubt that at any time the Khoisan as we understand them, a genetically-culturally coherent group in southern Africa, had the largest population of all. Humans of various sorts have been common across Afro-Eurasia for over a million years. Is it plausible that ancestors of the Khoisan had the largest populations of all? Anne Gibbon’s somewhat cautiously stated piece in Science, Dwindling African tribe may have been most populous group on planet, relays the sentiment which I share:

Other researchers agree that it’s likely that the Khoisan descend from a large population. But because sampling of African genomes is still so spotty, not everyone is yet convinced that the Khoisan “was the largest population on Earth at some point,” says evolutionary geneticist Pontus Skoglund of Harvard University. “Many African populations are not included for comparison,” he says, so it is possible that some of the diversity seen in the Khoisan was inherited from recent interbreeding that cannot yet be detected.

Either way, the study makes it clear that even though the Khoisan are genetically diverse by today’s standards, even they carry just a fraction of our ancestors’ genetic legacy over the past 120,000 years. “It is quite staggering how much extraordinary genetic variation and ethnic diversity was present but is now lost,” Skoglund says. The Khoisan, retaining more than the rest of us, offer a rare window to look back in time at some of that diversity.

The biggest gap in the current study is that many extinct lineages were not included. Obviously they couldn’t be included, because they’re extinct, though at some point in the future ancient DNA or (more likely in the African context) reconstruction of ancient genomes from extant populations which have absorbed them, might allow for a better understanding of Pleistocene human population sizes. Population genomics is powerful, but it has limits. We need to be cautious about assuming that what we can illuminate with current methods is all that can be conceived in our natural philosophy.

• Category: Science • Tags: Genetics, Khoisan, Population genomics 
🔊 Listen RSS


When, how, and why, different lineages of the tree of life diverged has long been a preoccupation of evolutionary science. Now one must add to that a caveat that it seems a great deal of the story also has to do with the entanglement of branches which were long separated. Paleontology has looked at the macroevolutionary patterns, and attempted to move from description to formal models which scaffold the long progress of natural history. Phylogenetics has painted the branches of the tree in loving detail, and attempted to infer patterns from the shape and pulses of the diversification. Population genetics has focused upon the microevolutionary parameters which shape the flux of the genetic makeup of particular lineages; drift, mutation, and selection. Now you have new fields such as population genomics, which fuse 21st technologies with the questions and theoretical machinery of 20th century disciplines (in this case, population genetics, just as phylogenomics is an extension of phylogenetics).

Liu, Shiping, et al. "Population Genomics Reveal Recent Speciation and Rapid Evolutionary Adaptation in Polar Bears." Cell 157.4 (2014): 785-794.

Liu, Shiping, et al. “Population Genomics Reveal Recent Speciation and Rapid Evolutionary Adaptation in Polar Bears.” Cell 157.4 (2014): 785-794.

Because of the monetary investment by organizations such as the NIH (among other factors) the -omics revolution has hit Homo sapiens first. But it is moving on, and that is important, because evolutionary science really can’t constrain itself purely to the human domain. Ultimate questions such as why there are so many species requires actually surveying the nature of variation in the world out there. Nevertheless, currently most of the post-human work seems to be occurring in the classical ‘model organisms’ (e.g., Drosophila), or charismatic creatures, especially big mammals. A new paper in Cell, of all journals, is in the second class, Population Genomics Reveal Recent Speciation and Rapid Evolutionary Adaptation in Polar Bears. As you can infer from the title the paper looks at both the phylogenetic history of polar and brown bears, as well as the evolutionary genetic functional differences between the two distinct lineages. As you can see their sampling coverage was limited to particular populations, which is reasonable in light of finite sequencing resources. They had 10 brown bears and 79 polar bears, with good coverage on a lot of them (~30x not atypical). The inferences necessarily derive from these populations, though they admit in the text you can only go so far with their limited geographic coverage.

Using a variety of methods (IBS tracts and ∂a∂i) they found that polar bears and brown bears (or at least the ones in their sample) diverged on the order of ~500,000 years ago into two populations. More precisely 479-343,000 years ago. This overlaps with the fossil evidence. It translates to a separation between the ancestral populations about 20,000 generations ago. The authors state:

… the distinct adaptations of polar bears may have evolved in less than 20,500 generations; this is truly exceptional for a large mammal. In this limited amount of time, polar bears became uniquely adapted to the extremities of life out on the Arctic sea ice, enabling them to inhabit some of the world’s harshest climates and most inhospitable conditions.

This seems a little hyperbolic to me. In fact the Neandertal-modern human divergence is only about half as far back in the past in generation time, and one could argue that our two lineages were pretty diverged as well. That being said, obviously there are huge visible and physiological differences between polar and brown bears. They include in their model estimates of effective population declines in the past, presumably due to the exigencies of the Pleistocene glaciations. Using paleontological results already known they suggest that the emergence, and derivation of the polar bear lineage occurred during a period of separation from the ancestors of brown bears. In other words, allopatric speciation. In line with earlier work they also report evidence of long term gene flow between the two lineages, in particular, gene flow from polar bears to brown bears. This seems to be an old and continuous event which has become attenuated of late (they didn’t detect the sort of long haplotypes indicative recent admixture).

A note of caution again, as the samples here are geographically limited. But using measures such as D-statistics which attempt to infer patterns of admixture between populations it does seem that the initial conclusion about decreased effective population implies expansion from small initial founder groups for modern extant lineages. One wonders if this is a commonality with large mammals which have been shaped by repeated glaciation events. Obviously I’m including humans here, but for humans we have a lot of evidence that in fact there has been a lot of replacement due to ancient DNA.

Perhaps more thoroughly persuasive is the evidence they report in the paper that the polar bear exhibits lots of evolutionary change from their ancestors in particular functional regions. Polar bears are highly carnivorous, and exhibit lots of morphological and metabolic differences from brown bears. To be short it is as if brown bears were put on a very high fat diet. The functional regions which indicate signatures of selection in polar bears don’t have corresponding hits in brown bears, which isn’t surprising. They’re adapted to different conditions. Additionally a lot of these changes in polar bears are inferred to be harmful in humans. Fast evolution often occurs by breaking things; loss of function. So not surprising. The question is how polar bears function then? Also, I wonder if brown bears themselves are derived in a manner which we don’t understand yet (the sample here is skewed toward polar bears). Though brown bears are generalists, so I presume that they’re probably closer to the ancestral morphology.

They conclude intriguingly:

…Such a drastic genetic response to chronically elevated levels of fat and cholesterol in the diet has not previously been reported. It certainly encourages a move beyond the standard model organisms in our search for the underlying genetic causes of human cardiovascular diseases.

As Sydney Brenner would say, we’ve learned enough (or not) about mouse diseases.

Citation: Liu, Shiping, et al. “Population Genomics Reveal Recent Speciation and Rapid Evolutionary Adaptation in Polar Bears.” Cell 157.4 (2014): 785-794.

🔊 Listen RSS


An old argument going back to the origins of theoretical population genetics has to do with the nature of the genetic effects which control traits and are subject to change in allele frequency due to adaptation. Often these are bracketed as part of the controversies between R. A. Fisher and Sewall Wright (see Sewall Wright and Evolutionary Biology). In short, Fisher contended that most evolution through adaptation was driven by selection operating upon additive genetic variation. That is, variation due to alleles across the genome, each having independent and additive effects on the trait. One might think of these as linear effects. In contrast Wright’s views were more complex or confused, depending upon your perspective on the sum totality of his theories. In the domain of genetic architecture he presented a model where gene-gene interactions, epistasis, played an important role in the evolutionary trajectory of populations, which traversed ‘adaptive landscapes’ in a contingent fashion.

The point is not to revisit old and somewhat stale controversies. It is to suggest that filaments of these nearly century old debates persist down to the present in a vividly relevant manner. Evolutionary biology is a progressive science, but the arc of the initial narrative initiated by the fusion of biometrics, Darwinism, and Mendelism, has not concluded. Last year a group suggested that much of the “missing heritability” may actually be found in the interaction term across genes. In many quantitative genetic treatments this component may be collapsed into the “environmental” component, as opposed to the “narrow-sense” heritability, which is defined by additive genetic variance. If this epistatic component can be thought to be more complex and non-linear it stands to reason that biologists would focus on the additive component of variance first. But, there is the possibility that additive genetic variation is somewhat like looking under the lamp post, because that is where the light is.

A new paper in PLoS Genetics takes the argument even further, An Evolutionary Perspective on Epistasis and the Missing Heritability. Let me jump to the first paragraph of the discussion:

The architecture of genetic variation must be understood if we are to make progress in fields such as disease risk prediction, personalised medicine, and animal and crop breeding. This study sought to examine the potential for epistasis to maintain genetic variation under selection, and thus to inform GWA strategies based on these results.

Though the paper seemed to shift between a broad evolutionary perspective and salient within biomedically focused GWAS studies, I found the treatment of the former rather elliptical. Using a range of gene-gene interaction models they concluded that deleterious genetic variation could be maintained to a far greater extent than with an additive paradigm. Ergo, the extant genetic variation across many traits which may seem to violate the fundamentals of evolutionary process (fitness maximization and erasure of genetic variation). This makes some sense, in that additive genetic variation should be effective in purging deleterious variants, and fixing advantageous ones. In contrast, interaction effects are such that benefits and demerits may be more conditional, at least in regards to their magnitude (from what I recall this was one reason for Fisher’s dismissal of their important to the overall evolutionary arc of a population; see R.A. Fisher: The Life of a Scientist). One technical aspect that concerns me about the arguments outlined in the above paper is the emphasis on overdominance and heterozygote advantage. Though they are aware of the problem of segregation load multiplied across loci, where homozygotes are generated in a context where heterozygotes are most fit, the authors still emphasize scenarios in their two-locus models where this is critical. Intuitively I have difficultly believing that this can be any major part of explaining “missing heritability.” But the broader issue in regards to epistasis is well taken, and it is a reality that epistatic and additive variance can emerge from the same loci conditional upon the overall genetic architecture. On the evolutionary canvas the two may be interchangeable as a function of time.

Their treatment of problems with GWAS was more straightforward. This is the method whereby associations are found between traits/diseases and particular genetic variants by looking at differences between discrete populations with differences in traits/diseases of interest. It seems that their models illustrate that the power to detect epistatic variance, and its impact on trait value, is masked by the fact that linkage disequilibrium between markers and causal variants must be very high for it to be detected. In contrast additive effects are less sensitive to decay in linkage disequilibrium, which measures the association of alleles across genes, so that one marker may serve as a signal for the presence of another variant down or up the genome.

Part of the answer is probably here. But how much? Though no “letter” to Nature I felt that this paper could be fleshed out somewhat, as some of the inferences seem to be rather underdeveloped. They have demonstrated complexity and lack of generality. But what now? I suppose we’ll see.

Citation: Hemani G, Knott S, Haley C (2013) An Evolutionary Perspective on Epistasis and the Missing Heritability. PLoS Genet 9(2): e1003295. doi:10.1371/journal.pgen.1003295

Addendum: For those curious about epistasis and evolution, see if your college library has Epistasis and the Evolutionary Process.

Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"