The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
Genetic Diversity

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Thanks, LOL, or Troll with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used three times during any eight hour period.
Ignore Commenter Follow Commenter
🔊 Listen RSS

Meeting the Taino

In the comments below a few days ago someone expressed concern at the diminishing of genetic diversity due to the disappearance of indigenous populations. My response was bascally that it depends. The issue here is whether that disappearance is due to assimilation, or extinction. If a given population is genetically absorbed into another, obviously their genetic diversity is by and large maintained. What disappears are the specific genotypes, the combinations of gene pairs, which are distinctive to that given group. This is the same dynamic at the heart of the ‘disappearing blonde gene’ meme. Unless there is selection at the loci which encode or predispose one to blonde hair the ‘gene’ isn’t going anywhere. Rather, the implicit issue here is that blonde people are intermarrying with non-blonde people, and if the genetic variant has a recessive expression then the frequency of the trait will decrease. Populations with a high degree of homozygosity at the ‘blonde loci’ are distinctive in a very particular manner, but they’re no more or less ‘diverse’ than other populations which don’t manifest the same tendency.

A toy example will suffice. Take two populations, A and B, and one locus, 1, with two variants, X and x. Assume that the two populations are the same size. At locus 1 population A is 100% X, and population B is 100% x. In a diploid scenario then all the individuals in population A will be XX, and in B will be xx. When you add A + B you get a frequency of X of 0.5, and of x of 0.5 (since the two populations are balanced in size).


Now imagine a scenario where all individuals in population A pair up with someone in population B (assume sex balance in both populations). In the first generation, F1, all the offspring will be heterozygote Xx (hybrids). The frequency of X and x will be 0.5 still, as in the previous generation. But no individual now reflects the genotype of the parental populations, as all individuals are heterozgyote. At the level of alleles, specific genetic variants, you’ve go the same diversity (X and x at locus 1). But at the level of genotype there’s a huge shift. Two genotypes (XX and xx) no longer exist, but a novel one is now fixed in the population (Xy).

A novel combination

Finally, in the F2 generation, the offspring of F1, Hardy-Weinberg will reassert itself. 25% of the genotypes will be XX, 25% xx, and 50% Xx, due to p2 + 2pq + q2 = 1. In this scenario some of the distinctiveness of the parental and F1 generations in terms of genotype are evident, but the diversity in the allelic sense of the parental and F1 states remains the same, X = 0.5 and x = 0.5. Observe that if you’re looking at genotypic diversity the F2 generations are actually more diverse than the parental (because Xy is a different genotype). In other words, in some ways the aggregation of various distinct populations may increase diversity by generating novel combinations.

This is not to deny that a very specific historically contingent form of diversity in terms of distinctness of particular groups is threatened today. That’s why it was important that the HGDP was overloaded with threatened groups like the Bushmen, Kalash, and Pygmies. These populations may be assimilated soon, and with that assimilation it will be more difficult to extract out historically very important information which will inform us about the human past.

But another issue is extinction instead of assimilation. Wouldn’t this eliminate a lot of genetic variation? Perhaps. I actually considered this issue a few years back with the Star Trek reboot. If you haven’t watched the film, there’s a major spoiler next. So basically on the order of ~10,000 Vulcans survived the destruction of their planet. Culturally the preservation was rather good, because the Vulcan elders, who are the repositories of the culture, were saved. In this way a fully fleshed Vulcan culture could easily reemerge out of the genocide. On the other hand, the vast majority of Vulcans died. Isn’t ths population bottleneck a genetic catastrophe? It depends. If the Vulcans who survived are a relatively random assortment of the population genetically, then the disaster isn’t that bad in terms of genetic diversity.

To get some idea of why, consider the statistic of heterozygosity. This measures the extent of heterozygote states, where the two gene copies differ at a locus, across the population. It’s a proxy for genetic diversity, as more allelic diversity produces more heterozygosity.

The decay of heterozygosity over time due to random genetic drift (without mutation) can be modeled like so:

Ht = H0(1 – 1/(2N))t

The variable “t” is simply the generation time, from an initial time. H0 refers to the initial heterozygosity, and Ht is simply the value at a given time out from that initial value. The N is effective population size. This formula can be used to model population bottlenecks. The Vulcan population reduction from one on the order of billions to 10,000 was basically a massive population bottleneck. The decrease in heterozygosity that you’d expect would be:

Ht = (1 – 1/(2*10,000))1

Ht = 0.99995 of the initial value. Basically almost nothing. Why? Because 10,000 turns out to be a relatively large population. This makes some intuitive sense. If you have a sample size of 10,000, and it’s representative, sample variance isn’t going to be that high. If you have an infinite number of coin flips so that the ratio of heads and tails is 50:50, reducing that to 10,000 flips isn’t going to result in much of a deviation from 50:50.

Let’s look at the effect of population bottlenecks of 20 generations at various values of N. The x axis shows generation time, while the y axis illustrates the proportion of the initial heterozyosity which remains.

This is not to downplay the impact of bottlenecks and demographic stochasticity. Rather, it’s to suggestion that population genetic diversity is relatively resistant to a crash in numbers. The extinction of small tribal groups is a tragedy, but genetically it may not be as much of a problem as we think. Even in groups such as the Bushmen with a great deal of genetic diversity it is likely that most of that diversity is already found within non-Bushmen populations.

Image credits: Ian Beatty and Lesley-Ann Brandt.

🔊 Listen RSS

The Pith: The human X chromosome is subject to more pressure from natural selection, resulting in less genetic diversity. But, the differences in diversity of X chromosomes across human populations seem to be more a function of population history than differences in the power of natural selection across those populations.

In the past few years there has been a finding that the human X chromosome exhibits less genetic diversity than the non-sex regions of the genome, the autosome. Why? On the face of it this might seem inexplicable, but a few basic structural factors derived from the architecture of the human genome present themselves.

First, in males the X chromosome is hemizygous, rendering it more exposed to selection. This is rather straightforward once you move beyond the jargon. Human males have only one copy of genes which express on the X chromosome, because they have only one X chromosome. In contrast, females have two X chromosomes. This is the reason why sex linked traits in humans are disproportionately male. For genes on the X chromosome women can be carriers of many diseases because they have two copies of a gene, and one copy may be functional. In contrast, a male has only a functional or nonfunctional version of the gene, because he has one copy on the X chromosome. This is different from the case on the autosome, where both males and females have two copies of every gene.

This structural divergence matters for the selective dynamics operative upon the X chromosome vs. the autosome. On the autosome recessive traits pay far less of a cost in terms of fitness than they do on the X chromosome, because in the case of the latter they’re much more often exposed to natural selection via males. In the rest of the genome recessive traits only pay the cost of their shortcomings when they’re present as two copies in an individual, homozygotes. A simple quasi-formal example illustrates the process.

Imagine a population which has an allele which expresses recessively and has sharply reduced fitness when it expressed. Assume that the allele in question, q, is present at a proportion of 0.50. All the other functional alleles are classed together as p, and are also 0.50. In the next generation the Hardy-Weinberg Equilibrium would entail that: 75% of the individuals would not express the recessive trait, but 25% of the individuals would.* But for ever copy of the deleterious allele which is expressed and so exposed to natural selection, there’s another copy of the deleterious allele which is “masked” in a heterozygous individual with one good copy, and so evades natural selection. As natural selection decreases the frequency of the deleterious allele fewer and fewer copies will be found in recessively expressed individuals, and so the power of selection to remove the allele will decrease as its own frequency declines. When the frequency of the deleterious alleles is ~0.01, only about 1 out of 100 copies will be found in a homozygote exposed to natural selection. In this way genetic diversity of even deleterious alleles can be preserved as many low frequency recessively expressed variants.

The situation differs on the X chromosome. If the population consisted only of females then the model above would hold. The trait only expresses if a female has two copies of the faulty gene. But one out of every three X chromosomes in the typical human population is present in a male. That means that every deleterious allele on that X will bear its full cost if it happens to be in a male, a 1 out of 3 probability. So I calculate that when you have a situation where the deleterious allele is present as a fraction ~0.01 on the X chromosome about 1 out of 4 copies will be expressed, overwhelmingly in males. This is a 25-fold difference between the X and autosome in terms of copies of a deleterious allele exposed to natural selection, all due to the hemizygosity of males.

But the effect of selection isn’t uniformly negative, the purification of bad gene copies from the population. Positive forces can also reduce diversity via a selective sweep. How and why this happens is rather straightforward. Imagine that you have a single base pair which fortuitously has a mutation which is very beneficial in a single individual. To make the expression simple imagine that it is dominant, and the individual is a heterozygote. The single individual who carries the favored mutation has a very large family because ~50 percent of their offspring also carry the favored mutation and are much more fit than the population average. And so on. This favored variant can spread very fast. Lactose tolerance is a good concrete case of this. When I say the favored variant spreads, I’m actually talking about one gene copy from one person which starts to increase in frequency because of its adaptive value. But recall that a single base pair is embedded within the genome, and that chromosomal regions are generally passed on together from parent to offspring. It’s quite often a package deal. When a favored allele emerges it enables the “hitchhiking” of nearby variants which have no selective advantage, except that they luckily exist next to a very adaptively beneficial allele (think of them as the gene’s “posse” or entourage). Of course genetic recombination breaks apart these associations over time, but this process takes generations. Until then what you see is the proliferation of a particular genomic segment along with the increase in frequency of the favored gene which is embedded in that particular region. By straightforward logic when a whole segment with associated alleles starts to increase in frequency aggregate genetic diversity decreases, as variation is swept aside.

And yet evolution is not simply natural selection. There are two processes which have nothing to do with selection as such which might reduce genetic variation. The motor which both these phenomenon turn on is random genetic drift. As you increase the power of drift to fluctuate gene frequencies generation to generation you also increase its power to render alleles extinct as they are extinguished once they hit the zero frequency boundary condition. This is why populations which have gone through population bottlenecks are so homogeneous; drift has squeezed most of the variation out of the gene pool by capriciously favoring some alleles and eliminating most of the rest.

The dynamics relevant to this specific case are differences in male and female effective population size, and large fluctuations in long term effective population size. For purposes of reduced X chromosomal diversity one would have to posit lower female effective population size than male effective population size. The reason why this would impact the diversity of the X relative to the autosome is that the X spends 2/3 of its time in females, while the autosome only spends 1/2 of its time in both sexes. So if females have lower effective population sizes than males the X chromosome is being buffeted by greater stochastic forces than the autosome. More generally, the X chromosome has a lower effective population even assuming sex balance because for every 4 copies of an autosomal chromosome there are 3 X chromosomes. Because of this reduced effective population size the X would be more sensitive to bottlenecks and the like, one of the consequence of which is reduced genetic diversity.

All the above is important to keep in mind when reading a new report in Nature Genetics on the balance between selection and drift in reducing variation on the X chromosome and across populations. The second refers to the fact that Africans seem to exhibit less relative reduction of variation on the X chromosome than non-Africans. First, the paper’s abstract, Analyses of X-linked and autosomal genetic variation in population-scale whole genome sequencing:

The ratio of genetic diversity on chromosome X to that on the autosomes is sensitive to both natural selection and demography. On the basis of whole-genome sequences of 69 females, we report that whereas this ratio increases with genetic distance from genes across populations, it is lower in Europeans than in West Africans independent of proximity to genes. This relative reduction is most parsimoniously explained by differences in demographic history without the need to invoke natural selection.

This research is part of the trend I’ve alluded to toward looking at whole-genome sequences. Remember, a lot of the 1 million SNP papers are focusing only upon genetic variants, polymorphisms, across the 3 billion base pairs. These variants are especially informative, but they miss a lot of the genome. Additionally there are some statistical problems with bias in the selection of the variants because they’re usually tuned toward one population, Europeans (different populations have somewhat different variants across the genome). The takeaway is that the time is now nearly here when we can look at the genome at its most precise and fine-grained scale, rather than using approximations, whether it be one locus, or 1 million SNPs.

With this broad canvas in mind, if there’s one thing you’ve read about the genome it’s that much of it is not functional. It doesn’t code. There are zones of the genome which are intergenic, between genes. Natural selection generally targets functional regions, not intergenic ones. If natural selection is the primary dynamic effecting the pattern we see here then differences should manifest between genic and intergenic regions since selection plays a much larger role in the former than the latter, both in constraining variation and increasing the frequency of favored alleles.

The figure below has four panels. Every panel has an x-axis defined by distance from a gene, left to right with increasing distance. So the leftmost point can be thought of as genic, and the rightmost point as intergenic. The left panels define Europeans, and the right panels Africans. More precisely they’re displaying results from whole-genome sequences of 36 West African Yoruba and 33 European American females. The top row shows the change in raw nucelotide diversities for autosomes and X, and the bottom row illustrates the change in ratio of diversity of the two genomic classes (X vs. autosome) as a function of distance.

In molecular evolutionary genetics it often useful to assume that the null hypothesis is neutrality. Basically that means that selection is not a main effect in driving the variation. Instead it’s a function of random forces such as mutation and drift. When one sees deviation from neutrality then one considers the effect of natural selection and the possibility of adaptation. You see here clear evidence for natural selection. The genetic diversity on the X chromosome has a much stronger relationship to distance from genes than the autosome. This matters because as you recall the X chromosome is much more brutally sculpted by natural selection on a priori grounds because disfavored alleles would be pruned more efficiently, while recessively expressing favored alleles would be less handicapped by the fact that their favored traits often did not express when they were present (because they were suppressed when in heterozygote). The pattern above is entirely in keeping with that model.

So now we’ve seen that a closer whole-genome examination of these samples implies that the X vs. autosomal difference in diversity is not just a function of neutral forces, but may have been driven by natural selection. But there’s a second part of the phenomenon: the disjunction is usually more stark in non-Africans. If so, does this imply that non-Africans have been subject to more natural selection? The manner in which they explored this question was clean and elegant: they compared the ratios of ratios as a function of distance from genes. By this, I mean that they looked at the ratio of diversity of the genome between the X and the autosome, and then generated a ratio from this value by comparing across Europeans and Africans. Unlike those above the figure to the left shows no differences as a function of genetic distance. What does this tell us? If natural selection was more efficacious in Europeans than Africans then the differences in diversity across these two populations should be stronger near genetic regions, because that is where the power of selection is most felt. Instead, what you see is that though the difference across X and autosomal genomes is real, it is consistent between the genomes of Africans and Europeans across the X and the autosome.

This suggests that the difference between Africans and Europeans is driven by demographics and not adaptation (positive selection) or functional constraint (negative or purifying selection). Random evolutionary forces don’t see genic or intergenic regions. They’re random, and blind or neutral to functional import. Unlike selection their impact is going to be genome-wide, just as the inter-regional differences we see here are.

In this case what happened? Going back to the beginning there were two specific possibilities: sex-biased migration and greater fluctuation in effective population size among non-Africans. The latter model is entirely consistent with an “Out of Africa” scenario where non-Africans derive from a small ancestral population which left Africa. This is the great “Out of Africa” bottleneck which seems to be a consistent finding by human molecular evolutionists. Because the X chromosome has a somewhat smaller effective population it would presumably have been more impacted by the homogenizing force of this bottleneck.

The first option though is intriguing, if peculiar. What if there were multiple “Out of Africa” pulses which consisted disproportionately of groups of young males? This would have enriched the genetic diversity of non-Africans on the autosome far more than the X chromosome, because the males would bring only one X chromosome for every two autosomes. I think the “Out of Africa” model is more plausible, but I’m not going to dismiss this scenario out of hand. We live in interesting and strange times when it comes to the origin of modern humans.

* p2 + 2pq + q2 = 1 = 0.502 + 2(0.50)(0.50) + 0.502

Citation: Gottipati, Srikanth, Arbiza, Leonardo, Siepel, Adam, Clark, Andrew G, & Keinan, Alon (2011). Relative autosomal, X-linked and X/A diversity are not correlated with genetic distance from the nearest gene. Nature Genetics : 10.1038/ng.877

Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"