The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information



=>
Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
/
Top Posts

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

ResearchBlogging.org
The Pith: You are expected to have 30 new mutations which differentiate you from your parents. But, there is wiggle room around this number, and you may have more or less. This number may vary across siblings, and explain differences across siblings. Additionally, previously used estimates of mutation rates which may have been too high by a factor of 2. This may push the “last common ancestor” of many human and human-related lineages back by a factor of 2 in terms of time.

There’s a new letter in Nature Genetics on de novo mutations in humans which is sending the headline writers in the press into a natural frenzy trying to “hook” the results into the X-Men franchise. I implicitly assume most people understand that they all have new genetic mutations specific and identifiable to them. The important issue in relation to “mutants” as commonly understood is that they have salient identifiable phenotypes, not that they have subtle genetic variants which are invisible to us. Another implicit aspect is that phenotypes are an accurate signal or representation of high underlying mutational load. In other words, if you can see that someone is weird in their traits, presumably they are rather strange in their underlying genetics. This is the logic behind models which assume that mutational load has correlates with intelligence or beauty, and these naturally tie back into evolutionary rationales for human aesthetic preferences (e.g., “good genes” models of sexual selection).

Variation in genome-wide mutation rates within and between human families:

J.B.S. Haldane proposed in 1947 that the male germline may be more mutagenic than the female germline…Diverse studies have supported Haldane’s contention of a higher average mutation rate in the male germline in a variety of mammals, including humans…Here we present, to our knowledge, the first direct comparative analysis of male and female germline mutation rates from the complete genome sequences of two parent-offspring trios. Through extensive validation, we identified 49 and 35 germline de novo mutations (DNMs) in two trio offspring, as well as 1,586 non-germline DNMs arising either somatically or in the cell lines from which the DNA was derived. Most strikingly, in one family, we observed that 92% of germline DNMs were from the paternal germline, whereas, in contrast, in the other family, 64% of DNMs were from the maternal germline. These observations suggest considerable variation in mutation rates within and between families.

From what I gather there’s a straightforward reason why the male germline, the genetic information which is transmitted by sperm to a male’s offspring, is more mutagenetic: sperm are produced throughout your whole life, and over time replication errors creep in. This is in contrast to a female’s eggs, where the full complement are present at birth. The fact that mutations creep in through sperm is just a boundary condition of how mutations creep in to the germline in the first place, errors in the DNA repair process. This is good on rare occasions (in that mutations may actually be fitness enhancing), more often this is bad (in that mutations are fitness detracting), and, oftentimes it is neutral. Remember that in terms of function and fitness a large class of mutations don’t have any effect. Consider the fact that 1 out of 25 people of European descent carry a mutation which can cause cystic fibrosis in the general population if it manifests in a homozygote genotype. But the vast majority of cystic fibrosis mutations are present in people who are heterozygote, and have a conventional functional gene which “masks” the deleterious allele.* And there are many mutations which are silent even in homozogyote form (e.g., if there is a change in a base at a synonymous position).

As noted in the letter above until recently estimating mutation rates was a matter of inference. On the broadest canvas one simply looked at differences between two related lineages which had been long separated (e.g., chimpanzee vs. human), and so accumulated many differential mutations, and assayed the differences. It may have been a fine-grained inference in the case of individuals who manifested a disease which exhibited a dominant expression pattern, so that one de novo mutation in the offspring could change the phenotype. For most humans this is thankfully not a major issue, and mutations remain cryptic for most of our lives. But no longer. With cheaper sequencing at some point in the near future most of us will have accurate and precise copies of our genomes available to us, and we will be able to see exactly where we have unique mutations which differentiate us from our parents and our siblings.

In this paper the authors took two “trios,” parent-child triplets, and compared their patterns of genetic variation at the scale of the full genome to a very high level of accuracy. Accuracy obviously matters a great deal when you might be looking for de novo mutations which are going to be counted on the scale of hundreds when base pairs are counted in billions. In the future when we have billions and billions of genomes on file and omnipotent computational tools I suspect there will be all sorts of ways to ascertain “typicality” of regions of your genome, but in this paper the authors naturally compared the parents to the children. If a mutation is de novo it should be underivable from the genetic patterns of the parent. But, sequencing technologies are not perfect, so there’s going to be a high risk for false positives when you are looking for the de novo mutations “in the haystack” (e.g., an error in the read of the offspring can be picked up as a mutation).

So they started with ~3,000 candidate de novo mutations (DNMs) for each family trio after comparing the genomes of the trios, but narrowed it down further experimentally as they filtered out the false positives. You can read the gory details in the supplements, but it seems that they focused on the identified candidates to see if they were: germline DNMs, non-germline DNMs, variant inherited from the parents, or a false positive call. So it turns out that half of the preliminary DNMs were somatic and about 1% turned out to be germline. Remember that the difference is that the germline mutations are going to be passed on to one’s offspring, while the somatic mutations only have impact on one’s physiological fitness over one’s life history. For the purposes of evolution germline mutations are much more important, though over your lifetime somatic mutations are going to be very important as you age.

After the methodological heavy-lifting the results themselves are interesting, albeit of somewhat limited generalizability because you are focusing on two trios only. Before we examine the results here’s a figure which illustrates the study design:

From what I can gather there are two primary findings in this paper:

1) Variance in the sex-mediated nature of DNMs across trios. One of the pairs was much closer to expectation. The male germline contribution was responsible for the vast majority of DNMs.

2) A more precise estimate of human mutational rates which might have implications for “molecular clock” estimates used in evolutionary phylogenetics.

Here are the findings in a figure which shows the 95% confidence intervals around estimated mutation rates:

CEU refers to the sample of white Utah Mormons commonly used in medical genetics, while YRI refers to Yoruba from Nigerians. Remember, these are two families only. That severely limits the power of the insights which you can draw, but already you see that while the CEU trio shows the expected imbalance between male and female contribution to DNMs, the YRI trio does not. But, both of the trios do suggest a lower mutation rate than found in previous studies which inferred the value from species divergence. Here is the portion which is relevant for human evolution: “These apparently discordant estimates can be largely reconciled if the age of the human-chimpanzee divergence is pushed back to 7 million years, as suggested by some interpretations of recent fossil finds.” I wouldn’t put my money on this quite yet, going by just this one study, but I’ve been hearing that this paper doesn’t come to this number in a scientific vacuum. Other researchers are converging upon a similar recalibration of mutational rates which might push back the time until the last common ancestor of many divergent hominoid and hominin lineages (including modern humans).

Moving the lens back to the present and of more personal genomic relevance:

Mutation is a random process and, as a result, considerable variation in the numbers of mutations is to be expected between contemporaneous gametes within an individual. If modeled as a Poisson process, the 95% confidence intervals on a mean of ~30 DNMs per gamete (as expected from a mutation rate of ~1 × 10−8) ranges from 20 to 41, which is a twofold difference. Truncating selection might act to remove the most mutated gametes and thus reduce this variation among gametes that successfully reproduce, however, any additional heterogeneity in stem-cell ancestry or environment (for example, variation in the number of cell divisions leading to contemporaneous gametes) would likely increase inter-gamete variation in the number of mutations.

Using the much smaller marker set obtained from 23andMe I found that two of my siblings are nearly 3 standard deviations apart in in identity-by-descent when it comes to the distribution of full-siblings. In the near future we might be able to ascertain the realized, not just theoretical, extent of mutational load across a family. As noted by the authors much of this might be a function of paternal age. Rupert Murdoch has children who are younger than many of his grandchildren, so there are many, many, “natural experiments” out there, as males are having offspring over 40 years apart.

On a societal level we may be able to estimate the exact cost in terms of public health costs of rising mean age of fathers. Personally we may also be able to note the correlations within families between high levels of DNMs and traits of interest such as intelligence and beauty. Compared to more fine-grained tools of ancestry inference I presume this is going to be dynamite. But it isn’t as if we didn’t know siblings varied before.

Citation: Donald F Conrad, Jonathan E M Keebler, Mark A DePristo, Sarah J Lindsay, Yujun Zhang, Ferran Casals, Youssef Idaghdour, Chris L Hartl, Carlos Torroja, Kiran V Garimella, Martine Zilversmit, Reed Cartwright, Guy A Rouleau, Mark Daly, Eric A Stone, Matthew E Hurles, & Philip Awadalla (2011). Variation in genome-wide mutation rates within and between human families Nature Genetics : 10.1038/ng.862

* In a random mating population the proportions are defined by the Hardy-Weinberg Equilibrium, p2 + 2pq + q2 = 1, so where q = 0.04, q2 = 0.0016 and 2pq = 0.0768. Heterozygote genotypes of CF outnumber homozygote ones 50 to 1.

Bloggy addendum: The first author of this letter is Don Conrad who is a contributor to Genomes Unzipped.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

ResearchBlogging.org Last summer I made a thoughtless and silly error in relation to a model of human population history when asked by a reader the question: “which population is most distantly related to Africans?” I contended that all non-African populations are equally distant. This is obviously wrong on the face of it if you look at any genetic distance measures. West Eurasians, even those without recent Sub-Saharan African admixture (e.g., North Europeans) are closer than East Eurasians, who are often closer than Oceanians and Amerindians. One explanation I offered is that these latter groups were subject to greater genetic drift through a series of population bottlenecks. In this framework the number of generations until the last common ancestor with Sub-Saharan Africans for all groups outside of Africa should be about the same, but due to evolutionary factors such as more extreme genetic drift or different selective pressures some non-African groups had diverged more from Africans than others in terms of their genetic state. In other words, the most genetically divergent groups in relation to Africans did not diverge any earlier, but simply diverged more rapidly.

Dienekes Pontikos disagreed with such a simple explanation. He argued that admixture or gene flow between Africans and non-African groups since the last common ancestor could explain the differences. I am now of the opinion that Dienekes may have been right. My own confidence in the “serial bottleneck” hypothesis as the primary explanation for the nature of relationships of the phylogenetic tree of human populations is shaky at best. Why my errors of inference?

There were two major issues at work in my misjudgments of the arc of the past and the topology of the present. In the latter instance I saw plenty of phylogenetic trees which illustrated clearly the variation in genetic distance from Africans for various non-African groups. Why didn’t I internalize those visual representations? It was I think the power of the “Out of Africa” (OoA) with replacement paradigm. Even by the summer of 2010 I had come to reject it in its strong form, due to the evidence of admixture with Neanderthals, and rumors of other events which were born out to be true with the publishing of the Denisovan results. But to a first approximation the clean and simple OoA was still looming so large in my mind that I made the incorrect inference, whereby all non-Africans are viewed simply as a branch of Africans without any particular differentiation in relation to their ancestral population. Secondarily, I also was still impacted by the idea that most of the genetic variation you see in the world around us has its roots tens of thousands of years ago. By this, I mean that the phylogeographic patterns of 25,000 years in the past would map on well to the phylogeographic patterns of the present. This assumption is what drove a lot of phylogeography in the early aughts, because the chain of causation could be reversed, and inferences about the past were made from patterns of the present. My own confidence in this model had already been perturbed when I made my errors, but it still held some sort of sway in my head implicitly I believe. It is one thing to move on from old models explicitly, but another thing to remove the furniture from your cognitive basement and attic.

I have moved further from my preconceptions between then and now. It took a while to sink in, but I’m getting there. A cognitive “paradigm shift” if you will. In particular I am more open to the idea of substantive back migration to Africa, as well as secondary migrations out of Africa. A new paper in Genome Research is out which adds some interesting details to this bigger discussion, and seems to weigh in further against my tentative hypothesis that serial bottlenecks and genetic drift can explain variation in distance to Africans of various non-African groups. Human population dispersal “Out of Africa” estimated from linkage disequilibrium and allele frequencies of SNPs:

Genetic and fossil evidence supports a single, recent (<200,000 yr) origin of modern Homo sapiens in Africa, followed by later population divergence and dispersal across the globe (the “Out of Africa” model). However, there is less agreement on the exact nature of this migration event and dispersal of populations relative to one another. We use the empirically observed genetic correlation structure (or linkage disequilibrium) between 242,000 genome-wide single nucleotide polymorphisms (SNPs) in 17 global populations to reconstruct two key parameters of human evolution: effective population size (N e) and population divergence times (T). A linkage disequilibrium (LD)–based approach allows changes in human population size to be traced over time and reveals a substantial reduction in N e accompanying the “Out of Africa” exodus as well as the dramatic re-expansion of non-Africans as they spread across the globe. Secondly, two parallel estimates of population divergence times provide clear evidence of population dispersal patterns “Out of Africa” and subsequent dispersal of proto-European and proto-East Asian populations. Estimates of divergence times between European–African and East Asian–African populations are inconsistent with its simplest manifestation: a single dispersal from the continent followed by a split into Western and Eastern Eurasian branches. Rather, population divergence times are consistent with substantial ancient gene flow to the proto-European population after its divergence with proto-East Asians, suggesting distinct, early dispersals of modern H. sapiens from Africa. We use simulated genetic polymorphism data to demonstrate the validity of our conclusions against alternative population demographic scenarios.

Here are the details. The authors use patterns of linkage disequilibrium (LD) to gauge divergence, time since divergence, and, the effective population sizes of various groups. LD measures the correlations of genetic variations across loci. Because of the shuffling properties of recombination the correlation of markers across the genome should be relatively low. That is, they should be independent. But not in all cases. You could, for example, have two markers at two genes which are positioned together close physically. Now imagine a selective sweep event which increases the frequency of one of the variants through positive selection. Then the other marker on the second gene will also rise up in frequency by “hitchhiking” along on the other’s good fortune. Over time recombination will break apart these associations, but that decay of LD takes time. Important, it is not just natural selection which can generate these patterns within the genome. Population bottlenecks can drive up (and down ) fragments of the genome wildly because of the jacking up of “noise” into the generation-to-generation transmission of allele frequency values within a population. So LD can reflect both demographic events as well as bouts of adaptation.

Another measure of genetic variation that the authors rely is the fixation index (Fst). This ignores patterns of correlation across genes, and is a comparison of the variation of a given specific marker from population to population. High Fst values are a signal to a lot between population differentiation. An Fst value of ~0 indicates almost no between population differentiation. An extreme example would be a marker, 1, which is at frequency 0.5 in population A and population B, and a marker, 2, which is at frequency 0.0 and 1.0 in population A and B. Fst = 0.0 for marker 1, and 1.0 for marker 2. The Fst values in this paper are averaging across the genome, so obviously you’ll get values on the interval between 0 and 1, though it will usually be closer to 0 for any given marker (average intercontinental human Fst values at a given marker is famously ~15%; ergo, the chestnut of wisdom that 85% of variation is within races, and 15% between).

The chart at the top of the post shows the divergence times inferred from an Fst based statistic and an LD based statistic, above and below respectively. Two notable things to observe. First, the basic structure of both statistics is similar. Second, LD tends to give smaller values. The authors contend that LD is clearly an underestimate because it doesn’t take into account migration and fixation of allele frequencies, where one variant reaches 100% and so LD can not be calculated.

An aspect of LD which is useful for the authors is that they could calculate effective population sizes over time for their disparate samples. Below is a plot which shows the variation over time. I’ve added some clarifying labels (you should recognize many of the abbreviations from the HapMap populations):

Some observations:

1) African have a relatively large breeding population from before to after the putative OoA event.

2) Non-Africans show the small ancestral population during the Pleistocene that you’d expect, rising very slowly if at all from the exit event from Africa across the Ice Ages.

3) Then ~10,000 years ago you start to see divergences. The Chinese crest to very large effective populations. The Tuscans are next in order. Then there’s a cluster of Northwest European groups. The Japanese are between the Tuscans and Northwest Europeans. Finally at the bottom you see Finns and Mexicans. This is not too surprising in terms of rank order. But here’s the interpretation from the paper at the European patterns:

…likely the consequence of bottlenecks associated with the depopulation and recolonization of Northern Europe before and after the last glacial maximum…growth accelerates moving forward in time, with the average rate about threefold higher in the period 8–5 KYA than 20–8 KYA, presumably representing the impact of agricultural innovations on population density.

Remember my point that it is problematic to back project contemporary variation to the past? I think this needs to be emphasized here. My own hunch is that the difference between the Finns and other Northwest Europeans has to do with the relative late adoption of agriculture of the former, and the possibility that much of the genome of the latter is due to relatively late intrusions from southern and eastern Europe of explosively expanding agricultural groups. In other words, I’m not sure that aside from the Finns the recolonization after the LGM matters much at all.

Also, there’s one point I want to make sure to get to: the authors contend that the time until last divergence can’t be explained by a model of serial bottlenecks, as I had posited last summer. In other words, there has to be more complex dynamics at work here. They ran a bunch of simulations with constructed genomic sequences. Varying effective population size so that you have a bunch of serial bottlenecks was not enough to explain the difference between East Asians and West Eurasians when it came to time until last divergence to Sub-Saharan Africans. There has to be something more complex going on.

Speaking of complexity, I would also like to add that this paper reinforces the likelihood of a “pause” of the ur-non-African population after they left Africa. There’s a ~20,000 year gap between the time until the last common ancestor, and then the separation of West and East Eurasians. Several genomic analyses have pointed in this direction. I think the exact span of this interval is going to be debated, but I suspect that it is real. Additionally, the authors contend that the genetic closeness of West Eurasians to Sub-Saharan Africans may point to a ancient second migration out of Africa.

First, let’s walk back to where we started. Here was the rough “cartoon” model of the origin of modern H. sapiens sapiens circa 2009:

1) 50-100 thousand years ago you have a huge number of hominin groups across Africa and Eurasia.

2) At some point within this interval of time a small population of East Africans began to rapidly expand in population. They replaced in totality all other hominins, within, and outside of, Africa.

3) Therefore the inference can be made that all human beings alive today are descended from one tribe of East Africans.

At this point we can probably reject this model as being the full story. There is now suggestive evidence that the population fluctuations of Africans has been far more modest than non-Africans over the past 100,000 years. We also have to confront the likelihood of multiple admixture events with those “Other” hominins outside of, and possibly within, Africa. Finally, we can’t reject back migration events as well as multiple Out of Africa pulses.

I believe that the pattern of genetic variation across the whole world, including within Africa, has re-ordered itself radically over the past 10,000 years. We need to stop, and take a breath. If we know so little about the past 10,000 years, how much can we confidently infer about the past 100,000 years? Only a few points I suspect. For now.

Related: See Dienekes’ comment as well.

Citation: McEvoy BP, Powell JE, Goddard ME, & Visscher PM (2011). Human population dispersal “Out of Africa” estimated from linkage disequilibrium and allele frequencies of SNPs. Genome research PMID: 21518737

(Republished from Discover/GNXP by permission of author or representative)
 
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at http://www.razib.com"