Though the original mathematical theoreticians of evolution, in particular R. A. Fisher and Sewall Wright, were critical in the formation of the Modern Neo-Darwinian Synthesis
, their formal frameworks were not without critics from within the mainstream. Ernst W. Mayr
famously rejected “beanbag genetics,”
the view propounded specifically by R. A. Fisher and J.B. S. Haldane in England that a model of evolution could be constructed from singular genetic elements operating independently upon traits. Mayr, as an ecologist and naturalist, believed that this framework lacked the essential integrative or holistic aspect of biology as it manifested in the real world. Selection after all operated proximately on the fitness of the whole organism. We’ve come a long way since those debates. One of the problems with the earlier disputes is that they were not sufficiently informed by the empirical evidence because of the primitive nature of experimental and observational evolutionary biology. Molecular biology changed that, and now the rise of genomics has also become a game changer. Genomics gets at the concrete embodiment of evolutionary change at its root, the structure and variation of the genomes of organisms.
A new paper in PNAS is a nice “mash-up” of the old and the new, Genomic patterns of pleiotropy and the evolution of complexity:
Pleiotropy refers to the phenomenon of a single mutation or gene affecting multiple distinct phenotypic traits and has broad implications in many areas of biology. Due to its central importance, pleiotropy has also been extensively modeled, albeit with virtually no empirical basis. Analyzing phenotypes of large numbers of yeast, nematode, and mouse mutants, we here describe the genomic patterns of pleiotropy. We show that the fraction of traits altered appreciably by the deletion of a gene is minute for most genes and the gene–trait relationship is highly modular. The standardized size of the phenotypic effect of a gene on a trait is approximately normally distributed with variable SDs for different genes, which gives rise to the surprising observation of a larger per-trait effect for genes affecting more traits. This scaling property counteracts the pleiotropy-associated reduction in adaptation rate (i.e., the “cost of complexity”) in a nonlinear fashion, resulting in the highest adaptation rate for organisms of intermediate complexity rather than low complexity. Intriguingly, the observed scaling exponent falls in a narrow range that maximizes the optimal complexity. Together, the genome-wide observations of overall low pleiotropy, high modularity, and larger per-trait effects from genes of higher pleiotropy necessitate major revisions of theoretical models of pleiotropy and suggest that pleiotropy has not only allowed but also promoted the evolution of complexity.
The basic thrust of this paper is to test older theoretical models of evolutionary genetics and their relationship and dependence on pleiotropy against new genomic data sets. In The Genetical Theory of Natural Selection R. A. Fisher proposed a model whereby all mutations affect every trait, and the effect size of the mutations exhibited a uniform distribution. Following in Fisher’s wake the evolutionary geneticist H. Allen Orr published a paper ten years ago, Adaptation and the cost of complexity, which argued that “…the rate of adaptation declines at least as fast as n-1, where n is the number of independent characters or dimensions comprising an organism.” This is the “cost of complexity,” which lay at the heart of this paper in PNAS.
To explore these questions empirically the authors looked at five data sets:
- yeast morphological pleiotropy, is based on the measures of 279 morphological traits in haploid wild-type cells and 4,718 haploid mutant strains that each lack a different nonessential gene (this also yielded quantitative measures)
- yeast environmental pleiotropy, is based on the growth rates of the same collection of yeast mutants relative to the wild type in 22 different environments
- yeast physiological pleiotropy, is based on 120 literature-curated physiological functions of genes recorded in the Comprehensive Yeast Genome Database (CYGD)
- nematode pleiotropy, is based on the phenotypes of 44 early embryogenesis traits in C. elegans treated with genome-wide RNA-mediated interference
- mouse pleiotropy, is based on the phenotypes of 308 morphological and physiological traits in gene-knockout mice recorded in Mouse Genome Informatics (MGI)
The first figure shows the results of the survey. You see in each data set the mean and median number of traits affected by mutations on a given gene, as well as the distribution of effects. Two conclusions are immediately evident, 1) most genes have a relationship only to a small number of traits, 2) very few genes have a relationship to many traits. You also see the percentages of genes impacted by pleiotropy is rather small. This seems to immediately take off the table simplifying assumptions of a mutant variant producing changes across the full range of traits in a complex organism. Additionally the effects do not seem to exhibit a uniform distribution; rather, they’re skewed toward genes which are minimally or trivially pleiotropic. From the text:
Our genome-wide results echo recent small-scale observations from fish and mouse quantitative trait locus (QTL) studiies…and an inference from protein sequence evolution…and reveal a general pattern of low pleiotropy in eukaryotes, which is in sharp contrast to some commonly used theoretically models…that assume universal pleiotropy (i.e., every gene affects every trait)
So if the theoretical models are wrong, what’s right? In this paper the authors argue that it seems as if pleiotropy has a modular structure. That is, mutations tend to have impacts across sets of correlated traits, not across a random distribution of traits. This is important when we consider the fitness implications of mutations, for if the impacts were not modular but randomly distributed the putative genetic correlations which would more likely serve as dampeners on directional change in trait value.
Figure 2 shows the high degree of modularity in their data sets:
Now that we’ve established that mutations tend to have clustered effects, what about their distribution? Fisher’s original model postulated a uniform distribution. The first data set, the morphological characteristics of baker’s yeast, had quantitative metrics. Using the results from 279 morphological traits they rejected the assumption of a uniform distribution. In fact the distribution was closer to normal, with a central tendency and a variance about the mode. Second, they found that standard deviations of effect sizes varied quite a bit as well. Many statistical models assume invariant standard deviations, so it is not surprising that that was the initial assumption, but I doubt many will be that surprised that the assumption turns out not to be valid. The question is: does this matter?
Yes. Within the parameter space being explored one can calculate distances which we can use to measure the effect of mutations. Panels C to F show the distances as a function of pleiotropic effect. The left panels are Euclidean distances while the right panels are Manhattan distances. The first two panels show the outcomes from the parameter values generated from their data sets. The second two panels use randomly generated effect sizes assuming a normal distribution. The last two panels use randomly generated effect sizes, and, assume a constant standard deviation (as opposed to the empirical distribution of standard deviations which varied).
To connect these empirical results back to the theoretical models: there are particular scaling parameters, the values of which the earlier models assumed, but which can now be calculated from the real data sets. It turns out that the empirical scaling parameter values differ rather significantly from the assumed parameter values, and this changes the inferences one generates from the theoretical models. The empirically calculated value of b = 0.612, as an exponent on the right hand side of the equation which generates the distances within the parameter space. From the text: “the invariant total effect model…assumes a constant total effect size (b = 0), whereas the Euclidian superposition model…assumes a constant effect size per affected trait (b = 0.5).” Instead of looking at the number value, note what each value means verbally. What they found in the empirical data was that there was variant effect size per affected trait. In this paper the authors found larger per-trait effects for genes affecting more traits, and this seems to be a function of the fact that b > 0.5; with a normal distribution of effect sizes and a variance in the standard deviation of effect sizes.
This all leads us back to the big picture question: is there cost of complexity?Substituting in the real parameters back into the theoretical framework originated by Fisher, and extended by H. Allen Orr and others, they find that the cost of complexity disappears. Mutations do not effect all traits, so more complex organisms are not disproportionately impacted by pleiotropic mutations. Not only that, the modularity of pleiotropy likely decreases the risk of opposing fitness implications due to a mutation, since similar traits are more likely to be similarly effected in fitness. These insights are summarized in the last figure:
The one to really focus on is panel A. As you can see there is a sweet spot in complexity when it comes to the rate of adaptation. Contra earlier models there isn’t a monotonic decrease in the rate of adaptation as a function of complexity, but rather an increase until to an equipoise, before a subsequent decrease. At least within the empirically validated range of the scaling exponent. This is important because we see complex organisms all around us. When theory is at variance with the observational reality we are left to wonder what the utility of theory is (here’s looking at your economists!). By plugging empirical results back into the theory we now have a richer and more robust model. I will let the authors finish:
First, the generally low pleiotropy means that even mutations in organisms as complex as mammals do not normally affect many traits simultaneously. Second, high modularity reduces the probability that a random mutation is deleterious, because the mutation is likely to affect a set of related traits in the same direction rather than a set of unrelated traits in random directions…These two properties substantially lower the effective complexity of an organism. Third, the greater per-trait effect size for more pleiotropic mutations (i.e., b > 0.5) causes a greater probability of fixation and a larger amount of fitness gain when a beneficial mutation occurs in a more complex organism than in a less complex organism. These effects, counteracting lower frequencies of beneficial mutations in more complex organisms…result in intermediate levels of effective complexity having the highest rate of adaptation. Together, they explain why complex organisms could have evolved despite the cost of complexity. Because organisms of intermediate levels of effective complexity have greater adaptation rates than organisms of low levels of effective complexity due to the scaling property of pleiotropy, pleiotropy may have promoted the evolution of complexity. Whether the intriguing finding that the empirically observed scaling exponent b falls in a narrow range that offers the maximal optimal complexity is the result of natural selection for evolvability or a by-product of other evolutionary processes…requires further exploration.
Citation: Wang Z, Liao BY, & Zhang J (2010). Genomic patterns of pleiotropy and the evolution of complexity. Proceedings of the National Academy of Sciences of the United States of America PMID: 20876104
Image credit: Moussa Direct Ltd., http://evolutionarysystemsbiology.org