I just came across another argument for w hy the regulatory changes vs. protein coding changes argument is inane— sometimes protein-coding changes are regulatory changes. Ok, maybe RPM made that point in the comments on that post I linked, but here’s a great example, from a recent paper:
The authors looked for local regulatory variation in a number of genes, and found one instance where the putative regulatory variant mapped to a protein-coding SNP inside the gene. On a little further study, they found the story goes like this– the gene itself (AMN1) is a regulator of two other genes (DSE1 and DSE2) in a network, and those genes, in turn, regulate AMN1. The coding change in the gene keeps it from playing its proper role in the network, so DSE1 and DSE2 are upregulated and, in turn, up-regulate AMN1. I’m sure there’s an easier way to explain this, but the take-home message is that a protein-coding change in AMN1 leads, indirectly, to it’s own regulation.
So the genetics underlying gene expression can be rather complex. And just think, it’s networks of interacting genes that lead to phenotypes–the complexity is rather daunting, and I feel like first understanding gene expression is certainly a great first step towards getting at phenotypic complexity itself (there’s another great first step, but no one seems to have taken it…yet). For those who simply must know more, here‘s a great review of the current knowledge on the genetics of gene expression.
Adam Gopnik has an excellent essay in the most recent New Yorker arguing that Charles Darwin, while cultivating an image of an unassuming naturalist forced, by the facts, to reluctantly revolutionize modern thought, was really, well, a Darwinian fundamentalist. The article isn’t available online, but it’s worth a little effort to find. A couple excerpts below the fold.
Darwin’s strategy was one of the greatest successes in the history of rhetoric, so much that we are scarcely aware that it was a strategy. His pose of open-mindedness and ostentatiously asserted country virtue made him, in his way, as unassailable as George Washington. The notion persists to this day that Darwin was a circumspect observer of animals, not a confident theorist of life.
Darwin was humble and modest in exactly the way that Inspector Columbo is. He knows from the beginning who thew guilty party is, and what the truth is, and would rather let the bad guys hang themselves out of arrogance and overconfidence, while he walks around in his raincoat, scratching his head and saying, “Oh, yeah–just one more thing about that six-thousand year old Earth, Reverend Snodgrass…” Darwin was a civil and courteous man, but he was also what is now polemically called a Darwinian fundamentalist. He knew that he was right, and that his being right meant that much else people wanted to believe was wrong. Design was just chance plus time, greed not a sin from the Devil but an inheritance from monkeys. “Our descent, then, is the origin of our evil passions!!” he wrote in his notebooks. “The Devil under form of Baboon is our grandfather!”
You don’t achieve a triumph of this kind without knowning what you’re doing, and Darwin was a cagey man when it came to carrying his day. He was pleased to let other men, particularly his great friend and champion T. H. Huxley, do the dirty work of polemics. Throughout thirty years of friendship, he and Huxley played, knowingly, a kind of good-cop, bad-cop game in public. Their correspondance shows that each knew his given role–when Darwin at last was put forward for an honorary degree at Oxford by the reactionary Lord Salisbury, it was with the severe corrollary that Huxley could not get the same. Huxley and Darwin, sharing the same basic views, enjoyed the joke. When Huxley had his famous debate with Bishop Wilberforce, Darwin kept silent, safe in the country, but wrote to his defender, “How durst you attack a live Bishop in that fashion? I am quite ashamed of you! Have you no reverence for fine lawn sleeves?”
The twin concordance rate in autism is something like 90% for monozygotic twins and 5% for dizygotic twins. This suggests a signigicant role for genetics in the development of autism. A new study identifies one of the possible genetic contributors, a regulatory change in the MET gene. MET signaling is important in a number of tissues, which the authors suggest is support for this being a valid finding. I’m pretty convinced–there’s a replication study, and the authors don’t do any funny things with statistics to make associations appear where they shouldn’t.
Further, the variant they identify is in the promoter of the gene, and they show that, in vitro, it leads to reduced transcription factor binding and reduced transcription. This, of course, is the most presuasive evidence.
Given the hypotheses that posit epigenetic modifications in autism, it would be interesting to see what the methylation patterns are like in the region. In fact, the SNP is a C–>G change, which creates a CpG dinucleotide, a possible target for methylation.
According to the latest estimates, the common ancestor of humans and chimpanzees lived sometime around 5 million years ago. Since then, a lot has happened. Presumably, there has been plenty of change along the lineage leading to chimpanzees, but let’s be honest– from our point of view, a lot more has happened along our lineage. We have Shakespeare, Klimt, and the International Space Station; they jab sticks into anthills. The biological basis for this disparity is certainly a worthy line of inquiry. As a first step down this path, I assume most readers here will agree with me that invoking the hand of God, while awfully tempting, would be laughable. So let’s turn our focus to a more profitable enterprise: genetics.
In the 1970s, the technology was a such a state that the proteins in the blood of both humans and chimps could be compared at some rough level. And as it turns out, we aren’t that different after all. Or rather, we’re less different than people expected to find, given that, as humans, we consider ourselves the Greatest. Species. Ever. So if our blood proteins aren’t that different, what makes us human?
One possible answer was provided in 1975, in an influential paper by Mary-Claire King and Allan Wilson. In their own words:
We suggest that evolutionary changes in anatomy and way of life are more often based on changes in the mechanisms controlling the expression of genes than on sequence changes in proteins. We therefore propose that regulatory mutations account for the major biological differences between humans and chimpanzees
Of course, this hypothesis was based on very little data (as Bruce Lahn pointed out in his interview) — the fact that human red blood cells and chimp red blood cells are almost the same doesn’t really tell us that much. But at the time, people were apparently expecting radical differences between human and chimp proteins across the board, so this paper shifted some paradigms.
Now, 30 years later, it’s possible to compare the genome sequences of the two species, and microarray technology makes it possible to compare gene expression levels as well. So are we close to settling this issue? My answer: hell no. And here’s why:
Here’s the question we’re supposed to answer: which are more important– protein-coding changes or regulatory changes? And here’s the problem with that question: how do you define important? Let’s make a list of the ways humans differ from chimpanzees– we walk on two feet, we have bigger brains, we have less hair, etc. etc. You can add your own if you like. If a protein-coding change gives us the bigger brain, but a regulatory change the lack of hair, who wins? Sure, you could argue about which trait contributes more to some notion of “human-ness”, but frankly, who gives a shit? Both are pretty important.
And in reality, any of those traits is likely to be influenced by a number of factors. In the developmental networks that have evolved seperately in the last 5 million years, the components of the networks (protin-coding changes) as well as the relationships between those components (regulatory changes) are both likely to have changed. So no single mutation is going to be the mutation. I feel like many people are under the impression there’s going the equivalent of an SRY gene that easily discriminates between chimps and humans. Ain’t gonna happen.
So let’s say someone is a true partisan on one side of this debate and wants to settle it once and for all. What would be necessary? Here’s my list:
1. A catalogue of all the phenotypic difference between humans and chimps.
2. A list of all the genetic changes underlying these difference (classified as coding and regulatory, of course), and a weight assigned to each change accoring to its relative importance in the generation of the phenotype.
3. An objective measure of “human-ness” that assigns relative importance to each of the phenotypic differences.
And there you go. For the person that does this, I will buy a cold beer.
 King and Wilson also propose that point mutations may be less important than rearrangements like tranlocations and inversions in human evolution, but they present no data on this and no one really remembers this hypothesis anymore.
 In sex determination in humans, there’s a single gene, SRY, that pretty much determines who’s a male and who’s a female (with, of course, the necessary caveats). In this case, it’s clear– take a female, add this gene, and you pretty much get a male. This protein-coding difference is arguably the most important difference between males and females.
It’s easy to make fun of studies like these, but hey, sometimes once you measure a phenomenon you find something unexpected. Not in this case, however:
Among men, heavy alcohol use was associated with higher odds of all risky sex outcomes examined, including unprotected sex (AOR = 3.48; 95% confidence interval [CI], 1.65 to 7.32), multiple partners (AOR = 3.08; 95% CI, 1.95 to 4.87), and paying for sex (AOR = 3.65; 95% CI, 2.58 to 12.37). Similarly, among women, heavy alcohol consumption was associated with higher odds of unprotected sex (AOR = 3.28; 95% CI, 1.71 to 6.28), multiple partners (AOR = 3.05; 95% CI, 1.83 to 5.07), and selling sex (AOR = 8.50; 95% CI, 3.41 to 21.18). A dose-response relationship was seen between alcohol use and risky sexual behaviors, with moderate drinkers at lower risk than both problem and heavy drinkers.
Bruce Lahn is a Professor of Human Genetics at the University of Chicago as well an Investigator at the Howard Hughes Medical Institute. In 2004, he was on the “Top 40 Under 40” list by Crain’s Chicago Business. Specifics of his research can be found on his faculty page. Our 10 questions are in bold below the fold.
1. One of the major trends in hominid evolution has been increasing brain size, with the somewhat confusing caveat that modern humans break that trend, with smaller brains than both Neanderthals and some earlier hominids. Many hypotheses have been proposed to explain this, from sexual selection for intelligence to selection pressures from culture. Do you have a favorite hypothesis? What evidence do you think could settle this issue?
Brain size is just a proxy for cognitive abilities. This proxy is very robust over long evolutionary periods (millions of years). But on a short time scale, fluctuation in brain size may not correlate well with cognitive abilities. Within humans, for example, brain size is only weakly correlated with cognitive test scores such as IQ (only about 15% of the variation in IQ can be explained by difference in brain size). Given this, perhaps we should not make too much out of the cognitive significance of brain size changes on a short time scale.
2. Your work on genes involved in human brain evolution (i.e. ASPM and microcephalin) has focused on amino acid changes. It has been hypothesized that most of the differences between humans and chimps are due to regulatory changes. Do you feel this is still a viable hypothesis? Do you consider your work a challenge to this hypothesis?
The hypothesis that most human-chimp differences are due to regulatory changes is proposed in the absence of any data. So, I don’t place too much weight on this hypothesis to begin with. Nevertheless, I acknowledge that this hypothesis has influenced the thinking of many people. Our work showed that coding region evolution is likely to be important for human brain evolution. In this regard, it can be considered to be a challenge to the hypothesis. However, our work by no means argues that regulatory changes are necessarily less important than coding changes. So, the jury is still out.
3. The aforementioned work on microcephalin and ASPM touched some nerves, due mostly to two issues: the difference in frequency of the derived haplotype in different populations, and co-incidence of major moments in human cultural evolution with the appearance of these derived haplotypes. Do you regret anything you wrote in either of those papers?
On the one hand, I don’t regret the things we wrote in the papers because they were scientifically justified and the speculative nature of some of our statements was clearly indicated as such. On the other hand, I can appreciate why some people might be concerned over the possibility that our results could be over-interpreted or even mis-interpreted to advance certain ideas about race and ethnicity, especially by people with certain political agenda. Our society, given its sordid history on race-related issues, is very confused about how to deal with racially and ethnically sensitive topics. As a result, science and politics get mixed up when they relate to these topics. I personally feel, like many other scientists, that science should be separate from politics. In particular, science should meet the same burden of proof regardless of what political implications it might have. But this may be too idealistic if not naive. I feel I am still learning how to handle such issues in a way that is honest to the science while at the same time sensitive and respectful to political and cultural needs.
4. You’ve speculated that humans will, at some time in the future, speciate. The evidence for clinal speciation in other taxa certainly supports this possibility. One possible counterargument is that germ-line genetic engineering or even pre-implantation genetic screening could lead to the human population becoming more homogenized, preventing the evolution of barriers to gene flow. What role do you see for technology in the future of human evolution?
I think we as a species now stand at a watershed moment in the history of life. For billions of years, evolution of life forms has been governed by the Darwinian process of random mutations followed by selection. Now, we are about to revise that principle dramatically by genetic engineering. Instead of starting with random mutations, of which only very few are advantageous, we can now prospectively change our genome (and the genomes of other species) in ways we intend. In a sense, genetic engineering will make Lamarckian evolution a reality. Given the revolutionary nature of this new technology, it is impossible to predict where the technology will take us into the future. But suffice it to say that genetic engineering, coupled with other technologies such as pre-implantation genetic screening, would likely speed up evolution enormously, and create life forms, including those derived from our own species, in ways that the Darwinian process can never hope to accomplish.
5. You’ve published a paper noting a correlation between mutation rate and the ratio of nonsynonymous to synonymous mutations in a gene. This ratio forms the basis for many tests for selection. What’s the best way to interpret such a test? You do much molecular work– how can one decide, using both statistical and molecular evidence, that the story for selection on a locus has been decided one way or another?
It is still debated among experts as to how to interpret the ratio of nonsynonymous to synonymous substitutions. The major difficult arises from the fact that both positive selection and relaxed constraint produce a high ratio. When a gene has a low ratio, one can argue that it has evolved predominantly under purifying selection. But when a gene has a high ratio, it is not clear whether it is due to strong positive selection, or relaxed constraint, or a bit of both. So, unless the ratio is very much greater than 1, it is not possible to conclude what a high ratio means. This is where other statistical and molecular evidence is needed. There are no clear-cut rules on what evidence can be considered “enough” for establishing (or refuting) positive selection. But the best cases usually involve multiple lines of evidence coming from several independent perspectives that are consistent with each other.
6. A lot of researchers studying human population genetics and evolution are strictly data miners (i.e., they generate/publish no original data). There are limitations to such an approach, as it depends on the available data and prevents certain analyses from being performed. Do you expect to see more research groups turning into pure data mining labs in the future? Or will there still be a place for independent labs generating their own data (for example, resequencing a gene in multiple individuals to study the polymorphism)?
Given the explosion of genomic data in the last decade or so, which shows no sign of slowing down any time soon, there is likely to be a proliferatio
n of pure data miners just because there is a niche for them. But I suspect that many interesting findings will still require the combination of data mining and wet experiments to provide key pieces of data not already available in public databases. In this regard, labs that can do both data mining and wet experiments can have an advantage over labs that can only do data mining.
7. The politics behind the funding of stem-cell research in the US have sometimes obscures the actual science. As someone who works in the field, where is it headed? What is truly feasible in terms of medical progress using an approach based in stem cell research?
I personally feel that the promises of stem cells as a direct reagent in the treatment of disease are grossly exaggerated. I think it will be a very long time before Parkinson’s disease or Alzheimer’s disease could be treated by introducing stem cells (or their derivative cells) into a patient. However, stem cells offer a model for studying developmental processes. As such, stem cell biology will ultimately make valuable contributions to our ability to better understand disease and develop treatments. So, I believe that the future of stem cell research lies in its potential as a research tool, and to a lesser extent, its ability to provide direct cure for disease.
8. Much of your work on stem cells is done in collaboration with a center in China. What is the attitude towards such research there, and how does it compare with the attitude here in the US?
The attitude is much more progressive relative to the US. Religion is not a dominant force in molding Chinese cultural traditions, and people are generally not married to a particular doctrine. This attitude provides greater flexibility for stem cell research.
9. Ian Buruma has noted that many Chinese dissidents have converted to Christianity, while David Aikman, in “Jesus in Beijing”, argues that the Christianization of much of China will alter geopolitics. How accurate do you think is the perception by many Westerners that Christianity is filling the ideological void left by the fall of Marxism-Leninism?
I tend to agree that Christianity is filling an ideological void left by the dying out of the old communist ideology. But whether China will be Christianized is a separate matter. There is plenty of Chinese who are strongly opposed to the idea of allowing religion to play a major role in the culture. I suspect it will be a major uphill battle for one religion, be it Christianity or otherwise, to spread beyond a few limited sectors of society. But this is just my guess.
10. Looking back, would you make any changes in your educational path? If so, what?
Looking back, I might have chosen economics instead of biology, as it might have allowed my work to have a broader impact. But it’s a tossup, and my feeling may well have stemmed from my constant impatience with lack of progress in my own work and therefore the perception that grass is greener on the other guy’s pasture
In the comments of a previous post, rikurzhen asks the following question:
do we know enough about recombination hot spots to say if they are heritable? if so, could the location of a hot spot itself be under selection?
I responded that it would be tough to tell whether or not hotspots are heritable or not, but this isn’t entirely true– I limited myself to thinking about humans. On a bit of further reading, I can now definitively say yes, certain DNA sequences are more likely than others to initiate recombination, and these sequences are (obviously) heritable.
The interesting thing is that the very existence of hotspots implies a paradox. I highly recommend the introduction to this article for those looking to understand why this is so. Here’s a good summary of the problem:
Sexual recombination is one of the main forces shaping eukaryote evolution, but implicit in its mechanism is a serious paradox. The mechanism, called double-strand break repair, was first proposed for fungi in 1983. It has become increasingly well understood and well supported in a wide variety of organisms, and double-strand DNA breaks (DSBs) are now thought to be the primary initiators of meiotic recombination in eukaryotes. DSBs usually occur at chromosomal sites called recombination hotspots, whose evolutionary persistence is at the heart of the paradox. DSBs appear to frequently cause destruction of the DNA sequence specifying the hotspot and replacement of this sequence by the sequence of its homolog. Over many generations this self-destructive mechanism is expected to cause all active hotspot alleles to be replaced by alleles incapable of initiating DSBs. The paradox is that this has not happened.
However, the story may be more complicated, at least in humans. From here:
Haplotype analysis around both hotspots identified active and suppressed men sharing identical haplotypes, establishing that these major variations in the presence/absence of a hotspot and in quantitative activity are not caused by local DNA sequence variation
It seems likely, to me, that there must be some mechanism that maintains a certain number of recombination events on a chromosome (recall from your molecular biology class that at least one crossing-over must occur on each chromosome in meiosis I to guarantee proper segregation of the chromosomes). The actual location of the crossing-over may be determined by a number of factors, local sequence variation included. Perhaps this recombination-guaranteeing system, which itself is not part of the crossing-over, could resolve the paradox.
Studies that look for an association between a genetic variant and a trait are often inconsistent, finding an association in some studies or some populations, but not others. This could be for a number of reasons– small samples sizes, heterogeneity, or difficulty quantifying the trait, among other things. Or it could simply be that there’s no association to find.
However, it’s certainly strong evidence for an association if inducing the variant allele in a mouse also induces the trait you’re looking at. This paper does an excellent job of that, pretty much conclusively settling the issue of whether a variant in a certain brain-expressed gene is invloved in anxiety. Mice without the variant are normal, mice with the variant display more anxious behavior (avoiding the middle of an open area, for example). Simple as that.
The causal allele is present at a frequency of 20-30% in Caucasian populations, so it is perhaps a large contributor to normal human variation in anxiety. And it’s also perhaps a prelude to finding common alleles that explain some of the variation of other cognitive phenotypes.
The X Prize Foundation, sponsor of a widely noted 2004 award for developing a reusable rocket suitable for private space travel, says it is now teaming with a wealthy Canadian geologist to offer $10 million to any team that can completely decode the genes of 100 people in 10 days.
And that’s not all. As an encore, the winning team will be paid $1 million more to decode another 100 people’s genes, including a bevy of wealthy donors and celebrities. Already accepted for future decoding: Google Inc. co-founder Larry Page, Microsoft Corp. co-founder Paul G. Allen and former junk-bond king Michael Milken.
John Hawks has a little commentary, including the ever important question: how does anyone actually know the winners get it right (as opposed to generating random variations on the current reference sequence, for example)?
One broader question overall is: why do we need a new method for sequencing? Sanger sequencing has gotten us this far, shouldn’t a little more miniaturization and automation get us where we need to be? The answer, in a word, is no.
As I see it, the problem with both 454 sequencing (one of the top candidates to be the next major sequencing technology) and Sanger sequencing is their ability (or lack thereof) to handle large repeats. Here’s what I mean:
First, note that sequencing is not the same as looking at a DNA molecule and reading on down. Sanger sequencing gets you about 600 bases at a time, so you first need to break the genome up into pieces of that size, sequence them all, then put them back in order. In 454 sequencing, you break the genome up into piece of about 1000 bp, but you only can read about 200 of them. Note that the human genome is composed of somewhere around 3.3 billion bases.
Now imagine you have a stretch of 3000 bases on chromosome 5, and another identical stretch on chromosome 11. If you’re sequencing 600 bases, you have no idea that there are two copies of that sequence in the genome– all you see is a bunch of identical sequence. The only thing you can do is sequence the same genome over and over again and hope that one of your 600 bp reads overlaps both some unique sequence from one chromosome and some of your repeat, allowing you to anchor it down. Of course, there are computational methods for resolving repeats that have gotten much better in the past few years, but the larger the repeat, the harder it is, and you still have to generate a lot of sequence– off the top of my head, I’d guess you need something like 10X coverage (meaning for a 3.3 billion bp genome, you have to sequence 33.3 billion bases) to successfully resolve most repeats. That’s a lot.
And this is not just a theoretical problem– the reference genome is still not complete due to the presence of large repeats! And these things are important– one, their architecture predisposes to pathogenic rearrangements and two, normal copy number differences in humans is a relatively unexplored source of natural variation.
In the short term (ie. in the next 2-3 years), no sequencing technology will be able to get around this problem reliably enough to win this prize. What we will see, however, is the use of array-based methods to type almost all polymorphism in the genome, which will give an approximation of a genome sequence. These technologies are cheap and fairly computationally tractable, meaning they can be widely used, but they’re still subject to problems like the ones described above, so they’re not as good as a full sequence.
In the long run, however, sequencing will indeed be routine. How long? I’m guessing no more than 10 years. And what technology will do it? My money is on technologies that work with a single DNA molecule. If they get good enough (and they will), it should be possible to simply read down a DNA molecule as if it were an open book. No problem with repeats there. Here’s a paper that follows an enzyme down DNA and uses the movement of the enzyme to call the base. Nanopore sequencing has also gotten a lot of press in the last few years; I wouldn’t be surprised if a company using that technology throws their hat into the ring.
It’s worth noting that the vast majority of DNA variants in the world are singletons– that is, they only appear in one individual. Once sequencing is routine, there will be a revolution in how we understand the genetic basis of phenotypes– whether pathological or not.