The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
 TeasersGene Expression Blog
74 Loci for Cognitive Development; Yes, This Is Happening
🔊 Listen RSS
Email This Page to Someone

 Remember My Information



=>

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
AgreeDisagreeLOLTroll
These buttons register your public Agreement, Disagreement, Troll, or LOL with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used once per hour.
Ignore Commenter Follow Commenter
Search Text Case Sensitive  Exact Words  Include Comments
List of Bookmarks

Screenshot 2016-05-11 12.07.21

Screenshot 2016-05-11 12.20.14 No time to comment. Yes, the hits with SNPs are cool. But look at all the functional associations and analysis in this paper! Some serious biology in this. The figure from the paper to the left which shows how the genes associated with this SNP hits are expressed in different tissue/types and organs. These are the biggest effect SNPs for years of education in the genome, so it makes sense that they’d be way over-expressed in the brain. It is definitely more convincing to those who might be skeptical a priori than some statistically robust associations (well, it should be more convincing at least).

Genome-wide association study identifies 74 loci associated with educational attainment:

Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals1. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample1, 2 of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.

 
• Category: Science • Tags: Genomics, IQ 
Hide 38 CommentsLeave a Comment
Commenters to Ignore...to FollowEndorsed Only
Trim Comments?
  1. AG says:

    SNPs findings would help next steps to pinpoint specific genes for brain functions. In the hindsight, it is no surprise that all human abilities have genetic basis. Knowledge might be environmental. But ability to absorb knowledge (educational attainment) is human genetic function.

    In animal world, it is trainability (learning new behavior) of smart animals. Stupid species are not trainable. Stupid animals are really stubborn. Taming is form of learning. The same rules apply to human too (Rushton reasoning).

    Certain human are obviously less trainable than others. But human often project their own perception onto others. Thus you have some bloggers stubbornly believe every thing is genetic determined. In their world, most people are not trainable. On the other hand, East Asian culture strongly believe hard-working or studying can change outcome since they are very trainable in the first place. They are also project their own experience onto others also. And believe every body can be trained or educated.

    • Replies: @AndrewR
    All but the most profoundly retarded humans are trainable. But some are obviously a lot more trainable than others. And as a general rule a person with a 90 IQ with a strong work ethic will be more successful by most measures than a 170 IQ bum.
  2. one should consider non genetic ways of intervention as well. Genes are of course important.electrical stimulation, implants perhaps. better methods.company of smarter people,nutrition & behavioral intervention at a young age etc.

    otherwise one will not find the intervention from environment & we might end up wrongly believing that genes alone are the defining factors in cognitive development

  3. Are genetic factors really estimated to only account for 20% of variation between individuals on educational attainment? I was under the impression it was rather more than that.

    • Replies: @Razib Khan
    i think this is a trait where environment including family resources matters a lot. college costs $.
    , @PD Shaw
    Here is a study which reports that educational achievement (pdf) has been found to be 58% heritable, based upon performance on a test given in the UK upon reaching the end of compulsory education (16). I assume educational attainment refers to number of years of education completed.
    , @gwern
    Most of these seem to be European cohorts, FWIW; UK Biobank (all UK) is a third of the sample on its own. The exact breakdown by cohort is... somewhere buried in the spreadsheet I haven't found yet.

    The quoted 20% figure seems to be GCTA on years of education, going by the supplementary information:


    Indeed, Rietveld et al. (2013) 7 reported GCTA-GREML estimates of SNP heritability for each of two cohorts (STR and QIMR), and the mean estimate was 22.4%. Assuming that 22.4% is in fact the true SNP heritability, the calculations outlined in the SOM of Rietveld et al. (pp. 22-23) generate a prediction of R 2 = 11.0% for a score constructed from the GWAS estimates of this paper and of R 2 = 6.1% for a score constructed from the combined (discovery + replication cohorts, but excluding the validation cohorts) GWAS sample of N = ~117,000-119,000 in Rietveld et al.—substantially higher than the 3.85% that we achieve here (with the score based on all genotyped SNPs) and the 2.2% Rietveld et al. achieved, respectively.
     
    This is a lot less than the full heritability of education estimated in family designs, just like the GCTA IQ estimates are much less. The SNP heritability is lower because of small chips, measurement error, and genetic variants with complicated non-additive effects. Sometimes measurement error can be a big factor: if you do Spearman's correction on the previous GCTA for intelligence, it goes from 0.31 to 0.49, which is a lot closer to 0.8 than 0.3! I don't think that's much of an issue here unless the discretization of 'years of schooling' is a problem. Non-additivity is likely: aside from intelligence, personality traits like Conscientiousness have long been considered one of the major predictors of schooling, and the last GWASes to attack Big Five have cracked their teeth on it, with GCTAs near zero and hardly any hits despite pretty hefty sample sizes. Also possible is changing heritability: going to highschool no longer means much, especially in Scandinavia, leading to ceiling & range restriction issues. When everyone is special no one is, etc. So if education has a heritability of 0.7, then a GCTA of 0.21 after all these issues are taken into account strikes me as fairly reasonable.
    , @notanon
    deflection shield imo - saying it's 20% allows a partial compromise with the people who will be trying to get them fired

    good tactic

    , @Emil O. W. Kirkegaard
    Meta-analysis of educational attainment found H2 = 40% and increasing.

    http://www.ipr.northwestern.edu/publications/docs/workingpapers/2013/IPR-WP-13-09.pdf
  4. @Karl Zimmerman
    Are genetic factors really estimated to only account for 20% of variation between individuals on educational attainment? I was under the impression it was rather more than that.

    i think this is a trait where environment including family resources matters a lot. college costs $.

    • Replies: @Karl Zimmerman
    This is obviously true in an American context. That said, I would have to presume it varies considerably from nation to nation, given the norm in most of Europe has been very low to no tuition until very recently (and sometimes being barred from anything further than trade school if you didn't do well enough on standardized testing). If the existing studies are heavily U.S. biased, then the 20% makes a lot more sense however.
    , @marcel proust
    But if (since?) socio-economic factors (including $s) are largely genetically determined, it's turtles all the way down, no? And with Friday the 13th coming on a Friday this month, the turtles are coming fast and furious.
    , @Anonymous
    Also you have to consider that the skills required for an advanced degree in Physics and an advanced degree in English might not overlap all that much.
  5. @Razib Khan
    i think this is a trait where environment including family resources matters a lot. college costs $.

    This is obviously true in an American context. That said, I would have to presume it varies considerably from nation to nation, given the norm in most of Europe has been very low to no tuition until very recently (and sometimes being barred from anything further than trade school if you didn’t do well enough on standardized testing). If the existing studies are heavily U.S. biased, then the 20% makes a lot more sense however.

  6. Of course in large sample-size studies one can easily expect to find differences. Thus the study seems correctly to give consideration of the magnitude of the effect the percentage of the total variation that this difference accounts for.
    from the phys.org website
    “The total influence of the 74 identified genetic variants is small, explaining about 0.43 of 1 percent of the variation in educational attainment across individuals, the scientists wrote.

    “For the variant with the largest effect, the difference between people with zero copies and those who have two copies of the variant predicts, on average, about nine more weeks of schooling,” Benjamin said.

    The results suggest that the genetic influences on educational attainment are spread across thousands, if not millions, of genetic variants, most of which have not yet been identified, Benjamin said.”

    “The very small effects of individual genetic variants is itself an important finding, which echoes what we’ve seen in our own earlier work,” Benjamin said. “It means that simplistic interpretations of our results, such as calling them ‘genes for education,’ are totally misleading. At the same time, despite the small effects of individual genetic variants, the results are useful because we can learn a lot from studying the combined effects of the genetic variants taken all together.”
    This is a European study. Since the implementation of mass education in Europe is only a few centuries old, we cannot expect there has been significant evolutionary pressure for success in this area.
    “…the researchers found that many of the genes associated with educational attainment are influential in brain development, even before birth. The scientists said these genes likely play a role in cognitive function and personality traits, such as grit, that matter for school performance.”
    What I find exciting is,using Genetic Based Medicine as a model, is to be able to to have Genetic Based Education. ADHD is a genetic variant which probably had some evolutionary benefit. But is ill suited to the modern classroom. Some success has been achieved in schools willing or able to alter environmental and instructional milieu.
    Hopefully we will be able to extend this to other genetics variants.

    • Replies: @Razib Khan
    The results suggest that the genetic influences on educational attainment are spread across thousands, if not millions, of genetic variants, most of which have not yet been identified, Benjamin said.

    i think millions is a bit of hyperbole. the # of variants is on the order of millions. i think for most of the variation isn't going to be tens of thosuands. about 10x more than height.
    , @notanon

    ADHD is a genetic variant which probably had some evolutionary benefit. But is ill suited to the modern classroom. Some success has been achieved in schools willing or able to alter environmental and instructional milieu.
     
    I like the sound of that.
  7. @Karl Zimmerman
    Are genetic factors really estimated to only account for 20% of variation between individuals on educational attainment? I was under the impression it was rather more than that.

    Here is a study which reports that educational achievement (pdf) has been found to be 58% heritable, based upon performance on a test given in the UK upon reaching the end of compulsory education (16). I assume educational attainment refers to number of years of education completed.

    • Replies: @KMD
    Educational achievement in this context is not the numbers of years of education completed but is measured by the number of certificates of higher education passed and the grades achieved at the age of sixteen. Basically , the grades are measured by performance in exams in various subjects plus an element based on performance in course work in the previous school year. Clearly, those that go on to examinations at a higher level and university education will eventually complete more years of education but, while this may well be influenced by the grades achieved at age 16, that is not what is being measured here.
  8. gwern says: • Website
    @Karl Zimmerman
    Are genetic factors really estimated to only account for 20% of variation between individuals on educational attainment? I was under the impression it was rather more than that.

    Most of these seem to be European cohorts, FWIW; UK Biobank (all UK) is a third of the sample on its own. The exact breakdown by cohort is… somewhere buried in the spreadsheet I haven’t found yet.

    The quoted 20% figure seems to be GCTA on years of education, going by the supplementary information:

    Indeed, Rietveld et al. (2013) 7 reported GCTA-GREML estimates of SNP heritability for each of two cohorts (STR and QIMR), and the mean estimate was 22.4%. Assuming that 22.4% is in fact the true SNP heritability, the calculations outlined in the SOM of Rietveld et al. (pp. 22-23) generate a prediction of R 2 = 11.0% for a score constructed from the GWAS estimates of this paper and of R 2 = 6.1% for a score constructed from the combined (discovery + replication cohorts, but excluding the validation cohorts) GWAS sample of N = ~117,000-119,000 in Rietveld et al.—substantially higher than the 3.85% that we achieve here (with the score based on all genotyped SNPs) and the 2.2% Rietveld et al. achieved, respectively.

    This is a lot less than the full heritability of education estimated in family designs, just like the GCTA IQ estimates are much less. The SNP heritability is lower because of small chips, measurement error, and genetic variants with complicated non-additive effects. Sometimes measurement error can be a big factor: if you do Spearman’s correction on the previous GCTA for intelligence, it goes from 0.31 to 0.49, which is a lot closer to 0.8 than 0.3! I don’t think that’s much of an issue here unless the discretization of ‘years of schooling’ is a problem. Non-additivity is likely: aside from intelligence, personality traits like Conscientiousness have long been considered one of the major predictors of schooling, and the last GWASes to attack Big Five have cracked their teeth on it, with GCTAs near zero and hardly any hits despite pretty hefty sample sizes. Also possible is changing heritability: going to highschool no longer means much, especially in Scandinavia, leading to ceiling & range restriction issues. When everyone is special no one is, etc. So if education has a heritability of 0.7, then a GCTA of 0.21 after all these issues are taken into account strikes me as fairly reasonable.

    • Replies: @Razib Khan
    this is all 100% correct. an author (one of many on this paper, so anonymous for all practical purposes) said some of the same in email just now.
  9. PZ Myers is not going to be happy about this and may deploy argumentum ad currum celer. Steve Hsu, on the other hand… well, he called this quite a while back.

    • Replies: @jaywalker
    I doubt pzed will get worked up about 74 genes that statistically explain .0043 of variation in education level. The authors explicitly stated in the paper "educational attainment is primarily determined by environmental factors". But if Hsu gets excited about these results that's his problem.
  10. @Razib Khan
    i think this is a trait where environment including family resources matters a lot. college costs $.

    But if (since?) socio-economic factors (including $s) are largely genetically determined, it’s turtles all the way down, no? And with Friday the 13th coming on a Friday this month, the turtles are coming fast and furious.

    • Replies: @Razib Khan
    SES isn't *largely*. the correlation is good, but closer to 0.5 i think.
  11. @gwern
    Most of these seem to be European cohorts, FWIW; UK Biobank (all UK) is a third of the sample on its own. The exact breakdown by cohort is... somewhere buried in the spreadsheet I haven't found yet.

    The quoted 20% figure seems to be GCTA on years of education, going by the supplementary information:


    Indeed, Rietveld et al. (2013) 7 reported GCTA-GREML estimates of SNP heritability for each of two cohorts (STR and QIMR), and the mean estimate was 22.4%. Assuming that 22.4% is in fact the true SNP heritability, the calculations outlined in the SOM of Rietveld et al. (pp. 22-23) generate a prediction of R 2 = 11.0% for a score constructed from the GWAS estimates of this paper and of R 2 = 6.1% for a score constructed from the combined (discovery + replication cohorts, but excluding the validation cohorts) GWAS sample of N = ~117,000-119,000 in Rietveld et al.—substantially higher than the 3.85% that we achieve here (with the score based on all genotyped SNPs) and the 2.2% Rietveld et al. achieved, respectively.
     
    This is a lot less than the full heritability of education estimated in family designs, just like the GCTA IQ estimates are much less. The SNP heritability is lower because of small chips, measurement error, and genetic variants with complicated non-additive effects. Sometimes measurement error can be a big factor: if you do Spearman's correction on the previous GCTA for intelligence, it goes from 0.31 to 0.49, which is a lot closer to 0.8 than 0.3! I don't think that's much of an issue here unless the discretization of 'years of schooling' is a problem. Non-additivity is likely: aside from intelligence, personality traits like Conscientiousness have long been considered one of the major predictors of schooling, and the last GWASes to attack Big Five have cracked their teeth on it, with GCTAs near zero and hardly any hits despite pretty hefty sample sizes. Also possible is changing heritability: going to highschool no longer means much, especially in Scandinavia, leading to ceiling & range restriction issues. When everyone is special no one is, etc. So if education has a heritability of 0.7, then a GCTA of 0.21 after all these issues are taken into account strikes me as fairly reasonable.

    this is all 100% correct. an author (one of many on this paper, so anonymous for all practical purposes) said some of the same in email just now.

  12. @marcel proust
    But if (since?) socio-economic factors (including $s) are largely genetically determined, it's turtles all the way down, no? And with Friday the 13th coming on a Friday this month, the turtles are coming fast and furious.

    SES isn’t *largely*. the correlation is good, but closer to 0.5 i think.

  13. @aeolius
    Of course in large sample-size studies one can easily expect to find differences. Thus the study seems correctly to give consideration of the magnitude of the effect the percentage of the total variation that this difference accounts for.
    from the phys.org website
    "The total influence of the 74 identified genetic variants is small, explaining about 0.43 of 1 percent of the variation in educational attainment across individuals, the scientists wrote.

    "For the variant with the largest effect, the difference between people with zero copies and those who have two copies of the variant predicts, on average, about nine more weeks of schooling," Benjamin said.

    The results suggest that the genetic influences on educational attainment are spread across thousands, if not millions, of genetic variants, most of which have not yet been identified, Benjamin said."
    '
    "The very small effects of individual genetic variants is itself an important finding, which echoes what we've seen in our own earlier work," Benjamin said. "It means that simplistic interpretations of our results, such as calling them 'genes for education,' are totally misleading. At the same time, despite the small effects of individual genetic variants, the results are useful because we can learn a lot from studying the combined effects of the genetic variants taken all together."
    This is a European study. Since the implementation of mass education in Europe is only a few centuries old, we cannot expect there has been significant evolutionary pressure for success in this area.
    "...the researchers found that many of the genes associated with educational attainment are influential in brain development, even before birth. The scientists said these genes likely play a role in cognitive function and personality traits, such as grit, that matter for school performance."
    What I find exciting is,using Genetic Based Medicine as a model, is to be able to to have Genetic Based Education. ADHD is a genetic variant which probably had some evolutionary benefit. But is ill suited to the modern classroom. Some success has been achieved in schools willing or able to alter environmental and instructional milieu.
    Hopefully we will be able to extend this to other genetics variants.

    The results suggest that the genetic influences on educational attainment are spread across thousands, if not millions, of genetic variants, most of which have not yet been identified, Benjamin said.

    i think millions is a bit of hyperbole. the # of variants is on the order of millions. i think for most of the variation isn’t going to be tens of thosuands. about 10x more than height.

  14. also, ~20% causes less problems in review….

  15. are these SNPs genotyped on the 23andMe array?

    • Replies: @gwern
    Most of them probably are, inasmuch as 23andMe is one of the participating groups (but they make it difficult and allow only the top 5k SNPs to be released, so the paper has to do some odd things like report results separately: only the top 5k hits for the full meta-analysis including 23andMe, and all hits for the meta-analysis sans 23andMe data). You can get the data on the SSGAC website if you want to try out a polygenic score on your own 23andMe export.

    ---

    Anyway, to go back to the GCTA issue and unbreast myself further. What does 20% mean? It means that with current comprehensiveness of arrays, with measurements of current accuracy, modeled as simple linear models, a ton of data will let you reach the upper ceiling of 20% of phenotypic variance explained. To get more, you are going to have to change one of those 3: switch to bigger arrays or whole-genomes, improve your phenotype measures to reduce noise, or start using fancier analyses like Craig Venter is doing to pick up gene interactions, dominance, epistasis, etc. All of which is going to cost a lot of money. (Although some people claim that current SNP chips would be almost as good as whole-genomes if more aggressive imputation was used. Maybe! That would be convenient.)

    Is the current 20% limit a problem? Well, 20% is more than enough if you want to do a Mendelian randomization / instrumental variable study (I've seen some quotes recently from academics about the correlation between higher education and greater lifespan being causally due to the education; they're going to be disappointed), and it'll give you tons of places to look if you want a biological understanding of causes of intelligence - this study alone has 162 genome-wide hits. (As far as I can tell, the 74 hits in the title/abstract is referring to just the hits in the original SSGAC paper, not the 162 you get when the SSGAC results are combined with UK Biobank; I guess they got the Biobank data too late in the process and could only use it as an out-of-sample replication dataset?)

    What about embryo selection or editing? I would say that 20% is enough, the important thing is how much you can explain in your polygenic score. Their genome-wide significance translates to each hit having a posterior probability of >90%, and the non-significant hits will all have substantial posterior probabilities of non-zero effects too. So just the currently available data implies that you know hundreds of alleles worth selecting on or editing. This is the central limit theorem at work: because each person has a net difference in alleles of only a few dozen variants, a polygenic score has to work extremely hard to deliver any visible predictive power even after it's identified the signs of many or most of the variants - most of the variants simply wind up canceling out and the genetic contribution to differences in intelligence is the leftover residue, like the matter leftover from matter/anti-matter not quite canceling out perfectly.

    So right now with the polygenic score, if you could do hundreds of CRISPR edits, you would get several standard deviations if improvement, even though it 'looks' like we know hardly anything about where to edit. This surprised me and I didn't believe it until I worked through Hsu's binomial model by hand and proved it to myself. And this works for other highly polygenic traits too, as long as they have some decent level of genetic heritability, there's tremendous selection/editing potential.

  16. gwern says: • Website
    @Anonymous
    are these SNPs genotyped on the 23andMe array?

    Most of them probably are, inasmuch as 23andMe is one of the participating groups (but they make it difficult and allow only the top 5k SNPs to be released, so the paper has to do some odd things like report results separately: only the top 5k hits for the full meta-analysis including 23andMe, and all hits for the meta-analysis sans 23andMe data). You can get the data on the SSGAC website if you want to try out a polygenic score on your own 23andMe export.

    Anyway, to go back to the GCTA issue and unbreast myself further. What does 20% mean? It means that with current comprehensiveness of arrays, with measurements of current accuracy, modeled as simple linear models, a ton of data will let you reach the upper ceiling of 20% of phenotypic variance explained. To get more, you are going to have to change one of those 3: switch to bigger arrays or whole-genomes, improve your phenotype measures to reduce noise, or start using fancier analyses like Craig Venter is doing to pick up gene interactions, dominance, epistasis, etc. All of which is going to cost a lot of money. (Although some people claim that current SNP chips would be almost as good as whole-genomes if more aggressive imputation was used. Maybe! That would be convenient.)

    Is the current 20% limit a problem? Well, 20% is more than enough if you want to do a Mendelian randomization / instrumental variable study (I’ve seen some quotes recently from academics about the correlation between higher education and greater lifespan being causally due to the education; they’re going to be disappointed), and it’ll give you tons of places to look if you want a biological understanding of causes of intelligence – this study alone has 162 genome-wide hits. (As far as I can tell, the 74 hits in the title/abstract is referring to just the hits in the original SSGAC paper, not the 162 you get when the SSGAC results are combined with UK Biobank; I guess they got the Biobank data too late in the process and could only use it as an out-of-sample replication dataset?)

    What about embryo selection or editing? I would say that 20% is enough, the important thing is how much you can explain in your polygenic score. Their genome-wide significance translates to each hit having a posterior probability of >90%, and the non-significant hits will all have substantial posterior probabilities of non-zero effects too. So just the currently available data implies that you know hundreds of alleles worth selecting on or editing. This is the central limit theorem at work: because each person has a net difference in alleles of only a few dozen variants, a polygenic score has to work extremely hard to deliver any visible predictive power even after it’s identified the signs of many or most of the variants – most of the variants simply wind up canceling out and the genetic contribution to differences in intelligence is the leftover residue, like the matter leftover from matter/anti-matter not quite canceling out perfectly.

    So right now with the polygenic score, if you could do hundreds of CRISPR edits, you would get several standard deviations if improvement, even though it ‘looks’ like we know hardly anything about where to edit. This surprised me and I didn’t believe it until I worked through Hsu’s binomial model by hand and proved it to myself. And this works for other highly polygenic traits too, as long as they have some decent level of genetic heritability, there’s tremendous selection/editing potential.

  17. @Karl Zimmerman
    Are genetic factors really estimated to only account for 20% of variation between individuals on educational attainment? I was under the impression it was rather more than that.

    deflection shield imo – saying it’s 20% allows a partial compromise with the people who will be trying to get them fired

    good tactic

  18. @aeolius
    Of course in large sample-size studies one can easily expect to find differences. Thus the study seems correctly to give consideration of the magnitude of the effect the percentage of the total variation that this difference accounts for.
    from the phys.org website
    "The total influence of the 74 identified genetic variants is small, explaining about 0.43 of 1 percent of the variation in educational attainment across individuals, the scientists wrote.

    "For the variant with the largest effect, the difference between people with zero copies and those who have two copies of the variant predicts, on average, about nine more weeks of schooling," Benjamin said.

    The results suggest that the genetic influences on educational attainment are spread across thousands, if not millions, of genetic variants, most of which have not yet been identified, Benjamin said."
    '
    "The very small effects of individual genetic variants is itself an important finding, which echoes what we've seen in our own earlier work," Benjamin said. "It means that simplistic interpretations of our results, such as calling them 'genes for education,' are totally misleading. At the same time, despite the small effects of individual genetic variants, the results are useful because we can learn a lot from studying the combined effects of the genetic variants taken all together."
    This is a European study. Since the implementation of mass education in Europe is only a few centuries old, we cannot expect there has been significant evolutionary pressure for success in this area.
    "...the researchers found that many of the genes associated with educational attainment are influential in brain development, even before birth. The scientists said these genes likely play a role in cognitive function and personality traits, such as grit, that matter for school performance."
    What I find exciting is,using Genetic Based Medicine as a model, is to be able to to have Genetic Based Education. ADHD is a genetic variant which probably had some evolutionary benefit. But is ill suited to the modern classroom. Some success has been achieved in schools willing or able to alter environmental and instructional milieu.
    Hopefully we will be able to extend this to other genetics variants.

    ADHD is a genetic variant which probably had some evolutionary benefit. But is ill suited to the modern classroom. Some success has been achieved in schools willing or able to alter environmental and instructional milieu.

    I like the sound of that.

    • Replies: @RaceRealist88
    The adhd gene is correlated with ddr7 I believe it is. Responsible for "wanderlust".
  19. KMD says:
    @PD Shaw
    Here is a study which reports that educational achievement (pdf) has been found to be 58% heritable, based upon performance on a test given in the UK upon reaching the end of compulsory education (16). I assume educational attainment refers to number of years of education completed.

    Educational achievement in this context is not the numbers of years of education completed but is measured by the number of certificates of higher education passed and the grades achieved at the age of sixteen. Basically , the grades are measured by performance in exams in various subjects plus an element based on performance in course work in the previous school year. Clearly, those that go on to examinations at a higher level and university education will eventually complete more years of education but, while this may well be influenced by the grades achieved at age 16, that is not what is being measured here.

    • Replies: @PD Shaw
    It still strikes me that educational attainment is going to be less precise and less heritable, given that four years of Oxford will be valued as much as four years at any other college. And the decision to seek an additional certificate will be subject to available alternatives, such as the family business, as well as tuition fees, or foreign-born preferences (which appears to be a complaint in the UK these days). OTOH, I would assume high heritability on the tails (someone who doesn't complete the required 16 years of study probably has a mental disorder, and someone who has 24 years probably has a high IQ). Using years of education is obviously easy to quantify, I just want to understand the inferences that can be drawn.
  20. So naturally the authors included tables of geographic and racial allele frequencies derived from the 1,000 genomes project and ancient DNA. Oh, wait, that would be actual science.

    In any case, they deserve credit for trying to answer the right questions.

    Now it’s time for someone else to finish what they started.

    • Replies: @ohwilleke
    Academia rewards you for how many papers, each of which contains something worth of publication, you can get published. There is a strong career health/compensation/tenure eligibility/full professorship eligibility/grant application credibility incentive to break up your scientific insights into the maximum number of published papers that the total number of insights you actually have from your research can support.

    In music composition and fiction writing, your legacy is mostly a function of your peak accomplishment in a single work. In academia, your legacy is mostly a function of how many minimum threshold accomplishments you can pull off.
  21. My gut instinct tells me that capturing 0.43% of variation with the 74 most significant loci is a sign that there is something wrong with the model that the GWAS is trying to fit.

    This implies that the number of relevant SNPs is much greater than 1500 because the effect size in the long tail is going to get smaller and smaller. Maybe you’re really look at more like 5,000-10,000 SNPs that have not reached fixation in contemporary populations.

    Why does that feel wrong?

    Because if there are really 5,000-10,000 loci, the law of averages is going to kick in with a vengeance and similarly regression to the mean should be huge, but while IQ breeds fairly true, IQ variation between fairly closely related individuals is often quite significant. If children inherit randomly from both parents and there are really 5,000-10,000 loci that matter, all of which have very small effects, IQ differences between siblings ought to be really, really slight and rare, because the 5,000-10,000 random trials for the large number of low effect SNPs should average out between full siblings almost completely. But, while full siblings are definitely correlated, there are routinely meaningful magnitude sibling IQ differences. When no one inherited factor has an impact of more than say 0.02%, that shouldn’t happen.

    Indeed, it ought to be possible to infer the number of independent heritable factors that account for the lion’s share of genotype driven differences in phenotype from mathematical models and non-genetic heredity data assuming that the instrument used to measure IQ is reasonably valid in terms of numerical differences in IQ scores being comparable to each other in magnitude of difference well into the tails to the normal distribution, and the absolute value starting at zero being meaningfully scaled. My numerical intuition is that in that kind of analysis you should be being seeing something on the order of low hundreds of really significant loci instead of mid- to high thousands (the phenotypic results are quite insensitive to the exact number of very, very low effect loci; the amount of variability between siblings has more to do with the number of independent non-negligible effect factors).

    Admittedly, another big factor is probably the crudity of using educational attainment as a proxy for IQ and other educational fitness enhancing cognitive traits is going to artificially depress the significance of each variant. It would be nice, for example, to pull high school transcripts and compare a finer grained measure like GPA or class rank or standardized test percentiles or military entrance exam scores to the panel of 72 culled loci for say 5%-10% of the sample to see how much that enhanced the power of each of these significant loci. I suspect that with a more finely calibrated measure of IQ that you might go from ca. 0.43% to perhaps as much as 2%-5% which would also be a nice way to establish that the effects of the genes identified are robust rather than being highly sensitive to methodology.

    There are lots of ways the model we are trying to fit could be wrong leading to an underestimate of effect size as well. Simple additive variance may be useful for lots of SNPs, but there have to be either an effect underestimated non-additive component in a significant subset of the SNPs, or a lot of IQ inheritance driven by other aspects of the genetic code (copy number variants, epigenetic effects, polygenetic impacts in “control” centers, and lots of other more subtle architectural features of our heredity that I’m not competent enough to suggest) that aren’t showing up using GWAS methodology that have big effects.

    Another possibility that seems pretty plausible given that a lot of the identified factors influence similar tissues in similar parts of the body at similar points in time in the person’s development, could be that a lot of low individual effect genes are physically clustered close enough to each other on particular parts of particular chromosomes. If so, the assumption that each of the 72 traits is inherited randomly with respect to the others is profoundly wrong and that instead, inheriting one of the 72 dramatically increases the likelihood that you will inherit others. this effect ought to be something that is pretty amenable to modeling with some kind of Monte Carlo methods in a computer model, since we know exactly where each of the 72 hits if located in the genome and our understanding of linkage disequalibrium should allow us to translate the locations into very mathematically precise measurements of degree of mathematical independence of the genes from each other. Existing computing resources and common place programming expertise that ought to be able to produce interesting and accurate results in a matter of a few weeks or less in the hands of a capable graduate student or three. If you have 72 target genes but due to their actual specific locations in the genome, but they have variability comparable to 40 truly independent degrees of freedom, then the law of averages paradox I raised above can be resolved pretty parsimoniously.

    • Replies: @Razib Khan
    segregation variance. this paper explains it

    http://www.gnxp.com/new/wp-content/uploads/2016/05/Rogers_1983_Assortative_mating_and_the_segregation_variance1.pdf
  22. @Afterthought
    So naturally the authors included tables of geographic and racial allele frequencies derived from the 1,000 genomes project and ancient DNA. Oh, wait, that would be actual science.

    In any case, they deserve credit for trying to answer the right questions.

    Now it's time for someone else to finish what they started.

    Academia rewards you for how many papers, each of which contains something worth of publication, you can get published. There is a strong career health/compensation/tenure eligibility/full professorship eligibility/grant application credibility incentive to break up your scientific insights into the maximum number of published papers that the total number of insights you actually have from your research can support.

    In music composition and fiction writing, your legacy is mostly a function of your peak accomplishment in a single work. In academia, your legacy is mostly a function of how many minimum threshold accomplishments you can pull off.

    • Replies: @John Massey
    One academic explained this to me as the concept of LPUs - Least Publishable Units.

    If you're sitting on a load of hot stuff, you surely don't blow it all in one big landmark paper, particularly in a field where you need to let it leak out in digestible bits so that you don't shock society into outright rejection of what you are showing them.
    , @Tim
    This is also why most of the really huge papers are from labs headed by people who already have a strong career, compensation, tenure, full professorship, and plenty of grant money.

    In this case, many older Principal Investigators make their Post-docs and Grad-students work for many years to get only one Cell/Science/Nature paper (assuming they don't get totally scooped), when they could have had 10 lesser papers with the same data.
    , @marcel proust
    Years ago, an academic I knew said that rather than maximize the number of publications per iinsight, he planned to take a route that, conditionally, was mathematically equivalent: minimize the number of insights per publication. I believe that he is no longer in academia.
    , @Douglas Knight
    "Legacy" is an odd choice of word. It suggests to me memory over a span of decades. Most academics have zero legacy. Getting tenure requires both quality and quantity, though not quantity of quality works. Legacy is mainly measured by the quality of one's peak idea, much as in the arts, though it is not so important if that idea is divided across multiple works or shares one work with other ideas. The other source of legacy is the number and quality of grad students who praise their ancestor.
  23. @Razib Khan
    i think this is a trait where environment including family resources matters a lot. college costs $.

    Also you have to consider that the skills required for an advanced degree in Physics and an advanced degree in English might not overlap all that much.

  24. @ohwilleke
    Academia rewards you for how many papers, each of which contains something worth of publication, you can get published. There is a strong career health/compensation/tenure eligibility/full professorship eligibility/grant application credibility incentive to break up your scientific insights into the maximum number of published papers that the total number of insights you actually have from your research can support.

    In music composition and fiction writing, your legacy is mostly a function of your peak accomplishment in a single work. In academia, your legacy is mostly a function of how many minimum threshold accomplishments you can pull off.

    One academic explained this to me as the concept of LPUs – Least Publishable Units.

    If you’re sitting on a load of hot stuff, you surely don’t blow it all in one big landmark paper, particularly in a field where you need to let it leak out in digestible bits so that you don’t shock society into outright rejection of what you are showing them.

  25. Tim says:
    @ohwilleke
    Academia rewards you for how many papers, each of which contains something worth of publication, you can get published. There is a strong career health/compensation/tenure eligibility/full professorship eligibility/grant application credibility incentive to break up your scientific insights into the maximum number of published papers that the total number of insights you actually have from your research can support.

    In music composition and fiction writing, your legacy is mostly a function of your peak accomplishment in a single work. In academia, your legacy is mostly a function of how many minimum threshold accomplishments you can pull off.

    This is also why most of the really huge papers are from labs headed by people who already have a strong career, compensation, tenure, full professorship, and plenty of grant money.

    In this case, many older Principal Investigators make their Post-docs and Grad-students work for many years to get only one Cell/Science/Nature paper (assuming they don’t get totally scooped), when they could have had 10 lesser papers with the same data.

  26. @ohwilleke
    Academia rewards you for how many papers, each of which contains something worth of publication, you can get published. There is a strong career health/compensation/tenure eligibility/full professorship eligibility/grant application credibility incentive to break up your scientific insights into the maximum number of published papers that the total number of insights you actually have from your research can support.

    In music composition and fiction writing, your legacy is mostly a function of your peak accomplishment in a single work. In academia, your legacy is mostly a function of how many minimum threshold accomplishments you can pull off.

    Years ago, an academic I knew said that rather than maximize the number of publications per iinsight, he planned to take a route that, conditionally, was mathematically equivalent: minimize the number of insights per publication. I believe that he is no longer in academia.

  27. @notanon

    ADHD is a genetic variant which probably had some evolutionary benefit. But is ill suited to the modern classroom. Some success has been achieved in schools willing or able to alter environmental and instructional milieu.
     
    I like the sound of that.

    The adhd gene is correlated with ddr7 I believe it is. Responsible for “wanderlust”.

    • Replies: @notanon
    makes perfect sense to me
  28. @bbtp
    PZ Myers is not going to be happy about this and may deploy argumentum ad currum celer. Steve Hsu, on the other hand... well, he called this quite a while back.

    I doubt pzed will get worked up about 74 genes that statistically explain .0043 of variation in education level. The authors explicitly stated in the paper "educational attainment is primarily determined by environmental factors". But if Hsu gets excited about these results that's his problem.

    • Replies: @dearieme
    "educational attainment is primarily determined by environmental factors": every tank must have its armour.
  29. @AG
    SNPs findings would help next steps to pinpoint specific genes for brain functions. In the hindsight, it is no surprise that all human abilities have genetic basis. Knowledge might be environmental. But ability to absorb knowledge (educational attainment) is human genetic function.

    In animal world, it is trainability (learning new behavior) of smart animals. Stupid species are not trainable. Stupid animals are really stubborn. Taming is form of learning. The same rules apply to human too (Rushton reasoning).

    Certain human are obviously less trainable than others. But human often project their own perception onto others. Thus you have some bloggers stubbornly believe every thing is genetic determined. In their world, most people are not trainable. On the other hand, East Asian culture strongly believe hard-working or studying can change outcome since they are very trainable in the first place. They are also project their own experience onto others also. And believe every body can be trained or educated.

    All but the most profoundly retarded humans are trainable. But some are obviously a lot more trainable than others. And as a general rule a person with a 90 IQ with a strong work ethic will be more successful by most measures than a 170 IQ bum.

    • Replies: @AG
    It is about level of trainability (from grade school, high school, college, to PhD, MD) depending on brain power.

    Yes, in absolute sense, all animals can be trained. But level of educational attainment is depending on mental ability. With IQ 90, just forget about possibility of MD or PhD. It is waste of time and effort. Work ethic would not help situation here. The rule applies to animal world too. It is extremely difficult to train insects or earthworm to perform new task for obvious reason. Only with basic mental power available for specific task, work ethic or other complicated social factors can have some impact on the final outcome.

    Confusion about many influencing factors are the problems for most people. To mentally see things clearly can help one succeed. But what make you see thing clearly? Independent analytic ability and learning ability are the answer (mental power again).
  30. People need to be aware that SNPs are not genes with functions. They are only smokes that help you to find the real firing guns.

    All SNPs correlation with a trait might not be true genetic functional correlation with specific trait. Do not over interpret this research finding.

  31. @KMD
    Educational achievement in this context is not the numbers of years of education completed but is measured by the number of certificates of higher education passed and the grades achieved at the age of sixteen. Basically , the grades are measured by performance in exams in various subjects plus an element based on performance in course work in the previous school year. Clearly, those that go on to examinations at a higher level and university education will eventually complete more years of education but, while this may well be influenced by the grades achieved at age 16, that is not what is being measured here.

    It still strikes me that educational attainment is going to be less precise and less heritable, given that four years of Oxford will be valued as much as four years at any other college. And the decision to seek an additional certificate will be subject to available alternatives, such as the family business, as well as tuition fees, or foreign-born preferences (which appears to be a complaint in the UK these days). OTOH, I would assume high heritability on the tails (someone who doesn’t complete the required 16 years of study probably has a mental disorder, and someone who has 24 years probably has a high IQ). Using years of education is obviously easy to quantify, I just want to understand the inferences that can be drawn.

  32. @ohwilleke
    My gut instinct tells me that capturing 0.43% of variation with the 74 most significant loci is a sign that there is something wrong with the model that the GWAS is trying to fit.

    This implies that the number of relevant SNPs is much greater than 1500 because the effect size in the long tail is going to get smaller and smaller. Maybe you're really look at more like 5,000-10,000 SNPs that have not reached fixation in contemporary populations.

    Why does that feel wrong?

    Because if there are really 5,000-10,000 loci, the law of averages is going to kick in with a vengeance and similarly regression to the mean should be huge, but while IQ breeds fairly true, IQ variation between fairly closely related individuals is often quite significant. If children inherit randomly from both parents and there are really 5,000-10,000 loci that matter, all of which have very small effects, IQ differences between siblings ought to be really, really slight and rare, because the 5,000-10,000 random trials for the large number of low effect SNPs should average out between full siblings almost completely. But, while full siblings are definitely correlated, there are routinely meaningful magnitude sibling IQ differences. When no one inherited factor has an impact of more than say 0.02%, that shouldn't happen.

    Indeed, it ought to be possible to infer the number of independent heritable factors that account for the lion's share of genotype driven differences in phenotype from mathematical models and non-genetic heredity data assuming that the instrument used to measure IQ is reasonably valid in terms of numerical differences in IQ scores being comparable to each other in magnitude of difference well into the tails to the normal distribution, and the absolute value starting at zero being meaningfully scaled. My numerical intuition is that in that kind of analysis you should be being seeing something on the order of low hundreds of really significant loci instead of mid- to high thousands (the phenotypic results are quite insensitive to the exact number of very, very low effect loci; the amount of variability between siblings has more to do with the number of independent non-negligible effect factors).

    Admittedly, another big factor is probably the crudity of using educational attainment as a proxy for IQ and other educational fitness enhancing cognitive traits is going to artificially depress the significance of each variant. It would be nice, for example, to pull high school transcripts and compare a finer grained measure like GPA or class rank or standardized test percentiles or military entrance exam scores to the panel of 72 culled loci for say 5%-10% of the sample to see how much that enhanced the power of each of these significant loci. I suspect that with a more finely calibrated measure of IQ that you might go from ca. 0.43% to perhaps as much as 2%-5% which would also be a nice way to establish that the effects of the genes identified are robust rather than being highly sensitive to methodology.

    There are lots of ways the model we are trying to fit could be wrong leading to an underestimate of effect size as well. Simple additive variance may be useful for lots of SNPs, but there have to be either an effect underestimated non-additive component in a significant subset of the SNPs, or a lot of IQ inheritance driven by other aspects of the genetic code (copy number variants, epigenetic effects, polygenetic impacts in "control" centers, and lots of other more subtle architectural features of our heredity that I'm not competent enough to suggest) that aren't showing up using GWAS methodology that have big effects.

    Another possibility that seems pretty plausible given that a lot of the identified factors influence similar tissues in similar parts of the body at similar points in time in the person's development, could be that a lot of low individual effect genes are physically clustered close enough to each other on particular parts of particular chromosomes. If so, the assumption that each of the 72 traits is inherited randomly with respect to the others is profoundly wrong and that instead, inheriting one of the 72 dramatically increases the likelihood that you will inherit others. this effect ought to be something that is pretty amenable to modeling with some kind of Monte Carlo methods in a computer model, since we know exactly where each of the 72 hits if located in the genome and our understanding of linkage disequalibrium should allow us to translate the locations into very mathematically precise measurements of degree of mathematical independence of the genes from each other. Existing computing resources and common place programming expertise that ought to be able to produce interesting and accurate results in a matter of a few weeks or less in the hands of a capable graduate student or three. If you have 72 target genes but due to their actual specific locations in the genome, but they have variability comparable to 40 truly independent degrees of freedom, then the law of averages paradox I raised above can be resolved pretty parsimoniously.

  33. AG says:
    @AndrewR
    All but the most profoundly retarded humans are trainable. But some are obviously a lot more trainable than others. And as a general rule a person with a 90 IQ with a strong work ethic will be more successful by most measures than a 170 IQ bum.

    It is about level of trainability (from grade school, high school, college, to PhD, MD) depending on brain power.

    Yes, in absolute sense, all animals can be trained. But level of educational attainment is depending on mental ability. With IQ 90, just forget about possibility of MD or PhD. It is waste of time and effort. Work ethic would not help situation here. The rule applies to animal world too. It is extremely difficult to train insects or earthworm to perform new task for obvious reason. Only with basic mental power available for specific task, work ethic or other complicated social factors can have some impact on the final outcome.

    Confusion about many influencing factors are the problems for most people. To mentally see things clearly can help one succeed. But what make you see thing clearly? Independent analytic ability and learning ability are the answer (mental power again).

  34. @ohwilleke
    Academia rewards you for how many papers, each of which contains something worth of publication, you can get published. There is a strong career health/compensation/tenure eligibility/full professorship eligibility/grant application credibility incentive to break up your scientific insights into the maximum number of published papers that the total number of insights you actually have from your research can support.

    In music composition and fiction writing, your legacy is mostly a function of your peak accomplishment in a single work. In academia, your legacy is mostly a function of how many minimum threshold accomplishments you can pull off.

    “Legacy” is an odd choice of word. It suggests to me memory over a span of decades. Most academics have zero legacy. Getting tenure requires both quality and quantity, though not quantity of quality works. Legacy is mainly measured by the quality of one’s peak idea, much as in the arts, though it is not so important if that idea is divided across multiple works or shares one work with other ideas. The other source of legacy is the number and quality of grad students who praise their ancestor.

  35. @RaceRealist88
    The adhd gene is correlated with ddr7 I believe it is. Responsible for "wanderlust".

    makes perfect sense to me

  36. @jaywalker
    I doubt pzed will get worked up about 74 genes that statistically explain .0043 of variation in education level. The authors explicitly stated in the paper "educational attainment is primarily determined by environmental factors". But if Hsu gets excited about these results that's his problem.

    “educational attainment is primarily determined by environmental factors”: every tank must have its armour.

  37. Can someone explain the interpretation of the “effect size” metric used in Supplementary Table 1.16? I’ve tried reading the supp info in this paper and Rietveld et al., but it’s not obvious to me what they are computing exactly.

    With the huge caveat that I don’t quite know what that metric means, I did a quick and dirty comparison of these effect size scores using the 1000 Genomes allele frequency data. That data provides allele frequency estimates (EAS_AF, AMR_AF, AFR_AF, EUR_AF, and SAS_AF) for five global super-populations. Only 158 of the 162 SNPs from Supplementary Table 1.16 were in the data set (98%). I did not attempt to determine why the 4 SNPs were missing from my analysis nor did I attempt to impute them. I constructed a score = sum(product(allele frequency, effect size)) for each population, but I need to emphasize that I don’t know if that’s the right way to treat “effect size”. The resulting scores have an unsurprising rank order:

    EAS = 0.380
    EUR = 0.349
    SAS = 0.332
    AMR = 0.299
    AFR = 0.280

    Before drawing any conclusions, it’s worth noting what could be wrong with that analysis, aside from the possibility that the score is not constructed from a proper understanding of effect size. For example, the effect sizes were estimated from an EUR population cohort and thus the effect sizes are likely biased in other populations. This is because the identified SNPs could be tagging other variants that are actually casual and the linkage between the casual and tag SNPs could vary by population. It is also possible that there was a bias in the recruitment of these population cohorts such that they are unrepresentative of their populations on variables like educational attainment.

    All that said, one comparison that is hard to ignore is EAS vs AMR. Those two populations are quite similar genetically and neither was used in the discovery of the SNPs.

  38. @Karl Zimmerman
    Are genetic factors really estimated to only account for 20% of variation between individuals on educational attainment? I was under the impression it was rather more than that.

    Meta-analysis of educational attainment found H2 = 40% and increasing.

    http://www.ipr.northwestern.edu/publications/docs/workingpapers/2013/IPR-WP-13-09.pdf

Comments are closed.

Subscribe to All Razib Khan Comments via RSS