It is a great pleasure to see that a massive new study on intelligence has just been published, after years of work and also months of publication delays. Anything which can be done to speed up the publication of results is to be welcomed. Research has now moved to an international dimension, with disparate groups being managed and cajoled into cooperative ventures, a major undertaking that requires academia to develop new skills of diplomacy, coordination of disparate research groups, careful assembly of very different data formats and research protocols, and a sensitive understanding of individual egos and conflicting cultural and political sensitivities.
In contrast, all that is required of a commentator is patience, in this case a year of waiting.
Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. James Lee et al. Nature Genetics, 2018
It is a very long paper, almost book size, because it has to cover so much ground. The table of contents lists 205 pages, of which 149 are explanatory text. Happily, supplementary figure 8 provides me with a picture I can use instead of having to explain everything. The picture (shown above) encapsulates the vast strides that have been taken in linking genetic research to the terra incognita of the synaptic gap which, if I thought about it at all as an undergraduate, seemed simply a mysterious chemical soup which passed message on upstairs to the brain, my main focus of interest. Now we have not only a diagram of the transmitter exchanges but linkage to the snippets of genetic code which lead to the actual processes.
First, a little comment about “years of education”. I think of this as a weak proxy for intelligence, and that it is used simply because the data are more widely available than intelligence test results. It will still be picking up interesting things, but will under-record pure intelligence. To look at the extent of this difference, look at other studies which have used actual intelligence test results, and you will find that they seem to tap into the same areas. When that is done, the correlation between intelligence and years of education is 0.7 which is perfectly respectable, and as high as correlations between Wechsler subtests. Also, see below for a comparison done within the study using two intelligence tests.
Second, in terms of the history of this research, this paper is EA3 (educational attainment 3rd sample) and follows on from previous work EA1 and EA2. Internationally there is some overlap and repetition in papers from other labs, and we probably need to explain these better. You may remember that last September I hoped “someone somewhere is keeping track of the overall picture, perhaps in a control room with multiple screens, like the NASA control centre of old, tracking the orbit of each SNP as it hoves into sight.” Not yet.
Third, the bulk of the paper is about the methods used to put together the samples into one big data base and all the corrections and assumptions which go into detecting the genetic signal within the noise. Some analyses, for example within-family studies, cannot be done reliably without 47,000 sibling pairs and they had only 22,135 identified, so they do only some restricted inferential work. A great deal has to be said about all these statistical matters, and the paper serves as a text for where the field has reached at the moment. Other leading labs will pile in with their detailed observations in due course. Please note the quaintly named “Winner’s Curse Adjustment”.
In this section the authors say:
Using a large sample of genotyped parent-child pairs from Iceland, the study documented that a polygenic score for EduYears constructed entirely from non-transmitted parental alleles predicts a respondent’s educational attainment. A plausible interpretation of this finding is that non-transmitted alleles are associated with EduYears through their effects on the child’s rearing environment. The effect of a polygenic score based on non-transmitted alleles was approximately 30% as large as the effect of a polygenic score based on transmitted alleles. An analogous analysis of height found that the effect of the non-transmitted-allele score was 6% as large as the effect of the transmitted allele score.
I find this difficult to take in, because the idea of the “child that you could have been” and “the parents you could have had” are new to me. However, the paper argues that an effect going from parent genotype, to parent phenotype, to offspring phenotype, is a very plausible explanation of the smaller effects inferred from the within-family studies than from the population GWAS. Assortative mating probably makes some contribution to the discrepancy as well.
As shown in Supplementary Table 38, in Add Health, a one-standard-deviation increase in the score is associated with a 4.7 percentage-point increase in the probability of completing high school incremental pseudo- R2=6.2%), a 15.6 percentage-point increase in the probability of completing college (incremental pseudo-R2=9.5%), and a 7.1 percentage-point reduction in the probability of having retaken a grade (incremental pseudo-R2=4.0%).
Here are the findings in a histogram:
Now consider, as Steve Hsu has done, the opportunity faced by parents who are having IVF because their genes contain a risk factor for Huntington’s chorea, or cystic fibrosis, or some other awful disorder, such that their petri dish embryos have to be screened before the best one is implanted in the womb. A doctor might say, very privately, to the parents “We have knocked out the genes for disease X as requested. All these 10 embryos will be fine. Would you like us to select the one of those 10 most likely to complete high school? Entirely up to you”.
At the request of a referee, Lee et al. have a go at using their polygenic score to predict the educational attainment of 1519 African Americans. It does not prove powerful, accounting for no more than 1.6% of the variance. This is a 85% come-down from the power of the score to predict European attainments, which they describe as an attenuation. However, this degree of attenuation is similar to that of 3 papers using European risk data to predict African American scores: 63% attenuation for education years, 88% attenuation for psychosis, and 85% attenuation for BMI. However, the predictive power of the polygenic score in other races is expected to decline purely from differing LD (linkage dysequilibrium) patterns; a SNP that tags a causal SNP in Europeans may not do so in Africans. The mere fact of a decline is not enough to say that the effects of the true causal sites differ in the two races. It may simply be that the SNPs point in a slightly wrong direction, but this is not resolvable at the moment. It would be good to have far more genetic and intelligence data on Africans, and then see how predictions based on them, our probable ancestral rootstock, predict European abilities. All that for later, when better data become available.
The authors see whether the polygenic score can predict actual intelligence test results: the Peabody Picture Vocabulary Test and the Henmon-Nelson Test.
The MTAG-CP score is more predictive than the GWAS-CP score, with an incremental R2 of 6.9% and again over the GWAS-CP score of1.8%.
In WLS, the MTAG-CP score is the most predictive of the four scores, with an incremental R2 of 9.7% and a gain over the GWAS-CP score of 2.7%.
In sum, the score can be used to predict the intelligence results to some degree, though once we get very large samples with intelligence test results then predictive power will be very likely to improve.
Here is a nice result, which validates work Heiner Rindermann did years ago, showing that it was better to have educated parents than rich parents.
It barely needs saying, but the tissues these SNPs most enrich are in the brain.
This paper covers considerable ground, and does so in a careful way. I think it will have a massive impact. The sample size and the quality controls on the data will overcome doubts about the applicability of the results. That is, I assume that they will (some readers will wish to avert their eyes) and the excitement of this and similar papers will influence how we think about intelligence, education and heritability. It is part of an international effort to identify the biological causes of cognitive ability. This brings the whole project many steps closer to its goal.