The figure shows standardized polygenic scores by population for Education GWAS, in descending order (1000 Genomes Populations, EA MTAG, N= 3,257 SNPs).
One function of a blog is to let people shoot down ideas. Conjectures have a short half-life. Refutations always snap at their heels. David Becker, whose latest version of country IQs received trenchant criticisms, and is now working on all of those (particularly dealing with countries with very low scores), and I will come back to him again when he is ready with the revised edition. Next in the crosshairs is Davide Piffer, who I put up for criticism by one and all in previous years.
Piffer has bounced back with a new paper
in which he turns once again to an idea on which he has worked since 2013. He argues that a relatively small number of genetic markers, ranging from 127 to 3527 SNPs depending on the construction method, can be used as stand-ins for a general pattern of selection, which has led to some genetic groups being brighter than others.
Here is the abstract:
Abstract: Genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment (EA) were used to test a polygenic selection model. Weighted and unweighted polygenic scores (PGS) were calculated and compared across populations using data from the 1000 Genomes (n = 26), HGDP-CEPH (n = 52) and gnomAD (n = 8) datasets. The PGS from the largest EA GWAS was highly correlated to two previously published PGSs (r = 0.96–0.97, N = 26). These factors are both highly predictive of average population IQ (r = 0.9, N = 23) and Learning index (r = 0.8, N = 22) and are robust to tests of spatial auto-correlation. Monte Carlo simulations yielded highly significant p values. In the gnomAD samples, the correlation between PGS and IQ was almost perfect (r = 0.98, N = 8), and ANOVA showed significant population differences in allele frequencies with positive effect. Socioeconomic variables slightly improved the prediction accuracy of the model (from 78–80% to 85–89%), but the PGS explained twice as much of the variance in IQ compared to socioeconomic variables. In both 1000 Genomes and gnomAD, there was a weak trend for lower GWAS significance SNPs to be less predictive of population IQ. Additionally, a subset of SNPs were found in the HGDP-CEPH sample (N = 127). The analysis of this sample yielded a positive correlation with latitude and a low negative correlation with distance from East Africa. This study provides robust results after accounting for spatial auto-correlation with Fst distances and random noise via an empirical Monte Carlo simulation using null SNPs.
Piffer explains the use of polygenic risk scores, and says:
The goal of this paper is to test the predictive power of polygenic scores, independently of spatial auto-correlation and noise due to drift and migrations. The prediction is that the polygenic selection model explains average population IQ better than a null model representing only drift and migrations. This implies that the frequencies of alleles with positive effect in the GWAS have different means across different populations.
Piffer accepts that an important limitation of polygenic risk scores based on European DNA is that they may miss other variants. This does not necessarily favour Europeans in comparison with other populations because polygenic scores contain both positive and negative variants. He explains:
For example, a recent GWAS carried out on Peruvians found a population-specific variant that reduces height by about 2.2 cm. Since this variant is polymorphic only in populations of Native American descent, it would have been missed by a European-based GWAS, potentially leading to an overestimation (relative to Europeans) of the PGS for the Peruvian population. A similar scenario might happen with EA polygenic scores, where population-specific variants with negative or positive effects are missed in other populations, leading respectively to over and under-estimations of the non-European population polygenic score. However, since population specific variants can also have a positive effect, the effects will tend to cancel each other out, thus limiting the potential bias.
Evidence suggesting that this is the case can be gathered from the polygenic score on height calculated using an European-based GWAS which produced very low scores for Peruvians, the second lowest in the 1000 genomes samples. Since most GWAS hits are not causal (so-called “tag SNPs”), but are genetically linked with “true” causal variants, and because patterns of Linkage Disequilibrium vary across populations (for example, Africans have on average much smaller LD blocks), this will reduce predictions for populations that are genetically distant from the GWAS sample.
Piffer also used the Lee (2018) data described below:
I said then:
At the request of a referee, Lee et al. had a go at using their polygenic score to predict the educational attainment of 1519 African Americans. It does not prove powerful, accounting for no more than 1.6% of the variance. This is a 85% come-down from the power of the score to predict European attainments, which they describe as an attenuation. However, this degree of attenuation is similar to that of 3 papers using European risk data to predict African American scores: 63% attenuation for education years, 88% attenuation for psychosis, and 85% attenuation for BMI.
However, the predictive power of the polygenic score in other races is expected to decline purely from differing LD (linkage disequilibrium) patterns; a SNP that tags a causal SNP in Europeans may not do so in Africans. The mere fact of a decline is not enough to say that the effects of the true causal sites differ in the two races. It may simply be that the SNPs point in a slightly wrong direction, but this is not resolvable at the moment. It would be good to have far more genetic and intelligence data on Africans, and then see how predictions based on them, our probable ancestral root stock, predict European abilities. All that for later, when better data become available.
In all his calculations Piffer uses height as a control variable, because it is polygenic, has heritable and environmental factors, and can be measured accurately in a non-contentious way. This makes it a good comparison with intelligence, which is also polygenic, also affected by environmental factors and can be measured accurately in a way that some people find absolutely, totally and utterly contentious. See his methods statement in his paper for the construction of the polygenic and socio-economic scales. Piffer uses Monte Carlo simulations to obtain a benchmark for spurious correlations.
Piffer is not trying to predict the IQs of individuals in different genetic groups, but merely their group averages. He is using a simple and restricted set of genetic findings to see if he can predict these averages, arguing that the limited set he is studying are indicators that a much larger sets of genes have gone through an evolutionary process of selections, and are responsible for group differences.
A note on Fst measures: a score of 0 means that two populations are interbreeding freely, which is described as “complete panmixis”. A score of 1 means that the two populations do not share any genetic diversity, and are not interbreeding, as happens with geographic or cultural isolation from each other.
The EDU3 score in Fig 2 is the most reliable. Note that US Blacks and African Caribbeans from Barbados are above the trend line, which Piffer discusses below.
Table 5 and Fig 9 show a fascinating finding: standard psychometrics suggests that the Ashkenazi are the brightest genetic group (estimated IQ 110, which is very close to what the polygenic score predicts (IQ 108). The sample size for Ashkenazis is only 145. The Finnish sample has 1738 subjects. Both are European sub-groups, hence less subject to issues which come from genetic distance (linkage equilibrium decay, population specific variants), but it would be good to have a replication of the Ashkenazi result.
In his discussion, Piffer asserts:
The calculation of population-level polygenic scores (average allele frequencies with positive GWAS beta) is a promising and quick approach to test signals of polygenic adaptation. The results clearly showed population differences in PGS (Figure 3), which correlated with estimates of average population IQ (Figure2) and students performance on standardized tests of mathematics, reading and science (r= 0.9 and 0.8, respectively).
The EDU3 polygenic score is the most robust, and is the best predictor. Monte Carlo simulations strongly suggest that the reported findings are not a fluke. Using this technique to construct a separate equation to predict height succeeds in predicting average population heights. As you might expect, that height equation does not predict average population intelligence, which shows that his equation on intelligence is not simply picking up a simple racial difference, and dressing it up as something which predicts racial differences in intelligence. The mere fact that the equation can predict at least the average intelligence of other racial groups shows that the European origin of the polygenic risk score does not prevent it from having wider relevance. It may well be the case that in all racial groups the genetic variants which boost or reduce intelligence are broadly similar (although they may be located at slightly different points in the genomic sequence).
Piffer identifies a statistical issue with polygenic risk scores: when do you stop? That is to say, how many predictive SNPs should you include? The strongest predictors and no more, or the long list of any that are predictive to any extent? One of his reviewers suggested the following approach.
“Start with the quantile that has the most significant SNPs, and then add quantiles in declining order of genome-wide significance. Initially, adding quantiles will improve prediction, but after a certain point, adding more quantiles will make prediction worse. At that inflection point you have the optimal PGS”.
Piffer found that in his data there was degradation of signal across significance quantiles, as shown by a weak trend for lower significance SNPs to have lower correlation with population IQ. There is flexibility about many SNPs are needed for good predictive power, and this has been discussed in more detail by Steve Hsu, regarding the benefits of compressed sensing.
The correlation coefficients between each score and the population IQ were computed. In turn, the correlation between the correlation coefficient and the quantile was computed, yielding a weak but significant correlation (Spearman’s r = −0.38, p = 0.0152). The PGS generated from most SNP subsets had lower predictive power than that of the full set.
As regards making general racial predictions on the basis of European DNA, the polygenic score is surprisingly good, so much so that one can discuss the findings and the implications. For example, he says:
Indeed, the IQ of African Americans appears to be higher than what is predicted by the PGS (Figure 2), which suggests this cannot be explained by European admixture alone, but it could be the result of enjoying better nutrition or education infrastructure compared to native Africans. Another explanation is heterosis (“hybrid vigor”), that is the increase in fitness observed in hybrid offspring thanks to the reduced expression of homozygous deleterious recessive alleles.
Future GWAS studies should be carried out on non-European populations. Indeed, trans-ethnic GWASs are a promising resource for the identification of alleles with homogeneous and heterogeneous effects and the computation of population-specific polygenic scores. Specifically, they would enable us to include SNPs that are polymorphic only in some populations, and to find the causal SNPs that have the same causal effect in all populations.
As Piffer is well aware, many people working in genetic research have not been convinced by his arguments. When he first presented his findings, they identified a number of criticisms, to which he replied. He expected a reply to his comments, but was told to publish a formal paper, which is a reasonable request. Piffer has now had his paper accepted, so it is time to criticize it. Open science, open discussion and may the best arguments win.