Here’s a vast gene study, the first million plus sample size on the topic of cognition, focusing on educational attainment (years of education). From Nature Genetics:
Full text for free here.
James J. Lee, Robbee Wedow, […]David Cesarini
Nature Genetics (2018)Published: 23 July 2018
Abstract
Here we conducted a large-scale genetic association analysis of educational attainment in a sample of approximately 1.1 million individuals and identify 1,271 independent genome-wide-significant SNPs. For the SNPs taken together, we found evidence of heterogeneous effects across environments. The SNPs implicate genes involved in brain-development processes and neuron-to-neuron communication. In a separate analysis of the X chromosome, we identify 10 independent genome-wide-significant SNPs and estimate a SNP heritability of around 0.3% in both men and women, consistent with partial dosage compensation. A joint (multi-phenotype) analysis of educational attainment and three related cognitive phenotypes generates polygenic scores that explain 11–13% of the variance in educational attainment and 7–10% of the variance in cognitive performance. This prediction accuracy substantially increases the utility of polygenic scores as tools in research.
Here’s the authors’ FAQ.
Ed Yong writes in The Atlantic:
Perhaps counterintuitively, Benjamin thinks that his team’s research “is really important for research on improving educational systems.” To understand how, forget genes for a moment, and think about wealth.
It’s uncontroversial to say that people who are born into rich families are more likely to fare better in school than those from poorer backgrounds. Of course, poor kids can still soar in school, and rich ones can flunk out, but few would deny that money is a powerful influence on people’s futures. Now, consider that household income explains just 7 percent of the variation in educational attainment, which is less than what genes can now account for. “Most social scientists wouldn’t do a study without accounting for socioeconomic status, even if that’s not what they’re interested in,” says Harden. The same ought to be true of our genes.
Imagine that authorities are planning to provide free preschool to kids from disadvantaged backgrounds. To see if such a policy actually helps children stay in school for longer, scientists would randomly assign the free classes to some kids but not others. Then, they would look at how the two groups fared. In doing so, they’d always try to account for factors like wealth that might also vary between the two groups. Similarly, “you can now wash away the genetic effects so you don’t have to worry about them,” says Benjamin. And in doing so, researchers could more precisely work out whether a policy change has any benefits—and they could do it through smaller, cheaper studies.
This, he argues, is the most powerful reason to study the genetics of education or cognitive ability—and ironically, it has very little to do with genes. Instead, it’s a way of making social science more powerful.
Here’s Carl Zimmer’s article in the NYT:
… The researchers scanned the DNA surrounding these influential variants and found an intriguing pattern.
“They’re not just randomly scattered around the genome,” said James J. Lee, a behavioral geneticist at the University of Minnesota and co-author of the new study.
The variants are linked to genes active in the brain, helping neurons to form connections.
In other words, of the genes where we know what they do, they tend to be involved in growing the brain. Of course, that’s what, in hindsight, you’d expect, so that demonstrates the prima facie validity of the findings.
A key to educational attainment may not be how quickly information is acquired, but how quickly it can be shared between various regions.
“Maybe it’s not about how fast a signal can zip along a cable,” Dr. Lee said. “It’s about the complexity of the connections between point A and B.”
But the genetic links suggest another, perhaps stranger possibility: Some variants linked to education work not in the brains of students, but in the people they inherited the variants from — their parents.
By somehow shaping the behavior of parents, these variants may alter the environments in which children grow up in a way that helps or impinges on time spent in school.
Based on their findings, Dr. Benjamin and his colleagues figured out how to calculate a genetic “score” for educational success. The more variants linked to staying in school longer, the higher an individual’s score.
The researchers calculated a score for a group of 4,775 Americans, ranking them into five groups. The researchers found that 12 percent of people in the lowest fifth finished college. Among people in the top fifth, 57 percent finished college.
So that’s pretty decent predictive power, at least as strong as many standard sociological factors such as family income. What we can do now is go back to family income and subtract the genetic influence and see what’s leftover.
A similar result emerged when the scientists looked at how many people in each group had to repeat a grade in school. In the lowest fifth, 29 percent did, while in the top fifth, only 8 percent did.
But when Dr. Benjamin and his colleagues calculated scores for African-Americans, it failed to predict how well different groups had done in school. One likely reason is that genetic markers aren’t reliable guides to how genes influence traits in different populations.
In other words, African Americans are genetically different enough from white Americans that a model developed on whites doesn’t work on blacks? But as Carl Zimmer explained in his vast recent book on genetics, race doesn’t really exist genetically for reasons. How to reconcile those two ideas? Perhaps Carl has some thoughts he’ll share with us …
Dr. Benjamin and his colleagues hope to grow their study to 2 million people or more, and expect to find thousands more genes linked to education.

Ten, nine, eight, seven, six, five, four….countdown before James R. Lee and Robbee Wedow are banished and shamed into silence. Sort of careless of them to notice things, say like James Watson.
Hmmm… I wonder if it will become optional to submit a genetic sequence when applying to college?
Two interesting points in the FAQ:
“As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)”
This seems much lower than the numbers that are often bandied about in the Jensenite crowd.
“First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations.”
But if race is a “social construct”, why can’t you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.
I realize this assumes race is more than a social construct. The study correcting for the first 10 PCs of the "genetic relatedness matrix" (I wonder how that relates to race ; ) says all that I think needs to be said about the "race is solely a social construct" article of faith.
If they filter out the effect of genes and socioeconomic status, is there going to be anything much left for biologists and sociologists to study?
Anthropology has probably passed its sell-by date. Outside a couple of tribes in the Amazon and the Sentinel Islanders, there are no human groups untouched by modernity (and the mere presence of the observer) who are left to study. So you're either a paleontologist or you're a sociologist.
There's still plenty of uncharted territory in genetics.
"As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)"
This seems much lower than the numbers that are often bandied about in the Jensenite crowd.
"First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations."
But if race is a "social construct", why can't you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.Replies: @Steve Sailer, @res, @RW
This basically says that the races are so different genetically that it’s hard to compare them.
The basic question is like this: if your brother is taller than you, it’s probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.
Charles Murray said a long time ago that he figured racial differences were genetically like family differences, just more so. But this recent height study suggests that racial differences may be more profound.
It’s an empirical questions that needs more research and when more data comes in we’ll still wind up arguing over whether glasses are part full or part empty.
Gregory Cochran wrote about this in several blogposts, where he explains it better than I can:
https://westhunt.wordpress.com/2017/02/09/everything-is-different-but-the-same/
https://westhunt.wordpress.com/2018/06/03/wysiwyg/Replies: @U-Bahn
Their glasses have been refilled and reëmptied several times.
Which reminds me, Atlantic vs New Yorker:
https://www.theatlantic.com/entertainment/archive/2012/04/we-resist-further-cooperation-cooperation/328832/
Worth emphasizing that this is how they measure predictive value:
As far as I can tell (having taken a quick look through the supplementary material PDF and spreadsheet) they scrupulously avoid mentioning the R^2 for their baseline model. I wonder what it is and how much variance is explained by each of the variables involved? Seems like a fairly fundamental question, and an odd omission given the supplementary PDF is 207 pages long.
The predictive power is much reduced for African-Americans, but still nonzero. From page 132 of the supplementary material:
Otherwise you are just adding noise.
"As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)"
This seems much lower than the numbers that are often bandied about in the Jensenite crowd.
"First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations."
But if race is a "social construct", why can't you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.Replies: @Steve Sailer, @res, @RW
Part of the explanation is that the SNPs in the PGS may just be linked to the actual causal SNPs (rather than causal themselves). Because patterns of linkage disequilibrium (LD) differ between races the same relationships may not hold in differing races.
I realize this assumes race is more than a social construct. The study correcting for the first 10 PCs of the “genetic relatedness matrix” (I wonder how that relates to race ; ) says all that I think needs to be said about the “race is solely a social construct” article of faith.
I think many consider that a feature, not a bug.
It sure beats compulsory nude photos.
Though stripping is still optional at the Ivies.
https://i0.wp.com/www.nationalreview.com/wp-content/uploads/2018/05/letitia-chai-cornell-senior-thesis.jpg?fit=789%2C460&ssl=1
It’s uncontroversial to say that people who are born into rich families are more likely to fare better in school than those from poorer backgrounds.
I never get tired of posting this from La Griffe:
Black children from the wealthiest families have mean SAT scores lower than white children from families below the poverty line.
Black children of parents with graduate degrees have lower SAT scores than white children of parents with a high-school diploma or less.
A lot of affirmative action today probably amounts to awarding places to the kids of black doctors at the expense of the children of white plumbers.
Interesting question. Do scientific disciplines ever expire because they’ve plumbed their depths?
Anthropology has probably passed its sell-by date. Outside a couple of tribes in the Amazon and the Sentinel Islanders, there are no human groups untouched by modernity (and the mere presence of the observer) who are left to study. So you’re either a paleontologist or you’re a sociologist.
There’s still plenty of uncharted territory in genetics.
I wonder what the old writers/editors/owners of The Atlantic about 50 years ago would think, if you took a few recent issues into a time machine with you and showed them Coates.
From his perspective, Yong is working hard to squeeze lemonade from lemons.
The predictive power is much reduced for African-Americans, but still nonzero. From page 132 of the supplementary material: Replies: @Yak-15, @hyperbola
Incremental “adjusted R-Sqd?”
Otherwise you are just adding noise.
Yep. I hope they all got tenure, or another job lined up.
That might solve the Rachel Dolezal problem.
To explain 11% of variance they used polygenic score of 1 million SNPs. Majority of these SNPs did not show up in GWAS.
The question is is how larger was sample: “independent data that was not included in the GWAS” because using 1 million variables and 1.2 million subjects is very close to the guaranteed over-fit.
I suspect that they used brute force fit similar to that used by Steven Hsu who could explain 9% of education attainment in 500k sample using 10,000 SNPs only. See here:
https://www.biorxiv.org/content/biorxiv/early/2017/09/18/190124.full.pdf
But Steven Hsu used pre-filtering of SNPs via individual correlations, so not all SNPs were used in his Lasso method.
Do we really need 990,000 extra SNPs to explain additional 2% of variance?
It is possible that their brute force was indeed a brute force method and they used all SNPs’ they got and get the best fit on the 1.2 million sample and then calculated variance on the validation sample that was smaller that 1.2 million? Is it possible that most of those 1 million SNPs are just spurious?
Regarding the amount of variance explained, since they don't tell us the R^2 of the baseline model we can't conclude much about the total variance explained.Replies: @utu
That’s not it; different races still have the same gene loci (these things tend to be very conserved), but they have different versions (alleles) that are absent in one of them, making it difficult to compare them unless both races are well represented in the GWAS. I mean: in a set of white people, you can catch the alleles present among whites, and associate them with a desired phenotype, and then use their presence in a given white individual to predict the phenotype. But if you try to use the GWAS results obtained from the white dataset to predict the phenotype in a black person, the prediction will not be very reliable, mostly because blacks may have different relevant alleles not present in the white dataset, whose effect we still don’t know.
Gregory Cochran wrote about this in several blogposts, where he explains it better than I can:
https://westhunt.wordpress.com/2017/02/09/everything-is-different-but-the-same/
https://westhunt.wordpress.com/2018/06/03/wysiwyg/
And who’ll pay the tab.
Their glasses have been refilled and reëmptied several times.
Which reminds me, Atlantic vs New Yorker:
https://www.theatlantic.com/entertainment/archive/2012/04/we-resist-further-cooperation-cooperation/328832/
And hasn’t it also been shown that brain size does indeed correlate with intelligence, and that different races have different average brain sizes, contrary to whoever fudged the data a century ago?
(I don’t know enough about this, so I don’t know who or what to look up, but I remember reading something to that effect, probably here.)
In the paper they state:
But in FAQ:
The abstract is misleading and conclusions in the paper are misleading. 1,271 SNPs or as they call them: “lead SNPs” explains only 3.9%. They do not say they had to toss in 998,729 extra non-lead SNPs to push the 3.9% up to 11%. And apparently these 998,729 do not belong to the “treasure trove.”
Now its time for N. Korea to demonstrate to the world that it has turned over a new leaf and take in boatloads of African muslim refugees. I think that might work well for liberals as well as it might be the only place in the world where there are not countless reports of them misbehaving. And probably every year or two, they’ll need another boatload or two to replace all those that mysteriously disappeared, so it’ll help with Sailer’s graph too.
I never get tired of posting this from La Griffe:
Black children from the wealthiest families have mean SAT scores lower than white children from families below the poverty line.
Black children of parents with graduate degrees have lower SAT scores than white children of parents with a high-school diploma or less.
A lot of affirmative action today probably amounts to awarding places to the kids of black doctors at the expense of the children of white plumbers.Replies: @ATate
No need to quote La Griffe, most people on the left will dismiss that out of hand.
Instead show them this graph;

I had one guy accuse me of getting it from Stormfront. When you tell them it’s from The Journal of Blacks in Higher Education, they get…quiet.
If you look at Steve’s blog you will see that the current study used summary statistics which do not permit the L1-optimization technique he used.
Regarding the amount of variance explained, since they don’t tell us the R^2 of the baseline model we can’t conclude much about the total variance explained.
No comments on Zimmer’s article, even though every other recent article of his has comments. Except, that is, for the one just before it on ancient toolmakers in China. I wonder if the NY Times comments-or-not algorithm for IQ articles is something like this: No comments for any IQ article, and also no comments for the immediately preceding article (if an IQ article is in the queue, decomment the previous article also). This gives them a certain deniability. People wonder why there are no comments, but if they go to his previous piece, no comments there either, so it must just be the overall policy for those articles in that section of the paper.
Regarding the amount of variance explained, since they don't tell us the R^2 of the baseline model we can't conclude much about the total variance explained.Replies: @utu
The truth was that Hsu over hyped his L1 lasso method because indeed he preselected 20k or 50k SNPs based on correlation which was kind of cheating because if he picked 20k SNPs based on correlations he could have sorted them and start doing linear regressions with first two most correlated, add next and do another linear regression and so on. It would be computational much faster than his non-linear L1 fit. The whole point of L1 lasso is to find variables which are so weak that they can’t be picked separately by means of correlation but could show up strength in multivariate predictor function when acting with others when selected in groups.
What are you talking about? In FAQ and also in the paper they seem to talk in terms of variance explained.
Last year I took data form the table and calculated how much variance a binary variable standing for race explains. If I remember correctly it was over 70%. Which means that environment explains about 30% of variance.
Read more carefully (e.g. look at the excerpts in my comment 6). Everything I see in the paper and the supplementary material is about incremental variance explained. Which means after controlling for the baseline model including sex, birth year, their interaction and 10 principal components of the genetic relatedness matrix.
I am not sure if I understand what "the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix" is really about. You can calculate covariance between all SNP values for each two individuals and this gives you covariance matrix from which you get principal components that have a value for each individual. Not sure what does this correct when you subtract the principal values from the phenotype. I do not see why having genetically related individuals is a problem and should be somehow corrected.
Anyway their results are: So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.Replies: @res
The incremental variance explained is what you want. They first fit sex, birth years… multivariate function and obtain residuals and R^2. Then they add to this multivariate model the polygenic score as another variable and get new fit, residuals and new R^2. The second fit is better and R^2 is larger. The difference between the two R^2’s from two fits is what they report. This makes sense.
I am not sure if I understand what “the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix” is really about. You can calculate covariance between all SNP values for each two individuals and this gives you covariance matrix from which you get principal components that have a value for each individual. Not sure what does this correct when you subtract the principal values from the phenotype. I do not see why having genetically related individuals is a problem and should be somehow corrected.
Anyway their results are:
So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.
Notice who replied saying the links were useful.
The "variance–covariance matrix of the genetic relatedness matrix" is not the same thing as the SNP matrix in the UKBB PCA, but I am pretty sure the overall effect is similar. I would be interested in hearing from anyone who has a better understanding of how these relate.
As I understand it, the "genetic relatedness" they controlled for is essentially race with the PCs providing progressively finer granularity. In other words, I believe they are looking at overall genetic relatedness rather than just family relationships. So if there is any consistent variation in the traits being looked at (here EA) between the genetic population groups in the study (and 10 PCs is fairly fine grained) then that variance will be corrected away even if it is due to genetics.
I think including the variance explained by the baseline model is critical for several reasons:
1. The baseline model may capture some of the genetic variance.
2. The predictive power of the model is the total variance explained.
3. I would argue the % of variance remaining after the baseline model is controlled for is an important metric.
I think a clear way to express the results would be to divide the model variables into three categories:
1. Non-genetic variables. Here sex, birth year, their interaction
2. Genetic variables used for correction. Here 10 principal components of the genetic relatedness matrix
3. Genetic variables used in incremental model.
Breaking the variables down like this and doing an ANOVA indicating how much variance is explained by each group of variables would be much more informative than only looking at the incremental variance explained by the third category. Get over yourself, utu. I am sorry if I am being too terse. I keep making the mistake of thinking that you are smart enough to learn from previous discussions and make inferences beyond what I stated explicitly.Replies: @utu
In my experience the New York Times does not allow comments hinting in any way at a link between IQ and race due to genetics. I believe that no matter how modest or indirect the reference, any such comment is censored. The only explanation allowed for IQ differences is white racism and related environmental causes – the absence of genetic causes is a fundamental principle that may not be challenged, even briefly and indirectly in the comments section.
They would think that the West had been infected with a non-lethal epidemic of encephalitis that had rendered the human population cognitively challenged and easily susceptible to hokum.
Good point. This is one reason the most predictive SNP for eye color in Europeans does not predict eye color for Blacks
OT: Another hate hoax exposed!
https://nypost.com/2018/07/23/waiter-made-up-story-about-racist-tipper-restaurant/
It would be interesting to do a study and see how many newspapers picked up the original story, vs how many printed the hoax reveal.
It is encouraging that in the original source publication, many many of the comments are “yeah no kidding these hoaxes happen all the time.”
https://www.oaoa.com/news/business/article_741af8b8-8eb5-11e8-b276-6fd15202251a.html
I am not sure if I understand what "the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix" is really about. You can calculate covariance between all SNP values for each two individuals and this gives you covariance matrix from which you get principal components that have a value for each individual. Not sure what does this correct when you subtract the principal values from the phenotype. I do not see why having genetically related individuals is a problem and should be somehow corrected.
Anyway their results are: So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.Replies: @res
Sigh. We discussed controlling for population structure two weeks ago here and I linked to a UKBB document which gives nice plots of the population structure PCA along with explanation of the technique: https://www.unz.com/isteve/genetic-analysis-of-social-class-mobility-in-five-longitudinal-studies/#comment-2410937
Notice who replied saying the links were useful.
The “variance–covariance matrix of the genetic relatedness matrix” is not the same thing as the SNP matrix in the UKBB PCA, but I am pretty sure the overall effect is similar. I would be interested in hearing from anyone who has a better understanding of how these relate.
As I understand it, the “genetic relatedness” they controlled for is essentially race with the PCs providing progressively finer granularity. In other words, I believe they are looking at overall genetic relatedness rather than just family relationships. So if there is any consistent variation in the traits being looked at (here EA) between the genetic population groups in the study (and 10 PCs is fairly fine grained) then that variance will be corrected away even if it is due to genetics.
I think including the variance explained by the baseline model is critical for several reasons:
1. The baseline model may capture some of the genetic variance.
2. The predictive power of the model is the total variance explained.
3. I would argue the % of variance remaining after the baseline model is controlled for is an important metric.
I think a clear way to express the results would be to divide the model variables into three categories:
1. Non-genetic variables. Here sex, birth year, their interaction
2. Genetic variables used for correction. Here 10 principal components of the genetic relatedness matrix
3. Genetic variables used in incremental model.
Breaking the variables down like this and doing an ANOVA indicating how much variance is explained by each group of variables would be much more informative than only looking at the incremental variance explained by the third category.
Get over yourself, utu. I am sorry if I am being too terse. I keep making the mistake of thinking that you are smart enough to learn from previous discussions and make inferences beyond what I stated explicitly.
I really have a problem with "2. Genetic variables used for correction." but after thinking about it I begin to think this problem in general is unsolvable. You can't correct with X (in these case genes) what you want to detect with X (also genes). How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence? This problem makes me think of Baron Munchausen saving himself from drowning by pulling on his own hair.
The outstanding issue is still outstanding: Replies: @res
The predictive power is much reduced for African-Americans, but still nonzero. From page 132 of the supplementary material: Replies: @Yak-15, @hyperbola
This article is more blah-blah-blah by the usual failures. Consider this statement:
We already have many studies indicating that “IQ”, “educational achievement”, ….. (complex, poorly defined traits) involve thousands of genes. This study confrims that. Hence we already know that even is there are only two variants per gene, the number of possible variants is two raised to a power > 1000. There are not enough people in the world to ever validate a prediction from the genetic makeup of an individual. No wonder they need over 200 pages of pdf to cover up.
"As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)"
This seems much lower than the numbers that are often bandied about in the Jensenite crowd.
"First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations."
But if race is a "social construct", why can't you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.Replies: @Steve Sailer, @res, @RW
Tulip, I’ve seen your question answered as follows: Both East Asian and European populations have many people with fair skin. But the gene(s) that cause fair skin are different in the different populations. Same with the genes for higher intelligence and educational attainment.
How, if at all, are the SNPs correlated with educational attainment different from those correlated with high IQ?
Identifying genes correlated with high IQ would obviously be an important finding in its own right. But if the genes are the same, then the study is merely confirming what was blindingly obvious to begin with — i.e., people with high IQ are more likely to finish high school, and complete college and post-graduate programs.
Tulip raised a good point:
As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)
“This seems much lower than the numbers that are often bandied about in the Jensenite crowd.”
I wonder how much educational attainment could be due to effects from one’s peer group.
They need to norm for Wakandans. The formula is W(n). I know for a fact that has not yet been done.
Behind that big, beautiful Wall of Wakanda each Wakandan is blessed with so much IQ they need to carry it in a wheelbarrow.
Notice who replied saying the links were useful.
The "variance–covariance matrix of the genetic relatedness matrix" is not the same thing as the SNP matrix in the UKBB PCA, but I am pretty sure the overall effect is similar. I would be interested in hearing from anyone who has a better understanding of how these relate.
As I understand it, the "genetic relatedness" they controlled for is essentially race with the PCs providing progressively finer granularity. In other words, I believe they are looking at overall genetic relatedness rather than just family relationships. So if there is any consistent variation in the traits being looked at (here EA) between the genetic population groups in the study (and 10 PCs is fairly fine grained) then that variance will be corrected away even if it is due to genetics.
I think including the variance explained by the baseline model is critical for several reasons:
1. The baseline model may capture some of the genetic variance.
2. The predictive power of the model is the total variance explained.
3. I would argue the % of variance remaining after the baseline model is controlled for is an important metric.
I think a clear way to express the results would be to divide the model variables into three categories:
1. Non-genetic variables. Here sex, birth year, their interaction
2. Genetic variables used for correction. Here 10 principal components of the genetic relatedness matrix
3. Genetic variables used in incremental model.
Breaking the variables down like this and doing an ANOVA indicating how much variance is explained by each group of variables would be much more informative than only looking at the incremental variance explained by the third category. Get over yourself, utu. I am sorry if I am being too terse. I keep making the mistake of thinking that you are smart enough to learn from previous discussions and make inferences beyond what I stated explicitly.Replies: @utu
I agree on one thing. I also would like to see R^2 of the baseline model as well as incremental R^2 for different baseline model, especially for one without PC variables. I suspect that the incremental R^2 w/o PC variable in the baseline model would be larger. So if the sample was multiracial, racial differences in education attainment would be removed by PC variables and thus the racial component to intelligence if it indeed exists would be strongly attenuated or even lost.
I really have a problem with “2. Genetic variables used for correction.” but after thinking about it I begin to think this problem in general is unsolvable. You can’t correct with X (in these case genes) what you want to detect with X (also genes). How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence? This problem makes me think of Baron Munchausen saving himself from drowning by pulling on his own hair.
The outstanding issue is still outstanding:
Identifying genes correlated with high IQ would obviously be an important finding in its own right. But if the genes are the same, then the study is merely confirming what was blindingly obvious to begin with -- i.e., people with high IQ are more likely to finish high school, and complete college and post-graduate programs.Replies: @res
Conscientiousness probably contributes to EA but not IQ. Perhaps conformity as well. Finding SNPs for those traits is left as an exercise for the reader ; )
I really have a problem with "2. Genetic variables used for correction." but after thinking about it I begin to think this problem in general is unsolvable. You can't correct with X (in these case genes) what you want to detect with X (also genes). How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence? This problem makes me think of Baron Munchausen saving himself from drowning by pulling on his own hair.
The outstanding issue is still outstanding: Replies: @res
Exactly.
Gregory Cochran wrote about this in several blogposts, where he explains it better than I can:
https://westhunt.wordpress.com/2017/02/09/everything-is-different-but-the-same/
https://westhunt.wordpress.com/2018/06/03/wysiwyg/Replies: @U-Bahn
Your brother might be taller if mom bought hormone-grown chicken or similar foods during bro’s youth. That would make an interesting study if not already done. Growth hormones join endocrine disruptor impacts in the food chain.
Behind that big, beautiful Wall of Wakanda each Wakandan is blessed with so much IQ they need to carry it in a wheelbarrow.Replies: @Okechukwu
Now here’s an idiot that doesn’t belong in a discussion of intelligence.
Another hotshot IQist who doesn’t understand the difference between wealth and income. It’s very possible that a low-earning white person is actually wealthy, while a high-earning black person has little tangible wealth and is living paycheck to paycheck. Whites had a long time to build up generational wealth when the playing field was markedly uneven. Blacks didn’t have the same opportunity. In fact their homes and businesses were burned down by vicious white racists if they showed any signs of success. Low earning whites are still raised in the culture of IQ and SAT tests. High earning blacks are not necessarily raised in the same culture, as they are a distinct sub-culture.
Sorry for the intrusion of cold, hard reality. You may get back to your ridiculous pseudoscience.
Man, debunking this stuff is too easy:
Immigrants from Africa Boast Higher Education Levels Than Overall U.S. Population
African immigrants boast higher levels of education than the overall U.S. population, with a particular focus on Science, Technology, Engineering, and Math. 40 percent of African immigrants have at least a bachelor’s degree—making them 30 percent more likely to achieve that level of education than the U.S. population overall. Of this group, about one in three, or 33.4 percent, have STEM degrees, training heavily in demand by today’s employers.
https://www.newamericaneconomy.org/press-release/immigrants-from-africa-boast-higher-education-levels-than-overall-u-s-population/
You guys need to understand that with us you’re not dealing with descendants of slaves who’ve been brutalized for 400 years. We’ll go toe-to-toe with anyone. Try your pseudoscience on us. Haha.