The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
 TeasersiSteve Blog
Genetics of Educational Attainment in Sample of Over a Million People
🔊 Listen RSS
Email This Page to Someone

 Remember My Information



=>

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
AgreeDisagreeThanksLOLTroll
These buttons register your public Agreement, Disagreement, Thanks, LOL, or Troll with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used three times during any eight hour period.
Ignore Commenter Follow Commenter
Search Text Case Sensitive  Exact Words  Include Comments
List of Bookmarks

Here’s a vast gene study, the first million plus sample size on the topic of cognition, focusing on educational attainment (years of education). From Nature Genetics:

Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals

Full text for free here.

James J. Lee, Robbee Wedow, […]David Cesarini
Nature Genetics (2018)

Published: 23 July 2018

Abstract

Here we conducted a large-scale genetic association analysis of educational attainment in a sample of approximately 1.1 million individuals and identify 1,271 independent genome-wide-significant SNPs. For the SNPs taken together, we found evidence of heterogeneous effects across environments. The SNPs implicate genes involved in brain-development processes and neuron-to-neuron communication. In a separate analysis of the X chromosome, we identify 10 independent genome-wide-significant SNPs and estimate a SNP heritability of around 0.3% in both men and women, consistent with partial dosage compensation. A joint (multi-phenotype) analysis of educational attainment and three related cognitive phenotypes generates polygenic scores that explain 11–13% of the variance in educational attainment and 7–10% of the variance in cognitive performance. This prediction accuracy substantially increases the utility of polygenic scores as tools in research.

Here’s the authors’ FAQ.

Ed Yong writes in The Atlantic:

Perhaps counterintuitively, Benjamin thinks that his team’s research “is really important for research on improving educational systems.” To understand how, forget genes for a moment, and think about wealth.

It’s uncontroversial to say that people who are born into rich families are more likely to fare better in school than those from poorer backgrounds. Of course, poor kids can still soar in school, and rich ones can flunk out, but few would deny that money is a powerful influence on people’s futures. Now, consider that household income explains just 7 percent of the variation in educational attainment, which is less than what genes can now account for. “Most social scientists wouldn’t do a study without accounting for socioeconomic status, even if that’s not what they’re interested in,” says Harden. The same ought to be true of our genes.

Imagine that authorities are planning to provide free preschool to kids from disadvantaged backgrounds. To see if such a policy actually helps children stay in school for longer, scientists would randomly assign the free classes to some kids but not others. Then, they would look at how the two groups fared. In doing so, they’d always try to account for factors like wealth that might also vary between the two groups. Similarly, “you can now wash away the genetic effects so you don’t have to worry about them,” says Benjamin. And in doing so, researchers could more precisely work out whether a policy change has any benefits—and they could do it through smaller, cheaper studies.

This, he argues, is the most powerful reason to study the genetics of education or cognitive ability—and ironically, it has very little to do with genes. Instead, it’s a way of making social science more powerful.

Here’s Carl Zimmer’s article in the NYT:

… The researchers scanned the DNA surrounding these influential variants and found an intriguing pattern.

“They’re not just randomly scattered around the genome,” said James J. Lee, a behavioral geneticist at the University of Minnesota and co-author of the new study.

The variants are linked to genes active in the brain, helping neurons to form connections.

In other words, of the genes where we know what they do, they tend to be involved in growing the brain. Of course, that’s what, in hindsight, you’d expect, so that demonstrates the prima facie validity of the findings.

A key to educational attainment may not be how quickly information is acquired, but how quickly it can be shared between various regions.

“Maybe it’s not about how fast a signal can zip along a cable,” Dr. Lee said. “It’s about the complexity of the connections between point A and B.”

But the genetic links suggest another, perhaps stranger possibility: Some variants linked to education work not in the brains of students, but in the people they inherited the variants from — their parents.

By somehow shaping the behavior of parents, these variants may alter the environments in which children grow up in a way that helps or impinges on time spent in school.

Based on their findings, Dr. Benjamin and his colleagues figured out how to calculate a genetic “score” for educational success. The more variants linked to staying in school longer, the higher an individual’s score.

The researchers calculated a score for a group of 4,775 Americans, ranking them into five groups. The researchers found that 12 percent of people in the lowest fifth finished college. Among people in the top fifth, 57 percent finished college.

So that’s pretty decent predictive power, at least as strong as many standard sociological factors such as family income. What we can do now is go back to family income and subtract the genetic influence and see what’s leftover.

A similar result emerged when the scientists looked at how many people in each group had to repeat a grade in school. In the lowest fifth, 29 percent did, while in the top fifth, only 8 percent did.

But when Dr. Benjamin and his colleagues calculated scores for African-Americans, it failed to predict how well different groups had done in school. One likely reason is that genetic markers aren’t reliable guides to how genes influence traits in different populations.

In other words, African Americans are genetically different enough from white Americans that a model developed on whites doesn’t work on blacks? But as Carl Zimmer explained in his vast recent book on genetics, race doesn’t really exist genetically for reasons. How to reconcile those two ideas? Perhaps Carl has some thoughts he’ll share with us …

Dr. Benjamin and his colleagues hope to grow their study to 2 million people or more, and expect to find thousands more genes linked to education.

 
Hide 46 CommentsLeave a Comment
Commenters to Ignore...to FollowEndorsed Only
Trim Comments?
  1. Ten, nine, eight, seven, six, five, four….countdown before James R. Lee and Robbee Wedow are banished and shamed into silence. Sort of careless of them to notice things, say like James Watson.

    • Replies: @Jim Don Bob
    Yep. I hope they all got tenure, or another job lined up.
  2. Hmmm… I wonder if it will become optional to submit a genetic sequence when applying to college?

    • Replies: @Reg Cæsar

    Hmmm… I wonder if it will become optional to submit a genetic sequence when applying to college?

     

    It sure beats compulsory nude photos.

    https://yalealumnimagazine.com/uploads/images/6100061/1434486047/posture-photos_500x216_0_0_460.jpg

    Though stripping is still optional at the Ivies.

    https://i0.wp.com/www.nationalreview.com/wp-content/uploads/2018/05/letitia-chai-cornell-senior-thesis.jpg?fit=789%2C460&ssl=1

    , @Rosamond Vincy
    That might solve the Rachel Dolezal problem.
  3. Two interesting points in the FAQ:

    “As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)”

    This seems much lower than the numbers that are often bandied about in the Jensenite crowd.

    “First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations.”

    But if race is a “social construct”, why can’t you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.

    • Replies: @Steve Sailer
    This basically says that the races are so different genetically that it's hard to compare them.

    The basic question is like this: if your brother is taller than you, it's probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.

    Charles Murray said a long time ago that he figured racial differences were genetically like family differences, just more so. But this recent height study suggests that racial differences may be more profound.

    It's an empirical questions that needs more research and when more data comes in we'll still wind up arguing over whether glasses are part full or part empty.

    , @res

    But if race is a “social construct”, why can’t you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.
     
    Part of the explanation is that the SNPs in the PGS may just be linked to the actual causal SNPs (rather than causal themselves). Because patterns of linkage disequilibrium (LD) differ between races the same relationships may not hold in differing races.

    I realize this assumes race is more than a social construct. The study correcting for the first 10 PCs of the "genetic relatedness matrix" (I wonder how that relates to race ; ) says all that I think needs to be said about the "race is solely a social construct" article of faith.
    , @RW
    Tulip, I've seen your question answered as follows: Both East Asian and European populations have many people with fair skin. But the gene(s) that cause fair skin are different in the different populations. Same with the genes for higher intelligence and educational attainment.
  4. If they filter out the effect of genes and socioeconomic status, is there going to be anything much left for biologists and sociologists to study?

    • Replies: @res
    I think many consider that a feature, not a bug.
    , @The Anti-Gnostic
    Interesting question. Do scientific disciplines ever expire because they've plumbed their depths?

    Anthropology has probably passed its sell-by date. Outside a couple of tribes in the Amazon and the Sentinel Islanders, there are no human groups untouched by modernity (and the mere presence of the observer) who are left to study. So you're either a paleontologist or you're a sociologist.

    There's still plenty of uncharted territory in genetics.
  5. @Tulip
    Two interesting points in the FAQ:


    "As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)"

    This seems much lower than the numbers that are often bandied about in the Jensenite crowd.

    "First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations."

    But if race is a "social construct", why can't you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.

    This basically says that the races are so different genetically that it’s hard to compare them.

    The basic question is like this: if your brother is taller than you, it’s probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.

    Charles Murray said a long time ago that he figured racial differences were genetically like family differences, just more so. But this recent height study suggests that racial differences may be more profound.

    It’s an empirical questions that needs more research and when more data comes in we’ll still wind up arguing over whether glasses are part full or part empty.

    • Replies: @ThirdWorldSteveReader

    The basic question is like this: if your brother is taller than you, it’s probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.
     
    That's not it; different races still have the same gene loci (these things tend to be very conserved), but they have different versions (alleles) that are absent in one of them, making it difficult to compare them unless both races are well represented in the GWAS. I mean: in a set of white people, you can catch the alleles present among whites, and associate them with a desired phenotype, and then use their presence in a given white individual to predict the phenotype. But if you try to use the GWAS results obtained from the white dataset to predict the phenotype in a black person, the prediction will not be very reliable, mostly because blacks may have different relevant alleles not present in the white dataset, whose effect we still don't know.

    Gregory Cochran wrote about this in several blogposts, where he explains it better than I can:

    https://westhunt.wordpress.com/2017/02/09/everything-is-different-but-the-same/
    https://westhunt.wordpress.com/2018/06/03/wysiwyg/
    , @Reg Cæsar

    ...we’ll still wind up arguing over whether glasses are part full or part empty.
     
    And who'll pay the tab.

    Their glasses have been refilled and reëmptied several times.

    Which reminds me, Atlantic vs New Yorker:

    https://www.theatlantic.com/entertainment/archive/2012/04/we-resist-further-cooperation-cooperation/328832/
    , @Bernardo Pizzaro Cortez Del Castro
    Good point. This is one reason the most predictive SNP for eye color in Europeans does not predict eye color for Blacks
  6. res says:

    Worth emphasizing that this is how they measure predictive value:

    We measure prediction accuracy by the ‘incremental R2’ statistic: the gain in the coefficient of determination (R2) when the score is added as a covariate to a regression of the phenotype on a set of baseline controls (sex, birth year, their interaction and 10 principal components of the genetic relatedness matrix).

    As far as I can tell (having taken a quick look through the supplementary material PDF and spreadsheet) they scrupulously avoid mentioning the R^2 for their baseline model. I wonder what it is and how much variance is explained by each of the variables involved? Seems like a fairly fundamental question, and an odd omission given the supplementary PDF is 207 pages long.

    The predictive power is much reduced for African-Americans, but still nonzero. From page 132 of the supplementary material:

    In the prediction analysis, we included the same set of age and sex controls as the European ancestry analysis, and the first 10 principal components of the variance-covariance matrix of the genetic data of the African-ancestry sample. We found that the LDpred score predicts 1.6% (95% CI: 0.7% to 3.0%) of the variance in EduYears among African ancestry individuals. This represents 85% attenuation in the predictive power of the score compared to the incremental R2 of 10.6% in our European-ancestry sample from HRS.

    • Replies: @Yak-15
    Incremental “adjusted R-Sqd?”

    Otherwise you are just adding noise.
    , @hyperbola
    This article is more blah-blah-blah by the usual failures. Consider this statement:

    Dr. Benjamin and his colleagues hope to grow their study to 2 million people or more, and expect to find thousands more genes linked to education.
     
    We already have many studies indicating that "IQ", "educational achievement", ..... (complex, poorly defined traits) involve thousands of genes. This study confrims that. Hence we already know that even is there are only two variants per gene, the number of possible variants is two raised to a power > 1000. There are not enough people in the world to ever validate a prediction from the genetic makeup of an individual. No wonder they need over 200 pages of pdf to cover up.
  7. res says:
    @Tulip
    Two interesting points in the FAQ:


    "As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)"

    This seems much lower than the numbers that are often bandied about in the Jensenite crowd.

    "First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations."

    But if race is a "social construct", why can't you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.

    But if race is a “social construct”, why can’t you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.

    Part of the explanation is that the SNPs in the PGS may just be linked to the actual causal SNPs (rather than causal themselves). Because patterns of linkage disequilibrium (LD) differ between races the same relationships may not hold in differing races.

    I realize this assumes race is more than a social construct. The study correcting for the first 10 PCs of the “genetic relatedness matrix” (I wonder how that relates to race ; ) says all that I think needs to be said about the “race is solely a social construct” article of faith.

  8. @Almost Missouri
    If they filter out the effect of genes and socioeconomic status, is there going to be anything much left for biologists and sociologists to study?

    I think many consider that a feature, not a bug.

  9. @bomag
    Hmmm... I wonder if it will become optional to submit a genetic sequence when applying to college?

    Hmmm… I wonder if it will become optional to submit a genetic sequence when applying to college?

    It sure beats compulsory nude photos.


    Though stripping is still optional at the Ivies.

    https://i0.wp.com/www.nationalreview.com/wp-content/uploads/2018/05/letitia-chai-cornell-senior-thesis.jpg?fit=789%2C460&ssl=1

  10. eah says:

    It’s uncontroversial to say that people who are born into rich families are more likely to fare better in school than those from poorer backgrounds.

    I never get tired of posting this from La Griffe:

    Black children from the wealthiest families have mean SAT scores lower than white children from families below the poverty line.

    Black children of parents with graduate degrees have lower SAT scores than white children of parents with a high-school diploma or less.

    A lot of affirmative action today probably amounts to awarding places to the kids of black doctors at the expense of the children of white plumbers.

    • Agree: Jim Don Bob, Buzz Mohawk
    • Replies: @ATate
    No need to quote La Griffe, most people on the left will dismiss that out of hand.

    Instead show them this graph;

    http://www.jbhe.com/latest/news/1-22-09/satracialgapfigure.gif

    I had one guy accuse me of getting it from Stormfront. When you tell them it’s from The Journal of Blacks in Higher Education, they get...quiet.

  11. @Almost Missouri
    If they filter out the effect of genes and socioeconomic status, is there going to be anything much left for biologists and sociologists to study?

    Interesting question. Do scientific disciplines ever expire because they’ve plumbed their depths?

    Anthropology has probably passed its sell-by date. Outside a couple of tribes in the Amazon and the Sentinel Islanders, there are no human groups untouched by modernity (and the mere presence of the observer) who are left to study. So you’re either a paleontologist or you’re a sociologist.

    There’s still plenty of uncharted territory in genetics.

  12. I wonder what the old writers/editors/owners of The Atlantic about 50 years ago would think, if you took a few recent issues into a time machine with you and showed them Coates.

    • Replies: @lavoisier

    I wonder what the old writers/editors/owners of The Atlantic about 50 years ago would think, if you took a few recent issues into a time machine with you and showed them Coates.
     
    They would think that the West had been infected with a non-lethal epidemic of encephalitis that had rendered the human population cognitively challenged and easily susceptible to hokum.
  13. Ed Yong writes in The Atlantic:

    From his perspective, Yong is working hard to squeeze lemonade from lemons.

  14. @res
    Worth emphasizing that this is how they measure predictive value:

    We measure prediction accuracy by the ‘incremental R2’ statistic: the gain in the coefficient of determination (R2) when the score is added as a covariate to a regression of the phenotype on a set of baseline controls (sex, birth year, their interaction and 10 principal components of the genetic relatedness matrix).
     
    As far as I can tell (having taken a quick look through the supplementary material PDF and spreadsheet) they scrupulously avoid mentioning the R^2 for their baseline model. I wonder what it is and how much variance is explained by each of the variables involved? Seems like a fairly fundamental question, and an odd omission given the supplementary PDF is 207 pages long.

    The predictive power is much reduced for African-Americans, but still nonzero. From page 132 of the supplementary material:

    In the prediction analysis, we included the same set of age and sex controls as the European ancestry analysis, and the first 10 principal components of the variance-covariance matrix of the genetic data of the African-ancestry sample. We found that the LDpred score predicts 1.6% (95% CI: 0.7% to 3.0%) of the variance in EduYears among African ancestry individuals. This represents 85% attenuation in the predictive power of the score compared to the incremental R2 of 10.6% in our European-ancestry sample from HRS.
     

    Incremental “adjusted R-Sqd?”

    Otherwise you are just adding noise.

  15. @Buffalo Joe
    Ten, nine, eight, seven, six, five, four....countdown before James R. Lee and Robbee Wedow are banished and shamed into silence. Sort of careless of them to notice things, say like James Watson.

    Yep. I hope they all got tenure, or another job lined up.

  16. @bomag
    Hmmm... I wonder if it will become optional to submit a genetic sequence when applying to college?

    That might solve the Rachel Dolezal problem.

    • Agree: Travis
  17. To explain 11% of variance they used polygenic score of 1 million SNPs. Majority of these SNPs did not show up in GWAS.

    The polygenic score we constructed “predicts” (see FAQ 1.4) around 11% of the variation in education across individuals (when tested in independent data that was not included in the GWAS). This ~1 million SNP polygenic score predicts much more of the variation than does the genetic predictor described in FAQ 2.2, which was based on only 1,271 SNPs. Including all ~1 million SNPs tends to add predictive power because the threshold for significance/inclusion that is used to identify the 1,271 SNPs is very conservative (i.e., many of the other ~1 million SNPs are also associated with educational attainment but are not identified by our study, and on net, it turns out empirically that more signal than noise is added by including them). This study’s polygenic score has much more predictive power than polygenic scores constructed from our earlier two GWAS of educational attainment, because both of those studies had much smaller sample sizes (~100,000 and ~300,000 individuals, respectively, compared with ~1.1 million individuals of the current study).

    The question is is how larger was sample: “independent data that was not included in the GWAS” because using 1 million variables and 1.2 million subjects is very close to the guaranteed over-fit.

    I suspect that they used brute force fit similar to that used by Steven Hsu who could explain 9% of education attainment in 500k sample using 10,000 SNPs only. See here:

    https://www.biorxiv.org/content/biorxiv/early/2017/09/18/190124.full.pdf

    But Steven Hsu used pre-filtering of SNPs via individual correlations, so not all SNPs were used in his Lasso method.

    Do we really need 990,000 extra SNPs to explain additional 2% of variance?

    It is possible that their brute force was indeed a brute force method and they used all SNPs’ they got and get the best fit on the 1.2 million sample and then calculated variance on the validation sample that was smaller that 1.2 million? Is it possible that most of those 1 million SNPs are just spurious?

    • Replies: @res

    I suspect that they used brute force fit similar to that used by Steven Hsu who could explain 9% of education attainment in 500k sample using 10,000 SNPs only.
     
    If you look at Steve's blog you will see that the current study used summary statistics which do not permit the L1-optimization technique he used.

    Regarding the amount of variance explained, since they don't tell us the R^2 of the baseline model we can't conclude much about the total variance explained.
  18. @Steve Sailer
    This basically says that the races are so different genetically that it's hard to compare them.

    The basic question is like this: if your brother is taller than you, it's probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.

    Charles Murray said a long time ago that he figured racial differences were genetically like family differences, just more so. But this recent height study suggests that racial differences may be more profound.

    It's an empirical questions that needs more research and when more data comes in we'll still wind up arguing over whether glasses are part full or part empty.

    The basic question is like this: if your brother is taller than you, it’s probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.

    That’s not it; different races still have the same gene loci (these things tend to be very conserved), but they have different versions (alleles) that are absent in one of them, making it difficult to compare them unless both races are well represented in the GWAS. I mean: in a set of white people, you can catch the alleles present among whites, and associate them with a desired phenotype, and then use their presence in a given white individual to predict the phenotype. But if you try to use the GWAS results obtained from the white dataset to predict the phenotype in a black person, the prediction will not be very reliable, mostly because blacks may have different relevant alleles not present in the white dataset, whose effect we still don’t know.

    Gregory Cochran wrote about this in several blogposts, where he explains it better than I can:

    https://westhunt.wordpress.com/2017/02/09/everything-is-different-but-the-same/
    https://westhunt.wordpress.com/2018/06/03/wysiwyg/

    • Replies: @U-Bahn
    Your brother might be taller if mom bought hormone-grown chicken or similar foods during bro's youth. That would make an interesting study if not already done. Growth hormones join endocrine disruptor impacts in the food chain.
  19. @Steve Sailer
    This basically says that the races are so different genetically that it's hard to compare them.

    The basic question is like this: if your brother is taller than you, it's probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.

    Charles Murray said a long time ago that he figured racial differences were genetically like family differences, just more so. But this recent height study suggests that racial differences may be more profound.

    It's an empirical questions that needs more research and when more data comes in we'll still wind up arguing over whether glasses are part full or part empty.

    …we’ll still wind up arguing over whether glasses are part full or part empty.

    And who’ll pay the tab.

    Their glasses have been refilled and reëmptied several times.

    Which reminds me, Atlantic vs New Yorker:

    https://www.theatlantic.com/entertainment/archive/2012/04/we-resist-further-cooperation-cooperation/328832/

  20. In other words, of the genes where we know what they do, they tend to be involved in growing the brain.

    And hasn’t it also been shown that brain size does indeed correlate with intelligence, and that different races have different average brain sizes, contrary to whoever fudged the data a century ago?

    (I don’t know enough about this, so I don’t know who or what to look up, but I remember reading something to that effect, probably here.)

  21. In the paper they state:

    For research at the intersection of genetics and neuroscience, the set of 1,271 lead SNPs that we identify is a treasure trove for future analyses. For research in social science and epidemiology, the polygenic scores that we construct—which explain 11–13% and 7–10% of the variance in educational attainment and cognitive per-formance, respectively—will prove useful across at least three types of applications.

    But in FAQ:

    Taken together, these 1,271 SNPs accounted for just 3.9% of the variation across individuals in years of education completed.

    By contrast, a prediction based on a polygenic score that combines ~1 million SNPs that we studied (see FAQs 1.5 & 2.3) has more predictive power: r = 0.33, corresponding to 11% of the variation across individuals.

    The abstract is misleading and conclusions in the paper are misleading. 1,271 SNPs or as they call them: “lead SNPs” explains only 3.9%. They do not say they had to toss in 998,729 extra non-lead SNPs to push the 3.9% up to 11%. And apparently these 998,729 do not belong to the “treasure trove.”

  22. Anonymous[320] • Disclaimer says:

    Now its time for N. Korea to demonstrate to the world that it has turned over a new leaf and take in boatloads of African muslim refugees. I think that might work well for liberals as well as it might be the only place in the world where there are not countless reports of them misbehaving. And probably every year or two, they’ll need another boatload or two to replace all those that mysteriously disappeared, so it’ll help with Sailer’s graph too.

  23. @eah
    It’s uncontroversial to say that people who are born into rich families are more likely to fare better in school than those from poorer backgrounds.

    I never get tired of posting this from La Griffe:

    Black children from the wealthiest families have mean SAT scores lower than white children from families below the poverty line.

    Black children of parents with graduate degrees have lower SAT scores than white children of parents with a high-school diploma or less.

    A lot of affirmative action today probably amounts to awarding places to the kids of black doctors at the expense of the children of white plumbers.

    No need to quote La Griffe, most people on the left will dismiss that out of hand.

    Instead show them this graph;

    I had one guy accuse me of getting it from Stormfront. When you tell them it’s from The Journal of Blacks in Higher Education, they get…quiet.

    • Agree: eah
    • Replies: @utu
    Last year I took data form the table and calculated how much variance a binary variable standing for race explains. If I remember correctly it was over 70%. Which means that environment explains about 30% of variance.
    , @Okechukwu
    Another hotshot IQist who doesn't understand the difference between wealth and income. It's very possible that a low-earning white person is actually wealthy, while a high-earning black person has little tangible wealth and is living paycheck to paycheck. Whites had a long time to build up generational wealth when the playing field was markedly uneven. Blacks didn't have the same opportunity. In fact their homes and businesses were burned down by vicious white racists if they showed any signs of success. Low earning whites are still raised in the culture of IQ and SAT tests. High earning blacks are not necessarily raised in the same culture, as they are a distinct sub-culture.

    Sorry for the intrusion of cold, hard reality. You may get back to your ridiculous pseudoscience.

  24. res says:
    @utu
    To explain 11% of variance they used polygenic score of 1 million SNPs. Majority of these SNPs did not show up in GWAS.

    The polygenic score we constructed “predicts” (see FAQ 1.4) around 11% of the variation in education across individuals (when tested in independent data that was not included in the GWAS). This ~1 million SNP polygenic score predicts much more of the variation than does the genetic predictor described in FAQ 2.2, which was based on only 1,271 SNPs. Including all ~1 million SNPs tends to add predictive power because the threshold for significance/inclusion that is used to identify the 1,271 SNPs is very conservative (i.e., many of the other ~1 million SNPs are also associated with educational attainment but are not identified by our study, and on net, it turns out empirically that more signal than noise is added by including them). This study’s polygenic score has much more predictive power than polygenic scores constructed from our earlier two GWAS of educational attainment, because both of those studies had much smaller sample sizes (~100,000 and ~300,000 individuals, respectively, compared with ~1.1 million individuals of the current study).
     
    The question is is how larger was sample: "independent data that was not included in the GWAS" because using 1 million variables and 1.2 million subjects is very close to the guaranteed over-fit.

    I suspect that they used brute force fit similar to that used by Steven Hsu who could explain 9% of education attainment in 500k sample using 10,000 SNPs only. See here:

    https://www.biorxiv.org/content/biorxiv/early/2017/09/18/190124.full.pdf

    But Steven Hsu used pre-filtering of SNPs via individual correlations, so not all SNPs were used in his Lasso method.

    Do we really need 990,000 extra SNPs to explain additional 2% of variance?

    It is possible that their brute force was indeed a brute force method and they used all SNPs' they got and get the best fit on the 1.2 million sample and then calculated variance on the validation sample that was smaller that 1.2 million? Is it possible that most of those 1 million SNPs are just spurious?

    I suspect that they used brute force fit similar to that used by Steven Hsu who could explain 9% of education attainment in 500k sample using 10,000 SNPs only.

    If you look at Steve’s blog you will see that the current study used summary statistics which do not permit the L1-optimization technique he used.

    Regarding the amount of variance explained, since they don’t tell us the R^2 of the baseline model we can’t conclude much about the total variance explained.

    • Replies: @utu

    If you look at Steve’s blog you will see that the current study used summary statistics which do not permit the L1-optimization technique he used.
     
    The truth was that Hsu over hyped his L1 lasso method because indeed he preselected 20k or 50k SNPs based on correlation which was kind of cheating because if he picked 20k SNPs based on correlations he could have sorted them and start doing linear regressions with first two most correlated, add next and do another linear regression and so on. It would be computational much faster than his non-linear L1 fit. The whole point of L1 lasso is to find variables which are so weak that they can't be picked separately by means of correlation but could show up strength in multivariate predictor function when acting with others when selected in groups.

    they don’t tell us the R^2 of the baseline model
     
    What are you talking about? In FAQ and also in the paper they seem to talk in terms of variance explained.
  25. Anon[196] • Disclaimer says:

    No comments on Zimmer’s article, even though every other recent article of his has comments. Except, that is, for the one just before it on ancient toolmakers in China. I wonder if the NY Times comments-or-not algorithm for IQ articles is something like this: No comments for any IQ article, and also no comments for the immediately preceding article (if an IQ article is in the queue, decomment the previous article also). This gives them a certain deniability. People wonder why there are no comments, but if they go to his previous piece, no comments there either, so it must just be the overall policy for those articles in that section of the paper.

    • Replies: @Peter Johnson
    In my experience the New York Times does not allow comments hinting in any way at a link between IQ and race due to genetics. I believe that no matter how modest or indirect the reference, any such comment is censored. The only explanation allowed for IQ differences is white racism and related environmental causes - the absence of genetic causes is a fundamental principle that may not be challenged, even briefly and indirectly in the comments section.
  26. utu says:
    @res

    I suspect that they used brute force fit similar to that used by Steven Hsu who could explain 9% of education attainment in 500k sample using 10,000 SNPs only.
     
    If you look at Steve's blog you will see that the current study used summary statistics which do not permit the L1-optimization technique he used.

    Regarding the amount of variance explained, since they don't tell us the R^2 of the baseline model we can't conclude much about the total variance explained.

    If you look at Steve’s blog you will see that the current study used summary statistics which do not permit the L1-optimization technique he used.

    The truth was that Hsu over hyped his L1 lasso method because indeed he preselected 20k or 50k SNPs based on correlation which was kind of cheating because if he picked 20k SNPs based on correlations he could have sorted them and start doing linear regressions with first two most correlated, add next and do another linear regression and so on. It would be computational much faster than his non-linear L1 fit. The whole point of L1 lasso is to find variables which are so weak that they can’t be picked separately by means of correlation but could show up strength in multivariate predictor function when acting with others when selected in groups.

    they don’t tell us the R^2 of the baseline model

    What are you talking about? In FAQ and also in the paper they seem to talk in terms of variance explained.

    • Replies: @res

    What are you talking about? In FAQ and also in the paper they seem to talk in terms of variance explained.
     
    Read more carefully (e.g. look at the excerpts in my comment 6). Everything I see in the paper and the supplementary material is about incremental variance explained. Which means after controlling for the baseline model including sex, birth year, their interaction and 10 principal components of the genetic relatedness matrix.
  27. @ATate
    No need to quote La Griffe, most people on the left will dismiss that out of hand.

    Instead show them this graph;

    http://www.jbhe.com/latest/news/1-22-09/satracialgapfigure.gif

    I had one guy accuse me of getting it from Stormfront. When you tell them it’s from The Journal of Blacks in Higher Education, they get...quiet.

    Last year I took data form the table and calculated how much variance a binary variable standing for race explains. If I remember correctly it was over 70%. Which means that environment explains about 30% of variance.

  28. res says:
    @utu

    If you look at Steve’s blog you will see that the current study used summary statistics which do not permit the L1-optimization technique he used.
     
    The truth was that Hsu over hyped his L1 lasso method because indeed he preselected 20k or 50k SNPs based on correlation which was kind of cheating because if he picked 20k SNPs based on correlations he could have sorted them and start doing linear regressions with first two most correlated, add next and do another linear regression and so on. It would be computational much faster than his non-linear L1 fit. The whole point of L1 lasso is to find variables which are so weak that they can't be picked separately by means of correlation but could show up strength in multivariate predictor function when acting with others when selected in groups.

    they don’t tell us the R^2 of the baseline model
     
    What are you talking about? In FAQ and also in the paper they seem to talk in terms of variance explained.

    What are you talking about? In FAQ and also in the paper they seem to talk in terms of variance explained.

    Read more carefully (e.g. look at the excerpts in my comment 6). Everything I see in the paper and the supplementary material is about incremental variance explained. Which means after controlling for the baseline model including sex, birth year, their interaction and 10 principal components of the genetic relatedness matrix.

    • Replies: @utu
    The incremental variance explained is what you want. They first fit sex, birth years... multivariate function and obtain residuals and R^2. Then they add to this multivariate model the polygenic score as another variable and get new fit, residuals and new R^2. The second fit is better and R^2 is larger. The difference between the two R^2's from two fits is what they report. This makes sense.

    I am not sure if I understand what "the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix" is really about. You can calculate covariance between all SNP values for each two individuals and this gives you covariance matrix from which you get principal components that have a value for each individual. Not sure what does this correct when you subtract the principal values from the phenotype. I do not see why having genetically related individuals is a problem and should be somehow corrected.

    Anyway their results are:

    Taken together, these 1,271 SNPs accounted for just 3.9% of the variation across individuals in years of education completed.
     

    By contrast, a prediction based on a polygenic score that combines ~1 million SNPs that we studied (see FAQs 1.5 & 2.3) has more predictive power: r = 0.33, corresponding to 11% of the variation across individuals.
     
    So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.
  29. utu says:
    @res

    What are you talking about? In FAQ and also in the paper they seem to talk in terms of variance explained.
     
    Read more carefully (e.g. look at the excerpts in my comment 6). Everything I see in the paper and the supplementary material is about incremental variance explained. Which means after controlling for the baseline model including sex, birth year, their interaction and 10 principal components of the genetic relatedness matrix.

    The incremental variance explained is what you want. They first fit sex, birth years… multivariate function and obtain residuals and R^2. Then they add to this multivariate model the polygenic score as another variable and get new fit, residuals and new R^2. The second fit is better and R^2 is larger. The difference between the two R^2’s from two fits is what they report. This makes sense.

    I am not sure if I understand what “the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix” is really about. You can calculate covariance between all SNP values for each two individuals and this gives you covariance matrix from which you get principal components that have a value for each individual. Not sure what does this correct when you subtract the principal values from the phenotype. I do not see why having genetically related individuals is a problem and should be somehow corrected.

    Anyway their results are:

    Taken together, these 1,271 SNPs accounted for just 3.9% of the variation across individuals in years of education completed.

    By contrast, a prediction based on a polygenic score that combines ~1 million SNPs that we studied (see FAQs 1.5 & 2.3) has more predictive power: r = 0.33, corresponding to 11% of the variation across individuals.

    So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.

    • Replies: @res

    I am not sure if I understand what “the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix” is really about.
     
    Sigh. We discussed controlling for population structure two weeks ago here and I linked to a UKBB document which gives nice plots of the population structure PCA along with explanation of the technique: https://www.unz.com/isteve/genetic-analysis-of-social-class-mobility-in-five-longitudinal-studies/#comment-2410937
    Notice who replied saying the links were useful.

    The "variance–covariance matrix of the genetic relatedness matrix" is not the same thing as the SNP matrix in the UKBB PCA, but I am pretty sure the overall effect is similar. I would be interested in hearing from anyone who has a better understanding of how these relate.

    As I understand it, the "genetic relatedness" they controlled for is essentially race with the PCs providing progressively finer granularity. In other words, I believe they are looking at overall genetic relatedness rather than just family relationships. So if there is any consistent variation in the traits being looked at (here EA) between the genetic population groups in the study (and 10 PCs is fairly fine grained) then that variance will be corrected away even if it is due to genetics.

    I think including the variance explained by the baseline model is critical for several reasons:
    1. The baseline model may capture some of the genetic variance.
    2. The predictive power of the model is the total variance explained.
    3. I would argue the % of variance remaining after the baseline model is controlled for is an important metric.

    I think a clear way to express the results would be to divide the model variables into three categories:
    1. Non-genetic variables. Here sex, birth year, their interaction
    2. Genetic variables used for correction. Here 10 principal components of the genetic relatedness matrix
    3. Genetic variables used in incremental model.

    Breaking the variables down like this and doing an ANOVA indicating how much variance is explained by each group of variables would be much more informative than only looking at the incremental variance explained by the third category.

    So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.
     
    Get over yourself, utu. I am sorry if I am being too terse. I keep making the mistake of thinking that you are smart enough to learn from previous discussions and make inferences beyond what I stated explicitly.
  30. @Anon
    No comments on Zimmer's article, even though every other recent article of his has comments. Except, that is, for the one just before it on ancient toolmakers in China. I wonder if the NY Times comments-or-not algorithm for IQ articles is something like this: No comments for any IQ article, and also no comments for the immediately preceding article (if an IQ article is in the queue, decomment the previous article also). This gives them a certain deniability. People wonder why there are no comments, but if they go to his previous piece, no comments there either, so it must just be the overall policy for those articles in that section of the paper.

    In my experience the New York Times does not allow comments hinting in any way at a link between IQ and race due to genetics. I believe that no matter how modest or indirect the reference, any such comment is censored. The only explanation allowed for IQ differences is white racism and related environmental causes – the absence of genetic causes is a fundamental principle that may not be challenged, even briefly and indirectly in the comments section.

  31. @songbird
    I wonder what the old writers/editors/owners of The Atlantic about 50 years ago would think, if you took a few recent issues into a time machine with you and showed them Coates.

    I wonder what the old writers/editors/owners of The Atlantic about 50 years ago would think, if you took a few recent issues into a time machine with you and showed them Coates.

    They would think that the West had been infected with a non-lethal epidemic of encephalitis that had rendered the human population cognitively challenged and easily susceptible to hokum.

  32. @Steve Sailer
    This basically says that the races are so different genetically that it's hard to compare them.

    The basic question is like this: if your brother is taller than you, it's probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.

    Charles Murray said a long time ago that he figured racial differences were genetically like family differences, just more so. But this recent height study suggests that racial differences may be more profound.

    It's an empirical questions that needs more research and when more data comes in we'll still wind up arguing over whether glasses are part full or part empty.

    Good point. This is one reason the most predictive SNP for eye color in Europeans does not predict eye color for Blacks

  33. OT: Another hate hoax exposed!

    https://nypost.com/2018/07/23/waiter-made-up-story-about-racist-tipper-restaurant/

    It would be interesting to do a study and see how many newspapers picked up the original story, vs how many printed the hoax reveal.

    It is encouraging that in the original source publication, many many of the comments are “yeah no kidding these hoaxes happen all the time.”

    https://www.oaoa.com/news/business/article_741af8b8-8eb5-11e8-b276-6fd15202251a.html

  34. res says:
    @utu
    The incremental variance explained is what you want. They first fit sex, birth years... multivariate function and obtain residuals and R^2. Then they add to this multivariate model the polygenic score as another variable and get new fit, residuals and new R^2. The second fit is better and R^2 is larger. The difference between the two R^2's from two fits is what they report. This makes sense.

    I am not sure if I understand what "the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix" is really about. You can calculate covariance between all SNP values for each two individuals and this gives you covariance matrix from which you get principal components that have a value for each individual. Not sure what does this correct when you subtract the principal values from the phenotype. I do not see why having genetically related individuals is a problem and should be somehow corrected.

    Anyway their results are:

    Taken together, these 1,271 SNPs accounted for just 3.9% of the variation across individuals in years of education completed.
     

    By contrast, a prediction based on a polygenic score that combines ~1 million SNPs that we studied (see FAQs 1.5 & 2.3) has more predictive power: r = 0.33, corresponding to 11% of the variation across individuals.
     
    So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.

    I am not sure if I understand what “the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix” is really about.

    Sigh. We discussed controlling for population structure two weeks ago here and I linked to a UKBB document which gives nice plots of the population structure PCA along with explanation of the technique: https://www.unz.com/isteve/genetic-analysis-of-social-class-mobility-in-five-longitudinal-studies/#comment-2410937
    Notice who replied saying the links were useful.

    The “variance–covariance matrix of the genetic relatedness matrix” is not the same thing as the SNP matrix in the UKBB PCA, but I am pretty sure the overall effect is similar. I would be interested in hearing from anyone who has a better understanding of how these relate.

    As I understand it, the “genetic relatedness” they controlled for is essentially race with the PCs providing progressively finer granularity. In other words, I believe they are looking at overall genetic relatedness rather than just family relationships. So if there is any consistent variation in the traits being looked at (here EA) between the genetic population groups in the study (and 10 PCs is fairly fine grained) then that variance will be corrected away even if it is due to genetics.

    I think including the variance explained by the baseline model is critical for several reasons:
    1. The baseline model may capture some of the genetic variance.
    2. The predictive power of the model is the total variance explained.
    3. I would argue the % of variance remaining after the baseline model is controlled for is an important metric.

    I think a clear way to express the results would be to divide the model variables into three categories:
    1. Non-genetic variables. Here sex, birth year, their interaction
    2. Genetic variables used for correction. Here 10 principal components of the genetic relatedness matrix
    3. Genetic variables used in incremental model.

    Breaking the variables down like this and doing an ANOVA indicating how much variance is explained by each group of variables would be much more informative than only looking at the incremental variance explained by the third category.

    So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.

    Get over yourself, utu. I am sorry if I am being too terse. I keep making the mistake of thinking that you are smart enough to learn from previous discussions and make inferences beyond what I stated explicitly.

    • Replies: @utu
    I agree on one thing. I also would like to see R^2 of the baseline model as well as incremental R^2 for different baseline model, especially for one without PC variables. I suspect that the incremental R^2 w/o PC variable in the baseline model would be larger. So if the sample was multiracial, racial differences in education attainment would be removed by PC variables and thus the racial component to intelligence if it indeed exists would be strongly attenuated or even lost.

    I really have a problem with "2. Genetic variables used for correction." but after thinking about it I begin to think this problem in general is unsolvable. You can't correct with X (in these case genes) what you want to detect with X (also genes). How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence? This problem makes me think of Baron Munchausen saving himself from drowning by pulling on his own hair.

    The outstanding issue is still outstanding:

    Taken together, these 1,271 SNPs accounted for just 3.9% of the variation across individuals in years of education completed.
     

    By contrast, a prediction based on a polygenic score that combines ~1 million SNPs that we studied (see FAQs 1.5 & 2.3) has more predictive power: r = 0.33, corresponding to 11% of the variation across individuals.
     
  35. @res
    Worth emphasizing that this is how they measure predictive value:

    We measure prediction accuracy by the ‘incremental R2’ statistic: the gain in the coefficient of determination (R2) when the score is added as a covariate to a regression of the phenotype on a set of baseline controls (sex, birth year, their interaction and 10 principal components of the genetic relatedness matrix).
     
    As far as I can tell (having taken a quick look through the supplementary material PDF and spreadsheet) they scrupulously avoid mentioning the R^2 for their baseline model. I wonder what it is and how much variance is explained by each of the variables involved? Seems like a fairly fundamental question, and an odd omission given the supplementary PDF is 207 pages long.

    The predictive power is much reduced for African-Americans, but still nonzero. From page 132 of the supplementary material:

    In the prediction analysis, we included the same set of age and sex controls as the European ancestry analysis, and the first 10 principal components of the variance-covariance matrix of the genetic data of the African-ancestry sample. We found that the LDpred score predicts 1.6% (95% CI: 0.7% to 3.0%) of the variance in EduYears among African ancestry individuals. This represents 85% attenuation in the predictive power of the score compared to the incremental R2 of 10.6% in our European-ancestry sample from HRS.
     

    This article is more blah-blah-blah by the usual failures. Consider this statement:

    Dr. Benjamin and his colleagues hope to grow their study to 2 million people or more, and expect to find thousands more genes linked to education.

    We already have many studies indicating that “IQ”, “educational achievement”, ….. (complex, poorly defined traits) involve thousands of genes. This study confrims that. Hence we already know that even is there are only two variants per gene, the number of possible variants is two raised to a power > 1000. There are not enough people in the world to ever validate a prediction from the genetic makeup of an individual. No wonder they need over 200 pages of pdf to cover up.

  36. @Tulip
    Two interesting points in the FAQ:


    "As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)"

    This seems much lower than the numbers that are often bandied about in the Jensenite crowd.

    "First, it means that polygenic scores of individuals from different ancestry groups cannot be meaningfully compared. A recent paper (Martin et al. 2017) illustrated this point in the context of polygenic scores for predicting height; in the sample analyzed in that paper, polygenic scores for height for individuals of European ancestries are on average larger than those of South Asian ancestries which in turn are larger than those of African ancestries. In actuality, however, populations of African ancestries represented by the sample have similar height to populations of European ancestries, and both African and European populations tend to be taller than South Asian populations."

    But if race is a "social construct", why can't you compare the polygenic scores of peoples from different ancestry groups? Lewontin wants to know.

    Tulip, I’ve seen your question answered as follows: Both East Asian and European populations have many people with fair skin. But the gene(s) that cause fair skin are different in the different populations. Same with the genes for higher intelligence and educational attainment.

  37. How, if at all, are the SNPs correlated with educational attainment different from those correlated with high IQ?

    Identifying genes correlated with high IQ would obviously be an important finding in its own right. But if the genes are the same, then the study is merely confirming what was blindingly obvious to begin with — i.e., people with high IQ are more likely to finish high school, and complete college and post-graduate programs.

    • Replies: @res

    How, if at all, are the SNPs correlated with educational attainment different from those correlated with high IQ?
     
    Conscientiousness probably contributes to EA but not IQ. Perhaps conformity as well. Finding SNPs for those traits is left as an exercise for the reader ; )
  38. RW says:

    Tulip raised a good point:

    As sample sizes for GWAS continue to grow, it will likely be possible to construct a polygenic score for educational attainment whose predictive power comes closer to 20% of the variance in educational attainment across individuals (Rietveld et al. 2013)

    “This seems much lower than the numbers that are often bandied about in the Jensenite crowd.”

    I wonder how much educational attainment could be due to effects from one’s peer group.

  39. In other words, African Americans are genetically different enough from white Americans that a model developed on whites doesn’t work on blacks?

    They need to norm for Wakandans. The formula is W(n). I know for a fact that has not yet been done.
    Behind that big, beautiful Wall of Wakanda each Wakandan is blessed with so much IQ they need to carry it in a wheelbarrow.

    • Replies: @Okechukwu

    They need to norm for Wakandans. The formula is W(n). I know for a fact that has not yet been done.
    Behind that big, beautiful Wall of Wakanda each Wakandan is blessed with so much IQ they need to carry it in a wheelbarrow.
     
    Now here's an idiot that doesn't belong in a discussion of intelligence.
  40. utu says:
    @res

    I am not sure if I understand what “the first 10 principal components of the variance–covariance matrix of the genetic relatedness matrix” is really about.
     
    Sigh. We discussed controlling for population structure two weeks ago here and I linked to a UKBB document which gives nice plots of the population structure PCA along with explanation of the technique: https://www.unz.com/isteve/genetic-analysis-of-social-class-mobility-in-five-longitudinal-studies/#comment-2410937
    Notice who replied saying the links were useful.

    The "variance–covariance matrix of the genetic relatedness matrix" is not the same thing as the SNP matrix in the UKBB PCA, but I am pretty sure the overall effect is similar. I would be interested in hearing from anyone who has a better understanding of how these relate.

    As I understand it, the "genetic relatedness" they controlled for is essentially race with the PCs providing progressively finer granularity. In other words, I believe they are looking at overall genetic relatedness rather than just family relationships. So if there is any consistent variation in the traits being looked at (here EA) between the genetic population groups in the study (and 10 PCs is fairly fine grained) then that variance will be corrected away even if it is due to genetics.

    I think including the variance explained by the baseline model is critical for several reasons:
    1. The baseline model may capture some of the genetic variance.
    2. The predictive power of the model is the total variance explained.
    3. I would argue the % of variance remaining after the baseline model is controlled for is an important metric.

    I think a clear way to express the results would be to divide the model variables into three categories:
    1. Non-genetic variables. Here sex, birth year, their interaction
    2. Genetic variables used for correction. Here 10 principal components of the genetic relatedness matrix
    3. Genetic variables used in incremental model.

    Breaking the variables down like this and doing an ANOVA indicating how much variance is explained by each group of variables would be much more informative than only looking at the incremental variance explained by the third category.

    So again you have not contribute much with your comment except you forced me to reread the paper so I could explain to you what they really did. Anyway, read and think while reading so you do not take to much of my time.
     
    Get over yourself, utu. I am sorry if I am being too terse. I keep making the mistake of thinking that you are smart enough to learn from previous discussions and make inferences beyond what I stated explicitly.

    I agree on one thing. I also would like to see R^2 of the baseline model as well as incremental R^2 for different baseline model, especially for one without PC variables. I suspect that the incremental R^2 w/o PC variable in the baseline model would be larger. So if the sample was multiracial, racial differences in education attainment would be removed by PC variables and thus the racial component to intelligence if it indeed exists would be strongly attenuated or even lost.

    I really have a problem with “2. Genetic variables used for correction.” but after thinking about it I begin to think this problem in general is unsolvable. You can’t correct with X (in these case genes) what you want to detect with X (also genes). How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence? This problem makes me think of Baron Munchausen saving himself from drowning by pulling on his own hair.

    The outstanding issue is still outstanding:

    Taken together, these 1,271 SNPs accounted for just 3.9% of the variation across individuals in years of education completed.

    By contrast, a prediction based on a polygenic score that combines ~1 million SNPs that we studied (see FAQs 1.5 & 2.3) has more predictive power: r = 0.33, corresponding to 11% of the variation across individuals.

    • Replies: @res

    How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence?
     
    Exactly.
  41. @Hypnotoad666
    How, if at all, are the SNPs correlated with educational attainment different from those correlated with high IQ?

    Identifying genes correlated with high IQ would obviously be an important finding in its own right. But if the genes are the same, then the study is merely confirming what was blindingly obvious to begin with -- i.e., people with high IQ are more likely to finish high school, and complete college and post-graduate programs.

    How, if at all, are the SNPs correlated with educational attainment different from those correlated with high IQ?

    Conscientiousness probably contributes to EA but not IQ. Perhaps conformity as well. Finding SNPs for those traits is left as an exercise for the reader ; )

  42. @utu
    I agree on one thing. I also would like to see R^2 of the baseline model as well as incremental R^2 for different baseline model, especially for one without PC variables. I suspect that the incremental R^2 w/o PC variable in the baseline model would be larger. So if the sample was multiracial, racial differences in education attainment would be removed by PC variables and thus the racial component to intelligence if it indeed exists would be strongly attenuated or even lost.

    I really have a problem with "2. Genetic variables used for correction." but after thinking about it I begin to think this problem in general is unsolvable. You can't correct with X (in these case genes) what you want to detect with X (also genes). How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence? This problem makes me think of Baron Munchausen saving himself from drowning by pulling on his own hair.

    The outstanding issue is still outstanding:

    Taken together, these 1,271 SNPs accounted for just 3.9% of the variation across individuals in years of education completed.
     

    By contrast, a prediction based on a polygenic score that combines ~1 million SNPs that we studied (see FAQs 1.5 & 2.3) has more predictive power: r = 0.33, corresponding to 11% of the variation across individuals.
     

    How do you know that the PC variable that you use is not also a good polygenic score that is responsible for intelligence?

    Exactly.

  43. @ThirdWorldSteveReader

    The basic question is like this: if your brother is taller than you, it’s probably because, while he has pretty much the same genes as you do, he likely has versions of those genes that make him taller. But if a person of a different race is taller, it might be because he has totally different genes, not just different variations of genes you all share.
     
    That's not it; different races still have the same gene loci (these things tend to be very conserved), but they have different versions (alleles) that are absent in one of them, making it difficult to compare them unless both races are well represented in the GWAS. I mean: in a set of white people, you can catch the alleles present among whites, and associate them with a desired phenotype, and then use their presence in a given white individual to predict the phenotype. But if you try to use the GWAS results obtained from the white dataset to predict the phenotype in a black person, the prediction will not be very reliable, mostly because blacks may have different relevant alleles not present in the white dataset, whose effect we still don't know.

    Gregory Cochran wrote about this in several blogposts, where he explains it better than I can:

    https://westhunt.wordpress.com/2017/02/09/everything-is-different-but-the-same/
    https://westhunt.wordpress.com/2018/06/03/wysiwyg/

    Your brother might be taller if mom bought hormone-grown chicken or similar foods during bro’s youth. That would make an interesting study if not already done. Growth hormones join endocrine disruptor impacts in the food chain.

  44. @Sarah Toga

    In other words, African Americans are genetically different enough from white Americans that a model developed on whites doesn’t work on blacks?
     
    They need to norm for Wakandans. The formula is W(n). I know for a fact that has not yet been done.
    Behind that big, beautiful Wall of Wakanda each Wakandan is blessed with so much IQ they need to carry it in a wheelbarrow.

    They need to norm for Wakandans. The formula is W(n). I know for a fact that has not yet been done.
    Behind that big, beautiful Wall of Wakanda each Wakandan is blessed with so much IQ they need to carry it in a wheelbarrow.

    Now here’s an idiot that doesn’t belong in a discussion of intelligence.

  45. @ATate
    No need to quote La Griffe, most people on the left will dismiss that out of hand.

    Instead show them this graph;

    http://www.jbhe.com/latest/news/1-22-09/satracialgapfigure.gif

    I had one guy accuse me of getting it from Stormfront. When you tell them it’s from The Journal of Blacks in Higher Education, they get...quiet.

    Another hotshot IQist who doesn’t understand the difference between wealth and income. It’s very possible that a low-earning white person is actually wealthy, while a high-earning black person has little tangible wealth and is living paycheck to paycheck. Whites had a long time to build up generational wealth when the playing field was markedly uneven. Blacks didn’t have the same opportunity. In fact their homes and businesses were burned down by vicious white racists if they showed any signs of success. Low earning whites are still raised in the culture of IQ and SAT tests. High earning blacks are not necessarily raised in the same culture, as they are a distinct sub-culture.

    Sorry for the intrusion of cold, hard reality. You may get back to your ridiculous pseudoscience.

  46. Man, debunking this stuff is too easy:

    Immigrants from Africa Boast Higher Education Levels Than Overall U.S. Population

    African immigrants boast higher levels of education than the overall U.S. population, with a particular focus on Science, Technology, Engineering, and Math. 40 percent of African immigrants have at least a bachelor’s degree—making them 30 percent more likely to achieve that level of education than the U.S. population overall. Of this group, about one in three, or 33.4 percent, have STEM degrees, training heavily in demand by today’s employers.

    https://www.newamericaneconomy.org/press-release/immigrants-from-africa-boast-higher-education-levels-than-overall-u-s-population/

    You guys need to understand that with us you’re not dealing with descendants of slaves who’ve been brutalized for 400 years. We’ll go toe-to-toe with anyone. Try your pseudoscience on us. Haha.

Comments are closed.

Subscribe to All Steve Sailer Comments via RSS
PastClassics
The unspoken statistical reality of urban crime over the last quarter century.
Which superpower is more threatened by its “extractive elites”?
How a Young Syndicate Lawyer from Chicago Earned a Fortune Looting the Property of the Japanese-Americans, then Lived...
Becker update V1.3.2