Verbal Intelligence by Demographic

A few years ago I put up a post, WORDSUM & IQ & the correlation, as a “reference” post. Basically if anyone objected to using WORDSUM, a variable in the General Social Survey, then I would point to that post and observe that the correlation between WORDSUM and general intelligence is 0.71. That makes sense, since WORDSUM is a vocabulary test, and verbal fluency is well correlated with intelligence.

But I realized over the years I’ve posted many posts using the GSS and WORDSUM, but never explicitly laid out the distribution of WORDSUM scores, which range from 0 (0 out of 10) to 10 (10 out of 10). I’ve used categories like “stupid, interval 0-4,” but often only mentioned the percentiles in the comments after prompting from a reader. This post is to fix that problem forever, and will serve as a reference for the future.

First, please keep in mind that I limited the sample to the year 2000 and later. The N is ~7,000, but far lower for some of variables crossed. Therefore, I invite you to replicate my results. After the charts I will list all the variables, so if you care you should be able to replicate displaying all the sample sizes in ~10 minutes. I am also going to attach a csv file with the raw table data. As for the charts, they are simple.

- The x-axis is a WORDSUM category, ranging from 0 to 10

- The y-axis is the percent of a given demographic class who received that score. I’ve labelled some of them where the chart doesn’t get too busy

All of the charts have a line which represents the total population in the sample (“All”).


The “Row” variable in all cases was WORDSUM. I put in YEAR(2000-*) in “Selection Filter(s).”

For the columns:

Sex = SEX

Race/ethnicity = For non-Hispanic blacks and whites put HISPANIC(1) in the filter. Then RACE. For Hispanics just limiting the sample to Hispanics will do, HISPANIC(2-*). Nothing in the row needed.

Education = DEGREE

Region = REGION

Political ideology = POLVIEWS(r:1-3″Liberal”;4″Moderate”;5-7″Conservative)

Political party = PARTYID(r:0-2″Democrat”;3″Independent”;4-6″Republican”)

Belief in God = GOD(r:1-2″Atheist & agnostic”;3-5″Theist”;6″Convinced Theist)

Religion = RELIG

Opinion about Bible = BIBLE

Income standardized to 1986 = REALINC(r:0-20000″0-20″;20000-40000″20-40″;40000-60000″40-60″;60000-80000″60-80″;80000-100000″80-100″;100000-120000″100-120″;120000-140000″120-140″;140-*”140-”)

Wealth = WEALTH(r:1-3″”)

Evolution = EVOLVED

You can find the raw table here.

  1. Thanks.

    Wow, the graph by religion really jumps out from the rest. Presumably, it wouldn’t be quite so extreme if restricted to white people by religion, but still …

  2. You accidentally listed the bible one twice. The one with “book of fables”.

  3. What are you using to draw the curves? cubic spline?

    The ones where there is little data (Jews, atheists) jump out by not being unimodal. In some ways, that’s a good feature of the graph, warning not to take too much from that particular line. But the Jewish data is actually unimodal on the high end and the non-monotone curve is the result of the smoothing method. I guess that’s a sign that the data is really weird, due to small numbers, even if it isn’t so weird as to not be unimodal.

  4. I am continually astounded that Wordsum is considered even mildly correlated with intelligence.

    WORDSUM scores can be divided into only two usefully descriptive categories:

    0-9 = Stupid and ignorant

    10 = Minimally literate

  5. @”Curious”: So 95.9 % of the sample surveyed is “stupid and ignorant” according to you (as only 4.1% received a score of 10)?

    Also, I don’t know what you mean by “continually astounded that Wordsum is considered even mildly correlated with intelligence”. The sample correlation between IQ and WORDSUM is 0.71. Based on my experience, that’s a pretty large correlation for social sciences — so is your astonishment stemming from a belief that IQ is a poor indicator for intelligence? That’s a defensible position, but otherwise I don’t understand what you mean by “considered”.

  6. Helpful tip for better graphics: Label your axis. Seriously, label the axis. Even if the reader can figure it out, still label the axis. That is all.

  Anonymous says:

    The party affiliation looks funky. It looks as though everyone who wasn’t a Democrat or Republican got lumped into “Independent.”

    Not being a registered voter doesn’t make you an Independent. Registered “no party affiliation” would also be an interesting datapoint.

  8. I don’t know if you’re aware of this, but the red and green you have chosen are virtually impossible for one who is red/green colorblind (e.g. me). It’s easier to distinguish on the graph than the legend, but it’s still very difficult.

  9. Why do you specify that it’s verbal intelligence? Why not general intelligence since that’s the correlation?

  10. Also, why do you have wordsum as 0-10 in the csv file when it’s 1-10 in GSS?

  Anonymous says:

    I am disappointed that a magazine like Discover magazine would use such poorly designed graphs. The data is interesting, however, I could not read the majority of it as I am colorblind and the colors you choose are very difficult to discern. Thank you.

  12. Hey Razib, I sat next to you on the bus. I was the guy going to Chico. I saw you making this. Blew my mind when I found this link on reddit. Hope you got home alright.

  13. @curious

    why do you say that? you have a low IQ, (<115) but you got a 10? the data speaks for itself.

  14. some minor points

    wordsum goes from 0 out of 10 to 10 out of 10. only a tiny proportion get 0 out of 10, but they’re there.

    re: graphic design. as alluded to above by #14, i wrote this post in about 20 minutes while i was on a bus, by myself. apologize for the aesthetic, but that wasn’t the point. that’s why i provided the .csv file. though in the future i’m going to keep in mind the color-blindness issue, i hadn’t thought of that. i basically just used open office’s web page export feature to create the JPG versions of the charts, and it does suck. and i also went with open office’s default chart colors. might have to write a reusable function in R in the future to make this 1) easy on time 2) on the eyes of readers. i didn’t label axes, etc., because space (600 pixel max), and time (might have taken 50 minutes to write the whole thing, adding labels gets kind of tedious when you are using a wizard).

  15. I hope these figures encourage others to study this with full-on IQ tests. It would be interesting to see also which groups cluster by verbal ability versus spatial.

  16. My bad, col=pres08 with row=wordsum shows me wordsum as 1-10 but in the help section it’s listed as 0-10. (It clipped because n=0 for wordsum=0 I guess)

  17. I have never taken the test without scoring a perfect score and it is really, really easy to do so. The words are simply not uncommon to someone with a halfway decent primary and secondary education or even those who just read a lot. I simply don’t believe one could have halfway intelligent conversation with others that don’t score a 10 also. But maybe that’s just me or others in my cohort who had the advantage of going to school before the educational rot set in in the 1970′s although I do admit to having intelligent test results that put me several deviations above the mean, I think that a decent education should allow the 120 IQ types to ace WORDSUM.

    Perhaps I’m wrong though. In that case we need a massive eugenic program to be instituted along with a return to the educational standards of past eras.

    I like reading GNXP because it introduces words, concepts, and meanings that I don’t know. Razib will no doubt castigate me for not doing my own WORDSUM correlation with age to prove my point but I’m too lazy for that (and always have been if I’m not getting paid to do so. I’m retired and prefer to limit my hard thinking to investments and the poker table.)

  18. Seems like an even more exaggerated example of the tests showing cognitive ability declining with age. Oh well, as we approach the supposed effects smart machines will have on society predicted by certain “singularity” enthusiasts we will need less and less humans of any sort to do any work. This argues for even more stringent eugenic programs to reduce the absolute number of the useless masses as there will be no productive work for any of them.

    Hardly seems to be any point in having hordes of slave laborers assembling i(Whatevers) so that hordes of clueless users can produce a myriad of meaningless Twitter blather.

  19. “Curious,” i think you may not assort with more normal cognitive types. that’s fine, but i don’t see what the point in denying WORDSUM’s usefulness is in that case. you’re comparing two different issues here.

  Anonymous says:

    I’m interested in the conflicting information your graphs show wrt religion vs intellegence. In the first graph, the suggestion seems to be that atheists score more highly than theists in general. The next graph shows Jews scoring much higher than atheists. How can this be so? The only conclusion I can draw is there is a small sample size of jews which does little to affect the overal average of theists. If the data is really that badly screwed, why do you feel it is worth publishing at all? clearly the graphs mean nothing.

  21. #23, only 30% of those with ‘no religion’ are atheists or agnostics. a substantial minority of jews are atheists and agnostics. and yes, the number of jews is small in any case. though they are a larger percentage of atheists and agnostics than theists.

  Anonymous says:

    No axis labeling? Really? The charts were frustrating and confusing to read without the axes being labelled. This comic should help you:

  Anonymous says:

    Razib, can you post data or point to where I could find data on the relative sample sizes with respect to the differnt variables. Just trying to figure out statistical significance; sorry if you have done this already and I missed it, but I didn’t see it in the CSV file you posted. (BTW, Your post has inspired a heated discussion on my facebook after I posted it, very interesting data. Thanks for the post).

  24. #26, that’s why i provided the variables. just go the gss website, and replicate the queries. the default will show you the weighted N. feel free to follow up if that’s too confusing.

  25. It would have been useful to provide the correlation between the different variables or control for some when plotting others. I would guess that the graph for religion, for example, would look quite different when controlled for education, since jews are over-represented in American academia.

  26. I understand WORDSUM’s utility because it is a GSS variable, but I have a hard time accepting it as a useful intelligence measure when breaking down groups. A couple general problems… it’s obviously a disaster near the ends of the distribution because of the large measurement error. Second, the correlation you quote is from data collected in the 1960s — long enough to matter, and the sample was soldiers taking the Army’s intelligence/vocation test. Those people bear little resemblance and inhabited a different world than the GSS sample. Finally, I don’t see any particular reason, given variation in linguistic exposure and access to education, that the WORDSUM-g correlation would be stable across subgroups. For example, white kids and rich kids are read to more and have far more access to books compared to black kids and poor kids, on average, and this seems like a reasonable input for WORDSUM, which is explicitly a measure of culturally-acquired knowledge.

    And the axes weren’t confusing — they were obvious to anyone who read the text and you defined them at the top, and this is a blog post. People who have just recently heard of Tufte go around scolding everyone’s charts for a couple years. It’s kind of cute.

  27. #30, I don’t see any particular reason, given variation in linguistic exposure and access to education, that the WORDSUM-g correlation would be stable across subgroups.

    seems testable: do the correlations between math and verbal parts of SAT vary by group? the latter is less ‘culture-fair,’ so they should if your proposition is correct.

