Razib Khan
General Social Survey

John von Neumann

About a month back a researcher at Yale published survey results which showed that Tea Party members exhibited more science knowledge than the general public, somewhat to his chagrin. I wasn’t particularly surprised, because the knowledge of science as it relates to political ideology is somewhat complex. Often the right-leaning get lower marks because of strong reactions to questions perceived to be ideological. It’s a rather robust finding that the more intelligent are more ideological, so it is no surprise that a group like the Tea Party would do better on tests which measure underlying cognitive orientation.

This was brought back to my mind by a new piece in The Atlantic which had a “Slate-pitch” sort of title: The Republican Party Isn’t Really the Anti-Science Party. There was some comment on Creationism in the piece, so I wanted to review the data on this mostly ideologically freighted of the standard science questions asked of the public. To do this I used the General Social Survey. To limit demographic confounds I constrained the samples to non-Hispanic whites who responded 2006-2012 (“Selection Filter(s): Race1(1) Hispanic(1)”). Additionally, I partitioned the data into two classes, non-college and college-educated (“Degree(r:0-2;3-4)”). Then I looked at political party identification and ideology (“Partyid” and “Polviews(r:1-2;3;4;5;6-7)”).


Agree: “Human beings, as we know them today, developed from earlier species of animals.”
Non-college College-educated
Strong Dem 56 88
Dem 54 79
Lean Dem 60 86
Independent 55 70
Lean Repub 44 56
Repub 37 56
Strong Repub 27 41
Liberal 69 94
Slightly Liberal 61 83
Moderate 52 71
Slightly Conserv. 47 65
Conservative 25 35

As someone with a professional fixation upon evolution and a lean toward conservative political viewpoints, obviously these results are disturbing to me. But they are what they are. The typical run of the mill Ph.D. scientist disagrees with the Right here rather strongly. I think the attitude toward evolution specifically is a major symbolic marker which alienates scientists as a demographic from anything to do with Republicans or conservatism, and vice versa. Though there are presumably normative implication in evolutionary, the primary disagreement here is basically on very long established and orthodox science.

• Category: Science • Tags: General Social Survey, Social Science 
One of the things I’m interested in is the perception by some that self-identified conservatives can mobilize better as a collective unit on the American political scene. To test that proposition I often poke around the General Social Survey. For example, it seems to be a common assumption among many liberals that women tend to be more supportive of abortion rights than men. This isn’t born out by the GSS data. There’s no sex difference. Until you correct for ideology. It turns out that among liberals women are more supportive of abortion rights…while among conservatives men are more supportive of abortion rights! If people socialize only with their own ideological subset then the perception of the relationship between sex and ideological opinions will differ a great deal.

Below I decided to attempt to ascertain differences between liberals and conservatives on “hot button” issues as a function of intelligence. I used WORDSUM to probe this. I classed those who scored 0-5 as dull, those who scored 6-8 as not dull, and those who scored 9 and 10 as smart. I also limited the sample to the year 2000 and later, and only non-Hispanic whites.

I’ll give you the results without comment.


Liberals Conservatives
Abortion for any reason acceptable
Yes Yes
Dull 46 29
Not Dull 65 27
Smart 79 26
Homosexual relations
Always wrong Always wrong
Dull 52 78
Not Dull 15 70
Smart 7 57
Ban Bible prayer in schools
Approve Approve
Dull 39 28
Not Dull 65 36
Smart 85 54
Allow anti-religionist to speak
Yes Yes
Dull 73 69
Not Dull 93 82
Smart 99 96
Humans developed from animals
Yes Yes
Dull 53 44
Not Dull 79 33
Smart 94 51
Belief in God
Atheist & agnostic Atheist & agnostic
Dull 6 5
Not Dull 21 4
Smart 31 9
Trade took away American jobs
Yes Yes
Dull 29 39
Not Dull 36 41
Smart 11 11
College educated
Yes Yes
Dull 11 16
Not Dull 40 33
Smart 72 59
I few years ago I complained that no one was using the General Social Survey web interface for blogging, a practice which probably can be traced back to the Inductivist (yes, social scientists use the GSS constantly, but they use it to publish papers, not blog posts). Kevin Drum noted my lament in late 2008 and promised that he’d revisit the GSS in the future. He hasn’t. That’s fine, there are 1 million things I mean to do which I don’t manage to get to. But still, it’s kind of depressing to me the amount of opinion people can express which they don’t bother to follow up on by using a web interface to a rich data source which requires no more than 1997 era browser skills.

There’s a lot you can do with the GSS interface, but I thought it might be useful to do something very simple so that people can see how easy it really is. Since most of the people I follow on twitter lean Left I see a lot of political chatter which is concerning to that segment of the population. For example there is a lot of talk about conservative white males and their lack of concern for global warming.

Can we explore this with any greater precision with the GSS? Yes we can.

First you need to find the appropriate variables. So go to the search box and enter in what you want to find. I typed “warming.” When you hit the “Go” button it will return a list of variables which we can then use in your further queries. My own suggestion is to keep the query simple and one word, this isn’t Google. You’ll get a lot of results usually, but at least it will give you options. Often there are many overlapping variables and you want to pick the one with the largest sample size or which was asked most recently.

Here is some of what I got for “warming”:

I want the last variable. If I click it it puts it into the “Selected” text box. I hit “Row” to copy it to the appropriate box. If you use the GSS enough with a few variables you get to know them off the top of your head and can skip this step. For evolution for example I know that “evolved” is a dichotomous response which was surveyed relatively recently.

But you want more than one variable. Going back to my initial curiosity I want to “cross” the variable under consideration with race and sex. I happen to know that there is a “Sex” variable where males are 1 and females are 2. I also know there is a “Race” and “Hispanic” variable, where 1 is white and non-Hispanic. I’ll put “Race” in the column box, so it crosses with the row. I’ll also limit the sample to Non-Hispanics and males. So you see I entered something in the “Selection Filters” box. There’s a lot more fine-tuning you can do at this point, but let’s just go with this.

Below are the results for the query above. As you can see it’s vintage 1997 as well:

All sorts of details are clear here. You can see the weighted sample size, the exact form of the question, and of course the results in combination of row and column classes.

Finally let’s control for ideology. I happen to know that the POLVIEWS variable has seven response classes, from extremely liberal to extremely conservative. I’m combine the three liberal classes and three conservative classes using the recombine option. You can see it below in the “Control” box. This means that the query above will now be split into three categories, one for liberals, one for moderates, and one for conservatives. Here’s a response for liberals and conservatives:

The sample sizes for non-whites here are very small, but the big difference is across ideology among white males. In other words we’re talking ideology as the causal factor. White male are more conservative. And conservatives are less concerned about Arctic seals possibly being threatened by global warming.

I used very much a toy example above. I just wanted to show you how easy the interface really is.

Yesterday I made an admission of my lack of trust after the 2008 financial crisis. I should have been more precise and clarified that my collapse in trust has been particularly aimed at elites and “experts.” In any case, I realized that the General Social Survey has 2010 results available. This means that I could check any changes in public trust and confidence from 2008 to 2010! Below in the set of charts there is one that assesses trust in banks and financial institutions. The direction of change validates my specific implication. But it seems that my intuition was wrong in that American society had slouched toward more general distrust. This makes me less pessimistic about the direction of our culture and the future rationally (I can’t say that my visceral emotional cynicism has been abolished).

As you can see there wasn’t much change between 2008 and 2010. For the broad question of “can you trust people” I also decided to break it down by political ideology, education, and intelligence in two year rages, 1972-1991 and 1992-2010. There are noticeable differences in intelligence and education (less intelligent and less educated people are more distrustful), but not in terms of ideology.

After the bar plots there are another range of line graphs by year showing confidence in a range of institutions (including finance) from 1972 to 2010. It is interesting how much you can see short term volatility due to world events, which quickly recedes back toward the trend line.

Long time reader Ian comments:

A comparison with “the American public” isn’t really appropriate – to even be in the pool where you’re thinking about an academic career, you need to have a college degree. And that population if memory serves, is far more liberal than the population at large. More realistic would be a comparison with the population of people who have graduate degree….

Roughly about ~20% of Americans self-identify as “liberal,” and ~40% as conservative. The General Social Survey has a variable POLVIEWS, which asks individuals to assign themselves to a position on a political spectrum, from “extremely liberal” to “extremely conservative,” like so:

1 = Extremely liberal
2 = Liberal
3 = Slightly liberal
4 = Moderate
5 = Slightly conservative
6 = Conservative
7 = Extremely conservative

So in other words, the higher the integer, the more conservative the individual. The GSS has a variable, EDUCATION, which records the highest level attained. It falls into three classes, high school, bachelor’s, and graduate degrees (I assume those who did not complete high school are omitted because they didn’t attain an education?). Additionally, it has a 10 word vocabulary test, WORDSUM, which has a 0.71 correlation with general intelligence. I combined those on the interval 0-4 (they got 0 to 4 answers correct on the test), and labeled them “dull.” 5-8 I labelled “average. And finally, 9 and 10 I labeled smart (about 20% are dull, 65% average, and 15% smart, in the total data set). Constraining the sample to the year 2000 and later, I produced the following charts:


~9% of the sample have graduate degrees, while 25% have bachelor’s degrees. So you aren’t really seeing middling educational attainments. There is obviously some tendency toward an increase in the proportion of self-identified liberals at graduate levels of education, but the biggest notable trend is the collapse of self-described “moderates” among the more intelligent or educated cohorts. I sometimes wonder if the vacuousness of political moderation is one reason why the term “centrist” is popular among those who are neither liberal nor conservative, but have a higher socioeconomic status.

Long time readers know this fact about the dullness of moderates, but I thought I’d reiterate. It’s interesting, if consistent. For what it’s worth, I think my own political views are becoming more moderate. But don’t confuse correlation with causation!

I’ve mentioned this before, but I thought it would be useful to repeat again. Many of my social science related posts use Berkeley’s web interface with the General Social Survey. Regularly people ask me in the comments details as to the variables, or a more explicit elaboration of the methods. First, this is a weblog, not a venue for me to publish scholarly papers. Most of the GSS related posts are meant to be “quick & dirty,” and stimulate further exploration by readers. Unfortunately follow ups rarely happen. One can speculate why, but that’s how it is. Nevertheless, I thought I would repeat really quickly how to use the GSS in a basic fashion.

First, here’s the URL:

This is the database from 1972 to 2008. You’ll meet a screen like this:


The page is cluttered, but basically the right side is where you enter in your row and column variables which you want to cross or compare together. The left side allows you to explore the variables. Search and selected are pretty straightforward, while you can browse the list of variables in the menu to the bottom left. The easiest thing to do is just look at frequencies of X, Y, and Z against particular categories A, B and C (e.g., educational attainments vs. sex). But you can do more, at the top left if you select “analysis” you have more options:


I’ve been looking at mean values a lot. Sometimes the mean is obvious because the variables are quantitative. But if you’re talking about a dichotomous response it is “recoded” numerically (e.g., 0 vs. 1), so you have to keep in mind that the mean is just a representation of the underlying data. There are correlations and regressions too. You can do a lot with the GSS, but the more complicated or detailed you get in your analysis, the less appropriate for a “quick & dirty” they are. I’ve been shying away from presenting regressions because to do it right you have to be careful, and if you just throw out a bunch of betas people aren’t going to replicate your analysis and might put more stock in the model than they should (and it’s not hard to massage the betas you get with your variables my just manipulating the set of variables).

Here’s a quick example of a query:


WORDSUM will output the % in the sample who score 0, 1, 2, etc. out of 10 on the WORDSUM vocabuary test. I wanted to check it against highest education attained, DEGREE. I decided to combine those without high school diplomas, those with high school diplomas, and some college, into one category, and label it “No College.” Next I combined those with bachelors and graduate degrees into one category. Then I controlled for males and females, so it will output the row and column variables twice for each control. Finally I constrained the data set to non-hispanic whites who were surveyed after 1999 to the present (2008 in this survey).

Here’s the outcome for males:


Razib Khan
"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"