The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information



=>
Authors Filter?
Agnostic The Razib Khan
Nothing found
 TeasersGene Expression Blog
/
Data

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

There’s a lot of genetic data out there. Many of the Reich lab data are downloadable. Additionally, Martin Sikora gave me a pedigree file with a lot of the ancient genotypes in their recent paper (much appreciated since pulling genotypes out of a lot of big sequence files of varied coverage was going to take some time and care). I merged the two together. But for whatever reason the Reich data set did not include anything from the South Indian samples from the 1000 Genomes. Since I have those, I decided to add a bunch. These are Telegu and Tamil speakers who are neither Brahmins nor scheduled castes and tribes (for those curious, the Velama map pretty well on PCA to the “South Indians” I culled from the 1000 Genomes).

You can download it here. It’s a 200 MB tarball. It’s in plink format. I did a minor allele frequency filter out 0.05, and got it down to 385,000 SNPs. Please note: these data vary greatly in quality on the individual level. A lot of the ancient samples are missing a lot of positions, so keep that in mind when you analyze them (e.g., if you run PCA some of the dimensions are pretty obviously just ancient samples missing lots of markers in a systematic manner). Finally, there are non-human outgroups in the data. For example if you run a PC analysis without subsetting PC 1 will separate Marmoset from humans, with other primates and ancient samples in spanning the gap. If you leave in ancient populations, a lot of them are going to be of much lower quality than the run-of-the-mill population.

Below are the samples by population and size. Most of the labels are from the Haak et al. data set. Obviously they’re a little idiosyncratic, but I figure you can figure it out. Please note that the .fam file has population labels in the family ID column. I added them manually where they didn’t exist (e.g., the Willerslev data and the 1000 Genomes did not have them, so I added them where appropriate).

Group N
S_India 136
Yoruba 70
Turkish 56
Spanish 53
Druze 39
Palestinia 38
Han 33
Basque 29
Japanese 29
Sardinian 27
BedouinA 25
French 25
Ulchi 25
Burusho 23
Chukchi 23
Eskimo 22
Russian 22
Tubalar 22
Brahui 21
Mozabite 21
Balochi 20
Biaka 20
Greek 20
Hungarian 20
Makrani 20
Yakut 20
BedouinB 19
Pathan 19
Yukagir 19
Egyptian 18
Kalash 18
Mayan 18
Sindhi 18
Adygei 17
Mandenka 17
Bell_Beaker 15
Unetice 15
Yamnaya 15
Hazara 14
Papuan 14
Pima 14
HungaryGam 13
Orcadian 13
Somali 13
AA 12
Bergamo 12
Icelandic 12
Karitiana 12
LBK_EN 12
Masai 12
Corded_Ware 11
Khomani 11
Nganasan 11
Norwegian 11
Sicilian 11
SwedenSkog 11
Ami 10
Armenian 10
Balkar 10
Belarusian 10
Bougainvil 10
Bulgarian 10
Chuvash 10
Croatian 10
Czech 10
Dai 10
English 10
Estonian 10
Even 10
Georgian 10
Han_NChina 10
Kalmyk 10
Kusunda 10
Lithuanian 10
Mbuti 10
Miao 10
Mixe 10
Mixtec 10
Mordovian 10
North_Osse 10
Selkup 10
She 10
Thai 10
Tu 10
Tujia 10
Tuvinian 10
Uygur 10
Uzbek 10
Yi 10
Zapotec 10
Abkhasian 9
Atayal 9
Chechen 9
Daur 9
Iranian_Je 9
Jordanian 9
Koryak 9
Kyrgyz 9
Lezgin 9
Libyan_Jew 9
Naxi 9
Nogai 9
Oroqen 9
Ukrainian 9
BantuSA 8
Cambodian 8
Cypriot 8
Esan 8
Hezhen 8
Iranian 8
Kinh 8
Kumyk 8
Lahu 8
Lebanese 8
Luhya 8
Luo 8
Maltese 8
Mansi 8
Mende 8
Punjabi 8
Saudi 8
Surui 8
Syrian 8
Tajik_Pomi 8
Tunisian 8
Tuscan 8
Yemenite_J 8
Aleut 7
Algerian 7
Altaian 7
Ashkenazi 7
Bengali 7
Bolivian 7
Ethiopian 7
Finnish 7
French_Sou 7
Georgian_J 7
Karasuk 7
Motala_HG 7
Tunisian_J 7
Turkmen 7
Xibo 7
Albanian 6
BantuKenya 6
Gambian 6
Hungary_Vatya 6
Iraqi_Jew 6
Itelmen 6
Korean 6
Mongola 6
Moroccan_J 6
Saharawi 6
Yemen 6
Afanasievo 5
Armenia_LBA 5
Cochin_Jew 5
GujaratiA 5
GujaratiB 5
GujaratiC 5
GujaratiD 5
Hadza 5
Ju_hoan_No 5
MTurkish_J 5
Quechua 5
Spanish_No 5
Andronovo 4
Kikuyu 4
Piapoco 4
Russia_Iron_Age 4
Scottish 4
Sintashta 4
Spain_EN 4
Spain_MN 4
Tlingit 4
Armenia_MBA 3
Australian 3
Baalberge 3
Benzigerod 3
Datog 3
Dolgan 3
FTurkish_J 3
Hungary_Maros 3
Italy_Remedello 3
Mezhovskaya 3
Sweden_Nordic_BA 3
Athabascan 2
Botocudo 2
Canary_Isl 2
Denmark_Nordic_BA 2
Denmark_Nordic_LN 2
Greenland 2
MiddleDors 2
Nivkh 2
Okunevo 2
Russia_LBA 2
Sweden_Nordic_LN 2
AG2 1
Alberstedt 1
Aleutian 1
Altai 1
Ancient_De 1
Ancient_Ne 1
Birnirk 1
Chimp 1
Clovis 1
Denisovan 1
Denmark_Nordic_LBA 1
Denmark_Nordic_MN_B 1
EBA 1
Esperstedt 1
Germany_BA 1
Gorilla 1
Halberstad 1
hg19ref 1
Hungary_MBA 1
Iceman 1
Italian_So 1
Karelia_HG 1
Karsdorf_L 1
Kazakhstan_Sintashta 1
Kostenki14 1
LaBrana1 1
LateDorset 1
LBKT_EN 1
Lithuania_LBA 1
Loschbour 1
MA1 1
Macaque 1
Marmoset 1
Mezmaiskay 1
Montenegro_Iron_Age 1
Montenegro_LBA 1
Orang 1
RR 1
Saami_WGA 1
Samara_HG 1
Saqqaq 1
Spain_EN_r 1
Starcevo_E 1
Stuttgart 1
Sweden_Battle_Axe 1
Sweden_Battle_AxeNordic_LN 1
Sweden_Iron_Age 1
Thule 1
Ust_Ishim 1
Vindija 1
 
• Category: Science • Tags: Data, Genomics 
🔊 Listen RSS

I was recently reading Sexual Behavior in the United States: Results from a National Probability Sample of Men and Women Ages 14–94. At N ~ 6,000 it’s a large sample of American sexual behavior around 2010. There was one descriptive result which I thought was interesting, though not surprising. Before the age of 25 it seems that women are more likely to have sex in a given year than an equivalent age man. After the age of 25 this starts to reverse, and men are more likely to be having sexual intercourse in a given year. The dynamics underlying this phenomenon seem to be easily subject to various speculations, so I’ll leave that to readers. Rather, I offer the graph (data drawn from the paper linked above):

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis, Sex 
🔊 Listen RSS

A commenter below notes:

Also, in modern society, doesn’t just about everyone reproduce, such that not only is any particular advantage competing against other countervailing pressures as you note, but also that the “less fit” genomes are not removed from the overall population, but rather are added back to the mix? In other words, the less-preferred short males don’t die and have zero kids, they also get married and their genes get thrown back into the pot.

First, let’s not get caught in the assumption that for genes to be disfavored one has to have zero fitness in individuals carrying those genes. If, for example, in a situation of demographic expansion you had individuals who had eight children vs. those who had one child, there would be selection for the traits which were passed by those with eight children in relation to those who had one child. But, it did make me realize I wasn’t intuitively aware of the distribution of number of offspring in the population. I assumed that the median was around two, but that’s about it.

So, I looked at the GSS CHILDS variable for individuals born in 1950 or earlier from the year 2000 on (COHORT and YEAR variables). I also separated out the results by sex. Do not take these results as definitive because the GSS data set is not entirely representative. But, it does give you a general sense.


In hindsight I can’t say I’m surprised that somewhat over 10 percent of middle-aged adults and older currently don’t have any children. That sounds about right. And the proportion of those who were only children also seems plausible.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis 
🔊 Listen RSS

As someone with mild concerns about dysgenic (albeit, with a normative lens that high intelligence and good looks are positive heritable traits) trends, I’m quite heartened that Marissa Mayer is pregnant. Of course she’s batting well below the average of some of her sisters, but you take what you can get in the game of social statistics. Quality over quantitative thanks to assortative mating.

This brings me to a follow up of my post from yesterday, People wanted more children in 2000s, but had fewer. A reader was curious about limiting the data set to females. Therefore, I did. The same general pattern seems to apply (the limitations/constraints were the same). The only thing I’ll note is that there were only ~40 women in the data set with graduate degrees in the 1970s who were also asked these particular questions, so take this with a grain of salt.


Realized
1970s 1980s 1990s 2000s
< HS 2.73 3.19 3.02 2.79
HS 2.67 2.91 2.59 2.22
Junior College 3 2.75 2.38 2.06
Bachelor 2.31 2.47 2.11 1.71
Graduate 2.11 2.07 1.89 1.56
< $20 K 2.52 2.89 2.57 2.23
$20-40 K 2.57 2.9 2.46 2.02
$40-80 K 2.91 2.95 2.49 1.99
> $80 K 3.08 2.86 2.35 1.95
Ideal
1970s 1980s 1990s 2000s
< HS 3.08 2.96 2.73 2.85
HS 3.04 2.89 2.61 2.97
Junior College 2.58 2.8 2.95 3.31
Bachelor 3.01 2.95 2.86 3.15
Graduate 2.73 2.52 3.63 3.02
< $20 K 3 2.84 2.79 3.04
$20-40 K 3.04 3.01 2.69 2.96
$40-80 K 3.06 2.83 2.89 3.06
> $80 K 3.13 2.87 2.84 3.06

 

Addendum: Small sample sizes in the “graduate” educated pool. That’s my explanation for the 1990s jump in ideal number of children.

Image credit: Wikimedia

 

 

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Demographics 
🔊 Listen RSS

Prompted by a comment below I was curious as to the correlation between intelligence and income. To indicate intelligence I used the GSS’s WORDSUM variable, which has a ~0.70 correlation with IQ. For income, I used REALINC, which is indexed to 1986 values (so it is inflation adjusted) and aggregates the household income. Finally, I limited my sample to non-Hispanic whites over the age of 30 (for what it’s worth, this choice also limited the data set to respondents from the year 2000 and later).

The results don’t get at the commenter’s assertions, because 10 out of 10 on WORDSUM does not imply that you’re that smart really. But the trendline is suggestive. Note that aggregated 0-4 because the sample size at the lower values is small indeed.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis, GSS, IQ 
🔊 Listen RSS

In the further interests of putting quantitative data out their instead of vague impressions, I noticed two GSS variables which might be of interest. One queries the impression of effect on the environment of genetically modified crops. The second asks about whether science does more harm than good. The latter question exhibited almost no year to year variation of note, so I just threw them in a pot together. But for the environment and genetically modified crop question I show responses for the year 2000 and 2010. As you can see there is a modest difference in regards to the first where liberals are more skeptical.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis 
🔊 Listen RSS

In the comments Chad says:

The Right is not inherently anti-science. Yes there are some morons out there who glorify in their ignorance, but lets recognize them for who they are, extremist idiots. This does not describe the majority of those on the Right. It doesn’t even describe the majority of creationists who are for the most part more concerned with work and children to be bothered to think about the origins of life in an average week. One can also point to similar kooks on the Left. Not just the genetic denialism described here, but also rejection of animal research, genetic engineering, organic farming, anti-vaccinations, etc.

First, I’m going to reiterate something: the majority of the human race consists of individuals who are not very smart. This is not meant as an insult, but it’s basically the truth. We may not be talking about idiots, but the average person on the street can not come close to reasoning like A. V. O. Quine. But the main issue I have with these equivalences is that though there is a valid point here, the reality is that it seems to be that the political Right in the USA has taken a bolder anti-science stance than the Left.

And that basically comes down to evolution. If you presuppose that the Left opposes animal research, by and large you will note that the arguments against this are normative. Yes, there are some arguments about the lack of utility and informativeness of this research, but really you are talking about values. In contrast, though some Creationists have made the argument that evolution is about values, you are really talking about a major analytic framework in biology. In fact, evolutionary processes are riddled throughout biological phenomenon. Rejecting evolution is not in the same league as rejecting Newtonian mechanics, but it is rather close.

Not only that, my perusal of the General Social Survey suggests that the gap between liberals and conservatives is likely far greater than between liberals and conservatives on other scientific topics, with the exception of highly politicized ones such as anthropogenic climate change. Unfortunately I haven’t found information on vaccination, but there are some questions about nuclear weapons and genetically modified organism. Compare & contrast.

Liberal Moderate Conservative
Humans developed from animals 69 52 39
Humans developed from animals (non-Hispanic white) 77 55 38
Humans developed from animals (college educated) 86 66 47
Strongly favor nuke power 16 13 12
Favor nuke power 49 50 64
Oppose nuke power 28 27 16
Strongly oppose nuke power 7 9 8
Don’t care whether or not food has been genetically modified 15 20 18
Willing to eat but would prefer unmodified foods 56 53 55
Will not eat genetically modified food 29 27 27

My own prediction is that on something like vaccination & autism you won’t see a major Left-Right difference. Rather, a small subculture on the Left as taken up this cause, and is rather vocal, but it is not a major group conformity marker like evolution vs. creation. The GMO question illustrates here that there isn’t a strong Left-Right difference either. This doesn’t mean that differences don’t exist. But we need more quantitative, and fewer impressionistic, examples.

Note: In the post below I suggested that sex differences is a major area where the Left is far less reality based than the Right. The proportional gap may be large on these topics, but this is not nearly as significant a scientific issue as evolution.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis 
🔊 Listen RSS

A weeks ago Robert Wright had a post up, Creationists vs. Evolutionists: An American Story. Here’s the crux:

A few decades ago, Darwinians and creationists had a de facto nonaggression pact: Creationists would let Darwinians reign in biology class, and otherwise Darwinians would leave creationists alone. The deal worked. I went to a public high school in a pretty religious part of the country–south-central Texas–and I don’t remember anyone complaining about sophomores being taught natural selection. It just wasn’t an issue.

A few years ago, such biologists as Richard Dawkins and PZ Myers started violating the nonaggression pact. [Which isn't to say the violation was wholly unprovoked; see my update below.] I don’t just mean they professed atheism–many Darwinians had long done that; I mean they started proselytizing, ridiculing the faithful, and talking as if religion was an inherently pernicious thing. They not only highlighted the previously subdued tension between Darwinism and creationism but depicted Darwinism as the enemy of religion more broadly.

If the only thing this Darwinian assault did was amp up resistance to teaching evolution in public schools, the damage, though regrettable, would be limited. My fear is that the damage is broader–that fundamentalist Christians, upon being maligned by know-it-all Darwinians, are starting to see secular scientists more broadly as the enemy; Darwinians, climate scientists, and stem cell researchers start to seem like a single, menacing blur.

To be generous to Wright this is a hypothesis. I think that it’s probably a wrong hypothesis on the face of it. Anyone who has a passing familiarity with the Creationist scene knows that its roots and origins are deep in the anti-modernist Protestant movements of the turn of the 20th century, though the modern derivations come in different garb (e.g., the Flood Geology of the mid-20th century which was originally promulgated by Seventh Day Adventists, or the Intelligent Design of the 1990s which was designed as a response to unfavorable court rulings). Rather, I think that the “New Atheism” had a brought cultural effect not on the mainstream society, but on the irreligious minority. In many ways I think that the New Atheism is a muscular secularism which is a reaction to the post-modernist relativist ennui which many non-believers in the United States in particular suffered from through the 1990s.

P. Z. Myers naturally pointed out that we have a long record of polling on Creationism, and it’s a rather stable trend line. I assume that short-term large fluctuations are spurious until we see more data points to shift our confidence in our prior expectations. But there’s another way we can explore this question.

The “New Atheism” came to the fore in the mid-2000s. The most influential of these books, The God Delusion, came out in 2006. Luckily for us the General Social Survey has a question, TRUSTSCI, which was asked in 1998 and 2008. We can then directly ask explore whether trust in science vs. religion was impacted by the broadsides against religion by Richard Dawkins et al.


The question is: “We trust too much in science and not enough in religious faith.” The responses are:

- Strongly agree
- Agree
- Not agree or disagree
- Disagree
- Strongly disagree

(note that in the GSS the responses are also coded 1 to 5 in the order above)

 

Trust in science by demographic, 1998 vs. 2008
Demographic Strong Agree Agree Neither Disagree Strong Disagree
1998 9 22 28 29 12
2008 7 25 25 31 12
1998 – Protestant 12 27 27 26 8
1998 – Catholic 4 21 30 35 10
1998 – None 2 5 29 30 33
2008 – Protestant 10 32 25 26 6
2008 – Catholic 3 24 28 34 11
2008 – None 2 9 18 43 28
1998 – Bible is Word of God 18 35 25 18 4
1998 – Bible is Inspired Word of God 5 22 31 33 10
1998 – Bible is Book of Fables 4 3 21 38 34
2008 – Bible is Word of God 15 44 23 15 2
2008 – Bible is Inspired Word of God 3 19 29 38 10
2008 – Bible is Book of Fables 2 9 17 39 33
1998 – Liberal 5 16 25 33 21
1998 – Moderate 9 20 32 27 11
1998 – Conservative 11 28 24 29 9
2008 – Liberal 4 16 19 41 20
2008 – Moderate 7 26 28 29 12
2008 – Conservative 9 29 25 29 8
1998 – Democrat 7 20 29 30 13
1998 – Independent 14 22 34 22 8
1998 – Republican 8 24 26 31 12
2008 – Democrat 6 24 25 30 14
2008 – Independent 5 21 28 36 11
2008 – Republican 8 29 25 29 8
1998 – No College 10 25 29 28 9
1998 – College 4 15 25 33 24
2008 – No College 8 28 26 30 8
2008 – College 3 17 22 34 24

Your mileage may vary, but I don’t see much difference. What’s the moral here? Before you make a conjecture why not check the relevant social science? Unfortunately, this is not a normal reflex. I recall in the early 2000s having to deal with the media and people talk about the “great American religious awakening” as if our nation was going through a particular time of religious fervor. I was skeptical, because the first results were already coming back from the Religious Identification Surveys. Rather than a religious awakening, the USA was seeing a secular surge which had no parallel after the 1960s. The moral of the story? Vague impressions can mislead.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis 
🔊 Listen RSS

It is sometimes fashionable to assert that higher socioeconomic status whites are the sort who will impose integration on lower socioeconomic status whites, all the while sequestering themselves away. I assumed this was a rough reflection of reality. But after looking at the General Social Survey I am not sure that this chestnut of cynical wisdom has a basis in fact. Below are the proportions of non-Hispanic whites who have had a black friend or acquaintance over for dinner recently by educational attainment:

35% – Less than high school
36% – High school
47% – Junior College
45% – Bachelor
59% – Graduate

I thought this might have been a fluke, so I played around with the GSS’s multiple regression feature, using a logistic model. To my surprise socioeconomic status was positively associated with having a black person over for dinner, and age negatively associated. These two variables in fact tended to exhibit equal magnitude values in opposition, and always remained statistically significant. Just to clear, I created a variable Non-South vs. South below (being Southern increases likelihood of having had a black person over for dinner). All the individuals surveyed are non-Hispanic whites for the year 2000 and later. You can add and remove variables, but SEI and age tend to be rather stable, and statistically significant, throughout.


(of course, this could just be a case where some demographics lie more than others)

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Demographics, Race 
🔊 Listen RSS

A few years ago a book came out, American Taliban: How War, Sex, Sin, and Power Bind Jihadists and the Radical Right. The title clearly was aimed to push copies, but the gist of the title has moderately wide circulation. The rough sketch is that conservative American Protestants are roughly equivalent to conservative Muslims. I have always held that this is a qualitatively misleading analogy. The reason is from all I can gather the socially views of mainstream American conservative Protestants are actually in the moderate range of opinion amongst Muslims. But apples-to-apples comparisons are rather difficult in this domain.

But then I realized that the World Values Survey could allow me to do exactly such comparisons. The method is simple. First, you can subsample the data sets, so I could look at Protestants in the United States who identified as political conservatives. I compared these to the view of Muslims in a selection of nations (the WVS doesn’t cover much of the world, and some questions are not asked in some countries).

The results below range from 1, never justifiable, to 10, always justifiable. There is some strangeness in the results below, but they show the general qualitative result: American conservative Protestants are in the main to the center or social liberal end of Muslim public opinion. They are not comparable at all to Muslim reactionaries.

Never justifiable Always justifiable
1 2 3 4 5 6 7 8 9 10 ? -
Justifiable: homosexuality
USA – Conserv Protestants 48% 7% 4% 0% 22% 5% 3% 4% 1% 4% 2% 0%
Iran (Shia) 81% 8% 3% 1% 1% 1% 1% 1% 0% 1% 0% 1%
Jordan (Muslim 99% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
Malaysia (Muslim 47% 9% 8% 8% 12% 7% 5% 2% 1% 1% 0% 0%
Turkey (Muslim) 72% 9% 5% 5% 5% 1% 1% 1% 0% 1% 2% 1%
Justifiable: abortion
USA – Conserv Protestants 27% 13% 9% 6% 20% 7% 3% 8% 1% 3% 1% 0%
Iran (Shia) 61% 11% 7% 5% 8% 3% 2% 2% 1% 2% 0% 1%
Iraq (Shia) 83% 6% 3% 3% 0% 0% 0% 0% 0% 0% 3% 2%
Iraq (Sunni) 62% 7% 4% 1% 6% 5% 4% 2% 1% 2% 6% 0%
Jordan (Muslim 93% 1% 2% 1% 1% 1% 0% 0% 0% 0% 1% 0%
Malaysia (Muslim) 49% 11% 8% 6% 11% 9% 5% 2% 1% 1% 0% 0%
Morocco (Muslim) 74% 4% 6% 3% 7% 1% 1% 1% 0% 1% 0% 3%
Turkey 61% 9% 6% 4% 7% 4% 3% 1% 1% 2% 1% 1%
Justifiable: man to beat his wife
USA – Conserv Protestants 86% 6% 1% 0% 2% 0% 0% 0% 0% 0% 2% 0%
Iran (Shia) 74% 10% 5% 3% 2% 1% 1% 1% 1% 1% 2% 0%
Jordan (Muslim) 86% 2% 2% 3% 5% 1% 1% 0% 0% 0% 0% 0%
Malaysia (Muslim) 45% 10% 7% 8% 12% 8% 5% 3% 1% 2% 0% 0%
Morocco (Muslim) 66% 7% 6% 5% 7% 2% 2% 2% 1% 3% 0% 1%
Turkey (Muslim) 78% 11% 5% 2% 1% 0% 1% 0% 0% 1% 0% 0%
(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis 
🔊 Listen RSS

Update: There was a major coding error. I’ve rerun the analysis. No qualitative change.

As is often the case a 10 minute post using the General Social Survey is getting a lot of attention. Apparently circa 1997 web interfaces are so intimidating to people that extracting a little data goes a long way. Instead of talking and commenting I thought as an exercise I would go further, and also be precise about my methodology so that people could replicate it (hint: this is a chance for readers to follow up and figure something out on their own, instead of tossing out an opinion I don’t care about).

 

Just like below I limited the sample to non-Hispanic whites after the year 2000. Here’s how I did it: YEAR(2000-*), RACE(1), HISPANIC(1)

Next I want to compare income, with 1986 values as a base, with party identification. To increase sample sizes I combined all Democrats and Republicans into one class; the social science points to the reality that the vast majority of independents who “lean” in one direction are actually usually reliable voters for that party. So I feel no guilt about this. I suppose Americans simply like the conceit of being independent? I know I do. In any case, here are the queries:

For row: REALINC(r:0-20000″LLM”;20000-40000″M”;40000-80000″UM”;80000-*”BU”)
For column: PARTY(r:0-2″Dem”;3″Ind”;4-6″Rep”)

What I’m doing above is combining classes, and also labeling. The GSS has documentation to make sense of it if you care. Some of you were a little confused as to what $80,000 household income in 1986 means. I went and converted 1986 dollars to dollars today.

Value of income conversion
1986 2012
$20,000 $42,000
$40,000 $83,000
$80,000 $166,000

 

As you can see $80,000 in 1986 would be $166,000 today. So what percentile in household income is $166,000? Here it is (I rounded generously, so it is really 43 or 93 and such, instead of 40 or 95):

Income range Quantitative class Descriptive class
Up to $20,000 < 40% Lower & Lower Middle (LLM)
$20,000 to $40,000 40% to 70% Middle (M)
$40,000 to $80,000 70% to 95% Upper Middle (UM)
$80,000 and up > 95% Broad Upper (BU)

To clear up future confusions I have relabeled the income ranges with the descriptive classes above. You can argue all you want that being in the ~5% of income is not upper class, but just pretend I used a different term (e.g., higher middle class?). I’m not too hung up on the terminology, I’m more focused on the people in the top 5% of the income distribution. The local doctor or successful business person, not the billionaire who owns an island in the Caribbean.

Now you have a sense of the classes which we’ll be looking at. In the results below I report the proportions of the row and column values. So the leftmost three columns will tell you the percentage of Democrats who are upper class, while the rightmost three columns will tell you the percentage of upper class people who are Democrats. The leftmost three columns add up to 100% vertically, the rightmost three columns 100% horizontally.

The second major aspect of reading the table below is that I “controlled” for various sets of characteristics. So, for example, you see the income and party identification patterns for those with no college education, and those with college educations. Here are the variables:

DEGREE(r:0-2″No College”;3-4″College”), BIBLE, REGION(r:1-4,8-9″Not South”;5-7″South”), SEI

Two notes here. First, I used the Census division categories. Second, the “socioeconomic status index” is more than just income, and I created three broad classes, giving you the percentile ranges.


Columns = 100% Rows = 100%
Dem Ind Rep Dem Ind Rep
LLM 42 51 33 40 24 33
M 28 27 28 37 18 45
UM 21 16 27 35 13 53
BU 8 6 11 34 11 55
No College
Dem Ind Rep Dem Ind Rep
LLM 51 55 39 39 26 35
M 29 27 31 36 20 44
UM 16 14 24 31 16 53
BU 4 3 6 28 15 57
College
Dem Ind Rep Dem Ind Rep
LLM 24 28 19 45 13 42
M 25 28 24 42 11 47
UM 32 25 35 40 8 35
BU 19 19 22 38 9 53
Bible is Word of God
Dem Ind Rep Dem Ind Rep
LLM 63 61 39 35 22 44
M 27 28 32 24 16 59
UM 10 10 23 16 10 75
BU 1 2 5 7 11 82
Bible is Inspired of God
Dem Ind Rep Dem Ind Rep
LLM 37 49 28 41 23 36
M 31 29 29 41 16 43
UM 24 17 29 37 11 52
BU 8 5 14 30 8 62
Bible is Book of Fables
Dem Ind Rep Dem Ind Rep
LLM 37 51 28 51 29 28
M 25 22 24 53 20 24
UM 23 20 30 50 17 30
BU 15 7 18 55 11 34
Not the South
Dem Ind Rep Dem Ind Rep
LLM 40 50 28 41 24 35
M 27 27 28 39 18 43
UM 23 17 28 38 13 49
BU 10 6 11 39 12 49
The South
Dem Ind Rep Dem Ind Rep
LLM 47 54 33 39 23 33
M 29 28 29 35 17 48
UM 18 14 27 28 11 60
BU 6 4 12 23 9 67
Bottom 50% of socioeconomic status
Dem Ind Rep Dem Ind Rep
LLM 55 59 44 40 27 33
M 28 27 30 36 23 41
UM 14 11 21 32 17 51
BU 4 3 5 34 17 49
40% to 10% of socioeconomic status
Dem Ind Rep Dem Ind Rep
LLM 34 38 26 40 19 41
M 31 30 29 38 14 47
UM 26 24 31 33 13 54
BU 9 7 15 28 10 62
Top 10% of socioeconomic status
Dem Ind Rep Dem Ind Rep
LLM 18 28 17 43 14 43
M 24 27 24 41 11 48
UM 30 20 34 40 6 53
BU 26 25 25 42 9 48
(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis, Demographics, GSS, Politics 
🔊 Listen RSS

A few months ago I listened to Frank Newport of Gallup tell Kai Ryssdal of Marketplace that upper class Americans tend to be Democrats. Ryssdal was skeptical, but Newport reiterated himself, and explained that’s just how the numbers shook out. This is important because Newport shows up every now and then to offer up numbers from Gallup to get a pulse of the American nation.

Frankly, Newport was just full of crap. I understand that Thomas Frank wrote an impressionistic book which is highly influential, What’s the Matter with Kansas, while more recently Charles Murray has come out with the argument in Coming Apart that the elites tend toward social liberalism. I’m of the opinion that Frank is just wrong on the face of it, but that’s OK because he’s an impressionistic journalist, and I don’t expect much from that set beyond what I might expect from a sports columnist for ESPN. Murray presents a somewhat different case, as outlined by Andrew Gelman, in that his “upper class” is modulated in a particular manner so as to fall within the purview of his framework. Neither of these qualifications apply to Frank Newport, who is purportedly presenting straightforward unadorned data.

When the “average person on the street” thinks upper class they think first and foremost money. This is not all they think about, but in the rank order of criteria this is certainly first on the list. We can argue till the cows come home as to whether a wealthy small business owner in Iowa who is a college drop out is more or less elite than a college professor in New York City who is bringing home a modest upper middle class income (very modest adjusting for cost of living). But to a first approximation when we look at aggregates we had better look at the bottom line of money. After that we can talk details. And the first approximation is incredibly easy to ascertain. Below is a table and chart which illustrate the proportion of non-Hispanic whites after 2000 who align with a particular party as a function of family income, with family income being indexed to a 1986 value (so presumably $80,000 hear means what $80,000 would buy in 1986, not the aughts).

 

Family Income Strong Dem Dem Lean Dem Ind Lean Rep Rep Strong Rep
Less than $20,000 12 15 12 24 9 15 12
$20-$40,000 12 15 10 18 11 19 15
$40-$80,000 11 14 10 13 11 24 18
More than $80,000 12 12 10 11 11 23 21

The results are straightforward: the more income a family has, the more likely they are to be Republican. There is a lot of nuance and geographical detail to be fleshed out in these results. But these facts are where we need to start.

Andrew Gelman has much more as usual. For example, this chart:

 

 

Why do I keep posting this stuff? Because facts matter. That’s my hope, my faith. Tell people facts, and they will open their eyes. Tell your friends, tell your family. Have whatever opinion you want to have, but start with the facts we know. Look up facts, calculate facts, analyze facts. They are there for us, we just need to go look. Google is your friend, Wikipedia is your friend. The General Social Survey is your friend.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis, Demographics 
🔊 Listen RSS

A questioner below was curious if vocabulary test differences by ethnic and region persist across income. There’s a problem with this. First, the INCOME variable isn’t very fine-grained (there is a catchall $30,000 or greater category). Second, it doesn’t seem to control for inflation. But, there is a variable, DEGREE, which asks the highest level of education attained. I used this to create a “college” and “non-college” category (i.e., do you have a bachelor’s degree or not). Because of sample size considerations I removed some of the ethnic groups, but replicated the earlier analysis.

Below are two tables. One shows the mean vocab score for region and ethnicity (for whites) for those without college educations, and another shows those with college educations. I decided to generate a correlation over the two rows, even though it sure isn’t useful as a quantitative statistical measure because of the small number of data points. Rather, I just wanted a summary of the qualitative result. The short answer is that the average vocabulary difference seems to persist across educational levels (the exception here is the “German” ethnicity).

Mean WORDSUM Score by Ethnicity and Region
No college education

Northeast

Midwest

South

West
German 6.05 5.81 5.79 6.11
Eastern Europe 6.17 6.16 6.18 6.29
Scandinavian 6.35 5.97 6.23 6.35
British 6.6 6.21 6.02 6.57
Irish 6.66 5.83 5.69 6.58
Italian 6 5.85 5.8 6.18
College educated

Northeast

Midwest

South

West
German 8.03 7.48 7.63 7.33
Eastern Europe 7.7 7.37 7.5 8.09
Scandinavian 8.5 7.82 7.86 7.92
British 8.44 8.06 7.76 7.95
Irish 8.03 7.79 7.39 7.59
Italian 7.45 7.75 7.6 7.87
Correlation of college and non-college
German 0.08
Eastern Europe 0.92
Scandinavian 0.57
British 0.70
Irish 0.57
Italian 0.40
(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis, GSS, I.Q., Regionalism 
🔊 Listen RSS

We’ll be talking about Iran a lot in the near future in the United States. I doubt we’ll invade the country (thank god). But one thing I think needs to be emphasized: on social issues Iran is more “progressive” than many of our close allies in the region, like Saudi Arabia, and one of the more progressive nations in the region. This is neither here nor there in the domain of geopolitics, but to convince a public about something it is often necessary to make a cartoon or caricature the enemy. I think it is important to remember though that aside from Israel our closest allies in the region are techno-feudal monarchies like Saudi Arabia, not those nations, like Iran, which have made a more thorough accommodation with modernity out of necessity (because oil can’t support the whole economy). It also reminds us that labels like “Islamic Republic” may not be totally useful.

As a gauge of modern outlook, as understood in the West, I poked around the World Values Survey. The results are for wave 4, around ~2000. The question asked was: A wife must always obey her husband. Possible answers:
- Agree strongly
- Agree
- Neither agree or disagree
- Disagree
- Strongly disagree

Below are two tables with nations which responded to this question. I stratified by sex and educational level of respondents. The sample sizes are in the “Total” column. The other numbers are percentages, summed along the rows to 100%. There are some surprises, but I’ll let the data speak for itself….

Total Agree strongly Agree Neither Disagree Strongly disagree
Algeria 1252 44 31 15 8 2
Sex Male 635 57 27 11 6 1
Female 617 31 35 20 11 2
Bangladesh 1489 34 49 10 5 2
Sex Male 825 38 49 8 4 1
Female 664 29 49 12 7 3
Indonesia 992 27 52 6 12 3
Sex Male 499 36 50 6 7 1
Female 493 18 55 6 17 5
Iran (Islamic Republic of) 2496 24 28 18 17 12
Sex Male 1343 31 31 17 14 7
Female 1153 16 25 20 21 19
Iraq 2305 64 25 9 0 2
Sex Male 1114 63 27 9 0 2
Female 1191 65 24 10 0 2
Jordan 1219 43 31 7 12 7
Sex Male 593 57 29 5 6 4
Female 626 29 34 10 18 9
Morocco 1012 56 24 13 6 1
Sex Male 496 66 22 10 3 1
Female 516 47 27 17 8 1
Nigeria 2020 83 13 2 1 1
Sex Male 1031 87 10 1 1 1
Female 989 79 17 3 1 1
Pakistan 1975 28 19 20 20 13
Sex Male 1021 34 18 19 17 12
Female 954 22 21 21 23 14
Saudi Arabia 1494 52 30 13 3 2
Sex Male 753 64 26 8 1 0
Female 741 39 33 19 5 4
Turkey 3368 32 42 15 11 0
Sex Male 1706 39 41 13 8 0
Female 1662 25 43 16 15 0
Egypt 3000 47 31 12 10 0
Sex Male 1540 53 29 11 7 0
Female 1460 40 34 14 12 0
Total 22622 44 31 12 9 4

Total Agree strongly Agree Neither Disagree Strongly disagree
Algeria 1248 44 31 15 8 2
Education level (recoded) Lower 301 49 31 12 8 0
Middle 544 46 31 16 7 1
Upper 403 39 30 17 10 4
Bangladesh 1476 34 49 9 5 2
Education level (recoded) Lower 789 34 52 8 5 2
Middle 401 37 46 10 5 3
Upper 286 31 49 12 7 2
Indonesia 985 27 53 6 12 2
Education level (recoded) Lower 241 25 58 5 8 3
Middle 411 28 53 5 13 1
Upper 333 29 49 6 13 3
Iran (Islamic Republic of) 2391 24 28 18 17 12
Education level (recoded) Lower 757 36 27 16 12 9
Middle 981 21 31 19 18 11
Upper 653 16 25 19 22 18
Iraq 2288 64 25 9 0 2
Education level (recoded) Lower 1298 67 25 7 0 1
Middle 577 63 24 11 0 3
Upper 413 55 29 13 0 3
Jordan 1217 43 31 7 12 7
Education level (recoded) Lower 587 54 27 6 7 6
Middle 332 36 34 8 15 8
Upper 297 27 38 9 19 8
Morocco 1012 56 24 13 6 1
Education level (recoded) Lower 788 63 24 10 3 0
Middle 160 38 26 22 13 1
Upper 64 22 30 25 14 9
Nigeria 2012 83 13 2 1 1
Education level (recoded) Lower 768 83 13 2 1 1
Middle 774 85 13 1 1 1
Upper 470 80 15 3 2 0
Pakistan 1973 28 19 20 20 13
Education level (recoded) Lower 1078 36 27 12 9 16
Middle 614 20 10 33 31 6
Upper 281 13 11 24 36 17
Saudi Arabia 1494 52 30 13 3 2
Education level (recoded) Lower 135 46 31 13 6 4
Middle 973 52 29 13 3 2
Upper 386 52 30 13 3 2
Turkey 3179 33 43 14 11 0
Education level (recoded) Lower 1975 37 48 9 6 0
Middle 918 29 39 18 14 0
Upper 287 15 23 33 29 0
Egypt 2998 47 31 12 10 0
Education level (recoded) Lower 1516 53 32 9 7 0
Middle 927 43 31 15 11 0
Upper 555 38 30 18 15 0
Total 22272 45 31 12 9 4
(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis 
🔊 Listen RSS

In the early 2000s I recall Joel Grus telling me how reality television would become a pretty powerful exploratory tool for social science. I’m not quite sure of that now (there here’s a game-theoretic analysis of Survivor!). For example, consider The Bachelor and The Bachelorette. If you watched this series you might think that we’re still living in the same country where a episode of Star Trek was not shown in the South because of an interracial kiss. In some ways “appointment television” has become a lagging indicator.

Rather, it looks like firms whose bread & butter is “the social web” are where the gold in social science is. Consider the OkTrends blog, which is affiliated with and has access to OkCupid. These companies have sample sizes not in the thousands, but in the millions! The Financial Times has a fascinating piece on the “secret sauce” of Match.com, Inside Match.com: It’s all about the algorithm:

With the number of paying subscribers using Match approaching 1.8 million, the ­company has had to develop ever more ­sophisticated programs to manage, sort and pair the world’s singles. Central to this effort has been the development, over the past two years, of an improved matchmaking algorithm….

“People are complex. You’re constantly making trade-offs about who’s too tall, too short, too smart and too dumb. People come in and tell us a bit about what they’re looking for. But what you say and what you do can be different.”

Academics call this “dissonance”. “It’s a theme that runs through social psychological literature,” says Andrew Fiore, a visiting assistant professor at Michigan State University, who works on ­computer-mediated communication. “We don’t know ourselves very well on a descriptive level.”

This is all great, but it falls into the category of generic platitudes about avowed and revealed preferences. Most people believe in fidelity, but a subset of these people cheat. The juicy stuff is in the specific patterns Match.com is finding:

As a result, Match began “weighting” variables differently, according to how users behaved. For example, if conservative users were actually looking at profiles of liberals, the algorithm would learn from that and recommend more liberal users to them. Indeed, says Thombre, “the politics one is quite interesting. Conservatives are far more open to reaching out to someone with a different point of view than a liberal is.” That is, when it comes to looking for love, conservatives are more open-minded than liberals.

This is intuitively surprising, but more scienced up it is rather strange because one of the psychological underpinnings for why someone is more likely to be liberal than conservative is “openness to experience.” But there’s openness, and then there’s openness. I suppose one could suggest that the aversion to political conservatives amongst liberals in Match.com’s data set might have to do with the fact that liberals feel like they know what they’re getting when they date a political conservative, and it doesn’t tingle their novelty seeking tendencies.

That being said, my personal experience growing up as an adolescent in a overwhelmingly politically conservative milieu (the Intermontane West) and spending most of my adulthood in very liberal cities (e.g., Portland, Oregon, Berkeley, California) is that the stereotyping and intolerance I’ve experienced as a libertarian-conservative atheist had more to due with my irreligiosity in the former context and my politics in the latter. One might suggest then that the appropriate analog for the Christian nationalist religious identity of many conservatives amongst secular liberals is the set of political positions which they espouse. Both signal virtue and righteousness, even if the details differ.

Though one should be careful of taking one glimpse into Match.com’s data set too seriously. Context matters, and I don’t know if there’s selection bias here (one suspects that eHarmony has a more conservative clientele, so right-wingers using Match.com might be more adventuresome by nature). Unfortunately I doubt that those outside of these firms will have much access all their delicious information, but people leave companies. I recall a friend telling me that he overheard some Facebook employees batting around how to predict when you were about to unfriend someone a few years back.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Culture, Data, Data Analysis, Dating 
🔊 Listen RSS

Bhutan famously espouses “gross national happiness”:

The term “gross national happiness” was coined in 1972 by Bhutan’s former King Jigme Singye Wangchuck, who has opened Bhutan to the age of modernization, soon after the demise of his father, King Jigme Dorji Wangchuk. He used the phrase to signal his commitment to building an economy that would serve Bhutan’s unique culture based on Buddhist spiritual values….

Apparently the nation has recent switched from absolute to constitutional monarchy:

Bhutan’s political system has developed from an absolute monarchy into a constitutional monarchy. In 1999, the fourth king of Bhutan created a body called the Lhengye Zhungtshog (Council of Ministers). The Druk Gyalpo (King of Druk Yul) is head of state. Executive power is exercised by the Lhengye Zhungtshog, the council of ministers. Legislative power was vested in both the government and the former Grand National Assembly.

On the 17th of December 2005, the 4th King, Jigme Singye Wangchuck, announced to a stunned nation that the first general elections would be held in 2008, and that he would abdicate the throne in favor of his eldest son, the crown prince….

From what I can tell the royal house of Bhutan seems genuinely sincere. More plainly paternalist than deiviously despotic.

Below are some Google Data trend lines comparing Bhutan to some of its smaller South Asian neighbors, as well as Sweden and Equatorial Guinea as comparisons at the high and low ends.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Economics, Science • Tags: Data, Data Analysis 
🔊 Listen RSS

Yesterday I made an admission of my lack of trust after the 2008 financial crisis. I should have been more precise and clarified that my collapse in trust has been particularly aimed at elites and “experts.” In any case, I realized that the General Social Survey has 2010 results available. This means that I could check any changes in public trust and confidence from 2008 to 2010! Below in the set of charts there is one that assesses trust in banks and financial institutions. The direction of change validates my specific implication. But it seems that my intuition was wrong in that American society had slouched toward more general distrust. This makes me less pessimistic about the direction of our culture and the future rationally (I can’t say that my visceral emotional cynicism has been abolished).

As you can see there wasn’t much change between 2008 and 2010. For the broad question of “can you trust people” I also decided to break it down by political ideology, education, and intelligence in two year rages, 1972-1991 and 1992-2010. There are noticeable differences in intelligence and education (less intelligent and less educated people are more distrustful), but not in terms of ideology.

After the bar plots there are another range of line graphs by year showing confidence in a range of institutions (including finance) from 1972 to 2010. It is interesting how much you can see short term volatility due to world events, which quickly recedes back toward the trend line.





(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis, Finance, General Social Survey, GSS, Trust 
🔊 Listen RSS

Poking around Google Data Explorer I reacquainted myself with an interesting fact: though the teen birth rate in Bangladesh is greater than that in Pakistan, the total fertility rate is far lower. The disjunction has emerged over the last generation, as Bangladesh’s TFR has dropped much faster than Pakistan’s. To the left you see a scatter plot, which shows teen fertility rates (age 15-19) as a function of total fertility rates. I’ve labeled a few nations, and also added the color coding by region. It is notable that the nations above the trend line seem to be Latin American, while those below are disproportionately Middle Eastern. That means that Latin American nations have higher teen fertility in relation to their total fertility, while Middle Eastern nations have lower teen fertility in relation to their total fertility. Sweden actually has a rather high fertility rate in relation to its teen birth rate. The expectation is generated by world wide patterns, so I thought I’d look more closely at the original data sets from the The World Bank. All the data is from 2008. The teen birth rates are per 1,000 of teens in the age range, with TFR’s are per woman.

My contention is this: those nations with high overall fertility despite low teen fertility rates indicate an ideological or operational pro-natalist cultural stance. That means that mature adult women in marriages are presumably having many children. The high teen fertility rates in Bangladesh vis-a-vis Pakistan is probably simply due to lower aggregate development (Pakistan is still higher up on the HDI ranking, though the gap is closing).

Below are some charts. First, a plot with lines of best fit (as generated by R’s loess function). Then, absolute deviations from the line of best fit as a function of fertility. Also, percentage deviations from the line of best fit as a function of fertility. I provide the weighted trend line, but rely on the unweighted fit for the rest of the charts.


[nggallery id=28]

Next, let’s compare percentage and absolute deviation from the trend line on the same plot.

[nggallery id=29]

Finally, a table with the “top 15.”


Country Teen births/1,000 TFR Population Deviation % Deviation
Top 15 absolute deviation above the trend line
Nicaragua 112.09 2.72 5667325 61.11 54.52
Dominican Republic 108.18 2.65 9952711 58.93 54.48
Brazil 75.07 1.88 191971506 45.83 61.05
Nepal 98.51 2.9 28809526 44 44.67
Venezuela 89.67 2.54 27935000 43.28 48.27
Cape Verde 93.36 2.73 498672 42.15 45.14
El Salvador 82.22 2.32 6133910 41.28 50.2
Ecuador 82.6 2.56 13481424 35.69 43.21
Costa Rica 66.9 1.96 4519126 35.47 53.02
Honduras 92.26 3.26 7318789 35.47 38.44
Panama 81.95 2.55 3398823 35.3 43.07
Jamaica 76.62 2.39 2687200 33.94 44.3
Gabon 88.6 3.31 1448159 31.68 35.76
Colombia 73.75 2.43 45012096 30.13 40.85
Mexico 64.33 2.1 106350433.7 29.21 45.4
Top 15 percentage deviation above the trend line
Brazil 75.07 1.88 191971506 45.83 61.05
Cuba 45.36 1.51 11204735 26.13 57.6
Bulgaria 41.6 1.48 7623395 23.18 55.71
Nicaragua 112.09 2.72 5667325 61.11 54.52
Dominican Republic 108.18 2.65 9952711 58.93 54.48
Barbados 42.75 1.53 255203 22.97 53.74
Costa Rica 66.9 1.96 4519126 35.47 53.02
Georgia 44.34 1.58 4307011 23.2 52.33
Romania 30.68 1.35 21513622 15.69 51.14
El Salvador 82.22 2.32 6133910 41.28 50.2
Puerto Rico 52.72 1.8 3954553 25.63 48.61
Chile 59.42 1.93 16803952 28.81 48.49
Venezuela 89.67 2.54 27935000 43.28 48.27
Mauritius 39.77 1.58 1268854 18.63 46.86
Uruguay 60.86 2.01 3334052 28.08 46.14
Top 15 absolute deviation below the trend line
Libya 3.11 2.7 6294181 -47.39 -1523.69
Oman 10.39 3.05 2785361 -45.54 -438.3
Israel 14.15 2.96 7308800 -41.07 -290.26
Djibouti 22.51 3.9 849245 -39.33 -174.7
Samoa 26.77 3.95 178869 -36.17 -135.12
Algeria 7.25 2.36 34373426 -34.7 -478.63
Malaysia 12.66 2.56 27014337 -34.25 -270.55
Uzbekistan 12.83 2.56 27313700 -34.08 -265.64
Micronesia 24.67 3.57 110414 -33.15 -134.39
Jordan 24.33 3.49 5812000 -33.11 -136.09
Saudi Arabia 25.81 3.12 24807000 -30.5 -118.18
Tajikistan 28.07 3.41 6836083 -29.1 -103.68
Qatar 15.81 2.41 1280862 -27.34 -172.93
Tunisia 6.88 2.06 10327800 -27.21 -395.56
France 6.76 2 62277432 -25.75 -380.91
Top 15 percentage deviation below the trend line
Country Teen births/1,000 TFR Population Deviation % Deviation
Libya 3.11 2.7 6294181 -47.39 -1523.69
Algeria 7.25 2.36 34373426 -34.7 -478.63
Oman 10.39 3.05 2785361 -45.54 -438.3
Denmark 5.92 1.89 5493621 -23.59 -398.55
Tunisia 6.88 2.06 10327800 -27.21 -395.56
France 6.76 2 62277432 -25.75 -380.91
Slovenia 4.84 1.53 2021316 -14.94 -308.59
Sweden 7.58 1.91 9219637 -22.48 -296.58
Israel 14.15 2.96 7308800 -41.07 -290.26
Norway 8.39 1.96 4768212 -23.04 -274.57
Malaysia 12.66 2.56 27014337 -34.25 -270.55
Uzbekistan 12.83 2.56 27313700 -34.08 -265.64
Belgium 7.6 1.82 10708433 -20.02 -263.45
Italy 4.8 1.41 59832179 -11.76 -245
Switzerland 5.44 1.48 7647675 -12.98 -238.69

To restate: my assertion is that nations with a high TFR despite low birth rates in the 15-19 age range indicate a realized preference for large families. This seems to be the class that Israel, Rwanda, and many Middle Eastern nations fall into. Some European nations, such as France, have a higher TFR in relation to what they’re teen birth rates would predict. This is just a function partly of very low teen birth rates. But in the case of France it is probably a function of moderate pro-natalism.

In the other class you have many Latin American nations, whose fertility is modest, but teen birth rates are very high. I think this is probably a symptom of demographic structure within the population. There’s a lot of inequality and variation in economics and cultures within the societies. I think this is why a very low TFR countries such as Romania shows up: the Roma minority has a high teen birth rate. They are not numerous enough to change the average TFR much, but have shifted the teen birth rates.

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Data, Data Analysis, Fertility 
🔊 Listen RSS

    The maps above juxtaposes the counties which shifted Republican in the 2008 presidential election vs. 2004 (reddish) and the age-adjusted estimated rates of obesity by county in 2007 (darker blue). One issue which I haven’t seen explored too much are the two faces of Appalachia; the Atlantic facing counties are generally healthier than the lowland countries to their east, even controlling for race. In contrast, the west facing counties have some of the lowest human development indices in the United States. West Virginia is the fattest state. And it seems purely from inspection that the east facing counties of Appalachia which shifted toward the Republicans in 2008 are also amongst the fattest in the nation.


    Rush Limbaugh, fat again

    Is this simply a coincidence? A reader queried me about the relationship between politics and weight, wondering about correlations. I don’t follow politics too closely, but apparently there has been some conflict recently between conservatives who oppose the top-down campaign against obesity spearheaded by our cultural and political elites. My perception, which may be wrong, is that some are portraying this as another liberal culture war. To some extent this is dumb, as it seems that the biggest salient predictor of weight is class. The majority of American adults are overweight according to BMI thresholds, and a significant minority are obese. And yet none of the presidential and vice presidential candidates in 2008, or their spouses, were overweight. Take a look at the candidates during the Democratic and Republican debates in 2008, and you can see that they don’t “look like America.” Despite the efforts of NAAFA this is one way that Americans are not too keen on the candidates reflecting themselves. Rather, it seems that Americans were more accepting of fat heads of state when they were a slimmer folk.

    Looking in the GSS there’s one variable which might shed light on the question of politics and weight, INTRWGHT. This is basically an interviewer assessment of the weight of the respondent. It was collected in 2004. I limited the sample to non-Hispanic whites to eliminate population stratification.


    Liberal Moderate Conservative
    Below Average 7.2 6 6.4
    Average 71.8 73.2 70.9
    Somewhat Above Average 18.3 17.1 18.7
    Considerably Above Average 2.7 3.7 3
    Liberal Moderate Conservative
    Below Average 27.9 32.5 39.5
    Average 24.9 35.7 39.4
    Somewhat Above Average 24.7 32.5 42.7
    Considerably Above Average 21.3 41.1 37.6

    The first set of numbers sums to 100% for the rows, and the second set sums the columns. I don’t see a notable different obetween liberals and conservatives. The only exception might be that liberals are more well represented among those who are below average in weight than those who are considerably above average, but the samples are small enough than I don’t trust that to be anything more than measurement error.

    There is another variable in regards to weight which I think is interesting: GENENVO1. The respondents were given this scenario: “Carol is a substantially overweight White woman. She has lost weight in the past but always gains it back again.” Then they were asked to rate the proportion of the outcome which could be attributed to genes. The means were as follows:

    Liberals, 54% environmental
    Moderates: 56% environmental
    Conservatives: 61% environmental

    I was a little dubious about this result, since it goes against stereotype. So I checked the other similar questions.

    “George is a Black man who’s a good all-around athlete. He was on the high school varsity swim team and still works out five times a week.”

    Liberals, 54% environmental
    Moderates: 54% environmental
    Conservatives: 59% environmental

    “Felicia is a very kind Hispanic woman. She never has anything bad to say about anybody, and can be counted on to help others.”

    Liberals, 54% environmental
    Moderates: 58% environmental
    Conservatives: 60% environmental

    “David is an Asian man who drinks enough alcohol to become drunk several times a week. Often he can’t remember what happened during these drinking episodes.”

    Liberals, 55% environmental
    Moderates: 56% environmental
    Conservatives: 58% environmental

    The differences are small, but consistent. It could be incorrect coding, and I don’t know how it relates to the current perceived polarization on the issue of weight. My own suspicion is that this is more a creation of the media than anything else, but I am going to look at correlations on the county level data next. But at this point I doubt there’s a culture war around fat. Being fat may not be immoral, but most people would rather be slim. Though how we get there is a matter of some contention naturally.

    (Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Culture, Data, Data Analysis 
🔊 Listen RSS

On occasion I get queries about what distinguishes people with science backgrounds from those who don’t have science backgrounds. I think an anecdote might illustrate the type of difference one is expecting. Back in undergrad I was having lunch with my lab partner, when a friend saw us and decided to chat with us as we ate. This friend is now an academic, and has a doctorate in a humanistic field (something like Comparative Literature, I forget). In any case, she had read something about transgenic organisms, and obviously felt as if it was the time and place to go on a rant about this. She knew that I was totally comfortable with the idea of transgenic organisms, but she recounted the fish-genes-in-tomato patent story to my lab partner to illustrate how gross the outcome could be. My lab partner was a pre-med math major, and she just shrugged and explained that she’d done biomedical research last summer, so she understood the practical necessity of such methods, and admitted that it would take more than a story about “fish genes” in a tomato to freak her out.

Kevin Drum’s post about the lack of Republican scientists makes me want to revisit the issue of science vs. non-science. I think the lack of Republican scientists is pretty straightforward. There’s the clear cultural gap, as the Republican party emphasizes its conservative Christian component, which turns off libertarian-leaning but secular scientists. And, there’s the reality that agencies like the NSF and NIH are often attacked by fiscal conservatives, and many scientists in academia and government depend on this funding. Sarah Palin’s attack on “fruit fly” research combined the two threads neatly and unfortunately.

In any case, there is a major related variable in the GSS, MAJORCOL. The sample sizes are not the best, but at least it was a recently asked demographic variable, 2006 and 2008. I decided to look at three sets, those with “natural science” degrees, those with “cs & engineering” degrees, and the total pot (inclusive of the first two classes). The last is a snapshot of all those with at least a college degree (the sample is restricted to those who completed their degree).

In the tables below each cell gives a percentage of the row in the column class. So in the first table 79% of CS & engineering degree holders are male. 22% of CS & engineering degree holders are Roman Catholic.

Basic Demographics
Race Religion
Male White Black Other Protestant Catholic No Religion
Natural Science 57 80 5 15 39 24 29
CS & Engineering 79 79 3 18 50 22 18
All Degree Holders 43 86 6 8 44 27 17

Ideology Party 2004 Vote
Liberal Moderate Conserv Dem Ind Rep Yes – Abortion on Demand Bush
Natural Science 43 27 30 47 16 37 70 43
CS & Engineering 30 27 43 37 13 50 54 58
All Degree Holders 33 29 38 48 10 42 52 52

Bible is…. Humans evolved Attitude about GMO food
Word of God Inspired Book of Fables Yes Not concerned Won’t eat Atheist & Agnostic Know God Exists
Natural Science 18 36 44 81 30 4 23 35
CS & Engineering 11 64 24 75 30 8 16 48
All Degree Holders 16 59 23 64 17 27 10 51

Verbal intelligence (WORDSUM vocab test score)
Dull (0-5) Not dull (6-8) Smart (9-10)
Natural Science 8 70 22
CS & Engineering 20 66 14
All Degree Holders 20 57 24

I assume no one is too surprised by these results. Here’s the code for the Majors:

MAJORCOL( r:8,11,24,33,41,51″Natural Science”; 14,18″CS and Engineering”;1-98″Full Sample”)

I counted biology, chemistry, geology, physics and mathematics as natural sciences. Math is probably a stretch. Computer science and engineering were obviously in the second category. Obviously there’s more you could do. For example, 49% of males with natural science degrees voted for George W. Bush in 2004, while 60% of those with cs & engineering degrees did. The total sample for males was 57% for Bush.

Many of the sample sizes are small, but they align with our intuition. Which perhaps makes them less than interesting….

(Republished from Discover/GNXP by permission of author or representative)
 
• Category: Science • Tags: Analysis, Data, Data Analysis, GSS 
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at http://www.razib.com"