 Gene Expression Blog

4197TkGD1DL._SY344_BO1,204,203,200_ At a readers’ suggestion I got Explaining Postmodernism: Skepticism and Socialism from Rousseau to Foucault. Unlike The Dialectical Imagination this is not necessarily a detached academic book. Rather, the author has a definite perspective. About 20 years ago I read George H. Smith’s Atheism: The Case Against God, and there are a lot of similarities between the two books. From that I suspected before doing some research that the author had some influence from Objectivism, and that seems correct. I don’t care too much, and I probably broadly share the authors’ libertarian/classical liberal politics, but the middle sections of the Explaining Postmodernism got a little preachy for my taste.

Nevertheless the first half especially is excellent, and it outlines a genealogy of the Postmodern movement very concisely and in an illuminating manner. There are some details where one might quibble (e.g., the relationship of Immanuel Kant to religion is much more in dispute than presented in Explaining Postmodernism, though that’s a minor objection). But the progression of Transcendental Realism down to the morass we see around us today is a familiar story told more crisply here then elsewhere.

The author does outline a sociological origin for Postmodernism which is very intriguing, as it explains why it is an overwhelmingly far Left movement, despite the implications toward extreme relativism and subjectivism one might infer from the worldview. But there is something that is omitted here, perhaps because the author was not aware of this fact: there is a relationship of Postmodernism to elite intellectual religious revival in the past few decades.

418lqzfRjwL._SY344_BO1,204,203,200_ Alister McGrath is a proponent of this school, and outlines his thesis in The Twilight of Atheism: The Rise and Fall of Disbelief in the Modern World. To condense McGrath’s argument, if everything is a superstition, then people will fall back on the tried and true superstitions. McGrath asserts the collapse of the authority of rationality and philosophical realism signal the end of the Enlightenment project, and undermines the secular world. As an empirical matter there is a lot one can quibble with here; in particular, though Postmodernism may be vigorous in the academy, science and technology are concrete witness to power of naive realism and rationalist presuppositions. But McGrath’s position is philosophically cogent. And, it is not well known, but the doyen of modern Intelligent Design, Phillip E. Johnson, has admitted that the he was strongly shaped by Critical Theory:

I used to refer jokingly to myself as the entire right wing of the Critical Legal Studies movement, which in their view was a contradiction in terms. Their critique was purely the instrument of a left-wing political program, which was chosen arbitrarily and presumed to be good. It was a faith commitment.

So I’m reading a lot of things on Postmodernism. I’m going back to Hegel. On the one hand this is a major opportunity cost. There’s a lot of science and programming stuff I want to read that I don’t have time to read. On other hand, much that I was suspecting becomes very clear. Ultimately I’m respecting Postmodernism more insofar as it is a reasonable tool from what I can see in destroying certainty and realism for those who are innumerate. If you have any understanding of statistics you’re pretty insulated from Postmodernism. But very few people think statistically. I also now believe it is ultimately far more dangerous than before, because it is a serious intellectual tool that can deconstruct much that is precious and rare in the world. In particular, the Enlightenment project, which in many ways goes against the world historical grain, or at least the grain of human intuitions and preferences. Postmodernism as an intellectual exercise is easy to dismiss. But as a tool of political battle it has to be taken seriously.

A Single Migration From Africa Populated the World, Studies Find. I will blog these papers. It’s a question of time. Please don’t pester me on non-open threads about this stuff.

Widespread allelic heterogeneity in complex traits.

14390818_10153997292567984_5603508903928344619_nGunman kills Jordanian Christian writer charged over anti-Islam cartoon. The title is misleading, as the cartoon was making fun of jihadists, not Islam. He posted the cartoon on Facebook.

Chinese Jews of Ancient Lineage Huddle Under Pressure. Most of the Kaifeng Jewish community was absorbed into the Han, though some became Muslim.

Academic Research in the 21st Century: Maintaining Scientific Integrity in a Climate of Perverse Incentives and Hypercompetition.

Finally, Sean don't post anything on sexual selection on this thread. I won't post the comments.

Airports are in interesting window into architecture and perceptions of the future. When I landed at Vienna International in 2010 it was as if I landed back in the 1970s. In contrast, Frankfurt Airport was the closest I’ve felt to really be pushed into the “gleaming future” you sometimes see in science-fiction films.

With that in mind, lately I’ve been thinking that for some reason the airport at Detroit reminds me of what the future was going to be like in my childhood of the 1980s.

There was a time, five years ago or so, when we knew all the humans who had been sequenced. Or at least most of them. But now we’re coming into the period when the first sequenced animals of any given species are starting to die. Above is Cinnamon, the first sequenced cat is no longer with us. And some day the hour will come when Craig Venter, who was a major contributor to the first human genome, will no longer be with us.

Something to consider.

The Estonian Biocentre has been one of the best resources in human population genomics, because their policy under Mait Metspalu seems to be to release the data once it’s published. Today I went and checked the site, and noticed a vlog accompanying their Nature paper, Genomic analyses inform on migration events during the peopling of Eurasia.

Well done.

I don’t mean to be an Ewen Callaway clipping service (though there are worse things to be), but today he has a piece up on ancient feline DNA and what it might imply for the distribution and spread of cats, How cats conquered the world (and a few Viking ships). My dissertation project is no longer on felines, but I spent several years doing analysis and thinking deeply on the issue of how cats emerged, and what might account for their contemporary distribution and phylogeographic relationships.

There are a few things I can divulge without scooping any future researchers who might work the data I’ve seen. First, ships and cats seem to be very closely connected. That is, maritime trade routes turn out to be highly suggestive of many of the patterns you see. This goes to the distinction between cats and dogs: the former are definitely creatures whose coexistence with humanity is conditional on complex civilization. The “finer things” in life, as it were.

The “domestication” of the cat is probably hard to disentangle from the emergence of urban centers, and the vermin which they attracted. What humans term vermin, the cats would naturally consider prey. The selective pressures are easy to imagine. Cats and humans are now companions, but initially their interests were simply concurrent.

And just as cities emerged independently in several locales (as well as agriculture), it is not implausible that domestic felines emerged from different wild populations, though at this point I’m modestly skeptical of most claims. Though it is not unlikely that there is introgression or admixture from diverged wild lineages into many domestic cat populations, the evidence of independent domestications is weak in my judgment. In contrast, cattle seem to be derived from two very distinct groups.

Rather, these research point to deep ancient structure among Middle Eastern feline groups, and parallel possibilities of human-cat coexistence as farming communities emerged rapidly during the early Holocene, with exigencies of historical events leading to later phylogeographic patterns we see around. I think the above research is on the right path. There is definitely a connection between most European domestic cat lineages and the indigenous populations for Egyptian cat (for example).

Screenshot 2016-09-19 02.06.35 The map to the right shows GDP per capita in the European Union in 2014 broken down by regions. I’ve long observed that the wealthiest regions of Europe are disproportionately those which were long under Habsburg rule. This fact transcends ethnicity and religion. Catholic northern Italy, Catholic southern Germany, as well as Protestant Netherlands, are all notably economically productive, and were long under Habsburg rule or hegemony.

The observation is just that, an observation. I have no grand theory to explain what is going on. And some have suggested that the outlines of this productive zone of Europe might even go back as far as Lotharingia. But, these sorts of patterns rooted in geopolitical history might hint at the possibility that cultural norms and institutions can be deeply rooted in region and locale.

This is at variance with our intuition that culture is protean and can change rapidly. This is most easily illustrated by the shift from militarism to pacific evident in both Japan and Germany in the past few generations. A shift that most believe could reverse course in short order.

In a similar vein, Peter Turchin has a post up at his blog, Ghost of Empires Past, which shows how pre-modern political structures continue to live in patterns in the World Values Survey!

Screenshot 2016-09-18 20.57.52

The above results are from Ancestry. You can see here 4% Melanesian. This is common in South Asians. And it’s not an error in the method. Rather, it is a natural outcome of the methods uses to generate admixture profiles.

Basically what’s going on is this:

1) You have data. In this case, the data are your own genotypes, as well as that of a set of individuals which represent world genetic variation, and are categorized into discrete populations.

2) You have a model or set of models. These models have different parameters.

3) You look at the data you have, and pick the parameters which best explain the data given the model.

If you have 100,000 or more markers that’s more than enough genotype data for individuals. The models themselves are quite stylized (e.g., HWE random mating sets of populations), but close enough to reality to give good results in many cases. For example, Ashkenazi Jews are often assigned to be ~100% Ashkenazi Jewish through these methods.

Then again, Ashkenazi Jews are a good test case. This is a population which went through a bottleneck about 500 to 1,000 years ago, and has been reasonably endogamous most of this time. Additionally, it’s not extremely structured due to inbreeding in different clan lineages. Though cousin marriage and uncle-niece marriage has been practiced by Ashkenazi Jews, the runs of homozygosity you see in Jewish genomes is not such that indicates a highly inbred population, as is common in the Middle East or South Asia. Rather, there are lots of medium length segments identical by descent across individuals.

Ashkenazi Jewish population is rather simple, and it is actually a rather clear and distinct population cluster. It stands to reason that when you create an Ashkenazi Jewish reference panel in your training data set it’s a pretty good match to the individuals you are testing.

The problems occur when you are to generate clusters and ancestry assignments for populations which are not so clear and distinct. Why do South Asians routinely come out as part Melanesian or Polynesian? This post was prompted by a Facebook thread where a South Asian customer of Ancestry was interested to see she had Polynesian ancestry. The reality is she almost certainly does not have Polynesian ancestry.

What’s going on is that the reference panel for South Asians used by many of the DTC genomics companies is not diverse enough to capture South Asian genetic diversity. There is an element of South Asian ancestry, “Ancestral South Indian” or ASI, which has deep shared ancestry with populations across Southern Eurasia and out toward Oceania. The admixture analysis method is searching through the reference panels for combinations of genotypes which can explain individual genetic variation. Since the South Asian training set is insufficient to explain all the South Asian variation the algorithms are filling in the balance of the variation with the closest available proxies to the “ghost clusters.”

The method is constrained and conditioned on two things:

1) The data being put in, which is often insufficient.

2) The set of populations that it is forced to work with to generate the combinations in individuals (the parameter values in the model to explain the data) are often insufficient or artificial.

What I mean by the last is that many of the genetic clusters are not taxonomically equivalent. “South Asian” ancestry is much more diverse and diffuse than “Melanesian” ancestry. This why Melanesian ancestry can explain South Asian ancestry, but generally not the reverse.

220px-Pointes_de_chatelperron A new paper in PNAS, Palaeoproteomic evidence identifies archaic hominins associated with the Châtelperronian at the Grotte du Renne, weighs in the question of whether the Châtelperronian culture were Neandertals, with an answer in the affirmative in this case:

The displacement of Neandertals by anatomically modern humans (AMHs) 50,000–40,000 y ago in Europe has considerable biological and behavioral implications. The Châtelperronian at the Grotte du Renne (France) takes a central role in models explaining the transition, but the association of hominin fossils at this site with the Châtelperronian is debated. Here we identify additional hominin specimens at the site through proteomic zooarchaeology by mass spectrometry screening and obtain molecular (ancient DNA, ancient proteins) and chronometric data to demonstrate that these represent Neandertals that date to the Châtelperronian. The identification of an amino acid sequence specific to a clade within the genus Homo demonstrates the potential of palaeoproteomic analysis in the study of hominin taxonomy in the Late Pleistocene and warrants further exploration.

The details about stratigraphy are beyond me. But the protein and mtDNA evidence is pretty conclusive in my opinion that there are Neandertal individuals in this assemblage. Therefore, assuming their stratigraphy is correct, what you see in the Châtelperronian may be a cultural influence upon Neandertals by anatomically modern humans who were pushing into Europe at this time.

51r8Ph-vcaL._SY344_BO1,204,203,200_ But cultural influence may not be the only dynamic at work. In The 10,000 Year Explosion: How Civilization Accelerated Human Evolution Greg Cochran hypothesized that Châtelperronian culture may have been a vector for Neandertal genes coming into modern human populations. And now we know that this isn’t always one directional. That is, just as modern humans absorbed genes from “archaic” populations, so archaic groups absorbed ancestry from modern populations (or at least humans closer to the main stem of modern humanity).

51dw0Uce+XL._SX330_BO1,204,203,200_ In The Third Chimpanzee Jared Diamond posited that the Châtelperronian Neandertals were analogous to native peoples in the New World such as the Cherokee, who adopted many aspects of European settler culture in their attempt to resist cultural absorption and marginalization. But one dynamic we need to remember about these tribes is that they also had a lot of European ancestry, in part because of the rapidly unbalanced population sizes. It seems entirely likely, as some have posited, that the last “Neandertal” populations were also substantially admixed. Therefore, it is not entirely surprising that they would also tend to exhibit cultural features more commonly found among modern humans.

My prediction is that when whole genomes of Châtelperronian Neandertals are available it is highly likely that they often show evidence of modern human ancestry.

Diamond’s The Third Chimpanzee is in my opinion a very underrated work. It is a bit dated today, but I still think it is quite worth reading.

What’s going on?

51I89uOM0AL._SX331_BO1,204,203,200_ Very busy, so haven’t gotten much further in The Dialectical Imagination, but I do have to say that the distinction between “positive freedom” and “negative freedom” is a useful one to highlight at this point. The comments below make me unsure about the influence of the Frankfurt School on modern socio-political movements, but the ideal of a utopian end state for society which enshrine a vision of the good definitely seems to be one of the things that moderns have lost. Rather, a lot of identity politics talk seems to be about positional games and status competition. The perpetual revolution.

At the suggestion of a reader I purchased Explaining Postmodernism. The Kindle version is $4.99.

There was an ancient DNA convention in England. It turns out that the original Polynesians may not have had much Melanesian ancestry, implying multiple migrations into Far Oceania.

The impact of recent population history on the deleterious mutation load in humans and close evolutionary relatives.

We live in an age when we have a lot of SNP data on a lot of populations. This allows for a very fine level of granularity in terms of analysis. To illustrate, Genetics recently published Nationwide Genomic Study in Denmark Reveals Remarkable Population Homogeneity, which analyzes hundreds of Danes with hundreds of thousands of SNPs. In The History and Geography of Human Genes, published twenty years ago, most of the analysis was grounded in pairwise comparisons between populations using hundreds of markers. Not only do we have much greater resources in terms of data, but we have various analytic frameworks which in concert allow for richer, more precise, inferences. Today we can actually assign ancestry regions of an individual’s genome!

The first author of the Genetics paper, Yorgos Athanasiadis, has put up a post where does a walk-through of the various methods step-by-step. It’s useful for anyone who has an inclination to do something similar for another data set.

This portion jumped out at me:

The six Danish regions showed highest affinity with a cluster that we call BRI(tish), because it’s mostly made up by British samples, followed by the NOR(wegian) and SWE(dish) clusters. This is not to say that Danes are about 40% made up by British DNA, as some enthusiastic twitters have mentioned. The BRI cluster also includes German, Belgian and Dutch samples, meaning that it might as well be reflecting some other ethnic component; in lack of a better name, we called it BRI. Another interesting fact is that because of the presence of this cluster, haplotype sharing with other Scandinavians was about 40%….

I think the implications of this are something I’m going to have to chew on for a while. Some of these genetics results aren’t straightforward in terms of what they mean in a vacuum, though the historical inference is obvious.

Anyway, read the whole thing, On the genetic structure of Denmark.

Ewen Callaway reports from a conference in England, Elephant history rewritten by ancient genomes:

Modern elephants are classified into three species: the Asian elephant (Elephas maximus) and two African elephants — the forest-dwellers (Loxodonta cyclotis) and those that live in the savannah (Loxodonta africana). The division of the African elephants, originally considered a single species, was confirmed only in 2010.

Scientists had assumed from fossil evidence that an ancient predecessor called the straight-tusked elephant (Paleoloxodon antiquus), which lived in European forests until around 100,000 years ago, was a close relative of Asian elephants.

In fact, this ancient species is most closely related to African forest elephants, a genetic analysis now reveals. Even more surprising, living forest elephants in the Congo Basin are closer kin to the extinct species than they are to today’s African savannah-dwellers. And, together with newly announced genomes from ancient mammoths, the analysis also reveals that many different elephant and mammoth species interbred in the past.

Palkopoulou and her colleagues also revealed the genomes of other animals, including four woolly mammoths (Mammuthus primigenius) and, for the first time, the whole-genome sequences of a Columbian mammoth (Mammuthus columbi) from North America and two North American mastodons (Mammut americanum).

The researchers found evidence that many of the different elephant and mammoth species had interbred. Straight-tusked elephants mated with both Asian elephants and woolly mammoths. And African savannah and forest elephants, who are known to interbreed today — hybrids of the two species live in some parts of the Democratic Republic of Congo and elsewhere — also seem to have interbred in the distant past. Palkopoulou hopes to work out when these interbreeding episodes happened.

15x coverage. This is awesome. And incredible.

Screenshot 2016-09-15 16.38.46

51gumWkW0TL A new paper in Quaternary International, Western Eurasian genetic influences in the Indonesian archipelago, confirms what has long been suspected by smaller batch data:

…To locate the primary areas of Western Eurasian genetic influence in Indonesia, we have assembled published uniparental genetic data from ∼2900 Indonesian individuals. Frequency distributions show that Western Eurasian paternal lineages are found more commonly than Western Eurasian maternal lineages. Furthermore, the origins of these paternal lineages are more diverse than the corresponding maternal lineages, predominantly tracing back to South West and South Asia, and the Indian sub-continent, respectively. Indianized kingdoms in the Indonesian archipelago likely played a major role in dispersing Western Eurasian lineages, as these kingdoms overlap geographically with the current distribution of individuals carrying Western Eurasian genetic markers. Our data highlight the important role of these Western Eurasian migrants in contributing to the complexity of genetic diversity across the Indonesian archipelago today.

The table above highlights the distribution of paternal Indian lineages in several parts of Indonesia. These Y chromosomal haplotypes are found in the core of what was Majapahit. Some of these haplotypes might be due to shared ancient ancestry, but the presence of R1a means that it is more recently than the past 4,000 years, as I believe R1a is relatively intrusive into South Asia. Many of the other haplogroups are a diverse cross-section of those typical for South Asia.

The further question then is whether these date to the period of European colonialization, or to the first millennium A.D., when the first “Indianized kingdoms” arose in Southeast Asia. The fact that there is compelling evidence of old and even admixture in Cambodia, where colonialism was not as pervasive or longstanding as in maritime Southeast Asia, suggests that it can’t be chalked up to the Dutch presence, and their role as mediators for migration (more plainly, they enslaved many South Asians and moved them around the Indian ocean basin).

But the text of the paper makes some things rather clear:

…constant since the first contacts and exchanges between the Indian sub-continent and Indonesia in the late 1st millennium B.C.E., it is likely that this gene flow was particularly intense during the period of the Hindu kingdoms in Indonesia (7th to the 16th century AD). These assumptions, based on archaeological and historical data, are also in broad agreement with dating on unpublished genome-wide SNP markers from Island Southeast Asia (unpublished data).

• Category: Science • Tags: Genetics

51I89uOM0AL._SX331_BO1,204,203,200_ Reading The Dialectical Imagination: A History of the Frankfurt School and the Institute of Social Research, 1923-1950.

A good book. Dense. But it is clear (the author so admits) that it’s only a superficial exploration of the ideas of the Frankfurt School.

That being said, a lot of the abstruse and in my opinion wrong-headed tendencies of Critical Theory types does seem to get back to the roots. In relation to impenetrability, the influence of Heidegger on Marcuse makes a lot of sense.

Screenshot 2016-09-05 12.27.38
For whatever reason I missed this paper which came out in July in AJHG, Human Y Chromosome Haplogroup N: A Non-trivial Time-Resolved Phylogeography that Cuts across Language Families. Basically it blows up sample size and utilizes NGS techniques (whole-genome) to resolve some questions around haplogroup N, and in particular the M46/TAT subclade which exhibits a peculiar geographic distribution, from the shores of the Baltic to easternmost Siberia.

Screenshot 2016-09-05 12.33.34 I actually blogged about this as far back as 2003, so it’s a long term mystery. There’s no autosomal rhyme or reason to the frequency of this lineage. Yes, there is a vague Uralic affinity, but this Y chromosomal variant is higher in the Lithuanians than the Finns, and found in peoples as distant as the Koryaks. One of the major early questions was whether it was a marker that indicated east-west movement, or west-east movement. In other words, was it associated Siberian ancestry in Finns and affiliated people, or did it indicate European ancestry in Siberian people?

Rurik, carrier of N1c

Rurik, carrier of N1c

If the results in this paper are correct the likely answer is: none of the above. The core TAT lineage looks like it underwent an explosion ~5,000 years ago. This is around the same time as Northern Europeans and Siberians as we understand them were coming into being. So the TAT lineage didn’t come with a specific people, it was part of the process which made the people. I’ll quote from the discussion:

Overall, a considerable proportion of men inhabiting much of the Arctic and temperate zones of western and eastern Eurasia share N3a3’6 lineages that date back to the mid-Holocene (4.5–5.0 kya). This common patrilineal ancestry unites widely different linguistic phyla, including Indo-European, particularly Balto-Slavic, branches of the Altaic, such as the Mongolic, Turkic, Tungusic, and Chu- kotko-Kamchatkan branches, as well as the Balto-Finnic branch of the Finno-Ugric.

The autosomal genome-wide data is clear, pretty much all the Finnic peoples in Europe seem to have a small (to various degrees, with Finns proper the least), but clear, signal of admixture that is Siberian. It is tempting to associate this with the men who carried TAT into these populations, but observe that the Lithuanians seem to be lacking in this signature. Y chromosomes and autosomes are not always in alignment, but recall that many Siberians have some West Eurasian ancestry, some of it likely quite ancient, and carry R1a1a Y chromosomes. The past was more complex than we had assumed, and the relationship between movements of men and languages is likely not so straightforward in the inferences we can make. It may be that the Siberian admixture into Finnic peoples, and their linguistic identity, post-dates the arrival of TAT into the far north of Europe.

One of the aspects of the explosion of many Y chromosomal lineages 4-5,000 years ago is how much they don’t associate well with ethno-linguistic boundaries. The “Indo-Aryan” R1a1a in South Asia is very common in some low caste South Indian tribal populations. The R1b brought by the Corded-Ware culture, which presumably transmitted Indo-European languages, is at very high frequency among the non-Indo-European Basques, as well as groups such as Sardinians, who were Indo-Europeanized only in Classical Antiquity. The Y lineages seem to expanded far beyond the totality of the cultural unit.

Genetics is giving us lots of data. But there are no theoretical bones to scaffold this flesh.

Screenshot 2016-09-05 04.14.05

Jonathan Novembre and Benjamin Peter have posted a preprint of a review, Recent advances in the study of fine-scale population structure in humans, which readers will find useful. In particular, the citations are a gold-mine for anyone attempting to navigate this literature.

The figure above from their preprint illustrates the number of markers needed to differentiate populations in Europe. Recall that genetic variation within Europe, especially Northern Europe, is rather low. It’s pretty clear that if you sample 100 SNPs from the human genome you can’t differentiate much. At 1,000 SNPs structure begins to appear, and this is starting to be well resolved by 10,000 SNPs. By 100,000 SNPs you are pretty much going to hit diminishing returns for regional diversity on Europe level scales. The pattern differs by method. PCA for example does much better with 10,000 SNPs in Europe than the model-based clustering (e.g., ADMIXTURE) in my experience, but the two are comparable as you near 100,000 SNPs. Beyond 100,000 SNPs there is not that much increase in resolution for genome-wide methods that rely on genotypes at this level of genetic diversity.

51qciM4cBhL._AC_UL320_SR242,320_ Of course, if you want really fine-scale differences, between villages for examples, more markers, and perhaps whole-genome sequencing that can pick up rare variants, are useful. In other words, there are cases one can imagine where more data than is normally available on SNP-chips ps useful. But these are definite boundary conditions. Once you get to the point of distinguishing branches of extended families you really can’t collapse the genealogies any further.

Another instance where more marker density, or the power of high coverage whole-genome sequencing, might be useful is for local ancestry deconvolution. If you’re assigning ancestry to windows of the genome then your marker density is going to be a limiting factor, as you might be slicing the 100,000 SNPs into 1,000 subunits.

Finally, there’s the issue of the models being tested. Novembre and Peter allude to the fact that many of these models posit stylized discrete pulse admixtures. As it turns out in some cases ancient DNA seems to have confirmed that something like this went on. That is, long periods of local stability and panmixia, followed by genetic turnover and admixture. But they note that there isn’t a good simulation framework where demographic scenarios are allowed to generate in silico data for testing new models. In other words, biologists are currently having to rely on “natural experiments.”

Scatterplot_r=-.76 The correlation between medical school GPA and career outcomes is low. The correlation between height and number of all-star appearances in the NBA is low. The correlation between SAT score and performance as a Google engineer is low.

Actually, I don’t know if all of these are strictly true. But I think you’ve seen the general form of the fact. So, for example, a professor I knew once recounted that at his graduate institution they once looked at correlation between GRE scores and future positions at tenure track institutions. They didn’t find any association.

51sdHZvYfTL._SX334_BO1,204,203,200_ (1) But there was a problem here: the institution was one of the top 10 graduate programs in the country in the biological sciences. The GRE scores were likely to be very high already. The result reported is certainly correct. But the inference given toward a general audience is often misleading. The correct inferences is within a particular range of the independent variable the correlation between the variable (here, GRE score) and an outcome (here, a tenure track position) is low. But what is often inferred is that there is no relationship between a variable and an outcome. Period. This is usually an incorrect inference.

One reason I’m putting this post up is a blogpost I noticed, Google Finds That Successful Teams Are About Norms Not Just Smarts. It links to The New York Times Magazine article which outlined how Google had attempted to find the “perfect” makeup of a team. The title is key here: not just. Most people who get to interview at Google are very bright. They aren’t arbitrary people pulled off the street. That’s one reason that the old Google system might have been counter-productive, since you already knew that the people you were testing were good at taking tests, as opposed to gauging them on other personal characteristics (e.g., do they have social skills which might allow them to work well on a team?).

Norms matter a lot. Isaac Newton’s father was an unlettered (if prosperous) farmer. If Newton had been born a few hundred years earlier he would not have flourished as he did. Norms matter. Culture matters. But not all men are born Isaac Newtons. Aptitude matters too. When we observe that norms and culture matters in the context of genius we are often engaging in range restriction. The individuals who illustrate the power of culture are not arbitrarily selected from the whole population.

silkroad51u0BtDlYJL._SX329_BO1,204,203,200_ Several people have asked me about this article in Foreign Policy, Does Chinese Civilization Come From Ancient Egypt? It’s interesting in terms of cultural commentary, and what it say about open-mindedness among the Chinese public and academy. In many ways the Chinese are much less open-minded than Westerners after decades of Marxism…but in other ways, they are surprisingly liberal in the classical sense. Willing to entertain crazy ideas out of left field.

The hypothesis that the roots of Chinese civilization diffused from ancient Egypt via the vector of a Hyksos migration is definitely a bridge too far. There’s no strong evidence for it from what I’ve seen. But there are different forms of diffusion. As highlighted in the article itself there is good evidence of cultural diffusion of specific elements of Shang technology, such as chariots. The chariot was the weapon of mass destruction of the Bronze Age. Once invented, it spread rapidly from one end of Eurasia to the other (and, into Egypt as well).

But there are still elements of uncertainty as to how it spread. One model is that it was transmitted from one society to another, in a process akin to how guns or influenza might have spread among Native Americans. Then, there is the leapfrog model, whereby long distance migration and travel serve to facilitate diffusion of culture (and genes).

k8882 In the late 2000s I read Empires of the Silk Road: A History of Central Eurasia from the Bronze Age to the Present Reprint Edition, a sprawling and idiosyncratic book which makes the case for the centrality of the Eurasian “Heartland” to world history. The author suggests as an aside that the progenitors of the Shang themselves may have been from the steppe, perhaps Indo-Europeans. At the time I dismissed that as lacking evidence.

But the past half a decade or more has shown us that populations moved a long more in the past 10,000 years than we’d have been led to believe. I am probably more open to an Indo-European influence on early Chinese civilization than I was in the late 2000s. This is where Y chromosomes are helpful. Below are Y chromosomal distributions of some ethnic minorities from northern and western China today:

Screenshot 2016-09-04 19.31.00

And here’s a table of a more diverse set of East Asian groups:

Screenshot 2016-09-04 19.32.30

The question of timescale is important. The Chinese Tatars arrived only in the last few hundred years from the Volga region of Russia. They have a lot of haplogroup I, which seems to be carried over from Pleistocene era West Eurasian populations. In contrast, we know that haplgroup R1a1a, and in particular the Z93 subclade common among South and Central Asians, was present in the Altai region during the Bronze Age because of ancient DNA. And we see R1a1a across many populations in these data. Unfortunately though there isn’t a breakdown between European and Asian subclades, because there has been a long movement back and forth on the steppe in the last 4,000 years (the Uyghurs also carry H, which is typical more of South Asian groups, indicating movement across the Pamirs, as has been historically attested). But the high frequency among the Uyghurs, the low frequency of other West Eurasian Y haplogroups, such as R1b and I (as well as the presence of J), are suggestive (along with autosomal work) of pre-Mongol West Eurasian heritage.

This is obvious to anyone who knows the history of the Silk Road and the European features of the mummies of Xinjiang (not to mention the cave paintings). The ancient DNA and history indicate that very early on a mixed population of western and eastern origins emerged in the heart of Eurasia. The question then is what role did they play in Chinese history? Almost certainly at minimum they were the vector by which the knowledge of the construction of chariots and other aspects of the West Eurasian military-industrial system were transmitted (just as later they were instrumental in the transmission of Buddhism). At maximum, they may have been the seeds around which chariot elites emerged in the Shang period.

The genetic data suggest that if there was a demographic impact, it was very small. The Han Chinese in the data which carry West Eurasian haplogroups are invariably sampled from the far north and west. Regions where assimilation of non-Han minorities to a Han identity has been common. Unlike Europe and South Asia, and like the Middle East, the Y chromosomes in East Asia do not as a whole seem star-shaped. This suggests that the demographic basis of the elites probably dates to the Neolithic, and was indigenous, as opposed to migrants from elsewhere. The role of Indo-Europeans was probably stimulative, rather than directive.

• Category: History, Science • Tags: China

Giant panda no longer ‘endangered’ but iconic species still at risk:

The International Union for Conservation of Nature (IUCN) announced the positive change to the giant panda’s official status in the Red List of Threatened Species, pointing to the 17 per cent rise in the population in the decade up to 2014, when a nationwide census found 1,864 giant pandas in the wild in China.

“For over fifty years, the giant panda has been the globe’s most beloved conservation icon as well as the symbol of WWF. Knowing that the panda is now a step further from extinction is an exciting moment for everyone committed to conserving the world’s wildlife and their habitats,” said Marco Lambertini, WWF Director General.

But we need to keep perspective. Something similar is happening with tigers, whose census sizes are finally increasing after a century of continuous decline. But we’re talking bounce back to 4,000! This is still a small and vulnerable population. Genetically, depending on the details of the structure, 100 to 1,000 are probably enough for viability. But genetics isn’t everything. Some random stochastic event (e.g., look at what happened to the Tasmanian devil) could wipe out a few thousand quickly, and we’d be up against our heels.

Parents Didn’t Just Dislike Super Nintendo 25 Years Ago—They Thought It Was a Scam. Fun fact: I stopped playing video games when I was 16. Mostly because it was taking up too much of my time. This means that I’m excluded from a lot of conversation and pop culture. So be it.

Excited to be going to ASHG 2016. Is Vancouver really part of Canada? Isn’t it like Seattle with a queen? In any case, the abstracts aren’t online yet. They should drop this week….

1584203._UY200_ The Ezra Klein podcast has a recurring question: name three books. Taking about five seconds, mine are the following: The Fall of Rome: And the End of Civilization, Principles of Population Genetics, and In Gods We Trust: The Evolutionary Landscape of Religion. The next three? Albion’s Seed: Four British Folkways in America, The Genetical Theory Of Natural Selection, and From Dawn to Decadence, 1500 to the Present: 500 Years of Western Cultural Life.

ngs48_0187 As a change of pace I’ll be checking out Christie Wilcox’s new book, Venomous: How Earth’s Deadliest Creatures Mastered Biochemistry. It’s often joked that geneticists don’t know real biology, and despite an undergrad background in biochemistry that’s probably somewhat true for me.

Got a copy of The Dialectical Imagination: A History of the Frankfurt School and the Institute of Social Research, 1923-1950. If history is written by the winners, I figured I should get to know the winners a little better!

I was doing most of my Python object-oriented, but everyone kept telling me that I should go functional (online and offline). As I don’t have too much time, I got Treading on Python Series: Intermediate Python Programming: Learn Decorators, Generators, Functional Programming and More.

People have been talking about this for a while. Gwern did some real analysis, Embryo selection for intelligence.

How does India’s caste system work in the 21st century? Quora user hits the bull’s eye.

Using Genetic Distance to Infer the Accuracy of Genomic Prediction.

Tabs Or Spaces – One Billion Files Later An Answer. Sorry Richard, it’s all spaces all the way….

5 racist stereotypes that historically were the opposite of what they are today. Vox is doing well because they’re very cautious about challenging the preconceptions of their readers. Often they reinforce them. Too easy to deconstruct, but please note that stereotypes which might be held about Asians may not be common for certain types of Asians. This sort of banal observation is difficult to make when you are experiencing an orgasm because of the smug-saturation.

Walmart automation will eliminate 7,000 jobs.

Dissecting the genetics of complex traits using summary association statistics.

More Ancient Jomon DNA.

Steak That Sizzles on the Stovetop.

I had forgotten this was once #1 in the early 90s:

• Category: Miscellaneous • Tags: Open Thread

51gumWkW0TL It is too much to assert to say that the Indian ocean is “our sea,” writ large as a species. But it does certainly seem to be the case that this body of water does punch above its weight. It is likely that anatomically modern humans emerged not too far from its shores, while the first, second, and third civilizations arose arose along its fringes (civilization being defined as having cities and some basic level of literacy). As humanity developed complex societies at the antipodes of Eurasia, in Europe and China, the focus on the Indian ocean basis became somewhat attenuated, but its centrality as a nexus between these two dynamic loci of economic and cultural activity persisted. In addition, both the world of Islam and Southeast Asia were deeply connected to the ocean, while for India it was the ocean.

Sanjeev Sanyal’s The Ocean of Churn: How the Indian Ocean Shaped Human History is a panoramic narrative which surveys the lands around this ocean, and how wrapped up they’ve been in human history. There are two broad themes which undergird The Ocean of Churn. First, Sanyal seems to (mostly) reject the Great Man theory of history, as well as deterministic Marxist models. Rather, he posits that historical processes are a complex adaptive system. This is probably true, but honestly I don’t see that it looms very large in The Ocean of Churn, which is mostly a descriptive narrative on the macroscale. If you deleted this nod to the theoretial framework it could be read perfectly fine. The second major theme, that the flow of ideas and peoples is bidirectional, rather than a sequential branching processes, permeates the book. In fact, it’s hard to ignore, because Sanyal begins by recounting how the Pallava dynasty of South India was refounded by a collateral branch from Cambodia!

Screenshot 2016-09-03 14.45.46The map to the left is a stylized representation of humanity’s expansion out of Africa. To a first approximation it gets a lot right. But because it only depicts unidirectional migration it misses a lot in the detail. The same could be said for culture. For example, the narrative of Islam is that it spread from Arabia, to the west and the east. But works such as Lost Enlightenment and Warriors of the Cloisters both argue that there was a massive reflux from the east after the transition between the Umayyads and the Abbasids. The Abbasid power base, and many of their courtiers, were from Khorasan in the northeast of modern Iran, and Transoxiana. While 7th century Islam crystallized in the matrix of a post-Roman Late Antique world, with the Umayyad center of power being in Syria, the Abbasids emerged from a milieu where Muslims, Christians, Zoroastrians, Buddhists, and even Hindus, mixed freely and exchanged ideas.

k8882 Like Empires of the Silk Road and Facing the Ocean: The Atlantic and Its Peoples 8000 BC-AD 1500, the The Ocean of Churn is a historical geography with a broad view and lacking a tight focus. It’s not a bug, it’s a feature. Additionally, Sanyal’s style is quite conversational and informal…dare I say, almost bloggish? He admits very early that he’s not writing an academic work. Rather, The Ocean of Churn is part travelogue, part historical commentary, and part review of the academic literature. Additionally, there is arguably somewhat of an Indo-centric bias, insofar as in a book which runs less than 300 pages India and its role at the center of events take up disproportionate space. This is somewhat ironic in light of the author’s conscious observation that previous histories of the Indian ocean were quite Eurocentric, but somewhat justified by the fact that India’s long history, relative influence around the basin of the Indian ocean, and demographic heft, probably warrant extra attention.

Sometimes this focus gets the better of Sanyal. The voyages of Zheng He somehow get drafted into the shift within maritime Southeast Asia from affiliation with the Dharmic set of cultures rooted in Hinduism and Buddhism toward that of Islam, where the machinations within the Chinese court were geared toward breaking the Indic affinities of this region to increase Chinese cultural hegemony.

There are two major problems I see with this. First, the swing from Hinduism and Buddhism to Islam in maritime Southeast Asia was a centuries long process, and occurred first in the Arabian sea, before shifting to the eastern regions of the Indian ocean. One can make the case that it was the rational thing to do for maritime facing Southeast Asian polities to realign their culture focus from Dharmic religions to Islam.

Second, there were longstanding dynamics at the court of the Ming dynasty which could explain much of the rationales for the voyages of Zheng He’s fleet, dynamics which can be traced as far back as the Song dynasty as to the proper role of the state in society and the world. This is a case where Sanyal’s narrative is too geographically and historically delimited to flesh out the more complex and messy dynamics at the heart of which was a civilizational pivot in Southeast Asia and sui generis maritime voyages out of the heart of the Chinese world.

But in general The Ocean of Churn does not suffer from narrowness. Rather, the footnotes and citations are a testament to Sanjeev Sanyal’s catholic tastes; they are wide-ranging, and warrant closer attention and follow-up. I did not, for example, know that the native Malagasy had retained a custom of boat burial even after they had to retreated to the highlands and become farmers with little experience of the sea. The Ocean of Churn is packed with many interesting details of this sort. It’s a gold-mine for those looking for more to read.

And yet in the course of inter-disciplinary work sometimes you’ll miss the trees from the forest. This occurs in Sanyal’s survey of the historical genetics literature. He gets most right, but gets some wrong. The first case is that Sanyal refers a few times to the 2013 paper, Genome-wide data substantiate Holocene gene flow from India to Australia. It came out in PNAS, a reputable journal, and includes an author, Mark Stoneking, which some prominence in the field. Additionally, the timing was such that it aligned well with a contemporaneous cultural change in humans, and the arrival of dingos. Unfortunately, the paper was certainly wrong. First, the data was not open, so I could not replicate, to see how robust the statistics were. I complained about this at the time. Second, several prominent statistical geneticists told me privately that they were very skeptical of the statistics. Third, recent research suggests that Aboriginal paternal lineages are very deeply rooted, Deep Roots for Aboriginal Australian Y Chromosomes. Indian Y chromosomes are very distinctive. There is evidence for them in Southeast Asia in regions without colonial era Indians, in particular Cambodia. Of course, it could be that only the female lineages persisted, but there is the reality that there’s been no evidence for recent Indian mtDNA in Australia to my knowledge (the divergences are very deep), and, no other paper which has access to Australian genome data has replicated this finding. Finally, from what I am hearing a new autosomal paper will come out soon and definitively render judgment against this paper’s result (this is why the Y chromosome paper came out).

r1a But the above is pretty small-ball. The major issue in the citation of historical genetic papers in The Ocean of Churn is that there are references to older works, as in five years old, that are totally out of date, and interpretations based on consensus understandings of the late 2000s that have been overturned.

Let’s start with R1a1a. This a male Y chromosomal lineage which is very common in South Asia, Central Asia, and West-Central Eurasia (Eastern Europe). When it comes to Y chromosomal lineages whole genome analyses have changed our understanding a lot of the phylogenetics of this topic. The earliest work uses highly mutable microsatellites, while later work focused on single nucleotide polymorphisms. Using these patterns of variation researchers created phylogenies, most of which have stood the test of time, and calibrated divergence times, many of which have not. Basically the Y chromosome doesn’t have enough SNP diversity to allow good calibration of divergence, while because of their high mutation rate microsatellites are not good for temporal inference.

Screenshot 2016-09-03 16.04.03 The plot above shows the R1a1a males in the 1000 Genomes data (it’s from Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences). The green are South Asians, the blue are Europeans. Because the 1000 Genomes data is biased toward West Europeans, there are far fewer R1a than R1b in the data from that continent. What they confirm is that South Asian and European R1a are two different clades. The South Asian clade is often termed Z93 because of a particular mutation. Z93 is overwhelming in South Asia, very rare in Europe, and relatively common among the R1a individuals in Central Asia (e.g., the Altai sample from the 1000 Genomes). The pattern of genetic diversity shows that there isn’t much, and, that the diversity is relatively shallow. That is indicative of two things. R1a went through a very recent massive population expansion. It’s a “star-shaped phylogeny.” Contrast that with J2, which has also undergone expansion since the rise of agriculture, but exhibits far more internal structure. J2 probably started expanding earlier, and, it’s expansion was never at any moment as explosive as that of R1a and R1b. Earlier work suggesting R1a diversification ~10,000 years ago does not hold up.

The second issue in relation to R1a is that now have ancient DNA . Both R1a and R1b are very rare before 4,000 years ago. Here’s a section from a paper published in 2015 out of David Reich’s lab:

Further evidence for a connection between the Srubnaya and populations of central/south Asia—which is absent in ancient central Europeans including people of the Corded Ware culture and is nearly absent in present-day Europeans…is provided by the occurrence in four Srubnaya and one Poltavka males of haplogroup R1a-Z93 which is common in present-day central/south Asians and Bronze Age people from the Altai…(Supplementary Data Table 1). This represents a direct link between the European steppe and central/south Asia, an intriguing observation that may be related to the spread of Indo-European languages in that direction.

The Srubna people seem to have flourished between the Dnieper and Volga, and south toward the Caucasus, about ~3,500 years ago. But the Srubna are not simple solution to the problem of Indo-Aryan origins. From the supplements of Genomic insights into the origin of farming in the ancient Near East:

The analysis in this section reconciles the evidence presented in the first paragraph regarding the origin of the ANI by showing that is may be related both to “southern” populations related to Iran and the Caucasus and to “northern” steppe populations. Our results do not resolve the relationship between ANI and the origin of Indo-European speakers in South Asia, in the sense that they reveal that South Asian populations have ancestry both from regions related to the Eurasian steppe and ancient Iran, which is compatible with alternative homeland solutions…

While the Early/Middle Bronze Age ‘Yamnaya’-related group (Steppe_EMBA) is a good genetic match (together with Neolithic Iran) for ANI, the later Middle/Late Bronze Age steppe population (Steppe_MLBA) is not. Steppe_MLBA includes Sintashta and Andronovo populations who have been proposed as identical to or related to ancestral Indo-Iranians…as well as the Srubnaya from eastern Europe which are related to South Asians by their possession of Y-chromosome haplogroup R1a1a1b2-Z935. A useful direction of future research is a more comprehensive sampling of ancient DNA from steppe populations, as well as populations of central Asia (east of Iran and south of the steppe), which may reveal more proximate sources of the ANI than the ones considered here, and of South Asia to determine the trajectory of population change in the area directly.

Basically, the Reich lab has used ancient DNA to confirm what genome bloggers started noticing around 2010: the “ANI” component of South Asian ancestry is itself a composition with different streams. In the initial analyses the division between ANI and ASI (“Ancestral South Indian”) dropped out so easily because the two groups were very genetically distinct. In contrast, the West Asian and Bronze Age Steppe streams of the ANI ancestry are rather genetically similar in comparison, making them harder to differentiate. Ancient DNA has been particularly useful because the differences were starker in the past.

All good so far. Much of this aligns with The Ocean of Churn and its thesis of bidirectionality. The problem I have is that Sanyal seems to be implicitly assuming that an “Out of India” theory for the emergence of Indo-Aryans is the correct position when there’s a lot of legitimate debate about this, and good reason to hold that this is not plausible. He refers to the Indo-Aryan Mitanni as “Indian,” when it fact this is likely to be an anachronism. Similarly, if it is found that a Dravidian language was spoken in southern Iran during the Bronze Age, I suspect that terming them “Indian” would also be an anachronism.

When I told Sanyal I was going to bring this up on Twitter he told me that I needed to cite peer-reviewed literature. This is reasonable, in light of the fact that when you’re navigating different disciplines you can’t familiarize yourself totally with the landscape…but, it highlights a problem with his citation pattern: he’s not a human population geneticist, and so hasn’t kept up with the field, nor does he know what papers are of high quality in retrospect and what papers are not. I suggested to him that I could actually run many of the analyses myself since the data is open, but he responded that this would be “he said/she said.” This is fair because most people do not have much familiarity with population genomics. But, it is unfair because I actually have familiarity with the field and can actually do the work myself, so perhaps my opinion should be weighted a bit higher?

Ultimately all this is going to be forgotten commentary when sites like Rakhigarhi start yielding ancient DNA. I have already made a bunch of predictions relating to that research. There have already been leaks in the Indian press, such as ‘Descendants of Harappans still living in Rakhigarhi’. I’m pretty sure that what they’ll find is that the people who inhabited the Northwest quadrant of South Asia at that point were already admixed. They simply lacked the Bronze Age Steppe component of ancestry, which probably arrived with the Indo-Aryans.

Of more interest to me is Sanyal’s assertion of Southeast Asian influences India. The 1000 Genomes data makes clear that there is substantial admixture from Southeast Asian populations in Bengal. But there is no historical record of this, but its impact has been significant. The Ocean of Churn makes much of matrlineal customs, and their diffusion from Southeast Asia to India through the vector of migration. I’m not so sure that migration (cultural diffusion) is the only explanation of this phenomenon, but it is certainly plausible. One quibble I would make here is the same as above in regards to India: a lot of the population dynamics of Southeast Asia date to the later Holocene. Not the Holocene-Pleistocene boundary, when Sundaland would have been inundated.

41yT8hhOZJL._SX339_BO1,204,203,200_ There is clearly recent gene flow from South Asia to Southeast Asia. The genetic data from Cambodia suggest it is even, which means it was a demographic movement which affected the whole people. The cultural connections between Indic Southeast Asia and India have long been known, but it has long been assumed that this was mostly a matter of ideas, not people. But clearly enough people went so that ~5% of Cambodian ancestry seems to be Indian! The question then proceeds about reverse migrations. The movement into Bengal was relatively recent, and my own analyses have shown that it can’t be explained as purely an Austro-Asiatic event. Rather, the Tai migrations which reshaped the cultural and to some extent demographic landscapes of mainland Southeast Asia seem to have had a spillover effect into Bengal, which was at that time rising to prominence under the Pala dynasty.

Ultimately The Ocean of Churn lives up to its name. The authors explores connections between Madagascar and the East African coast, the Indus Valley Civilization and Mesopotamia, and Southeast Asia and India. It is fertile ground, and despite my quibbles and concerns with portions of the book, it is an excellent place to start. If you want to continue in a more narrow and academic vein, I’d recommend The Shape of Ancient Thought: Comparative Studies in Greek and Indian Philosophies, a book which traces connections between the two civilizations, and also looks further as to deeper influences from Mesopotamia on both of them.

• Category: History, Science • Tags: India, Ocean of Churn
