The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
Geno 2.0

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

The above map shows the population coverage for the Geno 2.0 SNP-chip, put out by the Genographic Project. Their paper outlining the utility and rationale by the chip is now out on arXiv. I saw this map last summer, when Spencer Wells hosted a webinar on the launch of Geno 2.0, and it was the aspect which really jumped out at me. The number of markers that they have on this chip is modest, only >100,000 on the autosome, with a few tens of thousands more on the X, Y, and mtDNA. In contrast, the Axiom® Genome-Wide Human Origins 1 Array Plate being used by Patterson et al. has ~600,000 SNPs. But as is clear by the map above Geno 2.0 is ascertained in many more populations that the other comparable chips (Human Origins 1 Array uses 12 populations). It’s obvious that if you are only catching variation on a few populations, all the extra million markers may not give you much bang for the buck (not to mention the biases that that may introduce in your population genetic and phylogenetic inferences).

To the left are the list of populations against which the Human Origins 1 Array was ascertained, and they look rather comprehensive to me. In contrast, for Geno 2.0 ‘ancestrally informative markers’ were ascertained on 450 populations. The ultimate question for me is this: is all the extra ascertainment on diverse and obscure groups worth it? On first inspection Geno 2.0′s number of SNPs looks modest as I stated, but in my experience when you quality control and merge different panels together you are often left with only a few hundred thousand SNPs in any case. 100-200,000 SNPs is also sufficient to elucidate relationships even in genetically homogeneous regions such as Europe in my experience (it’s more than enough for model-based clustering, and seems to be overkill for MDS or PCA). One issue that jumps out at me about the Affymetrix chip is that it is ascertained toward the antipodes. In contrast, Geno 2.0 takes into account the Eurasian heartland. I suspect, for example, that Geno 2.0 would be better for population or ancestry assignment for South Asians because it would have more informative markers for those populations.

Ultimately I can’t really say much more until I use both marker sets in different and similar contexts. Since Geno 2.0 consciously excludes many functional and medically relevant SNPs its utility is primarily in the domain of demographics and history. If the populations in question are well covered by the Human Origins 1 Array, I see no reason why one shouldn’t go with it. Not only does it have more information about biological function, but the number of markers are many fold greater. On the other hand, Geno 2.0 may be more useful on the “blank zones” of the Affy chip. Hopefully the Genographic Project results paper for Geno 2.0 will come out soon and I can pull down their data set and play with it.

Cite: arXiv:1212.4116

🔊 Listen RSS

The Genographic Project is now moving beyond uniparental lineages with Geno 2.0. Spencer Wells kindly invited me to a conference call last month where he outlined a lot of the details, so I’ll hit the salient points for readers of this weblog:

* They’re unveiling a new SNP-chip and a new project which moves beyond the Y and mtDNA to the autosome. But they’re also expanding their coverage of uniparental markers.

* Though there are “only” autosomal 130,000 markers, Wells and his collaborators have selected a subset of markers which are highly informative of population structure (e.g., high Fst). Their SNPs are biased toward those with moderate levels of polymorphism across many populations to maximize the power of diagnosis of differentiation.

* They tried really hard to get rid of ascertainment bias. This means that in many previous chips there is a tendency to work off the polymorphism in Europeans, and then examine worldwide variation using this ruler. The problems with this method are obvious. One of the scientists on this project outlined how they worked to look for SNPs which are very informative for populations where ascertainment bias is a particular problem, Oceanians and Amerindians. I was impressed by their punctilious attitude on this question.

* The major downside is that they don’t have many trait informative SNPs on the marker. This means that they’re only interested in phylogenetics and phylogeography, rather than the evolution of specific suites of traits.

I’m sure that Wells will say a lot more. But there are a few extra aspects of the current trajectory which are exciting to me. First, they’re going to push their genotype results public at some point. Second, they’ll be encouraging utilization of the Geno 2.0 chip by giving them to specific researchers and groups. Third, their population coverage is very thorough. They have some publications in the pipeline, and it’s the last point that has me excited. I saw some slides of the coverage in India, and I’m 99% sure that this data set is the source of the claim from this group that India’s caste system predates the Indo-Aryans.

Addendum: Also, in some ways they are now moving into 23andMe’s space in scientific genealogy. If you are curious, please see Your Genetic Genealogist, as she has a much more thorough post.

• Category: Science • Tags: Geno 2.0, Genomics 
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"