The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information



=>
Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
/
African Genomics

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

I mentioned this in passing on my post on ASHG 2012, but it seems useful to make explicit. For the past few years there has been word of research pointing to connections between the Khoisan and the Cushitic people of Ethiopia. To a great extent in the paper which is forthcoming there is the likely answer to the question of who lived in East Africa before the Bantu, and before the most recent back-migration of West Eurasians. On one level I’m confused as to why this has to be something of a mystery, because the most recent genetic evidence suggests a admixture on the order of 2-3,000 years before the past.* If the admixture was so recent we should find many of the “first people,” no? As it is, we don’t. I think these groups, and perhaps the Sandawe, are the closest we’ll get.

Publication is imminent at this point (of this, I was assured), so I’m going to just state the likely candidate population (or at least one of them): the Sanye, who speak a Cushitic language with possible Khoisan influences. There really isn’t that much information on these people, which is why when I first heard about the preliminary results a few years back and looked around for Khoisan-like populations in Kenya I wasn’t sure I’d hit upon the right group. But at ASHG I saw some STRUCTURE plots with the correct populations, and the Sanye were one of them. I would have liked to see something like TreeMix, but the STRUCTURE results were of a quality that I could accept that these populations were not being well modeled by the variation which dominated their data set. Though Cushitic in language the Sanye had far less of the West Eurasian element present among other Cushitic speaking populations of the Horn of Africa. Neither were their African ancestral components quite like that of the Nilotic or Bantu populations. The clustering algorithm was having a “hard time” making sense of them (it seemed to wanted to model them as linear combinations of more familiar groups, but was doing a bad job of it).

Here is an interesting article on these groups: Little known tribe that census forgot. Like the Sandawe this is a population which seems to have been hunter-gatherers very recently, and to some extent still engage in this lifestyle. In this way I think they are fundamentally different from Indian tribal populations, who are often held up to be the “first people” of the subcontinent. More and more it seems that the tribes of India are less the descendants of the original inhabitants of the subcontinent, at least when compared to the typical Indian peasant, and more simply those segments of the Indian population which were marginalized and pushed into less productive territory. Over time they naturally diverged culturally because of their isolation, but the difference was not primal. In contrast, groups like the Sanye and Sandawe may have mixed to a great extent with their neighbors (and lost their language like the Pygmies), but evidence of full featured hunting & gathering lifestyles implies a sort of direct cultural continuity with the landscape of eastern Africa before the arrival of farmers and pastoralists from the west and north.

* I understand some readers refuse to accept the likelihood of these results because of other lines of information. I am just relaying the results of the geneticists. I am not interested in re-litigating prior discussions on this. We’ll probably have a resolution soon enough.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS


After the second Henn et al. paper I did download the data. Unfortunately there are only 62,000 SNPs intersecting with the HGDP. This is somewhat marginal for fine-grained ADMIXTURE analyses, though sufficient for PCA from what I recall. That being said, the intersection with the HapMap data sets runs from ~190,000 SNPs, to the full 250,000 SNPs (this makes sense since the Henn et al. #2 data set has some HapMap populations in it). So I’ve been experimenting a fair amount in the past few days, and I thought I would post on one issue which was clear in the original paper, but which I have replicated.


The Fulani (Fula) people of the western Sahel seem to have a relatively old West Eurasian component which has distinct affinities with the “Maghrebi” element discerned by Henn et al. In fact, the non-Sub-Saharan African ancestry of the Fulani is almost exclusively of this origin. To me this serves as a peculiar mirror of what you see in the Cushitic and Ethiopian Semitic peoples of the far east of the Sahel-Sudan latitudinal region. These populations also seem to be compounds of a Sub-Saharan Africa element with a West Eurasian one, but in their case the admixture is almost exclusively from a Southwest Eurasian (Arabian) component. Geographically these two symmetric admixture events make sense, but the exclusivity is still a bit surprising. Additionally, in both the case of the Fulani and the Ethiopian and Cushitic groups the admixture is widely distributed and even enough to imply that they are old events. I also assumed this because in some admixture runs a “pure” Fulani cluster partitions out, which is not unexpected for stabilized hybrid populations (all human populations are stabilized hybrids if you go back far enough).

To give you a flavor of what I’m talking about here are some screen shots of a run which is currently going. It has 180,000 markers. I removed Tunisians and many African populations from the Henn et al. data set, and included in the Utah whites from the HapMap. The individual plots show the ancestral proportions for each Fulani in the data set:

So what can we see here? First, let’s reiterate something: as in the case of the populations of the Horn of Africa the West Eurasian element in the Fulani is difficult to find in “pure” form in the populations from which it putatively derived. What does that imply? I think that that means that the Fulani have an origin in relatively recent historic time, on the order of 2,000, not 10,000, years. That is because I am skeptical that the Fulani would be able to maintain genetic distinctiveness for ~10,000 years from other populations around them. In contrast, the last 2,000 years have seen the rise of various cultural institutions, from trans-Saharan nomadism to Islam, which might slow down admixture sufficiently to maintain the differences between the Fulani and their neighbors. It also implies to me that the non-Maghrebi “Near Eastern” element which Henn et al. discerned is relatively a recent phenomenon in northwest Africa, else the Fulani should also carry it. How recent? Probably from Classical Antiquity down to the Muslim period. Observe that many North Africa groups have a red “European” element. This may be from Near Eastern populations, but I suspect that the fraction here is just too high to be explained by that. Also, you can see above that some groups in Morocco have nearly as much of this as Egyptians, but far less of the more genuine Near Eastern components.

In all likelihood the West Eurasian component came to the Fulani via the Tuareg or a related or antecedent population. So if you typed the Tuareg you would probably get a better sense of the “pure” “Maghrebi” genetic profile. These genetic results also can serve as fodder to understanding the ethnogenesis of the landscape of the Sahel. In the map above it is interesting to observe that the Hausa speak an Afro-Asiatic language, even though their West Eurasian component is far lower than the Fulani, who speak Niger-Congo dialects. What gives? I suspect that the difference here is that the Hausa are a case of elite emulation of a cultural complex which was much more integrated and elaborated by the time it arrived on the West African scene. This explains how there could be language shift, while in the case of the Fulani there was none. Another hypothesis is that Afro-Asiatic derives from Sub-Saharan Africa itself, and the Chadic (Hausa) group are basal to the phylogeny. I’ll let readers explore the implications of that. A final aspect, I put the quotations in the title because perhaps the Berber dialects spread via elite emulation, and the original Maghrebi ancestors of the Fulani spoke a different language, which has been lost? As they say, for every answer there bloom a thousand questions….

Image credit: Wikipedia, Wikipedia.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

In my post below, Tutsi probably differ genetically from the Hutu, there were many comments. Some I did not post because they were rude, though they did ask valid questions. I will address those issues, but let me quote one comment:

That’s an interesting possibility, but this admixture run didn’t split the non-hunter-gatherer Africans that well. In one of your previous analyses on East Africa you managed to get a pretty accurate ‘Afro-Asiatic/Cushitic’ and ‘Nilotic’ cluster. Is it possible that you could run this Tutsi sample using the same admixture settings as in the ‘Flavors of Afro-Asiatic’ blog post to see if he carries a significant Nilotic component or is mainly Bantu & Cushitic derived?

So I replicated ADMIXTURE runs for many of the same populations as I did in my post, Flavors of Afro-Asiatic. I also pared down the population set and generated a PCA with EIGENSOFT. Before I get to those results, let me tackle the questions.

1) “Are the Luhya suitable proxies for the Hutus?”

Probably. The reason is that Bantu-speaking populations, from the Congo to South Africa, are surprisingly similar. Not only that, but these populations are very distinctive from groups which are close them geographically, but linguistically different (e.g., Khoe, Sandawe, Masai). The Luhya are not exceptional. I’ve run the Henn et al. data sets enough to be convinced that they’re exactly as they should be. They are pretty much what you’d expect from Kenyan Bantu. A predominant element which ties them back to an East-Central African point of origin, with some admixture with other East African elements (similarly, South African Bantu exhibit Khoisan admixture). The Hutu may be peculiar, but we don’t know, and my null is that they’re mostly Bantu with some admixture, as is the case with most Bantu speaking populations (this one Tutsi seems to be an exception in that context, as they are presumably Bantu speaking). If you think that the the Luhya are not suitable, I invite you to download the HapMap Luhya, and merge them with some of the Henn et al. data sets (or HGDP or Behar data sets). I think that should convince you.


2) “The admixture percentages you give are weird for population X.”

Someone who is more technically fluent than I can correct me, but I suggest that you be very careful about taking absolute percentages too literally. If you tell a statistical algorithm to push the genetic variation you’ve input into it into a certain number of boxes, it will do that, even if it has to squeeze them in all sorts of ways. In other words, modulating the parameters is an easy way to generate plenty of weird absolute proportions. Often it’s pretty obvious that deeply admixed populations are showing up as their own distinctive cluster…but that begs the question, when is admixture so distant that it shouldn’t count? Instead of focusing on absolute percentages, look at the relationships between individuals and populations. These too can be tweaked and massaged, but my personal experience is that they’re somewhat less volatile.

3) “The Nilotic cluster doesn’t map well onto Nilotic populations.”

The labels one gives, formally or informally, to a population cluster are for ease of recollection. They are not there to transmit to you real concrete information about the deep history of a population and its relationship. Additionally, there is always going to be a lot of confusion when you leverage geographical or linguistic terminologies which have only approximate relationships to genetic clusters. Don’t get so caught up in semantics that you forget that ADMIXTURE components are abstractions, useful for smoking out genetic variation, not for perfectly mapping onto some idealized set of ur-populations.

Now to my results. I used 200,000 markers. I combined Lithuanians and Belorussians into one pot as “Baltic,” and Syrians and Jordanians into another as “Levantines.” For the PCA I focused on African populations, and used the Yemeni Jews as the outgroup. Additionally, there is clearly structure due to some family relationships amongst the Masai. This is a problem in many runs with them. Even when you remove the “problem” individuals other clusters tend to crop up at higher K’s where the Masai are very numerous. In any case, for the purpose of these runs ignore the family clusters, and focus on the more typical individuals amongst the Masai.

Remember that the Tutsi is 3/4 Tutsi, 1/4 Hutu. It is N = 1. So is the Nubian. You see in many of the Horn of Africa populations that the Eurasian component has an affinity with Yemenis, not Europeans. In contrast, the Nubian does have some European-like component. That’s probably simply due to the fact that in this run Levantines themselves have that, and Egyptians who also carry that component are part of the heritage of the Nubian. The Tutsi does have the Southwest Asian component, which the Masai seem to lack.

To get a better sense, let’s look at a slice of individuals. The Tutsi is last. The family relationships of some of the Masai are also clear. Focus on the more typical Masai and the Tutsi:

Looking at the individual results it seems that the Tutsi can be placed with the range of combinations of ancestral components of the Masai, though not the Luhya. To get a different vantage point let’s look at some PCAs, which visualize the largest components of genetic variance in the data set.

The results are not cut & dry. I am less skeptical of some Afro-Asiatic element in the Tutsi heritage, though it still seems that the dominant affinity is with the Masai.

Note: I ran K = 7 to K – 10. There wasn’t anything different in the general pattern of the runs I did not show.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

Over the past few days I’ve been trying to read a bit on the Sandawe. Most of the stuff I’ve been able to find is in the domain of linguistics, and is basically unintelligible to me in any substantive manner. The crux of the curiosity here is that the Sandawe, like their Hadza neighbors, have clicks in their language, and so have been classified with the Khoisan. Here’s some background:

The most promising candidate as a relative of Sandawe are the Khoe languages of Botswana and Namibia. Most of the putative cognates Greenberg (1976) gives as evidence for Sandawe being a Khoesan language in fact tie Sandawe to Khoe. Recently Gueldemann and Elderkin have strengthened that connection, with several dozen likely cognates, while casting doubts on other Khoisan connections. Although there are not enough similarities to reconstruct a Proto-Khoe-Sandawe language, there are enough to suggest that the connection is real.

I can’t speak to the validity of this at all, obviously. Some scholars do argue that the clicks in the Sandawe language were only acquired through interaction with peoples such as the Hadza, making an analogy to Xhosa, a Bantu language which has been strongly influenced by Khoi dialects. In any case, after having run ADMIXTURE a bunch of times on African population sets, and checked the genetic distances of the inferred ancestral ones, one thing that is clear is that the Sandawe don’t show a particularly close genetic relationship to the Bushmen, nor do they show a close relationship to the Hadza. In fact, the Hadza, Pygmies, and Bushmen show a closer relationship to each other, distant as it is, than to the Sandawe. The Sandawe themselves are distinctive from their Bantu neighbors, but, their connections seem more clear to the Masai and other peoples to the north.

Some of the anthropological stuff that I did find on the Sandawe not having to do with linguistics considered the issue of their status as hunter-gatherers, and their shift toward a form of agriculture within the past few centuries. Not surprisingly much of this literature consisted of ideologically shrill posturing, denouncing past scholarship for insensitivity and bigotry, while taking their own maximalist position. For example there has been the hypothesis that hunter-gatherer populations tend to be genetically and culturally isolated from agriculturalists, with several African groups used as exemplars. A group of anthropologists argue strenuously that this model may just be a construction of the biases of previous generations of scholars. But they offer little in the way of counterargument, more keen on uncovering the faults in the motives and methods of their predecessors than in building anything anew.

Genetics can help us a little here. Below are the results of ADMIXTURE and PCA I ran for a selection of populations. I pulled in some Behar et al. samples and merged it with the Henn et al. data set. The marker list was pruned down to ~160,000 SNPs. The limited selection of populations was conscious, insofar as I was exploring specific questions about the relationship of East African populations to Eurasian ones. At K = 8 the populations in my data set separated rather well. Do not take this separation as evidence that this K is a reflection of absolute concrete ancestral populations. Here’s the bar plot:

Since I’ve been running this data set, with some modifications, for a week now I can pick out some trends which I feel are robust at K = 8. For example, the Eurasian-like admixture you see across eastern Africa seems to be distinctively of a southern nature, centered on Arabia (probably Yemen). This makes total geographical sense. The Ethiopians and Somalis (I have some Somali samples which I threw in with the Ethiopians since the Cushitic Ethiopians seem more similar to the Somalis than to Semitic Ethiopians) lack the genetic influence of Bantus in totality. Rather, they have an affinity with the Nilo-Saharan peoples. Finally, the Sandawe tend to “break out” as a separate population only at higher K’s, generally clustering with the Nilo-Saharan element as long as possible.

Let’s also look at a PCA of the populations above on the first two principal components:

The PCA looks a little different from the ones you’re used to seeing because there are only West Eurasian and African groups in the sample. So the second component is not the familiar west-east axis in Eurasia, but the separation between the Mbuti and other Africans. On the far right of the plot you have Orcadians, then Druze, Saudis, and Yemenis. Then you have Horn of Africa populations, Ethiopians and Somalis along the vertical axis. Then Masai and Sandawe, and Luhya, a Kenyan Bantu group. The Masai are a confusing group. Even after removing problem individuals who might be related there tends to be a choppiness in the Masai results. The Sandawe on the other hand are more consistent by and large.

The genetic distances of the inferred ancestral groups aren’t too surprising. Here are MDS visualizations:

One of the consistent trends you see is that the Masai are closer to Eurasians than the Sandawe, but, the “Masai” modal ancestral component is no closer, or even further, from Eurasians than the “Sandawe” ancestral component. At higher K’s once the “Sandawe” element partitions out it is extremely dominant among the Sandawe, and found in lower fractions among other East African groups, especially non-Bantu such as the Masai. I wouldn’t put too much stock in the high proportion in the Ethiopians above, as the outcomes are rather scattered across the K’s and population combinations. The Masai are a population who always seem to have a low fraction of Eurasian-like “Arabian”, and this is what drags the population toward the Eurasians as in the PCA above. The Sandawe seem to lack this admixture; rather, their affinity with Eurasians is deeper and may not be due to admixture at all (ADMIXTURE itself is not perfect, and may transform an admixed group into a “pure” component, as we can see sometimes as among the Fulani or among South Asians, and, I suspect the Mozabites).

Back to the Sandawe and their position in the history of East Africa. Unlike the Pygmies and Khoisan they are not basal in relation to other human lineages from what I can see here. That is, they don’t “split off” as early from the main cluster of branches in a phylogenetic tree of human populations. In fact, unlike the Pygmies and Khoisan, and like the Masai, they are closer to Eurasians than the West African or Bantu peoples. In other words, they’re less basal. In fact, the Sandawe may be closer to Eurasians than most of the Nilotic groups when recent admixture with Eurasians is removed from the picture.

I do not know if the Sandawe are indigenous to their region of Tanzania. If I had to bet money I’d say not, and that some scholarly suppositions for a northerly origin may be plausible based on the affinities with the Masai and even Cushitic and Semitic peoples of Ethiopia and Somalia. The distinctiveness of the Sandawe from their Bantu neighbors seems clear, and there is no special closeness to the Khoisan of Southern Africa. Many anthropologists and historians have pointed out that some groups can “revert” to hunting and gathering facultatively. But the total Bantu domination of much of East Africa suggests to me that this is was not the case with the Sandawe. I think a plausible model is that the Sandawe were part of the substrate of East African hunter-gatherers who have mostly been eliminated and absorbed by the Bantu. In the north related peoples contributed to the emergent Nilo-Saharan and Ethiopian and Cushitic societies, which were able to avoid being swamped by the Bantu because of ecology and their own agricultural traditions. In this model the Sandawe affinities to Khoisan groups was more a matter of horizontal cultural borrowing and influence due to proximity, than a close genetic relationship.

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS


Khoikhoi on the move….

Dienekes mentioned today a new paper, Signatures of the pre-agricultural peopling processes in sub-Saharan Africa as revealed by the phylogeography of early Y chromosome lineages. Because of the recent comments in this space on the genetic history of Africa I was curious, but after reading it I have to say I can’t make much sense of the alphabet soup of haplogroups. Remember, there are different ways to capture and analyze the variation in one’s genes. A common activity is to sweep over the whole genome and focus on single nucleotide polymorphisms, variation at the base pair level. So my own analyses using ADMIXTURE focus on tens or hundreds of thousands of such markers. But there are other types of genomic variation, such as copy number, microsatellites, and minsatellites.

Additionally, much of the older human phylogeographic literature focused on mtDNA and Y chromosomal variance. For mtDNA it was partly a function of how easy it was to extract the genetic material (it’s copious on the cellular level). But perhaps more importantly these two types of variance aren’t subject to recombination. This means they are defined by clean phylogenetic trees which do not exhibit reticulation (recombination chops apart correlated markers and mixes & matches them) and presumably are not subject to natural selection, and so perfect for coalescent theory. So you can posit lineages related to each other by steps of sets of mutations, and also easily calculate the time until the last common ancestor for two different branches of the tree using a “molecular clock” model.

Here’s the abstract:

The study of Y chromosome variation has helped reconstruct demographic events associated with the spread of languages, agriculture and pastoralism in sub-Saharan Africa, but little attention has been given to the early history of the continent. In order to overcome this lack of knowledge, we carried out a phylogeographic analysis of haplogroups A and B in a broad dataset of sub-Saharan populations. These two lineages are particularly suitable for this objective because they are the two most deeply rooted branches of the Y chromosome genealogy. Their distribution is almost exclusively restricted to sub-Saharan Africa where their frequency peaks at 65% in groups of foragers. The combined high resolution SNP analysis with STR variation of their sub-clades reveals strong geographic and population structure for both haplogroups. This has allowed us to identify specific lineages related to regional pre-agricultural dynamics in different areas of sub-Saharan Africa. In addition, we observed signatures of relatively recent contact, both among Pygmies, and between them and Khoisan speaker groups from southern Africa, thus contributing to the understanding of the complex evolutionary relationships among African hunter-gatherers. Finally, by revising the phylogeography of the very early human Y chromosome lineages, we have obtained support for the role of southern Africa as a sink, rather than a source, of the first migrations of modern humans from eastern and central parts of the continent. These results open new perspectives on the early history of Homo sapiens in Africa, with particular attention to areas of the continent where human fossil remains and archaeological data are scant.

The authors posit that the connections between southern African Bushmen and the Pygmies of central Africa which they find in the Y chromosomal lineages might have been mediated by the peregrinations of Khoikhoi pastoralists, who possibly diffused from a central-southern African ur-heimat in advance of the Bantu expansion. This seems plausible to me.

The main issue which I’m curious about in regards to all these results are the connections between Pygmies and Bushmen set against the Bantus. I certainly had no expected it, and it has been repeated several times. There is now a lot of weird evidence that demands a hypothesis.

Image credit: Wikipedia

(Republished from Discover/GNXP by permission of author or representative)
 
🔊 Listen RSS

Last year a paper came out in Science which made a rather large splash, The Genetic Structure and History of Africans and African Americans by Tishkoff et al. Since it’s more than a year old I recommend that those of you curious about the details of the paper and don’t have academic access go through the free registration, as you can then read it in full. Unlike Reich et al. the Science paper didn’t unveil a new method of analysis. It was the standard bread & butter, with PCA’s & STRUCTURE plots & phylogenetic trees. But the coverage of populations within Africa was massive. They had a lot of results and relationships to cover, and ended up with a 100 page supplement.

I commend the whole paper to you. But there are two elements I want to highlight. First, a three dimensional PCA plot. It has the first, second and third principal components of variation. In other words, the three largest independent dimensions in terms of explanatory power of genetic variation. Panel A includes all world populations, and panel B just Africans.


3DPCA

For panel A, PC1 = 20% of the variance, PC2 = 5%, and PC3 = 3.5%. For panel B the PCs didn’t drop off quite so much, PC1 = 11%, PC2 = 6%, PC3 = 5% and PC4 = 4%. In case you don’t know, the Hazda are Africa’s last obligate hunter-gatherers, and speak a language with clicks in it, just as the Bushmen do. The big division highlighted in this paper is that between the “indigenous” relict populations, the Hazda, Sandawe, Bushmen and Pygmies, and those who belong to the more widespread agriculturalist and pastoralist societies of Africa. Implicit within the paper is the model of a Bantu Expansion of farmers, as well as a possible later Nilotic expansion (which brought the Tutsi and Masaai) of herders, in a north-south direction. In the process they assimilated/and or/displaced the indigenous populations, of whom the aforementioned peoples are relict islands persisting in ecologically isolated or unfavorable domains.

324_1035_F5The map to the left shows the population coverage within this paper of African groups. The pie graphs simply show ancestral quanta as inferred by STRUCTURE. You can read the paper for the blow-by-blow. But ultimately it seems there will be need for a finer-grained coverage to the south of the equator. If the Bantu expansion is as recent as archaeologists and linguists assume, on the order of ~2,000 years ago, then the gradients of genetic signals should persist. From what I can tell it is assumed on both genetic and phenotypic grounds that the Xhosa have a higher load of Khoisan ancestry than the Zulu or Tswana. The Bantu Expansion is recent enough that the semi-legendary Phoenician circumnavigation of Africa would have encountered many Khoisan peoples along the eastern coast.

Below are a selection of figures from the above paper. After selecting an image it is probably best to hit F11 for “Full Screen” if you aren’t a on a very big monitor (you can copy image location and view it in a separate window as well).

[nggallery id=5]

(Republished from Discover/GNXP by permission of author or representative)
 
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at http://www.razib.com"