The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Thanks, LOL, or Troll with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used three times during any eight hour period.
Ignore Commenter Follow Commenter
🔊 Listen RSS

Over the past few days I’ve been trying to read a bit on the Sandawe. Most of the stuff I’ve been able to find is in the domain of linguistics, and is basically unintelligible to me in any substantive manner. The crux of the curiosity here is that the Sandawe, like their Hadza neighbors, have clicks in their language, and so have been classified with the Khoisan. Here’s some background:

The most promising candidate as a relative of Sandawe are the Khoe languages of Botswana and Namibia. Most of the putative cognates Greenberg (1976) gives as evidence for Sandawe being a Khoesan language in fact tie Sandawe to Khoe. Recently Gueldemann and Elderkin have strengthened that connection, with several dozen likely cognates, while casting doubts on other Khoisan connections. Although there are not enough similarities to reconstruct a Proto-Khoe-Sandawe language, there are enough to suggest that the connection is real.

I can’t speak to the validity of this at all, obviously. Some scholars do argue that the clicks in the Sandawe language were only acquired through interaction with peoples such as the Hadza, making an analogy to Xhosa, a Bantu language which has been strongly influenced by Khoi dialects. In any case, after having run ADMIXTURE a bunch of times on African population sets, and checked the genetic distances of the inferred ancestral ones, one thing that is clear is that the Sandawe don’t show a particularly close genetic relationship to the Bushmen, nor do they show a close relationship to the Hadza. In fact, the Hadza, Pygmies, and Bushmen show a closer relationship to each other, distant as it is, than to the Sandawe. The Sandawe themselves are distinctive from their Bantu neighbors, but, their connections seem more clear to the Masai and other peoples to the north.

Some of the anthropological stuff that I did find on the Sandawe not having to do with linguistics considered the issue of their status as hunter-gatherers, and their shift toward a form of agriculture within the past few centuries. Not surprisingly much of this literature consisted of ideologically shrill posturing, denouncing past scholarship for insensitivity and bigotry, while taking their own maximalist position. For example there has been the hypothesis that hunter-gatherer populations tend to be genetically and culturally isolated from agriculturalists, with several African groups used as exemplars. A group of anthropologists argue strenuously that this model may just be a construction of the biases of previous generations of scholars. But they offer little in the way of counterargument, more keen on uncovering the faults in the motives and methods of their predecessors than in building anything anew.

Genetics can help us a little here. Below are the results of ADMIXTURE and PCA I ran for a selection of populations. I pulled in some Behar et al. samples and merged it with the Henn et al. data set. The marker list was pruned down to ~160,000 SNPs. The limited selection of populations was conscious, insofar as I was exploring specific questions about the relationship of East African populations to Eurasian ones. At K = 8 the populations in my data set separated rather well. Do not take this separation as evidence that this K is a reflection of absolute concrete ancestral populations. Here’s the bar plot:

Since I’ve been running this data set, with some modifications, for a week now I can pick out some trends which I feel are robust at K = 8. For example, the Eurasian-like admixture you see across eastern Africa seems to be distinctively of a southern nature, centered on Arabia (probably Yemen). This makes total geographical sense. The Ethiopians and Somalis (I have some Somali samples which I threw in with the Ethiopians since the Cushitic Ethiopians seem more similar to the Somalis than to Semitic Ethiopians) lack the genetic influence of Bantus in totality. Rather, they have an affinity with the Nilo-Saharan peoples. Finally, the Sandawe tend to “break out” as a separate population only at higher K’s, generally clustering with the Nilo-Saharan element as long as possible.

Let’s also look at a PCA of the populations above on the first two principal components:

The PCA looks a little different from the ones you’re used to seeing because there are only West Eurasian and African groups in the sample. So the second component is not the familiar west-east axis in Eurasia, but the separation between the Mbuti and other Africans. On the far right of the plot you have Orcadians, then Druze, Saudis, and Yemenis. Then you have Horn of Africa populations, Ethiopians and Somalis along the vertical axis. Then Masai and Sandawe, and Luhya, a Kenyan Bantu group. The Masai are a confusing group. Even after removing problem individuals who might be related there tends to be a choppiness in the Masai results. The Sandawe on the other hand are more consistent by and large.

The genetic distances of the inferred ancestral groups aren’t too surprising. Here are MDS visualizations:

One of the consistent trends you see is that the Masai are closer to Eurasians than the Sandawe, but, the “Masai” modal ancestral component is no closer, or even further, from Eurasians than the “Sandawe” ancestral component. At higher K’s once the “Sandawe” element partitions out it is extremely dominant among the Sandawe, and found in lower fractions among other East African groups, especially non-Bantu such as the Masai. I wouldn’t put too much stock in the high proportion in the Ethiopians above, as the outcomes are rather scattered across the K’s and population combinations. The Masai are a population who always seem to have a low fraction of Eurasian-like “Arabian”, and this is what drags the population toward the Eurasians as in the PCA above. The Sandawe seem to lack this admixture; rather, their affinity with Eurasians is deeper and may not be due to admixture at all (ADMIXTURE itself is not perfect, and may transform an admixed group into a “pure” component, as we can see sometimes as among the Fulani or among South Asians, and, I suspect the Mozabites).

Back to the Sandawe and their position in the history of East Africa. Unlike the Pygmies and Khoisan they are not basal in relation to other human lineages from what I can see here. That is, they don’t “split off” as early from the main cluster of branches in a phylogenetic tree of human populations. In fact, unlike the Pygmies and Khoisan, and like the Masai, they are closer to Eurasians than the West African or Bantu peoples. In other words, they’re less basal. In fact, the Sandawe may be closer to Eurasians than most of the Nilotic groups when recent admixture with Eurasians is removed from the picture.

I do not know if the Sandawe are indigenous to their region of Tanzania. If I had to bet money I’d say not, and that some scholarly suppositions for a northerly origin may be plausible based on the affinities with the Masai and even Cushitic and Semitic peoples of Ethiopia and Somalia. The distinctiveness of the Sandawe from their Bantu neighbors seems clear, and there is no special closeness to the Khoisan of Southern Africa. Many anthropologists and historians have pointed out that some groups can “revert” to hunting and gathering facultatively. But the total Bantu domination of much of East Africa suggests to me that this is was not the case with the Sandawe. I think a plausible model is that the Sandawe were part of the substrate of East African hunter-gatherers who have mostly been eliminated and absorbed by the Bantu. In the north related peoples contributed to the emergent Nilo-Saharan and Ethiopian and Cushitic societies, which were able to avoid being swamped by the Bantu because of ecology and their own agricultural traditions. In this model the Sandawe affinities to Khoisan groups was more a matter of horizontal cultural borrowing and influence due to proximity, than a close genetic relationship.

🔊 Listen RSS

Some have asked what the point is in poking around African population structure when Tishkoff et al. and Henn et al. have done such a good job in terms of coverage. First, it is nice to run your own analyses so you can slice & dice to your preference, and not rely on the constrained menu provided by others. There’s value in home cooking; you can flavor to your taste. Second, you never know what data people might leave on your doorstep. I’ve received the genotypes of three Somalis. Nothing too surprising, a touch more Cushitic than the Ethiopians in Behar et al., but interesting nonetheless.

Also, you can see how ADMIXTURE tends to come to weird conclusions in certain circumstances. Below is a K = 12 run ~50,000 SNPs. I’ve included in a few Behar et al. and HGDP populations to the Henn et al. set, as well as pruned a lot of the African groups which seem redundant in terms of information. I’ve added a few geographically informative labels as well.

Observe below that there is a Fulani cluster. I think this is pretty much an artifact. At K = 7 the Fulani have a majority component which is modal in West Africa & Bantu speakers, and a minority component which is identical to the one modal in Mozabite Berbers from Algeria. The Mozabites reside in the far northern Sahara, and their modal component drops off as one goes east toward western Asia and the eastern Mediterranean. I suspect that what is showing up in ADMIXTURE is the ancient hybridization of the Fulani, and perhaps their demographic expansion from this core group. We have some glimmers of the prehistory of the Fulani, and no expectation for them to be such a distinctive cluster, so I naturally jump to these inferences. But it does make me reconsider the nature of the “Sandawe,” “Mbuti” or “San” clusters in ADMIXTURE. These populations are culturally distinctive in deep ways from their neighbors, so a reflexive inference one might make is that they’re “pure” ancient substrate groups which have been overlain and marginalized by their Bantu neighbors. But their prehistory is far murkier than the Fulani because of their geographical isolation, so there is far less to go on. These “ancient” isolated groups themselves may have gone through the same sort of distinctive recent ethnogenesis processes which we presume occurred with the Fulani (also, in the plot below the Biaka are pure; but in most of the bar plots they have a minor element which they share with their neighbors, probably due to greater admixture and interaction between western Pygmies and their Bantu neighbors than among the easter ones).

OK, now let’s prune some of the “pure” and extraneous populations. Additionally, I’ll remove some of the K’s. So the proportions are going to be recalculated with a new base. So, keep in mind that the South African Bantus show elevated West African in part because the Khoisan proportion was removed, inflating the percentages for all the other elements.

Now let’s look at the pairwise Fst values between inferred populations. Remember, this measures the proportion of genetic variance which can be attributed to between population differences. The bigger the value, the larger the genetic distance. I’ll given the inferred populations labels, but don’t take that too seriously.

Fst divergences between estimated populations:
Fulani San Euro Maya Nilotic Biaka W African SW Asian Sandawe Mbuti Mozabite Bantu
Fulani 0.00 0.19 0.15 0.26 0.11 0.13 0.09 0.14 0.10 0.18 0.12 0.10
San 0.19 0.00 0.27 0.37 0.16 0.11 0.13 0.25 0.13 0.13 0.23 0.13
European 0.15 0.27 0.00 0.18 0.17 0.22 0.19 0.05 0.15 0.26 0.06 0.19
Maya 0.26 0.37 0.18 0.00 0.27 0.31 0.28 0.19 0.25 0.36 0.20 0.28
Nilotic 0.11 0.16 0.17 0.27 0.00 0.10 0.07 0.17 0.08 0.14 0.13 0.07
Biaka 0.13 0.11 0.22 0.31 0.10 0.00 0.07 0.21 0.09 0.09 0.18 0.07
W African 0.09 0.13 0.19 0.28 0.07 0.07 0.00 0.17 0.07 0.12 0.14 0.05
SW Asian 0.14 0.25 0.05 0.19 0.17 0.21 0.17 0.00 0.14 0.25 0.06 0.18
Sandawe 0.10 0.13 0.15 0.25 0.08 0.09 0.07 0.14 0.00 0.13 0.12 0.07
Mbuti 0.18 0.13 0.26 0.36 0.14 0.09 0.12 0.25 0.13 0.00 0.22 0.12
Mozabite 0.12 0.23 0.06 0.20 0.13 0.18 0.14 0.06 0.12 0.22 0.00 0.14
Bantu 0.10 0.13 0.19 0.28 0.07 0.07 0.05 0.18 0.07 0.12 0.14 0.00

Here’s the genetic distance between non-African groups and African ones on a bar plot .

Some consistent trends:

– Mbuti and Khoisan show the largest distance from non-Africans.

– Biaka are next. Again, this may be due to admixture between Biaka and neighboring groups, or, a closer relationship between the Biaka Pygmies and the non-Khoisan/Mbuti African groups with reference to the last common ancestors.

– Roughly equal distance of Bantus and West Africans.

– Marginally smaller distances between the Nilotic cluster and non-Africans.

– Finally, a consistently smaller difference between non-Africans and the Sandawe cluster.

As always we need to remember that these probably aren’t pure concrete real ancestral groups. I have no hesitation in presuming some low level consistent gene flow over time between the western Mediterranean groups of which Mozabites are part and some of the Nilotic populations in north-central Africa. This equilibration of gene frequencies would reduce the Fst value naturally. Second, the relative closeness of the Sandawe cluster jumped out at me initially when I looked at the African data. It just strikes me as weird.

Here’s Wikipedia on the Sandawe:

The Sandawe are an agricultural ethnic group based in the Kondoa district of Dodoma Region in central Tanzania. In 2000 the Sandawe population was estimated to number 40,000.

The Sandawe language is a tonal language with clicks, apparently related to the Khoe languages of southern Africa. Recent research suggests that the ancestors of the Khoe were pastoralists, and migrated into southern Africa from the northeast, perhaps from the region of the modern Sandawe.

But the Sandawe don’t seem to be that close to the South African Bushmen samples. Here’s a multidimensional scaling of the Fst relationships of selected inferred ancestral African groups (weight the x-axis more):

An aspect of PCA plots which always jumps out you is the gap between African groups and non-African ones, often spanned by populations which have likely recent admixture. One hypothesis to explain this is that there’s been little gene flow between Africa and the rest of the world since the Out of Africa event. Probably due to ecology (the Sahara). But here’s another explanation: the Bantu expansion has wiped clean much of the genetic variation of central and eastern Africa, the very variation which might span in part the African vs. non-African gap. The archaeology and anthropology indicate that both the groups currently dominant in much of eastern Africa and down to the south, the Bantu and Nilotic peoples, are intrusive on the scale of the past 3,000 years. So groups like the Hadza and the Sandawe are presumed to be relics of the older cultural and genetic variation. This may be why the Sandawe are closer to Eurasians than other African groups once you control for clear likely admixture (e.g., the Fulani). Or, it may be that the Sandawe themselves have an older admixture event due to back-migration from Eurasia….

Finally, let me leave you with a bunch of MDS plots which visualize the Fst differences.

Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"