Razib Khan
Ethiopia Genetics

Liya Kebede, Credit

There is a new paper, Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool, which is being sensationalized in the media. For example, the BBC headline: ‘DNA clues to Queen of Sheba tale’. I assumed that this was just the media, but to my surprise the authors themselves mention the ‘Sheba tale’ in their discussion for various reasons. This is unfortunate. Though it is true Ethiopians have a legend of descent from the queen of Sheba (and through her relationship to king Solomon the ancient Hebrews), if there is a scholarly consensus about the location of Sheba, it is probably in southwest Arabia (i.e., modern Yemen). But the reality is that it is probably just as likely that the story in the Hebrew Bible is an interleaved synthesis of legend and reality, and that disentangling the nuggets of truth so as to establish the location of the real Sheba is going to be impossible (it is just as likely that the real queen of Sheba, if she existed, was a Levantine notable who was given a more exotic provenance by the redactors of the Hebrew Bible).

As for the paper itself, it is of some interest. I’ve blogged and analyzed Ethiopian data myself, but the sample coverage here is awesome. Additionally, the authors attempted to ascertain time since admixture in relation to the Ethiopian population for their West Eurasia and African ancestral components, as well as sniffing around for signatures of selection in the genome. The highlights:

  • As first observed by Dienekes (to my knowledge) the Ancestral Sub-Saharan (ASS) component of Ethiopian ancestry is not in any way shape or form related to that modal in the Bantu or in West Africa. And, upon further exploration, it seems that it is separable from the Nilotic element as well, though this is less assured (one has to be careful when overloading a data set of a particular group of populations)
  • In Ethiopia it seems that Omotic ethnic groups are the modal reservoir for this component. This is of interest since Omotic are liminal members of the Afro-Asiatic language family
  • The major find here is that the non-African component of the ancestry of Ethiopians seems to have an affinity to Egyptians and Levantines, more than Yemenis
  • Additionally, there is some possible suggestive evidence for selection. Unsurprisingly Ethiopians carry a high proportion of the “European” variant of SLC24A5
  • Finally, the time since admixture is ~3,000 years BP (they used ROLLOFF)

In terms of selection, I am curious about what they found in the regions around the highland adaptation loci. One might predict that these regions should be enriched for indigenous African ancestry if the alleles are old. In contrast, if the alleles are newly arisen in the genetic background then there is no expectation that they should exhibit bias in their local genomic ancestry. The high frequency of SLC24A5 in a tropical population with West Eurasian ancestry is not surprising. South Indians have the derived variant on the order of ~50% frequency as well. The authors speculating about sexual selection seems like a deus ex machina. If sexual selection was strong for the derived variant and light skin then the allele should have become decoupled from the rest of the genome in terms of phylogeny (spreading to populations with lower levels of West Eurasian ancestry).

Two major criticisms. First, I am not clear that the comparison with non-African Ethiopian genomes was with the non-African genomes of the non-Sub-Saharan African populations. To get at what I’m saying, if you compare the West Eurasian ancestry of Ethiopians with various West Eurasian groups, then the proportion of West Eurasian ancestry in those groups is going to effect your Fst. Non-Jewish Yemenis have a high load of Sub-Saharan African ancestry. The relative closeness of the non-African component of the Ethiopians to Egyptians and Bedouins may simply be a function of the lower African ancestral load in these populations in comparison to the Yemenis. If the authors found greater genetic distance from Yemeni Jews I would be much more convinced, because the Jewish population in Yemen has a far lower proportion of African admixture than the non-Jews.

Second, like Dienekes I am not quite sure of ROLLOFF’s power in terms of generating a good peg for the time of admixture in this chronological window of time. The recent admixture events (e.g., North Africa, African Americans) are obviously right. But is it plausible that large numbers of West Eurasians were pushing their way into the highlands of Ethiopia as late as ~3,000 years ago? Perhaps. The depictions by Egyptians of the people of Punt seem to suggest they were of mostly West Eurasian ancestry. It could be that ~4,000 years ago the admixture had not been so thoroughgoing. There are two reasons I’m skeptical though. First, if there is one part of the world where we have some documentation of population movements ~3,000 years ago, it is the Near East. All we have to go on at this point is ROLLOFF. Second, like Dienekes I think we should be careful about relying on ROLLOFF alone. I have a hard time accepting ROLLOFF’s estimate for the admixture between West Eurasians and indigenous ancestral Indians ~3-4,000 years ago as well. Rather, I think that ROLLOFF is either biased toward underestimating the admixture time, or, picks up the last major pulses and misses the “peaks” of admixture. I would push both Ethiopian and Indian admixture events back several thousand years at least from what ROLLOFF is implying (or, perhaps more precisely, the inferences that some researchers make from ROLLOFF).

Frieda Pinto, Credit

Which brings me to an interesting point: there are strange correspondences between the demographic history of Ethiopia and South Asia. In both situations you have a population which seems to have arisen out of a balanced admixture between a distinctive indigenous population and a West Eurasian group which was intrusive. The ancient and medieval Western thinkers sometimes confused Ethiopia and India because of their marginal geographical position in relation to the Mediterranean world and the existence of dark-skinned people in both locales. The Greeks did differentiate though between the lighter skinned Indians of the north and the darker skinned ones of the south, with the latter resembling Ethiopians the most, except that their hair form was not curly (in reality, “north” would be the Punjab and Sindh, while the “south” would be Kerala and Tamil Nadu, because of the nature of Greek commerce and trade). Today some South Indians apparently get confused for being Ethiopian, and no doubt the reverse occurs, especially for women who straighten their hair somewhat.

That’s all I’ll say for now. The data is online, in convenient pedigree format. So I’ll be weighing in more in the near future….

Citation: Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool, Pagani et al.

(Republished from Discover/GNXP by permission of author or representative)
In the open thread someone asked: “Any recent stuff on the genetics of Ethiopians.” That prompted me to look around, because I’m curious too. Poking around Wikipedia I couldn’t find anything recent. A lot of the studies are older uniparental lineage based works (NRY and mtDNA). Ethiopia is interesting because unlike almost all other Sub-Saharan African nations it has a long written history. Culturally and linguistically it has both Sub-Saharan African, and non-Sub-Saharan African, affinities. The languages of highland Ethiopia are clearly Semitic. Those of lowland Ethiopia are Cushitic, a branch of the broader Afro-Asiatic language family concentrated around the Horn of Africa (Somali is a Cushitic language, though most Ethiopian nationals who speak a Cushitic dialect are of the Oromo group).

From a human evolutionary genetic perspective, Ethiopia also has specific interest. It is likely that the main recent pulse of humans Out of Africa traversed this region. Additionally, there is some evidence of deep time connections between the groups ancestral to Ethiopians and the Khoisan of southern Africa. It may be that Ethiopians and Khoisan are reservoirs of ancient genetic variation in Sub-Saharan Africa which as been overlain by Bantu in most other regions outside of West Africa. Finally, Ethiopians are known to have high altitude adaptations. This could be due to long term residence in the region, or, assimilation of favorable alleles from the long term residents by later populations.

Fortunately we can get a sense of the genetic affinities of Ethiopians thanks to a paper published last spring, The genome-wide structure of the Jewish people. The focus was clearly on Jews, but they surveyed Amhara & Tigray (Semitic speaking highlanders), Ethiopian Jews (similar ethnically to the Amhara & Tigray, but religiously non-Christian), and Oromo. In the PCA the Oromo and Semitic speaking populations are pretty obviously distinct clusters.

This just means that when you take worldwide genetic variation, and pull out the biggest independent dimensions, and then visualize individuals on the two largest dimensions in terms of how they explain variance, the Oromo and other Ethiopians don’t really intersect. Interestingly the Amhara and Tigray are almost indistinguishable, but the Ethiopian Jews are in their own cluster. There are, for the record, 7 Oromo, 7 Amhara, 5 Tigray, and 13 Ethiopian Jews in the sample.

Now let’s look at the genetic variation in ADMIXTURE. Remember this assigns the genomes of individuals in proportions to K ancestral units. As an example, if you had African Americans, Yoruba, and White Americans, in a total pool, and did K = 2, you might have a tendency where Yoruba and White Americans are in two totally different ancestral populations of K, while African Americans are 80% in one ancestry and 20% in another. The interpretation of this is straightforward, but when it comes to populations whose backgrounds we don’t know as well, one should be careful. The selection of a particular value for K is going to be really important, and we shouldn’t confuse the method from the reality which the method is trying to plumb.

First, K = 8 from Behar et al. I’ve reedited to highlight populations which might inform the variation of Ethiopians.

Now let’s look at a series of K’s. Note the changes.

Luckily for us, we don’t need to stop here. Dienekes included Behar’s Ethiopians (non-Jews) for Dodecad. Additionally, he included the Masai population from the HapMap. This turns out to be important because he found that Ethiopian Sub-Saharan ancestry is similar to that of the Masai, not the other African groups.

Dienekes also provided individual outputs. I’ve stitched together Ethiopians with Egyptians and Saudis. The color coding is the same as above.

You should be able to tell where the three groups start and stop pretty easily. I’m 99% sure that the six individuals with more East African and less Southwest Asian ancestry are all Oromo. Ethiopians, in particular highland Ethiopians, seem to me likely an ancient stabilized hybrid population between a population from Arabia, and a local Sub-Saharan population. This population seems unlikely to have been related to the peoples of West-Central Africa, who are associated with the Bantus across eastern and southern Africa. The Bantu agricultural toolkit runs into ecological constraints in various regions, and it is in those regions that non-Bantu populations have persisted. Ethiopia, with its unique climate and topography, naturally remains non-Bantu (as well as the Horn of Africa as a whole). The possible connections between Khoisan and Ethiopia may be a function of the fact that these areas harbor genetic variants which have disappeared in the intervening regions because of the Bantu expansion. I have a hard time accepting that the Bantu expansion was particular eliminationist, but I am starting to suspect that outside of Ethiopia population densities were very, very, low.

The antiquity of this ancient hybridization event to me is attested by the fact that Ethiopians lack any of the other Middle Eastern components besides the one modal in Saudi Arabia. There is a great deal of intra-population variance in the Saudi data set. Why? Part of this must be the slave trade, as well as pilgrims who remained in places like Mecca. But, I think part of the untold story here is that there may have been a larger genetic impact on Arabia after the rise of Islam from the Levant than vice versa! Probably the gene flow precedes Islam, as Arabia was hooked into worldwide trade and population movements, which Ethiopia was relatively insulated from. The Saudi data set has several people who are “pure” Southwest Asian, but also several who have a great deal of West Asian + South European. These seem likely to be people who have some background in the Fertile Crescent.

(Republished from Discover/GNXP by permission of author or representative)
Razib Khan
"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"