The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

Soft serve

The trait of lactase persistence (lactose tolerance) is probably one of the better schoolbook examples of natural selection in human populations. The reasons for this are probably two-fold. There is a very strong signature of selection within a specific gene known to associate with the trait in question in many populations. And, there is a very compelling historical narrative which explains rather neatly how this particular functional change could have undergone such strong selection within the past ~5,000 years across these populations. But the elucidation of the origin and spread of this genetic adaptation is also interesting because it looks as if it was not a singular event. Populations as disparate as Arabians, Danes, and Masai seem to carry different alleles around the locus of interest which confer the ability to digest milk. This illustrates the fact when selection pressures have a viable target, there is a rapid response on the genomic level. At some point during the maturation of a mammal the regulatory pathway which produces lactase enzyme shuts down. Yet within numerous human populations this gradual shutdown process has been short-circuited.

The variety of response in relation to this adaptation was brought home to me as I read Diversity of Lactase Persistence Alleles in Ethiopia – Signature of a Soft Selective Sweep, in the latest issue of The American Journal of Human Genetics:

The persistent expression of lactase into adulthood in humans is a recent genetic adaptation that allows the consumption of milk from other mammals after weaning. In Europe, a single allele (−13910∗T, rs4988235) in an upstream region that acts as an enhancer to the expression of the lactase gene LCT is responsible for lactase persistence and appears to have been under strong directional selection in the last 5,000 years, evidenced by the widespread occurrence of this allele on an extended haplotype. In Africa and the Middle East, the situation is more complicated and at least three other alleles (−13907∗G, rs41525747; −13915∗G, rs41380347; −14010∗C, rs145946881) in the same LCT enhancer region can cause continued lactase expression. Here we examine the LCT enhancer sequence in a large lactose-tolerance-tested Ethiopian cohort of more than 350 individuals. We show that a further SNP, −14009T>G (ss 820486563), is significantly associated with lactose-digester status, and in vitro functional tests confirm that the −14009∗G allele also increases expression of an LCT promoter construct. The derived alleles in the LCT enhancer region are spread through several ethnic groups, and we report a greater genetic diversity in lactose digesters than in nondigesters. By examining flanking markers to control for the effects of mutation and demography, we further describe, from empirical evidence, the signature of a soft selective sweep.

To some extent the paper was written rather confusingly for my taste. Importantly, they did not even consider the results of Pagani et al. (in the same journal!) from last year in their analysis. The big picture result is that whereas in Eurasia and East Africa it looks as if lactase persistence spread through populations via “hard” selective sweeps, in Ethiopia it may have been propagation through “soft” sweeps. The former are cases where a single new mutant confers a beneficial phenotype. In the absence of allelic competitors this variant sweeps up in frequency extremely rapidly, and flanking regions of the genome generate a long haplotype block. In Europeans this has resulted in a strongly homogenized region of the genome around LCT.

The situation in Ethiopia is a touch paradoxical in light of the above model. Instead of one allele, it looks as if several are segregating. And, the lactase persistence haplotypes exhibit more, not less, genetic diversity than the non-persistent variants. As noted in the article it may be that there are strong selective constraints against lactase persistence. Apparently there is a long non-persistent haplotype in Horn of Africa populations, explaining the reduced diversity of this subset of the sample. Whereas in a hard sweep a single mutation can rise in frequency against disfavored ancestral variants, in this situation you have a soft sweep where alternative variants with similar fitness values are presumably increasing in frequency.

But all this needs to be considered in light of Pagani et al., which indicates a very recent admixture in Ethiopia. The discussion above seems to suggest in situ selective events within the Horn of Africa, but the possibility is that the sweeps may have initiated among the Eurasian ancestors of the Ethiopians (perhaps some admixture mapping would be useful?). Ultimately this is going to be a complicated story. It doesn’t take away from the bigger picture that lactase persistence is an excellent model for natural selection, but the sketch has more details to be filled in, though I’m not quite sure about the specific character of this from this paper

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

One of the primary concerns/questions I had about Luca Pagani’s paper on the genetic origin of Ethiopians is that he found that their West Eurasian ancestor was closer to Levantine than Arabian. I was confused by this because on model-based clustering (e.g., Admixture) when you push down to a fine level of granularity you always see that the Ethiopians cluster with the Yemenis for their non-African ancestry. More precisely, Yemeni Jews are often ~100% component X, which ~50% of the ancestry of Ethiopians.

From what I recall Pagani et al. used haplotype windows which they assigned to Eurasian or African ancestral components, and they compared these to the populations related to the putative ancestral groups. Because Pagani et al. used blocks of the genome, rather than just on specific genotypes, I weight their finding more strongly. But I wanted to double check with TreeMix if the finding in Admixture was peculiar.

So again, I took a ~150,000 SNP set ran it on TreeMix with migration = 5.

Again, you see that the gene flow to the Ethiopians is coming from a position on the tree rather close to Yemenite Jews. One model which may explain this, and still align with Pagani’s findings, is that Arabians themselves are a synthetic population. A “pure” Yemenite Jew may have ancient admixture of African affinity beneath an intrusive element from the north. The parallelism between Ethiopia and Arabia in this model is clear, with the major difference being magnitude of the source population admixture (greater in Arabia), as well as some differences of the target population.

This again reiterates us to be careful of trust first-blush summaries.

(Republished from Discover/GNXP by permission of author or representative)
• Category: Science • Tags: Anthropology, Ethiopia, Genetics, Genomics 
🔊 Listen RSS

Liya Kebede, Credit

There is a new paper, Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool, which is being sensationalized in the media. For example, the BBC headline: ‘DNA clues to Queen of Sheba tale’. I assumed that this was just the media, but to my surprise the authors themselves mention the ‘Sheba tale’ in their discussion for various reasons. This is unfortunate. Though it is true Ethiopians have a legend of descent from the queen of Sheba (and through her relationship to king Solomon the ancient Hebrews), if there is a scholarly consensus about the location of Sheba, it is probably in southwest Arabia (i.e., modern Yemen). But the reality is that it is probably just as likely that the story in the Hebrew Bible is an interleaved synthesis of legend and reality, and that disentangling the nuggets of truth so as to establish the location of the real Sheba is going to be impossible (it is just as likely that the real queen of Sheba, if she existed, was a Levantine notable who was given a more exotic provenance by the redactors of the Hebrew Bible).

As for the paper itself, it is of some interest. I’ve blogged and analyzed Ethiopian data myself, but the sample coverage here is awesome. Additionally, the authors attempted to ascertain time since admixture in relation to the Ethiopian population for their West Eurasia and African ancestral components, as well as sniffing around for signatures of selection in the genome. The highlights:

  • As first observed by Dienekes (to my knowledge) the Ancestral Sub-Saharan (ASS) component of Ethiopian ancestry is not in any way shape or form related to that modal in the Bantu or in West Africa. And, upon further exploration, it seems that it is separable from the Nilotic element as well, though this is less assured (one has to be careful when overloading a data set of a particular group of populations)
  • In Ethiopia it seems that Omotic ethnic groups are the modal reservoir for this component. This is of interest since Omotic are liminal members of the Afro-Asiatic language family
  • The major find here is that the non-African component of the ancestry of Ethiopians seems to have an affinity to Egyptians and Levantines, more than Yemenis
  • Additionally, there is some possible suggestive evidence for selection. Unsurprisingly Ethiopians carry a high proportion of the “European” variant of SLC24A5
  • Finally, the time since admixture is ~3,000 years BP (they used ROLLOFF)

In terms of selection, I am curious about what they found in the regions around the highland adaptation loci. One might predict that these regions should be enriched for indigenous African ancestry if the alleles are old. In contrast, if the alleles are newly arisen in the genetic background then there is no expectation that they should exhibit bias in their local genomic ancestry. The high frequency of SLC24A5 in a tropical population with West Eurasian ancestry is not surprising. South Indians have the derived variant on the order of ~50% frequency as well. The authors speculating about sexual selection seems like a deus ex machina. If sexual selection was strong for the derived variant and light skin then the allele should have become decoupled from the rest of the genome in terms of phylogeny (spreading to populations with lower levels of West Eurasian ancestry).

Two major criticisms. First, I am not clear that the comparison with non-African Ethiopian genomes was with the non-African genomes of the non-Sub-Saharan African populations. To get at what I’m saying, if you compare the West Eurasian ancestry of Ethiopians with various West Eurasian groups, then the proportion of West Eurasian ancestry in those groups is going to effect your Fst. Non-Jewish Yemenis have a high load of Sub-Saharan African ancestry. The relative closeness of the non-African component of the Ethiopians to Egyptians and Bedouins may simply be a function of the lower African ancestral load in these populations in comparison to the Yemenis. If the authors found greater genetic distance from Yemeni Jews I would be much more convinced, because the Jewish population in Yemen has a far lower proportion of African admixture than the non-Jews.

Second, like Dienekes I am not quite sure of ROLLOFF’s power in terms of generating a good peg for the time of admixture in this chronological window of time. The recent admixture events (e.g., North Africa, African Americans) are obviously right. But is it plausible that large numbers of West Eurasians were pushing their way into the highlands of Ethiopia as late as ~3,000 years ago? Perhaps. The depictions by Egyptians of the people of Punt seem to suggest they were of mostly West Eurasian ancestry. It could be that ~4,000 years ago the admixture had not been so thoroughgoing. There are two reasons I’m skeptical though. First, if there is one part of the world where we have some documentation of population movements ~3,000 years ago, it is the Near East. All we have to go on at this point is ROLLOFF. Second, like Dienekes I think we should be careful about relying on ROLLOFF alone. I have a hard time accepting ROLLOFF’s estimate for the admixture between West Eurasians and indigenous ancestral Indians ~3-4,000 years ago as well. Rather, I think that ROLLOFF is either biased toward underestimating the admixture time, or, picks up the last major pulses and misses the “peaks” of admixture. I would push both Ethiopian and Indian admixture events back several thousand years at least from what ROLLOFF is implying (or, perhaps more precisely, the inferences that some researchers make from ROLLOFF).

Frieda Pinto, Credit

Which brings me to an interesting point: there are strange correspondences between the demographic history of Ethiopia and South Asia. In both situations you have a population which seems to have arisen out of a balanced admixture between a distinctive indigenous population and a West Eurasian group which was intrusive. The ancient and medieval Western thinkers sometimes confused Ethiopia and India because of their marginal geographical position in relation to the Mediterranean world and the existence of dark-skinned people in both locales. The Greeks did differentiate though between the lighter skinned Indians of the north and the darker skinned ones of the south, with the latter resembling Ethiopians the most, except that their hair form was not curly (in reality, “north” would be the Punjab and Sindh, while the “south” would be Kerala and Tamil Nadu, because of the nature of Greek commerce and trade). Today some South Indians apparently get confused for being Ethiopian, and no doubt the reverse occurs, especially for women who straighten their hair somewhat.

That’s all I’ll say for now. The data is online, in convenient pedigree format. So I’ll be weighing in more in the near future….

Citation: Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool, Pagani et al.

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

In the post yesterday I reported what was generally known about the Horn of Africa, that its populations seem to lie between those of Sub-Saharan African and Eurasia genetically. This is totally reasonable as a function of geography, but there are also suggestions that this is not simply a function of isolation by distance (i.e., populations at position 0.5 on the interval 0.0 to 1.0 would presumably exhibit equal affinities in both directions due to gene flow). For example, you observe the almost total lack of “Bantu” genetic influence on the Semitic and Cushitic populations of the Horn of Africa, and the lack of Eurasian influence in groups to the south and west of the Horn except to some extent the Masai.

Tacking horizontally in terms of discipline, over the past few generations there has been a veritable cottage industry making the case for the recent origin of many ethno-linguistic populations through a process of cultural self-creation. Clearly there are many cases of this, some of them studied in depth by anthropologists (e.g., the shift from Dinka to Nuer identity). But there has been an unfortunate tendency to over-generalize in this direction. In some ways this is peculiar insofar as these models presuppose the infinite plasticity of culture without observing the sharp and strong norms which those very same phenomenon can enforce. The genetic isolation of non-Muslims in the Middle East after the rise of Islam seems rather well validated by the evidence from genomics. The norms of both Muslims and non-Muslims strongly biased them toward endogamy, and nature of Islamic hegemony and domination was such that Muslims were the ones who were likely to have cosmopolitan affinities with the “Islamic international.” In contrast, non-Muslim minorities began a long process of involution after the Islamic Arab conquests, only disrupted in the past century by emigration and to a lesser extent emancipation.

So back to the Horn of Africa. The vast majority of the people of the Horn of Africa speak an Afro-Asiatic language. Arabic and Hebrew are the most famous members of this group, but it is a very broad classification, ranging from the dialects of the Berbers in the Maghreb all the way to ancient Akkaddian. There are two large subfamilies of particular note and interest here: Semitic and Cushitic. The map above shows the distribution within the Horn of Africa. One can “quick & dirty” summarize the pattern here by observing that Semitic languages in Ethiopia tend to be concentrated in the north-central Christian highlands, while Cushitic is found everywhere else. Additionally, there is the confluence between religion and ethnicity, as there are Cushitic Muslims (Somalis, Afar, etc.) and Cushitic Christians (many Oromo, etc.). From what I can gather many Cushitic social and political elites have had a tendency toward assimilating into an Amhara Semitic identity (Haile Selassie’s mother was a Muslim Oromo). We could therefore generate a possible model where Semitic langauges arrived late to Ethiopia and spread through elite emulation, so the difference between Semitic and Cushitic peoples should be marginal in the genomic dimension (such as the marginal differences between Hausa and Yoruba in Nigeria). Or, we could posit that the Semitic element is distinctive from a pre-existent Cushitic substratum.

To make a long story short by running more ADMIXTURE with a Horn of Africa centered data set I have discerned that one can actually differentiate Cushitic and Semitic elements in the Horn and tentatively identify them with different ancestral components. First, the technical details….

I began with the data set I started with in the runs I posted yesterday. Strange outliers in the Masai were removed. These are a few sets of individuals who “fix” for minority ancestral components. This is a tell that there’s structure within the Masai being picked up, but more like distantly related individuals, not ethnic level differences. After running this I noticed that a lot of the same then popped up in the non-Jewish Yemeni and Saudi samples. To some extent this is like “whack-a-mole.” If you remove one problem others simply pop out of the woodwork. So I removed all the non-Jewish Yemenis and Saudis. The number of markers remained the same, 210,000 SNPs.

There were still a few issues with outliers, especially with the Bantu Kenya, and to a lesser extent the Levantine samples. But at this point I decided to go with it, since these are marginal to the story of the Horn of Africa in any case. I stated yesterday that in general Horn of Africa populations don’t present their own clusters, but are a composite of others, mostly East African and Arabian. After I removed some of the spurious Masai components and ran ADMIXTURE up to K = 10 I did finally get a Horn of Africa cluster, “HoAc”. Additionally, I also found that you can see systematic differences between Cushitic Oromo and Somalis, and the Semitic Ahmara, Ethopian Jews, and Tigray.

Below are bar plots of K = 7 and K = 9. The lower K’s aren’t too different from what I posted yesterday, while K = 8 and K = 10 has too many minor components. I’ve posted only fine-grained and Horn of Africa focused plots, instead of the more general summary plots which show average ancestral quanta. Also, below these I’ve posted two dimensional representations of genetic distances between inferred ancestral groups for K = 7 and K = 9. I’ve removed several components though, in the case of one because it was clearly a spurious “extended family” cluster, and in some cases to better visualize relationships.

To cut to the chase, it looks like all Horn of Africa populations share a HoAc base, which one might term “Cushitic,” though that is not totally accurate. On top of that base you see differences based on language family. The Semitic speaking groups have an ancestral component which is identical to the one fixed in Yemeni Jews, while the Cushitic speaking ones tend to lack this. But observe that the Semitic speaking populations generally have the component found in the Cushitic speaking groups, and especially the Somalis in which it often fixes. This is why I put the sequence of language-population expansions so that the Semitic is overlain upon a Cushitic base. Additionally, there does seem to be admixture from Nilotic groups into Ethiopian, but not Somali, populations. This is most consistent and evident in the Oromo, and where an isolation by distance model seems plausible, as the Oromo are geographically the most likely to have interacted with Nilo-Saharan populations and the Somali the least.

Finally, please keep in mind that if the Somalis are 100% cluster X, that does not mean that the Somalis are derived from some real homogeneous ancestral cluster X. These ADMIXTURE components are very interesting in helping to flesh out relationships horizontally across populations today, but we should be cautious about what they can tell us about relationships vertically in terms of how populations emerged over time. A thoroughly admixed group can break out into its own distinctive cluster if it exhibits a level of internal homogeneity and the ancestral “reference” populations themselves no longer exist. This seems to be what has occurred in South Asia, where certain groups shake out as “100% South Asian,” but themselves on the deeper genomic level seem to be stabilized admixtures of ancient fusions between two ancestral groups which were very diverged. A South Asian analogy to the Horn of Africa might lead us to infer that Somalis are the equivalent of these populations, where they lack admixture with more recent arrivals to the region after the initial admixture event between “Ancestral East Africans” (AEA) the Arabians of yore. This may simply be a function of geography and historical contingency, as the position of Somalis is more “sheltered” because of the quasi-peninsular nature of their region of the Horn. Additionally, Somalia is relatively dry and unsuitable for agriculture, making it perhaps less ecologically friendly than the highlands of Ethiopia to Semitic populations bringing a new agricultural toolkit.

There’s plenty more you can say, but I’ll hold off, and add a word of caution: it is very possible that I was looking for these specific clusters and arrived at them via confirmation bias. As I’ve noted before, if you tune ADMIXTURE’s parameters in the proper fashion you can “arrive” at the answers you want. How to protect against this? If I keep performing ad hoc runs and going by intuition, lots of repetition often helps. You naturally arrive at a sense of the underlying distribution of possibilities, can guard against anchoring upon an outlier result, because you know that it is atypical (this is though on reason that ground-breaking results are ignored, as they don’t fit the paradigm, so there’s a flip-side to this bias). I also run cross-validation now and then to find the optimal number of K’s, but that really slows down the program, so I this is a matter of trade offs for me. I’m rather sure that the differences between Ethiopian and Somali groups are robust, because the same pattern of relationships (e.g., the Amhara tendency to resemble the Tigray more than the Somali) reoccurs over and over. But I’m not so confident about the inference I’ve drawn here about the Afro-Asiatic language families and the partitioning of the Cushitic and Semitic groups.

You can find some more files here.

Image credit: Wikipedia

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS In light of my last post I had to take note when Dienekes today pointed to this new paper in the American Journal of Physical Anthropology, Population history of the Red Sea—genetic exchanges between the Arabian Peninsula and East Africa signaled in the mitochondrial DNA HV1 haplogroup. The authors looked at the relationship of mitochondrial genomes, with a particular emphasis upon Yemen and the Horn of Africa. This sort of genetic data is useful because these mtDNA lineages are passed from mother to daughter to daughter to daughter, and so forth, and are not subject to the confounding effects of recombination. They present the opportunity to generate nice clear trees based on distinct mutational “steps” which define ancestral to descendant relationships. Additionally, using neutral assumptions mtDNA allows one to utilize molecular clock methods to infer the time until the last common ancestor of any two given lineages relatively easily. This is useful when you want to know when a mtDNA haplgroup underwent an expansion at some point in the past (and therefore presumably can serve as a maker for the people who carried those lineages and their past demographic dynamics).

What did they find? Here’s the abstract:

Archaeological studies have revealed cultural connections between the two sides of the Red Sea dating to prehistory. The issue has still not been properly addressed, however, by archaeogenetics. We focus our attention here on the mitochondrial haplogroup HV1 that is present in both the Arabian Peninsula and East Africa. The internal variation of 38 complete mitochondrial DNA sequences (20 of them presented here for the first time) affiliated into this haplogroup testify to its emergence during the late glacial maximum, most probably in the Near East, with subsequent dispersion via population expansions when climatic conditions improved. Detailed phylogeography of HV1 sequences shows that more recent demographic upheavals likely contributed to their spread from West Arabia to East Africa, a finding concordant with archaeological records suggesting intensive maritime trade in the Red Sea from the sixth millennium BC onwards. Closer genetic exchanges are apparent between the Horn of Africa and Yemen, while Egyptian HV1 haplotypes seem to be more similar to the Near Eastern ones.

Much of this is totally concordant with the results we’ve generated from the autosomal genome. Though the autosomal genome is much more difficult when it comes to implementing many of the tricks & techniques of phylogeography outlined above, it does offer up a much more robust and thorough picture of genetic relationships between contemporary populations. Instead of a a distinct and unique line of paternal or maternal ancestry, thousands of autosomal SNPs can allow one t o get a better picture of the nature of the total genome, and the full distribution of ancestors.

The map to the left shows the spatial gradients of the broader haplogroup under consideration, HV1. But what about the branches? Below is an illustration of the phylogenetic network of branches of HV1, with pie-charts denoting the regional weights of a given lineage:

Since the shading is so difficult, let me jump to the text:

…Curiously, the HV1 root haplotype with substitution at position 16,067 was not observed in the Arabian Peninsula except in four Yemeni Jews, but was observed in 11 Caucasus, four Egyptian, one European, two Maghreb, and six Near Eastern samples, thus supporting a possible origin in the Near East. Haplotype 16,067–16,362, possibly defining a pre-HV1 haplogroup, has so far been observed in Dubai (one), Ethiopia (four), Maghreb (one), and Yemen (three)….

I think you have be very, very, careful to not read too much into mtDNA lineage distributions and what they may tell you about the past, at least in and of themselves. With the rise of ancient DNA and deeper analyses of mtDNA sequences as well as better geographical coverage many of the inferences of the last 10 years are being radically revised. But, combined with the autosomal results the origin of these mtDNA haplogroups in the Middle East within the last ~10 thousand years seems eminently possible.

Finally, here are their time until the most recent common ancestor estimates:

…The TMRCA estimate for HV1 was 22,350 (14,737–30,227) years when taking into consideration the sequences without the polymorphism at 15,218—a figure which closely matches the estimate of 18,695 (13,094–24,449) years when not considering those two sequences. The control region age estimate of HV1 also presents a similar age, dating to 19,430 (6,840–32,023) years. Age estimates of HV1 daughter sub-haplogroups are only slightly lower—15,178 (8,893–21,671) years for HV1a and 17,682 (10,320–25,316) years for HV1b. The common Arabian Peninsula and East African sub-haplogroups HV1a3 and HV1b1 share a close age of 6,549 (2,456–10,746) years and 10,268 (4,792–15,918) years, respectively. Sub-haplogroups HV1a1 and HV1a2, which despite being rare seem to have a wider geographical distribution, have TMRCA of 10,268 (3,602–17,194) years and 9,518 (3,963–15,255) years, respectively. The ratio of the dates based on the ρ statistic for the synonymous clock relative to the complete sequence was 1.24, closely overlapping in most branches except for HV1a1 which has a very broad age estimate based only on synonymous diversity [23,616 (4,917–42,315) years]….

The confidence intervals on these estimates are really large. All you can say with a high degree of certainty is that the expansion of the family of HV1 haplogroups does not predate the Last Glacial Maximum, 15 to 20 thousand years ago. Many of the daughter branches seem to have emerged in the Holocene, possibly after the rise of agriculture. But with the huge possible set of ranges these temporal estimates come close to offering up pretty much zero additional clarity on the chronology of population dynamics in this region .

Readers might also be interested this from last January, Internal Diversification of Mitochondrial Haplogroup R0a Reveals Post-Last Glacial Maximum Demographic Expansions in South Arabia (with some of the same authors). One aspect of these sorts of papers working with mtDNA is that they remain generally oriented toward the proposition that Pleistocene population structure is extremely important in predicting contemporary patterns of genetic variation. I’m not sure this is such a robust model. The autosomal and uniparental data from Ethiopia and Somalia strongly leans us toward the proposition of admixture of two very distinct populations, one in East Africa (“Ancestral East Africans”), and Eurasian group which are likely to have been intrusive. The genetic distance between the Eurasian inferred ancestral component, which is nearly identical to that of southern Arabia, and other Eurasian components is not so large that it seems plausible that there could have a separation during the Pleistocene. In other words, there was a lot of Holocene migration. If I had to guess I would say it had something to do with the agricultural and pastoral lifestyles brought by Arabians to the Horn of African within the last 10,000 years. Simple ecology imposed a limit upon the expansion of these peoples into more classical lush tropical Africa. Eventually a population did emerge to exploit these territories, Bantus from west-central Africa. Just like the Arabian-AEA hybrid population they encountered ecological, and also demographic, limits on the margins of the Semitic and Cushitic dominated territories in the Horn of Africa. And then of course there are the Nilotes….

Citation: Musilová, Eliška, Fernandes, Verónica, Silva, Nuno M., Soares, Pedro, Alshamali, Farida, Harich, Nourdin, Cherni, Lotfi, Gaaied, Amel Ben Ammar El, Al-Meeri, Ali, Pereira, Luísa, & Černý, Viktor (2011). Population history of the Red Sea—genetic exchanges between the Arabian Peninsula and East Africa signaled in the mitochondrial DNA HV1 haplogroup American Journal of Physical Anthropology : 10.1002/ajpa.21522

(Republished from Discover/GNXP by permission of author or representative)
🔊 Listen RSS

In the open thread someone asked: “Any recent stuff on the genetics of Ethiopians.” That prompted me to look around, because I’m curious too. Poking around Wikipedia I couldn’t find anything recent. A lot of the studies are older uniparental lineage based works (NRY and mtDNA). Ethiopia is interesting because unlike almost all other Sub-Saharan African nations it has a long written history. Culturally and linguistically it has both Sub-Saharan African, and non-Sub-Saharan African, affinities. The languages of highland Ethiopia are clearly Semitic. Those of lowland Ethiopia are Cushitic, a branch of the broader Afro-Asiatic language family concentrated around the Horn of Africa (Somali is a Cushitic language, though most Ethiopian nationals who speak a Cushitic dialect are of the Oromo group).

From a human evolutionary genetic perspective, Ethiopia also has specific interest. It is likely that the main recent pulse of humans Out of Africa traversed this region. Additionally, there is some evidence of deep time connections between the groups ancestral to Ethiopians and the Khoisan of southern Africa. It may be that Ethiopians and Khoisan are reservoirs of ancient genetic variation in Sub-Saharan Africa which as been overlain by Bantu in most other regions outside of West Africa. Finally, Ethiopians are known to have high altitude adaptations. This could be due to long term residence in the region, or, assimilation of favorable alleles from the long term residents by later populations.

Fortunately we can get a sense of the genetic affinities of Ethiopians thanks to a paper published last spring, The genome-wide structure of the Jewish people. The focus was clearly on Jews, but they surveyed Amhara & Tigray (Semitic speaking highlanders), Ethiopian Jews (similar ethnically to the Amhara & Tigray, but religiously non-Christian), and Oromo. In the PCA the Oromo and Semitic speaking populations are pretty obviously distinct clusters.

This just means that when you take worldwide genetic variation, and pull out the biggest independent dimensions, and then visualize individuals on the two largest dimensions in terms of how they explain variance, the Oromo and other Ethiopians don’t really intersect. Interestingly the Amhara and Tigray are almost indistinguishable, but the Ethiopian Jews are in their own cluster. There are, for the record, 7 Oromo, 7 Amhara, 5 Tigray, and 13 Ethiopian Jews in the sample.

Now let’s look at the genetic variation in ADMIXTURE. Remember this assigns the genomes of individuals in proportions to K ancestral units. As an example, if you had African Americans, Yoruba, and White Americans, in a total pool, and did K = 2, you might have a tendency where Yoruba and White Americans are in two totally different ancestral populations of K, while African Americans are 80% in one ancestry and 20% in another. The interpretation of this is straightforward, but when it comes to populations whose backgrounds we don’t know as well, one should be careful. The selection of a particular value for K is going to be really important, and we shouldn’t confuse the method from the reality which the method is trying to plumb.

First, K = 8 from Behar et al. I’ve reedited to highlight populations which might inform the variation of Ethiopians.

Now let’s look at a series of K’s. Note the changes.

Luckily for us, we don’t need to stop here. Dienekes included Behar’s Ethiopians (non-Jews) for Dodecad. Additionally, he included the Masai population from the HapMap. This turns out to be important because he found that Ethiopian Sub-Saharan ancestry is similar to that of the Masai, not the other African groups.

Dienekes also provided individual outputs. I’ve stitched together Ethiopians with Egyptians and Saudis. The color coding is the same as above.

You should be able to tell where the three groups start and stop pretty easily. I’m 99% sure that the six individuals with more East African and less Southwest Asian ancestry are all Oromo. Ethiopians, in particular highland Ethiopians, seem to me likely an ancient stabilized hybrid population between a population from Arabia, and a local Sub-Saharan population. This population seems unlikely to have been related to the peoples of West-Central Africa, who are associated with the Bantus across eastern and southern Africa. The Bantu agricultural toolkit runs into ecological constraints in various regions, and it is in those regions that non-Bantu populations have persisted. Ethiopia, with its unique climate and topography, naturally remains non-Bantu (as well as the Horn of Africa as a whole). The possible connections between Khoisan and Ethiopia may be a function of the fact that these areas harbor genetic variants which have disappeared in the intervening regions because of the Bantu expansion. I have a hard time accepting that the Bantu expansion was particular eliminationist, but I am starting to suspect that outside of Ethiopia population densities were very, very, low.

The antiquity of this ancient hybridization event to me is attested by the fact that Ethiopians lack any of the other Middle Eastern components besides the one modal in Saudi Arabia. There is a great deal of intra-population variance in the Saudi data set. Why? Part of this must be the slave trade, as well as pilgrims who remained in places like Mecca. But, I think part of the untold story here is that there may have been a larger genetic impact on Arabia after the rise of Islam from the Levant than vice versa! Probably the gene flow precedes Islam, as Arabia was hooked into worldwide trade and population movements, which Ethiopia was relatively insulated from. The Saudi data set has several people who are “pure” Southwest Asian, but also several who have a great deal of West Asian + South European. These seem likely to be people who have some background in the Fertile Crescent.

(Republished from Discover/GNXP by permission of author or representative)
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"