The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog
African American

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Thanks, LOL, or Troll with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used three times during any eight hour period.
Ignore Commenter Follow Commenter
🔊 Listen RSS

One of the great things about ADMIXTURE is that the population elements shake out of the data through the logic of the program. The worst thing is that it is then left up to you to make sense of the elements. A useful way to use ADMIXTURE and avoid excessive interpretive fogginess is to figure out individual proportions of contribution from X ancestral groups when you have a pretty good idea that an admixture event did occur between very distinct and distantly related population groups. To some extent the whole New World is a good laboratory for this process. Consider, for example, someone from the Dominican Republic or Puerto Rico. There is a good chance that their ancestry will fractionate into three elements:

– An African one

– An Amerindian one

– A European one

These three elements are sampled from very different locations geographically. The ancestral populations have been separated for tens of thousands of years, with little to no gene flow across them. This means that the allele frequencies of the “source” populations should be relatively different (maximizing Fst). A mapping of inferred allele frequencies between abstract ancestral populations generated by ADMIXTURE to concrete allele frequencies of known source populations is rather straightforward.

So here’s an experiment. I have 40 individuals with non-trivial African admixture. Most of them are African Americans, though some are of Latino heritage, and several of Ethiopian or Somali origin. A minority are also people who have a small quantum of African ancestry, but well above the “noise” threshold. Let’s take four populations from the HapMap: Yoruba, Utah whites, Maasai, and Chinese from Beijing. I merged the data (removing problem individuals), and added the aforementioned 40 individuals. I pruned the data set so that no more than 0.5% of a given SNP is missing across the individuals. I was left with ~120,000 markers.

Then I did two runs of ADMIXTURE: supervised and unsupervised. In the supervised run the HapMap populations were “pure,” while in the unsupervised runs the HapMap populations also had their ancestries inferred. Here are the population breakdowns for the HapMap populations in the unsupervised run:

The Maasai are the only group with much intrapopulation variance:

OK, so how did the admixed set that I have vary across the two runs? There were four ancestral components, which I labeled:

– West African

– European

– Chinese

– East African

Here are the correlations between the two runs for the 40 individuals:

– West African, 0.9995

– European, 0.9997

– Chinese, 0.9957

– East African, 0.9988

Not too shabby. Here are the barplots side by side:

Here are the runs so you can see them:

This seems like a best-case scenario for ADMIXTURE smoking out population structure. For all the reality that ADMIXTURE is just a “dumb program,” when used judiciously it can be very illuminating.

• Category: Science • Tags: Admixture, African American, Genetics, Genomics 
🔊 Listen RSS

There’s new paper in Genome Biology (tip: Dienekes) which doesn’t present too much in terms of new results, Characterizing the admixed African ancestry of African Americans, but has really, really, good visualization of the data:

From cluster analysis, we found that all the African Americans are admixed in their African components of ancestry, with the majority contributions being from West and West-Central Africa, and only modest variation in these African-ancestry proportions among individuals. Furthermore, by principal components analysis, we found little evidence of genetic structure within the African component of ancestry in African Americans.

These results are consistent with historic mating patterns among African Americans that are largely uncorrelated to African ancestral origins, and they cast doubt on the general utility of mtDNA or Y-chromosome markers alone to delineate the full African ancestry of African Americans. Our results also indicate that the genetic architecture of African Americans is distinct from that of Africans, and that the greatest source of potential genetic stratification bias in case-control studies of African Americans derives from the proportion of European ancestry.

I want to emphasize the part about lack of utility of uniparental markers. These were the first markers which became widely used in scientific genealogy, and African Americans made a great deal of recourse to these so as to identify the tribe from which their ancestors came. There are obvious historical reasons why this would have more valence for this group than for others, as their ancestral identity was consciously erased during the period of slavery.

But even though generating trees of mtDNA or Y markers is more tractable using a coalescent model, and it gives you a clean answer, it’s only a tiny slice of your ancestry. And not necessarily a representative one. Perhaps better than nothing 10 years ago, but in the days of 450 K SNP chips probably outdated. As I said above I think the paper is interesting mostly because the graphical representation is really good. Most of the time I add labels, but this figure needs no additional explanatory editing!


The blue represents European ancestry in individual African Americans, and in the text they note that the frappe bar plot nearly perfectly aligns with the distribution on the PCA plot. Remember that the two axes on the PCA plot represent the two largest axes of variation, with the first component (largest) on the x, and the second component (second largest) on the y. The largest component naturally separates Europeans from the African groups, while the second largest component separates the various African groups. The difference between the two Pygmy groups is not surprising, the Biaka have been found to be much more admixed with their Bantu neighbors than the Mbuti. I wouldn’t put too much weight in the closeness of the San and Mbuti on the plot, because you’re seeing only a two-dimensional view of the total genomic variation, the two largest dimensions as evaluated by looking at the total range of variation of genes among the set of individuals (European, African American and African) within the data set. The relationships may differ if you constrain the sample space of genetic variation to African genotypes only, and other dimensions may also indicate different relationships.

Here are the estimates of ancestral quanta for African Americans by region against different potential ancestral groups. They had 136 African Americans, so I wouldn’t put too much weight on the interregional differences.


22% of the ancestry of African Americans in the sample is European, with a standard deviation of 12%. It seems that around 10% of the African American population is more than half European in ancestry. Interestingly, in Henry Louis Gates Jr.’s Faces of America, all three of the people with black ancestry, two of whom clearly identified as African American, were more than 50% European in ancestry.* When it comes to African ancestry the affinity with the region of the west of the Bight of Benin seems clear if you view the data through a more granular lens.


The Mandenka are from the western fringe of West Africa, while the Bantu are a linguistic group which seems to have emerged just to the east of Nigeria, and swept east and south with the spread of a particular agricultural lifestyle until pushed up against the Nilotic and Khoisan groups of East and South Africa respectively. But this is on the population level. Could it be that individuals exhibit variance by African region, as they do on European ancestry? Not too much (at least beyond a level of noise, and perhaps a few outliers).

The two figures below are based on African genotypes within the African American population.

Note the contrast with the linear topology evident when European ancestry is added into the mix. Verbally what is clear is that while some African Americans have more European ancestry than others, on an individual level very few are reasonably identified as Yoruba people, or Mandenka people. Rather, individual African Americans exhibit a mix of African lineages in proportion to the various weights of sources in the slave trade.

Why might this be? I have observed before that the vast majority of the ancestry of African Americans is likely colonial. Though a few African American communities, such as the Gullah of coastal South Carolina, preserve distinctive regional African folkways, by and large black Americans as a culture are American, and derive many of their distinctive aspects from elaborations on Anglo norms or a novel synthesis of African ones (in particular, it seems clear that black Americans have been strongly influenced by the two Southern British settler folkways in their speech and religion). The deep history of African Americans within this country means that a great deal of time has elapsed whereby people of Yoruba, Mandenka or Kongo ancestry could have intermarried. Without positive assortative mating by tribe the various ancestral quanta would have become intermixed in subsequent generations. The Gullah exception supports this model, because they lived in relative isolation from whites. The rice agriculture which they practiced required less direct supervision than cotton or tobacco to extract economic productivity, and the South Carolina coastal country was notoriously unhealthful for whites. The relatively humane nature of rice agriculture as opposed to cotton (and especially sugar) also manifested in the more stable family life of the ancestors of the Gullah. So the relationship between white planters and Africans in this region was closer to that between lord and serf than owner and property, and the ancestors of the Gullah could develop their culture in America more organically than African Americans elsewhere.

Adam_Clayon_Powell_JrIn contrast, white ancestry does exhibit a great deal of individual variation. Why? There are two obvious ones that jump out. First, much of the ancestry may be much more recent. Recent ancestry has less time to be “dispersed” across the population through intermarriage. Though certainly whites and blacks mixed genetically in the colonial era, the process continued uninterrupted down to emancipation, while the addition of new African ancestry ceased in near totality by 1810 (there was some trade in slavery which reached the United States of America after this period, but not much), and had greatly diminished in the decades before 1810. The endogenous population growth of the black American community was sufficient to provide slaves for the new cotton lands of the early 19th century. After 1865 white-black relations were more surreptitious but continued nonetheless (e.g., Malcolm X’s mother’s father was white). Second, there is naturally the reality that there was, assortative mating for European features (e.g., “good hair”, skin lighter than “a brown paper bag”) among the African American elite. Though ancestry and phenotype can become decoupled, this takes time, and as I suggest above much of the European ancestry is recent. The image above is of a black American Congressman, Adam Clayton Powell Jr. I assume most readers are aware that W. E. B. Du Bois’ “Talented Tenth” were disproportionately what in other societies would be recognized as people of mixed-race, but who in the United States were classed within the general black population because of the white Southern paradigm of hypodescent.

Overall, nothing too new in the paper, but really great charts!

Citation: Zakharia F, Basu A, Absher D, Assimes TL, Go AS, Hlatky MA, Iribarren C, Knowles JW, Li J, Narasimhan B, Sidney S, Southwick A, Myers RM, Quertermous T, Risch N, & Tang H (2009). Characterizing the admixed African ancestry of African Americans. Genome biology, 10 (12) PMID: 20025784

* Gates is more than 50% European, while Elizabeth Wright is 65% European in ancestry. This aligns with intuition based on physical appearance. Malcolm Gladwell, who may not identify as African American (his father was a white Englishman, his mother a mixed-race Jamaican, and he is a Canadian immigrant), is likely to be ~75% European, though the number was not noted in the special.

Image Credit: Library of Congress

• Category: Science • Tags: African American, Genetics, Genomics 
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"