The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Authors Filter?
Razib Khan
Nothing found
 TeasersGene Expression Blog

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

Credit: Dbachmann

Credit: Dbachmann

Update: Turns out “Maria” is also an ethnic Roma.

There was a recent case in Ireland of a young Roma girl who was blonde haired and blue eyed being removed from her home, on the suspicion that she was not in fact the biological child of the presumed parents (who, like most Roma, are reportedly of dark complexion, hair, and eye). I even saw a report that a hospital was consulted on the probability of such an outcome, and they said it would be “extremely unusual”. It turns out that DNA tests confirmed that this girl was the biological child of the putative parents. And of course all this has be understood in light of the case of “Maria” in Greece; a little blonde girl who turned out not to be the biological child of the two Roma who claimed her as their daughter (it looks like there was welfare fraud in that case).

My initial response to the Irish case was that consultant should be fired, because in an admixed population like the Roma it shouldn’t be that unusual to have offspring who deviate a great deal from the parental phenotype. This prompted some interesting reactions. First, there were those who seem blissfully ignorant of the fact that the Roma are an admixed population. That’s easy enough to resolve, as there have been scientific papers published on this issue using genome-wide data. Second, there are claims that very small fraction of Roma have blonde hair and blue eyes (on the order of less than 1%). The latter may be a defensible claim, though not indisputably so.

Before we move on I have to clarify that there is a distinction between “Roma” and “Romani.” The latter refers broadly to the populations across Europe which were referred to as “Gypsy,” while the former denotes a set of populations with a center of distribution in Southeast Europe, in particular in the Balkans. In much of Northern and Western Europe there are now two populations of Romani with very distinct histories (and genetics): the Roma who have recently arrived from Southeast Europe, and the various non-Roma groups who have a very long history in their nations of residence (e.g., Finnish Kale).

In terms of various traits we know a fair amount about the genetics of pigmentation in humans. Though the fine grained individual predictive models are coarse, most of the genes which have large effects on population-scale differences are now well characterized. This allows me to produce a model which is reasonably plausible to give you an intuition for why brown-skinned populations can generate a wide range of outcomes in realized phenotype.

Imagine five loci rank-ordered in effect size, gene 1, gene 2, gene 3, gene 4, and gene 5. Each gene comes in two flavors, two alleles. One is a “dark” allele (produces dark pigmentation) and another is a “light” allele. From these you can have a distribution of complexion which is referred to as a “melanin index” (it’s dependent on reflectance). Imagine that you assume each allele at each gene exhibits a melanin index value like so in relation to the aggregate:

Gene 1 = 30, 2
Gene 2 = 15, 1
Gene 3 = 10, 1
Gene 4 = 5, 2
Gene 5 = 5, 0


What you see above are potential genotypes (all heterozygote implicitly), with their phenotypic values being the sum of the two. One allele at gene 1 contributes 30 melanin units, and the other 2. And so on. Taking the “dark” alleles and assume they’re all homozygote (so doubling them), you get a maximal potential value of 130, and a minimal one of 6 if you make the “light” ones homozygote. But of course in most cases you’ll get a combination. But what would be the outcome for a given set of frequencies? Since I’m lazy I ran a simulation. I set the frequencies of the dark allele for each each like so:

Gene 1 = 60%
Gene 2 = 45%
Gene 3 = 35%
Gene 4 = 46%
Gene 5 = 50%

Then I generated 10,000 multilocus genotypes, and added a “noise” parameter so that the trait wasn’t totally determined by the genes. This is why the phenotypic value can be higher (and lower, though that bound can go no further than ~0) than what genotype would predict. Here’s the distribution:


The mean value is 73. The 25th percentile is 55. 1 out of 26 individuals should have an exclusively “light” genotype across all five genes. The point is that in a polygenic character if you have polymorphism on the genotypic level you’re likely to have it on the phenotypic level.

roma2 The second major question is is this even plausible for Roma? Yes. They’ve very admixed. Two recent papers make the case definitively, Reconstructing Roma History from Genome-Wide Data and Reconstructing the Population History of European Romani from Genome-wide Data. These papers used tens of thousands to hundreds of thousands of markers. You can see in the bar plot to the left that the Roma have much higher European-like ancestry proportions than other Indians. It is likely their parental population is Punjabi-like, so it seems that they’re ~50% non-Indian in admixture. The second paper offers up a wider population set for comparison, and it suggests that the Roma did not experience much gene flow with Middle Eastern groups (there are still Roma-related populations in the Middle East, the Dom). Rather, their primary phase of admixture occurred ~1,000 years ago in the Balkans.

Reconstructing the Population History of European Romani from Genome-wide Data has a wide range of Romani populations, and it seems evident that the Western and Northern Romani have more European admixture than the Balkan Roma. It turns out that the Welsh Romani seem totally Europanized in their genome.That is, they’re basically now a Northern European population, perhaps with some residual South Asian ancestry. Because these Romani originally spoke an Indo-Aryan language it seems that they are genuine Romani in a cultural sense. The Welsh Romani have simply undergone enough gene flow with the surrounding population over the past hundreds of years to lose their genetic distinctiveness.

second You can see a broader population wide comparison in this bar plot. European populations are at the top, and below them are the Romani groups. The South Asian admixture is again evident, but observe the paucity of both of the Middle Eastern components (you can label them “Northern/Caucasian” or “Southern/Arabian” for convenience; they show up repeatedly in Admixture analyses). The authors of the second paper linked above make much of this, but I would be cautious. I would have preferred that they run Admixture in supervised mode, or perhaps used a formal test of admixture (e.g., D-statistic). But, it is strongly suggestive of the possibility that the Roma sojourn in the Middle East was rather short, and that the true ethnogenesis of the group occurred in the Balkans primarily. And, as I said earlier, the European genetic character of Welsh Romani is pretty obvious in this plot (they cluster with Europeans in the PCA as well).

But, despite the Romani history of admixture in Europe, some of them are genetically very isolated now, and have been for hundreds of years. This seems the case of the Roma, who have had surprisingly little admixtures since the initial settlement. There’s widespread evidence of inbreeding and founder effect across the Romani populations as well, making them both admixed and very distinct. You see long runs of homozygosity, and the clustering bar plots tend to “break out” the Roma rather early on in the steps up the number of populations, similar to what you see in groups such as the Kalash. I believe one of the problems with adducing phylogenetic relationships of the Romani with Y and mtDNA markers was simply that bottleneck effects are more powerful for uniparental lines, and they were buffeted more by the small population size. In sum, when it comes to Roma genetic variation there are a few things to keep in mind:

1) South Asian source

2) Admixture with Southeastern Europeans

3) Long period of relatively genetic continuity and isolation after the initial phase

4) Genetic homogeneity within the groups. That is, they’re well admixed across most individuals

5) Lots of novel genetic uniqueness because of high drift rate because of small effective population size

(Republished from Discover/GNXP by permission of author or representative)
• Category: Science • Tags: Human Genomics, Roma, Romani 
🔊 Listen RSS

Romanis-historical-distributionIf you live in the States one of the things you hear a lot about Europe in regards to its relationship to its ethno-religious minorities are the problems with Muslims. This is probably an Americo-centric perspective shaped by 9/11, when many of the hijackers had turned out to have spent time in Germany. Additionally, terrorist actions in both London and Madrid highlight the persistence of these problems over the years. These sorts of shocking events put a sharp focus on the geopolitical cross-hairs which Europe finds itself in in the second age of mass migration. Though this time it is a destination, and not a source.

But having been to Europe recently it was notable that in several regions the day-to-day tension when it came to ethnicity often focused on Gypsies (I use the older term because the ethnonym “Roma” which has become politically correct in the USA includes only a subset of Europe’s Gypsy population, even if the greater number). Many regions of Europe now have two distinct populations of Gypsies, a long resident local group, as well as Roma from the eastern nations of the EU. Though the relationships between these traditionally nomadic peoples and indigenous populations has never been without tension, it is clear that something close to a modus vivendi has been achieved in many European nations between the majority and their small native Gypsy populations. The influx of the Balkan Roma add a new variable. But the political fuss for me simply rekindled a curiosity as to the genetic origins of the Gypsies. Culturally their South Asian provenance couldn’t be clearer; they speak an Indo-Aryan language. Their term for themselves in many parts of Europe comes from the Indo-Aryan word for “black,” as they are are darker than the natives of the lands in which they have settled , and in fact often look visibly South Asian. This seemed especially true of Balkan Roma. On the other hand the Kale of Finland looked to be brunette Europeans.

The problem with the genetics of the Gypsy people of Europe is that until recently they’ve focused on uniparental lineages. Though this has confirmed their South Asian origins, looking at maternal or paternal direct descent alone leaves something to be desired in terms of assessing ancestry, and, these two markers (mtDNA and Y) are subject to more drift as they are haploid (half as many copies). But a new paper in The American Journal of Physical Anthropology has some results using 16 autosomal STRs (a group of highly variant markers). A Genetic Historical Sketch of European Gypsies: The Perspective From Autosomal Markers:

In this study, 123 unrelated Portuguese Gypsies were analyzed for 15 highly polymorphic autosomal short tandem repeats (STRs). Average gene diversity across the 15 markers was 76.7%, which is lower than that observed in the non-Gypsy Portuguese population. Subsets of STRs were used to perform comparisons with other Gypsy and corresponding host populations. Interestingly, diversity reduction in Gypsy groups compared to their non-Gypsy surrounding populations apparently varied according to an East-West gradient, which parallels their dispersion in Europe as well as a decrease in complexity of their internal structure. Analysis of genetic distances revealed that the average level of genetic differentiation between Gypsy groups was much larger than that observed between the corresponding non-Gypsy populations. The high rate of heterogeneity among Gypsies can be explained by strong genetic drift and limited intergroup gene flow. However, when genetic relationships were addressed through principal component analysis, all Gypsy populations clustered together and was clearly distinguished from other populations, a pattern that suggests their common origin. Concerning the putative ancestral genetic component, admixture analysis did not reveal strong Indian ancestry in the current Gypsy gene pools, in contrast to the high admixture estimates for either Europeans or Western Asians.

This isn’t a 500,000 SNP-chip analysis, so everything needs to be taken with a grain of salt. But, 16 markers is a lot more than the two you usually have to deal with when assessing the genetics of the Gypsy populations of Europe, so it’s certainly an improvement when making inferences. One figure and table are really worth looking at in this paper.


The first plot shows the variance partitioned into two dimensions as a function of the 13 STRs. The table shows bootstrapped admixture estimates and standard deviations. They had a 3-population model with West Asians, but it didn’t look to me like they were getting sensible results with that, so I excised that portion (with only 600 pixels the table would have been very hard to read with the nonsensical estimates in). I think the last model where they aggregate West Asians with Europeans makes the most sense. I assume the major issue here is that with 16 STRs which aren’t necessarily filtered for ancestral informativeness within these populations you’re going to get weird results on the margins.

These results confirm the finding from previous Y and mtDNA results that Europe’s Gypsy populations are genetically fragmented, and seem to have gone through bottlenecks. In this paper they also seem to have found a pattern of decreased genetic variance from east to west for the Gypsy groups, which makes sense in light of a historical model of serial bottlenecks as they traversed Europe. Any reasonable model of the genetic heritage of the Gyspy people of the world posits that they’re a compound to various extents of populations distributed along a continuum between South Asia and Western Europe, and yet here you see a 2-dimensional plot that they don’t look like a linear combination of South Asians and Europeans. Why? Because of their unique genetic history has resulted in their “random walk” into patterns of allelic variance distinct from the ancestral groups.

But a second genetic dynamic with these populations seems to be admixture. With 16 STRs, and obvious sensitivity depending on the populations you survey, one should be careful about overweighting the findings from this paper. And yet plausibly it does show a pattern of decreased South Asian admixture the further you go from the Balkans. Not only does this stand to reason a priori, but empirically it’s generally agreed that the Gypsy groups of the north and west of Europe look less South Asian in appearance than those of the Balkans.

A final consideration here is that the Indian populations which they used as a reference for South Asians are not representative of the ancestral Indian groups from which Gypsies derived. The Indo-Aryan language of the Gypsies seems to share the most features with the language of northwest India, Punjabi and Hindi. But the samples which had the appropriate STRs for comparison were Central and South Indian. Overall I don’t think that’s that much of a consideration, but something to remember.

A bigger take home point is the disjunction between cultural and biological modes of inheritance and persistence. The language of the Gypsies retains in its broad outlines the character of an Indo-Aryan tongue. That is why the South Asian origins of Gypsies was able to be ascertained by Indian sailors in Britain who overheard, and broadly understood, what Gypsies were saying. Romanipen, the spirit of Gypsy culture which transcends difference of religion and nationality, seems to be clearly traceable to some South Asian antecedents (e.g., the emphasis on avoidance of contamination of food by outsiders).

And yet despite the cultural distinctiveness the various Gypsy populations have become genetically less South Asian. That makes sense, it seems likely that they left India ~1000 years ago, or 40 generations. They’ve been in the Balkans for about 600 years, or 24 generations. Let’s assume unrealistically that the Roma were 100% South Asian when they arrive in the Byzantine lands (there are related groups in the Middle East, so it seems certain they picked up Middle Eastern ancestry along the way, but no matter). 99% endogamy per generation would imply that they’d be 79% South Asian today. 95% endogamy would result in them being 29% South Asian. 90% endogamy would mean that they’d be 8% South Asian. Reality is more complex. It is likely that in the early periods when social norms had not hardened and Roma were less numerous the endogamy rates were probably far lower, especially as the Gypsy bands mixed with other destitute groups in the Balkans. The evidence of lots of structure across the Gypsy groups points to endogamy drilling down to a lower level of organization than just the ethnic group, which would be consistent with tendencies within South Asian culture more broadly.

More generally it seems that the Roma and their relatives can’t just be understood as a simple linear combination of Europeans, Middle Easterners, and South Asians, genetically or culturally. Their unique history has reshaped them, and their persistence and demographic expansion in the face of ostracism and persecution are clear evidence as to the functional success of their social-cultural traditions.

Citation: Gusmão A, Valente C, Gomes V, Alves C, Amorim A, Prata MJ, & Gusmão L (2010). A genetic historical sketch of European Gypsies: The perspective from autosomal markers. American journal of physical anthropology, 141 (4), 507-14 PMID: 19918999

(Republished from Discover/GNXP by permission of author or representative)
No Items Found
Razib Khan
About Razib Khan

"I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. If you want to know more, see the links at"