The research findings demonstrate that on the basis of DNA information it is possible to determine with an accuracy of more than 90 percent whether a person has red hair, with a similarly high accuracy whether a person has black hair, and with an accuracy of more than 80 percent whether a person’s hair color is blond or brown. This new DNA approach even allows differentiating hair colors that are similar, for example, between red and reddish blond, or between blond and dark blond hair. The necessary DNA can be taken from blood, sperm, saliva or other biological materials relevant in forensic case work.
A few years ago Wired published a mildly alarmist piece titled ‘The Inconvenient Science of Racial DNA Profiling’, which focused on eye color identification. It turns out that most of the eye color variation in European populations can be predicted by variants across two genes, HERC2 and OCA2. Because markers in this region can explain ~75% of the variation in the trait it is ‘quasi-Mendelian.’ As it happens, markers in this gene also seem to effect skin color, and hair color. Naturally these loci loom large in the paper which the above press release is based on. It’s in Human Genetics. Model-based prediction of human hair color using DNA variants:
Predicting complex human phenotypes from genotypes is the central concept of widely advocated personalized medicine, but so far has rarely led to high accuracies limiting practical applications. One notable exception, although less relevant for medical but important for forensic purposes, is human eye color, for which it has been recently demonstrated that highly accurate prediction is feasible from a small number of DNA variants. Here, we demonstrate that human hair color is predictable from DNA variants with similarly high accuracies. We analyzed in Polish Europeans with single-observer hair color grading 45 single nucleotide polymorphisms (SNPs) from 12 genes previously associated with human hair color variation. We found that a model based on a subset of 13 single or compound genetic markers from 11 genes predicted red hair color with over 0.9, black hair color with almost 0.9, as well as blond, and brown hair color with over 0.8 prevalence-adjusted accuracy expressed by the area under the receiver characteristic operating curves (AUC). The identified genetic predictors also differentiate reasonably well between similar hair colors, such as between red and blond-red, as well as between blond and dark-blond, highlighting the value of the identified DNA variants for accurate hair color prediction.
The authors used a logistic regression model where the SNPs are the variant inputs which predict the odds of the hair color categories in a sample of Polish individuals. In this paper they used categorical classes as the dependent variables. They don’t seem to make a persuasive case that this is more accurate than using quantitative measures of hair color as dependent variables, though they do indicate that there isn’t any gain in accuracy to using a quantitative model of hair color. Since this is for the purposes of forensic analysis perhaps there is a strong cost vs. benefit angle which I’m not foreseeing. To the left you have a figure which shows the rapid diminishing marginal returns on SNPs when it comes to predicting hair color in Northern Europeans. The reason that red hair is so easy to detect is that it’s a rare trait which has a very distinctive genetic signature. In fact forensic identification of individuals with red hair via DNA has been practiced in the past, though due to the rarity of this trait its utility hasn’t been too great.
The alleles here are the usual suspects which show up in pigmentation genetics. As noted in the paper there have been some difficulties in the intersection between genomics and medicine in terms of substantive results, but in forensics the localization of salient phenotypic variation to only a few markers has been a relative success.
Click on the image to the left and you’ll see the genes and associated markers. There are two sets of phenotypes predicted (depending on how they categorized them). All of them are ratios like so: (non-black hair color)/(black hair color). The numerical values are betas, which show the relationship between the independent variable and the predictor. The magnitude indicates the scale of the direction of the effect, and the genes are sorted by utility in prediction. So the first row has MC1R, and a set of highly penetrant markers termed “R.” The presence of R is highly correlated to red hair, as can be seen in the high values in the columns which denote red hair. Aside from MC1R you have conventional SNPs (some of which you can look up online pretty easily).
What does this mean, and why is it important? The law enforcement application is rather straightforward, though the existence of hair dyes and bleaching agents means that it isn’t quite that useful. Rather, I think what we’re seeing here is a step-wise improvement in forensic genetics to the point where in the near future the perpetrator sketch artist may be as antiquated as the VCR. We’ll be getting somewhere when markers for nose size and shape, as well as other facial characteristics, are smoked out. I don’t believe that a fine-grained reconstruction of someone’s countenance will be possible, but that’s not usually what’s needed in any case for forensics. A coarse reconstruction will probably be superior to the sketches derived from the memories of witness; they will be less precise, but more accurate.
Citation: Branicki W, Liu F, van Duijn K, Draus-Barini J, Pośpiech E, Walsh S, Kupiec T, Wojas-Pelc A, & Kayser M (2011). Model-based prediction of human hair color using DNA variants. Human genetics PMID: 21197618