The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Topics Filter?
Nothing found
 TeasersDavid [email protected] Blogview

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Troll, or LOL with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used once per hour.
Ignore Commenter Follow Commenter
🔊 Listen RSS

The following was written before I saw Razib’s post below, but I will post it as drafted, as I think it complements Razib’s data:

As there has been much discussion lately of Jewish intelligence and achievement, I was interested to see the following passage in Charles Murray’s Human Accomplishment:

Jews make their first appearance in the annals of the arts and sciences during the centuries when the Middle East and Moorish Spain were at their cultural peak. When science historian George Sarton set out to enumerate the top scientists across the world, including East Asia, South Asia, the Arab World, and Christian Europe, from 1150 to 1300, he came up with 626 names, of whom 95 were Jews – 15 per cent of the total, produced by a group that at the time represented about half of one per cent of the world’s population that was in a position to produce scientists.

To support this Murray cites Sarton’s Introduction to the History of Science, 1927-48, volume 2, pp. 323-3, 533-41, and 808-18. I have checked this out, and Sarton gives some more detailed information. He aims to identify leading figures in ‘science and intellectual progress’, not just science in the narrow sense. He gives data for three 50-year periods: (1) 1150-1200, (2) 1200-1250, and (3) 1250-1300, which can be summarised as follows:




By my count there are 94, not 95, Jews, but this does not significantly affect their percentage of the total. Of course, compilations of this kind must be taken with a hefty pinch of salt. Almost certainly there are biases of available data and selection. Sarton was a great historian of science, but his book is old – for one thing it predates Needham’s Science and Civilisation in China – and a modern survey would probably include more East and South Asian figures. Nevertheless, the high proportion of Jewish figures is impressive. The fairest comparison is probably with Muslims, as in this period Jews were mainly living in Islamic countries. There are nearly three-quarters as many Jews as Muslims in the list, yet Jews can hardly have been a tenth of the population in these areas. Also note that the Jews were relatively more prominent in the west than in the east. This may reflect differences in numbers, in cultural circumstances, or both.

As the emphasis of recent discussion has been specifically on Ashkenazi Jews, it is worth noting that few of the Jews in Sarton’s list were Ashkenazim. Sarton does not give a breakdown into different ‘sects’ of Jews, but he does specify their countries of activity. Only 13 of the 94 came from countries north of the Alps (Germany, Northern France, Bohemia, and England). The largest number of ‘western’ Jews of course lived in Spain.

Following the completion of the reconquista, and ultimately the expulsion of the Jews from Spain, the Sephardic (Spanish and Portuguese) Jews were scattered through Europe south and west of the Alps. It is my impression that the Sephardim remained more prominent in intellectual and cultural life than the Ashkenazim at least until the 17th century. The Ashkenazim produced relatively few intellectual notabilities until the second half of the 18th century [see Note], then exploded into extraordinary prominence in the 19th.

Note: There are of course some difficulties in deciding who is Jewish. For example, the astronomer Sir William Herschel is sometimes listed as Jewish, but it seems he was at most half-Jewish by ancestry. Some of those mentioned as Jewish by Charles Murray seem to be simply erroneous. I can find nothing to suggest that Johannes Herder was Jewish, not to mention the Nuremberg cobbler-poet Hans Sachs. Wagner must be spinning in his grave!

Added June 19

As I mentioned in comments on another thread, when Jews were allowed back into England in the 17th century, Sephardim (Jews of Spanish and Portuguese descent) were the first to arrive, and formed a cultural and economic elite. When Ashkenazim began to arrive in the 18th century, they were mostly poor and lower-class, and the Sephardi elite would have little to do with them. Only in the 19th century, when Sephardi numbers were depleted by apostasy and intermarriage with gentiles, did the barriers between Sephardim and Ashkenazim begin to break down. The wealth of successful Ashkenazim also made it difficult for the Sephardim to feel so superior. (I base these remarks on Cecil Roth’s books on the history of Jews in England.)

I find that there was a similar position in 17th century Holland: according to Simon Schama ‘by 1690, however, this delicate balancing act [between wealthy Sephardim and the Christian population] was threatened by the arrival of Ashkenazi Jews in much greater numbers. Of the 7,500 Jews in Amsterdam at that time, 5,000 were immigrants from Germany, Poland, Bohemia and Lithuania… They settled thickly in streets like Leprozenburgwal, the Nieuwe Kerkstraat and the Nieuwe Houtmarkt, which became known as the milieu of poor Jews… And they turned to the menial ‘ghetto’ trades disdained by the Sephardim like hawking, peddling, and old clothes dealing…’ (Schama, The Embarrassment of Riches, p.594.)

I mention these points because some modern commentators tend to assume that the Ashkenazim are an elite and the Sephardim a kind of underclass! This is the reverse of the historical position.

In modern times the achievements of Sephardic Jews have been overshadowed by the Ashkenazim, but are not negligible. I find that at least 5 Nobel Prizes have gone to Sephardic Jews: Baruj Benacerraf, Salvador Luria and Rita Levi-Montalcini (all in Medicine), Claude Cohen-Tannoudji (Physics), and Elias Canetti (Literature). This may not seem many compared with the Ashkenazi ‘score’ (over a hundred), but in relation to the small size of the Sephardic population – probably less than a million worldwide, if we interpret the term strictly as meaning Jews of Spanish and Portuguese descent – it is very respectable.

Posted by David B at 02:53 AM

• Category: Science 
🔊 Listen RSS

I posted recently on altruistic punishment. My attention has been drawn to another study showing that altruism can be favoured by individual selection:

A. Sanchez and J. Cuesta: ‘Altruism may arise from individual selection’, Journal of Theoretical Biology, 21 July 2005 (forthcoming), 235, 233-40.

Abstract:The fact that humans cooperate with non-kin in large groups, or with people they will never meet again, is a long-standing evolutionary puzzle. Altruism, the capacity to perform costly acts that confer benefits on others, is at the core of cooperative behavior. Behavioral experiments show that humans have a predisposition to cooperate with others and to punish non-cooperators at personal cost (so-called strong reciprocity) which, according to standard evolutionary game theory arguments, cannot arise from selection acting on individuals. This has led to the suggestion of group and cultural selection as the only mechanisms that can explain the evolutionary origin of human altruism. We introduce an agent-based model inspired on the Ultimatum Game, that allows us to go beyond the limitations of standard evolutionary game theory and show that individual selection can indeed give rise to strong reciprocity. Our results are consistent with the existence of neural correlates of fairness and in good agreement with observations on humans and monkeys.

The full text requires subscription, but a substantially similar text is available as a free pdf here. I won’t comment in detail, but the model presented in this paper is rather simpler than the one in J. Fowler’s paper.

As I said in my previous post, I am sceptical about much of this game-theoretical work. It starts from the false assumption that (in Fowler’s words) ‘Human beings frequently cooperate with genetically unrelated strangers whom they will never meet again, even when such cooperation is individually costly’.

In support of such statements the theorists cite results of experimental economics which show that in games like Ultimatum people seldom follow a rationally self-interested strategy. But the setting for these experiments is quite artificial, and tells us little or nothing about the circumstances in which altruistic behaviour evolved. As Robert Trivers has put it, ‘It’s absurd – and I use the word advisedly – to imagine that we’ve evolved to respond to the specific situations these economists put us in, with complete anonymity and no chance to interact with partners a second time’ (Science, 303, 2/2/2004, 1131).

I don’t think that statements like Fowler’s are true even in modern societies, but that doesn’t really concern me. Many human traits are maladaptive in modern circumstances. To understand the evolution of any species-wide human trait, what matters is what happened in 100,000 years or so of paleolithic times. During this period all humans were hunter-gatherers.

The behaviour of hunter-gatherers in modern times is not a safe guide to their behaviour in the paleolithic – for one thing, modern hunter-gatherers are confined to marginal habitats like deserts and deep jungle – but it is the best guide we have. Based on studies of modern hunter-gatherers (see references below for a selection), I would make the following generalisations:

1. Hunter-gatherers usually live in small bands of between 20 and 50 individuals. There are seldom more than about 12 adult males in a band. Bands have an established territory with rights recognised by other groups.

2. Most if not all of the individuals in a band are related to each other by blood or marriage. There is usually a core group of siblings (see e.g. Lee p.51).

3. A number of bands make up a tribe of between a few hundred and a few thousand individuals. Bands within the same tribe have much the same language, customs and kinship system.

4. Marriage is usually between members of different bands within the same tribe. It is regulated by rules of exogamy and endogamy, though these seem to have been stricter and more elaborate among the Australian aborigines than elsewhere. Marriage between people of different tribes is less common but does occur, especially when the tribes are small (see e.g. Elkin p.79) Most hunter-gatherers are moderately polygamous, with the older or more successful men having more than one wife. Among the aborigines women are monopolised by the older men (Maddock p.57-60).

5. Individuals often stay in the same band for life, except that members of one sex will move to another band when married. However, it is possible for individuals or families to move from one band to another, or even to another tribe, provided this is approved by the receiving group (e.g. Spencer and Gillen, p.68). Bands as a whole may split up or merge in response to fluctuations of population or resources.

6. Day-to-day contacts are mainly within the same band. Members of different bands will sometimes visit, e.g. to meet relatives or to discuss marriage arrangements. This seems especially common among the Bushmen (Lee p.259) There may also be collective meetings between bands, such as the famous aboriginal corroborees (Elkin p.61).

7. Within a band, there is usually a good deal of cooperation, such as food-sharing. Band society has been described as ‘egalitarian’ (Boehm) but this should not be exaggerated. Individuals may differ greatly in prestige, influence, and number of relations. There are also differences in number of wives and reproductive success.

8. Relations between different bands range from friendly to hostile. Disputes can arise over territory, access to water or seasonal surpluses of food, and over women. Belief in witchcraft is also a major source of trouble. Disputes can lead to prolonged feuds and a series of tit-for-tat killings.

9. The cardinal principle of relations between bands is reciprocity (see e.g. Lee p.335-7). Reciprocity is a means of sharing fluctuating resources in a harsh environment. If a band is e.g. allowed access to another band’s territory to hunt, they will be expected to return the favour in due course.

10. Contrary to some assumptions, different hunter-gatherer tribes are not always in a state of hostility with each other. Tribes do not act as political or military units, so most bands will not have much contact with other tribes unless their territories border each other. Neighbouring bands from different tribes have relations ranging from friendly to hostile just like bands within the same tribe. People in border areas are often bilingual or have a common ‘pidgin’ dialect, and there is often trade and intermarriage between the tribes.

From these points I think it will be clear that the claim that ‘human beings frequently cooperate with genetically unrelated strangers whom they will never meet again’ is simply not true for hunter-gatherers. Within the band, cooperation is between individuals who are usually related, and never strangers. Between bands or tribes, cooperation is based on customary expectations of reciprocity. Failure to meet these expectations will result in withdrawal of cooperation or more active reprisals.

This is the background against which human behaviour has evolved. To the limited extent that behaviour is really altruistic, I see no reasons to appeal to evolutionary mechanisms other than inclusive fitness, reciprocal altruism (Trivers), or simple tit-for-tat cooperation (Axelrod and Hamilton). As to ‘altruistic punishment’, I doubt that there is much of that in hunter-gatherer societies, but I may come back to that.


C. Boehm: Hierarchy in the Forest, 2001
A. P. Elkin: The Australian Aborigines, 1964
L. Hobhouse, G. Wheeler, M. Ginsberg: The Material Culture and Social Institutions of the Simpler Peoples, 1915/1965
E. Leacock and R. Lee (eds): Politics and History i
n Band Societies, 1982
R. B. Lee: The !Kung San, 1979
K. Maddock: The Australian Aborigines, 1973
B. Spencer and F. Gillen: The Native Tribes of Central Australia, 1899
E. Westermarck: The Origin and Development of Moral Ideas, 1924

Posted by David B at 02:49 AM

• Category: Science 
🔊 Listen RSS

Nice piece here on Carl Zimmer’s blog about the evolution of beetle horns.

Posted by David B at 06:58 AM

• Category: Science 
🔊 Listen RSS

A nice report here for anyone interested in the Darwin family.

Posted by David B at 11:54 AM

• Category: Science 
🔊 Listen RSS

Having once fallen for the notorious Monty Hall puzzle, I was interested to find a similar-looking problem in John Maynard Smith’s Mathematical Ideas in Biology (1968):

Of three prisoners, Matthew, Mark and Luke, two are to be executed, but Matthew does not know which. He therefore asks the jailer ‘Since either Mark or Luke are certainly going to be executed, you will give me no information about my own chances if you give me the name of one man, either Mark or Luke, who is going to be executed’. Accepting this argument, the jailer truthfully replied ‘Mark will be executed’. Thereupon, Matthew felt happier, because before the jailer replied his own chances of execution were 2/3, but afterwards there were only two people, himself and Luke, who could be the one not to be executed, and so his chance of execution is only 1/2. Is Matthew right to feel happier?

JMS says ‘This should be called the Serbelloni problem since it nearly wrecked a conference on theoretical biology at the Villa Serbelloni in the summer of 1966’.

So: is Matthew right to feel happier?

At the end of the book JMS simply gives the answer to the problem as ‘No.’ But in the text (page 70) he says that the problem ‘yields at once to common sense or to Bayes’ theorem’.

Common sense is sadly unreliable in such cases, but let us take the hint about Bayes’ theorem. This provides a means of calculating the probability that a hypothesis is true in the light of a piece of evidence. In this case the hypothesis is ‘Matthew will be executed’, and the evidence is the jailer’s statement that ‘Mark will be executed’.

Bayes’ theorem can be expressed as h|e = (h x e|h)/e, where h|e is the probability that the hypothesis is true in the light of the evidence, h is the antecedent probability that the hypothesis is true, e|h is the conditional probability that the evidence will be observed if the hypothesis is true, and e is the probability that the evidence will be observed whether or not the hypothesis is true.

The name of Bayes is often associated with subjectivist interpretations of probability, but the terms in Bayes’ theorem can be given frequentist interpretations. The expression (h x e|h)/e can then be interpreted as the long-term frequency with which both the hypothesis is true and the evidence is observed as a proportion of all cases in which the evidence is observed, in a large number of similar cases.

To calculate h|e in the Serbelloni case we therefore need to know the antecedent probability that Matthew will be executed, the conditional probability that the jailer will say Mark is to be executed if Matthew is to be executed, and the unconditional probability that the jailer will say Mark is to be executed, whether or not Matthew is to be executed.

Unfortunately in the problem as stated these probabilities are not explicit, but we may reasonably assume from the wording that:

a. initially each combination of possible outcomes is equally probable. There are three combinations: Matthew and Mark to be executed; Matthew and Luke to be executed; and Mark and Luke to be executed. Each prisoner therefore has a 2/3 chance of being executed and a 1/3 chance of surviving. Therefore h = 2/3

b. If Mark and Luke are both to be executed, the jailer will give Matthew the name of one of them at random, with probability 1/2

c. the jailer will not lie.

With these assumptions, e|h is 1/2. The condition h is fulfilled if Matthew and Luke are to be executed, or if Matthew and Mark are to be executed. By assumption these events are equally probable. If Matthew and Luke are to be executed, the jailer cannot tell Matthew that Mark is to be executed (since the jailer does not lie). If Matthew and Mark are to be executed, the jailer is bound to say that Mark is to be executed. The conditional probability e|h is therefore (1/2 x 0) + (1/2 x 1) = 1/2.

The unconditional probability of e is also (coincidentally) 1/2. There are two circumstances in which the jailer will say that Mark is to be executed: either, with probability 1/3, Matthew and Mark are to be executed, in which case e has a probability of 1, or, also with probability 1/3, Mark and Luke are to be executed, in which case there is a probability of 1/2 that the jailer will choose to give Mark’s name. The total probability of e is therefore (1/3 x 1) + (1/3 x 1/2) = 1/2.

Putting the various components together, we get h|e = (2/3 x 1/2)/1/2 = 2/3. The antecedent probability of 2/3 that Matthew will be executed therefore is not changed by the information he is given, and he is wrong to feel happier.

This may seem at first sight inconsistent with the Monty Hall puzzle, where the probabilities do change when Monty opens one of the doors. But the two cases can be reconciled if we consider the probability that Luke will be executed. Suppose for example that Matthew wants to place a bet on who will survive, with the proceeds to go to his widow if he does not survive himself. Initially there is an equal 1/3 probability that each of the prisoners will survive. But if Matthew persuades the jailer into giving him Mark’s name as described in the Serbelloni problem, then the probability that Luke will survive increases to 2/3, and Matthew (or rather his widow) will stand to gain by betting on Luke’s survival rather than his own (given the same betting odds), since Matthew himself still only has a 1/3 chance of survival. This can also be shown using Bayes’ theorem, but I leave that as an exercise for the reader!

Added on June 7:

I have a suggestion for visualising the problem:

– draw three lines of equal length (say, 2 inches) in pencil. These can represent the number of times each of the three outcomes (Matt/Mark, Matt/Luke, Mark/Luke) will occur in a large number of similar cases

– now rub out parts of the lines representing those cases where the jailer does not say that Mark is to be executed. This means you have to rub out the whole of the Matt/Luke line and half of the Mark/Luke line.

– you are left with the whole of the Matt/Mark line (2 inches) and half of the Mark/Luke line (1 inch), totalling 3 inches. These represent the posibilities remaining after the jailer’s announcement. Mark is executed in all of these cases, Matt in 2/3 of them, and Luke in 1/3, which is consistent with the Bayesian conclusion.

As a matter of historical interest, the Serbelloni problem seems to pre-date the Monty Hall puzzle. The latter became notorious around 1990, though I have found a reference to it in the 1980s. But the Serbelloni problem goes back at least to 1966, the date of the conference. However, in a slightly different form it is also to be found in a puzzle by Martin Gardner in 1959. In Gardner’s version the prisoner and the jailer have an argument about whether he should give him the information.

Posted by David B at 03:05 AM

• Category: Science 
🔊 Listen RSS

In my post yesterday I mentioned J B S Haldane’s classic 1957 paper on The cost of natural selection.

I find that this is available as a free pdf download here. Several other classic texts are available from the same source.

As I mentioned in my post, Haldane’s conclusions have been much modified by later geneticists. It is generally agreed that natural selection can be faster than Haldane supposed, though different models produce very different results. I mention this now because I see from my Google search that Haldane’s paper is often cited by anti-evolutionists as showing that the evolution of man by natural selection is impossible.

Posted by David B at 04:51 AM

• Category: Science 
🔊 Listen RSS

In a recent post on Altruistic Punishment I remarked that ‘group selection should be regarded as an explanatory last resort’.

I didn’t intend this to be a controversial proposition, but as some comments challenged it, I will give some reasons for regarding group selection as a last resort.

I will assume that the traits favoured by group selection (whatever that means) are not independently favoured by selection on individuals. The whole point of the group selection debate is about how to explain traits that appear contrary to individual selection.

[Added on June 2: Of course, I am not denying that in some circumstances a trait could be favoured by selection at both individual and group level, but this is not very interesting, and it is not what the group selection controversy is about. As George C. Williams put it: ‘Many biologists have implied, and a moderate number have explicitly maintained, that groups of interacting individuals may be adaptively organized in such a way that individual interests are compromised by a functional subordination to group interests. It is universally conceded by those who have seriously concerned themselves with this problem that such group-related adaptations must be attributed to the natural selection of alternative groups of individuals and that the natural selection of alternative alleles within populations will be opposed to this development… A group in this discussion should be understood to mean something other than a family and to be composed of individuals that need not be closely related’ (Adaptation and Natural Selection, 1966, p.92) If people insist on using the term ‘group selection’ for cases where individual selection within the group is favourable, or at least neutral, towards the trait in question, then I think they should at least use some additional qualifying term. I suggest that it should be called Trivial Group Selection, by analogy with mathematical usage, where problems often have ‘trivial’ solutions for some values of the variables. For example, the equation in Fermat’s Last Theorem has solutions for x = y = z = 0, but Andrew Wiles would have won no prizes for pointing this out.]

Beginning with some points of methodology:

1. By their fruits ye shall know them (Matthew 7: 20). In assessing any scientific theory we should ask whether it has proved fruitful. In Imre Lakatos’s terms, is it a progressive research programme? Is it generating testable empirical hypotheses and predictions, and if so, are they turning out successfully? Modern group selection theories have been under discussion for over 30 years now, but apart from M J Wade’s experiments on flour beetles (which present the most favourable circumstances for group selection) they seem to have generated little in the way of empirical data. If we look back further in time, group selectionism of the old kind was not only unfruitful but an obstacle to the progress of evolutionary theory. As John Maynard Smith said some years ago in a critique of Elliott Sober, until the 1960s ‘biology was riddled with “good-of-the species” thinking. Again and again, one met in the literature explanations of some trait… in terms of the benefit that the trait conferred on the species, or even on the ecosystem as a whole. It was clear to me, as it must have been clear to George Williams, that no progress would be made toward understanding the evolution of such traits until this kind of thinking was ended… If Sober’s way of describing the world is taken seriously, it will again cease to be obvious, and someone (not me, next time) will have the job to do over again’ (ref.1).

2. Theories are not to be multiplied without necessity. Since we clearly do need selection at individual level, it would be more parsimonious if we can also use it to explain the phenomena (such as ‘altruistic punishment’) for which group-selectionist hypotheses are being offered. We should look carefully for individual-selection solutions, such as inclusive fitness or game theory, before resorting to group selection.

3. Models of group selection are more complex than those of ordinary selection, because they have to incorporate at least two levels of selection and the interactions between them. They also tend to involve unfamiliar concepts and terminology. The objection to this is not just the effort involved in learning and applying any complex new theory, but the difficulty of interpreting the results. For an example see here.

4. Much of the argument about levels of selection has been about words rather than facts. The term ‘group selection’ by itself can cover many quite different phenomena. In considering any proposed application of group selection to some trait, we need to ask:

– is the trait cultural or genetic?
– is it a trait of a group or of individuals?
– is differential reproduction occurring at the level of the group (e.g. by group extinction or multiplication) or at the level of individuals?
– if a trait of individuals varies in frequency between different groups, how does the variance in frequency arise, and how is it maintained in the face of migration and interbreeding?
– is the trait selected by virtue of its own effects on the fitness of groups or individuals, or as a by-product of other properties?

To illustrate the last point, it is obvious that some human cultural traits have spread because they happen to be linked to successful political entities, the success of which has little or nothing to do with the traits in question. For example, the English language is widely spoken throughout the world, while Italian is not, but no-one will suppose that this has much to do with the intrinsic merits of English compared to Italian. Some authors distinguish between ‘selection’ and ‘sorting’ of traits, where ‘sorting’ does not imply any intrinsic fitness advantage of the trait concerned. In these terms English has spread through sorting but not selection. There is no doubt that ‘sorting’ in this sense does occur, but it is not very interesting, and we don’t need any elaborate theory to explain it. It is not what the group selection controversy is about.

Depending on the answers give to the questions above, many different versions of group selection can be proposed, some of which are more plausible than others. Watch out for the old ‘bait-and-switch tactic’. Also watch out for the use of the label ‘group selection’ for phenomena that would not be generally recognised as group selection at all. Notably, if a trait is selected by virtue of its fitness effects within local concentrations of genetically related individuals, many would call this kin selection rather than group selection.

Moving on to objections of substance, it is difficult to generalise because different objections apply to different versions of group selection:

5. In the older theories of group selection, differential reproduction was usually defined in terms of the extinction or multiplication of groups as a whole. The main difficulty with this is that a group would normally have a ‘generation time’ much longer than that of individual organisms. If the groups contain genetically diverse individuals, the traits favoured by individual selection would therefore (other things being equal) increase in frequency more quickly than those favoured by group selection. In order for group selection to prevail, the groups have to be very pure to begin with, mutation and migration rates have to be low, and/or the group selection effect per ‘generation’ has to be strong compared with individual selection. If we are dealing with human evolution, there is also the problem that there just may not have been enough time for extinction or multiplication of groups to have had much effect. Cultural evolution in particul
ar often occurs within a group ‘lifetime’, so it is very difficult to see how group selection of this kind could explain it.

6. In the newer theories of group selection (from about 1975 onwards), the selective process usually depends on the differential reproduction of individuals rather than groups. In these theories groups as such do not ‘die’ or ‘give birth’, but individuals vary in fitness according to the frequency of certain traits within groups. The ‘group selection’ effect therefore depends on covariance between fitness and frequency of the trait within groups. Price’s Equation, which I discussed here, provides a framework within this can be analysed. The difficulty for group selectionists is to explain how sufficient covariance is maintained despite migration and mixing of groups. Generally speaking, models of this kind only work if groups are small and fairly isolated. It then becomes difficult to distinguish group selection from kin selection, since the members of the group will often be related to each other.

7. Group selection is complicated. Cultural evolution is also complicated. When people try to combine the two in ‘cultural evolution by group selection’ I despair. I commented at length on the difficulties here, over 2 years ago. I wouldn’t still agree with all the details, but I don’t see any reason to change the general thrust. There are just too many differences between genetic and cultural evolution for the analogy to be useful. And one of the few actual examples of cultural group selection that is ever used by the Groupies – the Nuer Conquest – turns out on investigation to be much weaker than they claim.

8. As far as I recall, I have never seen a model of group selection that deals with the selection of more than one trait at a time. This is a problem even for selection on individuals. In a classic 1957 paper on the ‘cost of natural selection’ J B S Haldane estimated the number of ‘selective deaths’ (or failures to reproduce) that would be necessary in order for a rare advantageous gene to spread to fixation, and concluded that with plausible levels of advantage it would take about 300 generations per locus. If many loci were under selection simultaneously the process would take much longer. Later geneticists argued that Haldane’s assumptions were oversimplified, and that with more complex (but biologically reasonable) assumptions about the mode of selection it would be possible for a larger number of genes to be fixed more quickly (see e.g. refs. 2 and 3). But the problem is more severe for group selection because most models require the effect of a trait on group fitness to be large, in order to overcome countervailing individual selection. If a group has more than a few favourable traits under selection simultaneously, the combined effect on fitness could be unfeasibly large. There is also the difference that in individual selection a species will typically contain millions of individuals, much more than the number of loci under selection (which cannot be more than the number in the species genome). In contrast, with group selection the number of traits (cultural or genetic) under selection may be of the same order of magnitude as the number of competing groups (probably no more than a few hundred in the same geographical area). This poses additional problems for the ‘cost of selection’. I don’t know if the problems are insuperable, but my guess is that group selection can only effectively promote a few traits at a time.

It is still conceivable that group selection might account for one or a few particularly important human traits, such as language, social conformism, or collective punishment, which would then have a multitude of secondary effects by creating new conditions for evolution by individual and kin selection. But I hope we can do without it, and the onus of proof is on those who say we cannot.

Ref. 1: John Maynard Smith, ‘Reply to Sober’, in The Latest on the Best: Essays on optimality and evolution, ed. John Dupré, 1897.
Ref. 2: John Maynard Smith, ‘ “Haldane’s Dilemma” and the rate of evolution’, Nature, 1968, 219, 1114-1116.
Ref. 3: P. O’Donald, ‘ “Haldane’s Dilemma” and the rate of natural selection’, Nature, 1969, 221, 815-7.

Posted by David B at 06:41 AM

• Category: Science 
🔊 Listen RSS

Dienekes recently drew attention to an important forthcoming article on altruistic punishment. The article has now appeared: James H. Fowler: Altruistic punishment and the origin of cooperation, Proc. National Academy of Sciences, May 10 2005, vol. 102, 7027-49. It is available as a free pdf download here.

So what is altruistic punishment and why is it important?

One of the central problems of human evolution is to explain the widespread existence of cooperation. Such cooperation often produces a benefit for social groups as a whole, but at a cost to the cooperating individuals. So there is an advantage for ‘free-riders’ who take the benefits but avoid the costs. Why then does cooperation not break down?

The problem can obviously be solved if free-riders are punished. But those who do the punishing incur a cost in doing so (e.g. the risk of retaliation), greater than their individual gain, hence the term ‘altruistic punishment’. So why should anyone punish?

Again there is an obvious solution if failure to punish is itself punished. We can avoid an infinite regress of special rules about punishment by adopting a general social rule to the effect ‘punish all breaches of social rules’. Since this is itself a social rule, failure to punish is also a breach of a social rule, which must therefore be punished, and so on.

Such a rule is an evolutionarily stable strategy provided a large enough proportion of the population are already following it. But it is difficult to see how the rule could become established in the first place. Several theorists have resorted to an explanation by group selection: in small groups a large proportion of ‘punishers’ may be established by chance, and these groups then spread at the expense of other groups.

Group selection should be regarded as an explanatory last resort. The importance of Fowler’s paper is that it provides an alternative explanation of altruistic punishment based on individual selection. If the only strategies allowed are ‘cooperate’, ‘free ride’, and ‘cooperate and punish’, then ‘cooperate and punish’ will be more costly than the alternatives when it is rare, and it cannot spread by individual selection. The crucial feature of Fowler’s model is that it allows another strategy: individuals can opt out of group activities when the benefit of cooperation is lower than that of individual activity. If there are too many free-riders, more individuals will opt out and ‘do their own thing’. But when most of the population have opted out, the strategy of ‘cooperate and punish’ may have an advantage over ‘opt out’, which allows it to spread even when it is rare. The strategy ‘cooperate’ (but not punish) will spread more quickly at first, but once ‘cooperate and punish’ has passed a certain critical frequency (which depends on the parameters) it gains over ‘cooperate’ until it becomes the prevalent strategy. So ‘cooperate and punish’ cannot spread when it is rare in a population consisting only of ‘cooperators’ and ‘free-riders’, but the existence of the ‘opt-out’ strategy gives it an entry point.

Fowler considers a number of possible objections to his model. I am not sure that the model is very plausible, but it is no worse in this respect than the group-selectionist alternatives. It does at least mean that the groupies can no longer claim there is no alternative.

Personally I think that both approaches are misconceived. The basic flaw is encapsulated in the first sentence of Fowler’s paper: ‘Human beings frequently cooperate with genetically unrelated strangers whom they will never meet again, even when such cooperation is individually costly’.

Well, no, they don’t. Even in modern, well-regulated societies such cooperation is unusual. (When did you last do it?) In hunter-gather societies, which prevailed for most of human evolutionary history, it is practically unknown. But I will expand on my objections in another post.

Posted by David B at 11:56 AM

• Category: Science 
🔊 Listen RSS

I noticed a news report this week that the population of France was predicted to rise from 60 million to 75 million by 2050. As I found this surprising, I tried to find out more.

For those who can read French, the fullest report seems to be here.

For the benefit of others, the key points are as follows.

The population of France is currently just over 60 million. The French authorities have previously forecast that it would rise to 64 million by 2050. But using new population data from local registration they now think this is an underestimate. A French Government Minister has talked of 75 million, but this is not an official estimate. Some demographers think it is too high, and 70 million is more plausible.

The rising trend is attributed to three main factors:

– the birth rate is higher than expected

– life expectancy is continuing to increase

– net immigration is continuing and is also higher than expected.

Some readers will suspect that immigration is the main factor, but this doesn’t seem to be the case. Net annual immigration is running at about 100,000, which if it continues for 45 years (and assuming the immigrants reproduce themselves) would only account for 4.5 million. Moreover, net immigration of 50,000 a year had already been allowed for in population forecasts, so the higher observed rate only accounts for about 2.25 million of the increase in the estimate. Incidentally, it is said that immigration from Britain is an important element in the increase. There has certainly been a stream of people moving from Britain to rural France, escaping high house prices, yobs, and the general deterioration of the physical and social environment. There is hardly a middle-class Englishman who doesn’t dream of boules, brie and baguettes in the sunshine. But most of the British migrants are retired or ‘downshifting’.

The most interesting factor is the increase in the birth rate. The ‘indice conjoncturel de fécondité’, which seems to be equivalent to the Total Fertility Rate, has increased from 1.78 in 1998 to 1.92 in 2004. Higher birth rates among immigrants only account for a small part of this. The main factor seems to be that women who have postponed childbearing into their 30s are now ‘bearing fruit’. I pointed out some time ago (here) that changing patterns of conception made TFRs unreliable.

There is some uncertainty how far the birth rate will increase. The ‘optimists’ think it will reach a TFR of 2.1, which is about sufficient for replacement, while others are more cautious. This is the reason for the differing forecasts.

One must remember that in France the birth rate has traditionally been a matter of national pride and concern. Governments have always been ‘pro-natalist’, against the background that since 1800 French population grew much more slowly than that of Britain and Germany. French commentators are now gleeful that if the rising French trend continues, while in Germany the birth rate stays low, the population of France will be bigger than that of Germany by 2050. This is premature, as trends cannot be reliably forecast that far ahead, and an improvement in the German economy could draw in large immigration from eastern Europe.

For comparison, the population of the UK is also just over 60 million and is officially forecast to rise to 65 million by 2050. But at current rates of net immigration to the UK this level would be reached by 2030. Much of the recent migration is from Eastern Europe, especially Poland, and is proving generally popular, as the migrants are polite, hard-working, and well-behaved. Let’s hope they don’t send their children to British schools and spoil all that!

As in France, the birth rate is also increasing in Britain. According to the Office for National Statistics, ‘if the provisional 2004 patterns of fertility were to remain unchanged, as represented by the total fertility rate (TFR), then an average of 1.79 children would be born per woman. This is the highest rate since 1992 (1.80) and continues the gradual increase from a low point in 2001 when the TFR was 1.63’.

I don’t personally welcome an increase in the population of England, which is already over-populated. France, on the other hand, is relatively lightly populated, and could easily accommodate an increase to 70 million or so.

Posted by David B at 06:40 AM

• Category: Science 
🔊 Listen RSS

The NYT has a series of articles this week on Class in America.

I haven’t had time to read much yet but thought I’d mention it. Here is the first overview article. (Free registration may be required.)

Posted by David B at 10:52 AM

• Category: Science 
🔊 Listen RSS

The most alarming programme on British TV this week was an ITV report on the happy slapping craze. Moronic teenage thugs apparently find it amusing to slap complete strangers around the face while recording the incident on their video phones. In the worst cases it goes beyond slapping to punches and kicks, or even – in one case – setting a sleeping drunk alight.

I found a couple of goodish press reports on the subject, one here in a Scottish newspaper, and one in the Observer. As always, it is difficult to tell how far this is a real phenomenon and how far a media panic, of the kind that has recurred periodically since the ‘Mohawk’ craze of 18th century London.

Another issue that the two newspaper articles are conspicuously silent about is the ethnic aspect. In the ITV report most of the assaults seemed to be by blacks on whites. On the other hand, some of the culprits were certainly white, and in the setting-on-fire case the two culprits (now serving an inadequate 6-year sentence) were both white. I suspect that this is another case (like crack, gangsta rap, and ‘hoodies’) where a fashion has started among young blacks and spread throughout the underclass of feral teenage scum (if you’ll excuse a non-value-neutral expression).

Posted by David B at 06:52 AM

• Category: Science 
🔊 Listen RSS

Another fine essay by Theodore Dalrymple here.

(BTW the name is a pseudonym, so no relation to William.)

Comment from Razib: Read it!

Posted by David B at 02:59 PM

• Category: Science 
🔊 Listen RSS

In a recent post I mentioned that there was an error in Jobling, Hurles and Tyler-Smith’s (generally very good) book on Human Evolutionary Genetics. Discussing an important measure of genetic distance (Nei’s D), the authors state that D varies between 0 and 1. I was fairly sure that this was wrong, as D is minus the log of a fraction, and the fraction can vary between 1 and 0, so it therefore seems that D can vary from 0 to infinity (that is, it increases without limit as the fraction approaches 0). I was a bit nervous about pointing this out, as I don’t like to disagree with those more expert than myself, so I was pleased to find my belief confirmed in another book, Speciation, (2004) by Jerry Coyne and Allen Orr, who say clearly (page 73) that D can range from 0 to ∞.

This isn’t by any means the first time I’ve noticed an error in a textbook. It is particularly annoying to find mathematical errors in a book aimed at students and other non-specialist readers. An expert in the field will probably be able to see immediately that something puzzling is just a careless slip or printing error, whereas a student may spend a long time trying to make sense of it, and worrying that they are missing something. I suggest that textbook authors and publishers should be fined $1,000 for every error of this kind!

Addendum from Razib: AJHG has a review of the Jobling, Hurles and Tyler-Smith book, as does Henry Harpending. I enjoyed it a great deal, though advanced readers might find it more interesting as a lead/source for a wide range of papers and texts on topics that pique their curiousity as opposed to a nuts & bolts primer. If you want something more technical, I recommend Genetics of Human Populations by Bodmer and Cavalli-Sforza, the data is out of date, but the equations are all good (and the price is really phat if you get it used).

Posted by David B at 06:30 AM

• Category: Science 
🔊 Listen RSS

A biologist claims to have discovered a new family of rodents in Laos (or strictly, a new species of rodent which has been classified in its own family).

See the NYT report here. (Free registration may be required.)

Posted by David B at 02:38 AM

• Category: Science 
🔊 Listen RSS

I have deleted my own post ‘A Kick In The Ballots’, as it seems to have attracted ire for reasons that are not clear to me – and by now I am sure everyone knows the results of the British elections anyway.

Posted by David B at 04:10 AM

• Category: Science 
🔊 Listen RSS

I watched Dr Tatiana’s Sex Advice To All Creation this week on British TV (Channel 4). It was a co-production with Discovery Channel Canada , and I think it is scheduled to appear in Canada soon. It isn’t clear if it will be shown in the US – the sexual contents and language may be too strong for middle-America.

I quite enjoyed the series – even the musical intervals – but I doubt that I would have learned anything from it if I knew nothing about the subject to begin with. I guess it will increase sales for Olivia Judson’s book, which may have been the main object of the exercise from her point of view.

Apart from the wildly gimmicky presentation, I found Olivia Judson’s voice distracting. She has the poshest voice I’ve heard on British TV since Hugh Laurie played Bertie Wooster. She’s so posh she makes the Duchess of Devonshire sound common. Her voice sounded a bit strained, and I wondered if she is really hiding the dark secret that she is – American! Well, her father is American, her brother is at MIT, and she took her first degree at Stanford, so I think she must have spent a lot of time in America – was she even born there? Maybe she is overcompensating for it with an exaggerated English accent. Just a suggestion…

Posted by David B at 05:07 AM

• Category: Science 
🔊 Listen RSS

I can’t resist repeating the headline from this morning’s Sun!

Tony B. Liar has been re-elected, but with a sharply reduced majority. With most results now announced, the Labour majority in Parliament has been cut from around 180 to only about 60 – and many of the Labour MPs hate Tony even more than the Opposition.

Both Conservatives and Liberal Democrats have gained seats. The result is better for the Conservatives than the polls had predicted.

In a revolt against political correctness, a Labour candidate selected from a women-only shortlist in a Welsh constituency has been defeated by a popular local Labour man standing as an Independent. The national Labour Party machine thought their majority in the constituency was so large that they could impose an unwanted female candidate on the local party and get away with it. Wrong!

The bad news is that Ghastly George Galloway, Saddam’s buddy, has been elected by a heavily-Muslim electorate in Bethnal Green, defeating Oona King, the Black-Jewish former Labour MP.

Posted by David B at 03:04 AM

• Category: Science 
🔊 Listen RSS

My post yesterday stopped just as I was getting to the most important bit: how diversity ‘between populations’ is measured. Here is the rest…

The concept of heterozygosity as a measure of diversity can be extended from a single population to two populations (or sub-divisions of a single population). [For what is meant by ‘heterozygosity’ in this context see the previous post, including Note 1.] Suppose we have two populations, A and B, with the same set of alleles at a given locus, but different frequencies. (I will assume in what follows that populations are of equal size, and that we are considering diversity at a single locus.) If we know the frequencies, it is easy to calculate the probability that two genes selected randomly at that locus, one from each population, are homozygous for a given allele. We can then total the probabilities for each allele and subtract the sum from 1 to get a figure for heterozygosity, just as in the case of a single population. We can call this HD, to indicate that it is heterozygosity between genes selected from the two different populations. (Warning!! some authors use the term HD for a different concept.)

Unless the frequencies for all alleles are the same in populations A and B, HD is bound to be higher than the average H within A and B separately. If we take the average frequency of an allele in the two populations, which we can call M, the frequency in one population will be M+d and in the other M-d. (The M’s and d’s may of course be different for different alleles, subject to the constraint that the M‘s must sum to 1 and the d‘s in each population must sum to 0.) The homozygosity between A and B for that allele will be (M+d)(M-d) = M^2 – d^2, so HD over all alleles will be 1 – ΣM^2 +Σd^2. The average homozygosity for an allele within the two populations will be M^2 + d^2 [see Note 4] so the average heterozygosity within the populations, over all alleles, must be 1 – ΣM^2 – Σd^2. This is 2Σd^2 less than HD; in other words, the heterozygosity between the two populations is greater by 2Σd^2 than the average heterozygosity within them. It might therefore seem natural to take 2Σd^2 as an indicator of the ’difference’, ’divergence’, ’diversity’, or ’distance’ between the two populations.

However, this is not how diversity between two populations is usually measured. Suppose we consider the two populations as subdivisions of a single larger population. The average frequency for an allele in the combined population is M, so the heterozygosity for two genes selected at random (at the same locus) within the combined population is 1 – ΣM^2. We can call this HT, (with T for ‘total’) to indicate heterozygosity in the total combined population. [See Note 5] But the average heterozygosity within the two subpopulations is 1 – ΣM^2 – Σd^2 (see previous paragraph). If we call this HW (W for ‘within‘), it will be seen that HT = HW + Σd^2. It can also be seen that HD – HT =Σd^2, which may be interpreted as the excess of heterozygosity when two genes are selected from different sub-populations, over and above its level if they are selected at random from the total population. It is natural for a geneticist to see this as analogous to the partitioning of variance for some trait into ’within-group’ and ’between-group’ components. We can therefore define between-group heterozygosity not as HD, or as HD – HW, but as HD – HT. If we call this HB (B for ’between’), then HT = HW + HB. The heterozygosities within and between groups can of course also be expressed as proportions of total heterozygosity, in the form HW/HT and HB/HT. Since HB equals HT – HW, we can also express HB/HT as (HT – HW)/HT, or as 1 – HW/HT. These are all common expressions for Masatoshi Nei’s GST, introduced in 1973, which is probably the most widely used measure of diversity ’between groups’ in population genetics. (Warning!! Different authors use different abbreviations for the various components.) GST can be calculated as Σd^2/(1 – ΣM^2). For the case of two alleles, it is equivalent to Sewall Wright’s FST, [note 6] which for two populations is d^2/pq [note 7]. Lewontin, in his original study of human genetic diversity, used a slightly different measure which produces similar results to GST or FST. GST can also be applied to cases with more than two populations, with unequal population sizes, or with repeated hierarchical subdivisions. (Nei introduced it in this general form, with a more complex derivation than I have given here for the special case of two equal populations.)

It is not wrong to use GST as an indicator of diversity between populations. However, it seems more natural to do so if we start with a focus of interest on a total population divided into many sub-populations, than if we are starting with just two populations, and want to quantify the extent of difference between them. For the latter purpose it seems more natural to compare heterozygosity between the populations with the heterozygosity within them, perhaps by calculating (HD – HW)/HW. If our primary interest is in (say) population A – as might be the case if we are members of it! – it would also be relevant to calculate (HD – HA)/HA, where HA is heterozygosity within population A. (Note that this will be different from the equivalent calculation for population B if one population is more internally uniform than the other.) This will give an indicator of the extent to which an individual in the other population is likely to be genetically different from oneself as compared with another member of one’s own population. Typically, (HD – HA)/HA will be about twice the size of GST calculated for the two populations. If on the other hand we are considering many populations, GST, calculated for all of them, will give an approximation to the average difference if we make pairwise comparisons between them. [Note 8]

It seems then that GST may be suitable for comparison of many populations, while some other measure may be more suitable if we are just comparing two. But a more fundamental problem, which applies to any measures based on a partition of heterozygosity, is that the outcome is affected by the level of diversity or uniformity within populations. To dramatise this point, suppose population A has 5 alleles at a locus, each with frequency .2, while population B has 5 completely different alleles at that locus, also with frequency .2. In the combined population there are therefore 10 alleles, each with frequency .1. GST will therefore be (10 x .01)/(1 -[10 x .01]) = .11. Thus only 11% of total heterozygosity at this locus is accounted for by the differences between the two populations. Yet in fact they are genetically completely different! In principle, they might even be separate species. To take a more realistic example, suppose there are 3 alleles with frequency 4:3:3 in one population and 8:1:1 in the other. These frequencies are markedly different, yet GST comes out at only .14. The point can even be illustrated with a 2-allele system. Suppose one population has the alleles in frequencies 7:3 and the other in frequencies 3:7. The two populations are therefore markedly different in allele frequencies, yet GST will be only .16. The underlying problem is that GST (and other measures based on comparisons of heterozygosity) are measuring not just the difference between populations but the uniformity or diversity within them. Even in a 2-allele system, it is impossible to get a high value of GST unless each population is quite uniform, with a different single allele predominant in each. When GST (or FST) in humans is compared with that of other large mammals, where GST is often higher than for humans, this does not necessarily mean that human populations are very similar to each other, so much as that other animal populations are more internally uniform. This could just reflect the small population size of many large mammal
s, and the consequent strength of genetic drift. A low level of GST can be due to a low level of difference between populations, a high level of diversity within them, or any combination of the two. Knowing the level of GST by itself therefore strictly tells us nothing about the extent of genetic difference between populations.

These reservations about the use of GST are not entirely new. Nei himself, in his 1973 paper, noted that ’the estimate obtained in one population cannot be compared with that of another, unless the breeding system is similar for the two populations. If HS [= my HW] is small, GST may be very large even if the absolute gene differentiation is small’. However, Nei doesn’t seem to have mentioned the converse problem that if HW is large, GST may be small even if absolute differentiation is large. Anyway, Nei’s words of caution seem to have been generally ignored, and textbook writers and others have cheerfully compiled comparative tables of GST in different species or sub-species without considering whether the breeding systems of the populations are similar. This can lead to misunderstanding even by professional biologists. Brian Charlesworth, an eminent geneticist, drew attention to the dangers in a 1999 paper, saying: ’relative measures of between-population divergence, such as FST [or GST] are inherently dependent on the extent of within-population diversity. Indeed, for loci with very high levels of diversity such as microsatellites, FST is a poor measure of between-population divergence even in the absence of forces that affect diversity, since FST is necessarily low even if absolute divergence is high…’

Although GST and FST are the most widely used measures of between-population diversity, other formulae have sometimes been proposed. Charlesworth mentions two alternatives, which in my notation are equivalent to (HD-HW)/HD and (HD-HW)/2HT. Interestingly, Wright himself (Wright, p.413) originally proposed using the square root of FST as the measure of divergence between populations, which would tend to raise the level of divergence for low-FST populations as compared with using raw FST figures. But these measures are still sensitive to within-population diversity, since HW enters into the denominator in one way or another.

In his 1973 paper Nei suggested using the average gene diversity between populations (my HD), after subtracting the average within-population diversity (my HW), as an ’absolute measure of gene differentiation’, and claimed that it is ’independent of the gene diversity within subpopulations’. I am not sure that this is correct. It may be true if we are considering cases where internal diversity within subpopulations is low (which Nei seems to have been mainly concerned with), but not when it is high. The measure can be expressed as HD – HW = 2Σd^2, or 2HB in my notation. However, 2HB is still subject to the constraint that total heterozygosity between populations (HD) cannot be greater than 1, and 2HB is HD – HW. If HW is high, 2HB must therefore be low. For example, in the 10-allele case mentioned above, it would give the result 2HB = .2, which seems unsatisfactory as a measure of ’absolute divergence’ in a case where the two populations have no alleles in common!

Nei himself had already suggested in 1972 a measure of ’genetic distance’ between two populations which seems to avoid most of the problems discussed above. This measure, Nei’s D, is based on homozygosity – the probability that two randomly selected genes at a locus are identical – rather than heterozygosity. If we express average homozygosity within population A as H’A, and within population B as H’B, while homozgyosity for genes selected one from each population is H’AB, then Nei’s D can be expressed as minus log(H’AB/√[H’A.H’B]), where log stands for the logarithm (to base e) of the expression in brackets. H’AB/√[H’A.H’B] is the homozygosity between populations A and B divided by the geometric mean of the homozygosity within the two populations. If the two populations have identical gene frequencies for all alleles, then homozygosity between the populations will be the same as within them, and H’AB/√[H’A.H’B] will be 1. Its logarithm will therefore be 0. Where the gene frequencies are not identical between the populations, H’AB will be smaller than √[H’A.H’B], so H’AB/√[H’A.H’B] will be a fraction between 0 and 1; it will be 0 if the two populations have no alleles in common, since H‘AB will then be 0. The logarithm of a fraction is a negative number. For values of the fraction greater than 1/e but less than 1 the log will be a negative fraction; for the value 1/e it will be minus 1; and for values from 1/e to 0 it will be a negative number increasing (in absolute value) by 1 for each power of 1/e in the value of the fraction. As the fraction approaches close to 0, its log therefore goes to ‘minus infinity‘. [Note 9] Since D is minus log(H’AB/√[H’A.H’B]), the minus sign converts the negative values of the log into positive ones.

While at first sight rather daunting, Nei’s D has some attractive properties. Nei himself stressed its value for studies of population structure and evolution, which I am not competent to assess. But simply as a descriptive measure of genetic difference between populations, it seems preferable to GST, as it does not seem to be seriously distorted by the extent of heterozygosity within populations. However, it does have the drawback that for values of H’AB/√[H’A.H’B] approaching 0, the value of D increases disproportionately, as it ‘goes to infinity‘.

I do wonder whether for the basic purpose of summarising genetic difference between two populations, it would not be better simply to take the sum of the absolute differences in the frequencies of alleles between them (including zero frequencies for any alleles that are absent from one population). For example, if one population has alleles a, b, c, d, and e with frequencies .3, .2, .3, .1, and .1, and the other has alleles a, b, c, and d, with frequencies .5, .3, .1, and .1 the absolute differences would be .2, .1, .2, 0, and .1. The sum of the differences would therefore be .6. Such sums could of course be averaged over several loci. The maximum range of this indicator would be from 0 (no differences at all) to 2 (no alleles in common). If it is preferred to have only values ranging from 0 to 1, the indicator could be divided by 2. I am aware that this is a very crude measure, with no sophisticated rationale in population genetics, but it is easy to calculate, and does not seem to give intuitively absurd results for any scenario I can think of.

I think the main lesson to draw is that heterozygosity does not capture everything we are interested in if we want to measure genetic diversity at population level. If its limitations are not understood, there is a danger of drawing unfounded or absurd inferences. To illustrate this, consider a remark made by Lewontin in his book on Human Diversity. Having summarised the evidence that on average 85% of human diversity is within populations, as measured by GST or similar measures, he remarks that ‘To put the matter crudely, if, after a great cataclysm, only Africans were left alive, the human species would have retained 93% of its total genetic variation, although the species as a whole would be darker skinned. If the cataclysm were even more extreme and only the Xhosa people of the southern tip of Africa survived, the human species would still retain 80% of its genetic variation. Considered in the context of the evolution of our species, this would be a trivial reduction’ (Lewontin, p. 123).

To see the fallacy in this, suppose we consider a ’population’ made up of many different animal species, ranging from ants to zebras. If each of these species has an internal average heterozygosity of .85 (which is quite possible, for large widely-ranging species), then GST calculated for t
he whole ‘population’ will be no higher than .15. By Lewontin’s logic, all but one of these species could be exterminated without greatly reducing ‘genetic diversity’. Not a conclusion to please the tree-huggers! This is not to say Lewontin is necessarily wrong about the human case, but his inference cannot properly be drawn solely from measurements of GST. Before saying anything definite about the importance of genetic diversity between human populations, it would be necessary to consider other measures such as Nei’s D, which more directly measure the difference in gene frequencies between them. Nei and colleagues have done this for a number of genetic markers in the major human ‘races’ (African, Asian, and Euopean) and found fairly small values for D (less than 0.1 on average), which suggests that Lewontin’s conclusion is not in fact unreasonable. (See Nei, Livshits and Ota). But it still seems undesirable that the 85/15 figure should be so widely used with so little consideration of what it actually means.

Note 4: Homozygosity in the population with p = M+d will be (M+d)^2 = M^2 + d^2 + 2Md, and in the population with p = M-d will be (M-d)^2 = M^2 + d^2 – 2Md. The average of these is M^2 + d^2.

Note 5: The sources of this heterozygosity can be analysed into three components. There is a ½ chance of selecting one gene from each subpopulation, a ¼ chance of selecting them both from the population with allele frequency M – d, and a ¼ chance of selecting them both from the population with allele frequency M + d. The probabilities of homozygosity add up to ½(M+d)(M-d) + ¼(M-d)^2 + ¼(M+d)^2 = M^2, so HT over all alleles is 1 – ΣM^2 as expected.

Note 6: The terms FST and GST are used almost interchangeably in the literature. I won’t explore the exact relationship between the two measures. It is sometimes said that they are conceptually different but quantitatively the same. Nei’s derivation is certainly clearer than Wright’s, which is closely connected to his theories of inbreeding and genetic drift. Wright’s ’F’ is his measure of inbreeding, which he interpreted as the coefficient of correlation between uniting gametes. His explanations were notoriously obscure, and according to W. G. Hill the interpretation in terms of correlation is now unfamiliar to most geneticists. Incidentally, in the 1943 paper which is usually cited as the source of FST, Wright doesn’t actually use this term.

Note 7: More generally, for two or more populations Wright’s FST can be expressed as Vp/M’pM’q, where Vp is the variance of the frequency of one of the alleles among the populations, M’p is its mean frequency, and M’q is the mean frequency of the other allele. For two populations this reduces to d^2/pq. This assumes that we are taking the variance directly from the two populations themselves. If on the other hand we are estimating the variance among a wider ensemble of populations, using the observed populations as a sample basis for the estimate, the formula would need to be adjusted to allow for the fact that the variance of a sample is usually lower than the true population variance (technically it is a ‘biased statistic‘). For a ’sample’ of only two populations, the adjustment would double the variance, and therefore also double the value of FST. This adjustment seems inappropriate if we are only interested in measuring diversity between two populations. I mention this because Cavalli-Sforza et al, p.26-7, use the adjusted formula without explaining it, and it took me some time to work out why it was different from the formula I had seen elsewhere.

Note 8: it will be exactly equal to this average if we include the zero ‘differences’ between each population and itself.

Note 9: see e.g. Fine, p.377. Jobling et al, p.168, state incorrectly that D varies between 0 and 1. They also give the formula for D incorrectly, by omitting a necessary summation sign and putting a bracket in the wrong place. Cavalli-Sforza et al, p.27, give a correct version of the formula under the heading ’Nei’s Unbiased Genetic Distance’, and call it DN. This is different from the measure called D further up on the same page.


L. Cavalli-Sforza et al.: The History and Geography of Human Genes, 1994
*B. Charlesworth: ‘Measures of divergence between populations and the effect of forces that reduce variability’, Molecular Biology and Evolution, 15, 1998, 538-43.
H. B. Fine: College Algebra (Dover edn., 1961)
*W. G. Hill, ’Sewall Wright’s ’Systems of mating’’, Genetics, 143, 1996, 1499-1506.
M. Jobling et al: Human Evolutionary Genetics, 2004
R. Lewontin: Human Diversity, 1982
*M. Nei: ‘Genetic distance between populations’, American Naturalist, 106, 1972, 283-92.
*M. Nei: ‘Analysis of gene diversity in subdivided populations’, Proc. Nat. Acad. Sci., 70, Dec 1973, pp.3321-3323.
*M. Nei, G. Livshits and T. Ota: ‘Genetic variation and evolution of human populations’, 1993, in Genetics of Cellular, Individual, Family and Population Variability, ed. C. Hanis.
Sewall Wright: ‘Isolation by distance’ in Sewall Wright: Evolution: Selected Papers, ed. William B. Provine, 1986.

Items marked * are available as free pdf downloads if you Google hard enough.

Related: Part I.

Posted by David B at 01:49 AM

• Category: Science 
🔊 Listen RSS

For a while now I’ve been trying to understand how genetic diversity is measured.

For example, there is the familiar finding by Richard Lewontin, replicated by many others, that in humans about 85% of total genetic diversity is found within any single population, and only 15% between different populations.

But what do such statements actually mean? How can diversity be measured and apportioned between populations?

I guess that most laymen with an interest in genetics, like myself, content themselves with a very vague understanding of such claims. A full explanation would probably be too complicated and technical for the layman to follow.

But I wanted to dig a bit deeper. While I did this mainly for my own benefit, the results may be interesting to others.


– the good news is, that the basic concepts and methods are not very technical. Most of the key points can be understood using only elementary algebra.

– the bad news is that there are several different ways of measuring diversity, and especially the diversity or ’distance’ between two or more populations. In general the various methods will give the same rank order of diversity, but they may give very different numerical values.

– the worse news is that some of the most widely used measures are of doubtful value. The problem is not that they are actually wrong, but that the results are ambiguous and can give a misleading impression. In particular, Wright’s FST, Nei’s GST, and similar measures (including that used by Lewontin) can seriously understate the relative importance of genetic differences between populations as compared to differences within them.

Reasons for these conclusions are given more fully in the continuation. I hesitate to go into these technical issues, because I expect to get one (or both) of two responses:

(1) That’s all nonsense


(2) Oh, everyone knows that!

But after a good deal of reading on the subject, I don’t think it’s all nonsense. Most of the points I make can be found somewhere in the academic literature. In particular, I was pleased to find that an eminent population geneticist has made the same key point that independently occurred to me. On the other hand, these issues do not seem to be widely discussed in the literature, and someone who had read (say) the popular works of Cavalli-Sforza, Spencer Wells, and the like, or an introductory genetics textbook, would probably not be aware that there was any serious problem about measuring diversity.

Incidentally, the main problem I discuss is quite different from that raised by Anthony Edwards in his well-known paper on ’Lewontin’s Fallacy’. I should also emphasise that I am not saying that Lewontin’s 85% figure is actually misleading, just that it may be. But to understand why, read the rest…

In measuring genetic diversity we are attempting to quantify the extent of differences within or between populations. If we were measuring diversity in continuous quantitative traits such as height, there would be a clear starting point for identifying the ’differences’ we are interested in. Given any two measurements of the same trait, we can subtract one from the other to get a raw difference or interval. These intervals are themselves quantities in the same dimension as the trait itself, and can be added, multiplied, averaged, etc, to obtain such measures of diversity as the standard deviation, the Gini coefficient, the inter-quartile interval, or the mean absolute-value deviation.

With a non-quantitative trait such as genetic material, the starting point is not so obvious. Given any two stretches of DNA, we can try to identify and list the differences between them. But are all differences equal? Are we interested in differences of single nucleotides, codons, functioning genes, or what? Is every difference to be given the same weight, or do we, for example, ignore non-coding regions or synonymous codons? This depends in part on the underlying motive for measuring diversity. The existing measures of diversity, such as Wright’s FST, were devised mainly to assist in reconstructing phylogenies and population history. For this purpose all genetic differences are potentially informative, and the tendency is to treat all differences as equal. A different approach might be appropriate for other purposes. The choice of units of analysis may also affect the size of measured diversity even if it does not affect the rank order of diversity in different populations: for example, diversity at the level of haplotypes will be larger than at the level of haplogroups, since each haplogroup is divided into many haplotypes.

However, I don’t want to linger on these issues, and will assume that a decision has been taken about the level and kind of genetic differences we are interested in. That still leaves some problems of measurement to be settled.

Once genetic material has been classified into a number of variant forms or ’alleles’ (at whatever level), it would be natural to suggest that the level of diversity within a population can be measured by the number of different alleles within that population, while the diversity between two populations can be measured by the number (and/or proportion) of alleles found in one population and not the other. This could be a workable approach in comparing different species, but within the same species the problem is that most alleles are common to most populations (except perhaps for mitochondrial or Y chromosome haplotypes). Differences are more in the frequency (proportion) of different alleles than in their simple presence or absence.

With sufficient data on the frequency of different alleles, it is possible to calculate the probability that two genes at a given locus, selected at random from the relevant population or populations, will be either identical (homozygous) or different (heterozygous). [See note 1.] This will give us the average expected number of genetic differences between individuals, and it is plausible that this pins down in more precise terms the vague concept of ‘diversity’. Most measures of genetic diversity are therefore based on some index of heterozygosity: the probability that two genes at a given locus, selected at random from the relevant population(s), will be different. As Lewontin puts it, ’there are various measures of the diversity of objects in a collection, all of which are equivalent to asking the probability that two objects taken at random from the collection will be of different kinds’ (Lewontin, p.120) If the frequency of a given allele in a population is p, then the probability of randomly selecting that allele twice in succession is p^2 (i.e. p-squared). [Note 2] If we square the relevant frequency p for each allele at the same locus, the overall probability of selecting some allele twice in succession will be the sum of all the p-squareds: Σp^2. This is the expected homozygosity of the population at that locus. Since heterozygosity is just the complement of homozygosity, the expected heterozygosity at that locus is 1 – Σp^2, which I will call H. (In the special case of a two-allele system, with frequencies p and q (= 1-p), H = 2pq [see note 3].) We can then average H over a number of different loci to get an estimate of average H within the population.

Intuitively, we would expect diversity to be higher, other things being equal, when there are more alleles in the system rather than fewer, and when their frequencies are evenly spread rather than concentrated in one or a few alleles. Conversely, we would expect diversity to be low when most of the frequency is concentrated in one or a few alleles. H meets these criteria of diversity rather well, and in general the level of H seems to be a reasonable way of ranking different populations with respect to their internal genetic diversity. However, this does not guarantee that differences of diversity can be numerica
lly measured by the difference in H. Suppose for example that a population has n alleles, with equal frequencies for each allele. H will therefore be 1-n[(1/n)^2] = 1-1/n. Consider the following values of n and the corresponding values of H (to two places of decimals):


Evidently increasing the number of alleles from, say, 2 to 4 does not double the ‘diversity’ as measured by H, and increasing n beyond about 5 makes relatively little difference to H. Even increasing it tenfold from 10 to 100 only increases ‘diversity’ by 1/10. Since H cannot exceed 1, it is bound to be squeezed up against the ceiling when values of n are high. This seems intuitively unsatisfactory if we are looking for a numerical measure of diversity.

As well as differences in the number of alleles, differences in the relative frequency of alleles can also have intuitively unsatisfactory effects on H. Consider the following values of H (to two places of decimals) for different values of p, where p is the more common of two alleles in a two-allele system:


Over quite a wide range of changes in gene frequency (from p = .5 up to about p = .75) ‘diversity’ changes rather slowly, but beyond this point it falls more rapidly. Suppose we are comparing diversity in 4 populations, A, B, C, and D, where p and q are in the ratios 5:5, 7:3, 8:2 and 9:1 respectively. For these ratios, H is .5, .42, .32 and .18 respectively. By the criterion of H, we will conclude that the rank order of diversity (from greater to less) is A>B>C>D, which is intuitively reasonable, but we would also conclude that the difference in diversity between C and D (.32 – .18 = .14) is greater than the difference between A and B (.5 – .42 = .08). This does not seem intuitively right: in quantifying diversity at a population level the difference between a 5:5 split and a 7:3 split is surely at least as important as the difference between 8:2 and 9:1.

A further weakness of H as a quantitative measure of diversity is that it is almost bound to produce high values of H (between, say, .7 and .99) if there are more than 2 alleles at a locus, and no single allele is predominant. H cannot be less than .5 unless the most common allele has a frequency of at least .5. If there are more than a few alleles with significant shares of the population, H can hardly be less than .7. For example, if there are 4 alleles with frequencies in the ratio 30:25:25:20, H will be about .75. With 5 alleles in the ratio 25:20:20:20:15, H will be nearly .86. Such a compressed scale of measurement is likely to be inconvenient and potentially misleading. By analogy, suppose that we tried to measure climatic temperature on a new scale under which all temperatures between 0 and 60 degrees Fahrenheit had values between 0 and 90 and all temperatures above 60 degrees F were squeezed into the values between 90 and 100 on the new scale. The new scale would be unlikely to catch on!

Of course, the level of heterozygosity is interesting in itself, and if we want to define diversity by heterozygosity we are free to do so, but this doesn’t necessarily capture everything in our informal concept of diversity.

If this seems a minor technical point, reflect that if H within populations is as high as .86, then diversity between populations, measured in the most common way (Nei’s GST), cannot be more than .14, no matter how different the populations are from each other. (They could even be different species.)

But the measurement of diversity between populations is a bit more complicated than within a single population, so I will continue the analysis in a second post…

Note 1: These terms strictly apply only to genes within the same organism, but now seem also to be widely used with reference to genes in different individuals or populations. In this sense heterozygosity or homozygosity are hypothetical, referring to the probability that individuals would be hetero- or homozygous if their parental gametes were selected at random in the way specified.

Note 2: Strictly, in a finite population this is only true if we allow the same gene to be selected twice, but this does not matter unless the population is very small.

Note 3: Homozygosity = p^2 + q^2, so H = 1 – p^2 – q^2. But q = 1 – p, therefore H = 1 – p^2 – (1 + p^2 – 2p) = 2p(1-p) = 2pq.

R. Lewontin: Human Diversity, 1982

Posted by David B at 06:13 AM

• Category: Science 
🔊 Listen RSS


Posted by David B at 04:15 AM

• Category: Science 
The “war hero” candidate buried information about POWs left behind in Vietnam.
Are elite university admissions based on meritocracy and diversity as claimed?
The sources of America’s immigration problems—and a possible solution