The Unz Review: An Alternative Media Selection
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

 TeasersJames Thompson Blogview

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Troll, or LOL with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used once per hour.
Ignore Commenter Follow Commenter
🔊 Listen RSS

The full conference began yesterday. In the midst of listening to all the papers I can’t post anything much, but will keep live tweeting some of the presentations.

As ever, the best thing is meeting participants and finding out first hand about their work, stuff which will be published a year from now. Great fun watching people exchange data analyses by showing principal components results on their phones, and making arrangements to swop data sets. Also very entertaining to see the mix of ages, and to see Tom Bouchard talking to James Lee about the progress of our understanding of the genetics of intelligence, from what I will call the “twin age” to the genomic age.

Fabulous to see the youngest delegates attending and meeting the very researchers whose publications several decades ago inspired those student to enter the field. I saw first hand a few “your book changed my life” testimonial statements.

More to come.

🔊 Listen RSS

When I started work in September 1968 one of the first things I was taught was that intelligence testing had a long history, and that many of the subtests in the Wechsler assessments I had been taken from previous research. Kohs’ blocks (1920), I used to mutter, when people talked about Block Design. I was also taught something about the Stanford Binet tests that I would not be using, because some clinicians still used them, and there was historical data I would need to know about. In hearing about skilled Binet testers I learned about dynamic testing: going from one domain to another as quickly as possible, just to establish general levels efficiently. I also learned that such procedures were only possible once one had achieved a very good knowledge of the test.

I was required to know my material almost by heart so that I could concentrate on every aspects of the patient’s behaviour. After 200 test administrations I began to feel confident I had seen it all, and knew all error types intimately. On my 201st test administration I encountered an entirely new error on Block Design, a scope and size error which was highly unusual. Even psychologists can learn something.

Does history matter? I think so. The early history of intelligence testing allows us to test the idea that IQ items are “arbitrary” and have no relevance to real life problems.

First publication of subtests in the Stanford-Binet 5, WAIS-IV, WISC-V, and WPPSI-IV. Aisa Gibbons, Russell T. Warne. Intelligence, 2019, Volume 75, July–August 2019, Pages 9-18.

The authors discuss the pre-history of intelligence testing from 1905 onwards. The period up to 1920 was extremely productive, and testing was popular, perhaps because of its widespread use in the military. Binet was interested in the lower levels of ability, Terman in the highest levels. Test have to cater for the entire range, all 7 tribes of intellect. Not only that, they have to maintain discriminatory power throughout the whole range, though that is hard to do at the extremes.

Wechsler favored test formats and items that (a) showed high discrimination in intelligence across much of the continuum of ability, (b) produced scores with high reliability, (c) correlated strongly with other widely accepted measures of intelligence, and (d) correlated with “pragmatic” subjective ratings of intelligence from people who knew the examinee—such as a work supervisor (Wechsler, 1944).

That is a good summary of what an intelligence test item must achieve: discrimination, reliability, validity with other tests and, most importantly, with intelligence in everyday life.

Gibbons and Warne show that many tests go back a long way, and are earlier than generally realized. Their list of tests is an excellent way to understand all the tasks which have constituted the core elements of intelligence testing.

I learned a great reading through this section of the paper. For example, I did not know that Binet said of his early reasoning test that it was the best of the lot:

“the 1908 scale (of reasoning) has three images, each containing at least one human figure. The child then was asked to describe the picture, and more complex responses based on interpretation (rather than simply naming objects in the image) were viewed as indicative of greater intellectual ability. Binet found this subtest so useful when diagnosing intellectual disabilities that he wrote, “Very few tests yield so much information as this one.. .. We place it above all the others, and if we were obliged to retain only one, we should not hesitate to select this one” (Binet & Simon, 1908/1916, p. 189).

Intelligence goes beyond the obvious.

Here are some historical points which were news to me:

Jean Marc Gaspard Itard was the first to use a form board-like task when he studied and educated a young boy found in the wild (named the “wild boy of Aveyron”) in 1798.

The very similar visual puzzles and object assembly subtests have an origin in the puzzles used for entertainment and geography education, which were first created in the 1750s in England and were in wide-spread use in the early 20th century when the first intelligence tests were being created (Norgate, 2007).

One discovery that we found striking was the diverse sources of inspiration for subtests. While the majority did have roots in the creation of cognitive tests, others have their origin in games (the delayed response subtest, the object assembly subtest), classroom lessons (the block design subtest), the study of a feral child (form boards and related subtests), school assessments (vocabulary subtest) and more. To us, this means that items on intelligence tests often have a connection with the real world—even when they are presented in a standardized, acontextual testing setting. Additionally, this undercuts the suggestion that critics of intelligence testing often make that intelligence test items are meaningless tasks that are divorced from any relationship to an examinee’s environment (e.g., Gould, 1981).

On the other hand, one criticism of intelligence tests seems justified from our study: subtests that appear on popular intelligence tests have changed little in the past century (Linn, 1986). While one could argue that the enduring appeal of these subtests is due to their high performance in measuring intelligence, the fact remains that many of these subtests were often created with little guiding theory or understanding of how the brain and mind work to solve problems (Naglieri, 2007). While sophisticated theories regarding test construction and the inter-relationships of cognitive abilities have developed in recent decades (e.g., Carroll, 1993), it is often not clear exactly how the tasks on modern intelligence tasks elicit examinees to use their mental abilities to respond to test items.

One way to test this criticism is to think of new tests more suited to the present age. Of course, others have already had that thought, and have created computer games which measure intelligence. Fun, but is this a big advance? It is only a gain if the results are more accurate, better predictive of real-life achievements and more speedily obtained. That is a hard bar to clear, since reasonable overall measures can be obtained in a few minutes. More likely, corporations are measuring our intelligence very quickly and surreptitiously by noting our google searches, Facebook likes, and perhaps even commenting histories.

A more pressing problem, to which the authors allude in passing, is that some new-fangled tests are launched each year, and most fall out of use. The reasons is that Wechsler testers have now become highly pragmatic, and do not take kindly to complicated administration procedures, not to test materials which are difficult to assemble and present quickly.

The reality appears to be that any puzzling task taps ability, and there are diminishing returns when using new psychometric tasks. This is the familiar “indifference of the indicator” which Spearman proposed in 1904. This does not exclude finding that individuals have strength and weaknesses in specific domains, but simply that all tasks lead to g, either quickly or slowly, to slightly varying degrees.

• Category: Science • Tags: General Intelligence, Intelligence, IQ 
🔊 Listen RSS

I have good memories of San Antonio, host city of the ISIR 2012 conference. We visited the Alamo, and where throngs of tourists looked respectfully at an ancient wall of the building which was being restored with lime mortar. It was regarded as a restoration of national importance, and the wall was cordoned off, with detailed explanations. At that same time in England the local stone mason was restoring our similarly aged West facing cottage walls, putting in lime mortar, so it seemed a natural process, though accorded a profound reverence in this case. If only stones could speak, and so on. History is when things turn, not just when they happen. Impact matters, so a distant siege can become an icon of national affirmation. All respect to those citizens who saved the monument from encroaching development.

It was that visit in December 2012 which really got me blogging. It seemed a waste of an airfare to attend a conference and then not tell anyone about it. I listed the papers and made some comments, aware that I was casting my words into empty space. My actual conference report got 20 views. As I blogged more about intelligence research the numbers built up slowly. In the subsequent year I got 71,701 pageviews.

Next week I am off to the ISIR 2019 conference in Minneapolis. I will do my best to post up the papers presented, and to tweet comments and links. On the other hand, I tend to just listen to papers, so might well blog less for a while.

This morning I find that I have reached a million page views. Thanks for reading.

• Category: Science 
🔊 Listen RSS

Teachers loom large in most children’s lives, and are long remembered. Class reunions often talk of the most charismatic teacher, the one whose words and helpfulness made a difference. Who could doubt that they can have an influence on children’s learning and future achievements?

Doug Detterman is one such doubter:

Education and Intelligence: Pity the Poor Teacher because Student Characteristics are more Significant than Teachers or Schools.

Douglas K. Detterman, Case Western Reserve University (USA)
The Spanish Journal of Psychology (2016), 19, e93, 1–11.


Education has not changed from the beginning of recorded history. The problem is that focus has been on schools and teachers and not students. Here is a simple thought experiment with two conditions: 1) 50 teachers are assigned by their teaching quality to randomly composed classes of 20 students, 2) 50 classes of 20 each are composed by selecting the most able students to fill each class in order and teachers are assigned randomly to classes. In condition 1, teaching ability of each teacher and in condition 2, mean ability level of students in each class is correlated with average gain over the course of instruction. Educational gain will be best predicted by student abilities (up to r = 0.95) and much less by teachers’ skill (up to r = 0.32). I argue that seemingly immutable education will not change until we fully understand students and particularly human intelligence. Over the last 50 years in developed countries, evidence has accumulated that only about 10% of school achievement can be attributed to schools and teachers while the remaining 90% is due to characteristics associated with students. Teachers account for from 1% to 7% of total variance at every level of education. For students, intelligence accounts for much of the 90% of variance associated with learning gains. This evidence is reviewed.

Have we over-rated the impact of teachers, and ignored the importance of innate ability? How can we have been so mistaken? Read on.

At least in the United States and probably much of the rest of the world, teachers are blamed or praised for the academic achievement of the students they teach. Reading some educational research it is easy to get the idea that teachers are entirely responsible for the success of educational outcomes. I argue that this idea is badly mistaken. Teachers are responsible for a relatively small portion of the total variance in students’ educational outcomes. This has been known for at least 50 years. There is substantial research showing this but it has been largely ignored by educators. I further argue that the majority of the variance in educational outcomes is associated with students, probably as much as 90% in developed economies. A substantial portion of this 90%, somewhere between 50% and 80% is due to differences in general cognitive ability or intelligence. Most importantly, as long as educational research fails to focus on students’ characteristics we will never understand education or be able to improve it.

Doug Detterman is a noble toiler in the field of intelligence, and has very probably read more papers on intelligence than anyone else in the world. He notes that the importance of student ability was known by Chinese administrators in 200 BC, and by Europeans in 1698.

The main reason people seem to ignore the research is that they concentrate on the things they think they can change easily and ignore the things they think are unchangeable.

Despite some experiments, the basics of teaching have not changed very much: the teacher presents stuff on a blackboard/projector screen which the students have to learn by looking at the pages of a book/screen, and then writing answers on a page/screen. By now you might have expected all lessons to have been taught by some computer driven correspondence tutorials, cheaply delivered remotely. There is some of that, but not as much as dreamed of decades ago.

Detterman reviews Coleman et al. (1966) and Jencks et al. (1972) which first brought to attention that 10% to 20% of variance in student achievement was due to schools and 80% to 90% due to students.He then look at more recent reviews of the same issue.

Gamoran and Long (2006) reviewed the 40 years of research following the Coleman report but also included data from developing countries. They found that for countries with an average per capita income above $16,000 the general findings of the Coleman report held up well. Schools accounted for a small portion of the variance. But for countries with lower per capita incomes the proportion of variance accounted for by schools is larger. Heyneman and Loxley (1983) had earlier found that the proportion of variance accounted for by poorer countries was related to the countries per capita income. This became known as the Heyneman-Loxley effect. A recent study by Baker, Goesling, and LeTendre (2002) suggests that the increased availability of schooling in poorer countries has decreased the Heyneman-Loxley effect so that these countries are showing school effects consistent with or smaller than those in the Coleman report.

The largest effect of schooling in the developing world is 40% of variance, and that includes “schooling” where children attend school inconsistently, and staff likewise.

After being destroyed during the Second World War, Warsaw came under control of a Communist government which allocated residents randomly to the reconstructed city, to eliminate cognitive differences by avoiding social segregation. The redistribution was close to random, so they expected that the Raven’s Matrices scores would not correlate with parental class and education, since the old class neighbourhoods had been broken up, and everyone attended the schools to which they had randomly been assigned. The authorities assumed that the correlation between student intelligence and the social class index of the home would be 0.0 but in fact it was R2= 0.97, almost perfect. The difference due to different schools was 2.1%. In summary, in this Communist heaven student variance accounted for 98% of the outcome.

Angoff and Johnson (1990) showed that the type of college or university attended by undergraduates accounted for 7% of the variance in GRE Math scores. Fascinatingly, a full 35% of students did not take up the offer from the most selective college they were admitted to, instead choosing to go to a less selective college. Their subsequent achievements were better predicted by the average SAT score of the college they turned down than the average SAT scores of the college they actually attended, the place where they received their teaching. Remember the Alma Mater you could have attended.

Twins attending the same classroom are about 8% more concordant than those with different teachers, which is roughly in line with the usual school effect of 10%.

Detterman’s paper continues with a review of other more recent studies. A good summary is shown below.

Here is a summary of the characteristics of students which predict good scholastic outcomes.

• Category: Science • Tags: IQ, Public Schools 
🔊 Listen RSS

For some years I have been organizing the London Conference on Intelligence, which brings together about 25 invited researchers to present papers and debate issues in a critical but friendly setting. (“The London School” was the name give to those who argued that intelligence had a general component, and was heritable). Speakers are chosen for innovative work, independent thought and for being more interested in whether things are true than whether they are comfortable. We are in favour of the under-dog and the rebellious, but if there is a theme at all is that all views must have empirical support.

As you would expect from any group of academics, there were many differences of opinion, and less emphasis on organization. We made sure there was plenty of time for informal discussion, and that resulted in many of the researchers working together on scientific papers. In fact, about half of the presented papers eventually ended up as published work, slightly better than the norm for conferences. The only common project we ever agreed upon was that the Lynn database of country IQs should be thoroughly revised and every aspect documented on a public database.

By way of background, I had originally intended that these meeting would be public, with university students attending, and journalists invited. Speakers told me I was naïve to even consider that option, because many of them were already under political pressure, and feared loss of grant money, promotion, or even their academic survival. So, we moved to invitation only, and reduced publicity.

One young speaker was a bit different because he was a sociologist by background, and attended the group primarily to seek a sounding board for his work on the link between political attitudes and intelligence. He did no work on race and intelligence, though he later wrote a paper explaining why in his view such research should continue, and that suppressing it would be wrong.

Last year he won a Fellowship at the University of Cambridge, the best of over 900 candidates. This was a great achievement. Before he could take up this prestigious post, which deservedly would have launched him on a brilliant career, a political campaign was launched against him, and one of his supposed crimes was to have attended the London Conference on Intelligence. Additionally, he had written an empirical paper arguing that people’s views of immigrant groups were affected by that immigrant group’s criminality.

Short story: Cambridge threw him out. He lost his job, and effectively has lost his career. We had no way to defend him from this outrageous injustice. We wondered how he would ever find a job, in today’s very censorious climate.

It is a pleasure to report that he has launched a crowdfunded lawsuit against the Cambridge College which hounded him out. He is doing this simply to show that he was unfairly judged. Any surplus funds, should there be any will be held over for the next person to be treated in this awful way.

Could you please support him? It turns out that the investigation into his appointment process confirms he was the best person to get the job.

Even $10 from each person reading this blog would help him mount a case, and I think he will win. This could be a turning point.

• Category: Science • Tags: Academia, IQ, Political Correctness 
🔊 Listen RSS

As an undergraduate, my psychology tutor dryly commented to me that the best way to get a paper widely read was to give it a memorable title, like “the magic number 7, plus or minus 2”.

Miller, G. A. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological Review, 63(2), 81–97.

Here is the abstract, in full:

A variety of researches are examined from the standpoint of information theory. It is shown that the unaided observer is severely limited in terms of the amount of information he can receive, process, and remember. However, it is shown that by the use of various techniques, e.g., use of several stimulus dimensions, recoding, and various mnemonic devices, this informational bottleneck can be broken. 20 references.

Out of respect for George Miller, this post will be equally brief. His paper became a classic because it showed that we perceive the world through an attentional bottleneck, one which we would like to expand, but which has not proved possible despite every training effort over the last 63 years, other than for a few heroic individuals who practiced digit span hard for months, and then find their abilities did not generalize to other memory tasks. Like in a funnel, the many possible inputs of experience must slowly swirl down a narrow spout into the waiting brain. A grievous restriction.

All is not lost, because we can cope with our restricted scope by learning how to “chunk” data into other more meaningful units. So, although we are cabined, cribbed, confined in actuality, we have found heuristics to cope with our limitations. Despite that, people still yearn to achieve even more if they could increase their span just a little.

The much-vaunted Flynn effect has done nothing for digit span, although it may have increased the easy “digits forwards” performance just a fraction, at the cost of reducing the harder and more predictive “digits backwards” performance by a similar fraction, resulting in no change overall.

How do other species fare in this regard? In a very brief review, Manoochehri (2019) lays out the basic picture and wonders how memory span evolved.

The evolution of memory span: a review of the existing evidence. Majid Manoochehri

The existing evidence shows that chimpanzees have a memory span of about 5 items (Inoue &Matsuzawa, 2007; Kawai & Matsuzawa, 2000).

Lately, Toyoshima et al. (2018) have stipulated that rats are able to remember 5 objects at once.

Baboons reveal a memory span of about 4 to 5 items (Fagot & DeLillo, 2011).

Herman etal. (2013) have suggested a memory span of about 4 to 5 items for bottlenose dolphins.

The results of studying two rhesus monkeys Swartz et al. (1991) suggest a memory span of about 4 objects.

Similar work by Sugita et al. (2015) has argued that rats’ memory span is approximately 4 items.

Terrace (1993) has found it takes a pigeon 3-4 months to learn a 4-item list, which suggests that 4 is a pigeon’s memory span.

More studies of more animals would be needed to show if the jump from 4-5 items up to the human 7 items is the massive discontinuity it appears to be. Did we pick up a mutation 60 to 130 thousand years ago which gave us the bandwidth to use grammatical computations, greater articulatory rehearsal, leading to automatic long-term storage, and the beginnings of introspection, self-reflection, consciousness and symbolic thought? It might even have given us the ability to create and enjoy music, a language-like spin off from newly acquired processing skills.

• Category: Science • Tags: Animal IQ 
🔊 Listen RSS

Early in any psychology course, students are taught to be very cautious about accepting people’s reports. A simple trick is to stage some sort of interruption to the lecture by confederates, and later ask the students to write down what they witnessed. Typically, they will misremember the events, sequences and even the number of people who staged the tableaux. Don’t trust witnesses, is the message.

Another approach is to show visual illusions, such as getting estimates of line lengths in the Muller-Lyer illusion, or studying simple line lengths under social pressure, as in the Asch experiment, or trying to solve the Peter Wason logic problems, or the puzzles set by Kahneman and Tversky. All these appear to show severe limitations of human judgment. Psychology is full of cautionary tales about the foibles of common folk.

As a consequence of this softening up, psychology students come to regard themselves and most people as fallible, malleable, unreliable, biased and generally irrational. No wonder psychologists feel superior to the average citizen, since they understand human limitations and, with their superior training, hope to rise above such lowly superstitions.

However, society still functions, people overcome errors and many things work well most of the time. Have psychologists, for one reason or another, misunderstood people, and been too quick to assume that they are incapable of rational thought?

Gerd Gigerenzer thinks so.

He is particularly interested in the economic consequences of apparent irrationality, and whether our presumed biases really result in us making bad economic decisions. If so, some argue we need a benign force, say a government, to protect us from our lack of capacity. Perhaps we need a tattoo on our forehead: Diminished Responsibility.

The argument leading from cognitive biases to governmental paternalism—in short, the irrationality argument—consists of three assumptions and one conclusion:

1. Lack of rationality. Experiments have shown that people’s intuitions are systematically biased.

2. Stubbornness. Like visual illusions, biases are persistent and hardly corrigible by education.

3. Substantial costs. Biases may incur substantial welfare-relevant costs such as lower wealth, health, or happiness.

4. Biases justify governmental paternalism. To protect people from theirbiases, governments should “nudge” the public toward better behavior.

The three assumptions—lack of rationality, stubbornness, and costs—imply that there is slim chance that people can ever learn or be educated out of their biases; instead governments need to step in with a policy called libertarian paternalism (Thaler and Sunstein, 2003).

So, are we as hopeless as some psychologists claim we are? In fact, probably not. Not all the initial claims have been substantiated. For example, it seems we are not as loss averse as previously claimed. Does our susceptibility to printed visual illusions show that we lack judgement in real life?

In Shepard’s (1990) words, “to fool a visual system that has a full binocular and freely mobile view of a well-illuminated scene is next to impossible” (p. 122). Thus, in psychology, the visual system is seen more as a genius than a fool in making intelligent inferences, and inferences, after all, are necessary for making sense of the images on the retina.

Most crucially, can people make probability judgements? Let us see. Try solving this one:

A disease has a base rate of .1, and a test is performed that has a hit rate of .9 (the conditional probability of a positive test given disease) and a false positive rate of .1 (the conditional probability of a positive test given no disease). What is the probability that a random person with a positive test result actually has the disease?

Most people fail this test, including 79% of gynaecologists giving breast screening tests. Some researchers have drawn the conclusion that people are fundamentally unable to deal with conditional probabilities. On the contrary, there is a way of laying out the problem such that most people have no difficulty with it. Watch what it looks like when presented as natural frequencies:

Among every 100 people, 10 are expected to have a disease. Among those 10, nine are expected to correctly test positive. Among the 90 people without the disease, nine are expected to falsely test positive. What proportion of those who test positive actually have the disease?

In this format the positive test result gives us 9 people with the disease and 9 people without the disease, so the chance that a positive test result shows a real disease is 50/50. Only 13% of gynaecologists fail this presentation.

Summing up the virtues of natural frequencies, Gigerenzer says:

When college students were given a 2-hour course in natural frequencies, the number of correct Bayesian inferences increased from 10% to 90%; most important, this 90% rate was maintained 3 months after training (Sedlmeier and Gigerenzer, 2001). Meta-analyses have also documented the “de-biasing” effect, and natural frequencies are now a technical term in evidence-based medicine (Akiet al., 2011; McDowell and Jacobs, 2017). These results are consistent with a long literature on techniques for successfully teaching statistical reasoning (e.g., Fonget al., 1986). In sum, humans can learn Bayesian inference quickly if the information is presented in natural frequencies.

If the problem is set out in a simple format, almost all of us can all do conditional probabilities.

I taught my medical students about the base rate screening problem in the late 1970s, based on: Robyn Dawes (1962) “A note on base rates and psychometric efficiency”. Decades later, alarmed by the positive scan detection of an unexplained mass, I confided my fears to a psychiatrist friend. He did a quick differential diagnosis on bowel cancer, showing I had no relevant symptoms, and reminded me I had lectured him as a student on base rates decades before, so I ought to relax. Indeed, it was false positive.

Here are the relevant figures, set out in terms of natural frequencies

Every test has a false positive rate (every step is being taken to reduce these), and when screening is used for entire populations many patients have to undergo further investigations, sometimes including surgery.

Setting out frequencies in a logical sequence can often prevent misunderstandings. Say a man on trial for having murdered his spouse has previously physically abused her. Should his previous history of abuse not be raised in Court because only 1 woman in 2500 cases of abuse is murdered by her abuser? Of course, whatever a defence lawyer may argue and a Court may accept, this is back to front. OJ Simpson was not on trial for spousal abuse, but for the murder of his former partner. The relevant question is: what is the probability that a man murdered his partner, given that she has been murdered and that he previously battered her.

Accepting the figures used by the defence lawyer, if 1 in 2500 women are murdered every year by their abusive male partners, how many women are murdered by men who did not previously abuse them? Using government figures that 5 women in 100,000 are murdered every year then putting everything onto the same 100,000 population, the frequencies look like this:

🔊 Listen RSS

Superior: the return of race science. Angela Saini. 4th Estate. London. 2019.


Excitedly promoted in national newspapers, glowingly reviewed in Sunday magazines, the author interviewed on national radio, this book is part of a mainstream narrative which promotes the ascendant public stance, which is that race does not exist as a useful category, and that those who perversely study it have reprehensible motives.

Saini dedicates the book to her parents “the only ancestors I need to know”. This is touching, though a bit hard on her grandparents. The Prologue (page 3) explains her stance: “The key to understanding the meaning of race is understanding power. When you see how power has shaped the idea of race, and continues to shape it, how it affects even the scientific facts, everything finally begins to make sense”.

As Lenin said in 1921: “The whole question is—who will overtake whom?”

Written by an avowed anti-racist, race, racism, anti-racism, and political groupings on the Right are prominent themes. The style of the book is engagingly discursive, a quick tour through selective history: Hitler gets 8 mentions, Stalin, Mao and Pol Pot none. The text is reference free, and flows easily. Papers are listed at the end of the book, but not linked to the text, and claims cannot always be traced to references. Other popular genetics books have given references by page number. This is not a book to read if you want to learn about the genetics of intelligence, or of group differences in intelligence, about which there is surprisingly little.

Much of the book involves interviews with researchers, many of whom argue that there is no biological basis to race “except as social categories”. There is warm praise (page 89) for Montagu’s 1942 view that race “is based upon an arbitrary and superficial collection of external characters” and that “Individual variation within population groups, overlapping with other population groups, turned out to be so enormous that the boundaries of race made less and less sense”.

Amusingly for a book which understandably attacks the notion of racial purity, it tacitly champions ideological purity. As in the Da Vinci Code, it warns the innocent public about shadowy organizations promulgating foul ideas in tainted media, probably planning dreadful things in secret conclaves. Saini’s polemic assumes that if she can show that a person or their associates are right wing, then that invalidates their opinions. Her tone is unabashedly political. It is one thing to try to avoid being partisan, and fail; and quite another to be resolutely partisan throughout, and to assume righteousness. Here are a selection of what Saini regards as strong arguments.

Saini (page 90) quotes Lewontin 1972 and concludes:

In total, around 90% of the variation lies roughly within the old racial categories, not between them. There has been at least one critique of Lewontin’s statistical method since then, but geneticists today overwhelmingly agree that although they may be able to use genomic data to roughly categorize people by the continent their ancestors came from (something we can often do equally well by sight), by far the biggest chunk of human genetic difference indeed lies within populations.

First, there is no detailed criticism of Lewontin’s argument. Second, it is admitted that DNA alone can confirm genetic groups, albeit “roughly”. In fact, it can be done with very high precision, as Tang et al. (2005) have shown. Third, Lewontin’s misleading conclusion is repeated, without saying that some genes acting together can have big effects, and many other genes of minimal effect do not wash away actual differences. It would be like denying Pygmies are short or that the fastest sprinters are usually West Africans.

Richard Dawkins gave an elegant summary:

However small the racial partition of the total variation may be, if such racial characteristics as there are highly correlate with other racial characteristics, they are by definition informative, and therefore of taxonomic significance.

A balanced account would have mentioned findings which changed the picture. Lewontin based his claims on blood type markers: about as advanced as it was possible to be in 1972, but hopeless to identify genetic clustering, therefore doomed to render a false negative. By 1975 the number of markers had increased sufficiently to easily discriminate between groups.

The issue was explained here:

Edwards (2003) Human genetic diversity: Lewontin’s fallacy. A.W.F. Edwards. Bio-Essays 25:798–801, 2003

In popular articles that play down the genetical differences among human populations, it is often stated that about 85% of the total genetical variation is due to individual differences within populations and only 15% to differences between populations or ethnic groups. It has therefore been proposed that the division of Homo sapiens into these groups is not justified by the genetic data. This conclusion, due to R.C. Lewontin in 1972, is unwarranted because the argument ignores the fact that most of the information that distinguishes populations is hidden in the correlation structure of the data and not simply in the variation of the individual factors.

Here is an account of more modern research, showing that even US census categories can be utilized for the purposes of genetic classification

H.Tang et al. (2005) Genetic Structure, Self-Identified Race/Ethnicity, and Confounding in Case-Control Association Studies. Am J Hum Genet. 2005 Feb; 76(2): 268–275.

Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity.

There is a near-perfect correspondence between genetic measures and the common US census racial labels, with a misclassification rate of only 14 per 10,000. Some of this is due to the admixed “other” category, but 9,986 in 10,000 subjects can master the art of looking in a mirror and noting which race they most resemble, a task beyond the wit of some academics.

R.A.Fisher made all this plain in 1925

‘‘When a large number of individuals [of any kind of organism] are measured in respect of physical dimensions, weight, colour, density, etc., it is possible to describe with some accuracy the population of which our experience may be regarded as a sample. By this means it may be possible to distinguish it from other populations differing in their genetic origin, or in environmental circumstances. Thus local races may be very different as populations, although individuals may overlap in all characters;’’ R.A. Fisher (1925).

In summary, relying on Lewontin 1972 misrepresents current knowledge on genetics.

• Category: Science • Tags: Political Correctness, Race, Racism 
🔊 Listen RSS

You can detect a lot about a person using simple tasks which take less than 2 minutes. Here is a test which did the job in 90 seconds, but then got lengthened to 120 seconds to make it even more reliable. Of this test, one of those Edinburgh researchers said to me in a conference coffee break: “it is better than a brain scan”. What did he mean?

In order to get an overall estimate of mental power, psychologists have chosen a series of tasks to represent some of the basic elements of problem solving. The selection is based on looking at the sorts of problems people have to solve in everyday life, with particular attention to learning at school and then taking up occupations with varying intellectual demands. Those tasks vary somewhat, though they have a core in common.

Most tests include Vocabulary, examples: either asking for the definition of words of increasing rarity; or the names of pictured objects or activities; or the synonyms or antonyms of words.

Most tests include Reasoning, examples: either determining which pattern best completes the missing cell in a matrix (like Raven’s Matrices); or putting in the word which completes a sequence; or finding the odd word out in a series.

Most tests include visualization of shapes, examples: determining the correspondence between a 3-D figure and alternative 2-D figures; determining the pattern of holes that would result from a sequence of folds and a punch through folded paper; determining which combinations of shapes are needed to fill a larger shape.

Most tests include episodic memory, examples: number of idea units recalled across two or three stories; number of words recalled from across 1 to 4 trials of a repeated word list; number of words recalled when presented with a stimulus term in a paired-associate learning task.

Most tests include a rather simple set of basic tasks called Processing Skills. They are rather humdrum activities, like checking for errors, applying simple codes, and checking for similarities or differences in word strings or line patterns. They may seem low grade, but they are necessary when we try to organise ourselves to carry out planned activities. They tend to decline with age, leading to patchy, unreliable performance, and a tendency to muddled and even harmful errors. When we lose our ability to do even simple tasks, then the care home beckons.

One of these simple tasks is called Coding. It is also known as Digit-Symbol. You are shown a code box in which every number from 0 to 9 has a symbol underneath. Your task is to go through a whole set of numbers, each with an empty box underneath, and fill in the appropriate code. So, you look at the first number in the sequence, look up at the code box to find the appropriate code for that number, and then draw it in to the box. It is a dull task, like being a university teacher.

First, here is the overall picture which comes from Salthouse’s review of the data on ageing.

Localizing age-related individual differences in a hierarchical structure. Timothy A. Salthouse. Intelligence 32 (2004) 541–561

Data from 33 separate studies were combined to create an aggregate data set consisting of 16 cognitive variables and 6832 different individuals who ranged between 18 and 95 years of age. Analyses were conducted to determine where in a hierarchical structure of cognitive abilities individual differences associated with age, gender, education, and self-reported health could be localized. The results indicated that each type of individual difference characteristic exhibited a different pattern of influences within the hierarchical structure, and that aging was associated with four statistically distinct influences; negative influences on a second-order common factor and on first-order speed and memory factors, and a positive influence on a first-order vocabulary factor.

Personally, I am glad to see that Vocabulary holds up, but the rest fall sharply with age. To my eye the fall in perceptual speed is sharp, linear and with a pronounce slope. It plunges in a straight line. Here are the Wechsler standardisation data, and the overall findings from the meta-analysis.

This is what the task looks like.

Digit Symbol is such a simple task, almost free of any intellectual content, that it almost seems a measure of the brain’s clock speed, a clear indicator that old brains work more slowly than young ones. Given this finding, it is understandable that the coding task is a good predictor of how the person is ageing, and how well they are able to cope with the problems of everyday life. A brain scan, for all its apparent precision, is not a direct measure of actual performance. Currently, scans are not as accurate in predicting behaviour as is a simple test of behaviour. This is a simple but crucial point: so long as you are willing to conduct actual tests, you can get a good understanding of a person’s capacities even on a very brief examination of their performance.

The Digit Symbol test indicates current functioning. It does not assess previous levels of ability as is the case with the Adult Reading Test. (By the way, this test is absolutely nothing to do with regional accents. It is a test of whether you have learned how a specific set of irregularly written words are pronounced. For example, the word “Ache” has one pronunciation according to the usual rules of English, and another according to the quirky way in which English is written.

Exactly where Digit Symbol comes in the hierarchy of abilities is not my concern here. I just want to explain that much maligned “paper and pencil” tests (and their computer and keyboard equivalents) can be very powerful indicators of important personal abilities, with very profound consequences.

There are several tests which have the benefit of being quick to administer and powerful in their predictions. Digit Symbol is useful because it is a ratio scale. Each additional symbol the person puts in adds directly to their raw score. Another good, and even quicker test is Trail Making B which has a strong loading on g and also on an orthogonal processing speed factor.

All these tests are good at picking up illness related cognitive changes, as in diabetes. (Intelligence testing is rarely criticized when used in medical settings). Delayed memory and working memory are both affected during diabetic crises. Digit Symbol is reduced during hypoglycaemia, as are Digits Backwards. Digit Symbol is very good at showing general cognitive changes from age 70 to 76. Again, although this is a limited time period in the elderly, the decline in speed is a notable feature.

Predictors of ageing-related decline across multiple cognitive functions.
Stuart J. Ritchie, Elliot M. Tucker-Drob, Simon R. Cox, Janie Corley, Dominika Dykiert,Paul Redmond, Alison Pattie, Adele M. Taylor, Ruth Sibbett, John M. Starr, Ian J. Deary.
Intelligence 59 (2016) 115–126.

• Category: Science • Tags: IQ, Psychometrics 
The Great Retrodiction: English speakers only
🔊 Listen RSS

Science marches on. A researcher writes in to chide me that I have forgotten the fastest intelligence test of all, which masquerades as a simple reading test, but which can reach back 50 years, and in 90 seconds deliver a precise verdict on the best level of ability you had in your prime. Indeed, I had forgotten this test, despite recently using it in clinical practice. All this comes from Edinburgh, where Jean Brodie was in her prime, and where psychometry is now in its prime.

Picture the scene: the person being tested is handed a page with 50 words printed on it, and asked to read them aloud, one by one. All the examiner has to do is to note whether they have been pronounced correctly. And that’s it. It is called the National Adult Reading Test.

In his email to me the researcher gives estimates of the time taken:

“Three of the testers (between them they have given the NART several thousands of times), asked how long it takes to give the NART, replied: “an average of a minute and a half. Sometimes a minute, and sometimes three”.

This is a quick test, and extremely powerful.

Now a word about Thomas Caxton. Problem is, which word and how to spell it? Having brought his Flemish printing team over to Westminster, Caxton had to decide how to spell the uncouth English language, which was unkempt, various, regional, protean and quite the rising thing. He that the money was to be made by printing in English, and had to decide what English was likely to be understood. Even with Chancery Standard to guide him (craftily, he placed his printing press next door to the national centre (centre) for official document production) he had to make decisions about English. It is said of Caxton that he fixed written English before it had actually “reached a consensus”. I digress, but it is a feature of English that she is not wrote as she is spoke. In this peculiarity lies an informative isotope: children have to learn how to spell, and in doing so learn how they should pronounce what they read.

What dreadful traps lie in wait for those multitudes who have not won the lottery of life by being born British? I and my brothers, despite English schooling, on coming to England had difficulty with idiosyncratic spellings and with the pronunciation of place names. One of us spoke of “Leicester Square” as Lay-ses-ter, not the absurdly correct “Les-ter”. Equally, pronouncing “mortgage” as Mort-gage” not as the approved “mor-gage”. Why was the t silent? Yes, I know is it is a death pledge, as in Morte d’Arthur, and yes, one third of English is French. I blame someone, and Caxton will do.

Perhaps the ability to learn these absurd peculiarities of English is an intelligence test. It is certainly a burden on memory and learning, probably not as onerous as kanji, but a demanding task anyway. If this unremarked school-age skill measures speed and power of learning, then will it fade with age? Why not find out? Take some children who were tested for intelligence aged 11, and test them again in old age. Then, test them on the “reading/pronunciation test”. Then, compare their current and youthful intelligence scores with the estimate derived from the reading test.

If you want just a very quick summary: A short test of pronunciation—the NART—and brief educational information can capture well over half of the variation in IQ scores obtained 66 years earlier. The NART correlates 0.66 with the Moray House intelligence test given at age 11. A 66-year follow-up is a Foxtrot Oscar follow-up.

These are big claims, so if you want a little more detail, here are three relevant papers in support of them:

J. R. CRAWFORD, I. J. DEARY, J. STARR, L. J. WHALLEY. The NART as an index of prior intellectual functioning: a retrospective validity study covering a 66-year interval. Psychological Medicine, 2001, 31, 451–458.

Background. The National Adult Reading Test (NART) is widely used in research and clinical practice as an estimate of pre-morbid or prior ability. However, most of the evidence on the NART’s validity as a measure of prior intellectual ability is based on concurrent administration of the NART and an IQ measure.

Method. We followed up 179 individuals who had taken an IQ test (the Moray House Test) at age 11 and administered the NART and the Mini-Mental State Examination (MMSE) at age 77. A subset (Nfl97) were also re-administered the original IQ test.

Results. The correlation between NART performance at age 77 and IQ age 11 was high and statistically significant (r= –73; P< .001). This correlation was comparable to the correlation between NART and current IQ, and childhood IQ and current IQ, despite the shared influences on the latter variable pairings. The NART had a significant correlation with the MMSE but this correlation fell to near zero (r= .02) after partialling out the influence of childhood IQ.

Discussion. The pattern of results provides strong support for the claim that the NART primarily indexes prior (rather than current) intellectual ability.

The correlation of 0.73 is based on the fact that only the brighter subjects survived the 66 years after taking the test, so there is a restriction of range and when one allows for that, the underlying correlation is 0.78.

The pronunciation test survives even as dementia sets in:

Pronunciation of irregular words is preserved in dementia, validating premorbid IQ estimation

B. McGurn, MB, ChB; J.M. Starr, FRCPEd; J.A. Topfer, BA, MSc; A. Pattie, BSc; M.C. Whiteman, PhD; H.A. Lemmon, MA; L.J. Whalley, MD; and I.J. Deary, PhD

NEUROLOGY 2004; 62:1184–1186

The National Adult Reading Test (NART), used to estimate premorbid mental ability, involves pronunciation of irregular words. The authors demonstrate that, after controlling for age 11 IQ test scores, mean NART scores do not differ in people with and without dementia. The correlation between age 11 IQ and NART scores at about age 80 was similar in the groups with (r=0.63, p < 0.001) and without (r=0.60, p< 0.001) dementia. These findings validate the NART as an estimator of premorbid ability in mild to moderate dementia.

Clearly, intelligence runs through behaviour like carbon through chemistry. Those liable to dementia are those who were lower in intelligence at age 11. Although all the more recent scores are lower in the dementing group, the drop in the NART is explicable by the initial differences in intelligence. Once that is controlled for, it retains its predictive power.

If you did not have any access to a person’s intelligence score at age 11, which in clinical practice accounts for almost all your patients, then you would be guided mostly by the Mini Mental State examination. It is not affected by previous levels, because it is a broad measure of current functioning.

The NART might make you under-estimate the person’s original ability, and the extent to which they had fallen from previous levels. However, the authors say:

• Category: Science • Tags: IQ, Psychometrics 
James Thompson
About James Thompson

James Thompson has lectured in Psychology at the University of London all his working life. His first publication and conference presentation was a critique of Jensen’s 1969 paper, with Arthur Jensen in the audience. He also taught Arthur how to use an English public telephone. Many topics have taken up his attention since then, but mostly he comments on intelligence research.