The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

 TeasersJames Thompson Blogview

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Troll, or LOL with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used once per hour.
Ignore Commenter Follow Commenter
🔊 Listen RSS

I may be too trusting, but I generally accept upgrades. Several months ago, I willingly accepted an iPhone operating system upgrade, and lost all the Notes I had stored on my phone. These notes contained bank and credit card details, passport details, and other useful things which I have to consult from time to time, mostly when travelling. The real eye-opener is that I had stored these notes on my phone rather than the cloud, assuming they were more secure and more private because they were restricted to the hardware in my pocket, mine and mine alone. Not so. I was taught a lesson: Apple has the keys to what is in effect my portable office, and can destroy my arrangements at will, or by mere insouciance. They can decide what is best for me.

We are now in the public discovery phase of examining why two new planes have fallen out of the sky, with pilots struggling to stop them diving into the ground. US pilots reported the problem anonymously (as shown above), and the inadequacy of the manual and training was already known. The crashes have happened to foreign airlines, but an unknown risk has been revealed for all passengers to see.

Thank you for the comments on my previous post, particularly those which have found additional material from other aviation sources, and gone into the history of the development of the 737 series. Thanks also for the videos on the general principles of flight. General principles are the foundations of understanding.

I think I was probably looking at aviation websites in November, just after the Lion Air crash on 29 October, and formed the opinion that there was something wrong with the anti-stall system, and told people about it. I might have told anyone willing to listen in November, but I know I discussed this with a test pilot on 22 December 2018. We both recall the discussion, and family members who were present remember the basic points being made. Philip Tetlock ( ) will tell you, absolutely correctly, that predictions have to be as specific as possible before they can even be assessed. So, further disclosure: I think I argued the case solely on air-speed indicators, not angle of attack indicators, and did not know or did not include anything about the design change history of the 737 Max series, simply that the Lion Air crash suggested an anti-stall system problem.

This story has it all: the complexities of operator/machine interfaces (mostly a cognitive issue), the intricacies of modern aircraft (mostly a scientific issue with some cognitive aspects) and the compromises involved in the aircraft industry, concerning safety, operating and training costs, and competition between manufacturers (economic and political issues).

My focus is on the cognitive task of flying a plane, and forming an understanding of how systems work and how they must be managed in emergencies. I am also interested in the cognitive aspects of maintaining a plane, fault reporting and correcting. Psychology has a part to play in the discussion of cognitive tasks. For example, what is the natural thing to do when, shortly after take-off, a plane starts diving into the ground? Read a manual? Recall from memory, as the plane lurches ever downwards, what needs to be done? Call to mind the checklist of tasks required to disengage a system which unknown to you has been fooled by an unreliable angle-of-attack indicator? My view is that a cockpit is no place for badly designed IQ test items. Systems have to be adapted to human information processing limitations, and must fit in with startle responses and standard pilot reactions and conventions.

Using James Reason’s explanatory framework (Human Error, 1989), pilots flying the Boeing 737 Max 8 and encountering the opaque workings of MCAS (manoeuvering characteristics augmentation system) are carrying out intentional but mistaken actions: they are trying to pull a plane out of a dive. The plane is in fact climbing away from an airport after takeoff, but a failure in an angle of attack indicator has convinced MCAS that it is in a stall condition. (For extra money, you can buy a second angle of attack indicator, and apparently these two airlines did not do so. For safety, two should be standard at no extra cost). Accordingly, MCAS puts the nose of the plane down to avoid the stall. The pilot reacts by pulling back the yoke so as to resume upward flight, cognizant of the plain fact that unless he can gain height he is going to die, together with his passengers. His action satisfies MCAS for a short while, and then it comes in again, helpfully trying to prevent a stall (because pulling on the yoke is not enough: the whole tail plane has to be “trimmed” into the proper angle). Pilots are doing what comes naturally to them.

MCAS is diligently doing as instructed, but is badly designed, relying as it does in this case on a single indicator, rather than two which could identify and resolve discrepancies, and has no common sense about the overall circumstances of the plane. The pilots know that they have just taken off. MCAS, as far as I know, does not “know” that. Again, as far as I know, MCAS does not know even what height the plane is at. (I know that this is not real Artificial Intelligence, but I used it as an illustration of some of the problems which may arise from AI in transport uses). The pilots respond with “strong-but-wrong” actions (which would be perfectly correct in most circumstances) and MCAS persists with “right-but-wrong” actions because of a severely restricted range of inputs and contextual understanding. Chillingly, it augments a sensor error into a fatal failure. A second sensor and much more training could reduce the impact of this problem, but the inherent instability of the engine/wing configuration remains.

Using Reason’s GEMS system, the pilots made no level 1 slips or lapses in piloting. They had followed the correct procedures and got the plane off the ground properly (once or twice a pilot forgets to put the flaps down at take-off or the wheels down at landing). I think they made no level 2 rule-based errors, because their rule-based reactions were reasonable: they considered the local state information and tried to follow a reasonable rule: avoid crashing into the ground by trying to gain height. They could be accused of a level 3 error: a knowledge-based mistake, but the relevant knowledge was not made available to them. They may have tried to problem-solve by finding a higher level analogy (hard to guess at this, but something like “we have unreliable indicators” or “we have triggered something bad in the autopilot function”) but then they must revert to a mental model of the problem, and think about abstract relations between structure and function, inferring a diagnosis, formulating corrective actions and testing them out. What would that knowledge-based approach entail? Either remembering exactly what should be done in this rare circumstance, or finding the correct page in the manuals to deal with it. Very hard to do when the plane keeps wanting to crash down for unknown reasons shortly after take-off. Somewhat easier when it happens at high altitudes in level flight.

• Category: Economics, Science • Tags: AI, Airlines, Boeing 
🔊 Listen RSS

Conventional wisdom is that it is too early to speculate why in the past six months two Boeing 737 Max 8 planes have gone down shortly after take off, so if all that follows is wrong you will know it very quickly. Last night I predicted that the first withdrawals of the plane would happen within two days, and this morning China withdrew it. So far, so good. (Indonesia followed a few hours ago).

Why should I stick my neck out with further predictions? First, because we must speculate the moment something goes wrong. It is natural, right and proper to note errors and try to correct them.(The authorities are always against “wild” speculation, and I would be in agreement with that if they had an a prior definition of wildness). Second, because putting forward hypotheses may help others test them (if they are not already doing so). Third, because if the hypotheses turn out to be wrong, it will indicate an error in reasoning, and will be an example worth studying in psychology, so often dourly drawn to human fallibility. Charmingly, an error in my reasoning might even illuminate an error that a pilot might make, if poorly trained, sleep-deprived and inattentive.

I think the problem is that the Boeing anti-stall patch MCAS is poorly configured for pilot use: it is not intuitive, and opaque in its consequences.

By the way of full disclosure, I have held my opinion since the first Lion Air crash in October, and ran it past a test pilot who, while not responsible for a single word here, did not argue against it. He suggested that MCAS characteristics should have been in a special directive and drawn to the attention of pilots.

I am normally a fan of Boeing. I have flown Boeing more than any other plane, and that might make me loyal to the brand. Even more powerfully, I thought they were correct to carry on with the joystick yoke, and that AirBus was wrong to drop it, simply because the position of the joystick is something visible to pilot and co-pilot, whereas the Airbus side stick does not show you at a glance how high the nose of the plane is pointing.

Pilots are bright people, but they must never be set a badly configured test item with tight time limits and potentially fatal outcomes.

The Air France 447 crash had several ingredients, but one was that the pilots of the Airbus A330-203 took too long to work out they were in a stall. In fact, that realization only hit them very shortly before they hit the ocean. Whatever the limitations of the crew (sleep deprived captain, uncertain co-pilot) they were blinded by a frozen Pitot air speed indicator, and an inability to set the right angle of attack for their airspeed.

For the industry, the first step was to fit better air speed indicators which were less likely to ice up. However, it was clear that better stall warning and protection was required.

Boeing had a problem with fitting larger and heavier engines to their tried and trusted 737 configuration, meaning that the engines had to be higher on the wing and a little forwards, and that made the 737 Max have different performance characteristics, which in turn led to the need for an anti-stall patch to be put into the control systems.

It is said that generals always fight the last war. Safety officials correct the last problem, as they must. However, sometimes a safety system has unintended consequences.

The key of the matter is that pilots fly normal 737s every day, and have internalized a mental model of how that plane operates. Pilots probably actually read manuals, and safety directives, and practice for rare events. However, I bet that what they know best is how a plane actually operates most of the time. (I am adjusting to a new car, same manufacturer and model as the last one, but the 9 years of habit are still often stronger than the manual-led actions required by the new configuration). When they fly a 737 Max there is a bit of software in the system which detects stall conditions and corrects them automatically. The pilots should know that, they should adjust to that, they should know that they must switch off that system if it seems to be getting in the way, but all that may be steps too far, when something so important is so opaque.

What is interesting is that in emergencies people rely on their most validated mental models: residents fleeing a burning building tend to go out their usual exits, not even the nearest or safest exit. Pilots are used to pulling the nose up and pushing it down, to adding power and to easing back on it, and when a system takes over some of those decisions, they need to know about it.

After Lion Air I believed that pilots had been warned about the system, but had not paid sufficient attention to its admittedly complicated characteristics, but now it is claimed that the system was not in the training manual anyway. It was deemed a safety system that pilots did not need to know about.

This farrago has an unintended consequence, in that it may be a warning about artificial intelligence. Boeing may have rated the correction factor as too simple to merit human attention, something required mainly to correct a small difference in pitch characteristics unlikely to be encountered in most commercial flying, which is kept as smooth as possible for passenger comfort.

It would be terrible if an apparently small change in automated safety systems designed to avoid a stall turned out have given us a rogue plane, killing us to make us safe.

• Category: Economics, Science • Tags: AI, Airlines, Boeing 
Would you sincerely like to be famous?
🔊 Listen RSS

Donald Trump was the real star, and everyone wanted selfies with him.

Last night, in a break with usual stay-at-home custom, I went from my monastic cell out into the glittering evening parade of London’s West End. All the world is there, plus food and entertainment.

Leicester Square Theatre is not, as the name proudly suggests, on Leicester Square, (therefore fake) but on a side alley, in the best tradition of the off-beat lanes and pathways of theatricality, as befits the scruffy comradeship of the precarious acting profession. The bar was full of supposedly famous people, the talk spontaneous and embracing, immediate friends together at first sight. Several people in the small bar looked like someone else. Apparently, with sufficient work, I would resemble someone, but since I did not know the named actor the point was lost on me. The crowd outside was judged too big to be allowed into the crowded room, and were let in as space became available. First night drama.

The show (first night of a three-day run) started late, perhaps because it was still being put together. It began with Alison Jackson impersonators, each of which came out with striking blond hair, glamourous black-clad and booted long legs and lithe bodies to announce that they were Alison Jackson. Point taken. We can have personal identities yet be one of a type. Few of us outstandingly unique. The glamourous have imitators. The beautiful lead fashions. Nobodies want to rise to Somebodies. Who wouldn’t want to be famous? Who is real any more?

The real Alison Jackson, if that is who she eventually was, gave an illustrated and animated lecture about celebrity and fame, the main thesis being that Art had outfaked Life, and it was impossible to know what was fake or real any more. Fake news, fake facts, and fake people.

Her artistic history was based on a simple, fundamental cultural event: the mass mourning of Princess Diana by millions who never knew her, but who recognized her image. That image was the message, and the tears fell. Mine too. Of course, her image was carefully tended, and her clothes even more so. She was the People’s Princess, in a phrase provided by Alistair Campbell for Tony Blair.

Days after her death, interviewed for television about this oceanic public grief for Diana, I explained how those who thought they knew her were not entirely wrong. See someone’s image often enough you get to know them. Face recognition is powerful. TV shows you the living person, their expressions, momentary reactions, movements, mannerisms, tones of voice, and their major public life events. They become friends in a parallel life, a set of milestones against which other women’s lives can be compared: engagement, marriage, children, problems, separations, divorce, and what next?

While the camera crew wrapped up, the woman interviewer said that she still could not understand the grief, and did not feel it. I enquired of her, did she think that all the headlines about their royal romance should have been: Nanny makes good?

Jackman dares to create the pictures which support that sort of headline, merely on the basis that having had that thought, she realizes that others might have thought it, and believes that such an image should be created to challenge the prevailing images. In the jargon, Jackson de-constructed the images of Diana, and got hated for it. It there a reality? I believe there is, despite manipulations, and even in the anything-goes milieu of celebrity.

In face to face in conversation (reported by a person of my esteem) Diana was lovely, smart, kind, and at home with other mothers and children. No fool. Calm and friendly. Of course, she also had her private life with lovers and admirers. The book on her private life came out the next day after her public engagement, but despite her part in it she gave no hint that it was coming.

Diana’s death launched Alison Jackson on a 20 year examination of our fascination with celebrity and the images that fan the flames of adulation. Jackson shows celebrity’s imagined private lives, always as shabby as one would suspect them to be, often grossly so. She is understandably drawn to fame, and the Royal Family (“the firm”) in particular. Royalty is a brand which goes back a long time. Her pictures of the Royals home life are designed to take them down a peg or three.

Do we really want to know that they are like us? Apparently so. Royalty’s crime is to beguile us with their poses, and Jackson’s pics reveal the Royals know exactly what tricks they are up to when they claim to be King and Queen, and many believe them.

Jackson searches for look-alikes, often taking years to find the right one. The three-night theatre show was partly a set of auditions. We helped with our applause to choose the contestants worthy of getting a make-over, and they were filmed backstage in the process of celebrity transformation. Then they came out, looking vaguely right, and Jackson shot multiple photos as they enacted tableaux. On the screen many of the shots were great, just right, and very passable imitations.

On stage they were more clearly look-alikes. Photography, she has said, is ‘a slimy deceitful medium’, which ‘tells only a partial truth’. She publishes one in a thousand of the shots she takes. Interesting to find out whether the original image-makers also pick so few of the many pictures they shoot. Probably so.

Her approach is in the tradition of Hogarth, with added bile. Vulgarity rules, and also degrades the status of the pretentious. Satire is a political weapon and Alison Jackson is the Queen of sedition.

As proof of the refined nature of the audience, impressionist and comedian Rory Bremner was asked up from his seat to comment on celebrity, and he gave us an auditory play within Jackson’s larger visual play. His apparently effortless mimicry made his unseen characters parade before us, the voices mocking one fallible politician after another. He explained that in his youth celebrities like Elizabeth Taylor and Richard Burton seemed at a great distance from ordinary mortals. They truly were the gods, never to be met in real life. Now celebrity has become commonplace, and images are cheap and ubiquitous, yet there is still a pecking order, and power still commands a following. His account of being in character in public places (after hours of makeup) as politicians he disliked was that he feared assault even as he concentrated on imitating them correctly. We certainly got our money’s worth last night.

• Category: Culture/Society • Tags: Celebrity, Donald Trump 
🔊 Listen RSS

It is very unlikely that even if I continue my blog for decades, it will ever have the impact of Stephen Jay Gould’s (1981) “The mis-measure of Man”. It was a best seller, cited in the academic literature over 10,000 times, and even 445 times in 2017 alone. It continues to meet an audience need.

Why was it so popular? I read it and found that it was written in a very engaging way. In my view Gould had an excellent prose style. I enjoyed his essays. His book attacked intelligence tests, which had fallen in popularity, and had come to be seen as a Bad Thing. Intelligence testing had originally been seen as a very good thing, providing opportunities to bright young children who could not afford fancy schools, but who deserved the opportunity of good quality education and employment. Intelligence tests were meritocratic, not aristocratic. You could not fool them with specific knowledge derived from private tuition. They were the great levellers. Although it is hardly relevant to their actual veracity, they were warmly received by the political Left, who saw in these assessments a vindication of the working-class talents which had been suppressed by private education.

Why was it that SJ Gould had such an impact when he argued that the tests were biased against working class and minority racial groups? Morever, how did his views ever take hold when the issue of bias in intelligence testing had just been comprehensively evaluated in Arthur Jensen’s (1980) “Bias in Mental Testing”. Jensen showed that, far from under-predicting African-American achievements, they perhaps slightly over-predicted them. I presume that Jensen’s volume was less often read, though it was written by an expert, not a polemicist. Perhaps precisely because it was written by an expert, in a restrained and far from folksy style, it had less impact on popular culture, which is what tends to determine public debates.

I leave the full explanation to others, but I think that a good prose style, no equations, few numbers and little in the way of statistical and logical arguments generally increases readership. That would be predicted by the bell curve, which makes it plain that technical books about difficult subjects are a minority interest.

Gould’s book made a number of assertions. Two that stuck in people’s minds were: that measures of brain size derived from the study of skulls of different races had been biased, and that many items on the Army tests of intelligence were culturally biased.

The debate about the ancient skulls has raged to and fro for a long time, but it seems highly probable that the measures were taken correctly

Now the redoubtable Russell Warne has taken a detailed look at what Gould said about the Army Beta test, and finds that on that topic he has been unreliable and incorrect.

A number of points:

Face validity. Sure, it helps if a test item looks relevant to the job you are applying for. However, a test item may have high predictive value without seeming to. This is the famous “indifference of the indicator” dictum. If it predicts, use it. Furthermore, you cannot dismiss an item simply because you yourself can think of a way in which it might be misinterpreted, as Gould did. You need to show that such misinterpretations actually exist (and compare them with the misinterpretations which arise on items which seem fine to you).

Testees were not baffled by the use of numbers, in the sense of digits, as Gould implied. All language speakers had knowledge of digits because they had had some years of education.

Gould twists things. His reading of the instructions was that the men would be “scared shitless” whereas an officer who had actually done the testing wrote later: “It was touching to see the intense effort put into answering the questions, often by men who never before had held a pencil in their hands”. A shade different, don’t you think?

Gould claimed that “vast numbers of men” earned zero scores, and therefore, must not have been able to understand the Army Beta test instructions and/or stimuli. However, only 4% scored less than 10 in total, and only 2.6% of testees scored less than 5 points. Gould neglected to point out that the standard procedure in the Army was that the low scorers were then individually tested on the Stanford Binet to give them yet another chance to do well.

Gould reports an officer’s unfavourable view of the testing, but does not show that 13 other officers were favourable.

Gould criticised the short time limits on some subtests, saying they also were too short for his biology students, on whom he used the test (see later). Warne politely explains that short time limits on process tasks are required because otherwise they are too easy, and discriminate poorly. Short time limits are a good feature, not a bug. (This is a common misunderstanding. See Hyde on sex differences in the speed of completion of tasks).

Gould criticised the Beta test, saying that poor testing circumstances meant that it could not be considered a test of innate intelligence. He failed to tell his readers test-constructor Boring’s opinion that the tests had predictive value. Also, the test creators rarely mentioned “innate intelligence”. They simply found that test results helped them predict who would do well on the tasks the army required, which was the whole purpose of testing.

The test creators believed that different levels of education were likely to have influenced performance on the test, as did their immigrant status, but Gould cast Yerkes as dismissing that factor, when in fact he discussed it and correctly said that a correlation between years spent in the US and higher test scores showed an aculturation effect, but did not identify a cause.

Gould also downplayed the work done on establishing the validity of the Army tests. Scores on the Army Beta correlated positively with scores on other intelligence tests, including the Army Alpha (r= 0.811) and the Stanford-Binet (r= 0.727), both the “gold standard” of intelligence measurement at the time ([15], p. 634). Army Beta scores also correlated positively with external criteria, such as the number of years of schooling a recruit had (both as children and adults), commanding officers’ ratings of soldiers’ job performance, and army rank.

After all this, I would have regarded Gould has having given an unfair account of the test, and left it at that, job done. Warne, perhaps prisoner of the American work ethic, has gone further. He gave the Beta test to his students, and also pre-registered his expectations. This is excellent. Instead of getting the results and saying “I told you so” he puts his prior assumptions up for examination. If only Gould had done that.

For me the most interesting result is that the test picks up what looks like a confirmation of a secular trend. As more people get to go to college, scores go down and more resemble the average in the general population from which students are selected.

Warne says:

Given these results from our replication, it seems that Gould’s criticism of time limits and his argument that the Army Beta did not measure intelligence are without basis. Despite the short time limits for each Army Beta subtest, the results of this replication support the World War I psychologists’ belief that the Army Beta measured intelligence. We demonstrated this in the following four relevant results of our replication:

Here is my summary of those four results.

50 years on
🔊 Listen RSS

Philanthropy is a fine thing. A good sum of money put in the right place can benefit many people. Commerce is also a fine thing. A small sum of money put in the right place can create goods and services which people want, which can lead to profit which leads to more money being available to create goods and services. Virtuous circle. A man who dies rich is not disgraced. He dies in grace if his companies outlive him, and continue to provide things that people want.

This leads to an interesting question: did Bill Gates do more good for the world by founding Microsoft or by founding the Gates Foundation? Probably the former, I would estimate. I say that without being a fan of Microsoft’s products, which have often exasperated me, but just as a cool calculation about the long-term impact of readily accessible business and household programming power, which made computation accessible to billions of people. Tim Berners-Lee and Vincent Cerf could claim greater impact, and with consummate flair Steve Jobs packaged components into the right combination for the ultimate portable communication device (he knew our limitations), but much earlier than that Microsoft had turbo-charged the computer revolution, and pushed Apple aside in the business world, by a country mile.

Now Bill Gates is doing good works, and why not? His 2019 letter is just out.

It deals with 9 topics: Africa being the youngest continent (fastest growing population); DNA testing might prevent premature births (but they may be due to racism); the world’s building stock may double by 2060 (global warming); data may be sexist (not enough suitable data collected on women); helping teenage delinquents cope with their anger; a nationalist case for globalism; flush toilets (sanitation world-wide); textbooks go digital; mobile phones help poor women;

Frankly, apart from Bill’s day with teenage miscreants, there is little about education in this letter.

In fact, the education stuff is in his 2018 letter.

We made education the focus of our work in the United States because it is the key to a prosperous future, for individuals and the country. Unfortunately, although there’s been some progress over the past decade, America’s public schools are still falling short on important metrics, especially college completion. And the statistics are even worse for disadvantaged students.

To help raise those graduation rates, we supported hundreds of new secondary schools. Many of them have better achievement and graduation rates than the ones they replaced or complemented. Early on, we also supported efforts to transform low-performing schools into better ones. This is one of the toughest challenges in education. One thing we learned is that it’s extremely hard to transform low-performing schools; overall they didn’t perform as well as newly created schools. We also helped the education sector learn more about what makes a school highly effective. Strong leadership, proven instructional practices, a healthy school culture, and high expectations are all key.

We have also worked with districts across the country to help them improve the quality of teaching. This effort helped educators understand how to observe teachers, rate their performance fairly, and give them feedback they can act on. But we haven’t seen the large impact we had hoped for. For any new approach to take off, you need three things. First you have to run a pilot project showing that the approach works. Then the work has to sustain itself. Finally, the approach has to spread to other places.

How did our teacher effectiveness work do on these three tests? Its effect on students’ learning was mixed, in part because the pilot feedback systems were implemented differently in each place. The new systems were maintained in some places, such as Memphis, but not in others. And although most educators agree that teachers deserve more-useful feedback, not enough districts are making the necessary investments and systemic changes to deliver it.

To get widely adopted, an idea has to work for schools in a huge variety of settings: urban and rural, high-income and low-income, and so on. It also has to overcome the status quo. America’s schools are, by design, not a top-down system. To make significant change, you have to build consensus among a wide range of decision makers, including state governments, local school boards, administrators, teachers, and parents.

Melinda Gates said:

When economists describe the conditions under which countries prosper, one of the factors they stress is “human capital,” which is another way of saying that the future depends on young people’s access to high-quality health and education services. Health and education are the twin engines of economic growth.

Human capital can also refer to how bright people are, given only reasonable health and education. The phrase is often used as a coy way of commenting on the quality of the people. Boosting health and education gives early gains which plateau pretty fast. The first $5000 has a big effect, but at about $15000 not many more gains are found.

The Economist says:

Some [problems] require the exercise of ingenuity and discretion by small teams (eg, inventing a new vaccine); some demand the programmatic mobilisation of legions of people (immunisation drives). Others require both.

Improving education falls into this third, difficult category. It is not a problem that a small team of brilliant people can crack. Nor can a good education be delivered, like a vaccine, by following a strict protocol to the letter. Instead it requires legions of teachers to respond thoughtfully and conscientiously to pupils’ needs. Mr Gates left his BAM (Becoming a Man) circle wishing every classroom could emulate its intimacy and respectfulness. But that is hard to bottle.

Well, The Economist is championing a very traditional view. Some people have proposed proposed brilliant short cuts to learning, and some of them might work, although most of them haven’t.

Doug Detterman tracked these intelligence-boosting notions for over 50 years, and found them a perpetual disappointment.

Others propose more pedestrian and strict protocols followed to the letter, because those have traditionally worked throughout the ages, mixed with rewards and punishments. A very well thought out sequence of instruction should be instructive to the average pupil. Doing standard teaching well has much to commend it. However, it does not annul individual differences.

I do not rate The Economist as a good source on the question of intelligence and the effects of early education:

Consider the unwieldiness and impracticality of “legions of teachers to respond thoughtfully and conscientiously to pupils’ needs”. This is a prescription for schools being a cottage industry providing Saville Row suits for every shape and size of intellect. Really? Are reading, writing and arithmetic so idiosyncratic that instruction must be tailored to each individual? It is like saying that every computing problem is different, and must have its own operating system. Surely some instructions can be grasped by most students?

Bill Gates is a practical man, and is working with the old system, though new schools seem to be giving better results. Do these new schools use different techniques, different teachers or different students?

• Category: Science • Tags: Arthur Jensen, Bill Gates, Education, IQ 
🔊 Listen RSS

Newspapers have very warmly received an international project which, in the author’s views, strongly suggests that healthy babies are all alike in their developmental milestones, at least as determined by a study of particular centres in different parts of the world.

The study has the following general features: Find healthy pregnant women in several different comfortable parts of the world and then check whether the development of their children is the same or different between these centres. If the same, argue that race cannot be an explanation for differences between continental groups, since once they are equalized for health, child developmental differences disappear. This could well be true, so the excitement generated by the updated findings is understandable.

Newspapers are hardly to blame for reporting this study in glowing terms. The authors are bold enough to say:

It is evident that across developmental and growth parameters, only a very small percentage (around 10%) of the total variance in these fundamental human functions can be explained by differences among these populations (Fig. 3). The present results and previous publications, presented together in Fig. 3, support the position that most of the observed differences in growth and neurodevelopment across general populations or countries are primarily due to socioeconomic, educational and class disparities, i.e. postal codes define the health profiles of humans better than their genetic code.

For completeness, here is Fig 3

As regards differences between the peoples of different continents, the authors argue there is nothing much to see here, particularly on cognitive abilities, though there may be something happening with children’s behaviour. However, the authors suggest the behaviour difference is because of cultural differences in how people rate behaviour, not because children actually behave differently. Odd, because the authors were trying to ensure standard procedures were used across different sites, so as to be able to make valid statements about differences and similarities. You would have thought they would have ironed these things out in this large and long-term program of work. Anyway, for whatever reason, negative behaviours and emotional reactions vary between sites. Some kids seem to be more of a nuisance in some places.

You may see that Fig 3 shows very little differences in HC (head circumference) which has often been a bone of contention. Here are the actual figures for head circumference in centimeters at 37 weeks taken from the 2014 paper:

UK 34.5 (1.3)
USA 34.5 (1.4)
Brazil 34.2 (1.2)
Kenya 34.2 (1.2)
Italy 34.0 (1.2)
China 33.6 (1.2)
Oman 33.6 (1.1)
India 33.1 (1.1)

As you can see, UK and USA head circumferences are largest and have the largest standard deviations, India the smallest and the smallest standard deviation. Indeed, the mean for Indian head circumference is one UK standard deviation below the UK mean. Put like that, the centres differ somewhat in the brain size of the children.

What can we say about the apparent lack of any study centre differences in cognitive abilities? Few psychometricians would suggest that cognitive abilities could be reliably assessed at age 2. The Wechsler Preschool and Primary Scale of Intelligence makes a brave start at 2 years and 6 months. Others find it better to wait till 4 years of age, or better still 7 years of age or, for the sweet spot of early testing with reasonable predictive power for adulthood, 11 years of age.

Let us see what these researchers have included in their cognitive assessment of two year olds.

The following is taken from their Inter-NDA instruction manual

1) Make a tower of 5 blocks. There are no higher scores for children who can do the task immediately. Any child doing it in 3 trials gets same score as child who does it in the first trial. A child who builds a 4 block tower gets same score as child who only achieves 3 blocks. This may lead to a lack of discrimination among brighter children.
2) Naming 4 colours. Better task, but naming of 1 or 2 colours gets lumped together. Some loss of discrimination.
3) Matching cubes of same colour. Good scoring system, giving a valid 3 point scale.
4) Handing cube to examiner. Simple scoring, the first to use a time cut-off.
5) Puts spoon in cup when asked. This is a very easy test, because many children will have seen spoons in cups. Some kids might put the spoon in the cup without being asked. It isn’t a pure test of language comprehension. The scoring system loses discrimination at the higher end. A child who does it immediately gets the same score as a child who takes 3 trials to get the hang of it. The child who takes a full 5 trials to do it gets the same score as those who do it in 4 trials. Once again, there is a ceiling effect in the scoring system
6) Match 3 shapes on board. Again, a very easy test, with 3 shapes to be put in their respective holes. Using 4 or even 5 might have given a more discriminative test. Again, the scoring system loses discrimination in the higher range, exactly as described above.
7) Point to the door/entrance in the room. Simple task, same loss of discrimination at higher end.
8) Place raisin into a small opening. A coordination motor task, but a weak test of cognition.
9) Drinks water from cup. A weak test of cognition.
10) Looks at something pointed at. A weak test of cognition.
11) Pretends to drink from a cup. Interesting idea, and a better scoring system.
12) Pretends to make a cup of tea. Some cultural loading here? Test of whether the child can do a pouring motion with a toy teapot.
13) Give the dolly some tea. Imitation.
14) Horizontal scribble Again, interesting, but scoring not sensitive to brighter children.
15) Finding a bracelet placed in full view under a cloth. Scoring system again could do with more range.
16) Child’s use of plurals when shown objects. Good language test, but again the scoring could be more precise.

There are then several tasks to be rated on the basis of parental report: can ask for toilet, runs back to mother, goes up steps, throws ball near something, kicks ball.
Then a language item about syllabic babbling, good topic, but again very crudely measured. Next items, all reasonable and interesting: uses two words together; indicates “no” by gesture; uses a pronoun; count of how many words the child uses during the assessment (this is a good item, but with restricted range at the top); how many 3 word sentences used (another good item, but with restricted range at top); whether child can follow the topic of conversation (good); combines word and gesture (good).

• Category: Science • Tags: Heredity, Intelligence 
🔊 Listen RSS

If the brightness of European Jews is primarily due to their culture, then we should all seek to be adopted by a Jewish mother. If, on the other hand, it is necessary to be actually born from Jewish parents, then any cultural tips we may get from them may be a bonus, but it is their genes that are crucial.

Some researchers have just had a look at this, and seem to have found a genetic explanation for part of the reason why European Jews are particularly bright.

Dunkel, C. S., Woodley of Menie, M. A., Pallesen, J., & Kirkegaard, E. O. W. (2019, January 24). Polygenic Scores Mediate the Jewish Phenotypic Advantage in Educational Attainment and Cognitive Ability Compared With Catholics and Lutherans. Evolutionary Behavioral Sciences. Advance online publication.

They say:

A newly released multivariate polygenic score for educational attainment, cognitive ability, and self-rated mathematical ability in the Wisconsin Longitudinal Study was examined as a mediator of the group difference between Jews (n 53) and 2 Christian denominations, Catholics (n 2,603) and Lutherans (n 2,027), with respect to educational attainment, IQ, and performance on a similarities measure. It was found that the Jewish performance advantage over both Catholics and Lutherans with respect to all 3 measures was partially and significantly mediated by group differences in the polygenic score. This result is consistent with the prediction that the high average cognitive ability of Jews may have been shaped, in part, by polygenic selection acting on this population over the course of several millennia.

Public Significance Statement

Ashkenazi Jews exhibit high levels of general intelligence. The hypothesis that differences in general intelligence between Jews and Catholics and Lutherans is partially mediated by polygenic scores for educational attainment was tested. The results support the hypothesized partial mediation.

Data were sourced from the Wisconsin Longitudinal Study (WLS). The WLS is a longitudinal study of randomly sampled Wisconsin high school students beginning in 1957; the last wave of data collection was in 2011. The 1957 sample included 10,317 Wisconsin high school seniors. The sample is overwhelmingly of European descent.
9000 of the study participants were genotyped as part of the recent GWAS for intelligence (Lee 2018) and in this study the educational attainment polygenic score was used. It is the stronger predictor. The students were tested on the Henmon-Nelson Test of Mental Ability, a 30-min test consisting of 90 items of increasing difficulty in spatial, verbal, and mathematical ability. The reliability of the test .95; and it correlates .80 to .85 with Wechsler full scale IQ. When subjects were in their 50s they were given 8 items of the Wechsler Similarities test by phone. Intelligence was well tested.

The Jews in this sample are much brighter than the Christians. They had much, much higher educational levels, perhaps a gene-culture synergistic effect, or simply that educational levels measure ability and motivation. Perhaps the 8 point IQ advantage is enough to explain it.

To illustrate the differences between the Jewish and two Christian groups, we combined the two Christian groups and computed Cohen’s d for PGS and IQ. For PGS Cohen’s d 1.33, which is a very large effect size. For IQ, Cohen’s d .57, which is a medium effect size. These group differences are portrayed in Figure 1.

The correlations between the polygenic scores and the intellectual measures are .2 to .3 which is low. Aware that the Jewish sample is small, the authors drew 1000 same sized samples at random from the Christian population to get an estimate of the likelihood of absolutely chance differences between the Jewish and Christian students, and this turns out to be very low, so it is very likely to be a real difference.

The sample of Jewish students is small. However, researchers looking at small samples of ancient genomes argue that DNA is informative even in small samples, because it is a cumulative record of genetic pairings, and is highly informative thereby, but there is still a chance of quirky findings. The authors a well aware of this, and regard their findings as tentative. They certainly do not argue that Jewish cultural transmission is irrelevant, and say that it might support and amplify the genetic factors found by the polygenic scores.

Rare variants associated with lipid storage disorders may indeed confer a heterozygote advantage, which may have augmented the Jewish Group GCA above that which would be predicted by differences in the level of PGS alone, perhaps accounting for the relatively higher frequencies of these disorders in this population. Direct tests of this model still need to be carried out, however.

Dunkel and colleagues have established a tentative link between polygenic scores and Jewish intellectual advantage. This is an important step forwards, and worth testing on larger samples and with better polygenic score data.

• Category: Science 
🔊 Listen RSS

Four years ago I claimed that it was more important to have educated parents than rich ones. Parents who are educated were very likely bright to begin with, and judged worth educating as much as possible. They may even have gained in ability by virtue of further education. Brighter parents usually earn more than less bright ones, so many educated parents will also be wealthy. Nonetheless, if you have to chose which is best for children, choose education over wealth. Why? Because intelligence is the greatest wealth.

I’ve known for years that Rindermann had all these results you will see below, and it is great to see them all gathered together, and the analyses extended to complete the overall picture.

In 19 (sub)samples from seven countries (United States, Austria, Germany, Costa Rica, Ecuador, Vietnam, Brazil), we analyzed the impact of parental education compared with wealth on the cognitive ability of children (aged 4–22 years, total N = 15,297). The background of their families ranged from poor indigenous remote villagers to academic families in developed countries, including parents of the gifted. Children’s cognitive ability was measured with mental speed tests, Culture Fair Intelligence Test (CFT), the Raven’s, Wiener Entwicklungstest (WET), Cognitive Abilities Test (CogAT), Piagetian tasks, Armed Forces Qualification Test (AFQT), Progress in International Reading Literacy Study (PIRLS), Trends in International Mathematics and Science Study (TIMSS), and Programme for International Student Assessment (PISA). Parental wealth was estimated by asking for income, indirectly by self-assessment of relative wealth, and by evaluating assets. The mean direct effect of parental education was greater than wealth. In path analyses, parental education also showed stronger impact on children’s intelligence than familial economic status. The effects on mental speed were smaller than for crystallized intelligence, but still larger for parental education than familial economic status. Additional factors affecting children’s cognitive ability are number of books, marital status, educational behavior of parents, and behaviour of children. If added, a general background (ethnicity, migration) factor shows strong effects. These findings are discussed in terms of environmental versus hidden genetic effects.

Socio-economic status is associated with educational attainment, and as we know, a frequently observed correlation suggests an underlying cause. (In this instance, the correlational nature of the association is not seen as a grave disadvantage).

The popular interpretation of these findings in the media as well as in science is that they are caused by differences in the wealth of parents (for examples, see Rindermann & Baumeister, 2015): The rich can support their children through costly interventions that are beyond the ability of less wealthy parents, such as better housing, private schools, educational toys and computers, entrance to expensive museums, and hiring tutors. By the same token, the economically and socially disadvantaged poor cannot offer their children such supports. A straightforward intervention derived from this position was publicly formulated by Richard Nisbett in his keynote “Bring the Family Address” at the 2009 Association for Psychological Science convention in San Francisco: “If we want the poor to be smarter we should make them richer” (Wargo,2009, p. 17).

However, a closer look at different empirical phenomena makes it doubtful that economic differences are really at the root of differences in intellectual outcomes as opposed to underlying causes that they proxy. Consider six types of suggestive evidence for the position that educational mechanisms are stronger drivers of offspring intelligence than economic ones.

1. In many countries, there is only a low or even no positive relationship between indicators of economic wealth of families (e.g., owning TV, mobile phone, computer) and cognitive student assessment results; and sometimes, the relationships are negative.

2. Similarly, in international comparisons with individual-level data (PISA 2006, parental educational level is more strongly associated with children’s abilities than are parental wealth indicators.

3. Cognitive elites such as Nobel Laureates come less often from wealthy social strata than from well-educated ones.

4. A further type of evidence for educational mechanisms is indirect; rather than showing that parental education drives offspring intelligence, it shows that off-spring’s education drives their own intelligence, thus implicating underlying cognitive processes that are inculcated through education as an important contributor to IQ differences. In a narrative review of the historical literature, Ceci (1991) found that each year of missed or delayed schooling led to a decrement in cognitive ability. For example, missed schooling due to family travel, summer vacations, illness, dropping out, or absence of teachers in remote regions all led to reduced IQ performance compared with children who had not missed school: Two adolescents with the same IQ score at age 14 differed by nearly 8IQ points by the age of 18 if one of them remained in school until that age and the other dropped out at age 14 (Ceci, 1991). In a series of analyses, Winship and Korenman (1999) modelled IQ changes under different assumptions about the degree of measurement error. They estimated that the impact of 1 year of schooling results in an average IQ increase of about 2.7 IQ points for each yearof school attendance.

5. Parental ability and attitudes create an important developmental environmentfor children as illustrated by a qualitative Austrian study (Großschedl, 2006): Some parents whose children were cared for and supported by a public social program (the state pays all the rent including water, electricity, and central heat-ing) burned the books and learning materials supplied for their children “for heating” during vacations. They stated that these materials are not important and education is not important for girls, because they will marry later. Großschedl Rindermann and Ceci 301 (2006) found that during home visits, it was difficult to create a learning atmosphere for applying the training program, for homework, and for consulting parents, because parents and their children wanted to watch TV all day.

6. Consistent with the above five sources of empirical research, there are also anecdotal examples that contradict the popular assumption that a more expensive environment favors intellectual development: In Atlanta (based on observations in 2008), there are two famous zoological institutions, one charging US$37), and offering fishes, whales, and other animals swimming in basins with few or no explanatory texts describing the animals’ habitat, evolutionary or ontogenetic development, and behavior. The second institution (the Natural History Museum) had a US$19 entrance charge but offers age-appropriate voluminous written and verbal explanations, of the habits and geographic regions of animals including the presentation of complex topics such as evolution and the Doppler effect. The more expensive but superficial place attracted far larger crowds of which the largest fraction appeared to come from seemingly lower SES strata. The cheaper but cognitively more stimulating museum was nearly empty and the few people attending it appeared, from their dress and manner, to be from the middle or upper classes, many of them were whole families including fathers.

• Category: Science • Tags: Genetics, Heredity, IQ 
🔊 Listen RSS

I do not have a dog in the fight about dogs. My dad said that there was a dog in every boy’s life, and so we had some dogs when I was young, and then in my own life, no dogs. I was living a town life, and working, and had neither need nor wish for them. I have nothing against dogs, other than that they should live in the country, not the town, and preferably do something useful. In towns they are captive and, when badly trained, frequently a nuisance. In the country, so long as they are not worrying sheep, they are more agreeable company.

I can see that dogs have very probably evolved with us, in a symbiotic relationship. They know how to flatter us, in return for food and companionship. Parasitism it may be, but it works for many people, and virtually all dogs. Dogs and their owners are reciprocally besotted.

Frankly, I doubted owner’s stories about the intelligence of their pooches. We are creatures of habit, and dogs learn from observation how we are to be handled. So, it was with some initial hesitation that I looked at the research on canine intelligence, and then came to see that, after due allowance for restrictions on which tests which could be used, there was a case for comparing the intelligence of dogs and of dog breeds. The fact that the clever breed were sheep dogs pleased me. We all have to earn our keep.

Here is Rosalind Arden on the intelligence of dogs:

The other thing about dogs, is that they live shorter lives, so their generation pass more quickly, and can be observed as they evolve. Even more important, they can be bred through a selective process into different sorts, for different purposes. Assisted evolution in action. Hence, we can look at these close companions and make judgments about how characteristics and behaviours alter through evolution. We can even tamper so as to breed up dogs for our uses. Guide dogs, for example. Practically, dogs that can detect when we are about to have a fit. Perhaps even dogs that can detect our diseases before any other detection device can do so.

What can we find out from genetic analyses of dog behaviour and dog breeds?

Highly Heritable and Functionally Relevant Breed Differences in Dog Behavior
Authors: Evan L MacLean, Noah Snyder-Mackler, Bridgett M. von Holdt & James A.

* Correspondence to: [email protected] & [email protected]

Below I show the abstract verbatim, and have selected and abbridged the main points of the paper.

Abstract: Variation across dog breeds presents a unique opportunity for investigating the evolution and biological basis of complex behavioral traits. We integrated behavioral data from more than 17,000 dogs from 101 breeds with breed-averaged genotypic data (N = 5,697 dogs) from over 100,000 loci in the dog genome. Across 14 traits, we found that breed differences in behavior are highly heritable, and that clustering of breeds based on behavior accurately recapitulates genetic relationships. We identify 131 single nucleotide polymorphisms associated with breed differences in behavior, which are found in genes that are highly expressed in the brain and enriched for neurobiological functions and developmental processes. Our results provide insight into the heritability and genetic architecture of complex behavioral traits, and suggest that dogs provide a powerful model for these questions.

Studying aggression, fear, trainability, attachment, and predatory chasing behaviors on 14,020 individual dogs with breed-level genetic identity-by-state estimates from two independent studies we found that a large proportion of variance in dog behavior is attributable to genetic factors. The mean heritability was 0.51 ± 0.12 (SD) across all 14 traits (range: h 2 0.27-0.77), and significantly higher than the null expectation in all cases (permutation tests, p < 0.001).

Interestingly, the traits with the highest heritability were trainability (h 2= 0.73), stranger-directed aggression (h 2 = 0.68), chasing (h 2 = 0.62) and attachment and attention seeking (h 2 = 0.56), which is consistent with the hypothesis that these behaviors have been important targets of selection during the cultivation of modern breeds.

Overall, we identified 131 unique SNPs that were significantly associated with at least one of the 14 behavioral traits (Bonferroni p ≤ 0.05, Fig 2). Forty percent of these SNPs (n= 52) were located within a gene – none of which encoded for changes in the amino acid sequence of the protein. On average, the top SNP explained 15% of variance in the behavioral trait. Thus, while we identify multiple variants with moderately large effects, the variance explained by individual SNPs is far less than that explained by additive variation across the genome (heritability), suggesting that as in humans, behavioral traits in dogs are highly polygenic. However, the variance explained by the top SNPs in our analysis across breeds was, on average, more than 5 times higher than that from within-breed association studies.

Many of the gene-level associations with dog behavioral traits include (i) candidate domestication genes, (ii) genes mapped to phenotypes implicated in domestication, (iii) genes implicated in behavioral differences between foxes bred for tameness or aggression, and (iv) genes that underwent positive selection in both human evolution and dog domestication. For example, PDE7B, which is differentially expressed in the brains of tame and aggressive foxes has been identified as a target of selection during domestication, and is highly expressed in the brain where it functions in dopaminergic pathways. In our analyses, SNPs in this gene were associated with breed differences in aggression, which is consistent with data from experimentally bred foxes, as well as hypotheses that selection against aggression was the primary evolutionary pressure during initial domestication events.

The gene-trait associations identified in our study also align closely with similar associations in human populations. For example, breed differences in aggression are associated with multiple genes that have been linked to aggressive behavior in humans. Molecular associations with breed differences in energy include genes previously linked to resting heart rate, daytime rest, and sleep duration in humans. Lastly, breed differences in fear were associated with genes linked with temperament and startle response in humans, and several of the genes implicated in breed differences in trainability have been previously associated with intelligence and information processing speed in humans.

If the variants in genes identified in our analyses make major contributions to behaviour and cognition, then the associated genes should be (i) involved in biological processes related to nervous system development and function, and (ii) primarily expressed in the brain. Indeed, we found that behavior-associated genes (as identified through meta-analysis) were enriched for numerous nervous system processes. These processes include neurogenesis, neuron migration and differentiation, axon and dendrite development, and regulation of neurotransmitter transport and release.

• Category: Science • Tags: Dogs, Intelligence 
🔊 Listen RSS

Some things are associated with others. Some things you eat make you ill. Some animals attack you. Some places are dangerous, some people likewise. On a brighter note, some foods are tasty and healthy. Some animals can be domesticated, or at least are easy to hunt or trap. Some places are safe, and some people likewise.

Correlation is not causation, but it’s the way to bet. Your life may depend upon it. Under-predict dangers and you could end up dead. Better to be safe than sorry. Better to be sorry that you have missed some opportunities than to be dead. It is sensible to worry about what may happen. Stereotypes are your friend. They are preliminary observations about life. Improve them as you learn more. Some must be discarded, but many more can be sharpened up and refined.

Life is a dilemma. When searching for a meal you must avoid ending up as a meal. Be careful, but don’t worry so much that you cannot forage for food. Hunger will make you adventurous, and then you are at risk again.

Ideally, we would never calculate correlations coefficients, but would just look at the data properly plotted out, ideally over a long period, and judge things by eye. The shape of the distribution matters. Intellectual and scholastic tests need not be a perfect bell curve, though they can be pretty close to one.

Sometimes an unknown force distorts the distribution, as when illness and infections sap the wits of poor citizens living in bad circumstances. More mysteriously, sometimes distributions are almost normal, but pinched into a narrower range, as if bound by a tighter central limit. Why are some groups narrower than others? Women, for example? African Americans, for another? Easy to see how systematic disadvantage could shift a mean downwards, less easy to see how those forces could both encourage low scorers and discourage high scorers.

A correlation coefficient is a straight-line simplification. Useful, though. It captures a lot in a little number. Standard deviations are also very informative.

It is no disproof of a correlation that it is not unity. Most real-life correlations are far less than perfect, but will be much better than guessing, even though there will always be outliers. Adding up those outliers in terms of residuals (errors of prediction) is a useful way of understanding the power of predictions based on correlations. For example, if you have to predict the height of an unknown person, your best bet (least error prone) is to predict that they are of average height. If you are asked to predict the height of 100 people, betting that everyone of them is of average height results in your error of prediction being the same as the standard deviation of the height of the general population.

If you have extra knowledge, such as being told the height of the individual’s parents, then you can improve your prediction by taking that into account. You will have reduced your error of prediction, and can compare how much it improves your bets by comparing your reduced residual with that of the standard deviation of the population.

Some people really believe they have invalidated a correlation by drawing attention to a particular outlier. If you conceive of a correlation as an ellipse rather than a straight line you can see that the highest scorer on one variable will not be the highest scorer on the other variable. That only happens with perfect correlation. Steve Hsu explains the issue here:

Correlation is not causation, but you are more likely to find a cause in a correlated variable than in an uncorrelated one. Search where there is at least a trace of a putative connective tissue. If you think it was the tomato that upset your digestion, start your controlled trial on tomatoes.

Correlation is not causation, but sometimes a finding is suggestive, like a trout in the milk. It does not prove that the milk was watered, but it makes you suspicious.

The “correlation is not causation” mantra is true as far as it goes, but it tends to be used so as to argue that, despite many correlations linking A with B being found in different circumstances, these will somehow never suffice to strongly suggest a causal link between A and B. On the contrary, correlation is a necessary feature of causation, but not a sufficient proof. Correlation is not always causation, but it helps find causes. Correlation is a pre-condition of causality.

Michael Woodley has set a challenge: “Sure, correlation does not equal causation, but find me just one single instance of a causal relationship where there is no correlation (just one would suffice).”

Whilst it is true that correlation does not necessarily equate to causation, all causally related variables will be correlated. Thus correlation is always necessary (but not in and of itself sufficient) for establishing causation.

Woodley continues:

The claim that ‘correlation does not equal causation’ is therefore meaningless when used to counter the results of correlative studies in which specific causal inferences are being made, as the inferred pattern of causation necessarily supervenes upon correlation amongst variables. Whether the variables being considered are in actuality causally associated as per the inference is another matter entirely.

The correct critique of such findings therefore is from mediation, i.e. the idea that a given correlation might be spurious owing to the presence of ‘hidden’ variables that are generating the apparent correlation. A famous example is yam production and national IQ, which across countries correlate negatively. It would be wrong to say that yam production somehow inhibits IQ, as the association will in fact turn out to be mediated by something like temperature and latitude. These variables are in turn proxies for historical and ecological trends that make the sort of countries that yield fewer yams the sort of countries that are typically populated by higher ability people, and vice versa. The causation in this case is via additional variables, which cause the covariance between the two variables of interest, without there being a direct effect of one on the other.

Properly constructed multivariate models can use these patterns of mediation to infer the likelihood of causation going in one direction or another. Thus, it is possible to actually test causal inference amongst a population of correlated variables. By far the best way of doing this is to compare the fits of models containing specific theoretically prescribed patterns of causal inference against (preferably many) alternative theoretically plausible models, in which alternative patterns of causation are inferred (Figueredo & Gorsuch, 2007).

Sir William Gemmell Cochran termed this “Fisher’s Dictum‟:

“About 20 years ago, when asked in a meeting what can be done in observational studies to clarify the step from association to causation, Sir Ronald Fisher replied; `Make your theories elaborate.’ The reply puzzled me at first, since by Occam’s razor, the advice usually given is to make theories as simple as is consistent with known data. What Sir Ronald meant, as subsequent discussion showed, was that when constructing a causal hypothesis one should envisage as many different consequences of its truth as possible, and plan observational studies to discover whether each of these consequences is found to hold. (Cochran, 1965, §5).


• Category: Science • Tags: Correlation, IQ, Statistics 
James Thompson
About James Thompson

James Thompson has lectured in Psychology at the University of London all his working life. His first publication and conference presentation was a critique of Jensen’s 1969 paper, with Arthur Jensen in the audience. He also taught Arthur how to use an English public telephone. Many topics have taken up his attention since then, but mostly he comments on intelligence research.