The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

 TeasersJames Thompson Blogview

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

Archimedes in bath

As readers of this blog will know, it is usually Woodley of Menie who darkens these pages with talk of genetic ruin, while James Flynn is the plucky New Zealander bringing tidings of comfort and joy about rising intelligence. Now, after my foolishly letting the two of them talk unhindered together for two hours over a large convivial dinner in Montreal last July, while the rest of us swopped stories and jokes, Flynn appears to have gone over to the dark side.

In truth, it must be admitted that Woodley and Flynn go back some years now, and Flynn had come to the conference with his talk on Piaget already prepared, but it is funny what bedfellows one makes when one follows the data. James Flynn had long expressed surprise at the continuing rise in intelligence scores, and had charted their stagnation and then decline, though that has not happened in all countries, or not in the same way. However, here in 2017 we have none other than James Flynn worrying that our top thinkers have been decimated. What is going on?

I have never troubled you too much about Piaget. He stands outside the psychometric canon, and is no believer in proper samples. He studied his own children in great detail, thus being totally biased, unrepresentative, and at the same time very interesting. He was able to tease out what his children did not know, and what they could not work out at each stage in their development. Even as an undergraduate I was cautious about his work, and his stages of development. My dutiful rendering of his theories got me my highest ever marks, but I did not follow others in slavish admiration of his approach. Be reassured, the stages don’t just happen like sedimentary deposits. Brighter children go through them faster. However, his experiments have charm, and depth. Perhaps Bärbel Inhelder had a good influence on him, particularly on the tasks we will be looking at now.

The formal operations stage, achieved around 12 years of age (yes, sooner for brighter kids, and much later for the slower ones) was considered the apotheosis of cognition. Once you got there, you could start being a scientist. Not much point in teaching physics before that. What is lovely about these tasks, from our point of view, is that they are potential markers of the scientific frame of mind, the basic stepping stones of the empirical project. They have had the same logic (and probably level of difficulty) for ever, and certainly since Archimedes. As such, we can say that if a child cannot solve these problems, they can probably not solve the whole larger set of problems which require the scientific method.

Equilibrium in balance

Robert Siegler (1979) gave children a balance beam task in which some discs were placed either side of the centre of balance. The researcher changed the number of discs or moved them along the beam, each time asking the child to predict which way the balance would go.

He studied the answers given by children from five years upwards, concluding that they apply rules which develop in the same sequence as, and thus reflect, Piaget’s findings. Like Piaget, he found that eventually the children were able to take into account the interaction between the weight of the discs and the distance from the centre, and so successfully predict balance. However, this did not happen until participants were between 13 and 17 years of age.

Pendulum task

Formal operational thinking has also been tested experimentally using the pendulum task (Inhelder & Piaget, 1958). The method involved a length of string and a set of weights. Participants had to consider three factors (variables) the length of the string, the heaviness of the weight and the strength of push. The task was to work out which factor was most important in determining the speed of swing of the pendulum.

Participants can vary the length of the pendulum string, and vary the weight. They can measure the pendulum speed by counting the number of swings per minute. To find the correct answer the participant has to grasp the idea of the experimental method -that is to vary one variable at a time (e.g. trying different lengths with the same weight). A participant who tries different lengths with different weights is likely to end up with the wrong answer.

Volume and heaviness (Archimedes)

The way this has been studied by Michael Sayer is a whole lesson in itself. Students are taken step by step through the fundamental problems presented by floating and sunken objects. Sunken objects displace a quantity of water equal to their volume, floating objects displace a quantity of water equal to their weight. This is important because children, and many adults, think of volume and weight as highly correlated in everyday life. Of course, the dissection of these properties led to a milestone in the history of science: Archimedes’ principle of buoyancy. Archimedes realized that he could determine whether the king’s crown was made of pure gold by examining the amount of water it displaced.

Shayer takes the children through 12 experimental steps before they are presented with the Archimedes problem. This deals with a concern which I voiced to James Flynn in the question session after his talk: I assumed that the massive fall in students solving this task could be due to poor science teaching. The Shayer procedure is an extended, hands on, tutorial, and goes some considerable way to countering a lack of science teaching as an explanation for poor performance.
Having read these brief descriptions, you might like to rank them in order of difficulty.

IQ decline and Piaget: Does the rot start at the top? James R. Flynn, Michael Shayer. Intelligence, Volume 66, January–February 2018, Pages 112–121.

What James Flynn and Michael Sayer now report is contained in their Table 5, which I find it difficult to understand. There are 3 tests, and for each test we have 3 factors jostling for position in a jumble of figures: the level of Piagetian development of the children, where “3A and above” relates to early Formal Operations and 3b and below Concrete Operations; the dates at which the data were collected; and whether the results of the Piagetian tests were different from what would be expected from test intelligence results at that time. Frankly, there were too many variables at one time for this reader. It would have been better as a Figure, showing the historical changes pictorially.

Bottom line: the scores have crashed dramatically.

Piagetian collapse

The authors say:

In sum: at one time the best of Britons (aged 12–14) could cope with items on the formal level and blended into a smooth curve of performance. Now these items are beyond many of them and register as a huge decimation of high scorers.

• Category: Science • Tags: IQ 
🔊 Listen RSS

Birthday candle

Like her Royal Highness, The Queen, I have two birthdays, though only as a scribbler. The first is my “Psychological Comments” blogspot birthday, 22nd November 2012, and the second is 12 December 2016, when I joined My republican sentiments, in the French rather than American sense of that word, suggest that as a good citizen I must eschew this plurality of procreative obsequiences and elect one of them, so I have transitioned to the latter date. It will be more modest to have only one birthday, and will simplify the statistics. It also allows me, with the tolerance currently accorded to deluded persons, to decide that my year has transitionally stretched from 22 November 2016 to 12 December 2017. Why not? How dare you question my deep chronological convictions?

By my 4th Blog Birthday on 22 November I had achieved 1,018,000 pageviews and by the time I moved to on 12 December 2016 it finished at 1,044,000. Five thousand of those pageviews were probably caused by hoovering up all my previous posts, but even machines can learn as they flick through the pages, can’t they? Pedants should knock that number off. In fact, the blog rattled on for readers who had not heard of my move, so by January 2017 it had reached 1,098,000, a clear instance of cultural lag.

Perhaps those readers were in fact lonely bots who had developed refined tastes after imbibing four years of my wisdom. Anyway, I am calling my blogspot total as a midpoint 1,071,000.

At I had 25,991 pageviews in December 2016, and now the total for 2017 up until yesterday 12 December 2017 is 395,125 so my Unz year is a total of 421,116. My 5 year total is 1,492,116, which is better than a poke in the eye with a sharp stick. These things are not precise, so you can call it “one and a half million” if you wish to recruit new readers, and don’t mind a little salesmanship.

Here are some more results:

Top 20 posts

Blog birthday 5 top 10 posts
Blog birthday 5 next to 20 posts

Top 10 readership nations

Blog birthday 5 top countries

Age profile

Blog birthday 5 age range

Last year I said:

I have written 755 posts over these 4 years. I have achieved more readers than I imagined possible. Getting over a million page views is a big deal for me. I am sure that my blog has been read more in 4 years than my publications in 48 years.

Now I say:

I have written 840 posts over 5 years, and got 1,492,000 pageviews, more than I ever expected.

Ron Unz told me something I did not know, which was that in the first 4 years of blogging I had written 600,000 words. I had not intended to.

My four years of 1,018,000 page views divided by 755 posts gave me 1348 views per post.

This Unz year has given me 421,116 pageviews divided by 82 posts, thus 5135 views per post. It seems my lethargy has been rewarded. More startling, when a post is popular it climbs to readership levels I have never before attained. 22,540 for example. The World’s IQ got 825 comments, a record for me.

What is missing from the statistics so far? In the past I saw no need to mention the comments as a separate category. I valued those I got, but the blogspot comment function was not that easy to use, and had a distressing habit of losing everything when you wanted to submit your remarks. Given the difficulty of leaving comments, I was proud of the fact that my popular post on the 7 Tribes of Intellect had received an all-time record of 63 comments. The total of the top 10 all-time comments came to 282 and even allowing two comments for every one of the other 745 posts I ever wrote, that would be a 4 year total of 1800 at the very most. How does this compare with my year at Unz? Here I have received 8,400 comments, and those comments amount to an impressive 1,141,898 words.

Blog birthday 5 summary figure

Looking at the figure, it all seems simple. Pageviews have show a steady and rapid increase. Twitter followers have shown a steady but mild increase. The number of posts has decreased somewhat, though they may have all been a bit longer. Comments and particularly comment words have increased very sharply. Thank you to readers.

Moving to Unz has been a strange experience. I have spent much more time reading comments. I have not always understood them, and have had to follow the links and learn new things. On Unz as a whole there are many postings way outside my sphere of knowledge and comfort zone. There are many which are critical of the concept of intelligence, or intelligence testing, and I have engaged with some of them, but by no means all of them. Live and let live.

I have maintained all my operating rules as before: almost every author whose work I comment upon is sent an email with a link to the post and an invitation to A) correct any errors immediately and B) to make longer and more general comments, often in the form of an “Author Replies” posting. Why do I not invite comment from all authors? In a very few cases I think that my pointing out a particularly critical posting might be tantamount to harassment, so discretion is best.

When commentators point out an error I have thanked them for it, and have put in the correction as soon as possible. Probably, we should show in the text that a correction has been made, but that is evident from the comments.

Ron Unz and I agreed that moderation of comments would be extremely light. If someone was really intending to foment violence would drop them, but otherwise free speech should rule, and it has. Keep saying what you really think. Although it is entirely up to you, if you find that a fellow commentator has made an egregious error, try to be a little kinder as you bring them back to the path of righteousness. Why not assume that the other party is intelligent, but temporarily not in possession of the full facts? That way you can correct their errors quickly, without incurring their understandable wrath when you question their mental ability.

I hope commentators will keep giving their views, kindly or otherwise.

Now to the future. I have a long list of intended posts, and a backlog of papers which need to be read. All in good time. Life has other distractions, and even while keeping blogging, I will continue to enjoy them.

• Category: Science • Tags: Blogging, Hbd 
🔊 Listen RSS

empty podia

It is sad to hear from Chandra Chisala that our double act will no longer be available for hire, denying us both the prospect of a lecture tour, but if this really is his last word, that is a pity, because debates generally reveal new sources of data, and although personal positions rarely change immediately, they can change as contrary data accumulates. Perhaps Chisala meant “last word” in the rock star sense, and we may yet get on the road for a farewell tour. In that hope, I have given up my earlier plan of just letting him have the last word, and after a delay I make these comments in the hope of tempting him back out of retirement.

My explanation for the slow pace of change in viewpoints in behavioural research is that the delay is mostly caused by effort justification. Reading papers, finding and assembling data, marshalling arguments and finding rejoinders to points raised by critics all require effort, and few people like to see their work wasted. They tend to defend their positions (If they have given public lectures on a topic, then defensiveness is boosted). Hence my standard question to researchers: “If you are just about to put finishing touches to your sandcastle, do you welcome the wave that destroys it?” The scientific ideal is that a wave of new findings is to be welcomed precisely because it overturns sandcastles built on shaky foundations (although they may have been the best available at the time). Easier said than done.

On the contrary, academic debates should not be determined by the amount of effort put in by the participants, but by the results of studies and the accumulation of observations. However, if this really is Chisala’s last word, then a recap is appropriate.

To summarize the general debate: under what circumstances do real world achievements call into question group tests of intellect? One answer is: when more of the group in question have real-life achievements above what would be expected from their measured average intelligence. In my view there should be no doubt that real world achievements are a better measure of intelligence than predictive assessments.

One straightforward approach is to follow a standard procedure. Take the average intelligence score for the nation; then take the best estimate of the nation’s total population; then calculate how many citizens are above a criterion, say Greenwich Mean Intelligence plus two standard deviations (IQ 130) and then compare that number of bright persons with the number of persons who win intellectual prizes. That last category could include a broad range of achievements: Chess, Go, Scrabble, Maths Olympiads and Field Medals, Nobel prizes in science, science publications in the best journals, patents, science based companies, and so on. For example, being employed in the research departments of companies working on artificial intelligence, on new computer chips, on new materials, on new drugs, and other breakthrough research. Given that companies will usually select the brightest persons, all this can provide us with another source of intellectual achievement rankings.

Naturally, any standard procedure has to overcome some difficulties. Are we studying nations, or racial groups, or both? For example, the South African maths team is composed of people with varying genetic backgrounds. Chisala makes the same point about the Canadian chess team. One could try to deduce these genetic backgrounds by surnames, not an error free procedure, or by looking at photos, which is better, but not always precise. It is for this reason that there is a choice about whether we should be counting the wins for national teams, or the wins for individuals, and then classifying the individuals by race. Individuals are most probably preferable.

For example, in a ranking of South African Maths contestants there is a “J.M. MacFie”. My first boss was John McFie and he was European. His wife was African. Should we look at the photos of each contestant, or better still their 23andMe results? The hereditarian viewpoint suggests that such genetic results are indeed required. Chisala takes me to task about whether to measure country team results or the individuals. Country teams is quicker, but looking at each individual would give better detail. I was hoping to be guided by photos in the individual cases, but I am happy with either countries or individuals, once one can agree a method which identifies genetic background.

An interesting point which arises from this is how detailed we should be about genetic groups. I have followed the practice of discussing Sub-Saharan Africans, because that is what the debate has focussed on, and because their intelligence test results are pretty much the same. North Africa gets somewhat higher scores, and tends to be left out of these discussions, but could certainly be included. More interestingly, how much granularity should one require about Sub-Saharan sub-groups? I would say that if we have data for genetic sub-groups it is essential to use it. For example, are there Yoruba, Igbo or other tribes who are cognitive elites? It would be fascinating to find them, and I have already speculated, in previous posts, about their likely characteristics.

Should our population calculations omit women and youngsters? I am against that. First, it would introduce contentious further calculations to take account of varying age structures in different populations. Worse, it would mean that the moment a woman or a 13-year-old becomes a Chess Grand Master or excels in any intellectual endeavour, we would have to re-do all our national calculations. There is already some doubt about the total populations anyway, particularly in any country where bureaucracy is weak. I suggest we keep things as simple as possible, and thus reduce error terms. However, if anyone wants to do an international men-only, adults-only calculation and update it every 5 years, that is their choice. I prefer to keep it simple, and do the same total calculations for all countries.

How far should the net be cast as regards intellectual achievements? I suggest as far and wide as possible, or it will be assumed that some results are being held back. I favour those achievements which are in a “universal language” like maths, science and chess. There will always be some doubt about whether people in poor countries have access to knowledge and training, though the spread of internet access goes a long way to dealing with this. (In fact, it should level the playing field in terms of access to knowledge). Poker, Bridge, Backgammon, and Mahjong could be added to the list, because there are international competitions and rankings. I am not suggesting anyone should take part in such activities. Live and let live.

In summary, any conclusions about group intelligence can be called into question if a significant number of that group have exceptional real-life achievements.

Now to go through some particulars.

Individual Chess rankings

My apologies regarding Kenny Solomon. FIDE does indeed rate him as a Grandmaster with a rating of 2412, and World Rank (all players) 2716, World Rank (active players) 1948. As far as I can see there are no Africans in the top 100 players at the moment. Once again, I have just glanced at the country names and their own individual names, and not looked at photos or biographies. Doing a genetic background analysis of the top 1000 players and their ratings would be instructive.

• Category: Science • Tags: Africans, Chanda Chisala, IQ, Scrabble 
🔊 Listen RSS

World university rankings 2018

The Times Higher Education World University Rankings 2018 are now out, so I had a glance at it. They are claiming theirs is the best, because it covers 1000 research-intensive seats of higher learning, and because it was audited by PriceWaterhouseCoopers. What do accountants know about universities? I suppose they checked the figures on the assessments already determined, but the essence of a university is what it discovers. Teaching is a necessity for self-preservation, grant income likewise, citations are delayed measure of discovery, and all the rest is noise. At least this ranking does not spend too much time on the student experience, whatever that is.

Anyway, the top 25 universities are:

Oxford, Cambridge, CalTech and Stanford, MIT, Harvard, Princeton, Imperial College, Chicago, Zurich, Pennsylvania, Yale, John Hopkins, Columbia, UCLA, UCL, Duke, California Berkley, Cornell, Northwestern, Michigan, Singapore and Toronto, Carnegie Mellon, LSE and Washington.

The scores is: UK 5, USA 18, Europe, Canada and Singapore 1 each. It is good to see Britain leading the pack, but the wider picture is the rise of Chinese universities (27th and 30th places), and it will be interesting to see how soon they enter the top 25. Will they lack curiosity and creativity, or will they just march through the institutions, casting aside all before them? IQ results suggest the latter, but they might be conformist, constrained and bound by saucy doubts and fears. We shall see.

As for the top 25 in Psychology, it is good to see my institution gets a good ranking at second place to Stanford. I cannot claim to have contributed to it, because I doubt these measures include bloggers, although blogging is the new publishing, and has a greater reach than lecturing.

The pecking order in Psychology is Stanford, UCL, Princeton, Harvard, Duke, and then 15 of the top 25 are in the US, the other mentions being British Columbia, Toronto, Amsterdam, Karolinska, and Edinburgh.

So, the most highly ranked psychology comes from the United States and Canada, with four from the old continent: 2 British, one Dutch, one Swedish. Basically, Britain and overseas Britain. Of course, that might give just a bit too much credence to the witterings of Anglo-Saxon minds, but why not? The list of influential achievements supports it. The Dutch and Swedish are close in history and ancient ancestry, and so cousins in the European sense of that concept, and can be welcomed without resentment, and just a bit of condescension.

Europe was the first mover with its ancient universities, but has lost its way. Perhaps unreformed religion got in the way. Neither Africa nor South America have reached world status. To find the up and rising universities one can look at the top 100, but that mostly takes in more of the US, Europe and former British colonies. Hong Kong and South Korea show up above 100 but the general point is already made. Setting aside a place for independent thinking is an Anglo-Saxon/European pastime. The field is open for other cultures to develop these seats of learning and pull in the best minds from all over the world. Saudi Arabia has tried that, with oil money, but the general Arab representation is very weak. Perhaps the institutions should be classified by the nominal religions of the host countries. Comparisons are odious, but very interesting.!/page/0/length/25/sort_by/rank/sort_order/asc/cols/stats

The demonic possession of low ability
🔊 Listen RSS

By some oversight, or a lack of careers guidance, I have never become a world leader. I regret not having been able to pass a law forcing journalists to provide a link to the research findings they are reporting. I know that The Search for Truth is a minority interest, but what stops them from linking their scribbles to the original research?

It is hardly fitting, on a day of rest, to be told that in the United Kingdom a large number of child abuse cases are due to the perpetrators believing in demon possession and witchcraft. The Daily Telegraph, normally a newspaper of record, reported:

Figures released by the Department for Education show that 1,460 cases in England included concerns about abuse which was “linked to faith and belief” during the year to March 2017.

The guidance states that such abuse includes the belief that children are witches or possessed by a spirit, demon or the devil, as well as “ritual or muti murders where the killing of children is believed to bring supernatural benefits or the use of their body parts is believed to produce potent magical remedies”.

In other cases, the guidance said, magic or witchcraft is used “to create fear in children to make them more compliant when they are being trafficked for domestic slavery or sexual exploitation”.
Examples include the superstition that calling a wrong number can bring malevolent spirits into the home.

Children can also be scapegoated for misfortune which has befallen other members of the family, such as unemployment or poverty.

The guidance came about following concerns raised about belief in witchcraft among “migrant African communities in England”, the document said.

Full newspaper article here:

Intrigued, I searched for the relevant report, said to be from the UK Department of Education, but my searches failed to come up with any government report. Perhaps it is being held back till an official launching party later next week. Even more likely, it is filed somewhere obscure. Perhaps no-one knows.

Many newspapers, in their various ways, covered the story, but the statistics and the source were not available. Perhaps it was felt that mere citizens should enquire no further, and the agreed summary was all we should see. There were previous reports and guidance from about 2012 onwards. As to this year’s findings, as far as I could find, silence.

You can get the flavour by looking at an Action Plan launched in August 2012.

The practice of torturing children on the basis of a religious delusion has been given an acronym:

Child abuse linked to faith or belief (CALFB) is not a recent phenomenon; nor specific to any given culture or faith.

The figures for which cultures or faiths do this sort of thing are not given.

On a more general note, a charity, NSPCC, says that there were 58,000 children identified as needing protection from abuse in the UK in 2016

Not only are there problems in getting good data from families, but some of the definitions, such as for emotional abuse, are elastic. The charity wants money, so cannot be considered dispassionate in these matters. However, they set out their referral routes and data sources carefully.

There were 61 child homicides across the UK in 2015/16, fewer than 6 in a million children up to age 17. This is good news, and the trend is downwards. There may be some under-recording when the cause of death cannot be proved, and this covers another 60 cases. Again, the trend is downwards.

Witchcraft cruelty and neglect England

However, cases of cruelty and neglect have increased. Also, based on a 2009 survey, 18.6 per cent of 11 to 17 year olds say they have experienced some type of severe maltreatment, so either they have been missed in the other data, or they are reporting severity in an exaggerated manner. Severe maltreatment would probably leave them incapable.

As regards the general category of perpetrators, hell seems to be siblings, peers and the community. Unfortunately, the table does not distinguish between parents and guardians. Perhaps it is no longer recorded as a distinction.

Witchcraft perps by family connection

In fact, in a later, un-numbered table we get some figures, which go against the predicted direction of more violence being perpetrated by genetically unrelated persons.

Witchcraft severe perpetrators table

The Police have some guidance:

As one might expect by now, all this is a story half-told. Seeing the detail of the child abuse cases, and looking at the genetic, cultural and religious background of the perpetrators would allow us to see if the abuse is indeed, as suggested, to be found in equal proportions in all religions and genetic groups. The actual exemplars are mostly African, which might be a coincidence, but absent proper data, one cannot be sure. In the wider picture, witchcraft is a minor aspect of child abuse, though it makes the heart sink because it appears to be a migration related problem, and why add problems to an already difficult area?

Of course, we could try an oblique approach. What sort of people believe that a child has been possessed by the devil? What sort of people believe that a possessed child is to be treated by being tortured with a pair of pliers? What sort of person is generally prone to superstition?

Superstition and other credulities have been studied, and the initial research suggested that people who have difficulties computing probabilities in a game of chance are also prone to misunderstanding coincidences in real life, hence prone to superstitions. This lovely study stood as an explanation for a long time, until another researcher took the obvious step (in retrospect) of checking how bright the subject were, and found that their scholastic attainments explained both their miscalculation of chance games and their miscalculation of chance events in life. Intelligence strikes again. Get the whole, very interesting, story here.

I think that it will be found that child abuse “linked to faith or belief” is in fact based on low ability, and while no group has a monopoly of low ability, it would be salutary to see the backgrounds of the miscreants.

In fact, a paper on this very subject, looking at child abuse between 1992 and 2000, a period before the highest mass immigration, finds that:

• Category: Science • Tags: Child Abuse 
An algorithm that learns, tabula rasa, superhuman proficiency in challenging domains.
🔊 Listen RSS

AlphaGo Zero after 36 hours

It is usual to distinguish between biological and machine intelligence, and for good reason: organisms have interacted with the world for millennia and survived, machines are a recent human construction, and until recently there was no reason to consider them capable of intelligent behaviour.

Computers changed the picture somewhat, but until very recently artificial intelligence has been tried, and proved disappointing. As computers and programs increased in power and speed a defensive trope developed: a computer will never write a poem/enjoy strawberries/understand the wonder of the universe/play chess/have an original thought.

When IBM’s Deep Blue beat Kasparov there was a moment of silence. The best that could be proffered as an excuse was that chess was an artificial world in which reality was bounded, and subject to rules. At this point, from a game playing point of view, Go with its far greater complexity seemed an avenue of salvation for human pride. When AlphaGo beat Lee Seedol at Go, humans ran out of excuses. Not all of them. Some were able to retaliate: it’s only a game: real problems are more fuzzy than that.

Perhaps. Here is the paper. For those interested in the sex ratio in forefront of technology, there are 17 authors, and I previously assumed that one was a woman, but no, all 17 are men.

AlphaGo used supervised learning. It had some very clever teachers to help it along the way. AlphaGo Zero reinforced itself.

By contrast, reinforcement learning systems are trained from their own experience, in principle allowing them to exceed human capabilities, and to operate in domains where human expertise is lacking.

AlphaGo Fan used two deep neural networks: a policy network that outputs move probabilities and a value network that outputs a position evaluation. The policy network was trained initially by supervised learning to accurately predict human expert moves, and was subsequently refined by policy­gradient reinforcement learning. The value network was trained to predict the winner of games played by the policy network against itself. Once trained, these networks were combined with a Monte Carlo tree search to provide a lookahead search, using the policy network to narrow down the search to high­probability moves, and using the value network (in conjunction with Monte Carlo rollouts using a fast rollout policy) to evaluate positions in the tree.

Our program, AlphaGo Zero, differs from AlphaGo Fan and AlphaGo Lee12 in several important aspects. First and foremost, it is trained solely by self­play reinforcement learning, starting from ran­dom play, without any supervision or use of human data. Second, it uses only the black and white stones from the board as input features. Third, it uses a single neural network, rather than separate policy and value networks. Finally, it uses a simpler tree search that relies upon this single neural network to evaluate positions and sample moves, without performing any Monte Carlo rollouts. To achieve these results, we introduce a new reinforcement learning algorithm that incorporates lookahead search inside the training loop, resulting in rapid improve­ment and precise and stable learning. Further technical differences in the search algorithm, training procedure and network architecture are described in Methods.

How shall I describe the new approach? I can only say that it appears to be a highly stripped down version of what had formerly (in AlphaGo Fan and AlphaGo Lee) seemed a logical division of computational and strategic labour. It cuts corners in an intelligent way, and always looks for the best way forwards, often accepting the upper confidence limit in a calculation. While training itself it also develops the capacity to look ahead at future moves. If you could glance back at my explanation of what was going on in those two programs, the jump forwards for AlphaGo Zero will make more sense.

Training started from completely random behaviour and continued without human intervention for approximately three days. Over the course of training, 4.9 million games of self­play were gen­erated, using 1,600 simulations for each MCTS, which corresponds to approximately 0.4 s thinking time per move.

Well, forget the three days that get all the headlines. This tabula rasa, self-teaching, deep learning, network played 4.9 million games. This is an effort of Gladwellian proportions. I take back anything nasty I may have said about practice makes perfect.

More realistically, few players complete each move in 0.4 secs and can spend a lifetime on a game, amassing 4.9 million contests. Once recalls Byron’s lament:

When one subtracts from life infancy (which is vegetation), sleep, eating and swilling, buttoning and unbuttoning – how much remains of downright existence? The summer of a dormouse.

The authors continue:

AlphaGo Zero discovered a remarkable level of Go knowledge dur­ing its self­play training process. This included not only fundamental elements of human Go knowledge, but also non­standard strategies beyond the scope of traditional Go knowledge.

AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts, including fuseki (opening), tesuji(tactics), life­and­death, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles. Surprisingly, shicho (‘ladder’ capture sequences that may span the whole board)—one of the first elements of Go knowledge learned by humans—were only understood by AlphaGo Zero much later in training.

Here is their website explanations about AlphaGo Zero

Alpha Go 40 blocks

The figures show how quickly Zero surpassed the previous benchmarks, and how it rates in Elo rankings against other players.

The team concludes:

Our results comprehensively demonstrate that a pure reinforcement learning approach is fully feasible, even in the most challenging of domains: it is possible to train to superhuman level, without human examples or guidance, given no knowledge of the domain beyond basic rules. Furthermore, a pure reinforcement learning approach requires just a few more hours to train, and achieves much better asymptotic performance, compared to training on human expert data. Using this approach, AlphaGo Zero defeated the strongest previous versions of AlphaGo, which were trained from human data using handcrafted features, by a large margin. Humankind has accumulated Go knowledge from millions of games played over thousands of years, collectively distilled into patterns, proverbs and books. In the space of a few days, starting tabula rasa, AlphaGo Zero was able to rediscover much of this Go knowledge, as well as novel strategies that provide new insights into the oldest of games.

• Category: Science • Tags: Artificial Intelligence, Intelligence 
🔊 Listen RSS


As is my usual custom, I wrote to the authors whose work I had commented upon in my previous post:

I asked John Protzko how long the effects of intelligence boosting interventions lasted. He said that he thought this “fadeout” effect was likely to happen somewhere between 3 and 5 years after the intervention had finished.

Here is the paper he wrote on this issue, and his discussion considers various possible explanations for apparently real gains eventually fading away.

I speculate that when a skill is acquired, but is slightly out of reach of one’s “real” intellectual levels, it cannot be fully internalized, and therefore fails for lack of integration into everyday skills, in the way that attending a conference provides a temporary boost to intellectual excitement and apparent understanding of complex problems, but soon fades to humdrum insensibility. Protzko describes a version of this in paragraph 4.1.5. Genetic set point, and argues against it.

Protzko’s preferred explanation involves the concept of “scaffolding”: an environmental effect of a cognitively demanding environment being required to sustain and develop the cognitive skills acquired by the special intervention.

Elliot Tucker-Drob comments on my remarks about the Ritchie and Tucker-Drob paper:

I, for one, am most convinced by the policy change approach, as it is as close to an experiment as one can get. I understand your concerns. I do not entirely agree with them, but they are soundly argued. I would say that the results of the policy change will not stare us in the face in raw data because the policy only affects the unobserved subgroup of individuals who would have otherwise not completed the new minimum schooling requirement.

So, for example, if raising minimum schooling by 1 years increases schooling by 1 year in 10 percent of the population, and 1 additional year of schooling raises IQ by 1.5 points, we would only expect to see an effect of roughly .15 IQ points in the population as a whole.

The instrumental variable methodology that economists tend to use reverse engineers this math. It rescales the small population IQ boost associated with a policy change relative to amount by which average education was increased in the population (e.g. a 1 year raise in years of minimum compulsory education that only affects 10% of people amounts to a .10 year increase in the population as a whole) to get at IQ points per year.

The logic, and math, behind this has been formally worked out, and it tends to be very robust. However, as we mention in our discussion, the treatment effect is what is termed a “Local Average Treatment Effect,” (known as a LATE by many economists) that may not generalize to people who would exceed the minimum schooling level even in the absence of the policy.

Those are just my thoughts about why my money is on the policy change design, even if it is isn’t particularly conspicuous in raw historical national intelligence data.

I replied that one might have expected a very visible and permanent rise in intelligence scores after the school reform had been introduced, and drew a crude sketch of the predicted results in a figure.
Elliot replied:

A reason you might not see the abrupt step function in national data (as in the figure you attached) is if there is variable role-out across school districts (as there was in Brinch & Galloway’s study). This would result in a smoothing over time- but the step function would become apparent when you center each school-district’s time-series data around the date of policy implementation.

Here is an example of how the smoothing happens when there is variation in change point (this comes from a paper on terminal decline).

education years step function

I replied: I can see now that the change would be far less abrupt than I imagined, but it would have to be detectable. I look forward to anything which might come up later which can be studied to confirm the expected increase in ability.

Stuart Ritchie comments:

There really were a lot of unexpected results in the meta-analysis. For instance, I expected that the “fade-out” would really be quite substantial in the “Control Prior Intelligence” and “Policy Change” studies, but the long-range ones, such as the former design done in the Lothian Birth Cohorts, still appear to show effects (and for the Lothian Birth Cohort, it’s even more believable since, at least for the Moray House Test, it’s the exact same measure at the early and outcome tests).

Another unexpected result was the size of the effect in the “School Age Cutoff” design. I’m very sceptical that an effect as large as that will persist later into life. One possibility is that there is a substantial, but not total “fadeout” of the effect after the completion of school, but it’s obviously rather tricky to test for long-term effects using this particular research design, which compares adjacent years.

My thanks to these three authors for their additional remarks.

• Category: Science • Tags: Intelligence, IQ 
Stay even longer at school?
🔊 Listen RSS

Mozart effect

Although it is popular for people to claim that they don’t know what intelligence is, most people show an interest in boosting their intelligence. Funny, that. These schemes come around every few years: getting babies in the womb to listen to Mozart, taking vitamins and concentration enhancing drugs, counting backwards in the N back procedure: that sort of thing. Doug Detterman, founder editor of Intelligence, who waded through 50 years of this moonshine, said that he was not against finding something, but simply had to note that on close examination all these schemes had turned out to be a disappointment. Many techniques produce some effects, but few generalize and persist.

A common thread in this wishful thinking is that the procedures should be easy and fast: 20 training sessions, a couple of months of practice, that sort of thing. Unlikely. However, a stronger case has been made for that boring activity: staying longer at school. Whenever people intend to waste their time boosting their intelligence with the latest training technique, I tell them to learn something useful like statistics, experimental methods, genetics, computer programming, maths, game theory, physics and even history and philosophy. I don’t for a moment imagine it will boost their intellects, merely give them some content and some tools for thinking. On that note, Tony Flew’s “Thinking about thinking” Fontana, 1975, is a good start.

However, could studying difficult subjects boost intelligence? I looked at this argument some time ago, and came to the conclusion that it probably boosted intelligence by 0.6 IQ points. I was a bit doubtful about it doing any more than that, because we did not have a long data series which would put the matter beyond dispute. I did not accept the authors’ estimate of an enormous 3.7 IQ points per year gain, simply because of a lack of historical data. That is to say, if the Norwegian schooling reform really boosts IQ, then the long data series before and after the reform would show a sustained upward tick in the national intelligence. I haven’t been able to find those data, though they may exist. I give my argument in the link below. You will see that in that post I cover a paper on this topic by Stuart Ritchie and colleagues, of which more below.

However, the caravan moves on, and now we have two papers saying that schooling boosts intelligence. You wait ages for a bus, and then two come along.

Raising IQ among school-aged children: Five meta-analyses and a review of randomized controlled trials. John Protzko. Developmental Review, 46, 2017, 81-101.

There have been 36 RCTs attempting to raise IQ in school-aged children. Nutrient supplementation includes multivitamins, iron, iodine, and zinc. Training includes EF and reasoning training, and learning a musical instrument. We meta-analyze this literature to provide a best-evidence summary to date. Multivitamin & iodine supplementation, and learning a musical instrument, raise IQ.

In this paper, we examine nearly every available randomized controlled trial that attempts to raise IQ in children from once they begin kindergarten until pre-adolescence. We use meta-analytic procedures when there are more than three studies employing similar methods, reviewing individual interventions when too few replications are available for a quantitative analysis. All studies included in this synthesis are on non-clinical populations. This yields five fixed-effects meta-analyses on the roles of dietary supplementation with multivitamins, iron, and iodine, as well as executive function training, and learning to play a musical instrument. We find that supplementing a deficient child with multivitamins raises their IQ, supplementing a deficient child with iodine raises their IQ, and learning to play a musical instrument raises a child’s IQ. The role of iron, and executive function training are unreliable in their estimates. We also subject each meta-analytic result to a series of robustness checks. In each meta-analysis, we discuss probable causal mechanisms for how each of these procedures raises intelligence. Though each meta-analysis includes a moderate to small number of studies (< 19 effect sizes), our purpose is to highlight the best available evidence and encourage the continued experimentation in each of these fields.

The author is looking at controlled trials on which individuals are given intelligence tests before and after programs which are at least two weeks long, in children aged 5 through to pre-adolescence, looking at effect sizes at immediate post-test. No follow-ups are mentioned, not even 6 months later, the usual minimum for a clinical intervention.

The multivitamin study produces such a small effect that it is silly to test it for significance. Protzko says that there was “an incredibly small but significant effect of taking multivitamins on IQ (g = 0.097, 95%CI = 0.006to 0.187; see Fig. 1). The abstract says: “We find that supplementing a deficient child with multivitamins raises their IQ,” which I think exaggerates what was found, which was incredibly small. Furthermore, only 3 of the 17 studies have more than 100 people in the active treatment condition, and that is rather small even for a controlled trial.

Iodine supplementation shows a half a standard deviation gain, but the author is rightly cautious about the papers, saying it only works for iodine deficient children, and also the effect goes down from g 0.5 to g 0.3 when one outlier study is removed. Still a sizeable effect for the target population, though not something which will have general application.

Under “environmental changes” or what I would call experimental enrichment studies, there were no effects for balance and coordination exercises, home academic support as part of the Abecedarian project (bang goes one theory about family environments), reasoning training (rejected because it is too much like “teaching to the test”), executive function training (a slight effect, marred by publication bias) and finally music lessons.

We found that teaching a child a musical instrument raises their IQ by over a third of a standard deviation (g= 0.421, 95%CI = 0.196 to 0.646; see Fig. 5). In addition, there was no evidence for heterogeneity in this sample (Q(5) 4.68,p> 0.466;I2= 0%)

The author links this, speculatively, to rhythm perception and discrimination. However, sample sizes are small, ranging from 10 to 32 children in the training condition.

In his general discussion the author makes a case for the effects of education on intelligence, whilst conceding that because most of the studies are on poor children there is a range restriction which may affect the wider applicability of the results.

He concludes:

Studies supplementing inadequate diets with multivitamins raised IQ. Studies remediating mild iodine deficiency raised IQ. Studies involving teaching a musical instrument raised IQ. After correcting for range restriction, this corresponded to an increase of 4 IQ points in the population.

My conclusion is that Protzko has made a reasonable case, in a carefully argued paper, but everything he reports is about immediate effects, and before coming to any conclusions it would be good to know whether the effects last more than 6 months.

• Category: Science • Tags: IQ, Nutrition 
🔊 Listen RSS

children in queue all 5

A UK charity, The Sutton Trust, has urged universities to take in students with grades which are two levels below the usual entry requirements, arguing that some students are capable of doing well at university, but have low scholastic attainments because of environmental circumstances: being poor, being at a bad school, and having to look after sick relatives.

At this point you may imagine that I am lining you up to tell you, once again, that intelligence tests were expressly designed to identify true ability, and that their use has led to the better detection of such bright, but adventitiously disadvantaged youngsters. Not so. No testing of ability is involved.

It seems that many UK universities have such “contextual” programs, but the University of Bristol features prominently in this debate so I will use it as an example. It describes its Contextual Offers requirements thus: being in the bottom 40% of schools in terms of attainments or progression to higher education; or living in a postcode area with low progression to higher education; or having completed a University of Bristol outreach program; or having spent more than 3 months in local authority care. All of these are misfortunes, but none of them include any measures of cognitive ability.

As is my usual habit, at this point I give you due warning that you may wish to stop reading here. However, a very striking claim is made: that such students do just as well as students admitted in the usual way, based on scholastic attainments. Worth a look? I thought so.

The Sutton Trust research measured scholastic attainments using the best 3 A Level results. That in fact underplays the attainments of the best candidates, who do 4 or 5 A levels, and sometimes another half A level. This will have an impact on their calculations about the differences between normal and contextual admissions at the most prestigious universities, which are most able to demand the highest academic attainments. The authors then calculate what the entrance requirements were likely to have been, using an approximation, since they did not have actual data. Further, I cannot find any evidence that they know which degree courses were followed by candidates. Some disciplines are more demanding than others, and command correspondingly higher premiums in the occupational market place.
Anyway, on page 28 of their publication they lay out their case for these offers not being associated with poor outcomes. They say:

But is there any evidence that universities who appear to be more likely to contextualise are also more likely to see higher dropout rates, lower degree completion rates and lower percentages of students getting firsts or 2:1s? We find little evidence of this, at least using the two potential measures of contextualisation described above.

Figure 4.5 presents correlation coefficients showing the relationship between the degree of contextualisation evident from Figures 4.3 and 4.4 with average degree outcomes for students on these courses. If contextualisation were adversely affecting degree outcomes, then we would expect to see a negative correlation between the percentage of courses on which universities appear to offer lower entry grades to students from low participation neighbourhoods and dropout rates, and a positive correlation with degree completion and degree class. By contrast, we would expect to see a positive correlation between the difference in A-level grades of students from low participation neighbourhoods and the standard offer and dropout rates, and a negative correlation between this gap and degree completion and degree class results.
Sutton Trust contextualized offers

It is not easy to make sense of these correlations, other than to observe that they are rather small, particularly with only 25 data points. None of these correlations are remotely significant in the statistical sense. The bigger problem is to understand what they mean.

Administrative data from 25 universities who admit some student on this “contextual” basis has been looked at, even though we do not know what percentage of students were so admitted. It seems likely to be a small percentage, say about 3% of the whole student body. These students would be worth studying as a group, if one could identify them. I imagined, hearing the bold statements during radio interviews, that such a comparison between normal admissions students and “contextual” admissions students had been carried out, and I could see the results of a t test comparison. Instead, assumptions have been drawn from a chain of prior assumptions, as shown in Table 4.1 which shows very little impact from all these “contextual” factors, except perhaps the “free school meals” low income measure. There is little difference between the implied groups, which leaves me bewildered about why they are considered important factors for contextual admission in the first place. Their whole argument hinges on the assumption that children from poor families will be, in intellectual terms, no different from children from wealthier families. The possibility that families who do not require free meals are brighter does not figure in these discussions.

In my view the results provide no grounds for saying that one should discard scholastic attainment as a way of regulating entry to university. On the contrary, this publication asserts but does not prove that being poor is an indicator of undetected ability. If so, why not use the tests designed to identify such talents?

• Category: Science • Tags: Academia, Political Correctness 
Test results of 550,492 individuals in 123 countries
🔊 Listen RSS

CHILD-EXAMS-head in hands

Few subjects arouse as much ire as national IQs. Questions are asked about the cultural appropriateness of the tests, whether they have sufficient scope to assess the different talents of racial and cultural groups, the representativeness and size of the samples, and even whether those results are reported correctly.

National scholastic achievements, on the other hand, are greeted with widespread publicity, discussed anxiously in government and educational circles, and sometimes rather naively accepted as an unerring measure of a nation’s educational system. In some ways this is understandable, because PISA and similar studies are well-funded, are global in scope, and repeated at regular intervals, allowing progress to be monitored. Yes, every test can be gamed, and national results vary considerably in coverage, representativeness, and probably also in levels of cheating. However, these are matters for the sort of people who read the supplementary annexes, and persons of that sort cannot be considered normal.

Every test, either “school near” as those designed for PISA or “school far” as designed for intelligence testing, are subject to the same concerns about sampling, measurement invariance, individual item analysis, and the appropriateness of summary statistics. Why the difference in public response to these two different points on the assessment spectrum? Perhaps it is as simple as noting that in scholastic attainment there is always room to do better (or to blame the quality of schooling) whereas in intelligence testing there is an implication of an immutable restriction, unfairly revealed by tricky questions of doubtful validity.

Perhaps it is a matter of publicity. PISA has the money for brochures, briefing papers, press conferences, meetings with government officials. Richard Lynn put his list together in his study, and came up with results that many were happy to bury.

Now we have David Becker taking over the database, and doing the whole thing again. Here is the 3rd major iteration of his revision. He tells me:

In the last six months, I have been able to increase the number of sources used from 253 to 357 and the number of nations from 92 to 123, and also to make many improvements in the methods. At present, the database contains samples to a total of 550,492 individuals.

Here is a diagram showing the relationship between the newly established IQ values (David Becker, X axis) to those of Lynn and Vanhanen on Y axis (L & V).

Becker and Lynn compared

The correlation is .90 for 305 Comparisons. The average of the IQ variations is only 1.07 with a standard deviation of 5.86. This means that around 75 % of the new IQ measurements do not deviate more than 5.86 points from the original measurements. A deliberate manipulation of the figures by Lynn and Vanhanen, as the two scientists have often been accused of, I cannot confirm with the best will. More still, looking at the polygono shuffle (dotted) in the diagram, it can be seen that, compared to me, Lynn and Vanhanen showed higher values, especially in the IQ-Weak African and IQ-Strong Northeast Asian samples. (My note: slightly over-estimated African and East Asian intelligence).

However, it was the hypothesis that Lynn and Vanhanen wanted to test that countries with higher IQ would also have greater economic strength (GDP / head). However, these countries are, in particular, in the range from 95 to 105. Japan as a country in North-East Asia with the highest GDP / head of 41,300 $ and a national IQ of 104-107, for example, is far below the US with a GDP / head of $ 57,400 and a national IQ of 97-99. If Lynn and Vanhanen were deliberately increasing North East Asian IQs, then this would have reduced the support for their own hypothesis

In PISA style, here are the highlights:

1. The population-weighted cross-national mean IQ-score is 89.03, with SD of 12.89, for 123 nations. There are roughly 550,000 individuals in the included samples.

2. The countries of Latvia and Belarus are new in the dataset and are included in the geographic means, but Latvia still has poor data quality.

3. At the level of records (source), my re-estimated (DB) and Richard’s original (L&V) data give:

DB: M=85.58; SD=13.73; N=358
L&V: M=85.36; SD=12.69; N=315

They are highly similar. The mean difference was estimated for 314 records as only 1.06, with a SD of 5.84. 75% of the re-estimated IQs are within this SD.

4. But I would also emphasize that there are some other re-estimated scores which more than 15 IQ-scores away from Richard’s and the reason for this has to be determined urgently. Especially scores from Coloured Progressive Matrices (the new ones) are sometimes implausible.

So, it is overall important for me to say that this is a work in progress and the dataset is more suitable to find global patterns rather than the exact IQs of single nations.

In the spirit of open science, here is Becker’s work in progress for you to look through.
Here is the entire spreadsheet

Read the Manual 1.1 to get an understanding of the basic terms and categories.

On the spreadsheet the simplest summary is in “Favorites” and there is more background material in “Collection”. The nitty-gritty is in “Calculations” but there is even more detail in the further tabs.

For users who want quick access to IQ lists the table “FAVORITES” is recommended. It contains the final estimated national IQ-scores without additional information.National IQ-scores in column D (IQ(DB)) based solely on data, repeatedly checked and partly recalculated by David Becker. For these scores the highest possible amount of additional information is provided. Therefore, these scores are best suited to those who want to focus on rechecked and data from transparent and highly standardized methods.

National IQ-scores in column E (IQ(L&V)) were taken from the latest version of the dataset from Lynn and Vanhanen (2012, Table 2.1). All necessary corrections were done by Lynn & Vanhanen without revision by David Becker. Therefore, these scores are best suited to those who prefer to use original instead of revisited data, and the highest possible number of sources per nation.
National IQ-scores in column F (M(DB,L&V)) were calculated by unweighted means of column D and E. Therefore, these scores are best suited to those who prefer to use original and revisited data combined.

National IQ-scores in column G (IQ(L&Vo)) were calculated by David Becker as means from every single source which was not revisited. All necessary corrections were done by Lynn and Vanhanen without revision by David Becker.

National IQ-scores in column H (M(DB,L&Vo)) were calculated as means from data in columns D and G, weighted by number of sources. If National IQ-scores from columd D not given, national IQ-scores from column E used. For these scores the highest possible number of sources was included in the table “RECORDS”. Therefore, these scores are best suited to those who focus on a compromise between quality and quantity.

• Category: Science • Tags: IQ, Psychometrics 
No Items Found
James Thompson
About James Thompson

James Thompson has lectured in Psychology at the University of London all his working life. His first publication and conference presentation was a critique of Jensen’s 1969 paper, with Arthur Jensen in the audience. He also taught Arthur how to use an English public telephone. Many topics have taken up his attention since then, but mostly he comments on intelligence research.