This has been my best ever year, with 448,525 pageviews, an average of almost 9000 pageviews per post. These posts provoked 1.25 million words of comments, another all-time record, not bad for a mere 50 posts. The range of comment was very broad, the positions adopted often diametrically opposed, and quality of the best commentators outstanding. I wanted to reward those who stuck to the argument and provided references in support of their views, and Ron Unz had already noted those commentators, and arranged a way of doing so. Thanks to them, and to all of you for reading, and writing enthusiastically in response.

The top ten posts are shown below:

The world’s IQ of 82 drew the most attention any of my posts has ever received. Readership at 78,663 was almost 8 times higher than my most popular post last year. A viral post, it would seem. Global politics ought to take account of human capital. The next iteration of the world IQ database will probably show that IQ 86 is the best estimate, particularly if very low scores in some studies are excluded. In strict terms there is no justification for such an exclusion, since all results must be accepted on an equal basis, but tests of study quality are appropriate in my view, and I think a quality-weighted estimate will be around IQ 86. This very detailed and extensively documented and explained collection of papers on national intelligence is now a great resource for researchers. Contributions and referees are welcomed, as further relevant results are tracked down and included.

Although he has blocked me and so will not be able to read this, I am most grateful to Nassim Nicholas Taleb. He wrote an attack on intelligence testing which was so silly I decided to ignore it. Some of you asked me for my opinion, so somewhat against my will I went through it, correcting what I saw as the most glaring errors. That drew in a large number of readers, and overnight gained me 350 new Twitter followers. The moral is: rebut promptly, and never avoid an open goal.

I have always been fascinated by human errors, including my own. An aeroplane cockpit should be finely tailored to the human mind. When it is not, passenger die. The saga of the Boeing 737 Max 8 and its internal contradictions was a real-life Hal 9000, as in Stanley Kubrick’s “2001: A space odyssey”, a rogue computer that attacked the crew. Boeing produced software which had a mind of its own. When I called foul on it, I worried I was jumping the gun. I was a Boeing fan, and had previously lambasted the Airbus side-stick as being opaque, but the MAX software fix was terribly flawed, and would have been picked up by old-style Boeing engineers. Operators must have an accurate conception of how controls work. The comments on these posts were an education: there was much to learn from knowledgeable readers with technical information to impart.

Other posts gained readership by taking on various critics of the study of group differences and of intelligence testing. This is debate as it should be: an exchange of views about supportive evidence. I argue that genetics is a source of individual and group differences, not the sole cause, but an important one. On intelligence testing, the surprising finding is that intelligence is ridiculously easy to measure, and adequate tests take only a few minutes. Although these posts of mine were popular here, I am under no illusions: the criticisms which I rated as mistaken will have been read and believed by far more people than I reached in these refutations.

In the list above, look at the last column. Readers are having a proper read: they dwell a fair time on each post. The Boeing posts were a particularly good long read.

Finally, for aficionados, look at what is not in the Top Ten: for the first time “The 7 Tribes of Intellect” got pushed into 11th place, albeit missing inclusion by a mere 39 pageviews.

Some readers just dip in: I suppose they came to visit the “viral” posts and then, edified, went elsewhere. At the other end of the spectrum, there were very welcome repeat visits from loyal readers.

Bluntly, some people read me by mistake: I should have called the blog “Supplementary Statistics Appendix 3” to give them fair warning of what to expect. I hope they found many other more congenial places to satisfy their interest in psychology.

Here is a clue: although readers span the age range, with a big peak among those who beginning their careers (and liberating themselves from politically correct educators?) 83% of them are men, probably with a technical approach to life. They would like to know how things work, and expect to handle some numbers along the way.

Here is a new finding. Turks came to visit, but did not stay long. Why? Perhaps they wanted to know what evolution was about, and then got cut off. Any ideas?

The US predominates, as in real life. On a broader perspective, out of a total of 199 thousand readers, 119 thousand came from the British Empire (as was). Good to see cultural continuity. Readers in Sweden and Germany feel it worth having a look. France…. well, I am pleased to welcome inquisitive Franks.

I know there are many other far more popular psychology blogs, but given the necessarily technical nature of intelligence research, I think this is a good showing.

Twitter followers have risen from 4,500 last year to 5,825 now. I have not totally lost interest in Twitter, but have tweeted less, restricting myself to doing so when putting up new posts on the blog. I have given up looking at Twitter impact statistics, while still being interested in my blog statistics. Also, I have let many tweets pass me by, without responding. Each to their own opinion.

As always, I favour blog commentators who are evidence-based, responsive to counter-arguments, tough on all claims but kind to other commentators. To those who went out of their way to explain as well as argue, I appreciate your work enormously. Some of the exchanges should be written up and posted in their own right.

If you can please ensure that your anonymous handle is distinctive, that will help put your line of reasoning into context. If your name is Legion you are still as safe as when you call yourself Anon.

Back to the blog. 1,132,000 pageviews is more than I ever hoped to achieve. Thanks for reading, and Happy New Year.

  1. dearieme says:

    Thank YOU, old fruit.

    Taleb is a strange one. “He wrote an attack on intelligence testing which was so silly …”

    There’s something pleasingly circular about being stupid on the subject of intelligence tests.
    As a schoolboy I wanted to be good at French. I wasn’t but I didn’t claim the examiners were frauds.

    But I digress. Is there, in your view, a critic of intelligence testing who makes good points?

    • Replies: @James Thompson
  2. @dearieme

    Yes, critiques which correct the notion of absolute precision in test scores; critiques which explain that the predictive power is good but not absolute.

    • Replies: @res
    , @Twodees Partain
  3. Thank you Dr. Thompson, thanks a lot. As always, I’ve read your article from the first to the last line. Reminds me of that Maxwell Coffee ad, blues legend Mississippi John Hurt sings about in his Maxwell Coffee Blues and which says about Maxwell Coffee: Good ’til the Last Drop! 

    (At least one of the first songs ever written for a radio-ad).

    Your 737 Max-posts and the comments were quite easily my most thrilling read of 2019. And a story the established media missed out on in its entirety. – Which was one of the great miracles of this year. Blogs like yours are now the places, where even the hip (and hot) stuff is dug up. More than once, while reading these comments, I thought to myself: What are they going to do – at the Atlantic Monthly now? – Just buy and edit what had been published here? – They could have tried to do so, but instead, they chose to write – nothing! of any interest in this affair. A “Changing of the Guards” (Bob Dylan).

    • Replies: @James Thompson
  4. res says:

    Happy Unziversary! And Happy New Year!

    Would it be possible to create a table with unique page views and number of comments? I suspect “The 7 Tribes of Intellect” would have a high ratio. Perhaps a decent measure of timeless quality? Another measure might be number of page views received N months after publication.

    Speaking of that post, I notice there is a “Classics” button on your archive page. But that page has nothing more recent than 2015. Unless Classics just means “Old” would it be possible to add some of your best newer posts to that page? Or alternatively, create a “Best Of” or “Start Here” page which would provide a more easily navigable introduction to your oeuvre?

    P.S. I share your curiosity about Turkey. Does anyone have any ideas? If referral link information is available that might tell you something.

    P.P.S. I went back and looked at “The 7 Tribes of Intellect.” Is there any chance of turning your May 22, 2019 comment into a larger post? I think that would be an interesting topic if you would expand on it more. Link for anyone curious:

    • Replies: @James Thompson
  5. res says:
    @James Thompson

    Gathering a list of some of the best critiques (and critics) would be useful for “steelmanning” other critics. More on steelmanning with some interesting comments:

    I’m tempted to start using Logical Fallacy Referee images here:
    Would that be too over the top?

    • Replies: @res
  6. res says:

    Here is an older (2009) version of the Logical Fallacy Referee idea :

  7. @res

    Waste of talent is very sad, but I think the larger task it to update the whole post, and I can’t face that at the moment. However, it will be added to my To Do list, if such things still exist.

    • Replies: @res
  8. I must update the classics.

  9. res says:
    @James Thompson

    The optimistic take would be how much better we are about that now. The example you gave was caused by lack of opportunities for women at the time. Do you have (m)any examples of IQ tests enabling talent to flourish? One that comes to mind for me is a Swedish friend who had opportunities open up for him after taking their mandatory military service tests. He ended up working at various high end Silicon Valley companies.

  10. I found one very high ability child in the first year of my career. I think he was referred to Child Psychiatry because of misbehaving at school. Although the Wechsler Intelligence Scale at age 8 or 9 (I forget his age) correlates only about .6 to .7 with adult intelligence, it was safe to assume he was high ability. I explained this to his mother, and then got his father to ring me up so as to give them both guidance about getting better reading material and educational opportunities. Dad was very excited, saying he knew his kid was bright, but did not know he was IQ 145 bright. Sadly, never knew what happened to him in later life. He was a black Afro-Caribbean boy.

  11. CanSpeccy says: • Website

    On intelligence testing, the surprising finding is that intelligence is ridiculously easy to measure, and adequate tests take only a few minutes.

    Good God! Such clear and conclusive arguments to disabuse you of that silly notion, and you have absorbed not a word of it.

    You have yet even to recognize the need for an operational definition of intelligence that corresponds with the meaning of the word as defined by the Oxford English Dictionary: The ability to acquire and use knowledge and skills.

    The claim that IQ tests measure intelligence by a single number, is asinine. Intelligence is multi-faceted and individual intelligence differs according the the task or skill to be demonstrated.

    This view is not only obvious to anyone with any common sense or knowledge of humanity, but is manifest in the modular structure of the brain, each module subject to it own set of genes, and educational and environmental influences.

  12. @James Thompson

    “the best commentators outstanding.”

    Here’s a comment on that fragment of a line in your article: The word that would serve you better is ‘commenter’. A commentator is a person who provides commentary to the audience of an event such as a ball game, or some other contest, while a person making comments in a discussion is a commenter, FWIW.

    Congrats on three years at UR.

    • Replies: @dearieme
  13. I like it. Commenter. Thanks.

  14. dearieme says:
    @Twodees Partain

    And while we’re at it people who attend conferences are attenders not attendees.

    • Replies: @CanSpeccy
  15. CanSpeccy says: • Website

    And while we’re at it people who attend conferences are attenders not attendees.

    Not so, according to the ultimate authority, the Oxford English Dictionary.

    An attendee is:

    a person who attends a meeting, etc.

    Whereas an attender is:

    a person who goes to a place or an event, often on a regular basis

    • Replies: @dearieme
  16. Thank you for all of your efforts to share your knowledge with the masses. I have learned a lot reading your work here, and even before you joined Unz.

  17. dearieme says:

    Oxford? Pah. All it’s doing is adopting the cretinous position that if enough dimwits repeat an error that makes the error right.

    • Replies: @res
    , @CanSpeccy
  18. res says:

    My copy of the Compact OED does not even have the word attendee. For attender the third definition is the meaning you have in mind.
    3. = Attendant
    And gives a 1704 example: “I was a constant attender at Councils.”

    My Unabridged Webster’s also does not have attendee and offers a single definition for attender: “one who attends.”

    This page looks like a useful account:

    attendee (n.)
    “one who attends” (something), 1951, from attend + -ee. Attender (mid-15c. as “observer,” 1704 as “one who attends”) and attendant (1640s as “one present at a public proceeding”) are older, but they had overtones of “one who waits upon.”

    Another useful account:

    Attender was originally the only word for a person attending. [1] As with most nouns formed from verbs, as payer, trainer, employer, it was the receiver of action that was formed with -ee, as with payee, trainee, employee. In the 1980s with the advent of spell-checkers, the word attender was erroneously flagged as misspelled and attendee was its replacement. Since then attender is no longer in popular usage.

    Another perspective:

    When creating recipient nouns, keep in mind that a recipient is one to whom something is given or one for whom something is done. So, for example, the relatively new word attendee, indicating one who attends, is questionable because one does not receive attendance. The word technically should be attender (but, of course, it’s not).

    TLDR: What you said.

    • Replies: @keuril
  19. Is there any chance you might make a brief blog post about the Bret Stephens Affaire?

    Apparently he mentioned high intelligence as a part of “Jewish Genius” in a New York Times opinion piece, and now controversy rages.

    Worth, Stephens mentioned the research of Harpending and Cochrane on Ashkenazi Jewish intelligence.

    Those of us who are not experts in the field have trouble telling what to make of all this. All we can gather is that there is an ever growing domain of topics that we are not supposed to talk about unless we know in advance what the correct opinions are.

    Thank You and more grease to your elbow in 2020!

  20. You need to see Twitter as something that brings in fans while it’s not that hard work. It’s just a marketing tool like so much else and people love Twitter because they have a direct line to the person who claims stuff.

  21. CanSpeccy says: • Website

    Oxford? Pah. All it’s doing is adopting the cretinous position that if enough dimwits repeat an error that makes the error right.

    No, the Oxford Dictionary defines words based on common usage.

    Naturally, the Oxford does not have precise information on common usage, and thus, presumably, relies largely on usage to be found in print, although today it may also refer to usage in Internet pages. But either way, it is basing its definitions on usage by the more literate members of society, i.e., not cretins.

    As for the words in question, the definition of attender parallels that of a skier, a biker, a drinker, or a drug taker, in all cases the er ending meaning the habitual performance of the act referred to, namely, skiing, biking, drinking etc. Thus, an attender is a habitual attender at something or other, church perhaps, or meetings of alcoholics anonymous. Likewise, someone attending a conference might be a conference attender if they are in the habit of attending conferences, but if the reference is to attendance at a particular conference, then a person attending is not an attender but an attendee–unless, that is, they attend the same conference repeatedly, i.e., with breaks during which they are doing something else. In that case, then yes, they would be an attender, as well as an attendee.

    But you cannot get away from the principle of usage as a basis for the definition of words, else no one would know what you were talking about.

    • Replies: @dearieme
  22. dearieme says:

    you cannot get away from the principle of usage as a basis for the definition of words, else no one would know what you were talking about.

    That’s plain illogical – how would it cope when a word is becoming wrongly used? Which usage should it opt for?

    A franker work would list a classical meaning and then say something like “among the uneducated it is also used to mean …”.

    • Replies: @res
    , @CanSpeccy
  23. res says:

    I wonder what the current full OED (not the simplified web version CanSpeccy is referencing) would say about attendee and attender. Does anyone have access to a recent paper or electronic version?

    This page talks abut the OED definition; so apparently it is in more recent editions than mine.

    Interestingly, the earliest example of the word in the Oxford English Dictionary is a 1961 citation from the granddaddy of M-W dictionaries, Webster’s Third New International Dictionary of the English Language, Unabridged.

    But we’ll skip to the next (and far more entertaining) example, from The Swinging Set (1967), by William Breedlove and Jerrye Breedlove: “The only attendees who bother with clothes.”

    The OED says the usage originated in the US and is chiefly American. However, a third of the dictionary’s citations are from British sources, including this one from the April 3, 1980, issue of the Financial Times:

    “Some attendees view this flexibility as an opportunity to negotiate favourable terms.”

    Some more definitions at

  24. JackOH says:

    Congratulations, Prof. Thompson. A non-expert, I didn’t know I needed to read something about intelligence research until I read your columns. It really does add to one’s thinking about disparate issues, such as school funding, how we wish our society to be run, the quality(ies) of our political leadership, etc.

  25. Flemur says:

    Third Year at Unz

    Thank you, don’t stop now!

    People have probably seen this:

    65% of Americans Think They Are More Intelligent Than Average
    (“I am more intelligent than the average person.”)

    Ho ho! Those ignorant egotistical Americans!

    With a world average IQ of 86, 78% of Americans really are more intelligent than the average person.
    (US average IQ 98, stddev 15)

    • Replies: @res
    , @CanSpeccy
  26. res says:

    65% of Americans Think They Are More Intelligent Than Average
    (“I am more intelligent than the average person.”)

    I’m surprised the proportion is so low. Compare to something like driving ability:

    Svenson (1981) surveyed 161 students in Sweden and the United States, asking them to compare their driving skills and safety to other people’s. For driving skills, 93% of the U.S. sample and 69% of the Swedish sample put themselves in the top 50%; for safety, 88% of the U.S. and 77% of the Swedish put themselves in the top 50%.[29]

    McCormick, Walkey and Green (1986) found similar results in their study, asking 178 participants to evaluate their position on eight different dimensions of driving skills (examples include the “dangerous–safe” dimension and the “considerate–inconsiderate” dimension). Only a small minority rated themselves as below the median, and when all eight dimensions were considered together it was found that almost 80% of participants had evaluated themselves as being an above-average driver.[30]

    Perhaps people are more realistic about intelligence because of all the feedback they get in school?

  27. CanSpeccy says: • Website

    That’s plain illogical – how would it cope when a word is becoming wrongly used?

    Common usage is correct usage.

    If usage changes, then past common usage becomes obsolete, much to the disgust of those who were brought up with the old usage. For example, the annoying American habit of referring to data as a singular rather than a plural noun, a usage which, according to a Google search has now overtaken the obviously more educated (i.e., Latin aware) former use as a plural noun.

    That transition, based on past and present Google searches for “data is” versus “data are” has occurred in only the last two or three years — evidence clearly, of the dimming of the American mind. You can see this ugly trend plotted graphically, here.

    My paper copy of the Shorter OED, a gift to me from the Oxford University Press, includes neither “attendee” nor “attender” suggesting that most sensible people will avoid using such jargonistic terms, as it is clearer to be explicit, as in “those attending,” or “those who attended,” or “those who attended repeatedly, occasionally, from time to time, etc.”

    • Replies: @res
  28. CanSpeccy says: • Website

    65% of Americans Think They Are More Intelligent Than Average

    They think it, or they say they think it? The question reminds me of the story about Joseph Patrick Kennedy when boarding ship for Britain in 1938 as the newly appointed US Ambassador. “Do you think you’re up to the job” a reporter asked, to which Kennedy replied, “If Marlene Dietrich asked you to go to bed with her, would you say you’re not very good at it?”

    • Replies: @res
  29. res says:

    My paper copy of the Shorter OED, a gift to me from the Oxford University Press, includes neither “attendee” nor “attender” suggesting that most sensible people will avoid using such jargonistic terms, as it is clearer to be explicit, as in “those attending,” or “those who attended,” or “those who attended repeatedly, occasionally, from time to time, etc.”

    Does it include the definition of attendant which matches this?

    It seems odd to me that your dictionary does not include attender which has a long history and is ranked about 41ooo on this word frequency list (note it is a sample of every seventh word):
    Presumably your dictionary has quite a few more words than that.

    Which version of the Shorter OED do you have? There were major versions in 1933 and 1993 with additional revisions in 2002 and 2007.
    Looking more closely, there were multiple editions before 1993:

    P.S. If anyone cares, my copy of the compact OED is based on the 1933 OED1 (I hadn’t realized the source material was that old). The newer version from 1991 is based on the 1989 OED2:

    P.P.S. I enjoy the status seeking/asserting behavior: “a gift to me from the Oxford University Press.” There is a lot of that in your comments. It must be important to you.

    • Replies: @CanSpeccy
  30. res says:

    Good point. Do you have a source for that quote? When I do a web search the only place I see it is your blog.

    Much on the Dietrich-Kennedy (both Joe and John) connection.
    But was she telling the truth? (fifth paragraph from the end)

  31. CanSpeccy says: • Website

    Do you have a source for that quote? When I do a web search the only place I see it is your blog.

    Some book, read probably forty or more years ago. Have read a lot of biographies in my time. But off hand I don’t recall the source — but I’m not so good as to have made it up.

    Which version of the Shorter OED do you have?

    The most recent, I believe, i.e., the Sixth Edition.

    P.P.S. I enjoy the status seeking/asserting behavior: “a gift to me from the Oxford University Press.” There is a lot of that in your comments. It must be important to you.

    No, it’s purely for your acknowledged enjoyment.

  32. CanSpeccy says:

    It seems odd to me that your dictionary does not include attender which has a long history and is ranked about 41ooo on this word frequency list (note it is a sample of every seventh word):
    Presumably your dictionary has quite a few more words than that.

    At a rough estimate, the 6th edition of the SOED has something like 10 million words, but not ten million different words.

    Looking again at the entry for “attend,” I see that after 12 definitions with literary examples, there is the following:

    attendee noun = ATTENDANT noun 3 M20. attender noun LME.

    but with no definitions.

    So, according the the dictionary, attender and attendee are synonymous with attendant (when attendant is used as a noun), although the opinion of those who believe there is a difference in the meaning of these words presumably reflects actual usage, at least in their own circle.

    In a separate entry, attendant as a noun is defined as:

    a person in attendance or providing service, a servant, etc.

    • Replies: @res
    , @res
  33. res says:

    Thanks for following up.

    This page claims 600,000 words, but does not make clear whether they mean headwords only or not.

    From the same page some more detailed statistics for the fourth edition:

    Includes 97,600 headwords, 25,250 variant spellings, 500,000 definitions, 87,400 illustrative quotations and 7,333 sources of quotations (including 5,519 individual authors).

    Dictionaries being descriptive vs. prescriptive is a topic of debate. I prefer a mix with an effort to be clear about which is which. Even if only by being explicit about nonstandard usage.

    Historically English language dictionaries were more prescriptive. Today they are more descriptive.

    • Replies: @CanSpeccy
  34. CanSpeccy says: • Website

    Historically English language dictionaries were more prescriptive. Today they are more descriptive.

    Usage determines the generally understood meaning of a word. Therefore descriptive definitions are the essential key to any language. However, usage is not uniform among users of a single language. Hence the discomfort or irritation expressed by Dearieme, and in fact just about everyone, at usages which (an American would say “that”) conflict with their own.

    Prescriptive definitions reflect usage by particular user groups who, for whatever reason, seek precision or consistency in language use. Thus, for example, those with knowledge of the classical languages seek consistency in the use of words derived from Greek and Latin with their original Greek and Latin meanings, e.g., data “are,” not “is”, whereas those with technical expertise are inclined to demand consistency in usage with technical definitions, hence, for example, resistance to the use of the term “efficiency” in a broad sense meaning “competence.”

    Because language use betrays both social and geographic origins, education, employment, personal interests etc., it is perhaps the most important single means by which people weigh one another up. This probably explains the irritation or horror that some people feel about certain word uses: there’s no way they will be seen dead associating with such ignorance, provincialism, or whatever.

    Prescriptive word definitions provide a basis for claims to “educated” or “correct” usage, but they rightly cut no ice with people who have a habit of “incorrect” usage. One might as well just accept that at Oxford, where what a person knows is the key to their social standing, they say “data are” whereas at Cambridge Mass., where all that counts is money, power and SAT scores, they most likely, annoyingly, insistently, and ignorantly say “data is.”

  35. res says:

    So through happenstance I now have a copy of the first volume of the Shorter OED Fifth edition. It has the same derivative word section at the end of attend as yours does. Probably obvious, but to be clear, that indicates attendee comes from the mid 20th century and attender comes from late Middle English.

    The information section at the beginning of the dictionary notes that for derivatives: “regular formations with readily deducible meanings…may be left undefined.”

    That definition is saying that attendee corresponds to the third noun definition of attendant (the definition you gave is the first noun definition).
    3 A person who is present at a meeting etc. M17

    I am not sure if attender is meant to refer to the same definition of attendant or if it is considered obvious and left undefined. I am guessing the latter.

  36. keuril says:

    When creating recipient nouns, keep in mind that a recipient is one to whom something is given or one for whom something is done. So, for example, the relatively new word attendee, indicating one who attends, is questionable because one does not receive attendance.

    Actually “attendee” is analogous to a number of other English nouns with “-ee” endings, such as refugee, devotee, retiree, absentee, and escapee. For more info, Googer “retiree ergative”. And I also recommend reading the Wiki entry on ergative languages. English is not fundamentally an ergative language, but this is an ergative-like feature that has emerged and may continue to gain ground. Languages are always changing.

    • Replies: @res
  37. res says:

    Interesting perspective. From my quick look around it looks like a key distinction is whether the verb in question is transitive or intransitive.

  38. Factorize says:

    Dr. Thompson, I am interested in your psychometric assessment of the commenters (or possibly commentators for some) on your blog. Of particular interest to me would be the reasoning that you might apply to arrive at such an assessment. For example, one might use textual analysis tools that analyzed spelling, grammar, other textual features (e.g. length of clauses, sentence length, etc.), or features more related to the correctness (for some correctness might be motivated by political considerations) of ideas as described in the psychometric literature. In the virtual era, people are often confronted with the problem of trying to assess the intellectual ability of posters to blogs and typically apply simple heuristics such as correctness of grammar or spelling that might not provide useful results.

    A “forum” intelligence test does have a certain appeal as it speaks directly to the idea of a practical measure of intelligence that avoids what some might consider to be sterile traditional intelligence testing. Such a test would allow for a more multi-dimensional conception of ability in which a range of ways of knowing could be identified and be recognized as enriching collective online discussion.

    • Replies: @James Thompson
  39. @Factorize

    Interesting question. Psychiatrists sometimes used to ask me for IQ estimates on the basis of letters their patients or patient’s relatives sent them. Even though I was reluctant to estimate, I found that spelling and formal aspects influenced me, and then word frequencies. Less often, Police asked me for estimates on some cases.

    There are reading formulas used to estimate the readability of instructions and texts. The Flesch formula was one I used when studying the readability of medical leaflets. Applying that, or more recent variants, to comments would give very rough estimates.

    Given that my commenters have willingly come to a somewhat technical website it would probably be better to assume at least high average ability, and then judge comments by the quality of arguments. Something like the Watson-Glaser test would give an idea of how this might be done.

    Thanks. I will dwell on that further!

    • Replies: @res
  40. res says:
    @James Thompson

    Something like the Watson-Glaser test

    Thanks for that. I noticed that you wrote some more about it in this comment:

    This is the basis of the Watson-Glaser test, which has the merit of considerable face validity. You read a short document, and then have to aswer questions about what can be concluded in terms of inferences, deductions, interpretations, and arguments. This is very like the decisions carried out by managers, and so is often used in management selection.

    Anyone taking the test can see that it is legitimately facing applicants with the sorts of problems they will encounter in their work. Law firms often use it in selecting applicants.

    On samples of 20,000 students it correlates well (r=0.74) with the Otis Mental Ability Tests.

    On a tiny sample of Wechsler Verbal IQ it correlates r=0.41 with the Wechsler Adult Intelligence Scale, but that is because of a correlation of r=0.5 with the Verbal scale.

    On the down side, it can take over an hour, but the benefit is that you know with some certainty whether the candidate can cope with the sorts of problems which will land on their desk.

    This seems like a decent overview to me. If you know of anything better I would be interested.

    I wonder what the correlation with the LSAT is? It looks like its use in the legal field is common enough to inspire a diatribe like this:

    Is the Watson-Glaser test used much for personnel selection in the UK?

    • Replies: @James Thompson
  41. @res

    The candidate was rightly rejected, in my opinion.

    • Replies: @res
  42. res says:
    @James Thompson

    The candidate was rightly rejected, in my opinion.

    Hard to be sure without knowing more of the TRUE circumstances, but that seems like the way to bet.

    It is interesting how often anti-test diatribes are self defeating given what they tend to demonstrate about reasoning skills.

    This seems like the key excerpt:

    Conversely, there are essential lawyer skills which are not tested in the Watson Glaser. Sure, graduate recruiters could argue that’s why there are other assessment day activities in place to do that — but that argument does not work where a student has performed well overall but is not offered a training contract or vacation scheme because they failed the test.

    I have personal experience of this. Having done several of these tests myself, I had on a number of occasions passed the online test but failed it when having a second go at the assessment days. This shows the test is somewhat unreliable because at one point my potential success is rated highly while at other times it isn’t. I was even rejected from a vacation scheme at a top international firm solely based on the outcome of the Watson Glaser test. It felt like a massive blow, and ultimately led to my disbelief in the assessment method.

    So there were other testing activities. I wonder what “well overall” translates to (I think statements like that are often a tell that the person has no real sense of how others did in comparison). For me the interesting question (unanswered here) is how they combine the different results (and presumably, those of others in competition) to come up with a decision. Do they use soft/hard thresholding, weighted averages, or ? I could see the rejectee having a point if a hard threshold seemed excessively high, but there is too little information to even begin trying to assess the justice of such a complaint here.

    If I understand correctly, you have extensive experience in personnel evaluation. Can you offer any thoughts or references on typical approaches for analyzing multi-factorial data in selection situations? Both in terms of yes/no decisions for individuals and how things work if one is trying to choose roughly N people but has an especially strong or weak talent pool?

    My experiences have generally been limited to interviewing job candidates and then hashing out the decision in a meeting with the other interviewers later. Incredibly subjective (especially given that these were mostly engineering jobs, the resume pre-screen is probably more objective though), but seemed to work fairly well most of the time. The most interesting part of the process is seeing what everyone else thinks. The non-simultaneous nature of the interview process (in contrast to something like “assessment days”) adds to the complexity of the decisions.

    • Replies: @James Thompson
  43. @res

    I have little experience of personnel evaluation, other than in the very specific context of evaluating retired/sacked executives to find new placements or occupations for them. However, I have been impressed by the work of Hunter, Schmidt and Kunzel who have enormous data sets that have been accumulated over the years, and have done careful work showing the power of intelligence tests in selecting applicants for a very will range of jobs.

    My own experience of selecting clinical psychologists was sobering. I wanted it done on objective academic standards plus a wide-ranging personal interview, to see if they had any “bounce” in them: that is to say, had the capacity to pick up new topics and discuss them, and respond productively to challenges about their cherished beliefs about psychology. I think that my own personal record in selection was average. I picked up some very good psychologists, but once picked a very academic one who was no good with patients. We both worked that out in about four months, and I put him on to the task of writing papers.

  44. Factorize says:

    Might someone help out? What causes differences in twins IQ? If you have twins who are raised together and twins who are raised apart, then what will cause the separation in their intelligence? Not genetics. From what I understand of the literature, shared environment contributes very little to the similarity (or difference) in twin psychology. Non-shared environment?

    Would the difference be mostly related to non-shared environment? That is would twins raised together be expected to have more similar non-shared environments than those raised apart? For example, wouldn’t twins raised together who attended a private school be expected to have a greatly reduced range of possible non-shared environments than twins raised apart who could be exposed to a fuller range of potential non-shared environments?

    • Replies: @James Thompson
  45. @Factorize

    Good questions. I describe non-shared environments as personal niches: self created settings in which individual interests can flourish. Perhaps private schools encourage diversity in these pursuits.

  46. Factorize says:

    Doctor Thompson,

    I feel great! I have learned a great deal about human differences in my psychometrics course and have developed a better understanding of the methodologies applied in this field. I finally have a good grounding in the vocabulary and the basic ideas related to testing.

    The errors of logic and basic facts related to most public discourse of psychometrics (as are all too often demonstrated by posters on this blog) are now clear to me. Without a preliminary introduction to psychometrics, public discussion about this contentious topic almost inevitably would be expected to become hopelessly mired in confusion: This is typically what occurs.

    It is critically important to have a very good understanding of technical terms such as reliability, validity, correlation, etc. (in their various different meanings) in order for productive conversation to occur. Further, a basic education into the main findings of more than a century of psychometric research allows one to be directed to those aspects of the literature that are still open to legitimate discussion. I was somewhat surprised about the extent to which test bias is not regarded by the psychometric community as an open topic. The lingering memory of “regatta” has biased my perception. I was also disappointed that my course has strongly avoided exploring the more quantitative aspects of psychometric science. However, in total I am very glad that I have finally
    clarified a great many of my uncertainties related to human differences and to technical aspects of their measurement.

    • Replies: @James Thompson
  47. @Factorize

    I am very glad to hear of all these achievements. Certainly, many critics enter the field with insufficient preparation, and this wastes time. As to test bias, I think this has been worked on considerably. Arthur Jensen’s 1980 “Bias in mental testing” is a classic for very good reasons: he clearly sets out the principles which might distinguish a biased item from one which detects a true difference. A biased item will be a poorer predictor of real life achievements.

  48. Factorize says:

    Doctor Thompson I greatly appreciate your comment and encouragement.

    I have been thinking about the WAIS-IV (as I am preparing an assignment about this instrument; actually in the last few days I have finally secured a copy of the WAIS-IV technical manual. While my approach might not have been entirely legal, it is a tough document to acquire otherwise).

    In particular, I have been contemplating reliability of the WAIS-IV. After carefully watching online videos of testees being evaluated for the Block Design subtest, it occurred to me that some simple tips would greatly improve their performance. One of the most obvious tips would be to immediately turn all the cubes to the red-white split side facing up especially as the items became more difficult. It is easy to guess that the hardest 3 x 3 design would almost certainly have all splits on top: Including solid colors would make the challenge too easy. There are several other tips that would also be helpful for Block Design and would further increase scores. My best guesstimate is that I could increase typical scores on this subtest close to the ceiling after about 1 hour of coaching.

    I consulted my psychometrics tutor on this question and was quite surprised by the response. Apparently, my tutor considers that any such coaching would be considered immoral and that such preparation would invalidate the test result. Yet, Koh’s blocks are widely available online and online videos that I have seen show all the block designs for one form of the WAIS-IV. Block design is the first subtest administered during the WAIS-IV and is one of only 10 core tests that comprise the WAIS-IV’s full scale IQ.

    Given the above, I am very unsure how Block Design could be considered a reliable measure of intelligence. Theoretically, g should essentially be a fixed quantity. Any valid test of it should not exhibit large amounts of variation on retest, even with moderate coaching. A test that could be shown to have such variation would almost by definition not be valid. How can Block Design be valid when even minimal coaching would likely greatly increase the score?

    One reliability methodology that is often used is test-retest: Block Design has a reasonable test-retest correlation. Other types of reliability (such as internal consistency etc.) are likewise probably also entirely acceptable. However, these measurements are only based on retesting testees after the passage of time; there is no coaching given before retesting. Thus, such reliabilities only demonstrate the law of psychometric inertia (i.e., without instruction people tend to maintain the same psychometric state through time). I do not find such reliabilities persuasive.

    What probably should be reported is “robust test-retest reliability” which is to say the correlation that results when coaching is conducted before the retest. Interestingly, coaching would likely increase the reliability of the subtest as then true g differences would tend to differentiate examinees and not psychometrically hollow tricks. Coachable increases on other WAIS-IV subtests might likewise be possible, though the Block Design presents the most obvious opportunity to demonstrate the effect.

    • Replies: @James Thompson
  49. @Factorize

    You surprise me. Of course coaching can boost scores to some extent. The whole idea of keeping tests secret is so as to reproduce as closely as possible the ideal test situation: someone trying to solve a problem they have not seen before. Once the items are well-known they lose some of their bite: they are not real tests of problem-solving. They become tests of acquired knowledge.

    The real test of coaching is to coach with test A and then see if there has been a real improvement with test B. Usually there is little carry-over, that is, little skill transfer.

    Send the manual back.

  50. Factorize says:

    Doctor Thompson, I hope by surprised you do not mean disappointed.

    My understanding is that a test of g that is not highly stable is not reliable. From my psychometrics course, this is merely a restatement of a definition. When I step on a scale and I read out at 180 pounds, then if I step back on that scale within a few minutes and I have not indulged in a few cheeseburgers etc., then I expect that I should weigh in around 180– perhaps 182 pounds. The scale is reasonably reliable. If I take the Block Design subtest as a measure of my g, then I expect that another sitting of this subtest an hour later should also report a reasonably consistent score with the first testing. This test for g would then be considered reliable.

    However, during that hour between retesting, if I were coached to recognize certain features of the blocks and on retesting I had substantially increased my test performance on the test, then I would claim that Block Design is not a reliable or valid test of g. A legitimate test of g simply should not allow for easy manipulation of scores. By definition g is essentially a near fixed property of the mind. An IQ test that were not highly consistent from one testing to the next would lose all meaningfulness. Any readily explainable trick that could significantly increase a score on an IQ subtest would invalidate it. My current hunch is that Block Design could be gamed in this way.

    This is likely why my textbook directly quotes Wechsler on this very point of IQ constancy: “The constancy of the IQ is the basic assumption of all scales where relatives degrees of intelligence are defined in terms of it.”

    I acknowledge your mention of the need for skill transfer though the tricks that I had in mind should be transferable to any Block Design configuration. I am not merely saying that I could improve performance on a single set of block patterns. My tricks would be applicable to any potential block arrangement. I do not understand why this more robust version of reliability has not been part of psychometric research during the last century.

    Without such verification, we are left with the supposed anomaly that IQ has increased by ~30 points in the last century, and yet many might reasonably question whether g has increased at all. Notably, when you return to the original research on the Koh’s blocks, the times required for top scores were much slower than today. One could only guess that when this test was actually novel, people would have required more time to find the solutions. Perhaps one way to move beyond the problem of gaming the test would be to turn it into a test of learning: those people who merely demonstrated very high ability without any evidence of improvement on a learning dimension would receive lower scores than those with very low ability, though actually showed improvement.

    The Perceptual Reasoning Index is the most vulnerable to tricks because it is content empty. The Verbal Comprehension Index could also be gamed, though it would take much more effort as it would require carefully studying the dictionary etc.. If it were up to me to design an intelligence test, then I would require that testees be highly familiar with the content of the tests and have devoted a reasonable amount of effort preparing for the test.

    The tricks that I had in mind should transfer to any Block Design configuration. I do not understand why this more robust version of reliability has not been part of psychometric research over the last century. In Spearman’s 1904 paper, he all but gave up on the so-called village children because he could see that it would be difficult for a g factor to emerge when these children had so little refinement of their abilities. It was only when he used the school ratings of the private school students that his factor analysis was so on the mark.

    My textbook directly quotes Wechsler on the point of IQ constancy: “The constancy of the IQ is the basic assumption of all scales where relatives degrees of intelligence are defined in terms of it.”

    The Perceptual Reasoning Index of WAIS-IV is highly vulnerable to intervention. It is not overly surprising to me that Matrix Reasoning (i.e., in addition to the Block design subtest) has also seen large temporal increases in scores. There are also simple tricks that can be learned for this subtest to increase performance. In fact, a computer has been able to reduce the test to only 3 or 4 universal rules. We do not need to wait a century for a temporal increase in IQ of 2 standard deviations in the Perceptual Reasoning Index, after a few hours of instruction this increase could be achieved in a few hours. I could then go and retrieve my Nobel in psychometrics. Yet, of course, the gains would be entirely hollow; Human g would not have increased in the least.

    Perhaps if there is anyone on thread interested in investigating the reliability of Block Design, then they could purchase the Koh’s blocks online for ~$30 and we could then run through some coaching to test the hypothesis.

  51. All tests show a re-test effect, almost always a gain. In fact, scores even a year later can be slightly higher. People learn things, and become familiar with test items. This is a fact of life, and is coped with by having alternate forms (form B) and at least 6 to 12 months between tests.

    Despite all this one can extract a g factor from a wide variety of intellectual tasks. Finding other tests has always been important. Inspection time, choice reaction time, flicker fusion, and all the rest of it. Nothing which is strong enough to replace ordinary testing.

  52. Factorize says:

    This is awesome!

    There is a full suite of neuropsych/psychometric tests online. The freeware is called PEBL. This could form the basis of an intelligence test that made sense to me. For example, one of over 50 tests that they have included is a paired associates test. What I really appreciate about these tests is that one would not merely have to take a test once which was entirely unfamiliar. One could repeatedly take these tests and then see what happens to the scores. Many of these subtests show substantial test retest practice effects. This is the problem that I suggested that the Block Design on WAIS-IV would be subject to if testees had greater access to the tests; this is especially applicable in the context of computer administration.

    The letterdigit test on PEBL is similar to the WAIS-IV’s Coding subtest, though I think the PEBL version is probably better. With the PEBL version each response is evaluated in micoseconds. This approach would show exactly the learning process in action. In fact, I can trace out my learning curve over the 10 trials that I have tried. There is a dramatic practice effect that occurs. My mean reaction time almost monotonically decreased with a 50% reduction in mean RT over the 10 trials.

    This has to be the future of IQ testing. With the current paper and pencil approach, all they can do is measure the total time taken. Yet, they have no insight into the microsecond scale learning that this free program offers. With this approach, future IQ tests might have widely available online software such as PEBL that people could freely access and then when the actual testing were required testees could go to a testing center for invigilation. Perhaps what could also be done is that pre-testing could all be done in advance and then this information could be transmitted to the psychometrician. The actual testing would then simply need to sample the reported scores to check for accuracy. Psychometric testing that was applied in this way might allow for much much more
    psychometric accuracy. The actual at home portion of the testing might take upwards of ~ 10 hours in total, while the invigilated portion of the testing would then still be set for around an hour. The personal best performances (if verified) could then be recognized as the performance standard of the individual that would be possible in an optimized environment.

    One other thing that I found interesting is that I noticed that I did quite a bit better on the addition test on PEBL than I expect I would do in person. What I think happens is that when you are around people you will tend to speak SLOWLY so that everyone understands you, and you will maintain a calm and collected composure. Yet, with technology you do not need to speak slow or to maintain a rapport with the computer: you can simply focus on the task. There is also the comforting feature that the computer can present the questions at a precise rate and this style of presentation has a very pleasant feel. In addition the method used in all of the PEBL tests almost completely avoids tricks. For example, with the letterdigit test, one must answer each item in sequence and each response is recorded to the microsecond. One can observe the exact speed for each letter and what learning strategies might have been prepared.

    21st Century cognitive ability testing has arrived.

  53. Factorize says:

    Might anyone be aware whether there could be any connection between Spearman’s 1904 factor theory and the women’s suffrage movement of the early 20th Century? The fairer gender at an earlier time were not recognized as the smarter gender. Might g theory have helped to clarify the roughly equal intellectual standing of women and men and, thus, have helped women to achieve the right to vote?

  54. Factorize says:

    Doctor Thompson, this is extremely exciting!

    I am finally getting the feel for the psychometric style of thinking. For those kids out there who want to get on the fast track to psychometric understanding I would suggest first reading a popular best seller about psychometrics (over the last 50 years or so there have been only a few such books that have captured mainstream public interest), next take an online course psychometrics course from an open university, and then start extensively reading psychometric research in specialist journals etc. (A tremendous resource is .)

    The website is a remarkable legacy for a man who devoted almost his entire existence trying to educate others about pscyhometric truths especially as they relate to education. Decade after decade he reiterated the same message that students have different abilities to learn, though the overwhelming inertia of human ignorance resisted him at every step. Ironically, at the level of mainstream public discourse, there has been minimal demonstration of learning about basic concepts in psychometrics since Spearman’s breakthrough research in 1904 into the factor structure of human intelligence.

    Mainstream opinion has displayed a learning rate concerning learning theory that optimistically could be described as remedial. Considering that even foundational facts about intelligence theory have not been encoded into the mass consciousness, it would be difficult to even assign an IQ of 70 to the public’s psychometric understanding. For those wanting to diligently apply themselves to studying psychometrics who might be worried that they might not be smart enough, fear not: after over 100 years of intelligence research, typical citizens apparently have learned nothing. It is almost impossible to imagine an operational definition of success that could be any lower.

    Yet, life does not need to be like that; this is the 21st Century! As the Jensen site illustrates in our virtual era of open knowledge (with ZERO marginal cost), a person’s entire lifetime of accumulated knowledge can be shared with all those interested in enlightenment. This is amazing! I consulted my unis print library catolog and they only had one Jensen book: Bias in Mental Testing. Open knowledge, open data, smarter people, better world, yeah!

    I, Factorize, declare a new era of human consciousness. A world of higher intelligence, a world in which the lifetime accumulated knowledge of intellectual giants can more readily be incorporated into global human consciousness. An eternity of ignorance can finally end. While some might rightfully respond, “Yeah, where have you been for the last 25 years?”, the Jensen site has helped crystallize for me how powerful even a single person’s dedicated efforts can be (especially when magnified by open internet access). Jensen must have been more than slightly disturbed that his message, largely unchanged for decades, was unable to connect with the mainstream. In a pre-internet era, our great thinkers were expected to evangelize until they were wearied by the struggle.
    Hopefully now with the power of the internet knowledge can be transferred with much less resistance.

    ( One of among many notable Jensen works. This one describes his view on school integration, and how counter-intuitively integration can be understood as a form of egalitarian racism.$b654871&view=1up&seq=346 )

  55. Factorize says:

    After receiving a good grounding in the basics of psychometrics, I have been able to access a great deal of the research literature. One question that has me stumped at the moment is about the factor structure of human intelligence. This has been argued about endlessly for over a century. First Spearman wanted to stay with a g only model of intelligence, and then group factors were introduced etc.. However, through all of these years of argument it is still unclear what is the best model of intelligence. A newly reemerging idea is that g loads directly onto the tests in a so-called bifactor model.

    If this were in fact true, then g would have a truly enormous influence on intelligence. The broad factors would have almost no relevance. This might be a somewhat arcane reason to start marching in the streets, though the real world impact of such a g centric perspective would be of profound importance. As it is now the hierarchical models somewhat mute the influence of g through the broad factors.

    What I am unsure of is why this question has been so unresolvable to date. If g were so powerfully important, then would not some of the clinical neuropsych studies help to clarify the role of g? For example, in one study that I read mild/moderate and severe TBI had surprisingly small effects on verbal ability. Nevertheless, verbal ability has fairly high g loading. Is this study consistent with a bifactor model? In the same study, TBI had substantial effects on processing speed subtests, yet this index has the lowest loadings of g.

    Further, the way that this idea relates to the recent genetic revision of Spearman’s 1904 article has caught my attention. Should not this line of research finally be able to determine the correct factor structure of human intelligence? If variants can be identified according to whether they are g or s related, does this not imply that a provable factor model of human intelligence might finally be established? The article did not seem to clarify the point. Perhaps a Mendelian randomization type approach could be used. Here the effects of manipulations in g and s genetic variants could be observed in human phenotype data. This could also be of considerable importance for future genetic selection efforts. Some people might have quite high genetic g, yet fairly low genetic scores on some of the broad factors. This would add yet another tunable parameter in genetic enhancement.

    Thinking beyond the theoretical models with an analogy might help. Consider my computer. What happens to speed of my hard drive access if I were to upgrade my CPU? Does the speed relate more to the internal processing ability of my drive or is there a relatively significant input from the CPU upgrade. The bifactor model seems to suggest (in the related instance of human memory retrieval, for example vocabulary) that enhancing g (upgrading the CPU in the computer analogy) should allow for significantly improved human performance. If this reasoning were shown to be true, then it could be extended out to include other of the broad factors in order that a correct model could finally be reported. After over 100 years of debate it would be of great importance to the field of psychometrics if this question could be resolved. It would also vindicate Spearman’s insight that to resolve this question a deeper insight into human psychophysiology would be needed.

  56. very glad with all these developments. It would be worth while your writing up these main points so that we could give some general guidelines.

  57. Factorize says:

    Dr. Thompson, have you seen this one ? Impressive. We have drifted away from the core psychometric focus of the thread lately, perhaps this article could move us back on track. The article is quite startling; it found that SES has an independent effect on brain development above and beyond PGS EDU. This research raises ethical questions for me related to the nature of social stratification and its long term impact on the brain development of those from different backgrounds.

  58. Yes, I had seen a similar finding in a paper in November, and No, I haven’t posted on it yet, nor on this one. I am a bit behind things, for various reasons.

  59. Factorize says:

    I am feeling super-great about my recent breakthrough in understanding many of the ideas of psychometrics. Unfortunately, such sense of accomplishment can often be absent in education as typical paced class environments usually only bring students up to a minimal level of comprehension before moving on to some other topic. With self-paced online learning one can go off-road and develop mastery of subjects that are of particular interest. I am somewhat disappointed, though, that most of my courses have avoided a more quantitative approach. I expect that as online learning mainstreams a diversity of formats will become available for those with different learning styles (for example, the bubble animations showing economic development trajectories, etc.).

    Getting back to writing up some of my main observations, I would reiterate the importance of having a good base knowledge to work from. I have tried for some time to penetrate the thinking pattern of psychometrics, though this effort had not been successful until I studied the topic formally in my course. It is so critically important to have a very firm grasp of the main psychometric terminology: reliability, validity, g etc.. Once you understand the vocabulary, you are all set. Jensen’s book The g factor, now reads like a romance novel to me. I can breeze through it while maintaining a good level of comprehension. Without knowing the terminology, the book was a real struggle to read.

    Much of the psychometric literature endlessly repeats the base message about the history of intelligence testing, basic descriptions about the meaning of g etc.. As soon as you learn the basic message once, you can start speed reading. What is quite startling to me is that despite public service messages from leaders in the psychometric community being provided for many decades very minimal uptake is typically demonstrated by the mainstream community. For example, endless debates on this blog have questioned intelligence research at the level of g. In all of my readings about intelligence, I have not found any discussion that cast doubt on the bedrock idea of general ability. The current debate is more one step down at the level of the group factors. I am disappointed that the research effort has not been able to make more forward progress. At this time, the name brand IQ tests still have not been able to coalesce the terminology related to the group factors, even while the most dominant existing theory of intelligence (CHC Theory) has created an impressive model with substantial empirical support.

    g theory has always seemed quite dismal to me: People have different amounts of g that is largely fixed resulting in a life potential that is highly constrained. Yet, some of the research that I have encountered appears to rebut this interpretation. For example, in some research into reading ability it was found that sometimes even deficiencies in narrow factors can hinder reading acquisition (I think this was beyond classical reading disability such as dyslexia). Psychometric research, for one of the few times that I am aware, has been able to unleash human potential by first creating an accurate factor model of human ability (CHC theory), and then create interventions for those who have weaknesses in some specific skills.

    From this perspective, we can see that some developmental pathways might be blocked and some intervention could help unblock them so that people could be placed back onto the normal development track. Perhaps my suggestion above about how to accelerate learning psychometrics might help unblock barriers for some. One could imagine that a great many such “blocks” might exist that hold back people. Considering how important mature, well-informed discussion about psychometrics is for our collective future, such guidance might be of considerable value if it were able to divert the typically ill-informed public commentary to a discourse reflective of findings from the research literature.

    I have been working away of late on a course assignment about WAIS-IV. It is truly impressive how much of the research literature I can access through my virtual credentials. WAIS-V is expected possibly by next year. I will be very interested to see how psychometric research over the last 10 years will be incorporated into this updated psychometric instrument. I would really like to see
    a transition to a full adoption of CHC theory (and other changes), though I realize that with an existing legacy from previous WAIS/WISC/etc. versions a certain conservative inertia becomes difficult to escape.

  60. Factorize says:

    Doctor Thompson,

    How would you go about creating a national IQ test using Spearman’s “indifference of the indicator”. I want to make sure that I am correct about this.

    Roughly, you want to find some “tests of ability” or indicators that are correlated with intelligence (e.g., level of obesity in a nation, PISA scores, teenage fertility rates etc.): A variety of measures that to some degree correlate with IQ. Something that I found interesting is that national female homicide rates correlate quite a bit more strongly than national male homicide rates with national IQ. My guess here is that males might often be more highly impaired by drugs and alcohol than might women. I am wondering now which PISA score might be best to use as well. For example, should I use the 50th percentile result or the 10th percentile result. Would either of these be expected to be more highly correlated with IQ? There is an endless wealth of potential statistics available, I would want to choose carefully to find the “best” or at least better. After standardizing the results, they could be summed up for each nation and then perhaps standardized again to have a national standard score. I can see what Spearman meant about using somewhat ridiculous measures to arrive at a cognitive ability measure that would actually be expected to be quite accurate. This little trick is almost magical in its simplicity and power to develop psychometric awareness of what is happening all around. From Jensen’s g factor, it would likely be best to sum up each specific standard score for each nation (e.g., a nation’s obesity level) weighted by its g loading. The g loading here is its correlation with IQ/g? Might you have a list of these correlations handy? I am sure that you posted these many times previously though I am not able to locate them.

    To apply factor analysis, one could simply take the results matrix combined for all nations and form a correlation matrix etc.; in R one can almost just sit back and the let the R packages do all the work for you. This is something I am also somewhat uncertain about: What then? One largish factor should remain: the g factor. Removing the g factor (if possible) might be of interest; this would be the residuum without g. Perhaps additional structure could be found. I am unclear about how one would proceed with such analysis. One is perhaps venturing into various more complex types of factor and other analyses that I am not yet conversant with. Being able to do such analysis and create realistic models would be a powerful tool to gain further understanding from datasets. For example, how might the factor loadings found for obesity and teenage fertility be related in a factor solution.

    Any comments you might provide for these questions would be greatly appreciated. Factor analysis has become a highly specialized technical field, though with the current software much of it is open to those with fairly minimal quantitative backgrounds (which possibly has lowered the barriers somewhat beyond optimal).

