The Flynn Effect is important to understand; it is better understood now than ever before, but there is more to research; and it is probably more limited in its real-world consequences than people imagine, though the long-term consequences are still being debated.
Say you take any test of ability, and as an example let us take a vocabulary test. The test requires you not to just say whether you have heard a word, but to show that you know what it means by giving an accurate synonym. On the basis of this test your total word store can be calculated, and an accurate estimate made of your intelligence. Several decades later the students are followed up to see if their early verbal intelligence scores are associated with later success in life. Many studies reveal that testing general mental ability (vocabulary, verbal reasoning, spatial matching, mathematical progressions and so on) at around age 11 remains stable over the life course, and early intelligence gives a good indication of later achievements.
So, the Flynn effect be damned: intelligence tests are good predictors of later achievement, even when the scores are not known to anyone, including potential employers, so cannot have a self-confirming influence. It is because of results like these that researchers know that the Flynn effect does not diminish the predictive power of intelligence. The Flynn effect becomes important in the comparison of cohorts. That is to say, do 11 year olds nowadays have better vocabularies than 11 year olds 6 decades ago? Comparing across decades is problematical. The problem is that language has changed somewhat, so some words in the vocabulary test need to be changed, generally every decade, so that they can do their job of sorting out the test takers accurately according to their vocabularies. Tests and exams are designed to find the best predictors right now for reliable predictions of later achievements. They are less good at cohort comparisons. In fact, all exams have to tune themselves to the problems of the present, in order to be faithful indicators of ability. That fine tuning makes them partly children of their time, good at what they do (for example, picking the brightest to go on to the most demanding universities) but less good at allowing accurate comparison to be made across different versions of the exam over the decades.
All is not lost, because some test items stay the same. Digit Span has the same basic format, though the test has been improved by giving more trials so as to boost reliability. Coding tasks are very similar to what they have always been. Reaction times to simple or complex stimuli are also pretty standard. (In fact, the different technologies used over the last century have been difficult to compare, but that is a technical issue due to quirks of the equipment, not a conceptual one because the task remains the same). Simple arithmetic follows the same rules as always, so provides a good benchmark.
So, in judging whether the Flynn Effect is a real change in intelligence, it is preferable to go for the most unchanging of the test procedures, and to look at raw score wherever possible. I will not go into all those matters just at the moment, but you can find them in the Archive under “Flynn Effect”. As a rule of thumb, tests of Digit Span and Maths have shown little variance with cohorts.
What I would like you to assume for a moment that the apparent rise in intelligence in the last 8 decades is a true finding, and not just a case of IQ inflation. That is, assume that people are really brighter now, not just better at answering questions designed to measure their intelligence. If so, what follows?
Well, we should be living in the golden age of the intellect. Our rate of innovation should be increasing, problems should be being solved at a faster rate than ever before, and instructions manuals should be much shorter, if even needed at all. Also, we should have a good understanding of probability, sampling theory, and tests of statistical significance.
These matters have been much debated of late. Michael Woodley argues that the rate of innovation has decreased, that vocabularies are decreasing, that reaction times are slowing, that the more intellectually demanding task of Digits Backwards is decreasing somewhat while the easier Digits Forwards is increasing slightly, and that sensory discrimination is blunted. James Flynn argues that judging innovation in the short term is very difficult, because it takes time to evaluate the true impact and power of new ideas, many of which require proofs not yet available. He also doubts that the supposed increases in ability are real in the deep sense of current generations being brighter than their grandparents. He suggests that there has been a shift from concrete to abstract frames of reference, mostly due to schooling.
At this stage it is apposite to introduce a quiet couple you will not have heard of: Olev and Aasa Must. They are psychologists at the University of Tartu, Estonia. Estonia has many positive characteristics, but the best from my point of view is that they take intelligence seriously. Estonia bothered to translate and re-norm the Yerkes test in 1934. This has proved a psychometric gold mine.
In 2013, together with William Shiu; Alexander Beaujean; and Jan te Nijenhuis, Olev and Aasa Must used item response theory to delve into the Estonian version of the Yerkes 1919 National Intelligence Test given in Estonia in 1934 and again in 2006 and found that, using only the invariant (stable) items there was a Flynn effect on all but one subtest. There was much variability in the strength of the effect, ranging from an effect size of 0.24 (3.60 IQ points) to 1.05 (15.75 IQ points). There was a decrease in variability across time for all subtests, although only two showed a large decrease. Overall, the study suggested a real Flynn effect in Estonia, and of course the effect is not likely to be specific to that country, but be part of a general trend revealed by the careful collection of item by item intelligence testing in 1934. In 2016 they published further work on the Flynn effect.
So, what is the real effect of the Flynn Effect? It seems to have been very positive for well-organized and therefore wealthy countries; but currently to be fading or even reversing in those countries; while at the same time it is rising in poorer countries which are now becoming better organized in delivering schooling and health to their populations. Their rate of rising is slow: convergence with wealthy countries will take somewhere between 60 years and never.
Olev Must muses on whether the reversing Flynn effect will have real consequences for those societies. In particular, he wonders about societies living for decades or indeed centuries with negative Flynn effects. Modern societies (in Northen America and in Europe) are adapted to having positive Flynn effects, at least in the 19th and 20th Centuries. Western societies have come to believe that younger cohorts are promising, and more able than older ones. Educational systems, professional development, and innovative projects are working on this assumption.
It will be catastrophic to acknowledge and experience that one generation, the next generation, and the third generation are each less mentally agile than the previous one. Olev Must wonders why results of Finland in PISA have been dropping since 2012? Is this result of high migration? Will this have some other consequences, particularly for the economy? Does the Finnish educational model does not work any more?
We should not rush things. These predictions will be testable with each generation, but the current trend of the Negative Flynn Effect is sufficient to make some intelligence researchers worried.