The point is underscored by what Herrnstein and Murray call the "Flynn Effect:" IQ has been rising about 3 points every 10 years worldwide. Since World War II, IQ in many countries has gone up 15 points, about the same as the gap separating Blacks and Whites in this country. And in some countries, the rise has been even more dramatic. For example, average IQ in Holland rose 21 points between 1952 and 1982.
--------------------------------------------------------------------------------
Steven Johnson, "Dome Improvement," WIRED MAGAZINE, MAY 2005 pp. 102-105
-------------------------------------------------
Stephen Jay Gould's The Mismeasure of Man or Howard Gardner's work on multiple intelligences or any critique of The Bell Curve is liable to dismiss 10 as merely phrenology updated, a pseudoscience fronting for a host of racist and elitist ideologies that dare not speak their names.
These critics attack IQ itself - or, more precisely, what intelligence scholar Arthur Jensen called g, a measure of underlying "general" intelligence. Psychometricians measure g by performing a factor analysis of multiple intelligence tests and extracting a pattern of correlation between the measurements. (IQ is just one yardstick.) Someone with greater general intelligence than average should perform better on a range of different tests.
Unlike some skeptics, James Flynn didn't just dismiss g as statistical tap dancing. He accepted that something real was being measured, but he came to believe that it should be viewed along another axis: time. You can't just take a snapshot of g at one moment and make sense of it, Flynn says.
You have to track its evolution. He did just that. Suddenly, g became much more than a measure of mental ability. It revealed the rising trend line in intelligence test scores. And that, in turn, suggested that something in the environment - some social or cultural force - was driving the trend.
Significant intellectual breakthroughs -to paraphrase the John Lennon song - are what happen when you're busy making other plans. So it was with Flynn and his effect. He left the US in the early 1960s to teach moral philosophy at the University of Otaga in New Zealand. In the late '70s, he began exploring the intellectual underpinnings of racist ideologies. "And I thought: Oh, I can do a bit about the 10 controversies," he says. "And then I saw that Arthur Jensen, a scholar of high repute, actually thought that blacks on average were genetically inferior - which was quite a shock. I should say that Jensen was beyond reproach - he's certainly not a racist. And so I thought I'd better look into this." This inquiry led to a 1980 book, Race, IQ, and Jensen, that posited an environmental -not genetic - explanation for the black-white 10 gap. After finishing the book, Flynn decided that he would look for evidence that blacks were gaining on whites as their access to education increased, and so he began studying US military records, since every incoming member of the armed forces takes an IQ test.
Sure enough, he found that blacks were making modest gains on whites in intelligence tests, confirming his environmental explanation. But something else in the data caught his eye. Every decade or so, the testing companies would generate new tests and re-normalize them so that the average score was 100. To make sure that the new exams were in sync with previous ones, they'd have a batch of students take both tests. They were simply trying to confirm that someone who tested above average on the new version would perform above average on the old, and in fact the results confirmed that correlation. But the data also brought to light another pattern, one that the testing companies ignored. "Every time kids took the new and the old tests, they did better on the old ones," Flynn says. "I thought: That's weird."
The testing companies had published the comparative data almost as an afterthought. "It didn't seem to strike them as interesting that the kids were always doing better on the earlier test," he says. "But I was new to the area." He sent his data to the Harvard Educational Review, which dismissed the paper for its small sample size. And so Flynn dug up every study that had ever been done in the US where the same subjects took a new and an old version of an IQ test. "And lo and behold, when you examined that huge collection of data, it revealed a 14-point gain between 1932 and 1978." According to Flynn's numbers, if someone testing in the top 18 percent the year FDR was elected were to time-travel to the middle of the Carter administration, he would score at the 50th percentile.
When Flynn finally published his work in 1984, Jensen objected that Flynn's numbers were drawing on tests that reflected educational background. He predicted that the Flynn effect would disappear if one were to look at tests - like the Raven Progressive Matrices - that give a closer approximation of gr, by measuring abstract reasoning and pattern recognition and eliminating language altogether. And so Flynn dutifully collected IQ data from all over the world. All of it showed dramatic increases. "The biggest of all were on Ravens," Flynn reports with a hint of glee still in his voice.
The trend Flynn discovered in the mid-'80s has been investigated extensively, and there's little doubt he's right. In fact, the Flynn effect is accelerating. US test takers gained 17 IQ points between 1947 and 2001. The annual gain from 1947 through 1972 was 0.3110 point, but by the '90s it had crept up to 0.36.
Though the Flynn effect is now widely accepted, its existence has in turn raised new questions. The most fundamental: Why are measures of intelligence going up? The phenomenon would seem to make no sense in light of the evidence that g is largely an inherited trait. We're certainly not evolving that quickly.
The classic heritability research paradigm is the twin adoption study: Look at IQ scores for thousands of individuals with various forms of shared genes and environments, and hunt for correlations. This is the sort of chart you get, with 100 being a perfect match and 0 pure randomness:
The same person tested twice 87
Identical twins raised together 86
Identical twins raised apart 76
Fraternal twins raised together 55
Biological siblings 47
Parents and children living together 40
Parents and children living apart 31
Adopted children living together 0
Unrelated people living apart 0
After analyzing these shifting ratios of shared genes and the environment for several decades, the consensus grew, in the '90s, that heritability for IQ was around 0.6 - or about 60 percent. The two most powerful indications of this are at the top and bottom of the chart: Identical twins raised in different environments have IQs almost as similar to each other as the same person tested twice, while adopted children living together - shared environment, but no shared genes - show no correlation. When you look at a chart like that, the evidence for significant heritability looks undeniable.
Four years ago, Flynn and William dikkens, a Brookings Institution economist, proposed another explanation, one made apparent to them by the Flynn effect. Imagine "somebody who starts out with a tiny little physiological advantage: He's just a bit taller than his friends," dikkens says. "That person is going to be just a bit better at basketball." Thanks to this minor height advantage, he tends to enjoy pickup basketball games. He goes on to play in high school, where he gets excellent coaching and accumulates more experience and skill. "And that sets up a cycle that could, say, take him all the way to the NBA," dikkens says.
Now imagine this person has an identical twin raised separately. He, too, will share the height advantage, and so be more likely to find his way into the same cycle. And when some imagined basketball geneticist surveys the data at the end of that cycle, he'll report that two identical twins raised apart share an off-the-charts ability at basketball. "If you did a genetic analysis, you'd say: Well, this guy had a gene that made him a better basketball player," dikkens says. "But the fact is, that gene is making him 1 percent better, and the other 99 percent is that because he's slightly taller, he got all this environmental support." And what goes for basketball goes for intelligence: Small genetic differences get picked up and magnified in the environment, resulting in dramatically enhanced skills. "The heritability studies weren't wrong," Flynn says. "We just misinterpreted them."
dikkens and Flynn showed that the environment could affect heritable traits like 10, but one mystery remained: What part of our allegedly dumbed-down environment is making us smarter? It's not schools, since the tests that measure education-driven skills haven't shown the same steady gains. It's not nutrition - general improvement in diet leveled off in most industrialized countries shortly after World War II, just as the Flynn effect was accelerating.
Most cognitive scholars remain genuinely perplexed. "I find it a puzzle and don't have a compelling explanation," wrote Harvard's Steven Pinker in an email exchange. "I suspect that it's either practice at taking tests or perhaps a large number of disparate factors that add up to the linear trend."
Flynn has his theories, though they're still speculative. "For a long time it bothered me that g was going up without an across-the-board increase in other tests," he says. If g measured general intelligence, then a long-term increase should trickle over into other subtests. "And then I realized that society has priorities. Let's say we're too cheap to hire good high school math teachers. So while we may want to improve arithmetical reasoning skills, we just don't. On the other hand, with smaller families, more leisure, and more energy to use leisure for cognitively demanding pursuits, we may improve - without realizing it -on-the-spot problem-solving, like you see with Ravens."
When you take the Ravens test, you're confronted with a series of visual grids, each containing a mix of shapes that seem vaguely related to one another. Each grid contains a missing shape; to answer the implicit question posed by the test, you need to pick the correct missing shape from a selection of eight possibilities. To "solve" these puzzles, in other words, you have to scrutinize a changing set of icons, looking for unusual patterns and correlations among them.
This is not the kind of thinking that happens when you read a book or have a conversation with someone or take a history exam. But it is precisely the kind of mental work you do when you, say, struggle to program a VCR or master the interface on your new cell phone.