Linda S. Gottfredson
Intelligence 31 (2003) 415–424
Sternberg disputes not a single point in my critique of his work on practical intelligence. Instead, he discusses his broader theory of successful intelligence and answers self-posed objections from unspecified critics. His discussion exhibits the same problematic mode of argument and use of evidence that my critique had documented: it repeats the unsubstantiated claims that critics question as if merely repeating them somehow rebutted the critics; it ridicules rather than answers critics while claiming to do the reverse; and it spuriously validates Sternberg’s theory by reporting evidence selectively and inaccurately.
Sternberg’s (2003) ‘‘Reply to Gottfredson’’ addresses none of the many errors I identified in his accounts of evidence for practical intelligence theory: for instance, misreporting data, consistently overstating supportive results, and ignoring evidence that contradicts the theory. Instead, Sternberg peremptorily dismisses my analysis with two mere mentions, one each in his opening and closing paragraphs, that imply unscientific behavior on my part. He then claims to set the record straight by highlighting those aspects of his theory that, in fact, were not relevant to my analysis of practical intelligence (e.g., componential analyses of ‘‘analytical’’ ability), and he devotes only a single paragraph to practical intelligence itself.
Sternberg has sidestepped my critique. To appear to be facing criticisms, however, he poses his own set of ‘‘various criticisms. . .received over the years.’’ Half the 10 items are straw men that provide Sternberg the opportunity to restate his unsubstantiated claims as if they constituted additional support for his theory. They yield fresh examples only of Sternberg’s overstatement and error.
2. Non-engagement with contrary evidence
My analysis of major problems with Sternberg’s program of research on practical intelligence examined two theoretical and six empirical claims about practical intelligence for which Sternberg and his colleagues allege support. Sternberg’s reply does not dispute my rendition of them. I provided evidence that each of the eight claims is either false or unsubstantiated. Had I ‘‘pervasively misrepresented’’ Sternberg’s ideas and evidence, as he claims, one might have expected his reply to hold me accountable with a trenchant list of misstatements. However, Sternberg identifies none and relies instead on labeling his work as ‘‘positive’’ (his emphasis) and those who question it as ‘‘less than fully constructive.’’
The errors and self-contradictions that I documented ranged from the seemingly minor to egregious, but they were consistent in overstating the evidence for practical intelligence and understating it for g. Here are some examples, one for each of the six empirical claims, that Sternberg should have refuted had they been wrong.
1. Unexplained self-contradiction (on implicit theories of intelligence): Without explanation, Sternberg attributes to an early study a conclusion favoring practical intelligence theory when its authors (he was lead author) had actually reached the opposite conclusion, which favored g theory.
2. Failure to consider directly relevant evidence that vitiates his claim (that there must be a separate practical intelligence because g does not predict performance on certain simple or highly practiced tasks): Sternberg ignores the extensive research on experience, personality, and other non-g predictors of performance by g theorists themselves, which can explain the phenomena he says require positing a practical intelligence.
3. Selective use of less-relevant but more supportive evidence (on age trends in fluid and crystallized g): Sternberg cites less-relevant evidence while dismissing the more relevant when the former is consistent with a favored claim but the latter directly contradicts it.
4. No-lose interpretations (on the validity of tacit knowledge tests): Sternberg interprets even contradictory results as consistent support for his theory by positing that both ‘‘A’’ (‘‘domain generality’’) and ‘‘not A’’ (‘‘domain specificity’’) constitute evidence favoring the theory.
5. Misreported results (on the independence of IQ and tacit knowledge): Sternberg incorrectly reports correlations as not significant when they actually are, resulting in more consistent support for his preferred claim.
6. Skewed summary of results (on the predictive validity of tacit knowledge relative to g): Sternberg’s summaries of evidence routinely report only the largest criterion-related correlations for his tests but the lowest for competing ones, thereby making the former appear more predictive than the latter when the opposite is true.
Sternberg’s failure to engage such points mirrors his disinclination to engage unwelcome evidence in either the broader literature on g or his own research program. As described in my critique, his two theoretical claims gain plausibility only by substituting misleading labels ( g is only ‘‘academic’’ and just one type of ‘‘flexible’’ expertise or ‘‘achievement’’) for the century of pertinent contrary evidence (that g is actually a highly general, stable, and heritable trait of individuals, regardless of their circumstances). His six empirical claims seem credible only when, as illustrated in the examples above, he focuses on positive results and ignores or misreports disconfirming evidence.
3. Faux engagement wielding faux evidence
Some scholars have questioned whether tacit job knowledge tests really measure another form of intelligence, ‘‘practical’’ or otherwise, and others point up the ‘‘vacuous,’’ ‘‘pseudoempirical,’’ ambiguous, and jargon-laden character of triarchic theory in general (Kline, 1991, 1998, pp. 141–142; Messick, 1992, pp. 377–380; Rabbitt, 1988, p. 178). I listed these and yet other problems in my critique. Ignoring them all, Sternberg instead poses for himself a set of 10 ‘‘criticisms’’ from unspecified sources. Half are straw men, and only two of the remainder directly address the issue at hand — his claims for practical intelligence (I have switched their order below for ease of presentation).
3.1. Straw men
I have never seen critics assert any of the first five self-posed criticisms that Sternberg answers, and they seem meant to cast ridicule upon his critics.
Criticism 1: There is much more evidence in favor of g theory than in favor of the triarchic theory. There is, of course, more evidence regarding g than triarchic theory, but that is not the issue. The issue is that Sternberg and his colleagues tend to treat their small collection of evidence as equally dispositive as that for g. They also have much less evidence than they routinely imply they do. One especially important example will suffice. Sternberg has repeatedly implied that he has evidence for a general factor of practical intelligence that is largely independent of g and that predicts life success at least as well as g, if not better. Every element of that claim is demonstrably false. Because no one has collected the requisite data for extracting a general factor of practical intelligence, there is no evidence that one even exists, let alone one that is independent of g or as good a predictor. The small set of tacit knowledge studies seldom measured workers’ IQ, and they represent but a thin and atypical slice of both the IQ distribution and the world of work, let alone of ‘‘everyday life.’’
Criticism 2: Intelligence is fixed, not flexible. No g theorist claims that g is ‘‘fixed.’’ This is a canard and distracts readers from the pertinent point, which is that individual differences in g become highly stable and more heritable by adolescence. These facts mean that g is not just some culture-specific and situation-specific form of developing expertise that is comparable, as Sternberg suggests, to learning the ropes on a particular job (tacit knowledge).
Criticism 3: The triarchic theory says that intelligence is all relative, and that is not scientific. I do not object to Sternberg offering the vague proposition that ‘‘intelligence is all relative,’’ because he applies the term ‘‘intelligence’’ broadly to general competence or overall life success in a culture (although this strips the term of most useful meaning). It is inappropriate, however, for him to suggest that evidence shows the general factor of intelligence to be a cultural artifact: ‘‘Western and related forms of schooling may, in part, create the g phenomenon by providing a [particular] kind of schooling’’ (Sternberg et al., 2000, p. 9). Evidence proves otherwise, as I noted in my critique.
Criticism 4: Believing in the value of the triarchic theory somehow diminishes the contributions of psychometric theorists and researchers. It is not Sternberg’s ‘‘believing in triarchic theory’’ that diminishes competing theories and theorists, but his casting of gratuitous aspersions on them: being ‘‘quasi-scientific,’’ ‘‘g-ocentric,’’ ‘‘creating a kind of night of the living dead,’’ and such (Science and pseudoscience, 1999, p. 27; Sternberg, 1997, pp. 54–55; Sternberg & Wagner, 1993, p. 1). For Sternberg to deny that he ‘‘trashes’’ his critics (his term) is to deny the obvious. Nor does he refrain from it in his reply to my critique. Even favorable reviews of his work lament his ‘‘unwarranted personal assaults’’ (Herklots, 2001, pp. 225–226): ‘‘Unfortunately, the inclusion of such caustic asides will prevent this reviewer from recommending this otherwise exceptional book to any parent, politician, or first-year graduate student.’’
Criticism 5: The triarchic theory is just wrong. Again, as in the previous four self-posed ‘‘criticisms,’’ Sternberg has attributed a patently silly complaint to his critics that none would ever make. It is a form of ridicule, not argument.
The remaining five ‘‘criticisms’’ are mostly diversionary. Sternberg chooses to address one technical issue about which there is legitimate debate (correcting for statistical artifacts) while ignoring the more serious lapses about which there is none (e.g., mistaken and selective reporting of results). He selects four substantive points to argue, but only one (the meaning of tacit knowledge) directly relates to his own theory. All nonetheless provide him an opportunity to restate his disputed claims as if their mere repetition transformed them into evidence against his critics.
Criticism 6: g correlates with many things but the triarchic theory says it does not. Sternberg agrees that g is likely to predict many things to some extent, so that is not the issue. Rather, it is that he alleges support for his assertion that g is ‘‘only a tiny and not very important part’’ (Sternberg, 1997, p. 11) of the intellectual spectrum (not true) and that it has little value in the real world of practical affairs (not true), especially relative to his hypothesized general factor of practical intelligence (a claim never tested, let alone substantiated). The more important effect of his answer to this self-posed ‘‘criticism,’’ however, is to plant doubts about the meaning of g’s predictive validity, perhaps especially when it is strong. Sternberg does this by suggesting that g’s ability to predict life success results only from the game of life having been rigged by an entrenched power structure that arbitrarily rewards some people (‘‘people with green skin,’’ to take Sternberg’s example) rather than others. He does not say how this notion comports with the literatures showing that higher g people actually are more competent in performing core tasks in everyday life (e.g., protecting one’s health) or on the job, regardless of social advantage and even compared to lower IQ siblings growing up in the same household.
Criticism 7: The triarchic theory does not acknowledge the causal power of g. Sternberg’s answer is simply to reassert the falsehood that g has no causal force. However, differences in g have been shown (including experimentally) to cause differences in later performance, both in school and on the job, by any ordinary meaning of the term cause, and many employers have profited handsomely from acting on that assumption when hiring workers.
Criticism 8: The triarchic theory fails to acknowledge the importance of genetic factors in intelligence. In responding to this (accurate) criticism, Sternberg first points out that heritability estimates cannot be generalized beyond the sorts of samples from which they were calculated (as I myself had explained). This is hardly a reason to ignore them, however. We don’t throw out a map of Asia just because it doesn’t include Europe. In addition, contrary to what Sternberg implies, behavior geneticists caution proper interpretation of heritability estimates, and such estimates are not ‘‘often. . .misinterpreted in the literature on intelligence.’’ Sternberg then invokes the most bizarre extremes in rearing environments (e.g., being ‘‘locked in a closet’’) to justify ignoring the role of genes in typical circumstances. We already know that variations in typical family environments produce no lasting differences in intelligence. Sternberg has published reviews of that evidence in his own edited books (e.g., Scarr, 1997). Even being ‘‘locked in a closet’’ may have no permanent effects on mental ability because ‘‘Isabelle,’’ whatever the tragic consequences she may have suffered, soon developed normal intelligence after release from confinement in an attic where she had lived with virtually no mental stimulation (e.g., no toys, no speech) for the first 6 years of her life (Jensen, 1981). In short, Sternberg has given specious reasons to defend ignoring crucial evidence — evidence on the heritability of g — that eviscerates his claim that g is really just one among various culturally specific forms of knowledge on a par with specific sorts of ‘‘tacit knowledge.’’
Criticism 9: If one corrects for restriction in range and attenuation, one will find tacit-knowledge measures correlate with g. Sternberg is here defending his failure to estimate the effects of restriction in range on IQ in his samples (Yale undergraduates, psychology professors, managers with an average IQ at the 90th percentile, and the like). First, he implies that correcting for unreliability and restriction in range is typically just a self-serving exercise, next, that it would not increase the correlations between tacit knowledge and IQ very much anyway, then, that he lacks the necessary information with which to correct for restriction in range, and finally, that critics might be hoist with their own petard were he to make such corrections. He is wrong on the first and third counts. Professional test standards in employee selection recommend such corrections when evaluating theories, researchers commonly make them because reasonable estimates of unreliability and restriction in range on IQ usually can be made, and corrections often do make a difference when the students and workers studied had been selected into their positions partly on the basis of intellectual competence. It is these substantial increases in estimated correlations with g that Sternberg (1997, p. 225) has elsewhere sought to diminish as merely ‘‘jacked up.’’ As for the second and last points, Sternberg misleads us. The reliabilities of tacit job knowledge tests are not ‘‘about .9’’ (Sternberg, this issue). I could locate no test–retest reliabilities for his tacit job knowledge tests, but the eight reported internal consistency reliabilities range between .66 and .85, the median being .75 (Gottfredson, 2003). As Brody’s (this issue) critique of Sternberg’s research on the Sternberg Triarchic Abilities Test (STAT) in educational settings showed, corrections for unreliability increase considerably the correlations among Sternberg’s three ‘‘distinct’’ intelligences. Sternberg’s final argument is that correcting for statistical artifacts could make the one negative correlation between tacit knowledge and IQ even larger, which would only strengthen his case. As I shall discuss shortly, there is no reason to believe that the measure of ‘‘herbal knowledge’’ in question (traditional beliefs about the causes and cures of illness) among Kenyan children represents any sort of general mental competence. Thus, none of Sternberg’s four reasons justifies his failure to correct for statistical artifacts, especially when such artifacts can be counted on to render his results more compatible with his theory. In any case, Sternberg cannot rule out the plausible hypothesis that differences in tacit knowledge reflect mostly differences in g until he makes those corrections (I myself am not yet convinced that the various tacit knowledge tests assess primarily intellectual differences, because they often emphasize self-promotion regardless of competence).
Criticism 10: Practical intelligence is really job knowledge or personality or something other than intelligence. Sternberg defends his claim that tacit knowledge measures a new intelligence and not personality, job knowledge, or the like by first using ridicule to impugn those alternatives and then making new false claims while reasserting previous falsehoods. First, the ridicule. Some critics (Schmidt & Hunter, 1993) have indeed suggested that tacit knowledge tests measure job knowledge rather than an underlying ability to acquire knowledge. Sternberg makes them look foolish by stating that ‘‘These [Alaskan and Kenyan] children have no formal jobs so it is not clear what it would mean to say that their knowledge is ‘job knowledge.’’’ However, the forum of debate with those critics — and Sternberg’s own article in it (Sternberg & Wagner, 1993) — had been explicitly limited to tacit knowledge tests in the workplace, partly because that was the only sort of tacit knowledge test that had yet been developed. The Kenyan study of herbal knowledge was published only years later, and the Alaskan study is new and still unpublished.
Sternberg then further disputes this 10th ‘‘criticism’’ by making new errors while reasserting old ones. To illustrate, all four claims that Sternberg asserts in the several sentences directly following the foregoing implicit ridicule are false and misleading in one or more ways. The first two assertions introduce new falsehoods; the last two are ones I had already analyzed in my critique.
First assertion: ‘‘Tests of practical intelligence do not correlate significantly with tests of personality. . ..’’
The facts: Only one tacit knowledge test (not ‘‘tests’’) has ever been reported to have been examined in relation to personality, and then in only one sample of 45 managers. Sternberg has overstated his evidence by implying a plurality of studies that does not exist.
Second assertion: ‘‘We used to believe that tacit knowledge is all domain specific. Our research has convinced us that this belief was incorrect.’’
The facts: Sternberg has never clarified what his protean term ‘‘domain specific’’ actually means, nor is it clear what he means here by the phrase ‘‘all domain specific,’’ and far from ever retracting that belief (I have seen no such retraction), Sternberg and his colleagues continue to describe tacit knowledge as highly specific: ‘‘tacit knowledge is always wedded to particular uses in particular situations or in classes of situations’’ (Sternberg, Wagner, Williams, & Horvath, 1995, p. 917; see also Sternberg et al., 2000, pp. 107–108). Although it is not clear what criticism he is to trying to rebut when he asserts this supposed change of view, this assertion suggests a responsiveness to evidence on practical intelligence that Sternberg has, in fact, yet to demonstrate.
Third assertion: ‘‘We found high correlations between different subtests and tests of tacit knowledge, although we did not find high correlations of these tests with g-based measures.’’
The facts: As discussed at length in my critique, in only two of the four tacit knowledge tests (management and academic psychology) did the subscales of a tacit knowledge test correlate among themselves, and only two studies administered different tacit knowledge tests to the same sample. The correlation between tacit knowledge in management and in academic psychology was fairly high (.58), but that was in a sample of Yale undergraduates, not workers. In the only study of workers that administered two different tests, the correlations in its three samples (between knowledge for management and military leadership) were notably smaller, -.06, .32, and .36. As for the correlations between IQ and tacit knowledge, they were .14 in the study of 45 managers, .02–.25 in the three samples of Army officers, and -.04–.40 in five samples of college students or Air Force trainees. However, why should the former set (-.06–.36 for workers, .58 for students) be labeled high and the latter not (.02–.25 for workers, -.04–0.40 for students), especially when the latter are all artificially depressed by severe restriction in range on IQ?
Fourth assertion: ‘‘Thus, the tacit-knowledge tests seem to yield a general factor that is different from psychometric g.’’
The facts: As just indicated, not enough data have ever been collected to extract a general factor of practical intelligence, and the data that do exist contradict the claim as often as they support it. Sternberg merely implies with the label ‘‘general’’ what he cannot show, namely, that practical intelligence is a second higher order factor, alongside g, at the apex of the hierarchical structure of mental abilities. This claim is the linchpin of Sternberg’s evidence for practical intelligence theory, and yet it remains unsubstantiated. He inserts it here, as if solid as steel, in the guise of rebutting critics on a smaller question. My answer to that question — what do tacit knowledge tests really measure? — was that we do not yet know. The four tacit knowledge tests for jobs (sales, management, academic psychology, and military leadership) have been so diverse and changed so much over the years, in both content and mode of scoring, that it is hard to guess what they might be measuring individually or collectively. Despite all Sternberg’s references to convergent-discriminant validation, he has done little of it to clarify the meaning of tacit knowledge. His ‘‘converging operations’’ consist mostly of reporting evidence in a highly selective manner.
4. Practical intelligence in nonwork settings
My critique focused on Sternberg’s claims regarding tacit knowledge in work settings because that is where he and his colleagues have produced the fullest body of evidence on practical intelligence. Sternberg’s reply mentions several other studies meant to tap tacit knowledge, so I will briefly comment on two of them. I cannot comment on the study of tacit knowledge among Alaskan children, because there is no information yet available on it, nor need I recapitulate the problems with the other studies that Sternberg offers as rebuttal but whose flaws Brody and I had already detailed in our critiques, namely, the STAT studies in school settings (Brody, 2003) and racetrack handicapping, supermarket shopping, street vending, and related studies in everyday settings (Gottfredson, 2003).
4.1. Herbal knowledge among Kenyan children
Contrary to what Sternberg implies, this study did not measure skills or knowledge that actually enhance health, but only beliefs about illness and herbal treatments that are widely held in the rural village studied. For example, one of the answers scored as correct on the inventory was to agree that the ‘‘evil eye’’ is a likely cause of a baby’s crying and stomachache. Herbal knowledge scores correlated negatively not only with several tests of IQ and achievement, but also with parents’ social class. We might expect a belief in myths, superstitions, and other questionable folk ‘‘knowledge’’ to correlate negatively with both IQ and social class in the United States too, but that could hardly be said to dissipate the positive manifold of cognitive tests. Only a bona fide ability test could do that, and nothing suggests that adhering to folk beliefs about illness in a society undergoing modernization reflects a form of mental competence. If we construe such adherence as a form of backwardness, then the study shows the usual positive manifold.
4.2. Practical intelligence among Russian adults
Sternberg claims that ‘‘among Russian adults, although both academic and practical intelligence predicted mental and physical health, practical intelligence was the better predictor.’’ What was the measure of ‘‘practical intelligence?’’ Turning to the study itself, we see two: self-ratings of competence in social situations, running a household, and solving sudden problems, and second, typicality of preferred response to three financial challenges and locating medical services. What were the outcome measures of ‘‘mental and physical health?’’ Three items on self-reported physical health, depression, and anxiety, and five on self-efficacy. What we may have here is mostly self-confidence predicting a sense of self-efficacy, which is hardly news. How big were the criterion-related correlations? Uniformly small. The zero-order correlations of the three triarchic abilities with the eight outcomes averaged .07 (analytical), .03 (creative), and .14 (practical). In short, only by labeling variables advantageously and not reporting any numbers does this study seem to provide evidence that (a) ‘‘practical intelligence’’ (more than analytical intelligence) (b) ‘‘predicted’’ (c) ‘‘mental and physical health.’’
5. Chameleon aims
Sternberg’s reply repeatedly suggests that the aim of his research program is really quite modest — to show that ‘‘there is more to intelligence than g. . .and what this ‘more’ might be.’’ He is simply ‘‘going beyond g.’’ He implies that intelligence researchers, rather than seeking to ‘‘expand Spearman’s theory,’’ prefer to ‘‘accept it in close to its original form.’’ The contrast is false. Intelligence researchers have been no more content to rest on Spearman’s laurels than Sternberg has been to accept practical intelligence as a mere supplement to g. When writing for general audiences, he describes himself as a ‘‘revolutionary’’ (Sternberg, 1996), a veritable warrior against the forces of intellectual darkness, some of which have ‘‘created a kind of night of the living dead’’ (Sternberg, 1997, p. 55).
Recall, likewise, the central claim of his book on the topic, Practical intelligence in everyday life (Sternberg et al., 2000, pp. xi–xii).
[W]e argue that practical intelligence is a construct that is distinct from general intelligence and that. . .[it] is at least as good a predictor of future success as is the academic form of intelligence that is commonly assessed by tests of so-called general intelligence. Arguably, practical intelligence is a better predictor of success.
Far from being modest, this is a bold claim — that there are at least two general factors of intelligence, g and practical intelligence. It is also a highly implausible one in view of the vast and dense nomological network already established for the general factor of intelligence and the lack of evidence for practical intelligence. Sternberg’s reply deflects scrutiny of his implausible claim by focusing readers on a muted version of it.
No one has yet demonstrated that practical intelligence rests on scientifically valid evidence or that it is even a useful construct. No one has yet shown that the various tests of tacit knowledge, the ‘‘important aspect’’ of practical intelligence, measure anything that is not already effectively captured by measures of personality, interests, cognitive abilities, specialized knowledge, and other well-studied human traits and competencies. Establishing practical intelligence within or beside that pantheon would require carefully specified constructs, clear distinctions among types of predictors and criteria, testable hypotheses, well-described measures, fully described data collection methods, fully and accurately reported results, comparison and aggregation of results across studies, conclusions that fit the evidence, and more representative samples of tasks and people. Not the illusion of them, but the reality.