Predicting Complex Performance
October 25, 2007
New students to testing are often surprised to find out how modest the relationship is between performance on tests used to predict job performance or college success and actual performance on the job. Normally the correlation is around .30 and rarely is it above .40. What this implies is that approximately 85 percent of the variance in actual performance is not predictable from or explained by test scores. Stated differently, test scores can account for only 15 percent of the variation in individual performance. The lion’s share of the variance in college or job performance must be explained by other factors.
The above summary is, for technical reasons, a bit too pessimistic. Without going into all the gory details, suffice it to say that the actual relationship between tests and actual job performance is higher than the .30 typically observed. There are several reasons for this, but three are particularly important. First, the less than perfect reliability of the test has the effect of lowering the observed correlation between tests and performance. For professionally developed tests, all of which tend to have reliabilities in the range of .90, the effect is relatively small, but it can be accurately estimated.
The second, more important reason has to do with the phenomenon of self-selection. In general, students tend to gravitate toward courses and majors that are better suited to their background and ability. Students who obtain scores between 300 and 400, say on the SAT-Math test, are unlikely to major in mathematics or physics, and are in fact likely to avoid any courses involving substantial mathematical content. At the opposite end, students with high math test scores are more likely to take courses with demanding mathematical content. As a consequence, low scoring students often obtain quite high grade point averages, and students with high test scores often have modest grade point averages. The net result is a lowering of the correlation between test scores and grades.
The final reason is known in the technical literature as the “restriction of range” problem. Other things being equal, the more restricted the range of test scores or grades, the lower the estimated correlation between the two. As one goes up the educational ladder, the range of scholastic ability becomes smaller and smaller. Struggling or disaffected students drop out of high school; many who do graduate never go on to college; many who enroll in college never finish. This restriction is further exacerbated by grade inflation. Again, the net effect is a lowering of the estimated relationship between tests and grades.
When technical adjustments are made for these three factors, the correlation between test scores and performance turns out to be closer to .50 than .30. But even a true correlation as high as .50 means that only approximately 25 percent of the variance in performance is explained by test scores, and 75 percent of the variance must be explained by other factors.
What are some of these other factors that affect performance in college? A candidate list would include at least the following: creativity, emotional and social maturity, time management, good health, efficient study habits and practices, and the absence of personal, family and social problems. There are precious few standardized instruments to measure such attributes. And even if these instruments could be developed, their formal use in college admissions and in employment would no doubt be viewed with skepticism. In the absence of such measures, college admissions officials and employment interviewers rely on a host of other methods such as interviews and letters of recommendation, which in turn have their own problems.
The conclusion here is clear. We cannot materially improve prediction by constructing more reliable tests of the cognitive abilities we already measure since professionally developed tests of human abilities appear to have reached a reliability asymptote (around .90) that has not changed in over 75 years of experience with standardized testing. But even if we could construct tests with reliabilities as high as .95, we would increase the predictive validity only marginally.
If we want to increase our ability to predict who will and who will not succeed in college, on the job or in a profession, we will have to consider more than cognitive tests or tests of content knowledge and look instead to the myriad of other factors that enter into the equation. A complex criterion (college grades, on-the-job performance) requires an equally complex set of predictors. Stated differently, performance that is a function of many abilities and attributes cannot be predicted well by instruments that assess a single construct.