## The Think-Aloud Protocol: A High Yield/Low Stakes Assessment

### October 25, 2007

A verbal or “think-aloud” protocol is a transcribed record of a person’s verbalizations of her thinking while attempting to solve a problem or perform a task. In their classic book, *Verbal Reports as Data*, Ericcson & Simon liken the verbal protocol to observing a dolphin at sea. Because he occasionally goes under water, we see the dolphin only intermittently, not continuously. We must therefore infer his entire path from those times we do see him. A student’s verbalizations during problem solving are surface accounts of her thinking. There are no doubt “under water” periods that we cannot observe and record; but with experience, the analysis of students’ verbalizations while trying to perform a task or solve a problem offers powerful insights into their thinking.

The following problem is an item from a retired form of the SAT-Math test:

If X is an odd number, what is the sum of the next two odd numbers greater than 3X + 1?

(a) 6X + 8

(b) 6X + 6

(c) 6X + 5

(d) 6X + 4

(e) 6X + 3

Less than half of SAT test takers answered this item correctly, and the actual percentage is no doubt smaller since some students guessed the correct alternative. To solve the problem the student must reason as follows:

If

Xis an odd number, 3Xis also odd, and 3X+ 1 must be even. The next odd number greater than 3X+ 1 is therefore 3X+ 2. The next odd number after that is 3X+4. The sum of these two numbers is 6X+ 6, so the correct answer is option (b).

The only knowledge required to solve the problem is awareness of the difference between odd and even integers and the rules for simple algebraic addition. Any student who has had an introductory course in Algebra possesses this knowledge, yet more than half could not solve the problem.

Test development companies have data on thousands of such problems from many thousands of students. But for each exercise, the data are restricted to a simple count of the number of students who chose each of the five alternatives. Such data can tell us precious little about how students go about solving such problems or the many misconceptions they carry around in their heads about a given problem’s essential structure. Perhaps more than any other tool in an instructor’s armamentarium, the think-aloud protocol is the prototypical high yield/low stakes assessment.

In a series of studies I conducted some time ago at the University of Pittsburgh, I was interested in why so many students perform well in high school algebra and geometry, but poorly on the SAT-Math test. The performance pattern of *high grades/low test scores *is an extremely popular one, and the reasons underlying the pattern are many and varied. The 28 students that I studied allowed me to record their verbalizations as they attempted to solve selected math items taken from retired forms of the SAT. All of the students had obtained at least a “B” in both Algebra I and Geometry. Here is the protocol of one student (we will call him R) attempting to solve the above problem. In the transcription, S and E represent the student and the experimenter, respectively.

1. S: If

Xis an odd number, what is the sum of the next two odd numbers greater than 3Xplus 1?

2.(silence)

3. E: What are you thinking about?

4. S: Well, I’m trying to reason out this problem. Uh, ok I was. . . IfXis an odd number, what is the sum of the next two odd numbers greater than 3Xplus one? So. . . I don’t know, lets see.

5.(long silence)

6. S: I need some help here.

7. E: Ok, hint: IfXis an odd number, is 3Xeven or odd?

8. S: Odd.

9. E: OK. Is 3Xplus 1 even or odd?

10. S: Even.

11. E: Now, does that help you?

12. S: Yeah.(long silence)

13. E: Repeat what you know.

14. S: Uh, lets see . . . uh, 3Xis odd, 3Xplus 1 is . . . even.

15.(long silence)

16. E: What is the next odd number greater than 3Xplus 1?

17. S: Three? Put in three forX. . . and add it. So it would be 10?

18. E: Well, we’ve established that 3Xplus 1 is even, right?

19. S: Yeah.

20. E: Now, what is the next odd number greater than that?

21. S: Five?

22. E: Well,Xcan be ANY odd number, 7 say. So if 3Xplus 1 is even, what is the next odd number greater than 3Xplus 1?

23. S: I don’t know.

24. E: How about 3Xplus 2?

25. S: Oh, oh. Aw, man.

26.(mutual laughter)

27. S: I was trying to figure out this 3X. . . I see it now.

28. E: So what’s the next odd number after 3Xplus 2?

29. S: 3Xplus 3.

30. E: The next ODD number.

31. S: Next ODD number? Oh, oh. You skip that number . . . 3Xplus 4. So let’s see…

32.(long silence)

33. E: Read the question.

34. S:(inaudible)Oh, you add. Let’s . . . It’s b. It’s 6Xplus 6. Aw, man.

Two points are readily apparent from this protocol. The first is that R could not generate on his own a goal structure for the problem. Yet, when prompted, he provided the correct answers to all relevant subproblems. Second, R does not appear to apprehend the very structure of the problem. This is so despite the fact that the generic character of the correct answer (that is, in the expression 6*X* + 6, *X* may be *any* odd number) can be deduced from the answer set. R tended to represent the problem internally as a problem with a specific rather and a general solution. Hence, in responding (line 13) to my query with a specific number (i.e., 10), he was apparently substituting the specific odd number 3 into the equation 3*X* + 1. (Even here the student misunderstood the question and simply gave the next integer after 3*X*, rather than responding with 11, the correct specific answer to the question. This was obviously a simple misunderstanding that was later corrected.) The tendency to respond to the queries with specific numeric answers rather than an algebraic expression in terms of *X* was common. A sizable plurality of the 28 students gave specific, numeric answers to this same query.

This protocol is also typical in its overall structure. Students were generally unable to generate on their own the series of sub-goals that lead to a correct solution. But they experienced little difficulty in responding correctly to each question posed by the experimenter. The inability to generate an appropriate plan of action and system of sub-goals, coupled with the ability to answer correctly all sub-questions necessary for the correct solution, characterize the majority of protocols for these students.

Compare the above with the following protocol of one of only two students in the sample who obtained “A’s” in both Algebra and Geometry and who scored above 600 on the SAT-Math. The protocols for the above problem produced by these two students were virtually identical.

(Reads question; rereads question)

S: Lets see. If

Xis odd, then 3Xmust be . . . odd. And plus 1 must be even. Is that right? Yeah . . . So . . . what is the sum of the next two odd . . . so that’s . . . 3Xplus 2 and . . . 3Xplus . . . 4. So . . . you add. It’s b, it’s 6Xplus 6.(Total time: 52 seconds)

As a general rule, problems like the one above (that is, problems that require relevant, organized knowledge in long-term memory and a set of readily available routines that can be quickly searched during problem solving) presented extreme difficulties for the majority of the students. For many of these students, subproblems requiring simple arithmetic and algebraic routines such as the manipulation of fractions and exponents represented major, time-consuming digressions. In the vernacular of cognitive psychologists, the procedures were never routinized or “automated.” The net effect was that much solution time and in fact much of the students’ working memory were consumed in solving routine intermediate problems, so much so that they often lost track of where they were in the problem.

A careful analysis of these protocols, coupled with observations of algebra classes in the school these students attended, led me to conclude that their difficulties were traceable to how knowledge was initially acquired and stored in long-term memory. The knowledge they acquired about algebra and geometry was largely inert, and was stored in memory as an unconnected and unintegrated list of facts that were largely unavailable during problem solving.

The above insights into student thinking could not have been made from an examination of responses to multiple-choice questions, nor even from responses to open-ended questions where the student is required to “show your work.” For, as any teacher will attest, such instructions often elicit unconnected and undecipherable scribbles that are impossible to follow.

For instructors who have never attempted this powerful assessment technique, the initial foray into verbal protocol analysis may be labor intensive and time-consuming. For students, verbalizing their thoughts during problem solving will be distracting at first, but after several practice problems they quickly catch on and the verbalizations come naturally with fewer and fewer extended silences.

In many circumstances, the verbal protocol may well be the only reliable road into a student’s thinking. It is unquestionably a high yield, low stakes road. I invite teachers to take the drive. They will almost certainly encounter bumps along the way, and a detour or two. But the scenery will intrigue and surprise. Occasionally it will even delight and inspire.