Accurately assessing standardized testing

One of the areas that I consider to be within my professional duties as a K-14 educator is keeping up with juvenile literature. This means that I sometimes have to and sometimes get to read children's books. In the 'have to' department, I have read all 11 installments of Jeff Kinney's Diary of a Wimpy Kid and I'm still searching for a redeeming quality within the main character, Greg Heffley, or his family. In the 'get to' department, I have also read everything Theodor Geisel, aka Dr. Seuss, ever wrote and my only regret is that he won't be writing any more.

Also in that latter department is the Theodore Boone series written by John Grisham. That Grisham, one of the most popular authors of adult fiction in the world, even writes juvenile literature would probably come as a surprise to many. Nevertheless, the main character is described as a "kid lawyer," and his reasonably credible adventures do qualify as at least 'legal thriller-lite.'

The most recent installment, the sixth, came out just this last May and was entitled The Scandal. It tells the story of a middle school in Boone's mythical town in which teachers cheat on standardized tests. The form that cheating takes is identical to cheating which has actually taken place in various parts of the country. Which is troubling, to say the least.

Recently, Jay Greene, an education professor at the University of Arkansas wrote an editorial about the use of standardized test scores for purposes of school accountability which is equally troubling. Specifically, he noted that schools which manage to boost test scores, even in the absence of cheating of any kind, usually fail to see those improvements result in any sort of positive life outcomes such as increases in high school graduation rate, postsecondary school enrollment or job earnings. The boost to student achievement as measured by the test scores, in other words, seem isolated, having little or no effect any other aspects of a student's life either then or later. Thus, unless you value the test score in and of itself, a good thing on its own merits even if it produces no synergy, then the gain is pointless.

Another way of looking at both the reality of a very small handful of educators cheating on tests and the apparent emptiness of improved test scores, however, is that they are both illustrations of Campbell's Law, which states: "The more a quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social pressures it is intended to monitor." One example of Campbell's Law would be the polls in the recent election. As polls began to become the focus of campaigns and even used by campaigns to influence public opinion and how people would actually vote, they became corrupted as to their ability to accurately prediction election outcomes, their original purpose.


(If Campbell's Law sounds familiar, that is because it is the social science equivalent of the Heisenberg Uncertainty Principle which argues that the very act of observing certain physical elements thereby affects and changes the behavior of those physical elements. I'd say more about this if I understood it. I don't.)

In other words, the more attention we give and the higher the stakes we put on the outcomes of standardized testing, the more likely it is that those outcomes will no longer produce anything meaningful. As the stakes grow higher with standardized testing, the value of that testing falls. If, as Lord Acton said, "Power tends to corrupt; absolute power corrupts absolutely," might it also then be true that standardized testing devalues test scores; high stakes standardized testing devalues them absolutely?

Perhaps. And, as I've tried to make clear, this doesn't necessarily require anyone to cheat. It doesn't because there are perfectly licit ways to improve test scores which don't actually improve any underlying academic ability.

One of the challenges of the most recent manifestation of standardized testing, the Smarter Balanced Assessment (SBA), is that it is no longer paper-and-pencil but technology-based. Students no longer pull out their newly sharpened #2 pencils to take the exams but plop down, instead, in front of a computer screen. How you take a test can affect performance just as your knowledge base of the content of the test can affect your performance. Faced with the novelty of the test on the computer for the first time, many students, it is not unreasonable to think, did more poorly on the test. At least for the first year. They'll likely be more familiar the second time and so test scores may go up a bit for that reason alone. Add to this the urgent work of many school administrators to start providing more technology and more technology similar to the format of the SBA tests and you can see how test scores might nudge up a bit through that method alone. And this is just one of any number of such techniques.

But notice that this greater facility with a particular technology or particular testing format is not likely to boost actual knowledge, skills, graduation rates, placement in postsecondary schools or later job earnings. Which is precisely what Jay Greene has noted in his review of the research literature. Add to this the reality of occasional or perhaps more than occasional cheating according to Dubner and Levitt in their Freakonomics work, as highlighted by Grisham, and standardized testing scores can indeed rise with no discernible positive effect on anything or anybody.

Which leaves us in the uncomfortable position of standardized testing accurately reflecting student learning but only so long as nobody pays too much attention to it. Just how you get your arms around that conundrum is a job beyond the wiles of Greene or Dubner/Levitt or even Heisenberg. Dr. Seuss, where are you when we need you?

