Neither higher, deeper, nor more rigorous: One-type-fits-all national tests, Part 1

May 21, 2016 by

Richard Phelps – Some say that now is a wonderful time to be a psychometrician–a testing and measurement professional. There are jobs aplenty, with high pay and great benefits. Work is available in the private sector at test development firms; in recruiting, hiring, and placement for corporations; in public education agencies at all levels of government; in research and teaching at universities; in consulting; and many other spots.

Moreover, there exist abundant opportunities to work with new, innovative, “cutting edge”, methods, techniques, and technologies. The old, fuddy-duddy, paper-and-pencil tests with their familiar multiple-choice, short-answer, and essay questions are being replaced by new-fangled computer-based, internet-connected tests with graphical interfaces and interactive test item formats.

In educational testing, the Common Core Standards Initiative (CCSI), and its associated tests, developed by the Smarter-Balanced Assessment Consortium (SBAC) and the Partnership for Assessment of Readiness for College and Careers (PARCC), has encouraged the movement toward “21st century assessments”. Much of the torrential rain of funding, burst forth from federal and state governments and clouds of wealthy foundations, has pooled in the pockets of psychometricians.

At the same time, however, the country’s most authoritative psychometricians—the very people who would otherwise have been available to guide, or caution against, the transition to the newer standards and tests—have been co-opted. In some fashion or another, they now work for the CCSI. Some work for the SBAC or PARCC consortia directly, some work for one or more of the many test development firms hired by the consortia, some help the CCSI in other capacities. Likely, they have all signed confidentiality agreements (i.e., “gag orders”).

Psychometricians who once had been very active in online chat rooms or other types of open discussion forums on assessment policy no longer are, unless to proffer canned promotions for the CCSI entities they now work for. They are being paid well. They may be doing work they find new, interesting, and exciting. But, with their loss of independence, society has lost perspective.

Perhaps the easiest vantage point from which to see this loss of perspective is in the decline of adherence to test development quality standards, those that prescribe the behavior of testing and measurement professionals themselves.[1] Over the past decade, for example, the International Test Commission (ITC) alone has developed several sets of standards.

Perhaps the oldest set of test quality standards was established originally by the American Psychological Association (APA) and was updated most recently in 2014—the Standards for Educational and Psychological Testing (AERA, NCME, APA). It contains hundreds of individual standards. The CCSI as a whole, and the SBAC and PARCC tests in particular, fail to meet many of them.

The problem starts with what many professionals consider the testing field’s “prime directive”—Standard 1.0 (AERA, NCME, APA, p.23). It reads as follows:

“Clear articulation of each intended test score interpretation for a specified use should be set forth, and appropriate validity evidence in support of each intended interpretation should be provided.”

That is, a test should be validated for each purpose for which it is intended to be used before it is used for that purpose. Before it is used to make important decisions. And, before it is advertised as serving that purpose.

Just as states were required by the Race to the Top competition for federal funds to accept Common Core standards before they had even been written, CCSI proponents have boasted about their new consortium tests’ wonderful benefits since before test development even began. They claimed unproven qualities about then non-existent tests because most CCSI proponents do not understand testing, or they are paid not to understand.

In two fundamental respects, the PARCC and SBAC tests will never match their boosters’ claims nor meet basic accepted test development standards. First, single tests are promised to measure readiness for too many and too disparate outcomes—college and careers—that is, all possible futures. It is implied that PARCC and SBAC will predict success in art, science, plumbing, nursing, carpentry, politics, law enforcement, …any future one might wish for.

This is not how it is done in educational systems that manage multiple career pathways well. There, in Germany, Switzerland, Japan, Korea, and, unfortunately, few jurisdictions in the U.S., a range of different types of tests are administered, each appropriately designed for their target professions. Aspiring plumbers take plumbing tests. Aspiring medical workers take medical tests. And, those who wish to prepare for more advanced degrees might take more general tests that predict their aptitude to succeed in higher education institutions.

But that isn’t all. SBAC and PARCC are said to be aligned to the K-12 Common Core standards, too. That is, they both summarize mastery of past learning and predict future success. One test purports to measure how well students have done in high school, and how well they will do in either the workplace or in college, three distinctly different environments, and two distinctly different time periods.

PARCC and SBAC are being sold as replacements for state high school exit exams, for 4-year college admission tests (e.g., the SAT and ACT), for community college admission tests (e.g., COMPASS and ACCUPLACER), and for vocational aptitude tests (e.g., ASVAB). Problem is, these are very different types of tests. High school exit exams are generally not designed to measure readiness for future activity but, rather, to measure how well students have learned what they were taught in elementary and secondary schools. We have high school exit exams because citizens believe it important for their children to have learned what is taught there. Learning Civics well in high school, for example, may not correlate highly with how well a student does in college or career but many nonetheless consider it important for our republic that its citizens learn the topic

High school exit exams are validated by their alignment with the high school curriculum, or content standards. By contrast, admission or aptitude tests are validated by their correlation with desired future outcomes—grades, persistence, productivity, and the like in college—their predictive validity. In their pure, optimal forms, a high school exit exam, a college admission test, and vocational aptitude tests bear only a slight resemblance to each other. They are different tests because they have different purposes and, consequently, require different validations.

Reference

American Educational Research Association (AERA), American Psychological Association (APA), and the National Council on Measurement in Education (NCME). (2014). Standards for Educational and Psychological Testing, Washington, DC: AERA.

[1] (e.g., the International Test Commission’s (ITC) Guidelines for Test Use; the ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores; the ITC Guidelines on the Security of Tests, Examinations, and Other Assessments; the ITC’s International Guidelines on Computer-Based and Internet-Delivered Testing; the European Federation of Psychologists’ Association (EFPA) Test Review Model; the Standards of the Joint Committee on Testing Practices)

Print Friendly, PDF & Email

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.