There are 11 million licensed drivers in NY State. Unless they took a Driver’s Ed course, these motor vehicle operators (5% of which are between 16 and 19 years old) willingly subjected themselves – sometimes gleefully – to a high-stakes standardized test. Some do it more than once. Again and again, they go back to ensure they pass a standardized test that will have a dramatic impact on their personal freedom and daily lives. Despite the significant impact of this test, there really aren’t any public cries about the quality, validity, or reliability of the NYS Driver’s Test. In fact, the technical documents to support the design and psychometrics of the permit test don’t appear to be publically available. This lack of interest in the quality of the tests may be because it’s short (20 questions) or because it’s followed up by a road test scored by an assessor trained in the rules and basics of the road who gives the test taker immediate feedback on any mistakes or errors. In either case, we accept the presence of a standardized test as a part of the transition to responsible adulthood.
Americans have an odd relationship with standardized tests. We expect that the service providers we interact with, from cosmetologists to real estate agents, are duly licensed to do their jobs. We require that doctors, lawyers, teachers, and others who are members of a board or a profession meet certification criteria. In almost all of these cases, the license or the certification is only awarded after successfully passing one or more standardized tests. There are likely a variety of reasons why we’re comfortable with some standardized tests and not others: the age of the test taker, the degree to which the test taker wants to or seeks to take the test, the degree to which we believe the test measures something important, the test taker’s ability to prepare for the test, how the results will be used, or a fear of test-taking. Some of the discomfort may come from the fact that our field suffers from what Popham (2004) calls “assessment illiteracy”. He goes so far as to claim, “the vast majority of educators reside in blissful ignorance” when it comes to understanding the design and nature of standardized tests. In 2004, he got 5 million hits from Google when he searched for “educational assessment.” In 2014, there are 159 million and the need for assessment literacy has never been higher.
While it is impossible to explain the complexity of large-scale assessment in a single column, I’d like to offer an invitation for readers to invest time in their own assessment literacy. There are several available resources that can provide a NY educator with a better understanding of standardized test design and a deeper understanding of what the NYS standardized tests are and are not.
The best starting point for learning about standardized testing is a document referenced in the APPR documentation. The 1999 Standards for Educational and Psychological Testing was published by the American Psychological Association (APA), American Educational Research Association (AERA) and the National Council on Measurement in Education (NCME) and provide the foundation for testing. Each section of the text defines the psychometric concept (i.e. validity, reliability, fairness, etc.) and sets out the limits of the concept. While not written to explain how testing works, it is the official source for understanding testing concepts. I find myself going back to Popham’s ASCD book, “The Truth about Testing”, an overview of standardized testing that is free of the “noise” created by recent policy like APPR and RttT. Finally, membership in NCME costs $70 a year and provides access to numerous field-friendly and scholarly texts on standardized tests. A more practical document to improve New York State specific assessment literacy is the NY Testing Program Technical Reports. Like the Testing Standards, each section of the technical report explains the psychometric concepts and presents the statistics around the concept from the test being reported.
Social media is awash with claims that the NYS Tests are unfair, invalid, or unreliable. These technical reports provide evidence of the veracity, or lack thereof, for those claims. For example, several groups are claiming the tests are too long. A statistic known as “speededness” provides details about how many students left items blank at the end of the test, giving us an actual report of how many students finished or ran out of time. Additionally, NYSED has released items from the 2012-2013 and 2013-2014 assessments that include explicit and annotated alignment between the items’ demands and the CCLS. A thoughtful read of these resources can empower educators who wish to make claims against the misuse of standardized tests.
The use of standardized tests as the primary means to ascribe growth and attainment for students is highly problematic and has been documented extensively (Berliner and Glass, 2014). Seeking out information about standardized tests to improve assessment literacy isn’t a concession or an endorsement of over-testing or bad policy. It is important for educators to deepen their understanding of these tests by looking at SED annotations and reports, studies from the field of psychometrics, and long form analysis rather than relying primarily on social media and or impressionistic observations.
I named my column in NY ASCD, where this post was originally published, “Pushing at the Boundaries of Assessment” because we wanted to carve out space to investigate what it means to poke at our understanding of what it means to capture evidence of student learning. It’s challenging though, to push at boundaries if we don’t know where they are. These boundaries of standardized tests are defined by and for our profession. Members of our profession have the obligation to separate myth from truth, hyperbole from fact. It is by learning and understanding the research and thinking behind the tests that we can truly be prepared to lead with knowledge and can be prepared to answer the difficult questions. It is good to question and push at the boundaries. It is responsible to be well informed, even if it isn’t our own particular area of expertise.