Spoiler: I have no idea why NYSED made the decision to lift time limits on the state tests. I was not in the room. I’m a wild speculator, just like anyone else outside NYSED who is writing or talking about the changes in time limits.
Caveat: I live in the land of authentic, performance-based, portfolio assessment. I’m a large-scale testing tourist.
Bias: I believe that those who go to work for any state education department keep their humanity cards. I believe, like all of us, they are trying to make the best decisions possible with the information they have, within the constraints they see. So, when it comes to this decision, I believe they are trying to attend to Opt Out members’ concerns in ways that make sense and are doable given policy constraints.
So let’s look at testing time.
The first given we have to accept is that the goal of testing time limits is to provide students enough time to do their best but not so much time that the test takes over a student’s entire day. Most tests students take have a built time limit such as one period or one class block. Since large-scale tests like NYS’ are given in 700 school districts, there is no pre-existing limit so they have to figure it out. Unlike the Regents exams, for which classes are paused for a week, the 3-8 tests happen during a school day. Thanks to the intertubes, there’s a paper trail we can follow to see the thinking behind the current limits for the 3-8 tests.
From the 2015 Test Administrator’s Guide
This is what 5th graders are expected to do with that time. From the Teacher’s Guide to the 5th Grade ELA Test:
So how is NYSED, or any test designer, supposed to know how much time is the right amount of time? There are few things we can look to.
First, they estimate how long it should take students to take the test. From the 2014 NYS Testing Program Technical Report.
To review the math for 5th graders:
42 questions that are estimated to take one minute each: 42 minutes
6 passages that are estimated to take 5 minutes each: 20 minutes
42 + 20 = 62 minutes
(which leaves 28 minutes of “extra” time given 90 minute limit for students *without* extended time)
These rules of thumb are referenced in several texts and guidance documents around test design. I found references going back to studies in 1973 that speak to “one-minute per multiple choice item.” All of that said, test designers need to find out if the “rule of thumb” actually holds. So they look at a statistic called speededness (Tech report, page 46):
The technical report concludes:
The industry standard general rule of thumb is that omit rates for multiple-choice items should be less than 5.0%. Omit rates across multiple-choice and constructed-response items on the Grades 3–8 Common Core ELA and Mathematics Tests typically ranged from 0% to 3%. As may be expected, omit rates tended to increase for items at the end of the test booklets, and only for ELA Grade 3 in Books 1 and 3 did items initially exceed that 5% threshold. It was for that reason that the last two operational items in Book 1 (both MC) and the last operational item in Book 3 (a 4-point CR item) were dropped from scoring and all analyses presented herein. In general, omit rates rarely exceeded 3%, even for the last items within a booklet. That is, these omit rates remained within the acceptable range for large-scale achievement tests. In summary, the low omit rates observed across entire forms are consistent with tests that are not speeded. [emphasis mine]
In other words: generally speaking, students weren’t rushed. They had enough time. When they didn’t, items were removed from students’ scores if there was a slight indication they were rushed. The statistics, though, present a different picture than the stories of stressed students told by those who presented at forums, wrote letters, or post on social media. It’s these stories that I suspect informed SED’s decision. From the memo to the field: this change in policy may help alleviate the pressures that some students may experience as a result of taking an assessment they must complete during a limited amount of time.
There’s a great text on test blueprints that gets into a lot of this and more importantly, it speaks to how issues such as time limits are things that teachers need to consider when designing their own tests. So while SED’s announcement will likely generate conversation about large-scale tests and time limits, here’s hoping a little bit of that seeps over into conversations about teacher-created tests.
Postscript: I can’t find a source to cite at the moment, but I know I’ve read texts about the unintended consequences of untimed tests. I’m going to try and dig them up over the weekend for another post on this scintillating topic.
Second Postscript: The question came up around why 42 items – or why so many items in Book 1? That’s an issue of validity. Which could make for another post or textbook…
Third Postscript: So what about students with extended time? I’m trying to figure that out based on how our neighbor-ish over in Massachusetts handle it. Some tweets on it.