Yesterday, the University of Texas at Austin Meadows Center for Preventing Educational Risk released part one of the State of Texas Assessments of Academic Readiness (STAAR) readability study required by the 86th Legislature’s House Bill (HB) 3. The mandated study follows numerous reports that STAAR test items were above the reading level of students taking the test, such as this peer-reviewed study by Texas A&M University-Commerce researchers.
Here are the three main questions of the study and the answers gathered by its authors:
Question 1: Are the items on the 2019 STAAR tests (and the tests as a whole) aligned to grade-level Texas Essential Knowledge and Skills (TEKS)?
Answer: The TEKS that each test item is precoded to match are mostly in alignment. Across all 17 tests analyzed, eight questions were not in alignment, which means that the question did not adequately assess the standards it was meant to address. As for the test as a whole, all questions were found to be in alignment with grade-level TEKS.
Question 2: Are the items on the tests at a grade-appropriate readability level?
Answer: Due to a lack of research in the area, the authors used several different methods to try to measure the readability of each test item. For each method, the researchers obtained different results, which meant that none of the methods were reliable indicators of readability. Therefore, the study is inconclusive about the grade-level readability of test items and provides no further insight in this area. Because parents and advocates have expressed concerns with the readability of mathematics test items, this lack of findings is rather unsatisfying.
Question 3: Are the passages on the reading and writing tests at a grade-appropriate readability level?
Answer: The authors developed their own “test” to determine if a passage was grade-level appropriate in readability. In order to pass the test, each passage had to meet two out of three measures: sentence length and difficulty, syntactic simplicity or “syntax,” and vocabulary load or “narrativity.” For syntax and narrativity, the authors used a measure called “Coh-Metrix” that can either be based on English/Language Arts (ELA) norms or social studies norms, depending on the genre of the text.
While many passages met grade-level for sentence length/difficulty and syntax, only 31% of passages fell within or below the specified grade band for narrativity when using the ELA norms. However, because each passage only had to meet two out of the three criteria, 86% of writing and reading passages were found to be grade-appropriate. Additionally, the authors stipulate that the passages are more of the informational genre and thus could be evaluated using the social studies norms, which produces higher readability results, yet still comparatively low in the narrativity index.
This study leaves many questions unanswered. Is it acceptable that some test items are not correctly aligned to the TEKS? Are the STAAR test items, such as those on the mathematics tests, at the appropriate readability level? Is the 2/3 criteria test valid when it allows for narrativity to slip through the cracks? Is it good practice to allow for the majority of a test to use vocabulary that is outside the scope of commonly used language for a particular grade level?
The second part of this legislatively mandated study should surface by February 1, 2020. Stay tuned to ATPE’s Teach the Vote as we track the implementation of this important provision in HB 3. Read more about changes to student testing that resulted from the 2019 legislative session here and here on our blog.