A scandal broke out at the 2013 AACTE conference. A session brought together several proponents of the value-added model of evaluating effectiveness of teacher preparation programs from Louisiana, Ohio, and Connecticut. After their upbeat reports on how their respective states implement the model, Cory Koedel, a respected researcher reported that all of their models are based on incorrect interpretation of statistics. Judge the quality of his evidence for yourself in an earlier paper on the same subject. In previous studies, there was a significant clustering error; to prove it, Koedel and his co-authors employed a clever technique: they assigned teachers to teacher preparation programs randomly and still found something that looked like a statistically significant “difference” between the purely imagined programs. The random assignment should have shown no difference at all, but it did, which proves the clustering error. After adjusting for the error, they re-run the analysis for real programs and found virtually no statistical differences among the programs. It may be because we still do not have large samples, or because teacher preparation programs are similarly effective, or because other unknown factors like school leadership or culture are more important than we thought. There may very well be differences among teacher preparation programs, but using test scores of its graduates’ students is not a good way of measuring them. Variance within each program is way more important than the variance between programs.
The scandal is not in the error itself, but in the fact that we now have an entire federal policy based on the error. Both the Race to the Top and the NCLB Waivers regime include a push for states to develop teacher preparation evaluation systems based on the value-added model. Many states, including Rhode Island, have it as at least a stated goal. Louisiana has already ranked its teacher preparation programs, and may have closed some, based on the same error. Of course, Koedel’s discover happened after RttT became fact, but the question remains: how can Federal Government implement a very significant policy on the national scale without proper piloting, and without rigorous analysis of the underlying statistical methods? I don’t think each state can be blamed for adopting the idea; after all, the US Department of Education demands it – and puts its name and reputation behind it.
The next step would be re-analyze Louisiana and other states’ data to see if correction of the error will produce different results. I don’t believe it can be the case even in theory, because of the Koedel et.al’s falsification experiment. It is just too darn compelling. But what is next policy-wise? The emperor has been shown to have no clothes. Will the policy change, or will the Department keep pretending that nothing has happened?
The scandal is not in the error itself, but in the fact that we now have an entire federal policy based on the error. Both the Race to the Top and the NCLB Waivers regime include a push for states to develop teacher preparation evaluation systems based on the value-added model. Many states, including Rhode Island, have it as at least a stated goal. Louisiana has already ranked its teacher preparation programs, and may have closed some, based on the same error. Of course, Koedel’s discover happened after RttT became fact, but the question remains: how can Federal Government implement a very significant policy on the national scale without proper piloting, and without rigorous analysis of the underlying statistical methods? I don’t think each state can be blamed for adopting the idea; after all, the US Department of Education demands it – and puts its name and reputation behind it.
The next step would be re-analyze Louisiana and other states’ data to see if correction of the error will produce different results. I don’t believe it can be the case even in theory, because of the Koedel et.al’s falsification experiment. It is just too darn compelling. But what is next policy-wise? The emperor has been shown to have no clothes. Will the policy change, or will the Department keep pretending that nothing has happened?
Exactly the same thing happened in the 1960s when the USOE conducted the historically significant First-Grade Studies. After comparing the effects of 27 different methodologies on the teaching of reading, the researchers found more variance WITHIN and approach than they did BETWEEN approaches. In other words, Teachers make the difference, not methodological approaches.
ReplyDeleteWill the DOE keep pretending. Of course. Politics always trumps pedagogy.
Bob Rude
M.Ed. in Reading Program Director
RIC