Search This Blog

Feb 24, 2017

The psychometrics of simplicity

I am not a psychometrician, so my friends who actually are will probably laugh at me. It’s OK, bring it on. Lack of expertise has never stopped anyone from expressing an opinion. I just want to make a case for simple instruments against complex instruments, in the context of teacher preparation.

Please take a look at an observation form I helped design, with my colleagues at University of Northern Colorado. It was years ago, and I still like it. And take another look at the nine page, 32-items form COE at Sac State currently uses. It is very good, clearly worked on for years, but is still too long. And just for kicks, here is the document on 77 pages, describing the Danielson framework, perhaps the most dominant teacher evaluation platform in the country.

The longer, more detailed rubrics are, in theory, more reliable. They do not just name a domain, but contain specific behaviors or other observable indicators that are associated with skills or competencies. Students either answer questions, or not. A teacher either has stated learning objectives or not, etc. The short rubrics I like tend to be holistic, more subjective, and more difficult to justify. 

However, the context of use is everything. Those are not laboratory instruments. Supervisors and cooperating teachers use them in the field, where they observe someone’s lesson. These are situations where you have to keep your eyes really open for tiny nuances of interactions, and at the same time one has to go through a long checklist. It’s very basic: the observer runs out of brain resources.

What we have noticed for ages is that data coming from longer rubrics tends to be flat, uninteresting. If you have a four-point scale, everyone will be about at 3 in the beginning, and 3.5 at the end of student teaching. It is because human being are unable to make multiple evaluating decisions over short periods of time. Supervisors and cooperating teachers tend to make up their mind holistically about the way someone is teaching, and then simply justify their overall impression through the rubric. Most of them are experienced, wise people. What sets an expert apart from a novice is exactly the ability to make non-analytical, synthetic, holistic judgments. One can debate whether their image of good teaching is accurate, but that is how they form opinions. Novices go through check-lists, because they are not yet able to quickly synthesize. So we force experts behave like novices.

It is often done on the premise that an evaluation rubric is a pedagogical instrument, and that it intends to remind pre-service teachers about what is important. But I am not sure if the argument works. We should encourage our novice teachers to develop the ability to think holistically, to synthesize knowledge. The checklists create the false impression that if you only do all those things, you will teach well. Well, either the checklist has to be a hundred pages long, or it should not exist. There are just too many possible indicators. An isolated action does not have meaning outside of the relational context. Yes, as a rule one should not give long lecture to six graders. But man, I have seen such brilliant exceptions. A hostile classroom atmosphere is not always the fault of the teacher, and therefore, not a reflection on his or her skills. Etc., etc., etc. For a hundred page checklist we can provide a thousand page list of exceptions.

Another consideration is economic. For something like Danielson-inspired instrument to work, one needs significant resources committed to constant training and retraining of evaluators to ensure the inter-rater reliability. If you have a large teacher preparation program like ours, it is almost impossible to do. Supervisors are many, and they change often, cooperating teachers are a multitude, and they are busy and change constantly. Whatever precious resources we have are better spend on PD at higher levels, for example, on co-teaching models or on cognitive coaching. Training them to use the rubrics correctly feels like a waste of time.

With a short holistic rubric, we embrace the strength of holistic assessment, and avoid the negative side of indicator-rich instruments. One can easily keep in mind the four-five main domains, and give an honest expert opinion on how a teacher candidate is doing. The shorter rubrics also give more time for qualitative feedback, which is always more important. You have the time to write “pay attention how you move around the classroom; it may be distracting children” or “some children did not understand the assignment,” or something like this, because you don’t have to run through 45 indicators. We also do not observe a good number of items at all, because they are not all evident on every lesson. But we feel compelled to enter some random number, so the cell is not empty.



In my opinion, it is much better to have better subjective data than poor objective data. It is especially true because the indicator-based, objective and detailed rubrics are not really validated by research, contrary to what Danielson and others claim. In other words, we do not really know that if a student teacher have written, for example, the unit learning outcomes as “related to “big ideas” of the discipline,” that it will really help kids learn. We may have a professional consensus about it, but we do not know it for a fact. The studies on value-added measures of teaching effectiveness are in their infancy. And even theoretically we are unable dis-aggregate the teacher behavior to small indicators to show the relative weight of, say communications style vs. mastery of material vs. the careful planning of instruction. Underneath all the sophistication - is the same gut feeling that we acquire with experience. OK, it is a collective gut feeling, but professionals were known to be wrong collectively. Just to remind the hard-line psychometricians: the semantic hypothesis is still a hypothesis.

Another practical consideration for short holistic rubrics is this: teacher preparation programs do not have time to look at all data we generate. The more items you have, the more work it takes to process and interpret data. The fewer are the data points, the better it is to comprehend. Data usually supports or contradicts suspicions we already have. It cannot do much more with technologies we have today. When we develop AI, the neural network technology, let’s talk again. For now, we may be better off admitting that we use very limited data collection techniques and our dreams of data-informed continuous improvement process may be a bit premature. So we need to bring the expectations to where the technology is, otherwise we produce a lot of needless work and unprocessed data.

Feb 19, 2017

Taking stock of the good things

I spent some time last week working on several student complaints, and some of my colleagues felt sorry for me for doing this in my first couple of weeks on the job. I am thankful, but student complaints are unique learning opportunities that present a sharply focused view of the organization’s culture. Dealing with the unhappy ones presents the view from behind, so to speak. Like an army platoon’s speed is the speed of the last straggling soldier, the least happy students show what is possible, what works and what does not. I came out of these situations in high spirits. The system definitely works, and it works very well overall. Students are given second and third, and forth chances, they are treated fairly, the expectations remain high, and the rules are flexible enough to accommodate the diverse student body. The errors we make are minor and not systemic. Faculty and administrators spend a lot of time on individual students’ problems, and the solutions are reasonable. I feel really good about the College, and its faculty and staff.

Consider the phenomenon of the general complaint – the discourse of dissatisfaction that permeates any human society. The intensity of the general complaint is not related to the actual health of the organization. For example, at the university A people are nonchalant about half of all students skipping any given class. At the university B people are greatly upset about a discrepancy between a syllabus and a handbook. If you take the level of complaint into consideration, A is better than B, while in fact B is light years ahead of A. This is why it is unimportant how much people complain, but what do they complain about is important. There is a Russian proverb “For some, pears are too small, for others – borscht is too thin.” I am sure there is an English equivalent, but I cannot think of any right now. I hope it makes sense that the three student complaint cases made me rather happy. The College has figured out most of the structural problems that exist at any place like this. In professional preparation, we sometimes have to tell people “No, you cannot go forward,” which cannot please. That is inevitable; how we deal with it can vary greatly. There are also inherent issues with field placements, communication with field mentors – all colleges of our size have that, and such tensions have purely economic underpinnings. Yet some deal with them with more grace than others. We do a good job.

Of course, I am not blind and see the shortcomings, the bottlenecks, the weak spots. In fact, people tend to focus on the problems, because they are so immediate and pressing. We all get quickly used to the good things, which is why I value the “new eye” experience so much. To move forward, it is extremely important to keep the awareness of the tremendous achievements you have. To move a ship, one needs to take stock of the whole thing – the beauty, the structural integrity, - the integrative characteristics. One cannot just focus on the leaks and the rust spots. So, here are just some of the good things to appreciate: We live in one of the most affluent, advanced, democratic, and diverse parts of the world. Despite occasional budget cuts, public colleges still enjoy support of the public. We have thousands of successful, well-connected alums. We help teachers, principals, counsellors, school psychologists – they provide the backbone of todays’ economy, and these professions are not in danger of being outsourced or replaced by robots. So we own the future. We have a nice campus, with beautiful trees, modern technology and many amenities. I am not going to panic about the policy manual being outdated or the committee structure to be imperfect, or the assessment system being too cumbersome. Those may be annoying, but objectively small problems. We will plug all the holes as we go, no big deal. We have bigger fish to fry, and I am happy to report, looks like we’re totally ready for it.Taking stock of the good things

Feb 13, 2017

Searching for a vision

Some may believe that politics is the most human of all arts, but it is wrong. Primates do a lot of politicking; they form coalitions, and orchestrate coup d’états. Those behaviors are normal, and they probably become more intensive when resources are scarce (let’s say, drought or budget cuts). However, faced with a common external threat, chimps tend to act as a unified force.

The Academia is one of many primate habitats, and every single institution I know has its share of internal politics, which by necessity includes factions. After all, one cannot advance one’s interests and agenda without friends. So, coalitions naturally form and engage in various levels of competition. The acceptable levels of factional struggle are such that it does not take away too much time and effort from doing the work for our students, and moving forward as an institution. I know this is a vague definition, but it does the job. We have a given set of time and intellectual resources. How they are expanded matters. If too much is dedicated to internal politics, the task of development is threatened. In the most severe cases, even the routine maintenance of operations could suffer, but it is a rare case.

A new dean’s worst mistake is to get immediately entangled into the micro-politics by simply joining one of the coalitions, and letting it become the sole source of support and information. And let’s keep in mind, such a move is very tempting, for if you want to advance any kind of agenda, you should have people to rely on. So it works in the short run, but in the long run, the move is self-defeating, for it leaves the structural arrangements unchanged. Machiavelli had plenty to say about that. In his time, excessive internal struggle meant losing wars to neighbors. In our times, it means stagnation, and eventual loss of competitiveness. Authority based on trying to be objective and even-handed, fair to all factions, is slower, but ultimately, it is more stable and productive. The first rule of conduct – I cannot belong to any faction, but will try to listen and understand everyone. The basic English common law principle is to hear to the both sides, while trying to be impartial. It is perhaps one of the best ideas ever; it was designed to contain our natural tendency to color all information depending on whether it comes from a friend of from a foe.

Dean’s conduct is important, but not critical. The most important is to have a common vision, a big goal for all of us. If we get it, the micro-politics will be kept in check; they never become destructive, or take too much of anyone’s time. They still exist, but can even be a positive force where all factions find niches, and compete on how much they contribute to the common good. That is not just theory; I have seen this happen, and it works. We don’t have to be all friends. In fact, I find the utopian images of human brotherhood dangerous and ultimately destructive. I like the pragmatic, good-enough, yet inspired communities. We tend to share the same values, and that is the 80% of the way to flourishing.

To convert shared values into a vision is not an easy or fast project. There are some fascinating things to know about how visions cannot be too precise, and have to stay a little blurred. I even wrote a paper about that a couple of years ago. The term “vision” evokes the use of visual imagery; it cannot be limited to words. It is a mental picture of the not-so-distant future where we all want to be.