In every complex human profession, a preparation program is very broad. Teachers need to have hundreds of skill, big and small, know thousand things. Therefore, professional organizations create standards that are laundry lists of skills we designate as important. For example, California Teacher Performance Indicators is a 14 page document, with 6 domain and some 45 items just in general pedagogy, plus who knows how many subject-specific items. An element can be like this: “Maintain high expectations for learning with appropriate support for the full range of students in the classroom.” Or, let’s take NASP Standards for the Credentialing of School Psychologists; looks similar. Both organizations (and countless others) had tried to reduce the number of indicators for practical reasons, but instead they made them compound. For example, a school psychologist has to “ in collaboration with others, demonstrate skills to promote services that enhance learning, mental health, safety, and physical wellbeing through protective and adaptive factors and to implement effective crisis preparation, response, and recovery.” To figure out whether a student actually meets this standard, you need to observe his or her collaboration with others – not in general, but in the very specific act of promoting services that enhance learning. And then the same thing about promoting services than enhance mental health. Observe a completely different act of promoting safety, and you have to collaborate with someone while doing so. So it is literally hundreds of actual indicators you need not only observe, but observe long enough to gauge the level of sophistication. In other words, proving attainment of these standards is physically impossible. I understand the indent of the standard writers – you don’t want to leave gaping holes in preparation. Moreover, you do not want to be accused of being less rigorous than the last reiteration. However, I think they should start considering the actual lives of the documents they create, including the unintended consequences. Something is wrong with the very premise of standardizing complex professions. Weren’t we supposed to learn this from the epic failure of Frederic Taylor’s “Scientific management”?
Yet we are where we are. What an accredited program to do? Well, we design evaluation forms, which make sense at the end of the program, where we try to “cover” the bulk of the standard items. Our student teaching supervisor observes students 6 times per semester, for less than an hour. It is actually much higher than the national average of about 3-4 observations. School Psych do even more intense observations, and so does Counseling, Leadership and other professional programs. Still what is the chance that one would see something to show that the student teacher “Participates in school, district, or professional academic community learning events; uses professional learning to support student learning?” What if there were no community learning events? We actually did not observe it, it is all secondary information. And yet we have to meet every standard, or lose accreditation. In some areas, the loss of accreditation is a death warrant for a program.
To cope, we pretend. Instead of actually observing some behavior, we hope that someone somewhere in one of the classes possibly about professional learning communities. Of course, as the supervisor, I had never observed that one little thing, but I will check the little box of hope. Yes, we have 40 or 99 items evaluation forms, of which I can only really tell about 10 with certainty. They have to be more holistic (which is exactly what our colleagues did), but still retain plausible coverage of the standard items. The time is precious, and even if I did not observe all the items, I will check them all, for we are required to collect the assessment data. As a result, the granular data if of very little use. In theory, we should make program development decisions based on the data. For example, one semester, we should see – oh, look this semester we scored lower on “Plan instruction that promotes a range of communication strategies and activity modes between teacher and student and among students that encourage student participation in learning.” We ought to do something! Nah, nothing like this ever happens. The data is flat and boring. The forms can be very effective pedagogical instruments, a chance to talk to the future teacher or psychologist about how they do. As data source they are fairly weak. We rarely learn something about our programs from the compliance data that we did not know otherwise.
I think we should give up on the idea of standards altogether. Professional organizations can concentrate their intellectual resources on development of good observation protocols, evaluation instruments, and tests. Just cut the middle construct. Instead of asking what a good teacher or counsellor is, ask what a good one looks like. It is a big shift, if you think about it. How does a very perceptive professional actually recognize competence in others? There are probably telltale signs of both a potentially good teacher, a struggling teacher, or a hopeless teacher. Eye contact, body posture, speech patterns, their kids’ behavioral clues, the kinds of interactions, the ability to pause, the kind of measured display of emotion. Why pretend that we came to these signs through some theory? I think if we commit due attention to the act of recognition, analyze and distill it, we could come up with much better instruments, and much better data. We should concentrate on what is visible, and which signs are more meaningful.
This is where we should try to use the neural network approach, the first realistic application of artificial intelligence. I’ll skip the explanation on how it works – read Wikipedia for that. Fundamentally, the process is this: ask several master teachers to rank video clips of beginner teachers. Don’t ask why; we are studying master teachers’ perceptions, not the beginner teachers’ behavior. The neural network can actually detect what is in common among the good ones. The neural network will “learn” the traits which are basically patterns with a massive data set. The clips that rise to the top through collaboration of humans and the AI, will become the models to analyze and imitate. The neural networks are actually not that good at explaining why they selected particular cases. That is a feature that makes them uncomfortable for humans to use – they really work, but no one really knows how. They also can reproduce human prejudice, because the initial concepts come from humans. We have to be careful. Yet my point is - human intuition is also uncomfortably opaque. As a species, we evolved not to analyze, but to synthesize information. We process whole images, search for cues to reveal patterns. An ancient hunter did not have a checklist of hunting procedures when he taught his son. He presented the whole of his practice, and alerted his son to cues that others do not see.
Yet we are where we are. What an accredited program to do? Well, we design evaluation forms, which make sense at the end of the program, where we try to “cover” the bulk of the standard items. Our student teaching supervisor observes students 6 times per semester, for less than an hour. It is actually much higher than the national average of about 3-4 observations. School Psych do even more intense observations, and so does Counseling, Leadership and other professional programs. Still what is the chance that one would see something to show that the student teacher “Participates in school, district, or professional academic community learning events; uses professional learning to support student learning?” What if there were no community learning events? We actually did not observe it, it is all secondary information. And yet we have to meet every standard, or lose accreditation. In some areas, the loss of accreditation is a death warrant for a program.
To cope, we pretend. Instead of actually observing some behavior, we hope that someone somewhere in one of the classes possibly about professional learning communities. Of course, as the supervisor, I had never observed that one little thing, but I will check the little box of hope. Yes, we have 40 or 99 items evaluation forms, of which I can only really tell about 10 with certainty. They have to be more holistic (which is exactly what our colleagues did), but still retain plausible coverage of the standard items. The time is precious, and even if I did not observe all the items, I will check them all, for we are required to collect the assessment data. As a result, the granular data if of very little use. In theory, we should make program development decisions based on the data. For example, one semester, we should see – oh, look this semester we scored lower on “Plan instruction that promotes a range of communication strategies and activity modes between teacher and student and among students that encourage student participation in learning.” We ought to do something! Nah, nothing like this ever happens. The data is flat and boring. The forms can be very effective pedagogical instruments, a chance to talk to the future teacher or psychologist about how they do. As data source they are fairly weak. We rarely learn something about our programs from the compliance data that we did not know otherwise.
I think we should give up on the idea of standards altogether. Professional organizations can concentrate their intellectual resources on development of good observation protocols, evaluation instruments, and tests. Just cut the middle construct. Instead of asking what a good teacher or counsellor is, ask what a good one looks like. It is a big shift, if you think about it. How does a very perceptive professional actually recognize competence in others? There are probably telltale signs of both a potentially good teacher, a struggling teacher, or a hopeless teacher. Eye contact, body posture, speech patterns, their kids’ behavioral clues, the kinds of interactions, the ability to pause, the kind of measured display of emotion. Why pretend that we came to these signs through some theory? I think if we commit due attention to the act of recognition, analyze and distill it, we could come up with much better instruments, and much better data. We should concentrate on what is visible, and which signs are more meaningful.
This is where we should try to use the neural network approach, the first realistic application of artificial intelligence. I’ll skip the explanation on how it works – read Wikipedia for that. Fundamentally, the process is this: ask several master teachers to rank video clips of beginner teachers. Don’t ask why; we are studying master teachers’ perceptions, not the beginner teachers’ behavior. The neural network can actually detect what is in common among the good ones. The neural network will “learn” the traits which are basically patterns with a massive data set. The clips that rise to the top through collaboration of humans and the AI, will become the models to analyze and imitate. The neural networks are actually not that good at explaining why they selected particular cases. That is a feature that makes them uncomfortable for humans to use – they really work, but no one really knows how. They also can reproduce human prejudice, because the initial concepts come from humans. We have to be careful. Yet my point is - human intuition is also uncomfortably opaque. As a species, we evolved not to analyze, but to synthesize information. We process whole images, search for cues to reveal patterns. An ancient hunter did not have a checklist of hunting procedures when he taught his son. He presented the whole of his practice, and alerted his son to cues that others do not see.
No comments:
Post a Comment