Not a member yet? Register now and get started.

lock and key

Sign in to your account.

Account Login

Forgot your password?

Course Evaluations

01 Oct Posted by in Faculty, Students | 1 comment
Course Evaluations

Here’s a subject that requires periodic comment. A recent NPR tweet referred to two interesting new efforts to demonstrate the relatively low value of student ratings of faculty. The results are worth attention, though they also have their own risks of misinterpretation.

A Berkeley statistician notes that in many courses only half the students fill out the forms, and they are likely to be either unusually pleased or unusually displeased. He also notes wide variations in course types, from big lectures to upper division seminars, yet the same forms are often used regardless of specifics. His conclusion: student evaluations of teaching don’t mean what they seem to mean, but sheer inertia maintains the system.

Piling on, a Swiss economist tested student results in ensuing courses in the field, finding that those who had worked with highly-rated teachers did worse in the next round, suggesting that popularity reflected a lack of rigor (exception: unusually able students like demanding instructors). Overall conclusion: ratings reflect leniency, and they actually work to students’ ultimate disadvantage.

And I would add another related point. Polls of contemporary US students suggest the most important criterion in their reactions to faculty is likability. This one has long bothered me.

So what’s the conclusion? Most obviously – and I agree with this fully, and as Provost tried to promote it with some success – don’t rely on student evaluations alone. Use peer visits, periodically assess curricula and other teaching materials, and combine all this in any effort to determine who’s a really good teacher.

And in some cases, following the Swiss example, do add some effort to determine next-stage student performance, recognizing that sequential opportunities vary with discipline, and the method would not work easily for many general education offerings.

This said, however, a couple of caveats. Several faculty have already noted the NPR essay with characteristically dismissive comments about student input. We know that faculty are often uncomfortable with the measure, partly though not entirely for validity reasons. But I would argue strongly against any attempt to roll back student voice – while, again, fully agreeing that the results should comprise only one component of a larger exercise.

And let’s acknowledge also the limitations of other options. Peer evaluations can be great in units with a culture of collegial rigor. But they can also turn into love fests, perhaps particularly when tenured faculty are evaluating tenured colleagues. Student cautions here are especially essential.

Let’s note also a couple of aspects of administrative input that the recent articles have downplayed or ignored. First, administrators are not always stupid. They are capable for example of noting different course formats, and adjusting expectations accordingly. We know that large, required courses do worse than small courses in the discipline; we know that some types of students (MBAs, for example) are often nastier than others. And we can evaluate the evaluations accordingly.

Most important: while it is true that we make a bit of a fuss about high ratings, usually noting them (though along with lots of other evidence) when we single out teaching excellence. But the most important use is at the other end: students, who are usually pretty generous, help us identify faculty whose teaching is a problem, who need advice or (in case of repeated problems) other measures in response.  This is what I particularly sought in scanning results, and what I passed on to deans or department chairs with special urgency. Even here, of course, students may be off, but in most instances consistently subpar ratings do indicate issues that require attention, issues that go beyond unwelcome rigor.

Finally, let’s also remember that students, themselves, are skeptical about ratings – but in this case, skeptical that their opinions are seriously considered at all. I think they are wrong – again, I always took the results into account, as department chair, dean or provost – but we need to make our case here carefully. Too much dismissive rhetoric may confirm worst fears, reducing student attention precisely where we want to encourage it.