“For all those idiot principals, this is just another way to play ‘gotcha.’”

Chainsaw man 1

A while back, while blogging for Education Week, I ripped into the over-hyped marketing for Robert Marzano’s teacher evaluation system.

That system is now in use in districts around the country, and just a few days ago, I heard from an educator familiar with how the system is being implemented. She writes:

With a well-trained principal who sees this as a tool to help struggling teachers identify their weaknesses, and then make suggestions for improvements, this is a useful system because it provides a common vocabulary. For all those idiot principals, this is just another way to play “gotcha.”

Ouch. May you and I never be accused of playing “gotcha” with the teachers we’ve been entrusted to lead.

Here’s the real problem: Any tool that gives us greater potential to be more effective in our work…also places upon us a greater obligation to use that tool responsibly. And we’re not automatically cut out to wield greater power responsibly. We need to work up to it.

Changing the tools does not magically impart the user with the power to use them effectively. I’m OK with a hammer and decent with a staple gun, but heaven help me if someone hands me a full-size nail gun.

As principals around the country gain greater power through new evaluation systems, it’s our job to make sure that we develop the skill and perspective we need to handle this power responsibly and with the best interests of students in mind.

When You Don’t Have Enough Evidence

Over the past few years, many states have adopted much more stringent teacher evaluation requirements. We have new rubrics, more required observations, and more complex criteria on which to rate teachers.


For principals, the impact is undeniable: The teacher evaluation workload has grown dramatically. What used to be a perfunctory process of filling out a form is now a year-long process of gathering evidence on a huge range of criteria. What used to be a pass/fail process is now a detailed rating process that demands much more evidence.

I think this is a good change, but it’s one we have to handle smartly if we want to avoid being crushed under this new workload.

So I adopted a simple approach: I’m going to gather as much evidence as I can, and the teacher can supply evidence too, but if I don’t have evidence on something, I’m going to assume it’s satisfactory (3 on a 4-point scale).

If I believe a teacher’s performance in a given area is not satisfactory, I should go to the trouble of gathering evidence to back up my assertion. If a teacher believes their practice is exemplary and deserves a 4 out of 4, they should have readily available evidence. If I believe their practice is exemplary, I should have no trouble pulling out a few examples to showcase.

But what if I don’t have enough evidence? Let’s face it: this is often the case.

When it’s time to write the evaluation, we can’t include evidence we don’t have, and we shouldn’t try to mine our observation notes for patterns that aren’t there.

What most certainly should not do is to force ourselves to gather one piece of evidence for every component for every teacher. One piece of evidence may not be nearly enough, or it may be too much.

The Danielson Framework for Teaching has 22 components in 4 domains. If we focus on gathering one piece of evidence for every criterion—an enormous task for a staff of 30 or 40 teachers—what value does that add to the process?

Not much. There’s a word for a lone piece of evidence: anecdotal.

If at the end of the year, we find that we need more evidence to provide a justifiable rating on every criterion, the time to address that problem was months ago, not in our “creative writing” process at the last minute.

If I’m going to give a rating that has a negative impact on a teacher’s self-concept or employment situation, I want to have at least three specific pieces of evidence that I’ve documented in writing. I want dates and times, and I want to be sure anyone who reads the evaluation (the teacher, my boss, the union, human resources, a hearing judge) will agree with me.

This means I have to be in classrooms much more often than the evaluation process requires. Two formal observations won’t generate the kind of evidence I need.

We need to be in classrooms virtually every day, for more than a few minutes, to know as much as we need to know about the teaching and learning taking place in our classrooms.

We need to take good notes, gather good evidence, and most importantly, talk with teachers about their teaching and with students about their learning.

If we’re smart about our evidence-gathering, we can be more effective as instructional leaders and write evaluations that we can stand behind. 

How to Set Meaningful Professional Growth Goals

In many districts, the annual evaluation process for teachers involves setting both student growth goals and professional growth goals. I’ve found that professional growth goals are often fuzzier, and the administrator’s role in ensuring that the goal is meaningful and challenging is even more important. What do you look for?

1. Strong personal commitment
A professional growth goal should be something at the top of the teacher’s list – in other words, it should matter to them personally. Otherwise, the evaluation process will be nothing more than a formality.

2. Connection to school and team goals
A professional growth goal should have relevance to the broader work of the school and the teacher’s teammates. While everyone has their particular interests, the work you are supervising should have specific relevance to your school’s current areas of focus, and should involve relevant colleagues.

3. Well-defined evidence
It should be easy to determine, at the end of the year, whether the goal was accomplished or not. For example, goals such as “I will get better at…” are incomplete, because they don’t give clear criteria that enable the evaluator and the evaluated to agree on whether the goal was met. (A student growth goal should be measurable, but professional growth goals shouldn’t necessarily include quantitative measures.)

4. Growth orientation
The goal should emphasize (as the title suggests) professional growth, not just the completion of an agreed-upon project such as rearranging desks for group work.

Soccer girlIt might seem that #1 and #2 are in tension, as are #3 and #4. A good professional growth goal, though, can meet all of these criteria and provide a meaningful challenge and direction for the teacher’s efforts for the year.

Here are a few examples:

Goal: Increase skill in using writing workshop instructional model, with particular attention to modeling the writing process using my own work. By the end of the year, I will model three lessons for my grade-level team, and will develop a portfolio of my own writing that I have revised in front of students.

Goal: Increase collaborative learning in math by creating project-based lessons to allow students to work in groups. By the end of the year, I will develop, teach, and evaluate 6 lessons, and share them with our school’s math council for feedback.

Goal: Increase positive communication with parents of struggling students. By the end of the year, I will make at least 5 positive contacts with my 10 lowest-performing students’ families, and will update the student support team on their progress.

How do you help your staff develop meaningful professional growth goals?

WSJ: Where Teacher Report Cards Fall Short

In this Wall Street Journal piece on teacher accountability for student growth, Carl Bialik gives a concise round-up of the current debate on tying teacher evaluations to test scores:

One perplexing finding: A large proportion of teachers who rate highly one year fall to the bottom of the charts the next year. For example, in a group of elementary-school math teachers who ranked in the top 20% in five Florida counties early last decade, more than three in five didn’t stay in the top quintile the following year, according to a study published last year in the journal Education Finance and Policy.

Meanwhile, the District of Columbia began evaluating teachers based on test scores last school year, and fired more than 150 teachers after the school year because of poor performance. Test scores count for 50% of teacher ratings in subjects that are tested.

A report from the Department of Education released last month shows that even with three years of data, one in four teachers is likely to be misclassified because unrelated variables creep in.

Even with these questions, relying on student test scores to create a quantitative assessment of teachers might be better than the current standard practice. At many schools, principals grade teachers based on a few minutes of classroom observation (and then give most of them high scores).
Carl Bialik, Needs Improvement: Where Teacher Report Cards Fall Short (WSJ)

One of the most visually arresting elements of the article is this chart that shows the instability of value-added ratings from year to year:
WSJ chart
Source: WSJ.com

As you can see, the idea that excellence in teaching (as defined by impact on student learning) is a stable construct is not supported by the data from this study. Value-added is not ready for prime time.

But is it better than what we currently use for teacher evaluation?

But even skeptics of test-score-based evaluations acknowledge that a uniform, data-based approach for ranking teachers could be superior to subjective methods—such as principals’ observations—that still predominate in schools. “Damn near anything is going to be an improvement on the status quo,” says Daniel Willingham, a cognitive psychologist at the University of Virginia.

Indeed, the sad state of teacher evaluation is not due to a lack of data, but a lack of diligence on the part of school administrators. But I would disagree sharply with Willingham’s prescription for change. Far too much energy has been devoted in recent months to making a show of doing something – anything – to address issues of teacher performance. We seem to be enamored of anything that uses data and promises accountability, even if the results are patent nonsense, as in the illustration above.

We like this approach because it’s neat and quantitative, even if it’s dead wrong. Truly improving instruction is a lot more complicated. In order to do a better job of identifying effective and ineffective teachers, and ensuring that the latter make the necessary improvements, principals will have to make time to observe instruction, provide feedback, and take responsibility for the quality of teaching in their schools.

See this follow-up post for more analysis from Bialik.

Kim Marshall on Teacher Accountability

In this op-ed in EdWeek, Kim Marshall advocates for team-based accountability for student growth as a part of teacher evaluations:

So why are folks still talking about individual merit pay when it’s clear that it won’t work? Because the idea of holding teachers accountable for their students’ test scores sounds so obvious—and U.S. Secretary of Education Arne Duncan and a bunch of powerful politicians are enabling that gut feeling.

First, those who advocate performance-based accountability are absolutely right that student achievement needs to be front and center. It’s not enough to observe teachers’ classroom performance; we need evidence that students have learned.

Second, research has clearly established that teachers and principals make a huge difference to student achievement. They shouldn’t be ducking responsibility.

Third, when people are acknowledged for a job well done, it’s affirming and energizing. That’s true even for idealistic and intrinsically motivated educators.
–Kim Marshall, Merit Pay or Team Accountability?

As always, great thoughts from Kim Marshall.

x How to Handle Every Kind of Email