Resources

Teacher Calibration Protocols

Calibration protocols are procedures used to increase calibration between appraisers and between campuses throughout the year. When used strategically, they can help increase scoring accuracy by providing appraisers opportunities to practice collecting defensible evidence for ratings. They also help appraisers develop a deeper understanding of what effective instruction looks like across a variety of contexts and ensure that each appraiser in the district is aligned in how they are evaluating teachers.

At times, these protocols may be used to coach and develop appraisers through practice and feedback. At other times, the protocols may be used to evaluate whether appraisers are appropriately calibrated to the district’s standards.

  • Create a schedule of calibration activities that span the year.
  • Designate times for appraisers at different campuses to calibrate together.
  • Ensure that district leaders are certified and calibrating alongside campus appraisers.
  • Decide what it means for two appraisers to be calibrated to each other on a given calibration activity. (e.g. Do they need to match on every rating? Do they need to be within one on each dimension? Do they need to match on a certain percentage of ratings?)

During calibration debriefs, discuss evidence before sharing ratings so that everyone is aligned about not only what the rating should be but why. Quality evidence is objective and states exactly what the teacher said or did, or what students said or did.

  • Low quality evidence: The teacher checked for understanding after modeling one problem.
  • High quality evidence: After modeling one problem, the teacher asked each student to attempt a second problem on individual white boards (15 x 24). The teacher circulated as students worked and wrote down common errors she was seeing in kids’ work. Then, all students raised their boards so she could see how many students were able to complete the problem successfully on their own. 15 out of 22 students were successful.

Should ratings collected during calibration protocols be used as formal ratings?

While the decision is entirely up to the school, the ratings collected during a calibration activity are likely to be accurate since each rating was agreed upon by more than one person. For that reason, we recommend using these as formal ratings if that makes sense for your district or school. As you decide, be sure to solicit teacher input.

If two appraisers disagree about a rating, how do we decide who is “right”?

Using evidence collected during the observation, appraisers should discuss which rating makes the most sense based on the teacher observation rubric and then come to a consensus. The practice of debating and grounding discussion in evidence is perhaps the most important part of calibration activities because it promotes a deeper understanding of how to appraise instruction using the rubric. When in doubt, rely on scripted evidence.

What should we do if appraisers don’t calibrate to each other during a calibration activity?

During a single calibration activity, districts should not be concerned if appraisers aren’t calibrated. Continue engaging in calibration activities to become increasingly aligned over time. If a trend emerges in which appraisers or campuses are consistently not calibrated, the district and/or campus should create a plan to increase appraiser validity and reliability. Next steps could include the following:

  • Re-train appraiser(s) on the district’s teacher observation rubric.
  • Norm on what constitutes each performance level on the rubric for a specific subject or grade level.
  • Until calibration is established or re-established, have two appraisers conduct each scored observation.
  • Assign each teacher two appraisers and use the average scores of both appraisers.
  • Increase individualized coaching of appraisers who are not highly calibrated.

*All protocols are a suggestion. Districts are encouraged to adapt these protocols to meet their needs.

Protocol NameTime EstimateCalibration Protocol DescriptionWhen would this be useful?
Co-Observation30-45 minutesTwo or more appraisers observe the same live lesson at the same time, score 2-3 predetermined rubric dimensions and then use the evidence collected to norm on ratings.This can be used by appraiser managers as a coaching tool to develop appraisers’ accuracy in rating and their ability to collect high-quality evidence.It can be used by peers to increase their calibration to each other.It can also be used to assess how calibrated to the rubric an appraiser is.
Single
Dimension
Walkthrough
60-90
minutes
Two or more appraisers conduct short co-observations of multiple teachers (districts select time for short observations such as 5 minutes, 10 minutes, etc.). Appraisers rate each teacher on only one rubric dimension.Two or more appraisers conduct short co-observations of multiple teachers (districts select time for short observations such as 5 minutes, 10 minutes, etc.).Appraisers rate each teacher on only one rubric dimension.
Campus Walkthrough3-6 hoursCampus leadership team conducts short (10-15 minute) observations across many or all classrooms on a campus.Full campus walkthroughs can provide leadership teams a view of strengths and areas of weakness in instructional practices across their entire campus, especially if appraisers score teachers they don’t normally observe.This protocol can help increase alignment across a campus’ leadership team.
Student Actions vs. Teacher Actions Co-Observation30-45 minutesTwo or more appraisers observe the same lesson (either live or videoed). One person scripts only what students say and do. The other person scripts only what the teacher says and does.This protocol is useful for developing appraisers’ ability to collect quality evidence using not only teacher actions but also student actions.The debrief conversation will help appraisers develop a deeper understanding of the teaching rubric.
Virtual Synchronous Lesson Co-Observation30-45 minutesTwo or more appraisers observe the same lesson (either live or videoed). One person scripts only what students say and do. The other person scripts only what the teacher says and does.These protocols are especially useful to train appraisers to evaluate instruction in a new context (virtual) and using an adapted virtual instruction rubric.If your district needs to conduct scored observations virtually, we recommend implementing calibrated co-observations of virtual instruction.This can be used by appraiser managers to develop appraisers’ accuracy and ability to use high-quality evidence to rate teachers using the observation rubric.
Virtual Asynchronous Co-observationVariesTwo or more appraisers collect evidence on a few predetermined rubric dimensions using asynchronous instruction, and then discuss ratings together.These protocols are especially useful to train appraisers to evaluate instruction in a new context (virtual) and using an adapted virtual instruction rubric.If your district needs to conduct scored observations virtually, we recommend implementing calibrated co-observations of virtual instruction.This can be used by appraiser managers to develop appraisers’ accuracy and ability to use high-quality evidence to rate teachers using the observation rubric.