
The Benchmark Problem Nobody in Education Talks About
The Benchmark Problem Nobody in Education Talks About

Article by
Milo
ESL Content Coordinator & Educator
ESL Content Coordinator & Educator
All Posts
Ask a room full of teachers whether their students are making progress, and most will say yes. Ask them what they are measuring that progress against, and the answer gets complicated fast. Against last term's results? Against the class average? Against what a student of this age is generally expected to know? Each of these is a different yardstick, and depending on which one a school uses or whether it uses one consistently at all, a child can appear to be thriving while quietly falling behind the rest of the country.
This is the benchmark problem. It does not make headlines the way funding cuts or curriculum overhauls do, but it sits underneath a lot of the other problems that do.
Still grading everything by hand?
EMStudio is a free teaching management app — manage your classes, students, lessons, and more!
Learn More

Still grading everything by hand?
EMStudio is a free teaching management app — manage your classes, students, lessons, and more!
Learn More

Table of Contents
When a Grade Stops Meaning What It Used To
There is a reasonable assumption that a B means something; that it represents a level of competence, a position on a scale, a reliable signal to the student, the parent, and the next teacher in line. In practice, that assumption has been eroding for years.
A 2024 study by the Equitable Grading Project, one of the largest investigations of grading practices ever conducted, examining more than 33,000 middle and high school grades across two academic years, found that nearly 60% of student grades did not match the course knowledge those students demonstrated on corresponding standardised tests. Two thirds of the mismatches were inflated. The grade said one thing. The evidence said another.
This is not a story about dishonest teachers. Grading methodology varies from teacher to teacher, and grades themselves have long absorbed non academic factors, such as participation, effort, behaviour, and extra credit, that have nothing to do with whether a student has actually grasped the material. As researchers at the Equitable Grading Project note, current grading practices in many schools are simply out of step with what contemporary assessment knowledge tells us about measuring learning accurately.
The result is a system in which a student can move through year levels feeling adequately prepared, receive consistent encouragement from well meaning teachers, and still arrive at a national assessment or a secondary school entrance requirement significantly below where they needed to be. Nobody deceived them. The benchmarks just were not there to tell the truth.
The Inconsistency Nobody Has Fixed
Grading variation between teachers is not a new finding. Writing in Current Issues in Education, education researcher Danielle Iamarino traced the problem back to a warning issued as far back as 1996 that without shared grading standards, grades given by one teacher can mean something entirely different from grades given by another, even when both are teaching students of the same age in the same subject. Without a shared reference point, teachers improvise. One prioritises attendance. Another weighs effort heavily. A third marks strictly against the curriculum outcomes. All three call the result a grade. None of them is describing the same thing.
What external benchmarks do (the kind that are curriculum aligned, nationally normed, and applied consistently across year levels) is give that improvisation a floor. They do not replace teacher judgement, they contextualise it. A teacher who suspects a student is underperforming now has something to point to beyond instinct. A teacher who believes a student is excelling has a way to verify that belief against something beyond the classroom walls.
This is precisely the gap that providers such as AAS are designed to address, offering schools assessments that sit alongside national benchmarks, giving teachers and school leaders a consistent reference point that individual grading, however careful, cannot reliably provide on its own.
What Happens When the Reference Point Is Missing
The absence of consistent benchmarks does not just affect reporting. It affects instruction. A teacher who does not know where a student stands relative to a national standard has no reliable way to calibrate what comes next, whether to push harder, slow down, revisit a concept, or move on. The classroom becomes self referential: students are doing well compared to each other, or compared to last month, but the question of whether that progress is sufficient relative to where they need to be remains unanswered.
This matters most at transition points. The move from primary to secondary school. The shift into senior years. These are moments when a student's actual academic position becomes suddenly visible, often for the first time, through external assessment. For many students, that visibility is a shock. Not because they were not working. Because nobody gave them an honest reference point while there was still time to respond to it.
Reporting in Education Week in October 2024 highlighted the scale of this mismatch, noting that grading methodology varies from teacher to teacher and grades tend to include a mix of non academic factors, with grade inflation having picked up measurably since the pandemic. The College Board, meanwhile, found that for the third consecutive year, average SAT scores declined even as school grades trended upward. Students were being told they were doing well. The external evidence disagreed.
What Consistent Benchmarks Actually Give Teachers
The argument for shared academic benchmarks is not an argument against teacher autonomy. It is an argument for giving teachers better information. There is a difference between a teacher who grades freely and a teacher who grades freely but also has access to a curriculum aligned external measure twice a year. The second teacher can do something the first cannot: they can tell whether their classroom's internal progress is tracking toward an external standard, or drifting away from it.
For curriculum coordinators, consistent benchmarking across year levels also makes something else possible: genuine longitudinal tracking. Not just whether a student improved from last term, but whether their trajectory over three years places them where they need to be at the end of secondary school. That kind of visibility requires a consistent scale, applied consistently over time, against a reference point that does not shift every time a teacher changes or a new grading policy comes into effect.
The Fordham Institute, writing on grade inflation and its consequences, put the stakes plainly: if students in under resourced schools receive grades that do not reflect their actual performance, they may not receive the support and interventions they need to improve. The benchmark is not a bureaucratic formality. It is, in many cases, the mechanism by which a student gets help before it is too late.
The Conversation Schools Are Not Having
The reason this problem does not get talked about is that it is uncomfortable. Acknowledging that grades are inconsistent means acknowledging that students and parents have been receiving signals that were not always reliable. Acknowledging that benchmarks are missing means acknowledging that some schools have been navigating without a compass and calling it professional judgement.
None of this requires blame. Teachers work within the systems they are given. Grading cultures are inherited as much as they are chosen. But the conversation about what "on track" actually means (against what standard, measured how, and reported in a way that travels intact from one year level to the next) is one that more schools need to be having.
The benchmark problem is not unsolvable. It is just easier to leave unexamined. Until the external assessment arrives and a student discovers, without warning, that the grade they were given and the knowledge they actually have are not quite the same thing.
When a Grade Stops Meaning What It Used To
There is a reasonable assumption that a B means something; that it represents a level of competence, a position on a scale, a reliable signal to the student, the parent, and the next teacher in line. In practice, that assumption has been eroding for years.
A 2024 study by the Equitable Grading Project, one of the largest investigations of grading practices ever conducted, examining more than 33,000 middle and high school grades across two academic years, found that nearly 60% of student grades did not match the course knowledge those students demonstrated on corresponding standardised tests. Two thirds of the mismatches were inflated. The grade said one thing. The evidence said another.
This is not a story about dishonest teachers. Grading methodology varies from teacher to teacher, and grades themselves have long absorbed non academic factors, such as participation, effort, behaviour, and extra credit, that have nothing to do with whether a student has actually grasped the material. As researchers at the Equitable Grading Project note, current grading practices in many schools are simply out of step with what contemporary assessment knowledge tells us about measuring learning accurately.
The result is a system in which a student can move through year levels feeling adequately prepared, receive consistent encouragement from well meaning teachers, and still arrive at a national assessment or a secondary school entrance requirement significantly below where they needed to be. Nobody deceived them. The benchmarks just were not there to tell the truth.
The Inconsistency Nobody Has Fixed
Grading variation between teachers is not a new finding. Writing in Current Issues in Education, education researcher Danielle Iamarino traced the problem back to a warning issued as far back as 1996 that without shared grading standards, grades given by one teacher can mean something entirely different from grades given by another, even when both are teaching students of the same age in the same subject. Without a shared reference point, teachers improvise. One prioritises attendance. Another weighs effort heavily. A third marks strictly against the curriculum outcomes. All three call the result a grade. None of them is describing the same thing.
What external benchmarks do (the kind that are curriculum aligned, nationally normed, and applied consistently across year levels) is give that improvisation a floor. They do not replace teacher judgement, they contextualise it. A teacher who suspects a student is underperforming now has something to point to beyond instinct. A teacher who believes a student is excelling has a way to verify that belief against something beyond the classroom walls.
This is precisely the gap that providers such as AAS are designed to address, offering schools assessments that sit alongside national benchmarks, giving teachers and school leaders a consistent reference point that individual grading, however careful, cannot reliably provide on its own.
What Happens When the Reference Point Is Missing
The absence of consistent benchmarks does not just affect reporting. It affects instruction. A teacher who does not know where a student stands relative to a national standard has no reliable way to calibrate what comes next, whether to push harder, slow down, revisit a concept, or move on. The classroom becomes self referential: students are doing well compared to each other, or compared to last month, but the question of whether that progress is sufficient relative to where they need to be remains unanswered.
This matters most at transition points. The move from primary to secondary school. The shift into senior years. These are moments when a student's actual academic position becomes suddenly visible, often for the first time, through external assessment. For many students, that visibility is a shock. Not because they were not working. Because nobody gave them an honest reference point while there was still time to respond to it.
Reporting in Education Week in October 2024 highlighted the scale of this mismatch, noting that grading methodology varies from teacher to teacher and grades tend to include a mix of non academic factors, with grade inflation having picked up measurably since the pandemic. The College Board, meanwhile, found that for the third consecutive year, average SAT scores declined even as school grades trended upward. Students were being told they were doing well. The external evidence disagreed.
What Consistent Benchmarks Actually Give Teachers
The argument for shared academic benchmarks is not an argument against teacher autonomy. It is an argument for giving teachers better information. There is a difference between a teacher who grades freely and a teacher who grades freely but also has access to a curriculum aligned external measure twice a year. The second teacher can do something the first cannot: they can tell whether their classroom's internal progress is tracking toward an external standard, or drifting away from it.
For curriculum coordinators, consistent benchmarking across year levels also makes something else possible: genuine longitudinal tracking. Not just whether a student improved from last term, but whether their trajectory over three years places them where they need to be at the end of secondary school. That kind of visibility requires a consistent scale, applied consistently over time, against a reference point that does not shift every time a teacher changes or a new grading policy comes into effect.
The Fordham Institute, writing on grade inflation and its consequences, put the stakes plainly: if students in under resourced schools receive grades that do not reflect their actual performance, they may not receive the support and interventions they need to improve. The benchmark is not a bureaucratic formality. It is, in many cases, the mechanism by which a student gets help before it is too late.
The Conversation Schools Are Not Having
The reason this problem does not get talked about is that it is uncomfortable. Acknowledging that grades are inconsistent means acknowledging that students and parents have been receiving signals that were not always reliable. Acknowledging that benchmarks are missing means acknowledging that some schools have been navigating without a compass and calling it professional judgement.
None of this requires blame. Teachers work within the systems they are given. Grading cultures are inherited as much as they are chosen. But the conversation about what "on track" actually means (against what standard, measured how, and reported in a way that travels intact from one year level to the next) is one that more schools need to be having.
The benchmark problem is not unsolvable. It is just easier to leave unexamined. Until the external assessment arrives and a student discovers, without warning, that the grade they were given and the knowledge they actually have are not quite the same thing.
Still grading everything by hand?
EMStudio is a free teaching management app — manage your classes, students, lessons, and more!
Learn More

Still grading everything by hand?
EMStudio is a free teaching management app — manage your classes, students, lessons, and more!
Learn More

2026 Notion4Teachers. All Rights Reserved.
2026 Notion4Teachers. All Rights Reserved.
2026 Notion4Teachers. All Rights Reserved.








