Assessment & Data: Norm-Referenced vs. Criterion-Referenced Tests
What Nobody Tells You About How Your Child Is Being Scored
There’s a question most parents never think to ask — and most teachers weren’t taught to answer clearly:
When my child gets a score on a test, what does that score actually mean?
Not “did they pass?” Not “is that a good grade?” But: what is this score actually measuring, and compared to what?
The answer depends entirely on what kind of test your child took. And there are two fundamentally different kinds: norm-referenced and criterion-referenced.
Understanding the difference won’t just make you a more informed parent or educator — it will change how you interpret every test score you ever see again.
Two Tests Walk Into a Room...
Let’s start with a quick mental image.
Test #1: After the test, students are ranked from highest to lowest score. Half the class will always be “below average” — even if every single student answered 90% of the questions correctly.
Test #2: After the test, every student who demonstrates the required skills passes. It’s theoretically possible for every student to get an “A.” It’s also possible for everyone to fail.
Test #1 is norm-referenced. Test #2 is criterion-referenced.
Same students. Same classroom. Completely different logic.
Norm-Referenced Tests: You vs. Everyone Else
A norm-referenced test (NRT) is designed to spread students out and rank them against each other — or against a large representative sample called a “norm group.”
The score you get isn’t really about what you know. It’s about where you land compared to others.
The Classic Example: The SAT (and most standardized national tests)
When the SAT reports a “percentile rank,” that’s norm-referencing in action. A score in the 72nd percentile doesn’t mean you got 72% correct — it means you scored higher than 72% of test-takers.
Other Examples:
IQ tests
The ACT
Iowa Assessments
Cognitive Ability Tests (CogAT)
Most gifted-and-talented screening tools
How the Scores Look:
Percentile ranks (you’re in the top 30%)
Stanines (a 1–9 scale)
Normal Curve Equivalents (NCEs)
Scale scores compared to a national average
The Uncomfortable Truth About NRTs
Here’s something that should sit with you: on a perfectly designed norm-referenced test, someone has to be at the bottom. Always. By design. Even if the “bottom” students are doing well in absolute terms, they’re below average relative to the group.
This is why norm-referenced scores can feel demoralizing in ways that don’t reflect actual learning. A student who made tremendous growth might still be in the 20th percentile — because everyone else grew, too.
For Teachers: NRTs are useful for comparing your students to state or national populations. They’re less useful for knowing whether students mastered your curriculum.
For Parents: When you see a percentile score, remember — it tells you rank, not mastery. Your child in the 45th percentile might know the material. They just know it slightly less than the student next to them.
Criterion-Referenced Tests: You vs. a Standard
A criterion-referenced test (CRT) measures student performance against a fixed set of skills, standards, or learning objectives — not against other students.
The question isn’t “How did you do compared to everyone else?” It’s “Did you demonstrate what you were supposed to learn?”
The Classic Example: Your State’s End-of-Year Test
Most state accountability tests — the ones your school is graded on — are criterion-referenced. They’re built around specific grade-level standards (often the Common Core or state equivalents). Did the student demonstrate mastery of 7th grade reading standards? Yes or no? To what degree?
Other Examples:
State standardized tests (PARCC, STAAR, MCAS, etc.)
AP exams (1–5 score scale tied to content mastery)
Driving tests (you meet the standard, or you don’t)
Most classroom tests and quizzes
Curriculum-based assessments like those in Amplify ELA
How the Scores Look:
Proficiency levels (Below Basic / Basic / Proficient / Advanced)
Percent correct (You got 84% of the questions right)
Mastery indicators (Met standard / Did not meet standard)
Scaled scores tied to cut points
The Power and the Limitation
The power of criterion-referenced tests: theoretically, every student can succeed. If you teach the standards well, every student can reach proficiency. There’s no artificial “loser.”
The limitation: the quality of the test depends entirely on how well the criteria were written. A poorly designed rubric or a misaligned standard renders the whole thing meaningless — regardless of how students scored.
For Teachers: CRTs are your diagnostic tool. They tell you which standards were mastered and which need reteaching. They’re the foundation of data-driven instruction.
For Parents: When your child’s report card says “Proficient” or “Meets Standard,” that’s criterion-referencing. It means your child demonstrated the required skills — not that they beat 60% of the class.
Why This Matters for Equity
Here’s where it gets important — especially in schools serving historically underserved communities.
Norm-referenced tests were originally designed to sort students — to identify who was “gifted,” who qualified for special programs, who got into elite colleges. The sorting is baked in. That means students from under-resourced schools and communities will often rank lower — not because they haven’t learned, but because they had less access to the resources that fuel those scores.
Criterion-referenced tests, at their best, ask a different question: Did this student learn what we taught? That’s a more equitable question — as long as the teaching was equitable, too.
This is one reason many education researchers and advocates push for proficiency-based and standards-based grading: it shifts the conversation from “where does this student rank?” to “what has this student mastered, and what do they still need?”
That’s not just a technical distinction. It’s a different philosophy about what school is for.
A Note for Teachers: Both Tests Belong in Your Toolkit
It’s tempting to treat this as a competition — norm-referenced bad, criterion-referenced good. But in practice, skilled educators use both, for different purposes.
Use norm-referenced data when you want to:
Understand how your students compare to a broader population
Make decisions about program placement or enrichment
Identify students who may be significantly above or below grade level compared to peers nationally
Use criterion-referenced data when you want to:
Know whether your students mastered a specific standard
Decide whether you need to reteach something
Provide grades and feedback grounded in actual learning targets
Track growth toward clear benchmarks
The danger is when people confuse the two — or when norm-referenced rankings get treated as if they’re statements about absolute knowledge. They’re not. A student in the 40th percentile might know quite a lot. They just know slightly less than the student next to them.
The Takeaway
The next time you see a test score — whether it’s your own student’s benchmark results, your child’s report card, or a standardized test report — ask yourself two questions:
What was this score compared against? A group of other students, or a standard?
What does it actually tell us? Rank, or mastery?
Those two questions will tell you more than the number itself.
Assessment isn’t neutral. Every test is built on assumptions about what matters, what counts as knowledge, and how performance should be defined. Understanding the framework behind the score is the first step toward using that score wisely — and advocating for students when the framework doesn’t serve them.



