A High School Music Teacher Unravels a
School Assessment Conundrum

John P. W. Hudson
Retired Music Teacher

A recent IAE Newsletter article described how the Olympics solve an evaluation problem that educators also confront: how to fairly combine objective and subjective assessments, when the two seem incompatible (Sylwester, October, 2014). I experienced confusion at trying to combine subjective and objective assessments during my very first semester as a teacher. I had no idea what the underlying issues, causes, and conflicts were. I just felt an uneasy sensation when trying to combine them. I pieced together my solution over many years, and only in retirement have I gained clarity about what I did.

The Olympics use objective (quantitative, measurable, or countable) assessment processes for competitive distance or speed events such as running, swimming, jumping, throwing, and basketball. In objectively assessed races for example, the athletes' objective is to run from the moment the starter's gun goes off and get to the finish line in the shortest time. Because running times can be measured with precision calibrated to thousandths of a second, the validity of Olympic objective measurements is considered a gold standard.

The Olympics use subjective assessment processes for an expressive (qualitative, not precisely measurable) performance, such as in gymnastics, figure skating, and diving. A team of competent, credible judges makes this subjective judgment using a ten-point scale. The athletes' goal is to increase their subjective score by doing all of the required movements with style and grace. Events such as diving include both required and diver-selected dives. A diver can obtain a higher score by selecting an especially difficult dive and doing it very well. Similarly, a figure skater can gain points by doing a quadruple Lutz rather than a triple Lutz.

The requirement that a judge produce a numerical score allows the scores of the various judges to be compared and an average computed. This quantification of a subjective performance is a difficult challenge to judges but is essential in scoring such Olympic events.

The veracity of subjective scores is completely dependent upon the judges' credibility and the validity of the judges' evaluative criteria (Moursund & Sylwester, 10/9/2015). However, the subjective assessments of the judges can be, and have been, swayed by bias and nefarious influences. This problem is partially addressed by having a large number of judges and throwing out the highest and lowest scores that a performer receives.

Teachers Assessing Students

I was a band teacher. I assessed both objective and subjective aspects of the performance and the learning of each of my individual students. Certainly all teachers in the fine and performing arts face this dual assessment challenge. Most, if not all other teachers also face this challenge. This is obvious in courses that require writing and/or oral presentations. And how about the challenge of assessing a student's progress in learning to read—with understanding—across the curriculum?

In my subjective judgments of individual students, I am the only judge. I need to accomplish fair, reliable, valid scoring of each student, so that the students are all treated in an equitable manner.

My fledgling teaching assignment was to direct elementary and secondary school bands and, of course, to assign a letter grade to each student. I could objectively measure correct notes played, but the expressive aspects of playing required my subjective judgment. My challenge was to combine objective and subjective assessments into a single letter grade, when objective and subjective evaluations are fundamentally incompatible. Objective assessments are anchored to countable or measurable acts, but subjective assessments have no such tether, and are vulnerable to interpretation, the antithesis of objective reporting.

After years of grappling with the thorns of this opaque dilemma, I attempted to skirt the oil and water conundrum altogether by making all of my assessments objective. I taught and tested musical theory, counted correct notes, and presumed that a high correlation would exist with expressive performance.

This assumption was definitely wrong. The year in which I emphasized only objective measures, my students scored very high on their music exams, but the band ranked poorly in festival competition. The students and I were despondent. I had traded expressive rehearsal for test rehearsal, and irrespective of their high letter grades, the results were counter to our real goals: excellent musicianship, stellar performances, and fulfilled, joyful learners.

I then considered the idea of using only subjective assessments, but I had no concrete foundation for validity. I realized this would not be appropriate for the wide range of students in my band classes.

Powerful Lessons

When I had naïvely used only objective assessment, I inadvertently learned that whatever is being assessed tends to become the curriculum. Having witnessed this tendency of assessment to drive curriculum, I decided to lever that effect to my advantage.

Music festivals use criterion-referenced assessment terraced into three levels—bronze, silver, and gold—hierarchically organized as descriptive statements within rubrics. Tuning, balance, blend, and many other expressive elements are combined with objective assessments as in the Olympics, but correct notes, correct rhythms, and other countables were described rather than counted. To describe countables was to put objectively derived evidence of learning on the same footing as evidence of subjective expression.

Instead of numbers as a bridge between subjective and objective assessments, rich descriptions of authentic evidence can be used. Epiphany is a word that evokes moments of heavenly streaming light, and despite searching desperately for a less clichéd simile. I can't find a better way to describe how I felt at that pivotal moment of understanding.

On the day our band performed poorly at festival, I put a copy of the evaluative criteria away for study over the summer. I decided the students should read it too, so I presented it to them in September. I let the festival assessment criteria become the performance curriculum, which had the advantage of being clearly described by the most successful band directors over many years. The effect was immediate. With clear objectives, the students and I had a clear ladder to climb; our daily learning deepened, and performance swelled with newfound understanding. Describing countables had erased the need to use numbers to score subjective performances. Students could see their path to success clearly. Our performances that year earned satisfaction from students, parents, myself, and staff.

Applying letter grades was a simple matter of equating gold with 'A', silver with 'B', and bronze with 'C', 'C' plus, or 'C' minus. Authentic expression was finally on an equal footing with objective assessments. Problem solved. Almost.

Wilkerson's Warning

Letter grades are still necessary for reporting. Although they were fairly derived in my case, I witnessed over my career the damaging effects of translating authentic evidence into scores, which hides important evidence from reporting by using proxies (scores and letter grades). For example, students' identity can be so completely subsumed into academic performance that anything less than an 'A' is too often seen as a failure. Scores being corruptible and vulnerable to negative influences only magnified the tragedy of the fallen, broken body of a senior high school student I saw, only moments after her decision to end her life. Her shame of low marks, insufficient to enter university, had overwhelmed her. (It is for her that I dedicated my studies to finding a solution.)

Isabel Wilkerson said, in her book The Warmth of Other Suns: "Submerging an individual into abstractions is bad science... [and leads to] ...dehumanization" (Wilkerson, 2010). After years of suffering the angst of judging others whose musicianship sometimes approached my own, I came to understand how it is that scores aren't real evidence, but rather, are just a proxy: a way to translate the descriptive language of subjective expression into the hard-wired language of the objective. Scores had become trite, demoralizing, unsettlingly impersonal, and I was glad to be rid of them; but still, to use letter grades to describe expression similarly devolves performances into the mere abstractions, and the tragic consequences Isobel Wilkerson warns us about.

I now believe that Wilkerson's warning can be heeded by making descriptions the common measure for combining objective and subjective evidence of learning, rather than scores. I believe all demonstrations of learning are performances, whether they be spelling, math questions, historical dates, or dance moves, and all can be anecdotally described. All performances are empirical evidence of learning, so they should be reported in real terms. Even tests and examinations can be described in terms of what students can do with their knowledge. Naming accomplishments is as easy as counting scores, and is much more powerful and meaningful for reporting and growth than is the use of letter grades.

That said, our society has letter grades deeply embedded in how schools should report student performance. We know them, and we accept them as currency between teacher, student, and parent. Students accept them as imperfect representations of their efforts, and it gives them a common language to compare themselves with others as well as their own progress. However, I believe this social ranking effect of letter grades is not healthy. It can be ammunition for bullying by peers or parents, and as the suicide I witnessed is testament, dehumanizing. I believe students should compare themselves to a standard, not to each other.

Hope exists. British Columbia, Canada, no longer uses letter grades in primary grades. Descriptive evidence is presented instead, with a guide for comparative purposes (does not yet meet expectations, meets expectations, fully meets expectations, exceeds expectations). It is my hope that letter grades will eventually be replaced by descriptive reports at every level.

Every year, I understood more deeply the power of assessment. When used as a tool, it is a potent motivator and provides clear direction for curricular design. More important, by valuing uncountable or expressive elements, a very wide door opens to a humane education: one in which empathy, teamwork, imagination, creativity, energy, and passion have equal weight with knowledge.

I loved designing my lessons with a passion for student growth in mind, which to me is a much larger lesson than the knowledge or skills to be learned. Most of all, I had used assessment as a powerful ally to honour the young lives in my care with the dignity, power, and control they deserve. The school's job, after all, is to be everything it can be in service of the learners, their families, their community, and the world.

John Hudson is a retired teacher who taught Music from kindergarten to grade twelve, and also taught academic subjects such as English and Business Education. He spent twenty years at the secondary level before moving to elementary, where he taught music K-7, and also classroom grades five, six, and seven. After completing a Master's degree in Education, his passion for educational reform led to his being invited to Shenzhen, China, in 2007 to demonstrate and explain Western teaching methods for two years. His first book is Pathways Between Eastern and Western Education (Information Age Publications, 2009). He has retired in Surrey, British Columbia, Canada, but still enjoys teaching on call for the Richmond School District and Southridge Private School. Besides writing about educational reform, he enjoys spending time with his family, photography, playing in various bands, and travelling the world with Debbie, his wife of 43 years.

