Thursday, October 01, 2009

Problems with the New Barrymore Awards: Part II

In my first article on the problems with the new Barrymore Awards voting process, I pointed out how the new system’s assignment of voters enabled clustering of awards around certain productions to a degree unseen in seasons past.

Here, I will show how the new process itself cannot fulfill its stated goal of recognizing the best performance, design element, or production over an entire season. And I would say that this holds true even if everything I wrote in the first article proves false.

First things first: I realize that the Barrymore’s do not—in name—designate the “best” anything (e.g. director), but instead give awards signifying “outstanding” sound design, “outstanding” performance by a leading actress, etc. But this circumlocution merely equivocates on a term.

Under the old system of voting, only one performance or design element received the top number of votes from the judges, and similarly, the new system yields a “highest score” from the voters. In each process, someone is (or will be) collectively regarded as the “best” of the season. Of course, people can always pretend otherwise.

However, I would argue that only the old system could legitimately recognize the best performances and design elements of a season. By contrast, the new process cannot even convey a standard of excellence, let alone reward the most outstanding anything of the season.

Who Decides and How?

This year’s new system of voting sent eight randomly assigned voters out of 62 to see each show, with each voter seeing 12 to 20 shows over the season. Their instructions encouraged them to treat each show on its own merits and rank each performance or design element on a scale of 0 to 100, with rough-and-ready categories (like “poor: 0-20”) guiding their scores.

The judging of figure skating in the Olympics attempts something similar, assigning point values to each performer taken in individual consideration. But there, the judges possess pre-determined objective criteria (difficultly of routine, number of specific movements performed) that form part of their scoring.

However, because theatre lacks any such observer-independent objective criteria, the new Barrymore system more resembles trying to determine the fastest runner by taking each competitor in isolation, letting a handful of people watch him run, and then selecting another, different batch of observers to evaluate the next sprinter. Imagine this process without a stopwatch and you understand how they determined this year’s awards.

As such, this quantifiable system can only encourage thinking about excellence, but without a frame of reference or cross-comparison, it cannot possibly measure it adequately. Like obscenity, we must trust the voters to just “know it when they see it.”

How the old system of judges solve this problem

When it comes to art, this might be the best any of us can do, and the judges of the old system operated similarly. However, unlike the judges, the voters do not see every eligible show, which, in a qualitative analysis, is the only thing that could give them a frame of reference to properly vote for the “most outstanding X of an entire season.” Instead, they cast a once-and-done fixed vote that they cannot later rescind or alter.

The old system of judges who had seen every eligible production could—no matter how flawed otherwise—at least introduce a frame of reference for cross-comparison. Yes, they also lacked “objective criteria,” but unlike runners viewed by rotating sets of observers, the judges at least possessed the advantage of seeing and evaluating every show. At the end of the year, after marshalling a continually refined set of theatre-evaluating experiences, they could then confidently cast a vote for excellence.

But now, the new system has transferred the power of the judges to an even smaller group while losing the one advantage of cross-comparison that the judges conferred. Even assuming bias on the part of all judges, that they had seen every eligible show still gave the old system a level of quality control that the new process lacks.

A sports analogy clarifies the problem

So rather than 10 to 17 judges deciding all the awards after a period of reflection, this season the first (and isolated) impressions of eight individuals decided each and every award. But because of the random distribution of the voters, not even the same group of voters made any two decisions.

To borrow another analogy from sports, the new process resembles allowing a different set of judges to decide the gold, silver, and bronze medals. Whoever thought that spreading the responsibility of choosing each award—though not any award—onto new random groups actually increased the rigor and integrity of the Barrymore process needs to take a course in qualitative analysis.

In order to rank something as “the most outstanding X” of the year, one needs a large sample, not of voters seeing isolated shows, but of total number of shows seen.

By contrast, trying to pretend that the voters should only treat a show on its merits means asking them to ignore every single show or theatre-experience any of them ever had. But each voter can only know excellence by past exposure to such. And since no one can ever ignore the totality of their experience when making judgments about excellence, why wouldn’t Silvante want to buttress the system’s ability to truly reward it by ensuring that each and every person who votes on the awards all possess the same theatre-going experiences that season?

Qualitative analysis versus quantifiable metrics

Qualitative notions like “best” and “outstanding” must involve a comparison. But the elimination of a group of judges that could make these comparisons eliminated the possibility of the new system rendering such judgments. At best, the new awards can only stipulate which performance, production, or design element earned the highest score via random assignment of a group of voters who never again voted on another production as a unit. Perhaps they should change the name of each award from “Outstanding Actor,” to “Highest Voted Upon Performance,” a meaningless moniker to signify a process that could not otherwise ensure that it rewarded the quality of excellence.

Stay tuned for Part III in this series, where I discuss the potential for using quantitative analysis to judge art.

No comments: