A couple years ago, the Mariners’ brilliant pitcher Felix Hernandez had another brilliant season: first in the AL in ERA, second in the AL in strikeouts, first in innings pitched. And he went 13-12, because he was on a team with a historically bad offense that tended to be even worse when he was pitching. Roy Halladay, his NL peer, was asked if he deserved to win the Cy Young, and he said something interesting:
“It’s tough,” Halladay said. “Obviously Felix’s numbers are very, very impressive. But I think, ultimately, you look at how guys are able to win games. Sometimes the run support isn’t there, but you sometimes just find ways to win games. I think the guys that are winning and helping their teams deserve a strong look, regardless of how good Felix’s numbers are. It definitely could go either way; it’s going to be interesting. But I think when teams bring guys over, they want them to, ultimately at the end of the day, help them win games.”
Halladay, like Hernandez, is one of the best pitchers in baseball, and like any good pitcher, would know the frustrations of pitching well and not getting a win out of it (or pitching poorly and getting bailed out by his team). Yet Halladay, closer to the game than any journalist or stat geek could ever be, went with this odd mystical conceit that a pitcher can “sometimes just find ways to win games” beyond, you know, pitching better than anyone else in baseball.
I don’t think Halladay is dumb. Good pitchers aren’t, and Halladay is particularly crafty. I suspect that belief is what he needs to get by: the confidence that even when he’s getting raked, he can do something magic to win. Many great athletes have an ability to be completely ambivalent to reality when they need to. Sometimes it works, sometimes their coaches have to step in. And it’s a reminder of how distorting that perspective can be, particularly when your job depends on it.
I was reminded of this when the weird backlash against FiveThirtyEight and Nate Silver began; I was reminded again when Silver’s model proved impressively accurate last night. It’s why I like sports—they’re a safe space to learn and popularize things like probability and the intelligent use of data. The Cy Young is relatively trivial, possible contract bonuses aside, an award to recognize what’s generally obvious. I’m really not joking when I say that fantasy baseball taught me a lot about how to deal with data—and a lot of that came from Baseball Prospectus, where Silver’s legend began.
Last week he was made fun of for being a wizard; today, he’s celebrated as one. But Daniel Engber has a worthwhile note of caution—one of which looks like an error I made in assessing my home state:
But Montana is the most telling case: According to Silver’s polling average, the Democratic candidate, Jon Tester, led by 1.4. But Silver’s model, which uses “state fundamentals” among other factors, guessed the polls were wrong, and gave his opponent Denny Rehberg a 66 percent chance of winning. So far as I can tell, this was the only contest in which the magic model went against the averaged polls. Guess what? The projections might have been a little off: Tester has a 5-point lead.
In Virginia, I had my own little vague idea of “state fundamentals”; Silver tries to quantify it, but there’s still the problem of translating qualitative evaluations into qualitative comparisons: “An alternative forecast of the outcome that avoids polls and instead looks at the partisan environment of a state, public fundraising totals, statistical measures of left-right ideology and candidate quality, and other quantifiable factors.”
Note that: “statistical measures of left-right ideology.” Yesterday I posted about some research done by Boris Schor of the University of Chicago that uses such statistical measures, and as an academic, he goes deeply into his methodology—at the heart of it is real people doing research on hundreds of candidates, making judgements about their ideology, making judgements about what the spectrum of left-right ideology is, a knotty qualitative judgement that endless books have been written about, boiled down to a graph. Silver isn’t adverse to this kind of work, though his model is a black box; in The Signal and the Noise, he praises the work of the Cook Political Report, which uses fine-grained candidate interviews as part of their methodology:
Wasserman’s knowledge of the nooks and crannies of political geography can make him seem like a local, and Kapanke [a candidate] was happy to talk shop about the intricacies of his district—just how many voters he needed to win in La Crosse to make up for the ones he’d lose in Eau Claire. But he stumbled over a series of questions on allegations that he had used contributions from lobbyists to buy a new set of lights for the Loggers’ ballpark.
The Cook Report didn’t upgrade Kapanke’s rating, and he lost a race similar candidates won in similar districts. The Cook Report relies a lot on analysis, but they supplement it with, you know, journalism.
Which FiveThirtyEight does, too. Here’s a post on Virginia, based on a conversation with two polisci experts from the state; here’s one with Jon Ralston, the Rich Miller of Nevada; Pennsylvania; and so forth.
All the heat about Silver’s presidential predictions really misses a lot about the FiveThirtyEight project (which isn’t just Silver); they’re trying to do what the pundits who question them do, using some methods that are different, and some that really aren’t. The projections themselves use some difficult math, run through FiveThirtyEight’s secret sauce, but in and of themselves they don’t really explain anything about why the polling shows what it does, or why public fundraising totals (clearly, in the Citizens United era, a more difficult measure) are what they are. They’re excellent tools to explain things, but obviously Silver and his crew aren’t interested in the raw numbers alone, or they wouldn’t be talking to people with a better grasp of the underlying structures that lead to the numbers FiveThirtyEight bends to their model.
And it’s a model that journalists should have a lot of respect for, but without rushing, after the election, to the other extreme, treating it as a magic oracle. Not all his projections came in, and that’s okay—when things break, there’s a reason, and it’s something to learn from. It’s not magic, it’s logic.
And there are people who know more than Silver. That Sarah Jessica Parker dinner that so many people made fun of? That was literally calculated:
For the general public, there was no way to know that the idea for the Parker contest had come from a data-mining discovery about some supporters: affection for contests, small dinners and celebrity. But from the beginning, campaign manager Jim Messina had promised a totally different, metric-driven kind of campaign in which politics was the goal but political instincts might not be the means. “We are going to measure every single thing in this campaign,” he said after taking the job. He hired an analytics department five times as large as that of the 2008 operation, with an official “chief scientist” for the Chicago headquarters named Rayid Ghani, who in a previous life crunched huge data sets to, among other things, maximize the efficiency of supermarket sales promotions.
FiveThirtyEight can do a lot; I don’t think it could predict that “affection for contests, small dinners, and celebrity” would zero in on the former star of Sex and the City as a money tree for the Obama campaign. Silver’s small team is incredibly impressive, but you get what you pay for, and that kind of data (and data analysis) scales. Thanks to Silver and people like him, journalism made a big leap in its understanding, analysis, and presentation of data (which the New York Times also did an extraordinary job with). But the campaigns made a bigger leap yet. While some journalists are going to have to adjust to the FiveThirtyEight era, the next challenge for those who are already there will be catching up to the campaigns.
Photograph: The White House