The Talk That Wasn’t That Great — Bayes rules and all that
When the facts change, we change our opinions. Or at least we should, rational people would agree. But how much should we change our opinions exactly? A precise mathematical answer can be obtained by applying Bayes’ rule. It tells you how different your belief of something should be, given what it was before and any new evidence. It tells you how to quantify your estimates and educated guesses. This of course is unbelievably useful if you have to make decisions and you don’t have all the information, so you’re faced with some uncertainty. Or, to say the exact same thing in fewer words, it’s really useful for making decisions. Most decisions are made as the mathematicians would say “under uncertainty” — it’s extremely rare that anyone has all the relevant information on hand. That’s why Bayes’ rule is applied in so many domains from military planning and cryptanalysis to scientific hypothesis testing to the dictation-taking software and the spam filter on your computer. It will help you answer questions like: what are the likely effects of my decision? What are the likely causes of the situation I see in front of me? What was it that probably happened while I wasn’t looking? Who really sent this message, a spam bot or a dear friend? Did you just say “recognize speech” or “wreck a nice beach”?
That much (minus speech recognition) was covered by Sharon McGrayne at tonight’s Town Hall talk promoting her new book The Theory That Would Not Die about the history, controversy and applications of Bayes’ rule. And my prior belief was that it would be an awesome talk, based on my general love of Bayes’ rule, and you know, its history, and the (former) controversy, and its many many many (did I say this before?) applications.
Unfortunately, like a good rational person I must revise my belief based on the incontrovertible evidence I gathered while attending the talk tonight. I would now rate it as kind of okay, and probably wouldn’t recommend it. Here’s why.
Remember, the whole point of reasoning like a Bayesian (someone who applies Bayes’ rule) is that you take into account the evidence, yes, but also your prior beliefs. It’s those beliefs you’re constantly updating, essentially putting a number on how likely you think they are given new evidence. Clearly then, what you’ll end up with in terms of belief depends very much not only on the new information, but also on where you started: what mathematicians would call your “prior”. If you remove that part about incorporating priors into how you revise your beliefs, then there’s no controversy. It’s the same as doing regular (“frequentist”) statistics — you just calculate the share of the evidence that’s in front of you, pro and con, and base your belief on those ratios. And that is much less powerful when it comes to decision-making, because you would be ignoring all manner of information that could be encoded in your prior (textbook and common-sense knowledge, gut feelings, centuries of research that came before your experiment, the reliability of sources etc. etc.). That which makes it powerful also makes it controversial: how come you have to put a number on something so subjective as a gut feeling? How come your result means anything objectively? You’d think that someone who wrote a book about Bayes’ rule would mention priors and their importance a little bit earlier than the 45th minute of the talk.
Instead, McGrayne focused on the history of the development of the theorem, and how for two centuries people fought about its acceptability. That much was quite illuminating. Nobody ever said that it was wrong. The thing is a one-liner that can be proved from the general axioms of probability in a whopping three lines. What statisticians argued about was not its truth but its use. And whether it was a “proper” subject of study.
And McGrayne did a good job of poking fun at the academicians, while detailing all the long-time uses of Bayesian statistics by practitioners with the right resources and high stakes in getting the right answer, including of course the military, but also paternity lawyers, insurance agents, and my personal hero, founder of computer science and gay martyr Alan Turing who single-handedly won WWII*. It’s completely un-ironic that the ultimate success of Bayesian statistics in the last twenty years has only been made possible by the explosion of both computing power and clever computational algorithms that can do the required tedious calculations in situations with much uncertainty about a great many interconnected variables.
But even as she talked about that final triumphant chapter in Bayes’ rule’s long history, somehow McGrayne managed to not say the important parts. She off-handedly threw around technical terms like Gibbs sampling, and explained with coy humor where that name came from, rather than telling us what it is and why it’s so useful. In the space of five minutes she mentioned Markov chains, Monte Carlo methods, sampling and graphical models, but I’m don’t think anybody could follow what those things are let alone why any of them matter.
Which is too bad. Perhaps the book, in addition to relating the historical anecdotes surrounding this simple, elegant and useful theorem, will actually do it all justice. But I’m no longer willing to bet on it.
If you’re not completely afraid of math, a great introduction to Bayes’ rule can be found at betterexplained.com. It’s explained from the point of view of somebody using statistics to analyze experiments in fields like medicine.
*Not intended to be a factual statement.