# Two Unusual Papers on Monte Carlo Simulation

For Bayesian inference, Markov Chain Monte Carlo (MCMC) methods were a huge breakthrough. These methods provide a principled way for simulating from a posterior probability distribution, and are useful for integrating distributions that are computationally intractable. Usually MCMC methods are performed with computers, but I recently read two papers that apply Monte Carlo simulation in interesting ways.

The first is Markov Chain Monte Carlo with People. MCMC with people is somewhat similar to playing the game of telephone–there is input “data” (think of the starting word in the telephone game) that is transmitted across stages where it can be modified and then output at the end. In the paper the authors construct a task so that human learners approximately follow an MCMC acceptance rule. I have summarized the paper in slightly more detail here.

The second paper is even less conventional: the authors approximate the value of π using a “Mossberg 500 pump-action shotgun as the proposal distribution.” Their simulated value is 3.131, within 0.33% of the true value. As the authors state, “this represents the first attempt at estimating π using such method, thus opening up new perspectives towards computing mathematical constants using everyday tools.” Who said statistics has to be boring?

# Statistical Thinking and the Birth of Modern Computing

John von Neumann and the IAS computer, 1945

What do fighter pilots, casinos, and streetlights all have in common? These three disparate topics are all the subject of statistical thinking that led to (and benefitted from) the development of modern computing. This process is described in Turing’s Cathedral by George Dyson, from which most of the quotes below are drawn. Dyson’s book focuses on Alan Turing far less than the title would suggest, in favor of John von Neumann’s work at the Institute for Advanced Studies (IAS). Von Neumann and the IAS computing team are well-known for building the foundation of the digital world, but before Turing’s Cathedral I was unaware of the deep connection with statistics.

Statistical thinking first pops up in the book with Julian Bigelow’s list of fourteen “Maxims for Ideal Prognosticators” for predicting aircraft flight paths on December 2, 1941. Here is a subset (p. 112):

7. Never estimate what may be accurately computed.

8. Never guess what may be estimated.

9. Never guess blindly.

This early focus on estimation will reappear in a moment, but for now let’s focus on the aircraft prediction problem. With the advent of radar it became possible for sorties at night or in weather with poor visibility. In a dark French sky or over a foggy Belgian city it could be tough to tell who was who until,

otherwise adversarial forces agreed on a system of coded signals identifying their aircraft as friend or foe. In contrast to the work of wartime cryptographers, whose job was to design codes that were as difficult to understand as possible, the goal of IFF [Identification Friend or Foe] was to develop codes that were as difficult to misunderstand as possible…. We owe the existence of high-speed digital computer to pilots who preferred to be shot down intentionally by their enemies rather than accidentally by their friends. (p. 116)

In statistics this is known as the distinction between Type I and Type II errors, which we have discussed before. Pilots flying near their own lines likely figured there was a greater probability that their own forces would make a mistake than that the enemy would detect them–and going down as a result of friendly fire is no one’s idea of fun. This emergence of a cooperative norm in the midst of combat is consistent with stories from other conflicts in which the idea of fairness is used to compensate for the rapid progress of weapons technology.

Chapter 10 of the book (one of my two favorites along with Chapter 9, Cyclogenesis) is entitled Monte Carlo. Statistical practitioners today use this method to simulate statistical distributions that are analytically intractable. Dyson weaves the development of Monte Carlo in with a recounting how von Neumann and his second wife Klari fell in love in the city of the same name. A full description of this method is beyond the scope of this post, but here is a useful bit:

Monte Carlo originated as a form of emergency first aid, in answer to the question: What to do until the mathematician arrives? “The idea was to try out thousand of such possibilities and, at each stage, to select by chance, by means of a ‘random number’ with suitable probability, the fate or kind of event, to follow it in a line, so to speak, instead of considering all branches,” [Stan] Ulam explained. “After examining the possible histories of only a few thousand, one will have a good sample and an approximate answer to the problem.”

For a more comprehensive overview of this development in the context of Bayesian statistics, check out The Theory That Would Not Die.

The third and final piece of the puzzle for our post today is the well-known but not sufficiently appreciated distinction between correlation and causation. Philip Thompson, a meteorologist who joined the IAS group in 1946, learned this lesson at the age of 4 and counted it as the beginning of his “scientific education”:

[H]is father, a geneticist at the University of Illinois, sent him to post a letter in a mailbox down the street. “It was dark, and the streetlights were just turning on,” he remembers. “I tried to put the letter in the slot, and it wouldn’t go in. I noticed simultaneously that there was a streetlight that was flickering in a very peculiar, rather scary, way.” He ran home and announced that he had been unable to mail the letter “because the streetlight was making funny lights.”

Thompson’s father seized upon this teachable moment, walked his son back to the mailbox and “pointed out in no uncertain terms that because two unusual events occurred at the same time and at the same place it did not mean that there was any real connection between them.” Thus the four-year-old learned a lesson that many practicing scientists still have not. This is also the topic of Chapter 8 of How to Lie with Statistics and a recent graph shared by Cory Doctorow.

The fact that these three lessons on statistical thinking coincided with the advent of digital computing, along with a number of other anecdotes in the book, impressed upon me the deep connection between these two fields of thought. Most contemporary Bayesian work would be impossible without computers. It is also possible that digital computing would have come about much differently without an understanding of probability and the scientific method.

# More Error in Art: Fake Rothko’s

Value of Bert Cooper’s Rothko, 1960’s and Today

We have talked before about error in art, and the two types of errors that can occur–false positives and false negatives. At the time, we discussed this problem from the perspective of scholars and museums. Today, we consider the problem from the perspective of buyers. How much should you pay for a famous painting such as a Rothko?

In statistics and decision sciences, to answer this question we must decide on a “loss function”–what do you lose if you over- or under-estimate the true value? For an art buyer, undervaluing the painting is actually an advantage. Overvaluing it, on the other hand, can lead to significant losses, so we would penalize that more in our loss function.

Consider this example from Jay Livingston:

Maybe Shamus can find some solace in a Times story that ran the same day about a Manhattan art gallery that had been selling expensive forgeries.  I know that in art, quality and value are two very different things.  Still, I had to stop and wonder when I read about

Domenico and Eleanore De Sole, who in 2004 paid \$8.3 million for a painting attributed to Mark Rothko that they now say is a worthless fake. One day a painting is worth \$8.3 million; the next day, the same painting – same quality, same capacity to give aesthetic pleasure or do whatever it is that art does – is “worthless.”*  Art forgery also makes me wonder about the buyer’s motive.  If the buyer wanted only to have and to gaze upon something beautiful, something with artistic merit, then a fake Rothko is no different than a real Rothko.  It seems more likely that what the buyer wants is to own something valuable – i.e., something that costs a lot. Displaying your brokerage account statements is just too crude and obvious.  What the high-end art market offers is a kind of money laundering. Objects that are rare and therefore expensive, like a real Rothko, transform money into something more acceptable – personal qualities like good taste, refinement, and sophistication.

Andrew Gelman responds:

I’m in sympathy with Livingston’s general point—I too am happy to mock people who happen to have more money than I do—and Rothko’s art has always seemed pretty pointless to me. I mean, sure, it can look fine on the wall, but it hardly seems like something special to me.

But I think Livingston’s going too far, in that he’s forgetting the natural human desire not to get ripped off.

If stories of art rip-offs become more common, we can expect the price paid for (even real) paintings to go down. This will not be because their underlying value–aesthetically, as a piggy bank, or as a way of showing off–has decreased, but because buyers have adjusted their loss function to account for the probability of forgery.

# What Are the Chances Your Vote Will Matter?

Only one vote matters. In the United States, the vote that gives a presidential candidate the majority in the state that tips the electoral college decides it all. Nevertheless, about 122 million US voters went to the polls for the 2008 Presidential election.

If the only benefit you get from voting is your candidate winning, this behavior is totally irrational. Voters spend precious time and effort traveling to the polls or arranging for mail-in ballots, with very small odds that this will make any difference in the final outcome. Of course, the simplest explanation is that this argument is wrong and voting can be rational, but you could also say that voting is self-expression.

In a recent paper (gated), Douglas VanDerwerken
takes a slightly different approach. He estimates a one in 2.5 million chance that his vote will matter this year, given that he lives in North Carolina (a competitive state in 2008, and likely in 2012 too).* But then he points out that, “Even if your vote does not have an effect on the election, it can certainly have an effect on you.” His broader message is that:

Statistics is not divorced from subjectivity, nor from morality. What you decide depends on your moral axioms.

We can use statistics to inform our objective calculations, and our subjective intuitions, but decision-making is not a “plug and chug” process. In summarizing data, the statistician makes important decisions about how to abstract away from reality and what message to send. When that information as inputs for further decision-making–which always involves trade-offs–the statistician bears some responsibility for the outcome. Once again we are reminded that statistics is a rhetorical practice. (See also here and here.)

________________________________

*Full disclosure: Doug teaches the lab section of a Duke statistics course in which I am currently enrolled.

# Gelman’s Five Essential Books on American Elections

The Browser today has an interview with Andrew Gelman, one of the best-informed researchers on American elections (and other things). His selections are a bit strange eclectic, but readers of this blog might find them interesting. The one that I am most likely to read is The 480, a novel about political consultants.

# PyCon 2012 Video Round-Up

The videos from PyCon 2012 are posted. Here are the ones I plan to watch, along with their summaries:

Checking Mathematical Proofs Written in TeX

ProofCheck is a set of Python scripts which parse and check mathematics written using TeX. Its homepage is http://www.proofcheck.org. Unlike computer proof assistants which require immersion in the equivalent of a programming language, ProofCheck attempts to handle mathematical language formalized according to the author’s preferences as much as possible.

Sketching a Better Product

If writing is a means for organizing your thoughts, then sketching is a means for organizing your thoughts visually. Just as good writing requires drafts, good design requires sketches: low-investment, low-resolution braindumps. Learn how to use ugly sketching to iterate your way to a better product.

Bayesian Statistics Made (as) Simple (as Possible)

This tutorial is an introduction to Bayesian statistics using Python. My goal is to help participants understand the concepts and solve real problems. We will use material from my (nb: Allen Downey’s) book, Think Stats: Probability and Statistics for Programmers (O’Reilly Media).

SQL for Python Developers

Relational databases are often the bread-and-butter of large-scale data storage, yet they are often poorly understood by Python programmers. Organizations even split programmers into SQL and front-end teams, each of which jealously guards its turf. These tutorials will take what you already know about Python programming, and advance into a new realm: SQL programming and database design.

Web scraping: Reliably and efficiently pull data from pages that don’t expect it

Exciting information is trapped in web pages and behind HTML forms. In this tutorial, you’ll learn how to parse those pages and when to apply advanced techniques that make scraping faster and more stable. We’ll cover parallel downloading with Twisted, gevent, and others; analyzing sites behind SSL; driving JavaScript-y sites with Selenium; and evading common anti-scraping techniques.

Some of it may be above my head at this stage, but I think it’s great that the Python community makes all of these resources available.