Statistical Thinking and the Birth of Modern Computing

John von Neumann and the IAS computer, 1945

John von Neumann and the IAS computer, 1945

What do fighter pilots, casinos, and streetlights all have in common? These three disparate topics are all the subject of statistical thinking that led to (and benefitted from) the development of modern computing. This process is described in Turing’s Cathedral by George Dyson, from which most of the quotes below are drawn. Dyson’s book focuses on Alan Turing far less than the title would suggest, in favor of John von Neumann’s work at the Institute for Advanced Studies (IAS). Von Neumann and the IAS computing team are well-known for building the foundation of the digital world, but before Turing’s Cathedral I was unaware of the deep connection with statistics.

Statistical thinking first pops up in the book with Julian Bigelow’s list of fourteen “Maxims for Ideal Prognosticators” for predicting aircraft flight paths on December 2, 1941. Here is a subset (p. 112):

7. Never estimate what may be accurately computed.

8. Never guess what may be estimated.

9. Never guess blindly.

This early focus on estimation will reappear in a moment, but for now let’s focus on the aircraft prediction problem. With the advent of radar it became possible for sorties at night or in weather with poor visibility. In a dark French sky or over a foggy Belgian city it could be tough to tell who was who until,

otherwise adversarial forces agreed on a system of coded signals identifying their aircraft as friend or foe. In contrast to the work of wartime cryptographers, whose job was to design codes that were as difficult to understand as possible, the goal of IFF [Identification Friend or Foe] was to develop codes that were as difficult to misunderstand as possible…. We owe the existence of high-speed digital computer to pilots who preferred to be shot down intentionally by their enemies rather than accidentally by their friends. (p. 116)

In statistics this is known as the distinction between Type I and Type II errors, which we have discussed before. Pilots flying near their own lines likely figured there was a greater probability that their own forces would make a mistake than that the enemy would detect them–and going down as a result of friendly fire is no one’s idea of fun. This emergence of a cooperative norm in the midst of combat is consistent with stories from other conflicts in which the idea of fairness is used to compensate for the rapid progress of weapons technology.

casino-monte-carlo-roulette-monaco-1Chapter 10 of the book (one of my two favorites along with Chapter 9, Cyclogenesis) is entitled Monte Carlo. Statistical practitioners today use this method to simulate statistical distributions that are analytically intractable. Dyson weaves the development of Monte Carlo in with a recounting how von Neumann and his second wife Klari fell in love in the city of the same name. A full description of this method is beyond the scope of this post, but here is a useful bit:

Monte Carlo originated as a form of emergency first aid, in answer to the question: What to do until the mathematician arrives? “The idea was to try out thousand of such possibilities and, at each stage, to select by chance, by means of a ‘random number’ with suitable probability, the fate or kind of event, to follow it in a line, so to speak, instead of considering all branches,” [Stan] Ulam explained. “After examining the possible histories of only a few thousand, one will have a good sample and an approximate answer to the problem.”

For a more comprehensive overview of this development in the context of Bayesian statistics, check out The Theory That Would Not Die.

The third and final piece of the puzzle for our post today is the well-known but not sufficiently appreciated distinction between correlation and causation. Philip Thompson, a meteorologist who joined the IAS group in 1946, learned this lesson at the age of 4 and counted it as the beginning of his “scientific education”:

[H]is father, a geneticist at the University of Illinois, sent him to post a letter in a mailbox down the street. “It was dark, and the streetlights were just turning on,” he remembers. “I tried to put the letter in the slot, and it wouldn’t go in. I noticed simultaneously that there was a streetlight that was flickering in a very peculiar, rather scary, way.” He ran home and announced that he had been unable to mail the letter “because the streetlight was making funny lights.”

Thompson’s father seized upon this teachable moment, walked his son back to the mailbox and “pointed out in no uncertain terms that because two unusual events occurred at the same time and at the same place it did not mean that there was any real connection between them.” Thus the four-year-old learned a lesson that many practicing scientists still have not. This is also the topic of Chapter 8 of How to Lie with Statistics and a recent graph shared by Cory Doctorow.

The fact that these three lessons on statistical thinking coincided with the advent of digital computing, along with a number of other anecdotes in the book, impressed upon me the deep connection between these two fields of thought. Most contemporary Bayesian work would be impossible without computers. It is also possible that digital computing would have come about much differently without an understanding of probability and the scientific method.

The Politics of WiFi Names

Shorter than a tweet at only 32 characters but having a lifespan of years and visible to all who pass within about 30m, your SSID – the name of your WiFi router – may have more reach than your twitter account.

The lifespan of a tweet is about one hour. After that it sinks into oblivion. Estimates of the average number of twitter followers range from 26 to 127. A single tweet will reach 50 or so people. If you live in a city your neighbours, your guests, the guests of your neighbours, even the pizza guy all have a good chance of seeing your wifi router. If you live near a busy street, even better!

Are WiFi names being chosen to make points? Catching the bus home in Buenos Aires recently, I was idly trying to find an open hotspot when I noticed the names of 3 politicians appear as SSIDs within just a few blocks – Moreno, Cristina and Nestor. Each of these could well be a coincidence, but taken together amongst all the linksys, FT7898, Motorola SSIDs they stood out as tokens of political allegiance. I decided to look deeper into the data: are WiFi SSIDs being used to fly political colours?

Not mine

That’s from a post at OpenSignalMaps that’s worth reading in full. As the name implies, they have some really cool maps. The conclusion of their study (from Buenos Aires):

What does seem clear is that, of the routers clearly showing political sentiment, the sentiments are mostly positive with 114 supporting either Cristina, Nestor or Peron and 4 against. Another 580 we’ve tentatively put down as positive. Though CFK enjoys support from many quarters (and won a 54% majority recently) we would expect more routers to be named unfavourably. In Argentina, it seems people are more comfortable broadcasting positive sentiments via WiFi names.

[via Hacker News]

Wednesday Nerd Fun: Build Your Own Turing Machine

I sent this around to a few folks last week, but thought I would share it here as well. If you are not quite nerdy enough to know what a Turing Machine is (hint–you have already used one), check out the Wikipedia page on it.

The version of the machine shown below was built by Mike Davey, who offers this description of himself and the project:

I live in northeast Wisconsin and love to build things. I’ve always liked to make things, take things apart, and see how stuff works. I’ve made all sorts of things, from a CNC router to a Gingery metal lathe, from a greenhouse to furniture; it doesn’t really matter, I find it all enjoyable. I’m also fortunate that, while they may not always understand what I’m building, my family has always been supportive.

The Turing machine came about from a long interest in the history of computers. It’s amazing how groundbreaking computer concepts that were developed during the 40′s and 50′s are now often taken for granted. Something that today seems as basic as the flip-flop or a stack were hard-won ideas in their day. The Turing machine is that type of concept; although it seems almost trivial today, is it still conceptually so powerful.

While thinking about Turing machines I found that no one had ever actually built one, at least not one that looked like Turing’s original concept (if someone does know of one, please let me know). There have been a few other physical Turing machines like the Lego of Doom, but none were immediately recognizable as Turing machines. As I am always looking for a new challenge, I set out to build what you see here.

Here is Mike’s Turing Machine:

And in true WNF spirit, here’s the Lego of Doom project that Mike mentions, set to the theme of the A-Team: