Two Unusual Papers on Monte Carlo Simulation

For Bayesian inference, Markov Chain Monte Carlo (MCMC) methods were a huge breakthrough. These methods provide a principled way for simulating from a posterior probability distribution, and are useful for integrating distributions that are computationally intractable. Usually MCMC methods are performed with computers, but I recently read two papers that apply Monte Carlo simulation in interesting ways.

The first is Markov Chain Monte Carlo with People. MCMC with people is somewhat similar to playing the game of telephone–there is input “data” (think of the starting word in the telephone game) that is transmitted across stages where it can be modified and then output at the end. In the paper the authors construct a task so that human learners approximately follow an MCMC acceptance rule. I have summarized the paper in slightly more detail here.

The second paper is even less conventional: the authors approximate the value of π using a “Mossberg 500 pump-action shotgun as the proposal distribution.” Their simulated value is 3.131, within 0.33% of the true value. As the authors state, “this represents the first attempt at estimating π using such method, thus opening up new perspectives towards computing mathematical constants using everyday tools.” Who said statistics has to be boring?

 

Schneier on Data and Power

Data and Power is the tentative title of a new book, forthcoming from Bruce Schneier. Here’s more from the post describing the topic of the book:

Corporations are collecting vast dossiers on our activities on- and off-line — initially to personalize marketing efforts, but increasingly to control their customer relationships. Governments are using surveillance, censorship, and propaganda — both to protect us from harm and to protect their own power. Distributed groups — socially motivated hackers, political dissidents, criminals, communities of interest — are using the Internet to both organize and effect change. And we as individuals are becoming both more powerful and less powerful. We can’t evade surveillance, but we can post videos of police atrocities online, bypassing censors and informing the world. How long we’ll still have those capabilities is unclear….

There’s a fundamental trade-off we need to make as society. Our data is enormously valuable in aggregate, yet it’s incredibly personal. The powerful will continue to demand aggregate data, yet we have to protect its intimate details. Balancing those two conflicting values is difficult, whether it’s medical data, location data, Internet search data, or telephone metadata. But balancing them is what society needs to do, and is almost certainly the fundamental issue of the Information Age.

There’s more at the link, including several other potential titles. The topic will likely interest many readers of this blog. It will likely build on his ideas of inequality and online feudalism, discussed here.

Who says North is “up”?

There are several childhood lessons that I trace back to dinners at Outback Steakhouse: the deliciousness of cheese fries, the inconvenience of being in the middle of a wraparound booth, and the historical contingency of North as “up” on maps.
Upside_Down_World_Map

Who started using the NESW arrangement that is virtually omnipresent on maps today? Was it due to the fact that civilization as we now know it developed in the Northern hemisphere? (Incidentally, that’s why clocks run clockwise–a sundial in the Southern hemisphere goes the other way around.)

That doesn’t appear to be the case according to Nick Danforth, who recently took on this question at al-Jazeera America (via Flowing Data):

There is nothing inevitable or intrinsically correct — not in geographic, cartographic or even philosophical terms — about the north being represented as up, because up on a map is a human construction, not a natural one. Some of the very earliest Egyptian maps show the south as up, presumably equating the Nile’s northward flow with the force of gravity. And there was a long stretch in the medieval era when most European maps were drawn with the east on the top. If there was any doubt about this move’s religious significance, they eliminated it with their maps’ pious illustrations, whether of Adam and Eve or Christ enthroned. In the same period, Arab map makers often drew maps with the south facing up, possibly because this was how the Chinese did it.

So who started putting North up top? According to Danforth, that was Ptolemy:

[He] was a Hellenic cartographer from Egypt whose work in the second century A.D. laid out a systematic approach to mapping the world, complete with intersecting lines of longitude and latitude on a half-eaten-doughnut-shaped projection that reflected the curvature of the earth. The cartographers who made the first big, beautiful maps of the entire world, Old and New — men like Gerardus MercatorHenricus Martellus Germanus and Martin Waldseemuller — were obsessed with Ptolemy. They turned out copies of Ptolemy’s Geography on the newly invented printing press, put his portrait in the corners of their maps and used his writings to fill in places they had never been, even as their own discoveries were revealing the limitations of his work.

map_projectionsPtolemy probably had his reasons, but they are lost to history. As Danforth concludes, “The orientation of our maps, like so many other features of the modern world, arose from the interplay of chance, technology and politics in a way that defies our desire to impose easy or satisfying narratives.” Yet another example of a micro-institution that rules our world.

Github for Government

What happens when you combine open source software, open data, and open government? For the city of Munich, the switch to open source software has been a big success:

In one of the premier open source software deployments in Europe, the city migrated from Windows NT to LiMux, its own Linux distribution. LiMux incorporates a fully open source desktop infrastructure. The city also decided to use the Open Document Format (ODF) as a standard, instead of proprietary options.

As of November last year, the city saved more than €11.7 million because of the switch. More recent figures were not immediately available, but cost savings were not the only goal of the operation. It was also done to be less dependent on manufacturers, product cycles and proprietary OSes, the council said.

We’ve talked before about how more city governments could follow the open data, open government initiatives of NYC, using tech to benefit citizens rather than (only) creating initiatives to attract tech companies to the area. This shift in emphasis, toward harnessing the power of technology for widespread gains in happiness, is likely to become even more important following recent protests against tech employees in the Bay Area.

Open data and open government will take the principles of open source and use them to make an even bigger social and political impact. One tool from open source that can be adapted for use by these newer movements is Github. We will continue to follow these trends here, and if you are interested in this trend you can also check out Github and Government for more success stories.

Uncle Bob on Public Policy and Software Professionalism

Software developers need to develop their own professional standard, or politicians will do it for them. That’s what “Uncle” Bob Martin argues in this interview starting about 28:00:

Healthcare.gov was awful. That’s a case where a software failure interfered with a public policy. Whether you agree with that policy or not that should scare the hell out of you, because the next public policy may be one much more important and if our software can’t cope with it we could be in a really deep, deep hole.

At some point or another, some software team is going to screw up so badly that there is a disaster of tremendous loss of life. At that point the politicians of the world will decide they have to do something about it. If we are not there with a set of minimum standards that we follow, practices that we follow, if we can’t convince those politicians that we have been behaving professionally and that this was an accident–if we can’t convince them that we weren’t being negligent–then they’ll have no choice but to regulate us. They’ll pass laws about which languages we use, what platforms we can program on, what books we have to read, and so on. It will not be a good outcome. I don’t want to be a civil servant.

The Economy That Is Stanford

Five of the six most-visited websites in the world are here, in ranked order: Facebook, Google, YouTube (which Google owns), Yahoo! and Wikipedia. (Number five is a Chinese-language site.) If corporations founded by Stanford alumni were to form an independent nation, it would be the tenth largest economy in the world, with an annual revenue of $2.7 trillion, as some professors at that university recently calculated. Another new report says: ‘If the internet was a country, its gross domestic product would eclipse all others but four within four years.’

That’s from this London Review of Books piece by Rebecca Solnit. The October, 2012, research report on which the claim is based is here, based on survey data. Solnit’s piece is interesting throughout, including a discussion of parallels and differences between the tech boom and the Gold Rush.

Two Great Talks on Government and Technology

If you are getting ready to travel next week, you might want to have a couple of good talks/podcasts handy for the trip. Here are two that I enjoyed, on the topic of government and technology.

The first is about how technology can help governments. Ben Orenstein of “Giant Robots Smashing Into Other Giant Robots” discusses Code for America with Catherine Bracy. Catherine recounts some ups and downs of CfA’s partnerships with cities throughout America and internationally. CfA fellows commit a year to help local governments with challenges amenable to technology. One great example that the podcast discusses is a tool for parents in Boston to see which schools they could send their kids to when the city switched from location-based school assignment to allowing students to attend schools throughout the city. (Incidentally, the school matching algorithm that Boston used was designed by some professors in economics at Duke, who drew on work for which Roth and Shapley won the Nobel Prize.)

The second talk offers another point of view on techno-politics: when government abuses technology. Steve Klabnik‘s “No Secrets Allowed” talk from Golden Gate Ruby Conference discusses recent revelations regarding the NSA and privacy. In particular he explains why “I have nothing to hide” is not an appropriate response. The talk is not entirely hopeless, and includes recommendations such as using Tor. The Ruby Rogues also had a roundtable discussing Klabnik’s presentation, which you can find here.

Other recommendations are welcome.

The Economics of Movie Popcorn

The Smithsonian’s Food & Think blog recounts a long history of movie theaters’ objections to popcorn. They wanted to be as classy as live theaters. Nickelodeons didn’t have ventilation required for popcorn machines. Moreover, crunchy snacks would have been unwelcome during silent films.

But moviegoers still wanted their popcorn, and street vendors met their demand. This led to signs asking patrons to check their coats and their corn at the theater entrance.

Eventually, movie theater owners realized that if they cut out the middleman, their profits would skyrocket.  For many theaters, the transition to selling snacks helped save them from the crippling Depression. In the mid-1930s, the movie theater business started to go under. “But those that began serving popcorn and other snacks,” Smith explains, “survived.” Take, for example, a Dallas movie theater chain that installed popcorn machines in 80 theaters, but refused to install machines in their five best theaters, which they considered too high class to sell popcorn. In two years, the theaters with popcorn saw their profits soar; the five theaters without popcorn watched their profits go into the red.

Much more here, including how movie theater demand changed the types of popcorn that are grown.

PopcornPortionSizeExample

Popcorn and other concessions are important to theaters because a large percentage of ticket sales (especially during the first couple of weeks after a movie premieres) go to the studio. Recent figures I’ve seen are that concession sales are 80-90 percent profit, whereas in the opening weekend only about 20 percent of the sale price goes to the theater. This means that concessions can make up nearly half the profit for a theater–no wonder they try to keep viewers from bringing in their own refreshments.

Small bags of popcorn have now turned into buckets, perhaps in an effort to justify charging $8-10 rather than the nickel such snacks sold for when “talkies” were new. This transition is covered in the book Why Popcorn Costs So Much at the Movies and an interview with the author is here.

 

Visualizing Political Unrest in Egypt, Syria, and Turkey

The lab of Michael D. Ward et al now has a blog. The inaugural post describes some of the lab’s ongoing projects that may come up in future entries including modeling of protests, insurgencies, and rebellions, event prediction (such as IED explosions), and machine learning techniques.

The second post compares two event data sets–GDELT and ICEWS–using recent political unrest in the Middle East as a focal point (more here):

We looked at protest events in Egypt and Turkey in 2011 and 2012 for both data sets, and we also looked at fighting in Syria over the same period…. What did we learn from these, limited comparisons?  First, we found out first hand what the GDELT community has been saying: the GDELT data are in BETA and currently have a lot of false positives. This is not optimal for a decision making aid such as ICEWS, in which drill-down to the specific events resulting in new predictions is a requirement. Second, no one has a good ground truth for event data — though we have some ideas on this and are working on a study to implement them. Third, geolocation is a boon. GDELT seems especially good a this, even with a lot of false positives.

The visualization, which I worked on as part of the lab, can be found here.  It relies on CartoDB to serve data from GDELT and ICEWS, with some preprocessing done using MySQL and R. The front-end is Javascript using a combination of d3 for timelines and Torque for maps.

gdelt-icews-static

GDELT (green) and ICEWS (blue) records of protests in Egypt and Turkey and conflict in Syria

If you have questions about the visualizations or the technology behind them, feel free to mention them here or on the lab blog.

Visualizing the BART Labor Dispute

Labor disputes are complicated, and the BART situation is no different. Negotiations resumed this week after the cooling off period called for by the governor of California as a result of the July strikes.

To help get up to speed, check out the data visualizations made by the Bay Area d3 User Group in conjunction with the UC Berkeley VUDLab.  They have a round up of news articles, open data, and open source code, as well as links to all the authors’ Twitter profiles.

The infographics address several key questions relevant to the debate, including how much BART employees earn, who rides BART and where, and the cost of living for BART employees.

bart-salary

bart-ridership

More here.