Who says North is “up”?

There are several childhood lessons that I trace back to dinners at Outback Steakhouse: the deliciousness of cheese fries, the inconvenience of being in the middle of a wraparound booth, and the historical contingency of North as “up” on maps.
Upside_Down_World_Map

Who started using the NESW arrangement that is virtually omnipresent on maps today? Was it due to the fact that civilization as we now know it developed in the Northern hemisphere? (Incidentally, that’s why clocks run clockwise–a sundial in the Southern hemisphere goes the other way around.)

That doesn’t appear to be the case according to Nick Danforth, who recently took on this question at al-Jazeera America (via Flowing Data):

There is nothing inevitable or intrinsically correct — not in geographic, cartographic or even philosophical terms — about the north being represented as up, because up on a map is a human construction, not a natural one. Some of the very earliest Egyptian maps show the south as up, presumably equating the Nile’s northward flow with the force of gravity. And there was a long stretch in the medieval era when most European maps were drawn with the east on the top. If there was any doubt about this move’s religious significance, they eliminated it with their maps’ pious illustrations, whether of Adam and Eve or Christ enthroned. In the same period, Arab map makers often drew maps with the south facing up, possibly because this was how the Chinese did it.

So who started putting North up top? According to Danforth, that was Ptolemy:

[He] was a Hellenic cartographer from Egypt whose work in the second century A.D. laid out a systematic approach to mapping the world, complete with intersecting lines of longitude and latitude on a half-eaten-doughnut-shaped projection that reflected the curvature of the earth. The cartographers who made the first big, beautiful maps of the entire world, Old and New — men like Gerardus MercatorHenricus Martellus Germanus and Martin Waldseemuller — were obsessed with Ptolemy. They turned out copies of Ptolemy’s Geography on the newly invented printing press, put his portrait in the corners of their maps and used his writings to fill in places they had never been, even as their own discoveries were revealing the limitations of his work.

map_projectionsPtolemy probably had his reasons, but they are lost to history. As Danforth concludes, “The orientation of our maps, like so many other features of the modern world, arose from the interplay of chance, technology and politics in a way that defies our desire to impose easy or satisfying narratives.” Yet another example of a micro-institution that rules our world.

Visualizing the Indian Buffet Process with Shiny

(This is a somewhat more technical post than usual. If you just want the gist, skip to the visualization.)

N customers enter an Indian buffet restaurant, one after another. It has a seemingly endless array of dishes. The first customer fills her plate with a Poisson(α) number of dishes. Each successive customer i tastes the previously sampled dishes in proportion to their popularity (the number of previous customers who have sampled the kth dish, m_k, divided by i). The ith customer then samples a Poisson(α) number of new dishes.

That’s the basic idea behind the Indian Buffet Process (IBP). On Monday Eli Bingham and I gave a presentation on the IBP in our machine learning seminar at Duke, taught by Katherine Heller. The IBP is used in Bayesian non-parametrics to put a prior on (exchangeability classes of) binary matrices. The matrices usually represent the presence of features (“dishes” above, or the columns of the matrix) in objects (“customers,” or the rows of the matrix). The culinary metaphor is used by analogy to the Chinese Restaurant Process.

Although the visualizations in the main paper summarizing the IBP are good, I thought it would be helpful to have an interactive visualization where you could change α and N to see how what a random matrix with those parameters looks like. For this I used Shiny, although it would also be fun to do in d3.

One realization of the IBP, with α=10.

One realization of the IBP, with α=10.

In the example above, the first customer (top row) sampled seven dishes. The second customer sampled four of those seven dishes, and then four more dishes that the first customer did not try. The process continues for all 10 customers. (Note that this matrix is not sorted into its left-ordered-form. It also sometimes gives an error if α << N, but I wanted users to be able to choose arbitrary values of N so I have not changed this yet.) You can play with the visualization yourself here.

Interactive online visualizations like this can be a helpful teaching tool, and the process of making them can also improve your own understanding of the process. If you would like to make another visualization of the IBP (or another machine learning tool that lends itself to graphical representation) I would be happy to share it here. I plan to add the Chinese restaurant process and a Dirichlet process mixture of Gaussians soon. You can find more about creating Shiny apps here.

Constitutional Forks Revisited

Around this time last year, we discussed the idea of a constitutional “fork” that occurred with the founding of the Confederate States of America. That post briefly explains how forks work in open source software and how the Confederates used the US Constitution as the basis for their own, with deliberate and meaningful differences. Putting the two documents on Github allowed us to compare their differences visually and confirm our suspicions that many of them were related to issues of states’ rights and slavery.

Caleb McDaniel, a historian at Rice who undoubtedly has a much deeper and more thorough knowledge of the period, conducted a similar exercise and also posted his results on Github. He was faced with similar decisions of where to obtain the source text and which differences to retain as meaningful (for example, he left in section numbers where I did not). My method identifies 130 additions and 119 deletions when transitioning between the USA and CSA constitutions, whereas the stats for Caleb’s repo show 382 additions and 370 deletions.

What should we draw from these projects? In Caleb’s words:

My decisions make this project an interpretive act. You are welcome to inspect the changes more closely by looking at the commit histories for the individual Constitution files, which show the initial text as I got it from Avalon as well as the changes that I made.

You can take a look at both projects and conduct a difference-in-differences exploration of your own. More generally, these projects show the need for tools to visualize textual analyses, as well as the power of technology to enhance understanding of historical and political acts. Caleb’s readme file has great resources for learning more about this topic including the conversation that led him to this project, a New York Times interactive feature on the topic, and more.

Don’t Forget Your Forever Stamps

The price of a first-class US stamp is set to increase from 46 to 49 cents on January 26. Like Cosmo Kramer’s Michigan bottle redemption plan (see below), Allison Schrager and Ritchie King ran the numbers on whether it would be possible to provide from Forever Stamp arbitrage.

Could the scheme make money? Maybe–if you get the timing right and pay low interest on capital:

Assuming we sell all 10 million stamps for the bulk discount price of $0.475 each, our profit will be $150,000. Subtract out the $399 for the distributor database. Let’s also assume we spent the $3,500 for Check Stand Program plus, say, $300 to make the 100 displays for advertising in stores. That gives us $145,801.

If we do manage to shift the stamps in a month, the interest on our debt will be $29,000. That brings our profits to $116,801. Then we’ll return the equity to our shareholders, along with 50% of the profits.

That leaves us with the other 50%: $58,400.50. If you look at that as a profit on the $4.6 million initial outlay, it’s not very much: less than 1.3%. But remember, all that outlay was leveraged. So if you look at it as a return on our investment—$33.25 for shipping—it’s 175,541%.

What Can We Learn from Games?

ImageThis holiday season I enjoyed giving, receiving, and playing several new card and board games with friends and family. These included classics such as cribbage, strategy games like Dominion and Power Grid, and the whimsical Munchkin.

Can video and board games teach us more than just strategy? What if games could teach us not to be better thinkers, but just to be… better? A while ago we discussed how monopoly was originally designed as a learning experience to promote cooperation. Lately I have learned of two other such games in a growing genre and wanted to share them here.

The first is Depression Quest by Zoe Quinn (via Jeff Atwood):

Depression Quest is an interactive fiction game where you play as someone living with depression. You are given a series of everyday life events and have to attempt to manage your illness, relationships, job, and possible treatment. This game aims to show other sufferers of depression that they are not alone in their feelings, and to illustrate to people who may not understand the illness the depths of what it can do to people.

The second is Train by Brenda Romero (via Marcus Montano) described here with spoilers:

In the game, the players read typewritten instructions. The game board is a set of train tracks with box cars, sitting on top of a window pane with broken glass. There are little yellow pegs that represent people, and the player’s job is to efficiently load those people onto the trains. A typewriter sits on one side of the board.

The game takes anywhere from a minute to two hours to play, depending on when the players make a very important discovery. At some point, they turn over a card that has a destination for the train. It says Auschwitz. At that point, for anyone who knows their history, it dawns on the player that they have been loading Jews onto box cars so they can be shipped to a World War II concentration camp and be killed in the gas showers or burned in the ovens.

The key emotion that Romero said she wanted the player to feel was “complicity.”

“People blindly follow rules,” she said. “Will they blindly follow rules that come out of a Nazi typewriter?”

I have tried creating my own board games in the past, and this gives me renewed interest and a higher standard. What is the most thought-provoking moment you have experienced playing games?

Political Forecasting and the Use of Baseline Rates

As Joe Blitzstein likes to say, “Thinking conditionally is a condition for thinking.” Humans are not naturally good at this skill. Consider the following example: Kelly is interested in books and keeping things organized. She loves telling stories and attending book clubs. Is it more likely that Kelly is a bestselling novelist or an accountant?

Many of the “facts” about Kelly in that story might lead you to answer that she is a novelist. Only one–her sense of organization–might have pointed you toward an accountant. But think about the overall probability of each career. Very few bookworms become successful novelists, and there are many more accountants than (successful) authors in the modern workforce. Conditioning on the baseline rate helps make a more accurate decision.

I make a similar point–this time applied to political forecasting–in a recent blog post for the blog of Mike Ward’s lab (of which I am a member):

One piece of advice that Good Judgment forecasters are often reminded of is to use the baseline rate of an event as a starting point for their forecast. For example, insurgencies are a very rare event on the whole. For the period January, 2001 to August, 2013, insurgencies occurred in less than 10 percent of country-months in the ICEWS data set.

From this baseline, we can then incorporate information about the specific countries at hand and their recent history… Mozambique has not experienced an insurgency for the entire period of the ICEWS dataset. On the other hand, Chad had an insurgency that ended in December, 2003, and another that extended from November, 2005, to April, 2010. For the duration of the ICEWS data set, Chad has experienced an insurgency 59 percent of the time. This suggests that our predicted probability of insurgency in Chad should be higher than for Mozambique.

I started writing that post before rebels in Mozambique broke their treaty with the government. Maybe I spoke too soon, but the larger point is that baselines are the starting point–not the final product–of any successful forecast.

Having more data is useful, as long as it contributes more signal than noise. That’s what ICEWS aims to do, and I consider it a useful addition to the toolbox of forecasters participating in the Good Judgment Project. For more on this collaboration, as well as a map of insurgency rates around the globe as measured by ICEWS, see the aforementioned post here.

A Chrome Extension for XKCD Substitutions

This morning’s XKCD had some fun suggestions for replacing key phrases to make news articles more fun:

Regular readers may recall my Doublespeak Chrome extension, which works on the same principle. In short order, I was able to create a new app, XKCDSub, that works the same way: install the extension, and when you click its icon it will open your current page in a new tab with the phrases replaced. Here is an example of the extension in action on Elon Musk’s Wikipedia page:

elon

The code is open source on Github. You can find it in the Chrome webstore here.

The Economics of Movie Popcorn

The Smithsonian’s Food & Think blog recounts a long history of movie theaters’ objections to popcorn. They wanted to be as classy as live theaters. Nickelodeons didn’t have ventilation required for popcorn machines. Moreover, crunchy snacks would have been unwelcome during silent films.

But moviegoers still wanted their popcorn, and street vendors met their demand. This led to signs asking patrons to check their coats and their corn at the theater entrance.

Eventually, movie theater owners realized that if they cut out the middleman, their profits would skyrocket.  For many theaters, the transition to selling snacks helped save them from the crippling Depression. In the mid-1930s, the movie theater business started to go under. “But those that began serving popcorn and other snacks,” Smith explains, “survived.” Take, for example, a Dallas movie theater chain that installed popcorn machines in 80 theaters, but refused to install machines in their five best theaters, which they considered too high class to sell popcorn. In two years, the theaters with popcorn saw their profits soar; the five theaters without popcorn watched their profits go into the red.

Much more here, including how movie theater demand changed the types of popcorn that are grown.

PopcornPortionSizeExample

Popcorn and other concessions are important to theaters because a large percentage of ticket sales (especially during the first couple of weeks after a movie premieres) go to the studio. Recent figures I’ve seen are that concession sales are 80-90 percent profit, whereas in the opening weekend only about 20 percent of the sale price goes to the theater. This means that concessions can make up nearly half the profit for a theater–no wonder they try to keep viewers from bringing in their own refreshments.

Small bags of popcorn have now turned into buckets, perhaps in an effort to justify charging $8-10 rather than the nickel such snacks sold for when “talkies” were new. This transition is covered in the book Why Popcorn Costs So Much at the Movies and an interview with the author is here.

 

Visualizing Political Unrest in Egypt, Syria, and Turkey

The lab of Michael D. Ward et al now has a blog. The inaugural post describes some of the lab’s ongoing projects that may come up in future entries including modeling of protests, insurgencies, and rebellions, event prediction (such as IED explosions), and machine learning techniques.

The second post compares two event data sets–GDELT and ICEWS–using recent political unrest in the Middle East as a focal point (more here):

We looked at protest events in Egypt and Turkey in 2011 and 2012 for both data sets, and we also looked at fighting in Syria over the same period…. What did we learn from these, limited comparisons?  First, we found out first hand what the GDELT community has been saying: the GDELT data are in BETA and currently have a lot of false positives. This is not optimal for a decision making aid such as ICEWS, in which drill-down to the specific events resulting in new predictions is a requirement. Second, no one has a good ground truth for event data — though we have some ideas on this and are working on a study to implement them. Third, geolocation is a boon. GDELT seems especially good a this, even with a lot of false positives.

The visualization, which I worked on as part of the lab, can be found here.  It relies on CartoDB to serve data from GDELT and ICEWS, with some preprocessing done using MySQL and R. The front-end is Javascript using a combination of d3 for timelines and Torque for maps.

gdelt-icews-static

GDELT (green) and ICEWS (blue) records of protests in Egypt and Turkey and conflict in Syria

If you have questions about the visualizations or the technology behind them, feel free to mention them here or on the lab blog.

Review: RubyMotion iOS Development Essentials

rm-ios-devRubyMotion is a continued topic of interest on this blog, and I will likely have more posts about it in the near future. At this stage I am still getting comfortable with iOS development, but I would much rather be doing it in the friendly playground of Ruby rather than the Objective-C world. In addition to the RubyMotion book from PragProg, the next resource I would recommend is RubyMotion iOS Development Essentials.*

The book takes a “zero-to-deploy” approach, which is great for beginners trying to get their first app into the App Store. The first few chapters  will be redundant for developers who have worked with RubyMotion before, but they provide a helpful introduction to how RM works and the Model-View-Controller paradigm.

For several chapters the book uses the same example application, a restaurant recommender reminiscent of Yelp. Demonstrating code by building up from a simple application is a nice way of presenting the application. By the time readers have worked through these chapters they will have an example app that is more interesting than many of the toy apps in shorter tutorials.

Later chapters will benefit novice and experienced developers alike, because they fill a gap in the RubyMotion literature. Many tutorials overlook the process of testing RM code, and testing iOS in general can be challenging. The testing chapter of this book goes over unit testing, functional testing, and tests that rely on device events such as gestures.

My favorite chapter in the book was chapter 6, which goes over device capabilities. At 46 pages this is the longest chapter in the book, covering GPS, gestures, Core Data, using the Address Book, and more. I especially enjoyed working through the section on accessing the camera and Photo Library. This is difficult to test on the simulator since there is (obviously) no access to a built-in camera (as with certain iOS devices including some iPod Touch models), but the example app covers how to handle this gracefully.

Stylistically, it can be a challenge to lay out a book that uses iOS API jargon like UIImagePickerControllerSourceTypePhotoLibrary. There were some gripes with the authors’ choice of two-space indenting, but that is my preference so it did not bother me. One addition I would have preferred would be additional formatting for the code, using either colors (for the e-book) or bolding (for the print version) to distinguish function names and help the reader keep their place in the code. The apps themselves rely mainly on iOS defaults. This is common in tutorials, but it also helps them look natural in iOS 7. Most of the time I was working through the book I used the iOS 6.1 simulator, but it was no problem to upgrade to iOS 7.

As a whole this book is a thorough introduction to RubyMotion development. It has several key features that are missing from other RubyMotion tutorials, including an emphasis on testing code. This book makes a great resource for new RubyMotion developers, or developers who want to use more of the device capabilities.

*Note: For this review I relied on the e-book version, which was provided as a review copy by Packt Publishing.