RubyMotion for Complete Beginners

HelloWorldRubyMotionAccording to the RubyMotion guide for getting started, “RubyMotion is a toolchain that permits the development of iOS applications using the Ruby programming language.” In less formal terms it lets you write iOS apps in Ruby using your favorite development environment rather than Apple’s unpopular XCode IDE. This post assumes you have gone through the guide above but don’t have much other iOS development experience.

My number one recommendation for anyone coming to RubyMotion for the first time would be RubyMotion: iOS Development with Ruby by Clay Allsopp. This book has the quality we have come to expect from the Pragmatic Programmers. The code examples are clear and well-documented, encouraging you to work hands-on with RubyMotion from the first chapter.  The book website over at PragProg also includes a discussion forum where the author personally answers questions.

The Pragmatic Programmers also put out a RubyMotion screencast. Screencasts are popular within the Ruby-Rails community and seem to already be fairly widespread within the RubyMotion world.

My favorite RubyMotion screencast to date is Jamon Holmgren’s tutorial for making an app that displays several YouTube videos. In all I am pretty sure I wrote just over 100 lines of code for this demo app. I am certain that in Objective-C the app would have required much more code, and have been less fun to write. This tutorial uses the ProMotion tool that Jamon also wrote, which helps organize your app behavior based on “screens.” You can find a tutorial for getting started with ProMotion here.

If you would like to know a bit more about the background of RubyMotion and the people who use it, there are two podcast episodes I would recommend. The first is the Ruby Rogues interview with Laurent Sansonetti, the creator of RubyMotion. The second is Episode 29 of the Giant Robots Smashing into Other Giant Robots podcast (“The Most Ironic iOS Developer”). In that episode Ben Orenstein interviews two thoughtbot developers, one of whom uses primarily RubyMotion and the other uses Objective-C almost exclusively but has a bit of experience with RubyMotion. They give some nice perspective on the pros and cons of RubyMotion for iOS development and the show notes provide a number of other resources.

To keep up with new resources in the RubyMotion community, check out RubyMotion Tutorials. The code that I have written during this initial learning process is on Github.

Update 1: I forgot to mention that RubyMotion also offers a generous discount for an educational license if you do not plan to sell your apps on the App Store.

Update 2: Jamon Holmgren tweeted a new version of his tutorial this morning (1/30/13).

The Political Economy of Scrabble: Currency, Innovation, and Norms

Scrabble ornaments made by Jennifer Bormann, 2011

Scrabble Christmas ornaments made by Jennifer Bormann, 2011

In Scrabble, there is a finite amount of resources (letter tiles) that players use to create value (points) for themselves. Similarly, in the real world matter cannot be created so much of human effort is rearranging the particles that exist into more optimal combinations. The way that we keep track of how desirable those new combinations are in the economy is with money. Fiat currency has no intrinsic value–it is just said to be worth a certain amount. Sometimes this value changes in response to other currencies. Other times, governments try to hold it fixed. The “law of Scrabble” has remained unchanged since 1938 when it was introduced–but that may be about to change.

Like any well-intentioned dictator, Scrabble inventor Alfred Butts tried to base the value of his fiat money–er, tiles–on a reasonable system:  the frequency of their appearance on the front page of the New York Times. As the English language and the paper of record have evolved over the years, though, the tiles’ stated value has remained static. This has opened the door for arbitrage opportunities, although some players try to enforce norms to discourage this type of play:

What has changed in the intervening years is the set of acceptable words, the corpus, for competitive play. As an enthusiastic amateur player I’ve annoyed several relatives with words like QI and ZA, and I think the annoyance is justified: the values for Scrabble tiles were set when such words weren’t acceptable, and they make challenging letters much easier to play.

That is a quote from Joshua Lewis, who has proposed updating Scrabble scoring using his open source software package Valett. He goes on to say:

For Scrabble, Valett provides three advantages over Butts’ original methodology. First, it bases letter frequency on the exact frequency in the corpus, rather than on an estimate. Second, it allows one to selectively weight frequency based on word length. This is desirable because in a game like Scrabble, the presence of a letter in two- or three-letter words is valuable for playability (one can more easily play alongside tiles on the board), and the presence of a letter in seven- or eight-letter words is valuable for bingos. Finally, by calculating the transition probabilities into and out of letters it quantifies the likelihood of a letter fitting well with other tiles in a rack. So, for example, the probability distribution out of Q is steeply peaked at U, and thus the entropy of Q’s outgoing distribution is quite low.

Lewis’s idea seems to fit with a recent finding by Peter Norvig of Google. Norvig was contacted last month by Mark Mayzner, who studied the same kind of information as the Valett package but did it back in the early 1960s. Mayzner asked Norvig whether his group at Google would be interested in updating those results from five decades ago using the Google Corpus Data. Here’s what Norvig has to say about the process:

The answer is: yes indeed, I (Norvig) am interested! And it will be a lot easier for me than it was for Mayzner. Working 60s-style, Mayzner had to gather his collection of text sources, then go through them and select individual words, punch them on Hollerith cards, and use a card-sorting machine.

Here’s what we can do with today’s computing power (using publicly available data and the processing power of my own personal computer; I’m not not relying on access to corporate computing power):

1. I consulted the Google books Ngrams raw data set, which gives word counts of the number of times each word is mentioned (broken down by year of publication) in the books that have been scanned by Google.

2. I downloaded the English Version 20120701 “1-grams” (that is, word counts) from that data set given as the files “a” to “z” (that is, http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-1gram-20120701-a.gz to http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-1gram-20120701-z.gz). I unzipped each file; the result is 23 GB of text (so don’t try to download them on your phone).

3. I then condensed these entries, combining the counts for all years, and for different capitalizations: “word”, “Word” and “WORD” were all recorded under “WORD”. I discarded any entry that used a character other than the 26 letters A-Z. I also discarded any word with fewer than 100,000 mentions. (If you want you can download the word count file; note that it is 1.5 MB.)

4. I generated tables of counts, first for words, then for letters and letter sequences, keyed off of the positions and word lengths.

Here is the breakdown of word lengths that resulted (average=4.79):

norvig-word-lengths

Sam Eifling then took Norvig’s results and translated them into updated Scrabble values:

While ETAOINSR are all, appropriately, 1-point letters, the rest of Norvig’s list doesn’t align with Scrabble’s point values….

This potentially opens a whole new system of weighing the value of your letters….  H, which appeared as 5.1 percent of the letters used in Norvig’s survey, is worth 4 points in Scrabble, quadruple what the game assigns to the R (6.3 percent) and the L (4.1 percent) even though they’re all used with similar frequency. And U, which is worth a single point, was 2.7 percent of the uses—about one-fifth of E, at 12.5 percent, but worth the same score. This confirms what every Scrabble player intuitively knows: unless you need it to unload a Q, your U is a bore and a dullard and should be shunned.

However, Norving included repeats like “THE”–not much fun to play in Scrabble, and certainly not with the same frequency it appears in the text corpus (1 out of 14 turns). With the help of his friend Kyle Rimkus, Eifling conducted a letter-frequency survey of words from the Scrabble dictionary and came up with these revisions to the scoring system:

Image from Slate

Image from Slate

Eifling points out that Q and J seem quite undervalued in the present scoring system. So what is an entrepreneurial player to do? “Get rid of your J and your Q as quickly as possible, because they’re just damn hard to play and will clog your rack. The Q, in fact, is the worst offender,” he says.

Now as with any proposed policy update that challenges long-standing norms, there has been some pushback against these recent developments.  at Slate quotes the old guard of Scrabble saying that the new values “take the fun out” of the game. Fatsis seems to hope that the imbalance between stated and practical values will persist:

Quackle co-writer John O’Laughlin, a software engineer at Google, said the existing inequities also confer advantages on better players, who understand the “equity value” of each tile—that is, its “worth” in points compared with the average tile. That gives them an edge in balancing scoring versus saving letters for future turns, and in knowing which letters play well with others. “If we tried to equalize the letters, this part of the game wouldn’t be eliminated, but it would definitely be muted,” O’Laughlin said. “Simply playing the highest score available every turn would be a much more fruitful strategy than it currently is.”

In political economy this is known as rent-seeking behavior. John Chew, doctoral student in mathematics at the University of Toronto and co-president of the North American Scrabble Players Association, went so far as to call Valett a “catastrophic outrage.”

Who knew that the much beloved board game could provoke such strong feelings? With a fifth edition of the Scrabble dictionary due in 2014 it seems possible but highly unlikely that there could be a response to these new findings. A more probable outcome is that we begin to see “black market” Scrabble valuations that incorporate the new data, much like underground economies emerge in states with strict official control over the value of their money. Yet again, evidence for politics in everyday life.

For more fun with letter games, data, and coding, check out Jeff Knups’ guide to “Creating and Optimizing a Letterpress Cheating Program in Python.”

Micro-Institutions Everywhere: Parking and Snow

snow cartoonJeff Ely reports the problem:

You dig your car out of the snow, run an errand or two and come back home to discover…someone else has parked in “your” spot! This free rider problem reduces your incentive to dig your car out in the first place. If only property rights could be enforced, your incentives would be good.

According to the Washington Post, some cities use the legal system, others employ norms:

Boston has codified its citizens’ right to benefit from their backbreaking snow-clearing labor; a city law says that if you dig out your car in a snow emergency, a lawn chair or trash can renders the spot yours for at least two days while you’re away at work. In Chicago, blocking a parking spot is illegal, but city officials acknowledge an informal rule of dibs if you’ve done the digging.

During the DC Snowpocalypse of 2010, residents were unsure which method was best. The desire for some level of temporary property rights was there, but enforcement methods varied:

“I know this is public property, but if you spent hours laboring, I mean, come on, I think you have the right to say that is my spot,” said Tanya Barbour, who spent two hours Sunday shoveling free her silver Ford Expedition in the 1500 block of T Street NW. “If someone had clearly taken the time to shovel it out, I would not take that spot because I would not want that done to me.”

Across the District and in the Maryland suburbs Monday, many were not relying on Barbour’s honor system. Some used Boston-style markers — lawn chairs, recycling bins, orange cones, a mattress, even two bar stools with a Swiffer on top — to try to save spots along residential streets.

With Durham facing a forecast of “icy mix” for this afternoon, you can bet we here at YSPR will be on the lookout for emergent norms.

The Economist on Internet Politics

PoliticalProgressInternetOn Monday I gave a round-up of my posts on internet politics over the past year or so. Recently The Economist wrote a similar review. It is worth reading in full if this topic interests you. In this post we will discuss a few key points from that article, demonstrating the increasing relevance of politics online.

SOPA was not the only bill introduced that would have infringed on internet freedoms:

The success at the ITU conference in Dubai capped a big year for online activists. In January they helped defeat Hollywood-sponsored anti-piracy legislation, best known by the acronym SOPA, in America’s Congress. A month later, in Europe, they took on ACTA, an obscure international treaty which, in seeking to enforce intellectual-property rights, paid little heed to free speech and privacy. In Brazil they got closer than many would have believed possible to securing a ground-breaking internet bill of rights, the “Marco Civil da Internet”. In Pakistan they helped to delay, perhaps permanently, plans for a national firewall, and in the Philippines they campaigned against a cybercrime law the Supreme Court later put on hold.

The internet is indeed developing its own political culture:

The internet is nothing if not an exercise in interconnection. Its politics thus seems to call out for a similar convergence, and connections between the disparate interest groups that make up the net movement are indeed getting stronger. Beyond specific links, they also share what Manuel Castells, a Spanish sociologist, calls the “culture of the internet”, a contemporary equivalent of the 1960s counter-culture (in which much of the environmental movement grew up).

There are even political parties who make advocating for internet freedoms a key part of their platform:

In some countries the nascent net movement has spawned “pirate parties” that focus on net-policy issues; the first, in Sweden, was descended from the Pirate Bay, a site created to aid file sharing after Napster, a successful music-sharing scofflaw, was shut down. Pirate Party International, an umbrella group, already counts 28 national organisations as members. Most are small, but Germany’s Piratenpartei, founded in 2006, has captured seats in four regional parliaments.

One of the leaders of the German Pirate Party, Marina Weisband, even used a computing metaphor when asked about her party’s platform: “We don’t offer a ready-made programme, but an entire operating system.” This is similar to an idea from Reid Hoffman that we have discussed before.

Political parties and international treaties are not the only signs of a political life on the internet. Like any venue in which people have to develop a common life, norms and expectations of behavior have already begun to form. In the coming years and decades it will become even more evident that the internet is another area in which the politics of everyday life will play out. We should pay attention.

Internet Politics Round-Up

The January, 2012, Wikipedia blackout page. From @brainpicker on Instagram.

The January, 2012, Wikipedia blackout page. From @brainpicker on Instagram.

2012 was a busy year for followers of internet politics. The SOPA controversy began in late 2011, and really picked up steam with the blackout protest on January 18. Later that month we shared news of the arrest of an Iranian programmer. In February, Jody Baumgartner put out a CFP on “The Internet and Campaign 2012″. Henry Farrell’s literature review, which we discussed in May, providing a solid foundation for the academic study of online politics.

In the second half of the year I argued that the internet is like a new geography to which that freedom-seeking individuals and organizations are fleeing, in a parellel to James Scott’s ideas from The Art of Not Being Governed. These online communities have strong norms, including a high value place upon reputations and expectations for new members. Given these freedoms and norms, a certain level of internet crime may be acceptable. Despite security concerns, we also discussed how democracies are already using online voting and whether that technology might eventually become widespread in the US.

The last major event for internet politics in 2012 was the WCIT meeting in December. Prior to the meeting, proponents of internet freedom were concerned about the potential encroachment on human rights.  Given events like the internet shutdown in Syria and other dictatorial online activities, these fears were well-founded. However, thanks to members of the US delegation like Eli Dourado, a large number of countries refused to sign the updated treaty.

Major developments like these continue to suggest that the internet is a worthy subject of research for social scientists. On Wednesday we will discuss another review piece that brings home some of these points and raises new issues of interest.

Statistical Thinking and the Birth of Modern Computing

John von Neumann and the IAS computer, 1945

John von Neumann and the IAS computer, 1945

What do fighter pilots, casinos, and streetlights all have in common? These three disparate topics are all the subject of statistical thinking that led to (and benefitted from) the development of modern computing. This process is described in Turing’s Cathedral by George Dyson, from which most of the quotes below are drawn. Dyson’s book focuses on Alan Turing far less than the title would suggest, in favor of John von Neumann’s work at the Institute for Advanced Studies (IAS). Von Neumann and the IAS computing team are well-known for building the foundation of the digital world, but before Turing’s Cathedral I was unaware of the deep connection with statistics.

Statistical thinking first pops up in the book with Julian Bigelow’s list of fourteen “Maxims for Ideal Prognosticators” for predicting aircraft flight paths on December 2, 1941. Here is a subset (p. 112):

7. Never estimate what may be accurately computed.

8. Never guess what may be estimated.

9. Never guess blindly.

This early focus on estimation will reappear in a moment, but for now let’s focus on the aircraft prediction problem. With the advent of radar it became possible for sorties at night or in weather with poor visibility. In a dark French sky or over a foggy Belgian city it could be tough to tell who was who until,

otherwise adversarial forces agreed on a system of coded signals identifying their aircraft as friend or foe. In contrast to the work of wartime cryptographers, whose job was to design codes that were as difficult to understand as possible, the goal of IFF [Identification Friend or Foe] was to develop codes that were as difficult to misunderstand as possible…. We owe the existence of high-speed digital computer to pilots who preferred to be shot down intentionally by their enemies rather than accidentally by their friends. (p. 116)

In statistics this is known as the distinction between Type I and Type II errors, which we have discussed before. Pilots flying near their own lines likely figured there was a greater probability that their own forces would make a mistake than that the enemy would detect them–and going down as a result of friendly fire is no one’s idea of fun. This emergence of a cooperative norm in the midst of combat is consistent with stories from other conflicts in which the idea of fairness is used to compensate for the rapid progress of weapons technology.

casino-monte-carlo-roulette-monaco-1Chapter 10 of the book (one of my two favorites along with Chapter 9, Cyclogenesis) is entitled Monte Carlo. Statistical practitioners today use this method to simulate statistical distributions that are analytically intractable. Dyson weaves the development of Monte Carlo in with a recounting how von Neumann and his second wife Klari fell in love in the city of the same name. A full description of this method is beyond the scope of this post, but here is a useful bit:

Monte Carlo originated as a form of emergency first aid, in answer to the question: What to do until the mathematician arrives? “The idea was to try out thousand of such possibilities and, at each stage, to select by chance, by means of a ‘random number’ with suitable probability, the fate or kind of event, to follow it in a line, so to speak, instead of considering all branches,” [Stan] Ulam explained. “After examining the possible histories of only a few thousand, one will have a good sample and an approximate answer to the problem.”

For a more comprehensive overview of this development in the context of Bayesian statistics, check out The Theory That Would Not Die.

The third and final piece of the puzzle for our post today is the well-known but not sufficiently appreciated distinction between correlation and causation. Philip Thompson, a meteorologist who joined the IAS group in 1946, learned this lesson at the age of 4 and counted it as the beginning of his “scientific education”:

[H]is father, a geneticist at the University of Illinois, sent him to post a letter in a mailbox down the street. “It was dark, and the streetlights were just turning on,” he remembers. “I tried to put the letter in the slot, and it wouldn’t go in. I noticed simultaneously that there was a streetlight that was flickering in a very peculiar, rather scary, way.” He ran home and announced that he had been unable to mail the letter “because the streetlight was making funny lights.”

Thompson’s father seized upon this teachable moment, walked his son back to the mailbox and “pointed out in no uncertain terms that because two unusual events occurred at the same time and at the same place it did not mean that there was any real connection between them.” Thus the four-year-old learned a lesson that many practicing scientists still have not. This is also the topic of Chapter 8 of How to Lie with Statistics and a recent graph shared by Cory Doctorow.

The fact that these three lessons on statistical thinking coincided with the advent of digital computing, along with a number of other anecdotes in the book, impressed upon me the deep connection between these two fields of thought. Most contemporary Bayesian work would be impossible without computers. It is also possible that digital computing would have come about much differently without an understanding of probability and the scientific method.

What’s the Best Way to Learn? Just-In-Time versus Just-In-Case

18c-classroom

Illustration of an 18th-century classroom

You will never be dumber than you are right now. You will also never have more time than you do right now. Thus, you have a relative abundance of time and a relative dearth of knowledge. How do we strike a balance between these resources to optimally leverage them for learning?

These questions came up as I listened to two episodes of the Ruby Rogues podcast. In episode 70 David brings up just-in-time versus just-in-case learning. David’s ideas were prompted by Katrina Owen, who has a list of learning resources here. The other thought-provoking episode (responsible for the above paragraph) was number 87 in which the rogues discusses Sandi Metz’s new book, Practical Object-Oriented Design in Ruby. (I had the pleasure of meeting Sandi last night at a local Ruby meetup, after a draft of this post was written.) Here’s Chuck riffing off of a quote from the book:

“Practical design does not anticipate what will happen to your application. It merely accepts that something will and that in the present, you cannot know what. It doesn’t guess the future. It preserves your options for accommodating the future.” And so, what that says to me is you don’t always have enough information. You may never have enough information. You will never have less information than you have now. So make the design decisions that you feel like you have to and defer the rest, until you don’t have to anymore. And so it was basically, “Here are some rules. But use your best judgment because you’re going to get more information that’s going to inform you better later.” And so, that kind of opens things up. Here are the rules but if you have the information that says that you have to break them, then break them.

The just-in-time and just-in-case distinctions are useful in answering the question I posed at the beginning. But before I give concrete examples I think it is important to introduce another dimension to our learning classification: formal and informal. Being the good social scientists that we are, we can now formulate a two-by-two table.

LearningStylesJust-in-case learning is done well ahead of the time that it is needed for practical purposes. Children learn English (or whatever their native language) without thought for or anticipation of the letters, emails, and blog posts they will write in years to come. In a formal setting this can lead to the use of toy problems to make the skill seem practical. Students in an algebra class may have trouble seeing ‘the point’ of those skills until much later–and even then they may not fully recognize where that learning originated.

Just-in-time learning occurs at or very near the point of need. I could ask for travel directions to your house when we first meet, but that would be useless until you actually invite me (not to mention presumptuous). It is better to learn something like that when I can use it right away, since it has little value in the abstract. Programming–for me at least–has been much more of a just-in-time skill. I have taken one formal course in the topic and am currently enrolled in another. But the great benefit of these courses is that you get to put your skills to work immediately.

To answer the question we started with, I think that we need to place more value on just-in-time learning and less on just-in-case learning. As the quote from Sandi’s book points out, we live in a world of uncertainty. There are some skills that you simply cannot learn at a just-in-time pace (math being the main one that comes to mind). But for the plethora of other cases that our modern world and its tools make available, learning at the point of need is satisfactory and perhaps even superior. That is why we need to develop more avenues for just-in-time learning. Programmers have this in spades with sites like StackOverflow, but many other skill areas do not. Sites like Coursera also have a chance to provide a middle road between the categories in the table above. The ability to iterate quickly and pick up new skills on the fly will be increasingly valuable in the years to come.

Taxes, Moonshine, and State Building

moonshine_still_sugar_valley2I have to admit an ulterior motive behind Friday’s post. We discussed the Alchian-Allen theorem, which states that adding a fixed cost (usually but not necessarily for transportation) to the price of a good leads consumers to purchase more of the high quality good relative to the lower quality one. Although I hope that discussion was interesting enough in its own right it also serves as background for today. This post discusses the role of postbellum US internal revenue system, whose liquor tax collection efforts in the Mountain South constitute a slow but eventually successful state-building effort.

We have talked about state building before, and this post continues in that vein. The post-Civil War Mountain South was an extraordinarily difficult-to-govern region, both because of its geography and its population’s unwillingness to cooperate with the central government. (Regular readers will recall that these two characteristics are often correlated.) Our discussion will largely be based on the paper, “The Revenue: Federal Law Enforcement in the Mountain South, 1870-1900″ by Wilbur R. Miller (1989, JSTOR).

Here’s the connection with the Alchian-Allen theorem. Corn is a commodity, and the quality difference between any two cobs is negligible. However, distilling whiskey from corn both raises its value and makes it easier to transport. According to research from Cornell’s David Pimentel, “An acre of U.S. corn yields about 7,110 pounds of corn for processing into 328 gallons of ethanol.” That’s about 26.1 pounds of corn per gallon of pure alcohol.

Even if moonshine were only 100-proof (i.e. half alcohol), you have reduced the weight to be transported from 13 pounds of corn to around 8 pounds for a gallon of “mountain dew”–nearly a 40 percent reduction! Making higher proof could result in an over 60 percent weight reduction, drastically reducing transportation costs for Appalachian farmers. This also explains why bootleggers would try to achieve high alcohol contents for distribution and expect consumers to dilute the brew after it had been transported–it makes no sense to pay for shipping water content or other dilutions.

But all of these factors were in place long before the Union victory, so why did moonshining become more common afterward? The answer is taxes. From Miller’s article:

The beer tax, which remained low, was collected easily. There was some traffic in untaxed tobacco, but whiskey taxes were the most difficult to collect. Moonshining developed as soon as the excises were imposed, and so many distillers evaded whiskey taxes that in 1868 the commissioner of internal revenue requested reduction of the tax from $2.00 to fifty cents per gallon, thinking the lower excise would be easier to collect. Indeed, the lower rate resulted in increased revenues, but the government continued to lose potential revenue to moonshiners. (p. 197)

PE_BustinUpTheMoonshineStill_bw8x10_14bResistance to the taxes, even when they were reduced, had both an economic and a political logic:

Moonshiners argued that they could not afford to absorb a whiskey tax of ninety cents per gallon or to pass it on to their customers. They believed that economic survival depended on evading the tax, but illegal distilling exposed them to confiscation or destruction of their still or to imprisonment. Mountain dwellers resisted the tax. They believed that a man had a fundamental “right to do pretty much what he pleases,” arguing that “a farmer should have the same right to boil his corn into ‘sweet mash’ as to boil it into hominy.” (p. 200)

It did not help the government’s cause that revenue agents were often dishonest, responding to incentives of their own:

U.S. commissioners, who held preliminary hearings and issued warrants, and deputy marshals (often appointed as deputy revenue collectors), who served warrants and made arrests, received fees for each of their duties. Fee payment encouraged deputies to swear out warrants on doubtful testimony, leading to the arrest of innocent men. (p. 203)

Liquor taxes were serious business, though. The federal government carried a substantial amount of war debt and in 1895 the Supreme Court ruled that existing income taxes were unconstitutional, forcing the government to rely even more heavily on taxing goods like tobacco and whiskey:

The national government had developed a commitment to collecting taxes that it never had toward enforcing divil rights, because it was directly interested in the funds received. The whiskey tax became the largest domestic source of revenue. The excise on distilled spirits grew from 48 percent of all internal revenue in 1876 to 59 percent in 1982. (p. 214)

When Congress raised the whiskey tax to $1.10 (up from 90 cents) in response to the income tax decision, receipts immediately dropped and took five years to recover to their 1893 level. At the turn of the century states began to enact prohibition laws and made revenue collection and enforcement more difficult. Every “moonshine state” except Kentucky and Missouri was dry by 1916. Of course, this did not eliminate the moonshine business–it just made it that much more profitable.

For more on the topic of taxes, government control, and conflict, check out this working paper abstract from Anna Schultz, Nils Metternich, and Michael D. Ward (friends and colleagues, all). To follow up on the Alchian-Allen theorem and prohibition of various types see this post by azmytheconomics. If you want to know more about moonshiners in the 20th century, check out Last Call: The Rise and Fall of Prohibition by Dan Okrent and Bruce Yandle’s “Bootleggers and Baptists” theory (wikipdfpodcast).

Why Does Manhattan Have the Best Shrimp?

Fulton Fish Market in New York City

Fulton Fish Market in New York City

I’ll ask the question in the same form that I originally heard it: “Why does Manhattan have the best shrimp?”* It makes sense why Portland (Maine) or Boston would have great lobster–the shellfish are harvested nearby. But even though shrimping locations are far from New York City, Manhattan restaurants can regularly get high-quality, plump shrimp. Likewise for juicy red apples and other nice produce–all from many kilometers or even continents away. If the “buy local” movement is all its cracked up to be, shouldn’t the location where these items are harvested be the best spot to purchase them rather than the concrete jungle?

The answer involves transportation costs. Put yourself in the shoes of a Central American farmer for a moment. You have harvested a bumper crop of avocados or bananas, or another perishable crop. In your local marketplace you can sell the produce for just the cost of gas for a large truck. However, so can every other farmer in the local growing area. That means your local market has a lot of bananas and avocados, which drives down the price. The same item you could sell for 10 cents near home might go for $0.50 to $1.00 in a large US city.**

Having decided to ship your goods to the States, you have to decide which ones to sell. Do you want to send the so-so items that will sell in the low end of the price range above? It would still be better than what you can get at home, but it would barely cover the additional shipping costs. Keep in mind that the cost to ship it won’t depend on the quality of the product. It costs the same to send a box of rotting bananas as it does to send the world’s greatest bananas. This is why it makes sense to send the best: since the costs of shipping are fixed, you might as well ship the product where you’ll make the greatest profit.

In economics this is known as the Alchian-Allen theorem,*** which Wikipedia summarizes as:

It states that when the prices of two substitute goods, such as high and low grades of the same product, are both increased by a fixed per-unit amount such as a transportation cost or a lump-sum tax, consumption will shift toward the higher-grade product. This is true because the added per-unit amount decreases the relative price of the higher-grade product.

The moral of the story is that under linear transportation costs, it is profitable to send the highest quality goods to the market that is best able to pay for quality. This is why large cities in developed countries (especially in North America and Europe) can have excellent produce shipped in from virtually anywhere. In other words, it makes good economic sense for Manhattan to get the best shrimp.

I have never been to New York City or Boston, so I am largely speculating. Based on available evidence I believe they exist and have quality produce, as do other large cities I have visited such as San Francisco. American supermarkets in smaller cities have improved substantially over the last few decades, but not everyone can have the best.

** These numbers are made up. If you have data, let’s talk.

*** There has been some controversy over the theorem, but more recent work supports it. Tyler Cowen has discussed it in a podcast with Russ Roberts and a paper with Alex Tabbarok.

Review: Everything is Obvious

Everything is Obvious (Once You Know the Answer), by Duncan Watts, had been on my wishlist for a while before my sister gave it to me for my birthday. I was already sympathetic to the book’s key point: many conclusions of social science research that seem obvious in retrospect could not have been distinguished from other equally likely hypotheses a priori. In this post I briefly comment on one example from the book, its core argument, and its organization and style.

american-soldier-bookMy favorite example from the book comes from Paul Lazarsfeld’s discussion of The American Soldier. That report studied over 600,000 troops during and just after WWII. Lazarsfeld listed six key findings, which I quote here directly:

  1. Better educated men showed more psycho-neurotic symptoms than those with less education. (The mental instability of the intellectual as compared to the more impassive psychology of the man-in-the-street has often been commented on.)
  2. Men from rural backgrounds were usually in better spirits during their Army life than soldiers from city backgrounds. (After all, they are more accustomed to hardships.)
  3. Southern soldiers were better able to stand the climate in the hot South Sea Islands than Northern soldiers (of course, Southerners are more accustomed to hot weather).
  4. White privates were more eager to become non-coms than Negroes. (The lack of ambition among Negroes is almost proverbial.)
  5. Southern Negroes preferred Southern to Northern white officers. (Isn’t it well known that Southern whites have a more fatherly attitude toward their “darkies”?)
  6. As long as the fighting continued, men were more eager to be returned to teh States and they were after the German surrender. (You cannot blame people for not wanting to be killed.)

Spend a minute thinking about these results. Although some are undoubtedly political incorrect, others seem intuitive don’t they? As Watts says, a reader could easily imagine that, “Rural men in the 1940s were accustomed to harsher living standards and more physical labor than city men, so naturally they had an easier time adjusting. Why did we need such a vast and expensive study to tell me what I could have figured out on my own?” (p. xvi)

Lazarsfeld was playing a trick, though. In fact, the studies conclusions were just the opposite–and with the benefit of hindsight those could seem just as obvious. What seems like common sense after the fact was just one of several possibilities before the research was conducted.

According to Watts, common sense biases like these come in three forms. The first error is imagining others’ motives to be like our own if we were put in their situation–what I call armchair theorizing. The second error is failing to recognize that “more is different.” Models that portray the actions of groups as a simple linear aggregation of individuals are often dead wrong at predicting that outcomes of social processes. More often, social dynamics are nonlinear (see, for example, research in “information cascades”). Thirdly, we often suffer from hindsight bias, failing to recognize that historical outcomes we now take for granted were highly contingent and not easily predictable beforehand.

The book is divided into two parts: “Common Sense” and “Uncommon Sense”. The first part focuses on examples of the three common sense biases at work, and the second provides recommendations for social science research. Both parts refer heavily to the author’s previous work and other “big think” books–a genre that I hadn’t realized this book belonged to before reading.

Overall, Everything is Obvious is a quick read that will bring a social science outsider up to speed in cognitive biases and suggest some resources for learning more. For the practicing social scientist already familiar with the work of Daniel Kahneman and others like him, I would recommend skipping the first half in favor of the chapter summaries in the appendix, and reading the second half with a pen in hand to mark areas that you want to follow up on in the original literature. You can also read Andrew Gelman’s review here.