Technology and Government: San Francisco vs. New York

In a recent PandoMonthly interview, John Borthwick made a very interesting point. Many cities are trying to copy the success of Silicon Valley/Bay Area startups by being like San Francisco: hip, fun urban areas designed to attract young entrepreneurs and developers (Austin comes to mind). However, the relationship between tech and other residents is a strained one: witness graffiti to the effect of “trendy Google professionals raise housing prices” and the “startup douchebag” caricature.

New York, on the other hand, has a smaller startup culture (“Silicon Alley”) but much closer and more fruitful ties between tech entrepreneurs and city government. Mayor Bloomberg has been at the heart of this, with his Advisory Council on Technology and his 2012 resolution to learn to code. Bloomberg’s understanding of technology and relationship with movers and shakers in the industry will make him a tough act to follow.

Does this mean that the mayors of Chicago, Houston, or Miami need to be writing Javascript in their spare time? Of course not. But making an effort to understand and relate to technology professionals could yield great benefits.

Rather than trying to become the next Silicon Valley (a very tall order) it would be more efficacious for cities to follow New York’s model: ask not what your city can do for technology, but what technology can do for your city. Turn bus schedule PDF’s into a user-friendly app or–better yet, for many low-income riders–a service that allows you to text and see when the next bus will arrive. Instead of calling the city to set up services like water and garbage collection, add a form to the city’s website. The opportunities to make city life better for all citizens–not just developers and entrepreneurs–are practically boundless.

I was happy to see San Francisco take a small step in the right direction recently with the Open Law Initiative, but there is more to be done, and not just in the Bay Area. Major cities across the US and around the world could benefit from the New York model. See more of the Borthwick interview below:

How Traveling Salesmen Complicate the Traveling Salesman Problem

traveling_salesmanThe traveling salesman problem is simple in its setup but remarkably complicated to solve. You need to visit a number of cities, say 10, and want to find the shortest route that visits all of them exactly once and brings you back to where you started. From a list of routes it is easy to find the shortest one, but it is incredibly hard to verify that it is the shortest of all possible routes.

Finding a solution gets even more difficult when you go from a (mathematically) feasible solution to one that can be implemented in the real world. That is because you have to incorporate a notoriously unreliable component into your plans: human beings.

[I]n trying to apply this mathematics to the real world of deliveries and drivers, UPS managers needed to learn that transportation is as much about people and the unique constraints they impose, as it is about negotiating intersections and time zones….

For one thing, humans are irrational and prone to habit. When those habits are interrupted, interesting things happen. After the collapse of the I-35 bridge in Minnesota, for example, the number of travelers crossing the river, not surprisingly, dropped; but even after the bridge was restored, researcher David Levinson has noted, traffic levels never got near their previous levels again. Habits can be particularly troublesome for planning fixed travel routes for people, like public buses, as noted in a paper titled, “You Can Lead Travelers to the Bus Stop, But You Can’t Make Them Ride,” by Akshay Vij and Joan Walker of the University of California. “Traditional travel demand models assume that individuals are aware of the full range of alternatives at their disposal,” the paper reads, “and that a conscious choice is made based on a tradeoff between perceived costs and benefits.” But that is not necessarily so.

People are also emotional, and it turns out an unhappy truck driver can be trouble. Modern routing models incorporate whether a truck driver is happy or not—something he may not know about himself. For example, one major trucking company that declined to be named does “predictive analysis” on when drivers are at greater risk of being involved in a crash. Not only does the company have information on how the truck is being driven—speeding, hard-braking events, rapid lane changes—but on the life of the driver….

In other words, the traveling salesman problem grows considerably more complex when you actually have to think about the happiness of the salesman.

That’s from Tom Vanderbilt over at Nautilus, and the whole thing is worth a read. Oh, and there’s also an app for that.

Currency and Conflict

According to Lebanon’s Daily Star:

Traders across Syria reported widely fluctuating rates and two currency dealers in Damascus, where the pound appeared to be hit hardest, said it fell below 200 to the dollar for the first time in what one described as panic buying of the U.S. currency.

On Monday evening the pound traded at 205 to the dollar, down 20 percent in four days and 77 percent down since the start of the anti-Assad uprising in March 2011 when it was at 47.

The idea of examining currency prices over the course of a conflict is interesting. There are a number of confounders of course. For instance, the regime can often intervene in certain ways to affect the value of currency. Other incidents besides the conflict itself can also drive currency fluctuations, especially when the conflict is relatively minor.

One nice case (from strictly a research perspective) is the US Civil War, when both the Union and Confederacy issued their own notes. Jeffrey Arnold‘s project, “Pricing the Costly Lottery: Financial Market Reactions to Battlefield Events in the American Civil War,” leverages this fact to see how markets responded to successes and failures of either side. We discussed this project before when it was presented as a poster at PolMeth 2012, and Jeffrey’s website now has his MPSA 2013 slides.

Here’s his abstract, and one of my favorite graphs:

What role does combat play in resolving the disagreement that initiated war? Bargaining theories of war propose two mechanisms, the destruction of capabilities and the revelation of private information. These mechanisms are difficult to analyze quantitatively because the mechanisms are observationally equivalent, the participants’ expectations are unobservable, and there is a lack of data on battles. With new methods and new data on the American Civil War, I address these challenges. I estimate the information revealed by combat with a model of Bayesian learning. I use prices of Union and Cnnnonfederate currencies to measure public expectations of war duration and outcome. Data on battlefield events come from detailed data on the outcomes and casualties of the battles of the American Civil War. The results suggest that battle outcomes rather than casualties or information revelation had the largest influence on the expected duration of the American Civil War.

confederate-union-prices

Micro-Institutions Everywhere: Defining Death

From the BBC:

In the majority of cases in hospitals, people are pronounced dead only after doctors have examined their heart, lungs and responsiveness, determining there are no longer any heart and breath sounds and no obvious reaction to the outside world….

Many institutions in the US and Australia have adopted two minutes as the minimum observation period, while the UK and Canada recommend five minutes. Germany currently has no guidelines and Italy proposes that physicians wait 20 minutes before declaring death, particularly when organ donation is being considered….

But the criteria used to establish brain death have slight variations across the globe.

In Canada, for example, one doctor is needed to diagnose brain death; in the UK, two doctors are recommended; and in Spain three doctors are required. The number of neurological tests that have to be performed vary too, as does the time the body is observed before death is declared.

Micro-Institutions Everywhere: Virus Naming

Giant stuffed microbes make the lethal loveable

Giant stuffed microbes make the lethal loveable

The alphabet soup of naming new viruses rivals Pentagonese. AIDS. SARS. MRSA. Where do these names come from? One major source of influence in this area is the International Committee on Taxonomy of Viruses (ICTV).

Their latest innovation is MERS, referring to a new form of coronavirus that was first reported in September, 2012. In the meantime the virus has gone by the various abbreviations hCov-EMC, HCOV, NCoV, and nCoV (the last two referring to a “novel coronavirus”).

Coming up with a good name is tricky. It should be descriptive and memorable, but naming a virus after a geographic area has major downsides:

Historically, many infectious disease agents—or the diseases themselves—have been named after the place where they were first found. But increasingly, scientists and public health officials have shied away from that system to avoid stigmatizing a particular country or city. When a serious new type of pneumonia started spreading from Asia in 2003, officials at WHO coined the term severe acute respiratory syndrome (SARS) to prevent the disease from being named “Chinese flu” or something similar. (As it happened, the name ruffled feathers in Hong Kong anyway, because the city’s official name is Hong Kong SAR, for special administrative region—a fact that WHO had overlooked.)…

The new name is only a recommendation—one which the study group hopes will be adopted widely but which it has no power to enforce, Gorbalenya says. That’s because ICTV has the authority only to classify and name entire virus species

For more, check out this post from Science.

Risk, Overreaction, and Control

11-M_El_How many people died because of the September 11 attacks? The answer depends on what you are trying to measure. The official estimate is around 3,000 deaths as a direct result of hijacked aircraft and at the World Trade Center, Pentagon, and in Pennsylvania. Those attacks were tragic, but the effect was compounded by overreaction to terrorism. Specifically, enough Americans substituted driving for flying in the remaining months of 2001 to cause 350 additional deaths from accidents.

David Myers was the first to raise this possibility in a December, 2001, essay. In 2004, Gerd Gigerenzer collected data and estimated the 350 deaths figure, resulting from what he called “dread risk”:

People tend to fear dread risks, that is, low-probability, high-consequence events, such as the terrorist attack on September 11, 2001. If Americans avoided the dread risk of flying after the attack and instead drove some of the unflown miles, one would expect an increase in traffic fatalities. This hypothesis was tested by analyzing data from the U.S. Department of Transportation for the 3 months following September 11. The analysis suggests that the number of Americans who lost their lives on the road by avoiding the risk of flying was higher than the total number of passengers killed on the four fatal flights. I conclude that informing the public about psychological research concerning dread risks could possibly save lives.

Does the same effect carry over to other countries and attacks? Alejandro López-Rousseau looked at how Spaniards responded to the March 11, 2004, train bombings in Madrid. He found that activity across all forms of transportation decreased–travelers did not substitute driving for riding the train.

What could explain these differences? One could be that Americans are less willing to forego travel than Spaniards. Perhaps more travel is for business reasons and cannot be delayed. Another possibility is that Spanish citizens are more accustomed to terrorist attacks and understand that substituting driving is more risky than continuing to take the train. There are many other differences that we have not considered here–the magnitude of the two attacks, feelings of being “in control” while driving, varying cultural attitudes.

This post is simply meant to make three points. First, reactions to terrorism can cause additional deaths if relative risks are not taken into account. Cultures also respond to terrorism in different ways, perhaps depending on their previous exposure to violent extremism. Finally, the task of explaining differences is far more difficult than establishing patterns of facts.

(For more on the final point check out Explaining Social Behavior: More Nuts and Bolts for the Social Sciences, which motivated this post.)

Python for Political Scientists, Spring 2013 Recap

pythonThis spring Josh Cutler‘s Python course was back by popular demand. (This time it was known as “Computational Political Economy” but I like the less formal title.) I participated this time around as a teaching assistant rather than student, and it was a thoroughly enjoyable experience. The course syllabus and schedule is on Github.

Class participants were expected to have a basic familiarity with Python from going through Zed Shaw’s book over Christmas break outside of class. Each Tuesday Josh would walk them through a new computer science concept and explain how it could be used for social science research. These topics included databases, networks, web scraping, and linear programming. On Thursdays they would come to a lab session and work together in small groups to solve problems or answer questions based on some starter code that I supplied. I generally tried to make the examples relevant and fun but you would have to ask them whether I succeeded.

The class ended this past Saturday with final presentations, which were all great. The first project scraped data from the UN Millenium Development Goal reports and World Bank statistics to compare measures of maternal mortality in five African countries and show how they differed–within the same country! This reminded me of Morten Jerven’s book Poor Numbers on the inaccuracy of African development statistics (interview here).

In the second presentation, simulated students were treated with one of several education interventions to see how their abilities changed over time. These interventions could be applied uniformly to everyone or targeted at those in the bottom half of the distribution. Each child in the model had three abilities that interacted in different ways, and interventions could target just one of these abilities or several in combination. Combining these models with empirical data on programs like Head Start is an interesting research program.

The third presentation also used a computational model. Finding equilibrium networks of interstate alliances is incredibly difficult (if not impossible) to do analytically when the number of states is large. The model starts with pre-specified utility functions and runs until the network hits an equilibria. Changing starting values allows for the discovery of multiple equilibria. This model will also be combined with empirical data in the future.

For the fourth and final presentation, one participant collected data on campaign events in Germany for each of the political parties during the current election cycle. This reminded me of a Washington Post website (now taken down) detailing campaign visits in 2008 that I scraped last year and used in lab once this semester.

These examples show the wide variety of uses for programming in social science. From saving time in data collection to generating models that could not be done without the help of algorithms, a little bit of programming knowledge goes a long way. Hopefully courses like this will become more prominent in social science graduate (and undergraduate) programs over the coming years. Thanks to Josh and all the class participants for making it a great semester!

____________

Note: I am happy to give each of the presenters credit for their work, but did not want to reveal their names here due to privacy concerns. If you have questions about any of the presentations I can put you in touch with their authors via email.

What Can Novels Teach Us?

Is it worthwhile for a social scientist to read fiction? What can novels teach us about human behavior? This post summarizes the work of several authors who would answer the first question with a resounding “yes,” and describes their arguments about how novels help us understand social behavior.

Most recently I had the pleasure of reading Michal Suk-Young Chwe‘s new book, Jane Austen, Game Theorist. Austen herself likely would have preferred the term “imaginist,” which is how the title character in Emma describes herself, referring to her strategic thinking abilities. Chwe’s argument in the book is that Austen is systematically analyzing strategic thinking through her novels. Austen certainly understood that novels could help teach social behavior: she writes in Northanger Abbey that novels contain “the most thorough knowledge of human nature [and] the happiest delineation of its varieties.” On Wednesday we will take a more detailed look at Chwe’s argument. In the meantime you can find a presentation summarizing the book here.

Austen would be in good company with Ariel Rubinstein. The central thesis of his recent book, Economic Fables, is straightforward: “Economic models are not more, but also not less, than stories–fables.” (You can read the book for free here, or see Ariel explain the motivation behind the book in this video.) Rubinstein’s view is actually the converse of Austen’s: he is not arguing that works of fiction are illustrative of human behavior, but that many social science models are themselves useful fictions. (Ed Leamer has advanced a similar view with a more practical twist in his book, Macroeconomic Patterns and Stories.)

Tyler Cowen helps to identify the key differences and similarities between models and novels in his paper, “Is a Novel a Model?” Here is the abstract:

I defend the relevance of fiction for social science investigation. Novels can be useful for making some economic approaches — such as behavioral economics or signaling theory — more plausible. Novels are more like models than is commonly believed. Some novels present verbal models of reality. I interpret other novels as a kind of simulation, akin to how simulations are used in economics. Economics can, and has, profited from the insights contained in novels. Nonetheless, while novels and models lie along a common spectrum, they differ in many particulars. I attempt a partial account of why we
sometimes look to models for understanding, and other times look to novels.

This interview with Tyler contains a summary of his perspective on novels and much more.

Cowen’s former GMU Economics colleague Russ Roberts also agrees that novels are useful for understanding social behavior–so much so that he has written three of them. Each of the novels illustrates one main economic lesson, and all of them support the idea of free markets for solving problems. Roberts interviewed Rubinstein the Econtalk podcast, in which they discuss some of the ideas that led to Rubinstein’s new book.

Overall this attention to useful fictions is a positive development for social science. Novels can help reach a much wider audience than journal articles and many nonfiction books. One danger–which we are far from now but still exists–is that we value the elegance of the novel itself (the language it uses) rather than the lessons it teaches. Another downside is that it is difficult to convey the policy relevance of a novel. Nevertheless, teaching lessons about human behavior in an enjoyable and memorable form is a huge step forward from most contemporary social science.

Micro-Institutions Everywhere: Book ID Numbers

Pink identifies the prefix, current only 978 or 979. Purplse is the registration group element, identifying the geographical source of the book (1-5 digits). In light green is the publisher or imprint's ID, up to 7 digits. In yellow is the publication element for idenfitying the edition or format of the book. Highlighted in grey is the check digit, used to verify the number. "5" in red identifies US dollars as the currency for the price, highlighted in dark green.

Pink identifies the prefix, current only 978 or 979. Purplse is the registration group element, identifying the geographical source of the book (1-5 digits). In light green is the publisher or imprint’s ID, up to 7 digits. In yellow is the publication element for idenfitying the edition or format of the book. Highlighted in grey is the check digit, used to verify the number. “5” in red identifies US dollars as the currency for the price, highlighted in dark green.

If you are a bookworm like me, you have evidence of this micro-institution all around you. Grab a nearby book and look at the back cover, or a couple of pages inside the front cover. You will see a series of numbers that uniquely identify the book: its International Standard Book Number (ISBN). That 10 or 13-digit number serves as the worldwide identifier for books, helping customers at online retailers like AbeBooks, Half.com, and Amazon be sure that they are purchasing the right reading material without physically inspecting the product.

Ironically, it is those same online marketplaces and their accompanying e-readers that now endanger the future of the ISBN. The supply of ISBNs is finite, you understand, and demand is high:

The International Standard Book Number (ISBN), invented in Britain in 1965, took off rapidly as an international system for classifying books, with 150 agencies (one per country, with two for bilingual Canada) now issuing the codes. Set up by retailers to ease their distribution and sales, it increasingly hampers new, small and individual publishers. Yet digital publishing is weakening its monopoly.

Publishers who were in at the beginning got great blocks of ISBNs. Many have plenty still in stock. Some countries, including Canada, Hungary and Croatia, make them free to bolster book publishing. But in Britain, America and Japan, where ISBNs are needed for any hope of mainstream publication, they are costly.

Self-published writers understandably do not want to pay for a costly ID number when they are making small margins off of an e-book. If they are only selling through a single retailer (say, an Amazon Kindle edition) there is little incentive to get a unique number–customers will be able to find the book without it. And alternatives are cropping up:

Amazon has introduced the Amazon Standard Identification Number (ASIN). Digital Object Identifiers (DOI) tag articles in academic journals. Walmart, an American supermarket chain, has a Universal Product Code (UPC) for everything it stocks—including books. Humans are also getting labels: the Open Researcher and Contributor ID system (ORCID) identifies academics by codes, not their names. And ISBNs are not mandatory at Google Books.

It is foreseeable that one of these options will emerge as a privately-provided institution, replacing the ISBN. The transition is unlikely to be smooth, however–switching equilibriums rarely is. As you trace your finger across an ISBN number on a printed page, you are not only touching a micro-instituiton. You may be holding history in your hands.

Statistical Thinking and the Birth of Modern Computing

John von Neumann and the IAS computer, 1945

John von Neumann and the IAS computer, 1945

What do fighter pilots, casinos, and streetlights all have in common? These three disparate topics are all the subject of statistical thinking that led to (and benefitted from) the development of modern computing. This process is described in Turing’s Cathedral by George Dyson, from which most of the quotes below are drawn. Dyson’s book focuses on Alan Turing far less than the title would suggest, in favor of John von Neumann’s work at the Institute for Advanced Studies (IAS). Von Neumann and the IAS computing team are well-known for building the foundation of the digital world, but before Turing’s Cathedral I was unaware of the deep connection with statistics.

Statistical thinking first pops up in the book with Julian Bigelow’s list of fourteen “Maxims for Ideal Prognosticators” for predicting aircraft flight paths on December 2, 1941. Here is a subset (p. 112):

7. Never estimate what may be accurately computed.

8. Never guess what may be estimated.

9. Never guess blindly.

This early focus on estimation will reappear in a moment, but for now let’s focus on the aircraft prediction problem. With the advent of radar it became possible for sorties at night or in weather with poor visibility. In a dark French sky or over a foggy Belgian city it could be tough to tell who was who until,

otherwise adversarial forces agreed on a system of coded signals identifying their aircraft as friend or foe. In contrast to the work of wartime cryptographers, whose job was to design codes that were as difficult to understand as possible, the goal of IFF [Identification Friend or Foe] was to develop codes that were as difficult to misunderstand as possible…. We owe the existence of high-speed digital computer to pilots who preferred to be shot down intentionally by their enemies rather than accidentally by their friends. (p. 116)

In statistics this is known as the distinction between Type I and Type II errors, which we have discussed before. Pilots flying near their own lines likely figured there was a greater probability that their own forces would make a mistake than that the enemy would detect them–and going down as a result of friendly fire is no one’s idea of fun. This emergence of a cooperative norm in the midst of combat is consistent with stories from other conflicts in which the idea of fairness is used to compensate for the rapid progress of weapons technology.

casino-monte-carlo-roulette-monaco-1Chapter 10 of the book (one of my two favorites along with Chapter 9, Cyclogenesis) is entitled Monte Carlo. Statistical practitioners today use this method to simulate statistical distributions that are analytically intractable. Dyson weaves the development of Monte Carlo in with a recounting how von Neumann and his second wife Klari fell in love in the city of the same name. A full description of this method is beyond the scope of this post, but here is a useful bit:

Monte Carlo originated as a form of emergency first aid, in answer to the question: What to do until the mathematician arrives? “The idea was to try out thousand of such possibilities and, at each stage, to select by chance, by means of a ‘random number’ with suitable probability, the fate or kind of event, to follow it in a line, so to speak, instead of considering all branches,” [Stan] Ulam explained. “After examining the possible histories of only a few thousand, one will have a good sample and an approximate answer to the problem.”

For a more comprehensive overview of this development in the context of Bayesian statistics, check out The Theory That Would Not Die.

The third and final piece of the puzzle for our post today is the well-known but not sufficiently appreciated distinction between correlation and causation. Philip Thompson, a meteorologist who joined the IAS group in 1946, learned this lesson at the age of 4 and counted it as the beginning of his “scientific education”:

[H]is father, a geneticist at the University of Illinois, sent him to post a letter in a mailbox down the street. “It was dark, and the streetlights were just turning on,” he remembers. “I tried to put the letter in the slot, and it wouldn’t go in. I noticed simultaneously that there was a streetlight that was flickering in a very peculiar, rather scary, way.” He ran home and announced that he had been unable to mail the letter “because the streetlight was making funny lights.”

Thompson’s father seized upon this teachable moment, walked his son back to the mailbox and “pointed out in no uncertain terms that because two unusual events occurred at the same time and at the same place it did not mean that there was any real connection between them.” Thus the four-year-old learned a lesson that many practicing scientists still have not. This is also the topic of Chapter 8 of How to Lie with Statistics and a recent graph shared by Cory Doctorow.

The fact that these three lessons on statistical thinking coincided with the advent of digital computing, along with a number of other anecdotes in the book, impressed upon me the deep connection between these two fields of thought. Most contemporary Bayesian work would be impossible without computers. It is also possible that digital computing would have come about much differently without an understanding of probability and the scientific method.