Political Forecasting and the Use of Baseline Rates

As Joe Blitzstein likes to say, “Thinking conditionally is a condition for thinking.” Humans are not naturally good at this skill. Consider the following example: Kelly is interested in books and keeping things organized. She loves telling stories and attending book clubs. Is it more likely that Kelly is a bestselling novelist or an accountant?

Many of the “facts” about Kelly in that story might lead you to answer that she is a novelist. Only one–her sense of organization–might have pointed you toward an accountant. But think about the overall probability of each career. Very few bookworms become successful novelists, and there are many more accountants than (successful) authors in the modern workforce. Conditioning on the baseline rate helps make a more accurate decision.

I make a similar point–this time applied to political forecasting–in a recent blog post for the blog of Mike Ward’s lab (of which I am a member):

One piece of advice that Good Judgment forecasters are often reminded of is to use the baseline rate of an event as a starting point for their forecast. For example, insurgencies are a very rare event on the whole. For the period January, 2001 to August, 2013, insurgencies occurred in less than 10 percent of country-months in the ICEWS data set.

From this baseline, we can then incorporate information about the specific countries at hand and their recent history… Mozambique has not experienced an insurgency for the entire period of the ICEWS dataset. On the other hand, Chad had an insurgency that ended in December, 2003, and another that extended from November, 2005, to April, 2010. For the duration of the ICEWS data set, Chad has experienced an insurgency 59 percent of the time. This suggests that our predicted probability of insurgency in Chad should be higher than for Mozambique.

I started writing that post before rebels in Mozambique broke their treaty with the government. Maybe I spoke too soon, but the larger point is that baselines are the starting point–not the final product–of any successful forecast.

Having more data is useful, as long as it contributes more signal than noise. That’s what ICEWS aims to do, and I consider it a useful addition to the toolbox of forecasters participating in the Good Judgment Project. For more on this collaboration, as well as a map of insurgency rates around the globe as measured by ICEWS, see the aforementioned post here.

Review: Everything is Obvious

Everything is Obvious (Once You Know the Answer), by Duncan Watts, had been on my wishlist for a while before my sister gave it to me for my birthday. I was already sympathetic to the book’s key point: many conclusions of social science research that seem obvious in retrospect could not have been distinguished from other equally likely hypotheses a priori. In this post I briefly comment on one example from the book, its core argument, and its organization and style.

american-soldier-bookMy favorite example from the book comes from Paul Lazarsfeld’s discussion of The American Soldier. That report studied over 600,000 troops during and just after WWII. Lazarsfeld listed six key findings, which I quote here directly:

  1. Better educated men showed more psycho-neurotic symptoms than those with less education. (The mental instability of the intellectual as compared to the more impassive psychology of the man-in-the-street has often been commented on.)
  2. Men from rural backgrounds were usually in better spirits during their Army life than soldiers from city backgrounds. (After all, they are more accustomed to hardships.)
  3. Southern soldiers were better able to stand the climate in the hot South Sea Islands than Northern soldiers (of course, Southerners are more accustomed to hot weather).
  4. White privates were more eager to become non-coms than Negroes. (The lack of ambition among Negroes is almost proverbial.)
  5. Southern Negroes preferred Southern to Northern white officers. (Isn’t it well known that Southern whites have a more fatherly attitude toward their “darkies”?)
  6. As long as the fighting continued, men were more eager to be returned to teh States and they were after the German surrender. (You cannot blame people for not wanting to be killed.)

Spend a minute thinking about these results. Although some are undoubtedly political incorrect, others seem intuitive don’t they? As Watts says, a reader could easily imagine that, “Rural men in the 1940s were accustomed to harsher living standards and more physical labor than city men, so naturally they had an easier time adjusting. Why did we need such a vast and expensive study to tell me what I could have figured out on my own?” (p. xvi)

Lazarsfeld was playing a trick, though. In fact, the studies conclusions were just the opposite–and with the benefit of hindsight those could seem just as obvious. What seems like common sense after the fact was just one of several possibilities before the research was conducted.

According to Watts, common sense biases like these come in three forms. The first error is imagining others’ motives to be like our own if we were put in their situation–what I call armchair theorizing. The second error is failing to recognize that “more is different.” Models that portray the actions of groups as a simple linear aggregation of individuals are often dead wrong at predicting that outcomes of social processes. More often, social dynamics are nonlinear (see, for example, research in “information cascades”). Thirdly, we often suffer from hindsight bias, failing to recognize that historical outcomes we now take for granted were highly contingent and not easily predictable beforehand.

The book is divided into two parts: “Common Sense” and “Uncommon Sense”. The first part focuses on examples of the three common sense biases at work, and the second provides recommendations for social science research. Both parts refer heavily to the author’s previous work and other “big think” books–a genre that I hadn’t realized this book belonged to before reading.

Overall, Everything is Obvious is a quick read that will bring a social science outsider up to speed in cognitive biases and suggest some resources for learning more. For the practicing social scientist already familiar with the work of Daniel Kahneman and others like him, I would recommend skipping the first half in favor of the chapter summaries in the appendix, and reading the second half with a pen in hand to mark areas that you want to follow up on in the original literature. You can also read Andrew Gelman’s review here.

How Could Hurricane Sandy Affect the Election?

Many political scientists and commentators have been asking this question in recent days. For one answer, you can see Mike Munger’s thoughts in Duke Today yesterday. In this post I will consider one unlikely but interesting scenario: what could happen if elections in New Jersey or New York are delayed until after Tuesday?

According to Article 2, Clause 4 of the US constitution,

The Congress may determine the Time of chusing (sic) the Electors, and the Day on which they shall give their Votes; which Day shall be the same throughout the United States.

Obviously the founders felt that it was important that polls be held simultaneously across the states. However, there is some leeway at the state level for special circumstances like a natural disaster.

How would voters in New York and New Jersey  be affected if the election was already decided before they cast their ballots? These are fairly predictable states, unlikely to tip the election on their own. If one candidate already had enough electoral votes for victory and there is zero chance that your vote will matter, I can think of no other explanation than that voting is a form of self-expression.

There two examples that help shed light on this question. One that came to mind was a paper by Thomas Jensen and Asger Lau Andersen on how exit polls affect voter turnout. Their approach is a game theoretic model, but they cite a 2009 referendum vote in Denmark as a motivating example (ungated):

In order to pass, the proposal therefore had to overcome two obstacles: One, a majority of the votes cast in the referendum must be in favor of the proposal. And two, at least 40% of all eligible voters must vote in favor of the proposal. In the weeks preceding the June 7 election, there was no doubt that only the latter of these requirements had the potential to become binding. In a Gallup poll released a week before the election, 84% of respondents indicated that they approved of the proposal to change the law. However, only 40.2% responded that they would show up at polls and vote in favor of the proposal.

On the afternoon of the election day, TV2, a major Danish TV channel, published the results of an exit poll, which predicted that 37.9% of all eligible voters would cast a vote in favor of the proposal to change the law. However, during the evening the situation turned around with pollsters reporting a considerable increase in turnout. In the end, the official result was that 45.1% of all eligible voters had voted in favor of the proposal, which corresponded to 85.4% of all votes cast. Thus, the proposal passed with a comfortable margin.

So in this instance, receiving information about the likelihood of a particular outcome before polls closed affected the vote. But that still does not answer the question of how voters react when they have zero chance of changing the result.

For that, we turn to an example from France. The French government tries to avoid the fate of the Danish referendum by taking several strong measures. From Wikipedia:

Elections are always held on Sundays in France. The campaigns end at midnight the Friday before the election; then, on election Sunday, by law, no polls can be published, no electoral publication and broadcasts can be made. The voting stations open at 8 am and close at 6 pm in small towns or at 8 pm in cities, depending on prefectoral decisions. By law, publication of results or estimates is prohibited prior to that time; such results are however often available from the media of e.g. Belgium and Switzerland, or from foreign Internet sites, prior to that time. The first estimate of the results are thus known at Sunday, 8pm, Paris time; one consequence is that voters in e.g. French Guiana, Martinique and Guadeloupe knew the probable results of elections whereas they had not finished voting, which allegedly discouraged them from voting. For this reason, since the 2000s, elections in French possessions in the Americas, as well as embassies and consulates there, are held on Saturdays as a special exemption.

Gaining information about the expected outcome of an election does appear to affect turnout. It seems that if voters can still make a difference they are more likely to show up and vote for their desired outcome, while if the election is already decided they will stay home. As I said at the outset this is an unlikely scenario, but you can bet that political scientists will be keeping an eye on this chance for a true natural experiment.

New Conflict Forecasting Website

Wardlab is the working group run by Michael D. Ward. The lab has a new website: mdwardlab.com. You can find out about our ongoing projects, download software packages, or follow the Conflict Forecast blog.

The team includes some really smart people, several of whom have their own websites.

The site is still in a beta version, but many in this blog’s audience are interested in political forecasting and conflict, so I thought I would go ahead and share.

PolMeth 2012 Round-Up, Part 2

A Map from Drew Linzer’s Votamatic

Yesterday I discussed Thursday’s papers and posters from the 2012 Meeting of the Political Methodology Society. Today I’ll describe the projects I saw on Friday, again in the order listed in the program. Any attendees who chose a different set of panels are welcome to write a guest post or leave a comment.

First I attended the panel for Jacob Montgomery and Josh Cutler‘s paper, “Computerized Adaptive Testing for Public Opinion Research.” (pdf; full disclosure: Josh is a coauthor of mine on other projects, and Jacob graduated from Duke shortly before I arrived) The paper applies a strategy from educational testing to survey research. On the GRE if you get a math problem correct, the next question will be more difficult. Similarly, when testing for a latent trait like political sophistication a respondent who can identify John Roberts likely also recognizes Joe Biden. Leveraging this technique can greatly reduce the number of survey questions required to accurately place a respondent on a latent dimension, which in turn can reduce non-response rates and/or survey costs.

Friday’s second paper was also related to survey research: “Validation: What Big Data Reveal About Survey Misreporting and the Real Electorate” by Stephen Ansolabehere and Eitan Hersh (pdf). This was the first panel I attended that provoked a strong critical reaction from the audience. There were two major issues with the paper. First, the authors contracted out the key stage in their work–validating data by cross-referencing other data sets–to a private, partisan company (Catalist) in a “black box” way, meaning they could not explain much about Catalist’s methodology. At a meeting of methodologists this is very disappointing, as Sunshine Hillygus pointed out. Second, their strategy for “validating the validator” involved purchasing a $10 data set from the state of Florida, deleting a couple of columns, and seeing whether Catalist could fill those columns back in. Presumably they paid Catalist more than $10 to do this, so I don’t see why that would be difficult at all. Discussant Wendy Tam Cho was my favorite for the day, as she managed to deliver a strong critique while maintaining a very pleasant demeanor.

In the afternoon, Drew Linzer presented on “Dynamic Bayesian Forecasting of Presidential Elections in the States” (pdf). I have not read this paper, but thoroughly enjoyed Linzer’s steady, confident presentation style. The paper is also accompanied by a neat election forecast site, which is the source of the graphic above. As of yesterday morning, the site predicted 334 electoral votes for Obama and 204 for Romney. One of the great things about this type of work is that it is completely falsifiable: come November, forecasters will be right or wrong. Jamie Monogan served as the discussant, and helped to keep the mood light for the most part.

Jerry Reiter of the Duke Statistics Department closed out the afternoon with a presentation on “The Multiple Adaptations of Multiple Imputation.” I was unaware that multiple imputation was still considered an open problem, but this presentation and a poster by Ben Goodrich and Jonathan Kropko (“Assessing the Accuracy of Multiple Imputation Techniques for Categorical Variables with Missing Data”) showed me how wrong I was. Overall it was a great conference and I am grateful to all the presenters and discussants for their participation.

Transportation as an information problem

In some previous posts I looked at Joe’s question about the causes of traffic and compared them to mass transit (ie rail) options. I recently returned from an excellent trip to the Bay Area–in which I found myself using BART even less than expected–and realized that I neglected to mention a very important issue when it comes to transportation decisions: information problems. In the language of social science, “information problem” is a term used to describe a situation in which a decision-maker has less information than they need to make an efficient decision*. There is a whole branch of study devoted to these issues, but here I apply them to traffic/transportation.

Before, we talked about issues of allocation: supply and demand of road space. Often however, even at high traffic times there is still road space somewhere it just happens not to be the road you are on. If you had more information about which roads were crowded and which weren’t, you could choose an alternate route. (Obviously the geographic layout of the area you’re attempting to traverse makes a difference, but in most US metro areas there are multiple viable options for getting from point A to point B.)

I’ve been debating with myself whether supply/demand or information problems are more fundamental to the issue of traffic, and haven’t reached a firm conclusion, but I can say that given the current state of transportation supply we can reach more efficient outcomes by improving access to information. There are basically three ways that I can think of to get use information in deciding your transportation means/route:

1. Habit–In situations of uncertainty, risk-averse individuals (ie most people) will often simply make the decision they made the time before. In the most drastic biological terms, this is akin to natural selection: I didn’t die last time I made this choice, so it’s probably OK this time. The problem is that being “OK” and merely surviving doesn’t get us as near to making an optimal decision. We observe people using habit quite often on daily commutes, and to some extent this is rational: they may spend more time in traffic, but they take less time to actually think about their decision. When time is valuable, though, habit is often a poor decision-maker.

2. Experience–Continuing to think about the average daily commuter, accumulated experience over multiple cycles of traffic may help to make more efficient decisions. These cycles may be daily (“if I leave at 6am there is less traffic than 8am”), weekly (in many US cities, rush hour starts earlier on Friday afternoons than it does Mon-Thurs), or even annually (think Memorial Day). The problem with experience is that it takes multiple cycles to become accustomed to patterns, and humans aren’t great at detecting patterns. On the plus side, experience can be pretty good at recognizing recurring events that don’t conform to a set time schedule, like baseball games or concerts, but is bad at detecting rare events like accidents.

3. Real-time info–Nowadays real-time traffic info is available for many major US metro areas through both city-specific services and Google Maps. In any given city I tend to think that local info (e.g. TranStar in Houston and 511 in the Bay Area for automobiles, and various iPhone apps for Metro transit in SF/DC) is better than what’s on Google Maps but it’s better than nothing if you’re in a new place. As people become more aware of these services I think they will be able to make more efficient decisions about whether and where to drive. Real-time info, alas, cannot look into the future and tell you whether traffic is getting thicker or thinner on any stretch of road at a given point in time.

The best solution is probably combining your own experience with real-time info. For example, if you see that a piece of road is not congested at 3:30pm but know that it will become clogged within the next hour, that’s better than assuming it will be traffic-free. But real-time info beats habit or experience because it can show you the results of (relatively) unpredictable events like accidents or weather.

A lot of people think that prediction is one of the main benefits of social science, and complain when it fails (for example, a spate of recent news pieces on “why didn’t political scientists predict the Arab Spring?”). I for one agree with what Marc Morjé Howard wrote recently at the Monkey Cage comparing the Arab Spring to the 1989 revolutions in Eastern Europe:

Neither set of movements was predicted—even by experts.  Although for some this may raise questions about the value of “expertise,” in my view it puts into question the importance of prediction.  Contingent events and human behavior in unknown situations are impossible to predict.  The fact that most scholars failed to predict the particular decisions made by leaders like Gorbachev, Ceausescu, Ben-Ali, or Mubarak does not necessarily mean that they did not understand the regime or society.  And it certainly does not mean we should stop studying countries, areas, and languages.  Social science still has much to offer.

I could go on but this is over 800 words so I’ll stop. This may will likely be a conversation that I pick up again in the future, so please feel free to throw in your $0.02 in the comments. 


*This definition is not uncontroversial. My wording is mainly intended to reflect the consequences of the information problem without including scenarios described as “rational irrationality” or “limited attention.”