What Did Manifest Destiny Look Like?

“Manifest Destiny was the belief widely held by Americans in the 19th century that the United States was destined to expand across the continent. The concept, born out of ‘a sense of mission to redeem the Old World’, was enabled by ‘the potentialities of a new earth for building a new heaven.’” (Wikipedia, citing Frederick Merk)

Now, Michael Porath has told the story of manifest destiny in a series of 141 maps. The main technical trick is that Porath designed the site in HTML5, so it has some nice interactive features. The maps appear on a single page in four columns but you can click any of them for a close-up with an explanation of the changes, or mouse-over a region of the map to see what political entity it was under at the time (e.g. unorganized territory, Spanish colony).

There are two additions that I think would help improve this project. The first is a sense of time scale–some of the maps are only a month apart (January and February 1861, for example) while others are separated by several decades (March 1921 and January 1959). Adding time would allow for a second feature: an animation that would show the areas of change and continuity over time. An excellent example of this is David Sparks’ choropleth maps of presidential voting over time.  I do not know whether this could still be done in Porath’s HTML5 setup, but it is often useful to think about changes to graphical displays (additions or subtractions) that would help to convey meaningful information. What other suggestions do you have for these maps?

Afghanistan Casualties Over Time and Space

The data comes from the Defense Casualty Analysis System for Operation Enduring Freedom. Here it is over time:

Notice the seasonality of deaths in Afghanistan, likely due to the harsh winters. Here is the same data plotted across space (service member home towns):

Not surprisingly, hometowns of OEF casualties are similar to those of service members killed in Iraq.

Eliminate File Redundancy with Ruby

Say you have a file with many repeated, unnecessary lines that you want to remove. For safety’s sake, you would rather make an abbreviated copy of the file rather than replace it. Ruby makes this a cinch. You just iterate over the file, putting all lines the computer has already “seen” into a dictionary. If a line is not in the dictionary, it must be new, so write it to the output file. Here’s the code designed with .tex files in mind, but easily adaptable:

puts 'Filename?'
filename = gets.chomp
input = File.open(filename+'.tex')
output = File.open(filename+'2.tex', 'w')
seen = {}
input.each do |line|
  if (seen[line]) 
  else
    output.write(line)
    seen[line] = true
  end
end
input.close()
output.close()

Where would this come in handy? Well, the .tex extension probably already gave you a clue that I am reducing redundancy in a \LaTeX file. In particular, I have an R plot generated as a tikz graphic. The R plot includes a rug at the bottom (tick marks indicating data observations)–but the data set includes over 9,000 observations, so many of the lines are drawn right on top of each other. The \LaTeX compiler got peeved at having to draw so many lines, so Ruby helped it out by eliminating the redundancy. One special tweak for using the script above to modify tikz graphics files is to change the line

if (seen[line])

to

if (seen[line]) && !(line.include? 'node') &&  !(line.include? 'scope') && !(line.include? 'path') && !(line.include? 'define')

if your plot has multiple panes (e.g. par(mfrow=c(1,2)) in R) so that Ruby won’t ignore seemingly redundant lines that are actually specifying new panes. The modified line is a little long and messy, but it works, and that was the main goal here. The resulting \LaTeX file compiles easily and more quickly than it did with all those redundant lines, thanks to Ruby.

What Huntington Forgot

Earlier this week I complained on Twitter about having to read Huntington’s “Clash of Civilizations” for the umpteenth time when virtually every serious IR scholar knows he’s wrong. I really despise the “wrong but influential” line of argument, because it so often means that someone has been given influence in the literature for being both inflammatory and wrong, which is different than being wrong but influential on policy.

The gist of Huntington’s argument is that in the post-Cold War period civilization will increasingly become a salient referent for conflict. While I do not have time to pick apart this argument in full, it does rest largely on civilization as a cultural constant, as most of the civilizations he recognizes reach back at least 1,000 years. To see how this assumption can be misleading, see the map of world religions in 1895 below.

Huntington is known for being politically incorrect, but thankfully he did not use “heathen” as a category (yellow). This simple example is not definitive, but illustrates that the way in which civilizations are conceived has changed in century between the creation of this map and the appearance of Huntington’s article.

Meta-Blogging, Pt. 2: Weekly Trend in Tweets, Likes, and Comments

This post begins to describe the blog data collected (separately) by Anton Strezhnev and myself. One of the first things I did was to set the date variable in R format so that I could do some exploration.

library(foreign)
# setwd('get your own')
monkey1 <- read.dta('finalMonkeyCageData.dta')

monkey1$newdate <- as.Date(monkey1$date, "%m/%d/%Y")

monkey1$weekdaynum <- format(monkey1$newdate, "%w")

day_abbr_list <- c("Sun","Mon","Tue","Wed","Thu","Fri","Sat")

par(mfrow=c(3,1))

boxplot(monkey1$tweets ~ monkey1$weekdaynum, xaxt='n',xlab='',ylab="Tweets",col='blue')
axis(1,labels=day_abbr_list, at=c(1,2,3,4,5,6,7))

boxplot(monkey1$likes ~ monkey1$weekdaynum, xaxt='n',xlab='',ylab="Likes",col='red')
axis(1,labels=day_abbr_list, at=c(1,2,3,4,5,6,7))

boxplot(monkey1$comments ~ monkey1$weekdaynum,xlab='',xaxt='n',ylab="Comments",col='green')
axis(1,labels=day_abbr_list, at=c(1,2,3,4,5,6,7))

The result was this plot:

Monkey Cage Activity by Weekday

For tweets and likes it looks like earlier in the week (Sunday, Monday) is better, while comments get an additional bump on Saturday and Wednesday. In the next couple of posts we’ll look at how these three activities are correlated with page views, and how comments are distributed on the other blogs I scraped.

Meta-Blogging, Pt. 1: Introduction

About a month ago, Joshua Tucker posted some hypotheses about the number of tweets and likes that posts get on The Monkey Cage. Anton Strezhnev took up his question, building a screen scraper in Python and making all of his data public. His tentative conclusion was that posts containing graphics are more likely to be “Liked” than tweeted.

Coincidentally, Josh Cutler is teaching a course on Python for the Duke Political Science Department this semester, and one of our assignments was to build a blog scraper.* I took Anton’s scraper as a starting point and built three more, to get data from Andrew Gelman’s blog, Freakonomics, and Modeled Behavior. The idea behind these choices was to make comparisons between economics and political science blogs, and to have gradations of “wonkiness,” another of the proposed hypotheses. Although it’s pretty hard to escape wonkiness entirely in the academic blogosphere, here’s how I see the categorization:

Here’s what you can expect from this series (not necessarily in this order):

  • How do comments/tweets/likes correlate with page views?
  • How do comments predict (correlate with) tweets and likes?
  • What other factors predict tweets and like? (post length, images, time since previous post)
  • What predicts comments? (same potential explanations)
  • Are there author- or category- specific factors on the blogs?
My goal for this series, besides just answering the questions is to demonstrate the process of research to a broader audience. My own research process is decidedly imperfect, but by making everything–data collection scripts, data files, and R code for analysis–public, you will be able to see where judgment calls were made, or where mistakes might have crept in. If you have questions about the process, or criticism of my work, please comment on the posts or shoot me an email. Look for some preliminary charts later today, and real analysis to begin over the weekend/next week. 

___________

Note: I’m not sure how the term “scraper” emerged, but it refers to a script that collects information from websites without doing any permanent damage to the website. Unless you forget to put in a time delay and crash the blog–but I’m not naming any names