What Can Software Developers Learn from Tigers?

Object-oriented programming is a powerful way of modeling the world. Objects encapsulate data and behavior, and can interact and be composed in many useful ways.

As developers, one question we often consider is which types (in both technical and non-technical sense of the word) of objects to privilege by building them into our systems. Is an Address a proper object, or is it just a bit of data that can be encapsulated under a User? Are PowerUsers and NewUsers different enough to merit their own classes? (Probably not.)

But as our systems evolve, it can become difficult for existing object models to respond to new requirements. That’s apparently what is currently happening with tiger species. Because there are nine recognized tiger species, preservation efforts are spread across the six non-extinct species. A new proposal based on examination of DNA suggests that there should only be two tiger species.

As one researcher quoted in the piece says, “It’s really hard to distinguish between tigers…. The taxonomies are based on data from almost a hundred years ago.” Although your object model may not have legacy code that is quite that old, this case demonstrates the importance of reconsidering what traits and behaviors you allow to be first-class citizens in your system.

In the case of tigers, reducing the number of recognized species to two (one inhabiting continental Asia, and another the archipelago of Indonesia) would allow conservationists to try more flexible strategies. One example mentioned in the article is moving tigers within the same (redefined) species from one area to another (the updated definition of a continental Asian tiger would include the Amur tiger of Russia, the Bengal of India, and the South-China tiger). To a non-expert, it seems that interbreeding between these population groups would also help increase their numbers and perhaps increase genetic diversity.

Figuring out how to classify real-life entities can be very difficult. For tigers, what characteristics define a species? A century ago it might have been their physical appearance, while today we can look at the genetic level. In software, we have to think hard about the taxonomies we choose because they quickly become metaphors we live by.

The lesson for developers is that making your object model too fine-grained can introduce unexpected constraints when requirements change. To paraphrase Keynes, “When my information changes, I alter my code.”

The Future of Imagination

Venkatesh Rao had a great piece last week on imagination as a survival skill. Here is the gist:

I suspect failure-to-self-actualize will become the leading cause of death (or madness) in the developed world.

Rao defines “self-actualization” as

the imaginative embodiment of internal realities (what the daemon feels) in the form of a dent in the universe: a surprising and free external reality that actualizes a new possibility for all

and “imagination” as

the ability to create unpredictable new meaning while generating more freedom than you consume.

The post is very good, worth reading twice. However, there is one key shift that Rao overlooks. He focuses on imagination as an essential ability for the wealthy (the “one percent”), but it has even bigger implications for a future of 100 percent unemployment.

This change is coming, slowly but surely. It’s hard to imagine a future where “the robots take over” entirely, but we are already seeing a society where the poorest have more leisure time than the wealthiest. As Arnold Kling writes:

The prediction I would make is that we would see a lot more leisure. For those whose skill adaptation is adequate, that leisure will take the form of earlier retirement, later entry into the work force, or shorter hours. For those whose skill adaptation is inadequate, that leisure will show up as unemployment or reluctant withdrawal from the labor force.

I think that if you look only at males in isolation, you will see this in the data. That is, men are working much less than they used to. For some men, this leisure is very welcome, but for others it is not. In that sense, I think that we should look at the fears of the early 1960s not as quaint errors but instead as fairly well borne out.

The availability of inexpensive leisure (think cable TV and Youtube) has increased the reservation wage of low-wage workers. This has made unskilled individuals less willing to work for near-minimum wage jobs, as detailed in this New York Times article.

Self-actualization of highly skilled individuals as described by Rao has created so much freedom for those at the bottom of the income distribution that they now choose not to work. In their own words, though, the willingly unemployed do not seem to live fulfilling lives. Self-actualization is as important for them as it is for the wealthy, but they suffer from a failure of imagination.

Design Patterns for Cooking

Last week Alexey introduced the idea of cooking patterns:

A recipe is basically a fixed set of actions and ingredients, while cooking techniques are just the possible actions. If we invent cooking patterns – an abstraction on top of each ingredient / action pair – we could have more understanding of the dish we are preparing while keeping the flexibility in ingredient and technique choice.

Let’s take fritters as an example. Wikipedia says the following:

Fritter is a name applied to a wide variety of fried foods, usually consisting of a portion of batter or breading which has been filled with bits of meat, seafood, fruit, or other ingredients.

A pattern in its most obvious form. Notice the “wide variety”, a fixed ingredient (batter) and a list of possible variables (meat, seafood, vegetables, fruit) that could influence the fritters you end up making.

I find this idea very exciting, because I enjoy cooking and am also in the process of learning more about software design patterns.

Cooking patterns seem like an accessible way to introduce beginners to more abstract ideas about software, too. Algorithms are often described as “recipes,” and this is a nice way to build on that concept.

For leveling up your cooking skills, ChefSteps looks promising. Their resources include classes, projects, and an ingredients wiki. I have signed up for one class and plan to follow up on this recommendation after completing it.

If you are interested in cooking patterns, check out the Github repo or read the full article.

A Checklist for Using Open Source Software in Production

A great majority of the web is built on open source software. Approximately two-thirds of public servers on the internet run a *nix operating system, and over half of those are Linux. The most popular server-side programming languages also tend to be open source (including my favorite, Ruby). This post is about adding a new open source library to an existing code base. What questions should you ask before adding such a dependency to a production application?

The first set of questions are the most basic. A “no” to any of these should prompt you to look elsewhere.
  • Is the project written in a language you support? Is it in a language you support? If not, is it compatible (e.g. through stdin/stdout or by compiling to your language of choice)?
  • Is the project in a version of of the language you support? If it’s written in Python 3 and you only support Python 2, for example, using this library could lead to headaches.
  • Can you use the project in your framework of choice (e.g. Rails or Django)?
  • Are there conflicts with other libraries or packages you’re currently using? (This is probably the hardest question to answer, and you might not know until you try it.)
Assuming there are no immediate technical barriers, the next questions to ask are of the legal variety. Open source licenses come in many flavors. In the absence of a license, traditional copyright rules apply. Be especially careful if the project you are investigating uses the GPL license–even basing the code you write off of a GPL open source project can have serious legal ramifications. There’s a great guide to OSS licenses on Github. If you’re the author or maintainer of an open source project checkout choosealicense.com.
The next thing to consider is whether and how the project is tested. If there is not an automated test suite, consider starting one as your first contribution to the project and be very reluctant to add the project to your application. Other related questions include:
  • Are there unit tests?
  • Are there integration tests?
  • What is the test coverage like?
  • Do the tests run quickly?
  • Are the tests clearly written?
Finally, by using an open source project you are also joining a community of developers. None of these questions are necessarily show-stoppers but knowing the size of the community and the tone of its discourse can save you pain down the road.
  • Is the project actively maintained? When was the last commit?
  • Does the community have a civil, professional style of debate and discussion?
  • Is there only one developer/maintainer who knows everything? This doesn’t have to be a deal breaker. However, if there is a single gatekeeper you should make sure you understand the basics of the code and could fork the project if necessary.

This is by no means an exhaustive list but these questions can serve as a useful checklist before adding an open source as a dependency for your project.

Falsehoods Programmers Believe

The first principle is that you must not fool yourself – and you are the easiest person to fool. – Richard Feynman

Programmers love to fool themselves. “This line has to work! I didn’t write that bug! It works on my machine!” But if ever there was a field where you can’t afford to fool yourself, it’s programming. (Unless of course you want to do something like lose $172,222 a second for 45 minutes).

Over the years I’ve enjoyed lots of articles that talk about false assumptions that programmers accept without really questioning them. I thought it would be helpful to have these collected in one place for reference purposes. If you know of articles that would be a good fit on this list, let me know and I will add them.

Falsehoods programmers believe…

Tirole on Open Source

Jean Tirole is the latest recipient of the Nobel prize in economics, as was announced Monday. For more background on his work, see NPR and the New Yorker. My favorite portion of Tirole’s work (and, admittedly, pretty much the only part I’ve read) is his work on open source software communities. Much of this is joint work with Josh Lerner. Below I share a few selections from his work that indicate the general theme.

open_sourceThere are two main economic puzzles to open source software. First, why would highly skilled workers who earn a substantial hourly wage contribute their time to developing a product they won’t directly sell (and how do they convince their employers, in some cases, to support this)? Second, given the scale of these projects, how do they self-govern to set priorities and direct effort?

The answer to the first question is a combination of personal reputation and the ability to develop complementary software (Lerner and Tirole, 2002, p. 215-217). Most software work is “closed source,” meaning others can see the finished product but not the underlying code. For software developers, having your code out in the open gives others (especially potential collaborators or employers) the chance to assess your abilities. This is important to ensure career mobility. Open source software is also a complement to personal or professional projects. When there are components that are common across many projects, such as an operating system (Linux) or web framework (Rails), it makes sense for many programmers to contribute their effort to build a better mousetrap. This shared component can then improve everyone’s future projects by saving them time or effort. The collaboration of many developers also helps to identify bugs that may not have been caught by any single individual. Some of Tirole’s earlier work on collective reputations is closely related, as their appears to be an “alumni effect” for developers who participated in successful projects.

Tirole and Lerner’s answer to the second question revolves around leadership. Leaders are often the founders of or early participants in the open software project. Their skills and early membership status instill trust. As the authors put it, other programmers “must believe that the leader’s objectives are sufficiently congruent with theirs and not polluted by ego-driven, commercial, or political biases. In the end, the leader’s recommendations are only meant to convey her information to the community of participants.” (Lerner and Tirole, 2002, p. 222) This relates to some of Tirole’s other work, with Roland Benabou, on informal laws and social norms.

Again, this is only a small portion of Tirole’s work, but I find it fascinating. There’s more on open source governance in the archives. This post on reputation in hacker culture or this one on the Ruby community are good places to start.

Two Unusual Papers on Monte Carlo Simulation

For Bayesian inference, Markov Chain Monte Carlo (MCMC) methods were a huge breakthrough. These methods provide a principled way for simulating from a posterior probability distribution, and are useful for integrating distributions that are computationally intractable. Usually MCMC methods are performed with computers, but I recently read two papers that apply Monte Carlo simulation in interesting ways.

The first is Markov Chain Monte Carlo with People. MCMC with people is somewhat similar to playing the game of telephone–there is input “data” (think of the starting word in the telephone game) that is transmitted across stages where it can be modified and then output at the end. In the paper the authors construct a task so that human learners approximately follow an MCMC acceptance rule. I have summarized the paper in slightly more detail here.

The second paper is even less conventional: the authors approximate the value of π using a “Mossberg 500 pump-action shotgun as the proposal distribution.” Their simulated value is 3.131, within 0.33% of the true value. As the authors state, “this represents the first attempt at estimating π using such method, thus opening up new perspectives towards computing mathematical constants using everyday tools.” Who said statistics has to be boring?

 

Schneier on Data and Power

Data and Power is the tentative title of a new book, forthcoming from Bruce Schneier. Here’s more from the post describing the topic of the book:

Corporations are collecting vast dossiers on our activities on- and off-line — initially to personalize marketing efforts, but increasingly to control their customer relationships. Governments are using surveillance, censorship, and propaganda — both to protect us from harm and to protect their own power. Distributed groups — socially motivated hackers, political dissidents, criminals, communities of interest — are using the Internet to both organize and effect change. And we as individuals are becoming both more powerful and less powerful. We can’t evade surveillance, but we can post videos of police atrocities online, bypassing censors and informing the world. How long we’ll still have those capabilities is unclear….

There’s a fundamental trade-off we need to make as society. Our data is enormously valuable in aggregate, yet it’s incredibly personal. The powerful will continue to demand aggregate data, yet we have to protect its intimate details. Balancing those two conflicting values is difficult, whether it’s medical data, location data, Internet search data, or telephone metadata. But balancing them is what society needs to do, and is almost certainly the fundamental issue of the Information Age.

There’s more at the link, including several other potential titles. The topic will likely interest many readers of this blog. It will likely build on his ideas of inequality and online feudalism, discussed here.

Who says North is “up”?

There are several childhood lessons that I trace back to dinners at Outback Steakhouse: the deliciousness of cheese fries, the inconvenience of being in the middle of a wraparound booth, and the historical contingency of North as “up” on maps.
Upside_Down_World_Map

Who started using the NESW arrangement that is virtually omnipresent on maps today? Was it due to the fact that civilization as we now know it developed in the Northern hemisphere? (Incidentally, that’s why clocks run clockwise–a sundial in the Southern hemisphere goes the other way around.)

That doesn’t appear to be the case according to Nick Danforth, who recently took on this question at al-Jazeera America (via Flowing Data):

There is nothing inevitable or intrinsically correct — not in geographic, cartographic or even philosophical terms — about the north being represented as up, because up on a map is a human construction, not a natural one. Some of the very earliest Egyptian maps show the south as up, presumably equating the Nile’s northward flow with the force of gravity. And there was a long stretch in the medieval era when most European maps were drawn with the east on the top. If there was any doubt about this move’s religious significance, they eliminated it with their maps’ pious illustrations, whether of Adam and Eve or Christ enthroned. In the same period, Arab map makers often drew maps with the south facing up, possibly because this was how the Chinese did it.

So who started putting North up top? According to Danforth, that was Ptolemy:

[He] was a Hellenic cartographer from Egypt whose work in the second century A.D. laid out a systematic approach to mapping the world, complete with intersecting lines of longitude and latitude on a half-eaten-doughnut-shaped projection that reflected the curvature of the earth. The cartographers who made the first big, beautiful maps of the entire world, Old and New — men like Gerardus MercatorHenricus Martellus Germanus and Martin Waldseemuller — were obsessed with Ptolemy. They turned out copies of Ptolemy’s Geography on the newly invented printing press, put his portrait in the corners of their maps and used his writings to fill in places they had never been, even as their own discoveries were revealing the limitations of his work.

map_projectionsPtolemy probably had his reasons, but they are lost to history. As Danforth concludes, “The orientation of our maps, like so many other features of the modern world, arose from the interplay of chance, technology and politics in a way that defies our desire to impose easy or satisfying narratives.” Yet another example of a micro-institution that rules our world.

Github for Government

What happens when you combine open source software, open data, and open government? For the city of Munich, the switch to open source software has been a big success:

In one of the premier open source software deployments in Europe, the city migrated from Windows NT to LiMux, its own Linux distribution. LiMux incorporates a fully open source desktop infrastructure. The city also decided to use the Open Document Format (ODF) as a standard, instead of proprietary options.

As of November last year, the city saved more than €11.7 million because of the switch. More recent figures were not immediately available, but cost savings were not the only goal of the operation. It was also done to be less dependent on manufacturers, product cycles and proprietary OSes, the council said.

We’ve talked before about how more city governments could follow the open data, open government initiatives of NYC, using tech to benefit citizens rather than (only) creating initiatives to attract tech companies to the area. This shift in emphasis, toward harnessing the power of technology for widespread gains in happiness, is likely to become even more important following recent protests against tech employees in the Bay Area.

Open data and open government will take the principles of open source and use them to make an even bigger social and political impact. One tool from open source that can be adapted for use by these newer movements is Github. We will continue to follow these trends here, and if you are interested in this trend you can also check out Github and Government for more success stories.