Almost all working programmers agree that they want to work in a “clean” codebase, but very few agree on what this means in practice. One way to simplify these discussions is to choose (or create) a style guide and enforce it with lint rules. This will catch issues like indentation or the number of blank lines between functions.

Higher level concerns such as code complexity can be more difficult to catch in an automated fashion. Code Climate and similar tools attempt to address this, and are popular with many open source maintainers. If your team is unable or unwilling to use Code Climate, you may want to try computing your own metrics to measure code quality.

In an effort to start measuring the complexity of code in one medium-sized (45,000 LOC) Python project, I decided to write a script to count and plot function and method lengths. The full script is in this gist

The first part recursively walks the project, finding all files with a desired extension (.py in my case):

def walk_dir(dirname, ext):
  for root, dirs, files in os.walk(dirname):
    for file in files:
      if file.endswith(ext):
        yield os.path.join(root, file)

Each file is then read in and parsed using the ast library, which allows us to easily find the classes and functions defined in the file:

def count_function_lengths(pth):
    # load file
    f = open(pth, 'r')

    # parse the AST of the file
    t = ast.parse(
    classes = [e for e in t.body if type(e) == ast.ClassDef]
    funcs   = [f for f in t.body if type(f) == ast.FunctionDef]
    line_numbers = [f.lineno for f in funcs]

One catch is that although ast can easily tell you the line on which a function definition starts, I could not find any reliable way to find the line on which it ends. (One option would be to look for the return outside of any conditional blocks, but functions in Python are not required to return anything, although almost all do in practice.) To work around this, I sorted the line numbers on which each class and function began and used that as the end line for the previous function. This ignores whitespace and comments but gives us a useful first approximation.

We can then compute and plot a histogram of line lengths for all functions in a project. Here’s what this looks like for the example project I mentioned above:


Histogram of Function Lengths in a Python Project

At a glance we can easily see that some functions in this project are way too long (over 200 lines!). In practice this provides two main benefits. First, it makes our conversations around code complexity more concrete and objective. Second, it gives us a way to prioritize our refactoring efforts and measure our progress. Rather than debating which parts of the code need cleaning up, we can get right to work making the code more readable and maintainable. Happy coding!