How can you make one set of images “look like” another set? This is a common problem in computer vision because collecting labeled training images is expensive. If you can make a pre-existing set of labeled images (either a publicly available set or one that your team has already collected) match the style of a new set that you want to work with, your model is more likely to be successful (because what deep learning models learn is based on the distribution of the values in the training set). There are a number of increasingly advanced ways to do this including Generative Adversarial Networks but this post will demonstrate one of the simplest approaches, known as histogram matching.
The basic idea behind histogram matching is that you can get a very rough idea of what an image looks like by examining the histogram of its pixel values. For example, do they skew toward the darker or lighter end of the spectrum, and how much variance is there? If we can make the histograms of two images more similar, we will in turn make the appearance of the images more similar. We will try to make an image of London on a dreary day match the style of a picture of sunny Bondi Beach.
One trick for making this process more memory efficient is that instead of storing all the pixel values, or even the full histograms, we can represent our matching model as a series of polynomial coefficients. The question at that point becomes what order polynomial is sufficient for making the transformation.
The figure above is an example of what the transformation looks like using a third-order polynomial. As you can see, the pixel values of the matched image are sort of a compromise between the source and target images. There are fewer pixel values at the low end of the range than in the source image, but more at the higher end of the range than in the target image. In appearance, the matched image is bluer than the original but retains many of its details, such as the color of the bus and the bear.
How does the quality of the transformation vary as we increase the number of polynomial coefficients that we estimate?
As expected, increasing the order of the polynomial used to estimate the transition makes the histogram curves a closer fit. However, we reach diminishing returns after the fourth-order (the visual differences between matched images with a fourth- and fifth-order polynomial are negligible). This appears to be a common occurrence across the datasets where I have explored this technique. If you want to match two sets of images using a memory-efficient approach, this technique could work for you. All the code used to produce these examples is available on GitHub.