Comparing Images

A practical problem that I have is to sort my digital photos. Some of them have been taken with analog cameras and they have been scanned. Some of them have been scanned several times, using different resolution, different providers or different technologies.

So one issue that occurs is having two directories which contain more or less the same images from different scans.

Some of them have been sorted or enriched with useful information or just rotated correctly, others have better resolution. Maybe it is good to use different scans to create a better image than the best scan.

For a few hundred films doing that manually is possible, but a lot of work. So I thought about automating it, at least partially.

There are several issues that make it more difficult. Scans are not very accurate, so they cut of a tiny random bit in the borders. So to match two images exactly it is necessary to scale and crop them.

Colors look quite different. And then of course resolutions are different. And sometimes they have been turned by an angle of 90 or 270 degrees. Some scans miss a few images that another one has.

So, how to start?

First all images are scaled to thumbnails of approximately 65536 pixels. It turns out to be 204×307, but could of course be anything around that size retaining the rough aspect ratio.

Now all thumbnail images from the two directories are read into memory. 80 thumbnail images are no big deal…

Images in portrait orientation are rotated by 90 and 270 degrees and both variants are used. So from here on all images are in landscape format. Upside down is not supported.

All images are scaled to exactly 204×307 in memory to allow comparison.

And the average r, g and b-values from all images from each directory are calculated. The r,g,b values of each pixel are multiplied or divided by a factor, which is the square root of the quotient of these averages and constrained to the range 0..255. So this partially neutralizes the effect of different colors.

Now for each thumbnail from the first directory is compared with each thumbnail from the second directory. This is done by calculating for each pixel the sum of the squares of the differences between the r, g and b values of the two images.

These values are added up and divided by the number of pixels (204×307 in my case). The square root of this is the comparison result. For images that are actually the same, comparison results between 30 and 120 are possible. Now some heuristic is used to find matches based on these values. Also an „interpolation“ is used, if consecutive images in both directories occur with the middle one missing. This usually brings good results. In some cases manual interaction is needed. So it is possible to mark images in an web interface and then the match that was found for them is revoked. Also it is possible to provide „hints“, which means that these images should be matched.

It took about a day and half to write this and it works sufficiently well for the purpose.

But there are much more sophisticated algorithms that really recognize objects. I have a program called hugin that combines a few images to a panorama. Usually it works quite well and it can even rotate, shift and scale images and do amazing things most of the time. Sometimes it just does not work with certain images. Also Google has of course very powerful software to recognize images and compare them and even create panoramas. If we thing about face recognition… There is really good stuff around. But the first approach with some improvements gave sufficiently good results and the program will only run a couple of hundred times, then it will not be needed anymore.

This is heavily inspired by the Perl-library Image::Compare, but since I am doing something slightly different, I do not use this library any more. But subsequently it has been written in Perl. I will happily provide my source code, but I have baked into it assumptions and conventions that I have introduced myself, but that are not universal, concerning the file names and directory structures for the images.

Share Button

Ketchup or Milestone

Bicycles have potentially very accurate speedometers. They measure distances to an accuracy of usually ten meters and they do this by counting the rotations of the front wheel. Now we can go into physics and see if the front wheel slips significantly, but I would not expect that to be relevant. What is a bit relevant is how much air there is in the tire. But bicycle tires usually take pressures between 4 and 6 bars and using this ensures that the resistance is lowest, the tire lasts longer and of course the measurement is more accurate. Only for cycling offroad and especially on snow or sand, lower pressures might be helpful.

Now usually the circumference of the front wheel can be configured to an accuracy of one millimeter. For this there are two methods and each method has its fans, who say that it is totally impossible to use the other method.

The milestone method, or more accurately the kilometer stone method, because we should of course use the metric system by default in an international context, works like this: The wheel circumference is roughly estimated and configured. Now we cycle long distances on roads with kilometer stones. This is not the case in every country and on every road, but in some countries national highways and even regional highways have it. Germany used to have it in the past, but they moved to a system that numbers sections and measures only within the section, which is useless for this purpose. Sweden and Switzerland do not have it. So it requires makeing use of the kilometer stones in a country that has them. Distances on road signs are very inaccurate in most countries, so they are absolutely useless for this purpose and no better than an estimation of the circumference. So now traveling 10 or better 20 kilometers on a road without steep slopes and without strong head wind allows to see the deviation and adjust the circumference appropriately. Repeating this a few times will yield the best result.

The other method is called ketchup method. It is easy. One spot on the wheel is marked with a lot of ketchup. Then it is carefully walked on a straight line for a few meters meters and the distance between the ketchup spots is measured.

Now what is the difference between the two methods?

Cyclists do not exactly follow a straight line, but go a little bit of zik-zak. This becomes more with head wind or slopes, but usually it is a value that is individual for the cyclist. So going from town A to town B, which is 100 kilometers, will yield (almost) exactly 100 kilometers with the speedometer configured by the kilometer stone method. But it will yield a bit more when configured by the ketchup method.

Both values have their relevance. The ketchup kilometers are the distance that was really traveled in the exact line. And the milestone kilometers are the distance that has been achieved in terms of the usefulness as a means of transport. The zik-zak is just a tiny inevitable inefficiency that does not bring us anywhere.

Now there are people who insist that one of the two methods is superior. And people who just do not care to measure to that accuracy.

But we can learn something from this for IT solutions. Often we have the choice, which information to show to the user. Quite often the ketchup value is chosen, which is a technical internal value, that is the most correct by some technical point of view. But it is good to think twice which value really matters for the user. And it is usually worth to go through a lot more effort and present that „milestone value“ and leave the „ketchup value“ for internal use and maybe advanced detail GUIs that show both values.

I do not care how you configure your bicycle speedometer, but I prefer the milestone method myself. But for user interfaces of software we should all think a bit and understand, what is the equivalent of ketchup and milestone method. At least if there is a significant difference, we should prefer milestone and make more useful user interfaces instead of „correct“ ones in some technical way that is irrelevant to the user.

Share Button