Combining multiple scans

When images are scanned multiple times, maybe there is a way to construct an image that is better than any of the scans from them.

In this case it is assumed, that one scan has a higher resolution, but another scan got the colors better.

It has already been found out, which two scans belong to the same original.

Also one of them has already been rotated to the desired position.

The rotation can easily be found. Since images are roughly rectangular and width is roughly 1.5 times the height, simply two rotations have to be tried. Images are compared pixelwise and the one that has less deviation is probably the right one.

Now the orientation is already the same. And when scaled to the same size, the images should be exactly the same, apart from inaccuracies. But that is not the case. They are scaled slightly differently and they are shifted a bit.

Now the shift and scaling can be expressed by four numbers as x=a*u+b and y=c*v+d, where (x,y) and (u,v) are coordinates of the two images. It could be a matrix, if rotation is also involved and this would work the same way, but professional scans do have shift and scaling problems, but much less rotation problems.

So to estimate a, b, c and d, it is possible to go through all the pixels (u,v) of one image. Now the pixels (x,y) = (u+du, v+dv) are compared for combinations of small du and dv. The result is a sum of squares of differences s in a range between 0 and 3*65536. Now low values are good, that means that the coordiates (u,v,x,y) are a good match and need to be recorded with a high weight. This is achieved by weighting them with (3*65536-s)^2. Now it is just two linear regressions to calculate a,b,c and d. Of course, the points near ues the borders need special care, but it is possible to ommit a bit near the boarder to avoid this issue.

Now the points (u,v) of one image are iterated and the approximate match (x,y) on the other image is found. Now for (r,g,b) there can be a function to transform them. To start, it can be linear again, so this will be three linear regressions for r, g, and b. Or it could involve a matrix multiplied to the input vector. Or some function, then a matrix multiplication then vector addition and then the inverse function. The result has to be constrained to RGB-values between 0 and 255.

Now in the end, for each pixel in one image there is a function that changes the colors more to what the other image has. This can be applied to the full resolution image. Also, a weighted average of the original values and the calculated values, to find the best results. Then this can be applied to the whole series.

A few experiments with the function and a bit research on good functions for this might be done, to improve. In the end of the day there is a solution that is good enough and works. Or maybe a few variants and then some manual work to pick the best.

Schreibe einen Kommentar

Antwort abbrechen