How to multiply long numbers

Assuming we have a multiplication of n-bit-numbers already available.
Other than typical compiled languages, which assume that the multiplication result fits into the same size as the two factors, we should take a closer look.
Unsigned n-bit-numbers as factors x and y fullfill

    \[0 \le x \le 2^n-1\]

    \[0 \le y \le 2^n-1\]

So we have for the product z=x*y

    \[0 \le z \le (2^n-1)^2 \le 2^{2n}-1\]

So we should be prepared for a 2n-bit long result. Assembly language provides for this. In Java we are kind of running out of luck, because we have the 64-bit-long as the largest possible value, so we cannot make use of the 64-bit capabilities of our CPU by getting a 128-bit-result. Even worse we do not have unsigned primitive types. For addition and subtraction it is possible to cheat by assuming that the longs are unsigned, because the addition for signed and unsigned numbers is almost the same, but for multiplication this is slightly more complex.

In C we do have better chances, but we need to go compiler and OS-specific to some extent. gcc on 64-bit platforms has a type __uint128_t and we can hope that something like this

__uint128_t mul(unsigned long long x, unsigned long long y) {
__uint128_t xx = (__uint128_t) x;
__uint128_t yy = (__uint128_t) y;
__uint128_t z = xx*yy;
return z;
}

is optimized by the compiler to one assembly language instruction that multiplies two unsigned 64-bit-integers into one unsigned 128-bit integer.

Multiplication of longer numbers is achieved by just assuming

    \[ x= \sum_{j=0}^{m-1}2^{nj}x_j, y= \sum_{j=0}^{m-1}2^{nj}y_j\]

with

    \[ \bigwedge_{j=0}^{m-1} 0\le x_j, y_j <2^n.\]

Now the product is

    \[ z = \sum_{j=0}^{m-1}\sum_{k=0}^{m-1}2^{n(j+k)}x_j y_k \]

where each x_j y_k summand needs to be split into a lower and an upper part such that

    \[x_j y_k = 2^n h(x_j y_k) + l(x_j y_k).\]

All these half products need to be added with propagating the carry bits to the next higher level.
This can be sorted out and done with a lot of additions with carry-bit, finally giving the desired result.

For reasonably short numbers this is a good approach, if it is done well, but it can be replaced by better algorithms for larger numbers.

The most obvious next step is to move to the Karatsuba algorithm.
Assuming we have two factors x=x_l + 2^m x_h and y = y_l + 2^m y_h the naïve approach of having to calculate the four products
x_h y_h, x_h y_l, x_l y_h, x_l y_l can be replaced by calculation of only three products x_h y_h, (x_l + x_h)(y_l+ y_h), x_l y_l, because what is actually needed is the sum x_h y_l+x_l y_h which can be obtained by

    \[(x_l + x_h)(y_l+ y_h)-x_l y_l-x_h y_h=x_h y_l+x_l y_h.\]

That can be iterated by subdividing the numbers subsequently into halves until a size is reached that can be handled well enough by the naïve approach.
An issue that needs to be observed is that x_l + x_h and y_l+ y_h need one bit more than their summands.

For even larger numbers the so called Toom-Cook algorithm, an extension of the idea of the Karatsuba-algorithm for more than two summands, can be useful and for numbers larger than that the Schönhage-Strassen multiplication proves to be faster. It is implemented in some libraries for long integer arithmetic, like GMP. Since around 2007 this can be improved by using the Fürer algorithm which is assymptotically faster, but I do not know of any useful implementations and for practical purposes the improvement does not seem to be very relevant.

Share Button

Testing Java- and C-programs with Ruby and Perl

It is very important to write good unit tests for software that is non-trivial and that is relied on by other pieces of software.
Often the logic of the software can easily be covered by the native testing facilities of the programming language, like JUnit for Java or, much less well known but available, CUnit for C. When a lot of framework code is involved or third party libraries are used heavily, there is almost no other way for certain tests, because setting up the environment cannot easily be achieved elsewhere.

But we also encounter cases where writing good unit tests in the same language as the library itself becomes a pain. We procrastinate the issue of writing them and end up with way too little or no unit tests at all. A lot of software deals with doing some calculations or transformations of numbers and strings, usually a lot of numbers and a lot of strings. Now Strings are the strength of the Perl programming language and are not really implemented very well or at least not very powerful or easy to use in most other languages. Specifically the String facilities of Perl are much superior to those of Java and C. Off course our software needs to perform well, it needs to integrate into the environment and follow the global corporate software standards, so Java or C or some other programming language is the choice that should not be challenged here for the productive software. But some tests of the functionality can more easily be achieved by iterating over some input data and creating output of the input data combined with the results. This can be perl code already or something really easy to parse. Perl is really the tool that can parse almost anything, but we do not really want to be distracted by unnecessary work but get our job done. So something like generating Perl code or CSV or YAML or JSON, but please not XML if not really needed, should do. Then we can pipe the output to perl or to a perl script and this will tell us, if everything is ok. When we know our platform, it can even be done that the Java- or C-Unit-test stores the output to a file or pipe and calls the Perl script on it and fails or succeeds depending on its output.

When it comes to numeric types, Ruby is very strong. It has unlimited size integers by default, which can be casted to n-bit-integers using constructions like
xx=x&(1<<n)-1,
it has Rational, LongDecimal (as an external gem) and Complex and is easily extendible.

Usually we can expect that corporate constraints on which tools and programming languages may be used are less restrictive when it comes to unit tests. Integrating this on a continuous integration platform is a job that needs to be addressed but it is worth the effort, if a lot of tests become easier with this approach. And doing tests in another language makes tests more credible.

Off course the general idea is applicable for other combinations. Look into Scala, F#, Clojure, JavaScript, Python and some others as well, if they seem to be more helpful than Ruby or Perl for your unit testing automation. But this does indeed raise the question if a world where corporate policies allowed Scala and Closure instead of Java, F# instead of C# and Elixir instead of Erlang and PL/I instead of Cobol would be better.

Share Button