Karl Brodowsky's IT-Blog – Seite 27 – IT Sky Consulting GmbH

In Facebook, Xing, Google+, Vk.com, Linkedin and other of these social media networks we are often encountered with a trivial question like this:
1->2 2->8 3->18 4->32 5->50 6->72 7->?
There are some easy patterns. Either it is some polynomial formula or some trick with the digits.
But the point is, that any such sequence can easily be fullfilled by a polynomial formula. That means we can put any value for 7 and make it work. Or any answer is correct. So what would probably be the real question is the most simple function to full-fill the given constraints. Simplicity can be measured in some way… If the solution is unique is unclear, but let us just look at the polynomial solution.

A function is needed that takes as parameter a list of key-value-pairs (or a hash map) and that yields a function such that the function of any of the key is the associated value.

Assuming a polynomial function in one variable we can make use of the chinese remainder theorem, which can be applied to univariate polynomials over a field $F$ as well as to integral numbers. For a polynomial p(X) we have

$p(x) \equiv p(X) \mod X-x$

where $X$ is the polynomial variable and $x\in F$ is a concrete value.

We are looking for a polynomial $p(X)$ such that for given values $x_0,\ldots x_{n-1}, y_0,\ldots,y_{n-1} \in F$ we have

$\bigwedge_{i=0}^{n-1} p(x_i) = y_i$

or in another way

$\bigwedge_{i=0}^{n-1} p(X) \equiv y_i \mod X-x_i$

which is exactly the Chinese remainder theorem.
Let

$I=\{0,\ldots,n-1\}$

and

$\bigwedge_{j=0}^{n-1} I_j = I \setminus \{j\}$

We can see that for all $i \in I$ the polynomials

$e_i = \prod_{j \in I_j} \frac{X-x_j}{x_i-x_j}$

have the properties

$e_i(x_i)=1$

$\bigwedge_{j \in I_i} e_i(x_j)=0$

$\bigwedge_{i \in I}\bigwedge_{j \in J} e_i(x_j)=\delta_{i,j}$

where $\delta_{i,j}$ is the Kronecker symbol, which is 0 if the two indices differ and 1 if they are equal.
Or as congruence:

$\bigwedge_{i \in I}\bigwedge_{j \in J} e_i(X)\equiv \delta_{i,j} \mod X-x_j$

Then we can just combine this and use

$p(X) =\sum_{i \in I} y_i e_i(X)$

This can easily be written as a Ruby function
def fun_calc(pairs) n = pairs.size result = lambda do |x| y = 0 n.times do |i| p_i = pairs[i] x_i = p_i[0].to_r y_i = p_i[1].to_r z = y_i n.times do |j| if (j != i) p_j = pairs[j] x_j = p_j[0] z *= (x - x_j) / (x_i - x_j) end end y += z end y end result end
This takes a list of pairs as a parameter and returns the polynomial function als lambda.
It can be used like this:
lop = [[0, 0], [1, 1], [2, 4], [3, 9], [4, 16], [5, 25], [6, 36], [7, 64]]


f = fun_calc(lop)

20.times do |x| y = f.call(x) puts sprintf("%6d -> %6d", x, y) end
Put this together into a ruby program and add some parsing for the list of pairs or change the program each time you use it and all these „difficult“ questions „that 99.9% fail to solve“ are not just easy, but actually soluble automatically.

This is interesting for more useful applications. I assume that there will always be situations where a function is needed that meets certain exact values a certain inputs and is an interpolation or extrapolation of this.

Please observe that there are other interesting and useful ways to approach this:

Use a „best“ approximation from a set of functions, for example polynomials with a given maximum degree
use cubic splines, which are cubic polynomials within each section between two neighboring input values such that at the input values the two adjacent functions have the same value ( $y_i$ , of course), the same first derivative and the same second derivative.

For highway and railroad construction other curves are used, because the splines are making an assumption on what is the $x$ -axis and what is the $y$ -axis, which does not make sense for transport facilities. They are using a curve called Clothoid.

Use Java, C, Perl, Scala, F# or the programming language of your choice to do this. You only need Closures, which are available in Java 8, F#, Scala, Perl, Ruby and any decent Lisp dialect. In Java 7 they can be done with an additional interface as anonymous inner classes. And for C it has been described in this blog how to do closures.

Thread local variables

The idea of threads is to share almost all information between the threads of the same process. This implies more effort to avoid creating inconsistencies by simultaneous access to the same data. But sometimes it is necessary to have data private to one thread. This is usually easily accomplished by just using local variables and method or function parameters that are simply not naturally available to the other threads, unless they are referenced by some global structures or by structures shared with other threads. But there are some cases when it is useful to have thread local variables.

The general idea is to call a function or method several times and to retain intermediate results somewhere. In object oriented programming attributes may be used for this, in procedural programming global variables or a structure that is passed around. In functional programming this is just forbidden and a workaround is always possible but this may be painful, complicated, inefficient or impossible due to the usage of existing functionality written by others.

Now the global variable fails miserably, one of the reasons, it should not be used. And its object-oriented blessed brother, like a static attribute in Java or C++, as well. Multiple threads may run simultanous sessions with these functionalities. So thread local variables are the better approach.

They can easily be built manually. You just need to use a hash map, make sure its access is thread safe by using RWLocks in C or ConcurrentHashMap or some synchronized HashMap in Java. Then use the thread ID as key to the hash map and store some structure in it where all the bad stuff is stored. Or use multiple hash maps.

But both languages provide better facilities for this purpose. In newer Java versions ThreadLocal should be used. This needs an initializer. Then it provides a get method that accesses the thread local copy of the variable and creates it, if it is not present. This approach is strongly recommended when dealing the DateFormat and NumberFormat, which are not thread safe in Java. Using synchronized on the format or creating it on demand each time works as well, but the thread local approach seems to be the most strait-forward and most efficient way. But these special examples should lose their importance. Using Joda-Time or the new Java-8 functionality for Date and Time should make SimpleDateFormat, DateFormat, Calendar and java.util.Date obsolete. And NumberFormat is still useful, but in many cases replaceable by String.format() and the C-like functionality.

In C a function pthread_key_create can be used to create a thread local veriable, pthread_setspecific() and pthread_getspecific() allow to store and retrieve a thread local veriable and hence thread local data. Please read the man pages instead of memorizing the exact API.

I might provide or link example programs for these concepts in some future posting.

O’Reilly schließt Aktivitäten in Deutschland

This article is of minor relevance to English speaking readers, because it concerns the German O’Reilly books, while the English O’Reilly books continue to be available. So I will write in German.

Der O’Reilly-Verlag hatte jahrelang eine deutsche Niederlassung in Köln, wo deutschsprachige Versionen der beliebten Informatik-Fachbücher herausgegeben wurden. Dies waren oft Übersetzungen, aber in einigen Fällen auch exklusiv in deutscher Sprache erscheinende Titel oder doch Titel, die später vom Deutschen ins Englische übersetzt wurden. Der d-Punkt-Verlag wird die Aktivitäten übernehmen und auch unter dem Name O’Reilly weiterführen.

Mehr dazu unter
Informatik Aktuell.

How to multiply long numbers

Assuming we have a multiplication of n-bit-numbers already available.
Other than typical compiled languages, which assume that the multiplication result fits into the same size as the two factors, we should take a closer look.
Unsigned n-bit-numbers as factors $x$ and $y$ fullfill

$0 \le x \le 2^n-1$

$0 \le y \le 2^n-1$

So we have for the product $z=x*y$

$0 \le z \le (2^n-1)^2 \le 2^{2n}-1$

So we should be prepared for a 2n-bit long result. Assembly language provides for this. In Java we are kind of running out of luck, because we have the 64-bit-long as the largest possible value, so we cannot make use of the 64-bit capabilities of our CPU by getting a 128-bit-result. Even worse we do not have unsigned primitive types. For addition and subtraction it is possible to cheat by assuming that the longs are unsigned, because the addition for signed and unsigned numbers is almost the same, but for multiplication this is slightly more complex.

In C we do have better chances, but we need to go compiler and OS-specific to some extent. gcc on 64-bit platforms has a type __uint128_t and we can hope that something like this
__uint128_t mul(unsigned long long x, unsigned long long y) { __uint128_t xx = (__uint128_t) x; __uint128_t yy = (__uint128_t) y; __uint128_t z = xx*yy; return z; }
is optimized by the compiler to one assembly language instruction that multiplies two unsigned 64-bit-integers into one unsigned 128-bit integer.

Multiplication of longer numbers is achieved by just assuming

$x= \sum_{j=0}^{m-1}2^{nj}x_j, y= \sum_{j=0}^{m-1}2^{nj}y_j$

with

$\bigwedge_{j=0}^{m-1} 0\le x_j, y_j <2^n.$

Now the product is

$z = \sum_{j=0}^{m-1}\sum_{k=0}^{m-1}2^{n(j+k)}x_j y_k$

where each $x_j y_k$ summand needs to be split into a lower and an upper part such that

$x_j y_k = 2^n h(x_j y_k) + l(x_j y_k).$

All these half products need to be added with propagating the carry bits to the next higher level.
This can be sorted out and done with a lot of additions with carry-bit, finally giving the desired result.

For reasonably short numbers this is a good approach, if it is done well, but it can be replaced by better algorithms for larger numbers.

The most obvious next step is to move to the Karatsuba algorithm.
Assuming we have two factors $x=x_l + 2^m x_h$ and $y = y_l + 2^m y_h$ the naïve approach of having to calculate the four products
$x_h y_h$ , $x_h y_l$ , $x_l y_h$ , $x_l y_l$ can be replaced by calculation of only three products $x_h y_h$ , $(x_l + x_h)(y_l+ y_h)$ , $x_l y_l$ , because what is actually needed is the sum $x_h y_l+x_l y_h$ which can be obtained by

$(x_l + x_h)(y_l+ y_h)-x_l y_l-x_h y_h=x_h y_l+x_l y_h.$

That can be iterated by subdividing the numbers subsequently into halves until a size is reached that can be handled well enough by the naïve approach.
An issue that needs to be observed is that $x_l + x_h$ and $y_l+ y_h$ need one bit more than their summands.

For even larger numbers the so called Toom-Cook algorithm, an extension of the idea of the Karatsuba-algorithm for more than two summands, can be useful and for numbers larger than that the Schönhage-Strassen multiplication proves to be faster. It is implemented in some libraries for long integer arithmetic, like GMP. Since around 2007 this can be improved by using the Fürer algorithm which is assymptotically faster, but I do not know of any useful implementations and for practical purposes the improvement does not seem to be very relevant.

Addition 2019-10-08: In 2019 an algorithm has been discovered, that is assumed to be assymptotically $O(n \log n)$ , which is according to a conjecture of Schönhage the best possible result. It has not yet been proven rigorously and the conjecture is still a conjecture. The question is, if for practical purposes of multiplying numbers that still fit into real computers, there is a real gain from these algorithms. So the questions is, if they become faster than Schönhage-Strassen on numbers that are not too long to be practically multiplied. So right now this is really mostly an issue of mathematics and theoretical informatics. See also Maths whiz solves 48-year-old multiplication problem.. And see for the paper.

Testing Java- and C-programs with Ruby and Perl

It is very important to write good unit tests for software that is non-trivial and that is relied on by other pieces of software.
Often the logic of the software can easily be covered by the native testing facilities of the programming language, like JUnit for Java or, much less well known but available, CUnit for C. When a lot of framework code is involved or third party libraries are used heavily, there is almost no other way for certain tests, because setting up the environment cannot easily be achieved elsewhere.

But we also encounter cases where writing good unit tests in the same language as the library itself becomes a pain. We procrastinate the issue of writing them and end up with way too little or no unit tests at all. A lot of software deals with doing some calculations or transformations of numbers and strings, usually a lot of numbers and a lot of strings. Now Strings are the strength of the Perl programming language and are not really implemented very well or at least not very powerful or easy to use in most other languages. Specifically the String facilities of Perl are much superior to those of Java and C. Off course our software needs to perform well, it needs to integrate into the environment and follow the global corporate software standards, so Java or C or some other programming language is the choice that should not be challenged here for the productive software. But some tests of the functionality can more easily be achieved by iterating over some input data and creating output of the input data combined with the results. This can be perl code already or something really easy to parse. Perl is really the tool that can parse almost anything, but we do not really want to be distracted by unnecessary work but get our job done. So something like generating Perl code or CSV or YAML or JSON, but please not XML if not really needed, should do. Then we can pipe the output to perl or to a perl script and this will tell us, if everything is ok. When we know our platform, it can even be done that the Java- or C-Unit-test stores the output to a file or pipe and calls the Perl script on it and fails or succeeds depending on its output.

When it comes to numeric types, Ruby is very strong. It has unlimited size integers by default, which can be casted to n-bit-integers using constructions like
xx=x&(1<<n)-1,
it has Rational, LongDecimal (as an external gem) and Complex and is easily extendible.

Usually we can expect that corporate constraints on which tools and programming languages may be used are less restrictive when it comes to unit tests. Integrating this on a continuous integration platform is a job that needs to be addressed but it is worth the effort, if a lot of tests become easier with this approach. And doing tests in another language makes tests more credible.

Off course the general idea is applicable for other combinations. Look into Scala, F#, Clojure, JavaScript, Python and some others as well, if they seem to be more helpful than Ruby or Perl for your unit testing automation. But this does indeed raise the question if a world where corporate policies allowed Scala and Closure instead of Java, F# instead of C# and Elixir instead of Erlang and PL/I instead of Cobol would be better.

How to recover the Carry Bit

As frequent readers might have observed, I like the concept of the Carry Bit as it allows for efficient implementations of long integer arithmetic, which I would like to use as default integer type for most application development. And unfortunately such facilities are not available in high level languages like C and Java. But it is possible to recover the carry bit from what is available in C or Java, with some extra cost of performance, but maybe neglectable, if the compiler does a good optimization on this. We might assume gcc on a 64-Bit-Linux. It should be possible to do similar things on other platforms.

So we add two unsigned 64-bit integers $x$ and $y$ and an incoming carry bit $i\in\{0, 1\}$ to a result

$z\equiv x+y+i \mod 2^{64}$

with

$0 \le z < 2^{64}$

using the typical „long long“ of C. We assume that

$x=2^{63}x_h+x_l$

where

$x_h \in \{0,1\}$

and

$0 \le x_l < 2^{63}$

. In the same way we assume $y=2^{63}y_h + y_l$ and $z=2^{63}z_h + z_l$ with the same kind of conditions for $x_h$ , $y_h$ , $z_h$ or $x_l$ , $y_l$ , $z_l$ , respectively.

Now we have

$0 \le x_l+y_l + i \le 2^{64}-1$

and we can see that

$x_l + y_l + i = 2^{63}u + z_l$

for some

$u\in \{0,1\}$

.
And we have

$x+y+i = 2^{64}c+z$

where

$c\in\{0,1\}$

is the carry bit.
When looking just at the highest visible bit and the carry bit, this boils down to

$2c+z_h = x_h + y_h + u$

This leaves us with eight cases to observe for the combination of $x_h$ , $y_h$ and $u$ :

x_h	y_h	u	z_h	c
0	0	0	0	0
1	0	0	1	0
0	1	0	1	0
1	1	0	0	1
0	0	1	1	0
1	0	1	0	1
0	1	1	0	1
1	1	1	1	1

Or we can check all eight cases and find that we always have

$c = x_h \wedge\neg z_h \vee y_h \wedge\neg z_h \vee x_h \wedge y_h \wedge z_h$

$c = (x_h \vee y_h) \wedge\neg z_h \vee x_h \wedge y_h \wedge z_h.$

So the result does not depend on $u$ anymore, allowing to calculate it by temporarily casting $x$ , $y$ and $z$ to (signed long long) and using their sign.
We can express this as „use $x_h \wedge y_h$ if $z_h=1$ and use $x_h \vee y_h$ if $z_h = 0$ „.

An incoming carry bit $d$ does not change this, it just allows for $x_l + y_l + d < 2^{64}$ , which is sufficient for making the previous calculations work.

In a similar way subtraction can be dealt with.

The basic operations add, adc, sub, sbb, mul, xdiv (div is not available) have been implemented in this library for C. Feel free to use it according to the license (GPL). Addition and subtraction could be implemented in a similar way in Java, with the weirdness of declaring signed longs and using them as unsigned. For multiplication and division, native code would be needed, because Java lacks 128bit-integers. So the C-implementation is cleaner.

Microsoft dropping Windows-Support

Microsoft’s CEO Satya Nadella announced that Microsoft is withdrawing its support for the MS-Windows platform and concentrating on its core business, which is „services – services – services“. For a transition period of a few months, support for MS-Windows can still be obtained at an increased rate from Microsoft, but development and support have become too inefficient to be profitable and so the platform will no longer be supported by Microsoft in the future. Negotiations with Richard Stallman from the Free Software Foundation are performed about open sourcing Windows and having it thus continuously available for the existing customer basis. The FSF could take over the source code, continue the development as open source software and charge money for the support. But these talks seem to become extremely difficult, because Richard Stallman insists on not using the word „open source“ during the negotiations. Microsoft recommends using Linux as operating system for both server and desktop and plans to complete the transition of its own infrastructure within the next 8 months.

Nadella points out that this will increase the profitability of Microsoft and provide benefit to the stock holders, because the core business, services, can now be dealt much more focussed and intensely.

Binary Data

Having discussed to some extent about strings and text data, it is time to look at the other case, binary data.

Usually we think of arrays of bytes or sequences of bytes stored in some media.

Why bytes? The 8-bit-computers are no longer so common, but the byte as a typical unit of binary data is still present. Actually this is how files on typical operating systems work, they have a length in bytes and that number of bytes as content. When we use the file system, we have to deal with this.

But assuming a perfect lossless world, the typical smallest unit of information is not a byte, but a bit. So we could actually assume that we have an array of bits of a given length $l$ and the length does not need to be a multiple of 8. This might be subtle, but it does describe the most basic case. Padding with 0-bits does not usually do the job, but there need to be other provisions to deal with the bit-length and the storage length. In case of files this could be achieved by storing the last three bits $b$ of the bit length in the first byte of the file and using the file size $s$ the bit length $l$ can be calculated as $8(s-1)+b$ .

How about memory? Often the array of bytes is just fine. The approach to have an array of booleans as bits usually hurts, because typical programming languages reserve 32 bit for each boolean and we actually want a packed array. On the other hand we have 64bit-computers and we would like to use this when moving around data, so actually thinking of an array of 64-bit-integers plus information about how many of the bits of the last entry actually count might be better. This does bring up the issue of endianness at some point.

But we actually want to use the data. In the good old C-days we could use quite a powerful trick: We read an array of bytes and casted the pointer to a pointer to a struct. Then we could access the struct. Poor mans serialization, but it was efficient and worked especially well for records with fixed size, for both reading and writing. Off course variable length records can be dealt with as well by imposing rules that say something about the type of the next record based on the contents of the current record.

The Perl programming language and subsequently the Ruby programming language had much more powerful and flexible mechanisms using methods of functions called pack and unpack.