Exceptions to implement Program Logic

Sometimes it is conveniant to use exceptions for implementing the regular program logic.

Assuming we want to find some data and then process them. When no data is found, this step can be skipped, because there is nothing to do. So we could program something like this:


public Data readData(Param param) {
    Data data = db.read(param);
    if (data.isEmpty()) {
        throw new NotFoundException("nothing found");
    }
    return data;
}

public ProcessedData doWork(Param param) {
    try {
        Data input = readData(param);
        ....
        ....
        ....
        ProcessedData result = ....
        return result;
    } catch (NotFoundException nfex) {
        return ProcessedData.empty();
    }
}

And some other exceptions could also be handled in a similar way.

Of course some people say, that this is not good and an abuse of exceptions. But sometimes it is tempting.

So is this bad? And if so, why? Let’s find out.

This is some kind of weird obfuscation of the control flow, because throwing and catching of exceptions can be far apart and it can become quite unclear, from where in the stack which exceptions can be thrown. So there are good reasons to recommend using exceptions only for what they are meant for by their name. The Goto has never made it into Java and we are discouraged from using it in many other languages, like C. But languages like Java, C, Perl, Ruby and some others provide quite rich control flow relying neither on goto nor on exceptions by using „return“ anywhere in a function or method or subroutine, leaving loops with „break“ or „last“ or going to the next iteration with „next“ or „continue“. Perl and Java even allow to specify which of nested loops to leave with break or last. These mechanisms are very powerful and there is no urgent need to add exceptions or even gotos just to support the control flow.

Once moving to newer languages like Scala much of this is gone or at least strongly discouraged in a purely functional programming style. This makes programming Scala harder, and comes with benefits that might be worth the extra effort.

But in Java these functional purists have not become very strong yet, so using „break“, „continue“, „return“ etc. is still ok and quite powerful.

In Java there is another very major problem with exceptions. Many, if not most Java programs run in a framework or container like Spring, EJB/JEE, JBoss Fuse, for example. Now a piece of software becomes a software component, that can interact with other components through the framework. And exceptions are noticed by the framework. In many cases they have the effect that an ongoing transaction is marked as „rollback only“. So the whole processing continues normally, and when all the code from the components is finally done, the framework performs a rollback instead of a commit.

As long as exceptions are only used for handling errors or unsual situations, in which cases the rollback is probably the way to go anyway, everything is fine. But if we for example look up something and base the further processing on the outcome of this, then a NotFoundException will result in very counter intuitive behavior.

So the original rule of not abusing exceptions is actually not such a bad idea.

Share Button

How to use $ in Articles using WP QuickLaTeX

I use WP QuickLaTeX by Pavel Holoborodko in some of my articles to include mathematical formulas.
Now it can be an issue that the „$“-sign, that marks the beginning of a formula, is used as dollar sign.

This can be achieved by using [latexpage] in the beginning of the page. Sometimes it is desirable, to apply this only to part of the page. This is achieved using [latexregion].

The following

$ this is a dollar sign
[latexregion]
And this is a full line formular
$$a^2+b^2=c^2$$
And this is an inline formula $a^2+b^2=c^2$. With more stuff...
[/latexregion]
And here dollar signs $$$$ are dollar signs again.

results in:

$ this is a dollar sign

And this is a full line formular

    \[a^2+b^2=c^2\]

And this is an inline formula a^2+b^2=c^2. With more stuff…

And here dollar signs $$$$ are dollar signs again.

And yes, to show the

[latexregion]
...
[/latexregion]

above, I had to actually type

[latexregion]
...
[/latexregion]

Let’s stop the recursion here…

Links

Share Button

Borrow and Carry Bit for Subtraction

Similar to the usage of the carry bit when adding there are mechanisms for subtracting that allow to integrate the result of subtraction of the lower bits into the subtraction of the next higher block of bits, where necessary.

There are two ways to do this, that are trivially equivalent by a simple not operation:

  • borrow bit (also borrow flag)
  • carry bit (also carry flag)

Often CPUs use what is the carry bit for addition and interpret it as borrow bit for subtraction.

Here are the two ways:

Borrow Bit

It is assumed, that the CPU-word is n bits long, so calculations are performed modulo N=2^n. Further it is assumed, that the range 0\ldots N-1 is preferred, so all sign issues are excluded for the moment.

When subtracting two numbers x, y from each other like x-y and y > x, the provided result is

    \[x-y+N \equiv x-y \mod N\]

and the borrow bit is set (b=1), to indicate that the subtraction caused „underflow“, which had to be corrected by added N in order to get into the desired range.

In the „normal“ case where y \le x, the provided result is simply

    \[x-y\]

and the borrow bit is not set (b=0).

The next round of subtraction takes the borrow bit into account and calculates x-y-b, where the condition becomes y+b > x and the result is

    \[x-y-b+N \equiv x-y \mod N\]

or

    \[x-y-b\]

respectively. This is how some of the older readers used to do it in school on paper, but of course with N=10.

Now the typical integer arithmetic of current CPUs uses Two’s complement, which means that -y=(\mathrm{NOT}\; y)+1. Combining this with the previous results in calculating

    \[x-y = x + (\mathrm{NOT}\; y) + 1 - b \mod N\]

At this point some CPU-designers found it more natural, to use the carry bit c=1-b instead of the borrow bit b.

Carry Bit

When subtracting two numbers x, y from each other like x-y and we have y > x, the provided result is

    \[x-y+N \equiv x-y \mod N\]

and the carry bit is not set (c=0), to indicate that the subtraction caused „underflow“, which had to be corrected by added N in order to get into the desired range.

In the „normal“ case where y \le x, the provided result is simply

    \[x-y\]

and the carry bit is set (c=1).

The next round of subtraction takes the borrow bit into account and calculates x-y-1+c, where the condition becomes y+1-c > x and the result is

    \[x-y-1+c+N \equiv x-y \mod N\]

or

    \[x-y-1+c\]

respectively.

Now two’s complement with -y=(\mathrm{NOT}\; y)+1 this can be written as

    \[x-y = x + (\mathrm{NOT}\; y) + 1 - b \mod N\]

or with c=1-b

    \[x-y = x + (\mathrm{NOT}\; y) + c \mod N\]

These two ways are really equivalent and easily transformed into each other. Neither of them provides big advantages, apart from the fact that we have the unnecessary confusion, because it depends on the CPU design, which of the two variants is used.

Recovery of Borrow or Carry bit

The borrow bit is calculated and used during subtractions that are done at assembly language level, but higher languages like C or Java do provide access to this information. It is relatively easy to recover the carry bit in the case of addition based on x, y and x+y \mod N.

This possible as well for the subtraction. Quite easily the comparison between x and y or y+b could be done before the subtraction. This would work, but it is kind of inefficient, because under the hood the comparison is just a subtraction with discarding the result and keeping the flags. So the subtraction is performed twice. This might not be such a bad idea, because a compiler could recognize it or the work of subtracting twice could be neglectable compared to the logic for an „optimized“ recovery, based on some logic expression of certain bits from x, y and x-y \mod N.

Links

Share Button

Poor mans profiling with LOGs

We have professional profiling tools and we should use them.. They give really useful and extensive information.

So why bother about doing „poor man’s profiling“?

About ten years ago, running profilers was kind of constrained to very small examples and computers with really huge memory. We have this huge memory now on every development machine. But we can only run such tests on development and test systems.

Sometimes it is possible to copy data from the productive system to the test system and really simulate what is happening on the productive system. Often this is quite far away from what is really happening on the productive system and we often only know about a problem when it has already occurred.

What we usually do have are log files. Each entry has a time stamp and can somehow be connected to a line of code where it is created. We always know the class or the file or something like that from where the log statement was issued and since there is only a limited number of log statements, they can often be recognized. Sometimes it is a good idea to actually help finding where it was logged. For example could we artificially put the source code line number or some uniq number or a short random string into the source code for each log statement. Then we can find several pieces of information in the log statement:

  • The timestamp (please use ISO-format!!! And at least milliseconds)
  • The source code location (line number, class name, random string, relative position in file, meaningful description like entering/leaving methods etc.)
  • The thread and/or process
  • The payload information of the log

Now log statements analyzed by thread. The first step is to look from which pairs of locations in the source code we observe successive log statements from the same thread, ignoring all logging from other threads and processes. We can also estimate the time that was spent for going from the first to the second location. Aggregating this by basically adding up thes timestamp deltas for each such pair over all threads and the whole log file will already indicate some hotspots where time is spent. Sometimes it is even interesting to look at sequences of three log statements (from the same thread). This was useful, when DB-commits, i.e. one such pair like „begin commit“ to „commit done“ used around 50% of the time, if not more. This was not interesting without knowing the statement that was committed, which was found out by looking at the previous log entry of the same thread just before the commit.

Such analysis can be done with scripts for example in Perl, Ruby or Python. I use Perl for this kind of task. Maybe log analysis tools provide this out of the box or with a little bit of tweaking. A concept about what entries in log files look like that make recognizing the location of the program that created the log entry, are very helpful. Sometimes we actually want to see, with which kins of data the problem occurred, which means looking also into the payload.

Share Button

Rounding to Rational Numbers

Usually we think of rounding as a way of approximately expressing numbers with many decimal places by numbers with fewer decimal places.

Regular readers of this blog have already encountered, that this concept can be extended and generalized, as is mentioned in the article about Geometric and Harmonic Rounding and Residue Class Rounding or Rounding with Sum.
Now computers and humans can deal with rational numbers and sometimes that is the best way.

Idea 1: Read the Double as Rational

Most typical non-integer computer numbers are actually already rational numbers of the form \frac{n}{2^m} with a relatively large power of two in the denominator. But as soon as we perform divisions, we leave the accurate world and round, usually in the way that the machine throws in front of our feet out of the box. But all other floating point operations can result in rounding as well. So rational numbers can be a way to go. So we can just naturally convert a double or float number into a rational number and continue working with that.

Idea 2: Go for human readable fractions

Let us think of human readers. We are kind of comfortable with fractions like \frac{1}{12}, \frac{1}{10}, \frac{1}{8} and multiples of them, so basically every fraction that can be expressed with a denominator of 1, 2, 3, 4, 5, 6, 8, 10 or 12. The 12 is because of the omnipresence of the analog clock, the 10 because it is just one more decimal digit and the 8 because it is about the number of successive halving that we can still easily cope with. The idea would be to round to the nearest such fraction and to prefer the lower denominator when two adjacent values are equally far away. This works out well, because our set of allowed fractions between 0 and 1 is just

    \[\{0, \frac{1}{12}, \frac{1}{10}, \frac{1}{8}, \frac{1}{6}, \frac{1}{5}, \frac{1}{4}, \frac{1}{3}, \frac{3}{8}, \frac{2}{5}, \frac{5}{12}, \frac{1}{2}, \frac{7}{12}, \frac{3}{5}, \frac{5}{8}, \frac{2}{3}, \frac{3}{4}, \frac{4}{5}, \frac{5}{6}, \frac{7}{8}, \frac{9}{10}, \frac{11}{12}, 1\}\]

and it is more or less trivial to program such a rounding algorithm, by just hard coding the limits, normalizing to the interval [0,1) and finding the right slot by binary search, for example. This is what we usually want to make numbers human readable and understandable. If we want more accuracy we can often just use the trick of going to % or or just shifting units by multiples of 1000, depending on what we are measuring or counting. This is often a bit better than just decimal numbers, and we can more often solve the issue of rounding with sum when some of the values we want to round are the same and just won’t come out the same of our rounding procedure.

Idea 3: Use continuous fractions

With continuous fractions it is possible to express any real number in the form

    \[a_0 + \cfrac{1}{a_1 + \cfrac{1}{a_2 + \cfrac{1}{ \ddots + \cfrac{1}{a_n} }}}\]

or

    \[a_0 + \cfrac{1}{a_1 + \cfrac{1}{a_2 + \cfrac{1}{a_3 + \cfrac{1}{\ddots}}}}\]

For integers this is trivial, just use a_0. For negative numbers we just take the negative of the continuous fraction of the absolute value, so we can assume a non-integral positive number r_0.
This allows determining a_0 by just taking the integer part of it.
Then we continue with r_1 = \frac{1}{r_0 - a_0} and so on. Either we end up with an integer a_n at some point, which is the case for rational numbers or the continuous fraction continues infinitely, which is the case for irrational numbers. Now taking the continuous fraction just up to the first n elements, as in the first form and ignoring what comes after that, can in turn be converted into a rational number, by just doing some trivial rational arithmetic. It turns out, that this is a very good approximation for that size of the denominator. The funny thing is that this works even for irrational numbers under certain conditions. For example could we assume that we are dealing with numbers of the form x+y\sqrt{D} for some fixed rational D and rational numbers x and y. It is relatively easy to approximate x+y\sqrt{D}=x_0+y_0\sqrt{D} with the largest integer a_0 that is less or equal. Now we can calculate

    \[\frac{1}{x_k+y_k\sqrt{D}-a_k}=\frac{x_k-a_k-y_k\sqrt{D}}{(x_k-a_k)^2-y_k^2\cdot D}=x_{k+1}+y_{k+1}\sqrt{D}\]

and thus use this algorithm as another way to calculate rational approximations of square roots. In this case the continuous fraction becomes periodic and there is a surprising lot of interesting mathematics behind it, if you like to dig deeper.

Share Button

Accident Languages

Some commonly used languages have been quite well designed or at least would have been considered so at the time when they appeared. Even if they have their weaknesses, they should be good for some purposes.

Now beauty of programming languages is highly subjective. So I do not claim any universal truth to this. But I consider the popularity of certain languages as an accident, at least for the time when they became highly popular. This does not preclude that good stuff has been done with them or that they have matured to some extent or even that their potential replacements of the time of their appearance have lost relevance. I do not mention the classics COBOL and Fortran, because they were early pioneers of high level languages. And looking at these two languages, one of them does not deserve to be mentioned together with the other one.

Anyway, here we go:

Pascal

Pascal was in certain areas popular before C took off. To a great extent this was true because it was used as a teaching language and because Turbo Pascal ran well on a PC with 4.77 MHz and 256k Memory and Floppy Disks only in the early 1980s. This was the hardware that many people could afford with some pain (prices around 5000 DEM in Germany). And it transported some interesting concepts to a wider public. People who knew Algol 68 used to consider it a step backwards. Anyway, in the later 1980s C was available to a broader public, because affordable hardware that could run C-compilers reasonably fast and conveniently came into existence. C does have its flaws, but it is a well designed language and I personable consider the step from Pascal to C one of the most positive ever encountered.

Python

Python is a good language with a lot of power. But I see it in a similar space as Ruby, which to me appears as one of the most beautifully designed languages. So why use Python if Ruby could serve the same purpose. While Ruby took off with Rails for some time, Python kind of took over where people were looking for a replacement for Fortran and in Data science. It now has really superior libraries in many areas and has achieved a very strong position in Devops, where it has taken over to a large extent from Perl and Ruby. So Python has really gained traction and I do recommend learning it.

PHP

PHP came out as an easier to use Perl clone for web development. In the 1990s and early 2000s CGI was a common way to do web application development and Perl was strong in this area. I still consider Perl the better designed language, even though it does have its flaws and weaknesses. A more critical view would be that PHP was an inferior clone that copied only what they understood. In terms of building a great open source community it would have been more desirable, if the PHP guys had contributed to developing frameworks and libraries that would have made Perl better. But competition is a good thing also, even in open source. And PHP has clearly won the competition for the web space against Perl and now has good library and framework support. Some of the negative reputation of PHP comes from the fact, that unskilled developers write a lot of PHP code, because they came from the web design space and just added a bit of PHP to their HTML code. This is not to blame on the language. On the positive side I would mention that a lot of really great software has been written in PHP, some of which I use on a daily basis, like Wikipedia and its underlying software MediaWiki, WordPress and a lot of others.

VBA

VBA comes as a development and scripting language for MS-Office, which is a useful thing. Like PHP it attracts non-developers who write ugly code, because they use MS-Office and enrich it with a bit of VBA, which is not to blame on the language. I have not really tried it enough, so I let others speak:

We do have to admit, that sometimes non-developers „get the job done“ in a fraction of the time that professional developers need using VBA and Excel. The drawback is, that this solution is really limited, because it depends on some excel sheets on drive C: of a certain team member. Or of course shared ones, but who is using it when? How many copies are there around? Data and program are together in one file…
Anyway, a lot of useful stuff is done with VBA and as long as we do not miss the point when it is time to switch to something else, this might not even be a bad idea.

JavaScript

JavaScript was developed in 1995 and gave some functionality to the browser. The original idea was to enable applets, but applets have lost their relevance long ago and JavaScript itself can now do everything that applets could do and much more than that. JavaScript did transport interesting concepts to a wider public, like lambda expressions or anonymous functions, that had been around for ages in Lisp, but only been known to a small fraction of the software developers. But the legitimate question is, why a new language had to be invented for this, instead of adding an existing and mature language like Perl or Tcl to the browser. They were good candidates for this in 1995, I would not recommend to add them now, because we do have more adequate choices now. Anyway, JavaScript lacked and to my knowledge still lacks decent integer types. All numbers are double precision floating point, which does express integers of up to 53 bits accurately, but for counting pixels some kind of sufficiently long integer would have been a useful thing. There was some hope coming up that Dart might be established as a successor to JavaScript. But not even Google with Chrome was influential enough to get this established. It did survive as one of many languages that can be „compiled“ or „transpiled“ to JavaScript, as TypeScript, CoffeeScript, Clojure, Scala and many other languages can now be transpiled to JavaScript. I was told that even JavaScript itself is transpiled in order to support different languages of the language from one source code base. New hope is coming with WebAssembly, which will establish a common binary language (like JavaVM) on all browsers and allow compiling languages to this target instead of JavaScript. Again, as PHP and VBA, many non developers use JavaScript, again coming from the web design background. And again, we have today very good web applications that heavily rely on browser side logic, which is implemented in JavaScript or compiled to JavaScript. This allows for much better user experience and saves power on the server which no longer needs to do the rendering. Moving JavaScript to the server side with Node.js looks like a weird idea. We have better languages to work with on the server side than JavaScript. But again, competition between concepts, languages and technologies is also a good thing.

Conclusion

All these languages have somehow reached a relatively high popularity, even though they were not really the most advanced choices. But a lot of good stuff has been done with them and is still being done for example with JavaScript, Python and PHP today. That is what counts most.

Share Button

UUIDs revisited

UUIDs have proven useful in many circumstances.
We have basically two main variants:

  • The UUID is calculated as a combination of the Ethernet-MAC-address, the timestamp and a counter.
  • The UUID is calculated using a good random number generator

While variant 1 provides for a good uniqueness, there are some issues with it. Today we use mostly virtualized servers, which means that the MAC-address is coming from a configuration file and no longer guaranteed to be world wide unique. And we give away some information with the UUID, that we do not necessarily want to give away.

Variant 2 can be proven to have an acceptably low risk of collisions, but this is only true when using really good random number generators, which cannot always be guaranteed. Also it introduces an uncertainty in an area where we do not need it. We need to worry about this uniqueness, at least a little but, which is unnecessary.

So the question is, if we can rethink variant 1.

Assuming, that our software runs on our server farm. There may be a few hundred or thousand or even millions of virtual or physical servers. Now the organization does have a way to uniquely identify their servers. Of course we only need to consider the servers that are relevant for the application. Maybe an ID for the service instance instead of the server is even better. We may assume a numerical ID or something or have a table to map IP addresses and the like to such an ID. Some thinking is still required on how to do this. We can fill the digits that we do not need with random numbers.

Putting this ID instead of the MAC address solves the issue of configurable MAC address.

The next problem, that timestamps can be abused to find out something that should not be found out could be resolved by running the timestamp part or even the ID part (including a random number) through a symmetric encryption or simply some bijective function that is kept as a secret.

In many circumstances there is nothing wrong with customizing the UUID-generation to some „local“ standard, if this is well understood and carefully implemented.

Share Button

How to calculate Square Roots and Cubic Roots

The functions sqrt and sometimes even cbrt are commonly available, but it is nice to see how they can be calculated.

There are several approaches, but the most popular ones are Newton’s method and an algorithmic formulation of how roots are taken manually, for those old enough to still have learned it in school. Earlier measurements that I did many years ago showed that the Newton approximation is slower, but it would be worth to do newer measurements.

So we have an equation y = x^2 or y=x^3 and want to find x or a well defined approximation of x when we know y. Mathematically speaking we want to assume that y is constant and we want to find an x for which f(x)=x^2-y=0 or g(x)=x^3-y=0. If we guess such an x and then draw the tangent at the curve of the function at the point (x, f(x)) or (x, g(x)), then the intersection point of the tangent can be used as the next approximation. This method converges in the case of these two functions (and some others) and is reasonably fast. Now the tangent has the linear equation

    \[\frac{y-y_0}{x-x_0}=f'(x_0)\]

where y_0=f(x_0) and f'(x)=\frac{df(x)}{dx} is the derivative of f(x). We want to solve this equation for y=0 and thus we get

    \[x-x_0 = -\frac{f(x_0)}{f'(x_0)}\]

and thus

    \[x=x_0-\frac{f(x_0)}{f'(x_0)}\]

As an iteration rule

    \[\bigwedge_{i=0}^\infty x_{i+1}=x_i-\frac{f(x_i)}{f'(x_i)}\]

In case of the sqare root we can just start with an estimation by shifting half the length to the right, but avoiding zero, which is important because of the division. Then we get for an appropriate n

    \[x_0 = 2^{-n}y\]

    \[x_{i+1}=x_i-\frac{x_i^2-y}{2x_i}=\frac{x_i+y/x_i}{2}\]

The last form is quite intuitive, even without calculus. As I said this converges usefully fast and there is tons of math around to describe the behavior, speed, precision and convergence of the calculations performed in this algorithm. Written in Ruby just for integers, this is quite simple. Convergence is simply discovered by the fact that the result does not change any more, which may fail in some cases, where intermediate results oscillate between two values, but just for the purpose of benchmarking it seems to be sufficient:

def sqrt_newton(x)
  if (x == 0) then
    return 0
  end
  y0 = x
  u0 = x
  while (u0 > 0) do
    y0 >>= 1
    u0 >>= 2
  end
  y0 = [1, y0].max
  yi = y0
  yi_minus_1 = -1
  loop do
    yi_plus_1 = (yi + x/yi) >> 1;
    if (yi_minus_1 == yi_plus_1) then
      return [yi, yi_minus_1].min
    elsif (yi == yi_plus_1) then
      return yi
    end
    yi_minus_1 = yi
    yi = yi_plus_1
  end
end

The newton algorithm tends to oscillate between two approximations, so this termination criteria takes into account y_{i-1}, y_i and y_{i+1} and uses the lower of the two oscillating values. This results in calculating the largest integer y such that y^2\le x and (y+1)^2 > x.

For the third root we get for an appropriate n

    \[x_0 = 2^{-n}y\]

    \[x_{i+1}=x_i-\frac{x_i^3-y}{3x_i^2}=\frac{x_i+y/x_i^2}{3}\]

Again this is a useful way and there is math around on when to stop for a desired precision.
Read Wikipedia for the convergence issues.

There is another approach, that people used to know when doing calculations on paper was more important than today. For the decimal system it works like this:

1. 7 3 2 0 5
----------------------
/ 3.00 00 00 00 00
/\/ 1 = 20*0*1+1^2
-
2 00
1 89 = 20*1*7+7^2
----
11 00
10 29 = 20*17*3+3^2
-----
71 00
69 24 = 20*173*2+2^2
-----
1 76 00
0 = 20*1732*0+0^2
-------
1 76 00 00
1 73 20 25 = 20*17320*5+5^2
----------
2 79 75
(source Wikipedia)
We group the digits to the left and to the right of the decimal point in groups of two. The highest possible square of an integral number that is below or equal to the leftmost group (03 in the example above) is used for the first digit of the result (1 in the example above). This square is subtracted and the next group is appended (200 in the example). Assuming that y_n is the result already calculated and x_n is what we have achieved after the subtraction and the appending of the next group, we search for a digit z_n such that u_n = 20\cdot y_n\cdot z_n + z_n^2 \le x_n. z_n is chosen in such a way that it yields the maximum possible u_n wich is still \le x_n. Subtracting u_n from x_n and appending the next group allows for the next iteration.

Now this can be turned into an algorithm. The first approach is to just switch from decimal system to binary system. Then for each iteration step we have to deal just with the possible values of 0 and 1, which greatly simplifies the algorithm. Here is a simple ruby program that would do this:

def split_to_words(x, word_len)
  bit_pattern = (1 << word_len) - 1   words = []   while (x != 0 || words.length == 0) do     w = x & bit_pattern     x = x >> word_len
    words.unshift(w)
  end
  words
end

def sqrt_bin(x)
  if (x == 0) then
    return 0
  end
  xwords = split_to_words(x, 2)
  xi = xwords[0] - 1
  yi = 1
  1.upto(xwords.length-1) do |i|
    xi = (xi << 2) + xwords[i]     d0 = (yi << 2) + 1     r  = xi - d0     b  = 0     if (r >= 0) then
      b  = 1
      xi = r
    end
    yi = (yi << 1) + b   end   return yi end

It seems that the two solutions yield the same results, but the sqrt_newton outperforms sqrt_bin by a factor of two.

Now we should reconsider, if base 2 is really the best choice. Actually we can use any power of 2 as a base and efficiently work with that. Apart from the initial first step, which is done by using an extended version of sqrt_bin, the next steps are estimated by division and trying neighboring values to get the exact result. This makes use of the fact that the equation we need to solve
u_n = 2\cdot b\cdot y_n\cdot z_n + z_n^2 \le x_n with the maximum z_n fullfilling this equation, where b is the base to which we are working, witch was 10 or 2 above and could now be a power of 2. As soon as y_n\cdot b has a certain size, the influence of z_n^2 becomes less relevant. We can consider the maximum posible value for z_n, which is b-1 and thus solve 2\cdot b\cdot y_n\cdot z_n\le x_n and 2\cdot b\cdot y_n\cdot z_n\le x_n-(b-1)^2, each for the maximum z_n fullfilling the equation. This can be calculated by simple division. If the range between the two solutions is small enough, then each value in the range can be tried to find the actual accurate solution for 2\cdot b\cdot y_n\cdot z_n + z_n^2 \le x_n and this is more efficient than working just bitwise. This method sqrt_word seems to outperform sqrt_newton for longer numbers, for example around 60 decimal digits with word_length=16. So the most promising approach seems to be to optimize the implementation and parameters of sqrt_word. The issue of termination, which has been properly addressed in the newton implementation, is already dealt with in this implementation. For more serious analysis it would be interesting to implement the algorithms in C or even in assembly language. So this is the final result for square roots, with some checks added:

def check_is_nonneg_int(x, name)
  raise TypeError, "#{name}=#{x.inspect} must be Integer" unless (x.kind_of? Integer) && x >= 0
end

def check_word_len(word_len, name="word_len")
  unless ((word_len.kind_of? Integer) && word_len > 0 && word_len <= 1024)
    raise TypeError, "#{name} must be a positive number <= 1024"
  end
end

def split_to_words(x, word_len)
  check_is_nonneg_int(x, "x")
  check_word_len(word_len)
  bit_pattern = (1 << word_len) - 1
  words = []
  while (x != 0 || words.length == 0) do
    w = x & bit_pattern
    x = x >> word_len
    words.unshift(w)
  end
  words
end

def sqrt_bin(x)
  yy = sqrt_bin_with_remainder(x)
  yy[0]
end

def sqrt_bin_with_remainder(x)
  check_is_nonneg_int(x, "x")
  if (x == 0) then
    return [0, 0]
  end

  xwords = split_to_words(x, 2)
  xi = xwords[0] - 1
  yi = 1

  1.upto(xwords.length-1) do |i|
    xi = (xi << 2) + xwords[i]
    d0 = (yi << 2) + 1
    r  = xi - d0
    b  = 0
    if (r >= 0) then
      b  = 1
      xi = r
    end
    yi = (yi << 1) + b
  end
  return [yi, xi]
end

def sqrt_word(x, n = 16)
  check_is_nonneg_int(x, "x")
  check_is_nonneg_int(n, "n")

  n2 = n << 1
  n1 = n+1
  check_word_len(n2, "2*n")
  if (x == 0) then
    return 0
  end

  xwords = split_to_words(x, n2)
  if (xwords.length == 1) then
    return sqrt_bin(xwords[0])
  end

  xi = (xwords[0] << n2) + xwords[1]
  a  = sqrt_bin_with_remainder(xi)
  yi = a[0]
  if (xwords.length <= 2) then
    return yi
  end

  xi = a[1]
  2.upto(xwords.length-1) do |i|
    xi = (xi << n2) + xwords[i]
    d0 = (yi << n1)
    q  = (xi / d0).to_i
    j  = 10
    was_negative = false
    while (true) do
      d = d0 + q
      r = xi - (q * d)
      break if (0 <= r && (r < d || was_negative))
      if (r < 0) then
        was_negative = true
        q = q-1
      else
        q = q+1
      end
      j -= 1
      if (j <= 0) then
        break
      end
    end
    xi = r
    yi = (yi << n) + q
  end
  return yi
end

def sqrt_newton(x)
  check_is_nonneg_int(x, "x")
  if (x == 0) then
    return 0
  end
  y0 = x
  u0 = x
  while (u0 > 0) do
    y0 >>= 1
    u0 >>= 2
  end
  y0 = [1, y0].max
  yi = y0
  yi_minus_1 = -1
  loop do
    yi_plus_1 = (yi + x/yi) >> 1;
    if (yi_minus_1 == yi_plus_1) then
      return [yi, yi_minus_1].min
    elsif (yi == yi_plus_1) then
      return yi
    end
    yi_minus_1 = yi
    yi = yi_plus_1
  end
end

This is the approach that has been built into the LongDecimal library, ignoring Newton. The examples have been added to github.

The algorithms can be extended to cubic roots or any higher roots. In this case, the nth root of \sum_{j=0}^{m}a_j b^{n(m-j)} is calculated by starting with the maximal integral number z_0 with z_0^n \le a_0 and the subsequently finding numbers z_j fullfilling an equation of the form {n}\choose{k}\sum_{k=1}^n (by_j)^{n-k}z_j^k \le x_i. This is always easy to handle for base two, by just testing the two possible solutions. For higher bases and n=3 it involves solving an quadratic equation, once the numbers are high enough to neglect the term z_j^3. For n=4 it is just possible to take the square root of the square root. For higher values of n and bases other than 2 it becomes really difficult to tame this algorithm. So I intend to constrain myself to square roots and cube roots. I have not explored, if it is useful to calculate the cube root with a higher base than 2 and which approach provides the best performance for cube roots. Even the square root calculation can possibly be tuned a bit. Maybe this will be addressed in another article.

Share Button

Guava-Collections in Java-APIs

When we write APIs, that have parameters or as return values, it is a good idea to consider relying on immutable objects only. This applies also when collections are involved directly or indirectly as content of the classes that occur as return values or parameters. Changing what is given through the API in either direction can create weird side effects. It even causes different behavior, depending on weather the API works locally or via the network, because changes of the parameters are usually not brought back to the caller via some hidden back channel, unless we run locally. I use Java as an example, but it is quite an universal concept and applies to many languages. If we talk about Ruby, for example, the freeze method might be our friend, but it goes only one level deep and we actually want to deep-freeze.. Another story, maybe…

Now we can think of using Java’s own Collections.unmodifiableList and likewise. But that is not really ideal. First of all, these collections can still be modified by just working on the inner of the two collections:

List list = new ArrayList<>();
list.add("a");
list.add("b");
list.add("c");
List ulist = Collections.unmodifiableList(list);
list.add("d");
ulist.get(3)
-> "d"

Now list in the example above may already come from a source that we don’t really control or know, so it may be modified there, because it is just „in use“. Unless we actually copy the collection and put it into an unmodifiableXXX after that and only retain the unmodifiable variant of it, this is not really a good guarantee against accidental change and weird effects. Let’s not get into the issue, that this is just controlled at runtime and not at compile time. There are languages or even libraries for Java, that would allow to require an immutable List at compile time. While this is natural in languages like Scala, you have to leave the usual space of interfaces in Java, because it is also really a good idea to have the „standard“ interfaces in the APIs. But at least use immutable implementations.

When we copy and wrap, we still get the weird indirection of working through the methods of two classes and carrying around two instances. Normally that is not an issue, but it can be. I find it more elegant to use Guava in these cases and just copy into a natively immutable collection in one step.

For new projects or projects undergoing a major refactoring, I strongly recommend this approach. But it is of course only a good idea, if we still keep our APIs consistent. Doing one of 100 APIs right is not worth an additional inconsistency. It is a matter of taste, to use standard Java interfaces implemented by Guava or Guava interfaces and classes itself for the API. But it should be consistent.

Links

Share Button

Scala Days 2019

Scala Days had a tenth anniversary in 2019. It is an annual conference about Scala, which I like to visit, when possible.

There is some confusion as to what is meant:

  • Where this the 9th, 10th or 11th Scala Days?
  • Where the first Scala Days 9, 10 or 11 years ago?

The truth is, that the first Scala Days were organized in 2010 and took place in Lausanne. So it was 9 years ago, but the tenth event. For this reason it was done in Lausanne again, but in a much larger venue.

Just for the completeness:

This is the opening keynote by Martin Odersky. A video will be on youtube soon.

I visited the following talks:

And on the last day I visited the following talks and the diversity panel and the closing panel:

Here is the whole schedule.

Btw., links might expire when the next Scala Days will start using the site in early 2020. Where it will be in 2020 is still unknown, but since 2014 it has been a habit to do the even years in Berlin. I will try to come, if there are Scala Days 2020 in Europe.

Share Button