# How to calculate Square Roots and Cubic Roots

The functions sqrt and sometimes even cbrt are commonly available, but it is nice to see how they can be calculated.

There are several approaches, but the most popular ones are Newton’s method and an algorithmic formulation of how roots are taken manually, for those old enough to still have learned it in school. Earlier measurements that I did many years ago showed that the Newton approximation is slower, but it would be worth to do newer measurements.

So we have an equation or and want to find or a well defined approximation of when we know . Mathematically speaking we want to assume that is constant and we want to find an for which or . If we guess such an and then draw the tangent at the curve of the function at the point or , then the intersection point of the tangent can be used as the next approximation. This method converges in the case of these two functions (and some others) and is reasonably fast. Now the tangent has the linear equation

where and is the derivative of . We want to solve this equation for and thus we get

and thus

As an iteration rule

In case of the sqare root we can just start with an estimation by shifting half the length to the right, but avoiding zero, which is important because of the division. Then we get for an appropriate n

The last form is quite intuitive, even without calculus. As I said this converges usefully fast and there is tons of math around to describe the behavior, speed, precision and convergence of the calculations performed in this algorithm. Written in Ruby just for integers, this is quite simple. Convergence is simply discovered by the fact that the result does not change any more, which may fail in some cases, where intermediate results oscillate between two values, but just for the purpose of benchmarking it seems to be sufficient:
 def sqrt_newton(x)   if (x == 0) then     return 0   end   y0 = x   u0 = x   while (u0 > 0) do     y0 >>= 1     u0 >>= 2   end   y0 = [1, y0].max   yi = y0   yi_minus_1 = -1   loop do     yi_plus_1 = (yi + x/yi) >> 1;     if (yi_minus_1 == yi_plus_1) then       return [yi, yi_minus_1].min     elsif (yi == yi_plus_1) then       return yi     end     yi_minus_1 = yi     yi = yi_plus_1   end end 

The newton algorithm tends to oscillate between two approximations, so this termination criteria takes into account , and and uses the lower of the two oscillating values. This results in calculating the largest integer such that and .

For the third root we get for an appropriate n

Again this is a useful way and there is math around on when to stop for a desired precision.
Read Wikipedia for the convergence issues.

There is another approach, that people used to know when doing calculations on paper was more important than today. For the decimal system it works like this:
 1. 7 3 2 0 5 ---------------------- / 3.00 00 00 00 00 /\/ 1 = 20*0*1+1^2 - 2 00 1 89 = 20*1*7+7^2 ---- 11 00 10 29 = 20*17*3+3^2 ----- 71 00 69 24 = 20*173*2+2^2 ----- 1 76 00 0 = 20*1732*0+0^2 ------- 1 76 00 00 1 73 20 25 = 20*17320*5+5^2 ---------- 2 79 75 (source Wikipedia)
We group the digits to the left and to the right of the decimal point in groups of two. The highest possible square of an integral number that is below or equal to the leftmost group (03 in the example above) is used for the first digit of the result (1 in the example above). This square is subtracted and the next group is appended (200 in the example). Assuming that is the result already calculated and is what we have achieved after the subtraction and the appending of the next group, we search for a digit such that . is chosen in such a way that it yields the maximum possible wich is still . Subtracting from and appending the next group allows for the next iteration.

Now this can be turned into an algorithm. The first approach is to just switch from decimal system to binary system. Then for each iteration step we have to deal just with the possible values of and , which greatly simplifies the algorithm. Here is a simple ruby program that would do this:
 def split_to_words(x, word_len)   bit_pattern = (1 << word_len) - 1   words = []   while (x != 0 || words.length == 0) do     w = x & bit_pattern     x = x >> word_len     words.unshift(w)   end   words end

 

def sqrt_bin(x)   if (x == 0) then     return 0   end   xwords = split_to_words(x, 2)   xi = xwords[0] - 1   yi = 1   1.upto(xwords.length-1) do |i|     xi = (xi << 2) + xwords[i]     d0 = (yi << 2) + 1     r  = xi - d0     b  = 0     if (r >= 0) then       b  = 1       xi = r     end     yi = (yi << 1) + b   end   return yi end 

It seems that the two solutions yield the same results, but the sqrt_newton outperforms sqrt_bin by a factor of two.

Now we should reconsider, if base 2 is really the best choice. Actually we can use any power of 2 as a base and efficiently work with that. Apart from the initial first step, which is done by using an extended version of sqrt_bin, the next steps are estimated by division and trying neighboring values to get the exact result. This makes use of the fact that the equation we need to solve
with the maximum fullfilling this equation, where is the base to which we are working, witch was or above and could now be a power of . As soon as has a certain size, the influence of becomes less relevant. We can consider the maximum posible value for z_n, which is and thus solve and , each for the maximum fullfilling the equation. This can be calculated by simple division. If the range between the two solutions is small enough, then each value in the range can be tried to find the actual accurate solution for and this is more efficient than working just bitwise. This method sqrt_word seems to outperform sqrt_newton for longer numbers, for example around 60 decimal digits with word_length=16. So the most promising approach seems to be to optimize the implementation and parameters of sqrt_word. The issue of termination, which has been properly addressed in the newton implementation, is already dealt with in this implementation. For more serious analysis it would be interesting to implement the algorithms in C or even in assembly language. So this is the final result for square roots, with some checks added:
 def check_is_nonneg_int(x, name)   raise TypeError, "#{name}=#{x.inspect} must be Integer" unless (x.kind_of? Integer) && x >= 0 end

 def check_word_len(word_len, name="word_len")   unless ((word_len.kind_of? Integer) && word_len > 0 && word_len <= 1024)     raise TypeError, "#{name} must be a positive number <= 1024"   end end def split_to_words(x, word_len)   check_is_nonneg_int(x, "x")   check_word_len(word_len)   bit_pattern = (1 << word_len) - 1   words = []   while (x != 0 || words.length == 0) do     w = x & bit_pattern     x = x >> word_len     words.unshift(w)   end   words end def sqrt_bin(x)   yy = sqrt_bin_with_remainder(x)   yy[0] end def sqrt_bin_with_remainder(x)   check_is_nonneg_int(x, "x")   if (x == 0) then     return [0, 0]   end   xwords = split_to_words(x, 2)   xi = xwords[0] - 1   yi = 1   1.upto(xwords.length-1) do |i|     xi = (xi << 2) + xwords[i]     d0 = (yi << 2) + 1     r  = xi - d0     b  = 0     if (r >= 0) then       b  = 1       xi = r     end     yi = (yi << 1) + b   end   return [yi, xi] end def sqrt_word(x, n = 16)   check_is_nonneg_int(x, "x")   check_is_nonneg_int(n, "n")   n2 = n << 1   n1 = n+1   check_word_len(n2, "2*n")   if (x == 0) then     return 0   end   xwords = split_to_words(x, n2)   if (xwords.length == 1) then     return sqrt_bin(xwords[0])   end   xi = (xwords[0] << n2) + xwords[1]   a  = sqrt_bin_with_remainder(xi)   yi = a[0]   if (xwords.length <= 2) then     return yi   end   xi = a[1]   2.upto(xwords.length-1) do |i|     xi = (xi << n2) + xwords[i]     d0 = (yi << n1)     q  = (xi / d0).to_i     j  = 10     was_negative = false     while (true) do       d = d0 + q       r = xi - (q * d)       break if (0 <= r && (r < d || was_negative))       if (r < 0) then         was_negative = true         q = q-1       else         q = q+1       end       j -= 1       if (j <= 0) then         break       end     end     xi = r     yi = (yi << n) + q   end   return yi end 

def sqrt_newton(x)   check_is_nonneg_int(x, "x")   if (x == 0) then     return 0   end   y0 = x   u0 = x   while (u0 > 0) do     y0 >>= 1     u0 >>= 2   end   y0 = [1, y0].max   yi = y0   yi_minus_1 = -1   loop do     yi_plus_1 = (yi + x/yi) >> 1;     if (yi_minus_1 == yi_plus_1) then       return [yi, yi_minus_1].min     elsif (yi == yi_plus_1) then       return yi     end     yi_minus_1 = yi     yi = yi_plus_1   end end 

This is the approach that has been built into the LongDecimal library, ignoring Newton. The examples have been added to github.

The algorithms can be extended to cubic roots or any higher roots. In this case, the th root of is calculated by starting with the maximal integral number with and the subsequently finding numbers z_j fullfilling an equation of the form . This is always easy to handle for base two, by just testing the two possible solutions. For higher bases and it involves solving an quadratic equation, once the numbers are high enough to neglect the term . For it is just possible to take the square root of the square root. For higher values of and bases other than it becomes really difficult to tame this algorithm. So I intend to constrain myself to square roots and cube roots. I have not explored, if it is useful to calculate the cube root with a higher base than and which approach provides the best performance for cube roots. Even the square root calculation can possibly be tuned a bit. Maybe this will be addressed in another article.

# Flashsort in Ruby

Deutsch

There is a simple implementation of Flashsort in Ruby, after having already provided an implementation in C. The C-implementation is typically faster than the libc-function qsort, but this depends always on the data and on how well the metric-function has been written, that is needed on top of the comparison function for Flashsort. You can think of this metric function as some kind of monotonic hash function. So we have

This additionally needed function of method is not really there, apart from numerical values, so we really have to invest some time into writing it. This makes the use of Flashsort a bit harder. A good metric function is crucial for good performance, but for typical text files quite trivial implentations already outperform classical algorithms like Heapsort and Quicksort and Mergesort for larger amounts of data.

This blog article shows other sorting algorithms for Ruby.

# LongDecimal

Disclaimer: This article is an occasion, where you might need some of the presumably useless mathematics that you might have learned in school and university. If this bothers you, maybe you should wait for the next article in about two weeks time.

LongDecimal is a library that I have provided for Ruby. It is available as a ruby gem. It was originally intended to provide something like BigDecimal for Java. There is a BigDecimal, but it is not really the same. For writing finance applications, such a class is useful, so I wrote one that covers what Java’s BigDecimal has. It ended up by having a lot more, but we will get to that later.

So the general idea is that we do math with a subset of the rational numbers () . This is not quite the truth, because the actually carries information that we care about, so we would actually define

So we actually want to allow the numerator to be a multiple of and we use this to express the precision as to how many digits after the decimal point are explicitely part of our number. Having more decimal places after the decimal point expresses more precision.

Now we try to use mathematical operations , and on . It turns out that we have three different cases. The ring operations can be defined without problems, even though is not quite a ring, as we will see. But it is good enough for most purposes.

Addition and Subtraction actually lose information if , because we might have an input with lower precision and in the end pretend to have a result of the higher precision. But not losing numerical information is considered more important and implicit rounding should be avoided at all costs, at least for the basic operations.

is not a ring, but it is a Semiring. The zero is not universally unique, but we seem to have many zeros . This is not the problem, because only would act as an additive neutral element. What we lack are additive inverse elements. If we have an element with , there is no element , such that . The distributivity, required for a semiring, can be seen easily:

But since we do computer programming and not math and only use math as a tool to help us, it is kind of OK, that it is only a semiring and not a ring, as long as we know it.

Division is a special case, because it is not always possible to express the exact numerical value of the quotient in , for example , where the denominator is not a power of ten. To do such operations, a rule on how to round needs to be provided. This is cumbersome, because it blows up our formulas, so we define a set . Now the quotient of two elements of is a member of . And we have the rules

where and somehow try to estimate how precise the result of the division might be. The basic idea is to do the whole calculation that includes the division and round the result to the desired number of decimal places after the point and with the rounding mode desired.

Now the power is a hard one. Arbitrary powers can of course be defined and are supported, but most of the time, the exponent is actually an integer. These cases can be defined nicely. For exponents we actually get a result in and for negative exponents we get results in :

For non-integral exponents, the calculation of powers falls back to Ruby’s built in power and transforms elements of and involved into rational numbers. These are of limited use, but they are provided and work and can be used, when needed. There is a more general power function, that has additional parameters for the desired rounding and number of digits after the decimal point. While this library goes long ways to achieve decent accuracy and speed, there are certainly possible input parameters that will result in extremely long calculation times or results that are much less accurate than claimed. Such examples are „hard“ to find and should not harm the practical usefulness of the library too much. Similar libraries in the Java world like BigDecimal do not even try to calculate powers with arbitrary exponents and the Ruby builtin library BigDecimal (which is something slightly different) does have its issues when calculating arbitrary powers.

Rounding functions are there to convert a numerical type that is at least viewable as a subset of to . The actual rounding has to be implemented, but it has been done for , and the built in types of Ruby except for Complex (). For complex numbers, the real and the imaginary part are rounded and stuffed into a new complex number.

Rounding needs two pieces of information, the desired precision (number of decimal places after the decimal point) and the rounding mode. There are different methods for rounding, but they all follow the same basic rules. A special case is the round_to_allowed_remainders, which does a residue class rounding.

There are many rounding modes. Rounding can be towards 0, away from 0, towards infinity or towards negative infinity. This boils down to cutting off all digits but (or adding zeros) and possibly adjusting the result by one, if the cut off part contained anything but zeros. Other rounding modes take a mean between the two adjacent result candidates and decide by that which one to take, requiring an extra rule for the case that the value that needs to be rounded happens to be exactly on the border.

Generalized powers and all functions that return something irrational like square roots, cubic roots, exponential functions, logarithms and in the future also trigonometric functions needs to be calculated with the number of digits required and a rounding mode. Currently square roots (sqrt) and cube roots (cbrt) are calculated accurately according to these rounding parameters. For the transcendential functions (logarithms, exponential functions, power, trigonometric functions) minor deviations from the mathematically accurate result are still possible. Since the major usage of the library is expected to deal with the basic operations only, this is considered acceptable. To really work with the transcendental functions, using interval arithmetic in conjunction with long decimal would anyway be a better way, so the necessary guarantee to be given would be to provide a result that is close, but guaranteed to be lower or equal than the real mathematical result and one that is guaranteed to be greater or equal. Progress in this area is not going to happen very soon, unless someone would be volunteering to help with this or someone would be volunteering to sponsor the development.

Also it might be interesting to port this library to other languages, even to Java, because it has become much more sophisticated than Java’s BigDecimal library. Again this is unlikely to happen too soon without any help.

The current priority is to keep this library working with recent Ruby versions and to add the missing trigonometric functions.

Use it as follows:
gem install long-decimal
to install it. Then use it in your code with:
require "long-decimal"

A remark for people who are mathematically inclined: The definition of the natural numbers is not totally universal. Sometimes we have and sometimes we have . To avoid this, I am using , even though the index is kind of ugly. I agree with Dijkstra that we should prefer to include the 0 in the natural numbers.
Another remark for mathematically interested readers: If we were defining , we would actually have a ring. If we now replaced with a prime number , we would approach the realm of p-adic numbers (). This is well worth supporting by a library as well, but it is quite a different story and of course only of interest to a small group who actually knows p-adic numbers and works with them.

This blog:

# TruffleRuby

The language Ruby is one of the most beautiful languages. A lot of things can be done, it has a good level of abstraction, it has chosen some very good defaults, has provided some great ideas that I have not discovered in any other language that I know well and provides a lot of flexibility. But I could no longer recommend it for projects that might require a good performance. I won’t go into the issue of static typing vs. dynamic here. Ruby is following the dynamic typing path and if you think that is a bad idea at all, then it will never become your favorite. But this is an issue with pros and cons. The big disadvantage of Ruby is that it is not very good in terms of performance. The single threaded performance is somewhat better in many reasonable languages like Java, Python, Scala, Clojure, C, C#, F# and some others .. And it gets worse when we want to use multiple cores, because Ruby does not run them simultaneously, but uses a global lock which ensures that only one thread at a time is running. Or in case of JRuby just crashes or yields wrong results in certain mulithreaded programs that we could write.

One approach is to go for immutability as a default, which allows quite painless multithreading. Scala and Clojure follow this route, for example. It is hard to write good code with this constraint or to make good use of very local mutability without leaking it outside, but under these conditions multiple threads are just working fine without deadlocks, crashes or falsified results. Another approach is to just copy structures and leave its own copy to each thread. There are ways to do a lot on this path, but the copying costs a lot of memory and performance and it is not always a gain.

Now Ruby heavily relies on mutable structures for strings and collections. It is not reasonable to go for a total paradigm change in this aspect. But there are some ways to get good and safe and fast operations on these collections and strings without breaking this. One idea is to work with chunks of collections or strings. For strings, the string that we are working with is described as a concatenation of such strings. Many operations can be made by just concatenating multiple strings together and possibly replacing one of them with a copy that can be made as needed. This is called Rope. A similar approach can be applied to collections. Then a smart locking mechanism is applied to the shorter string or collections when needed, but many operations can avoid locking or block much less of the structure.

Also the compiler can analyze the program and simplify it to a great extent, compile it to the JVM, which in conjunction with hot spot optimizations will make it run really fast. Now this TruffleRuby is much faster than other Rubies, by a factor of about 10. It uses GraalVM and it actually supports a lot of C-extensions for libraries through the feature of GraalVM that they can be eventually compiled to the JVM. It does not work if extensions rely on implementation details of the Ruby structures in C and it often does not work for C-extensions that go to low level OS functionality. The current version of TruffleRuby is not really ready to use in conjunction with Ruby on Rails, which is kind of a no go, because Ruby is usually used in conjunction with Rails. My impression is that it will be possible to use it with Rails in a year or two.

Hearing of this in a talk by Benoit Daloze in the local Rails user group in Zürich was a great and positive surprise. Ruby gets interesting again.

# Logging

Deutsch

Software often contains a logging functionality. Usually entries one or sometimes multiple lines are appended to a file, written to syslog or to stdout, from where they are redirected into a file. They are telling us something about what the software is doing. Usually we can ignore all of it, but as soon as something with „ERROR“ or worse and more visible stack traces can be found, we should investigate this. Unfortunately software is often not so good, which can be due to libraries, frameworks or our own code. Then stack traces and errors are so common that it is hard to look into or to find the ones that are really worth looking into. Or there is simply no complete process in place to watch the log files. Sometimes the error shows up much later than it actually occurred and stack traces do not really lead us to the right spot. More often than we think logging actually introduces runtime errors, that were otherwise not present. This is related to a more general concept, which is called observer effect, where logging actually changes the business logic.

It is nice that log files keep to some format. Usually they start with a time stamp in ISO-format, often to the millisecond. Please add trailing zeros to always have 3 digits after the decimal point in this case. It is preferable to use UTC, but people tend to stick to local date and time zones, including the issues that come with switching to and from daylight saving time. Usually we have several processes or threads that run simultaneously. This can result in a wild mix of logging entries. As long as even multiline entries stay together and as long as beginning and end of one multiline entry can easily be recognized, this can be dealt with. Tools like splunk or simple Perl, Ruby or Python scripts can help us to follow threads separately. We could actually have separate logs for each thread in the first place, but this is not a common practice and it might hit OS-limitations on the number of open files, if we have many threads or even thousands of actors as in Erlang or Akka. Keeping log entries together can be achieved by using an atomic write, like the write system call in Linux and other Posix systems. Another way is to queue the log entries and to have a logger thread that processes the queue.

Overall this area has become very complex and hard to tame. In the Java world there used to be log4j with a configuration file that was a simple properties file, at least in the earlier version. This was so good that other languages copied it and created some log4X. Later the config file was replaced by XML and more logging frame works were added. Of course quite a lot of them just for the purpose of abstracting from the large zoo of logging frameworks and providing a unique interface for all of them. So the result was, that there was one more to deal with.

It is a good question, how much logic for handling of log files do we really want to see in our software. Does the software have to know, into which file it should log or how to do log rotation? If a configuration determines this, but the configuration is compiled into the jar file, it does have to know… We can keep our code a bit cleaner by relying on program functionality without code, but this still keeps it as part of the software.

Log files have to please the system administrator or whoever replaced them in a pure devops shop. And in the end developers will have to be able to work with the information provided by the logs to find issues in the code or to explain what is happening, if the system administrator cannot resolve an issue by himself. Should this system administrator have to deal with a different special complex setup for the logging for each software he is running? Or should it be necessary to call for developer support to get a new version of the software with just another log setting, because the configurations are hard coded in the deployment artifacts? Interesting is also, what happens when we use PAAS, where we have application server, database etc., but the software can easily move to another server, which might result in losing the logs. Moving logs to another server or logging across the network is expensive, maybe more expensive than the rest of this infrastructure.

Is it maybe a good idea to just log to stdout, maintaining a decent format and to run the software in such a way that stdout is piped into a log manager? This can be the same for all software and there is one way to configure it. The same means not only the same for all the java programs, but actually the same for all programs in all languages that comply to a minimal standard. This could be achieved using named pipes in conjunction with any hard coded log file that the software wants to use. But this is a dangerous path unless we really know what the software is doing with its log files. Just think of what weird errors might happen if the software tries to apply log rotation to the named pipe by renaming, deleting, creating new files and so on. A common trick to stop software from logging into a place where we do not want this is to create a directory with the name of the file that the software usually uses and to write protect this directory and its parent directory for the software. Please find out how to do it in detail, depending on your environment.

What about software, that is a filter by itself, so its main functionality is to actually write useful data to stdout? Usually smaller programs and scripts work like this. Often they do not need to log and often they are well tested relyable parts of our software installation. Where are the log files of cp, ls, rm, mv, grep, sort, cat, less,…? Yes, they do tend to write to stderr, if real errors occur. Where needed, programs can turn on logging with a log file provided on the command line, which is also a quite operations friendly approach. Named pipes can help here.

And we had a good logging framework in place for many years. It was called syslog and it is still around, at least on Linux.

A last thought: We spend really a lot of effort to get well performing software, using multiple processes, threads or even clusters. And then we forget about the fact that logging might become the bottle neck.

# Meaningless Whitespace in Textfiles

We use different file formats that are more or less tolerant to certain changes. Most well known is white space in text files.

In some programming languages white space (space, newline, carriage return, form feed, tabulator, vertical tab) has no meaning, as long as any whitespace is present. Examples for this are Java, Perl, Lisp or C. Whitespace, that is somehow part of String content is always significant, but white space that is used within the program can be combination of one or more of the white space characters that are in the lower 128 positions (ISO-646, often referred to as ASCII or 7bit ASCII. It is of course recommended to have a certain coding standard, which gives some guidelines of when to use newlines, if tabs or spaces are preferred (please spaces) and how to indent. But this is just about human readability and the compiler does not really care. Line numbers are a bit meaningful in compiler and runtime error messages and stack traces, so putting everything into one line would harm beyond readability, but there is a wide range of ways that are all correct and equivalent. Btw. many teams limit lines to 80 characters, which was a valid choice 30 years ago, when some terminals were only 80 characters wide and 132 character wide terminals where just coming up. But as a hard limit it is a joke today, because not many of us would be able to work with a vt100 terminal efficiently anyway. Very long lines might be harder to read, so anything around 120 or 160 might still be a reasonable idea about line lengths…

Languages like Ruby and Scala put slightly more meaning into white space, because in most cases a semicolon can be skipped if it is followed by a newline and not just horizontal white space. And Perl (Perl 5) is for sure so hard to compile that only its own implementation can properly format or even recognize which white space is part of a literal string. Special cases like having the language in a string and parsing and then executing that should be ignored here.

Now we put this program files into a source code management system, usually Git. Some teams still use legacy systems like subversion, source safe, clear case or CVS, while there are some newer systems that are probably about as powerful as git, but I never saw them in use. Git creates an MD5 hash of each file, which implies that any minor change will result in a new version, even if it is just white space. Now this does not hurt too much, if we agree on the same formatting and on the same line ending (hopefully LF only, not CR LF, even on MS-Windows). But our tooling does not make any difference between significant changes and insignificant formatting only changes. This gets worse, if users have different IDEs, which they should have, because everyone should use the IDE or editor, with which he or she is most efficient and the formal description of the preferred formatting is not shared between editors or differs slightly.

I think that each programming language should come with a command line diff tool and a command line formatting tool, that obey a standard interface for calling and can be plugged into editors and into source code management systems like git. Then the same mechanisms work for C, Java, C#, Ruby, Python, Fortran, Clojure, Perl, F#, Scala, Lua or your favorite programming language.

I can imaging two ways of working: Either we have a standard format and possibly individual formats for each developer. During „git commit“ the file is brought into the standard format before it is shown to git. Meaning less whitespace changes disappear. During checkout the file can optionally be brought into the preferred format of the developer. And yes, there are ways to deal with deliberate formatting, that for some reason should be kept verbatim and for dealing differently with comments and of course all kinds of string literals. Remember, the formatting tool comes from the same source as the compiler and fully understands the language.

The other approach leaves the formatting up to the developer and only creates a new version, when the diff tool of the language signifies that there is a relevant change.

I think that we should strive for this approach. It is no rocket science, the kind of tools were around for many decades as diff and as formatting tools, it would just be necessary to go the extra mile and create sister diff and formatting tools for the compiler (or interpreter) and to actually integrate these into build environments, IDEs, editors and git. It would save a lot of time and leave more time for solving real problems.

Is there any programming language that actually does this already?

How to handle XML? Is XML just the new binary with a bit more bloat? Can we do a generic handling of all XML or should it depend on the Schema?

# Loops with unknown nesting depth

We often encounter nested loops, like

for (i = 0; i < n; i++) {
for (j = 0; j < m; j++) {
doSomething(i, j);
}
}


This can be nested to a few more levels without too much pain, as long as we observe that the number of iterations for each level need to be multiplied to get the number of iterations for the whole thing and that total numbers of iterations beyond a few billions (, German: Milliarden, Russian Миллиарди) become unreasonable no matter how fast the doSomethings(...) is. Just looking at this example program

public class Modular {
public static void main(String[] args) {
long n = Long.parseLong(args[0]);
long t = System.currentTimeMillis();
long m = Long.parseLong(args[1]);
System.out.println("n=" + n + " t=" + t + " m=" + m);
long prod = 1;
long sum  = 0;
for (long i = 0; i < n; i++) {
long j = i % m;
sum += j;
sum %= m;
prod *= (j*j+1) % m;
prod %= m;
}
System.out.println("sum=" + sum + " prod=" + prod + " dt=" + (System.currentTimeMillis() - t));
}
}


which measures it net run time and runs 0 msec for 1000 iterations and almost three minutes for 10 billions ():

> java Modular 1000 1001 # 1'000
--> sum=1 prod=442 dt=0
> java Modular 10000 1001 # 10'000
--> sum=55 prod=520 dt=1
> java Modular 100000 1001 # 100'000
--> sum=45 prod=299 dt=7
> java Modular 1000000 1001 # 1'000'000
--> sum=0 prod=806 dt=36
> java Modular 10000000 1001 # 10'000'000
--> sum=45 prod=299 dt=344
> java Modular 100000000 1001 # 100'000'000
--> sum=946 prod=949 dt=3314
> java Modular 1000000000 1001 # 1'000'000'000
--> sum=1 prod=442 dt=34439
> java Modular 10000000000 1001 # 10'000'000'000
--> sum=55 prod=520 dt=332346


As soon as we do I/O, network access, database access or simply a bit more serious calculation, this becomes of course easily unbearably slow. But today it is cool to deal with big data and to at least call what we are doing big data, even though conventional processing on a laptop can do it in a few seconds or minutes... And there are of course ways to process way more iterations than this, but it becomes worth thinking about the system architecture, the hardware, parallel processing and of course algorithms and software stacks. But here we are in the "normal world", which can be a "normal subuniverse" of something really big, so running on one CPU and using a normal language like Perl, Java, Ruby, Scala, Clojure, F# or C.

Now sometimes we encounter situations where we want to nest loops, but the depth is unknown, something like

for ( = 0;  < ; ++) {
for ( = 0;  < ; ++) {

for ( = 0;  < ; ++) {
dosomething();
}

}
}


Now our friends from the functional world help us to understand what a loop is, because in some of these more functional languages the classical C-Style loop is either missing or at least not recommended as the everyday tool. Instead we view the set of values we iterate about as a collection and iterate through every element of the collection. This can be a bad thing, because instantiating such big collections can be a show stopper, but we don't. Out of the many features of collections we just pick the iterability, which can very well be accomplished by lazy collections. In Java we have the Iterable, Iterator, Spliterator and the Stream interfaces to express such potentially lazy collections that are just used for iterating.

So we could think of a library that provides us with support for ordinary loops, so we could write something like this:

Iterable range = new LoopRangeExcludeUpper<>(0, n);
for (Integer i : range) {
doSomething(i);
}


or even better, if we assume 0 as a lower limit is the default anyway:

Iterable range = new LoopRangeExcludeUpper<>(n);
for (Integer i : range) {
doSomething(i);
}


with the ugliness of boxing and unboxing in terms of runtime overhead, memory overhead, and additional complexity for development. In Scala, Ruby or Clojure the equivalent solution would be elegant and useful and the way to go...
I would assume, that a library who does something like LoopRangeExcludeUpper in the code example should easily be available for Java, maybe even in the standard library, or in some common public maven repository...

Now the issue of loops with unknown nesting depth can easily be addressed by writing or downloading a class like NestedLoopRange, which might have a constructor of the form NestedLoopRange(int ... ni) or NestedLoopRange(List li) or something with collections that are more efficient with primitives, for example from Apache Commons. Consider using long instead of int, which will break some compatibility with Java-collections. This should not hurt too much here and it is a good thing to reconsider the 31-bit size field of Java collections as an obstacle for future development and to address how collections can grow larger than elements, but that is just a side issue here. We broke this limit with the example iterating over 10'000'000'000 values for i already and it took only a few minutes. Of course it was just an abstract way of dealing with a lazy collection without the Java interfaces involved.

So, the code could just look like this:

Iterable range = new NestedLoopRange();
for (Tuple t : range) {
doSomething(t);
}


Btw, it is not too hard to write it in the classical way either:

        long[] n = new long[] {  };
int m1 = n.length;
int m  = m1-1; // just to have the math- matched...
long[] t = new long[m1];
for (int j = 0; j < m1; j++) {
t[j] = 0L;
}
boolean done = false;
for (int j = 0; j < m1; j++) {
if (n[j] <= 0) {
done = true;
break;
}
}
while (! done) {
doSomething(t);
done = true;
for (int j = 0; j < m1; j++) {
t[j]++;
if (t[j] < n[j]) {
done = false;
break;
}
t[j] = 0;
}
}


I have written this kind of loop several times in my life in different languages. The first time was on C64-basic when I was still in school and the last one was written in Java and shaped into a library, where appropriate collection interfaces were implemented, which remained in the project or the organization, where it had been done, but it could easily be written again, maybe in Scala, Clojure or Ruby, if it is not already there. It might even be interesting to explore, how to write it in C in a way that can be used as easily as such a library in Java or Scala. If there is interest, please let me know in the comments section, I might come back to this issue in the future...

In C it is actually quite possible to write a generic solution. I see an API like this might work:

struct nested_iteration {
/* implementation detail */
};

void init_nested_iteration(struct nested_iteration ni, size_t m1, long *n);
void dispose_nested_iteration(struct nested_iteration ni);
int nested_iteration_done(struct nested_iteration ni); // returns 0=false or 1=true
void nested_iteration_next(struct nested_iteration ni);


and it would be called like this:

struct nested_iteration ni;
int n[] = {  };
for (init_nested_iteration(ni, , n);
! nested_iteration_done(ni);
nested_iteration_next(ni)) {
...
}


So I guess, it is doable and reasonably easy to program and to use, but of course not quite as elegant as in Java 8, Clojure or Scala.
I would like to leave this as a rough idea and maybe come back with concrete examples and implementations in the future.

# When to use Scala and Ruby

There are many interesting languages that have their sweet spots and of course a larger set of languages than just two should be considered for new projects.

But Ruby and Scala are both very interesting languages that did not just pick up and sell concepts that were already known, but brought them to a new level and to new beauty. Interestingly, both were started by a single person and finally became community projects.

There are some differences to observe.

Ruby is mostly a dynamic language, which means that it is easier and more natural to change the program at runtime. This is not necessarily a bad thing and different Lisp variants including today’s Clojure have successfully used and perfected this kind of capability for many decades. Consequently more things happen at runtime, especially dynamic typing is used, which means that types only exist at runtime.

Scala is mostly a static language, which means that all program structures have to be created at compile time. But this has been brought to perfection in the sense that a lot of things that are typically available only in dynamic languages, can be done. The type system is static and it is in this sense more consistent and more rigorous than the type system of Java, where we sometimes encounter areas that cannot reasonably be covered by Generics and fall back to the old flavor of untyped collections. This does not happen too often, but the static typing of Scala goes further.

In general this gives more flexibility to Ruby and makes it somewhat harder to tame the ways to do similar things in a static way in Scala. But the type system at compile time of course helps to match things, to find a certain portion of errors and even to make the program more self explanatory without relying on comments. In IDEs it is hard to properly support Scala, but the most common IDEs have achieved this to a very useful level. This should not be overvalued, because there are enough errors that cannot be detected by just using common types. It is possible to always define more specific types which include tight constraints and thus perform really tight checking of certain errors at compile time, but the built in types and the types from common libraries are to convenient and the time effort for this is too high, so it does not seem to be the usual practice. In any case it is a recommended practice to achieve a good test coverage of non-trivial functionality with automated tests. They implicitly cover type errors that are detected by the compiler in Scala, but of course only to the level of the test coverage. Ruby is less overhead to compile and run. We just write the program and run it, while we need a somewhat time intensive compile step for Scala. If tests are included, it does not make so much of a difference, because running the tests or preceding them with a compile job is kind of a minor difference.

An interesting feature of Ruby is called „monkey patching“. This means that it is possible to change methods of an existing class or even of a single object. This can be extremely powerful, but it should be used with care, because it changes the behavior of the class in the whole program and can break libraries. Usually this is not such a bad thing, because it is not used for changing existing methods, but for adding new methods. So it causes problems only when two conflicting monkey patches occur in different libraries. But for big programs with many libraries there is some risk in this area. Scala tries to achieve the same by using „implicit conversions“. So a conversion rule is implicitly around and when a method is called on an object that does not exist in its type, the adequate conversion is applied prior to the method. This works at compile time. Most of the time it is effectively quite similar to monkey patching, but it is a bit harder to tame, because writing and providing implicit conversions is more work and harder to understand than writing monkey patches. On the other hand, Scala avoids the risks of Ruby’s monkey patching.

An increasingly important issue is making use of multiple CPU cores. Scala and especially Scala in combination with Akka is very strong on this. It supports a reasonably powerful and tamable programming model for using multiple threads. The C- or JavaSE-way is very powerful, but it is quite difficult to avoid shooting oneself into the foot and even worse there is a high likelihood that such errors show up in production, in times of heavy load, while all testing seemed to go well. This is the way to go in some cases, but it requires a lot of care and a lot of thinking and a team of skillful developers. There are more developers who think they belong to this group than are actually able to do this well. Of course Scala already filters out some less skilled developers, but still I think its aproach with Akka is more sound.
Ruby on the other hand has very little support for multithreading, and cannot as easily make use of multiple cores by using threads. While the language itself does support the creation of threads, for many years the major implementation had very little support for this in the sense that not actually multiple threads were running at the same time. This propagated into the libraries, so this will probably never become the strength of Ruby. The way to go is to actually start multiple processes. This is not so bad, because the overhead of processes in Ruby is much less than in JVM-languages. Still this is an important area and Scala wins this point.

Concerning web GUIs Ruby has Rails, which is really a powerful and well established way to do this. Scala does provide Play, which is in a way a lot of concepts from Rails and similar frameworks transferred to Scala. It is ok to use it, but rails is much more mature and more mainstream. So I would give this point to Ruby. Rails includes Active Record, about which I do have doubts, but this is really not a necessary component of a pure WebGUI, but more a backend functionality…

So in the end I would recommend to use Scala and Akka for the solution, if it is anticipated that a high throughput will be needed. For smaller solutions I would favor Ruby, because it is a bit faster and easier to get it done.

For larger applications a multi tier architecture could be a reasonable choice, which opens up to combinations. The backend can be done with Scala. If server side rendering is chosen, Ruby and Rails with REST-calls to the backend can be used. Or a single page application which is done in JavaScript or some language compiling to JavaScript and again REST-calls to the backend.

# Lazy Collections, Strings or Numbers

The idea is, that we have data that is obtained or calculated to give us on demand as much of it as we request. But it is not necessarily initially present. This concept is quite common in the functional world, where we in a way hide the deprecated concept of state in such structures, by the way in a way that lets use retain the benefits that led to the desire for statelessness.

Actually the concept is quite old. We have it for I/O in Unix and hence in Linux since the 1970ies. „Everything is a file“, at least as long as we constrain ourselves to a universal subset of possible file operations. It can be keyboard input, a named or anonymous pipe, an actual file, a TCP-connection, to name the most important cases. These are „lazy“ files, behave more or less like files as far as sequential reading is concerned, but not for random access reading. The I/O-concept has been done in such a way that it takes the case into account that we want to read bytes, but get only bytes. This can happen with files when we reach their end, but then we can obtain an indication that we reached the end of the file, while it is perfectly possible that we read less then we want in one access, but eventually get bytes including subsequent reads. Since the API has been done right, but by no means ideal, it generalizes well to the different cases that exist in current OS environments.

We could consider a File as an array of bytes. There is actually a way to access it in this way by memory-mapping it, but this assumes a physically present file. Now we could assume that we think of the array as a list that is optimized for sequential access and iterating, but not for random access. Both list types actually exist in languages like Java. Actually the random access structure can be made lazy as well, within certain constraints. If the source is actually sequential, we can just assume that the data is obtained up to the point where we actually read. The information about the total length of the stream may or may not be available, it is always available somehow in the case of structures that are completely available in memory. This random access on lazy collections works fine if the reason of laziness is to actually save us from doing expensive operations to obtain data that we do not actually need or to obtain them in parallel to the computation that processes the data. But we loose another potential drawback in this case. If the data is truly sequential, we can actually process data that is way beyond our memory capacity.

So the concept transfers easily from I/O-streams to lists and even arrays, most naturally to iterables that can be iterated only once. But we can easily imagine that this also applies to Strings, which can be seen a sequence of characters. If we do not constrain us to what a String is in C or Java or Ruby, but consider String to be a more abstract concept, again possibly dropping the idea of knowing the length or having a finite length. Just think of the output of the Unix command „yes“ or „cat /dev/zero“, which is infinite, in a theoretical way, but the computer won’t last forever in real life, of course. And we always interrupt the output at some time, usually be having the consumer shut down the connection.

Even numbers can be infinite. For real numbers this can happen only after the decimal point, for p-adic numbers it happens only before the decimal point, if you like to look into that. Since we rarely program with p-adic numbers this is more or less an edge case that is not part of our daily work, unless we actually do math research. But we could have integers with so many digits that we actually obtain and process them sequentially.

Reactive programming, which is promoted by lightbend in the Reactive Manifesto relies heavily on lazy structures, in this case data streams. An important concept is the so called „backpressure“, that allows the consumer to slow down the producer, if it cannot read the data fast enough.

Back to the collections, we can observe different approaches. Java 8 has introduced streams as lazy collections and we need to transform collections into streams and after the operation a stream back into a collection, at least in many real life situations. But putting all into one structure has some drawbacks as well. But looking at it from an abstract point of view this does not matter. The java8-streams to not implement a collection interface, but they are lazy collections from a more abstract point of view.

It is interesting that this allows us to relatively easily write nested loops where the depth of the nesting is a parameter that is not known at compile time. We just need a lazy collections of -tuples, where is the actual depth of the nesting and the contents are according to what the loops should iterate through. In this case we might or might not know the size of the collection, possibly not fitting into a 32-bit-integer. We might be able to produce a random member of the collection. And for sure we can iterate through it and stop the iteration wherever it is, once the desired calculation has been completed.

# How to create ISO Date String

It is a more and more common task that we need to have a date or maybe date with time as String.

There are two reasonable ways to do this:
* We may want the date formatted in the users Locale, whatever that is.
* We want to use a generic date format, that is for a broader audience or for usage in data exchange formats, log files etc.

The first issue is interesting, because it is not always trivial to teach the software to get the right locale and to use it properly… The mechanisms are there and they are often used correctly, but more often this is just working fine for the locale that the software developers where asked to support.

So now the question is, how do we get the ISO-date of today in different environments.

## Linux/Unix-Shell (bash, tcsh, …)

date "+%F"

## TeX/LaTeX

 \def\dayiso{\ifcase\day \or 01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or% 1..10 11\or 12\or 13\or 14\or 15\or 16\or 17\or 18\or 19\or 20\or% 11..20 21\or 22\or 23\or 24\or 25\or 26\or 27\or 28\or 29\or 30\or% 21..30 31\fi} \def\monthiso{\ifcase\month \or 01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or 11\or 12\fi} \def\dateiso{\def\today{\number\year-\monthiso-\dayiso}} \def\todayiso{\number\year-\monthiso-\dayiso} 
This can go into a file isodate.sty which can then be included by \include or \input Then using \todayiso in your TeX document will use the current date. To be more precise, it is the date when TeX or LaTeX is called to process the file. This is what I use for my paper letters.

## LaTeX

(From Fritz Zaucker, see his comment below):
 \usepackage{isodate} % load package \isodate % switch to ISO format \today % print date according to current format 

## Oracle

 SELECT TO_CHAR(SYSDATE, 'YYYY-MM-DD') FROM DUAL; 
On Oracle Docs this function is documented.
It can be chosen as a default using ALTER SESSION for the whole session. Or in SQL-developer it can be configured. Then it is ok to just call
 SELECT SYSDATE FROM DUAL; 

Btw. Oracle allows to add numbers to dates. These are days. Use fractions of a day to add hours or minutes.

## PostreSQL

(From Fritz Zaucker, see his comment):
 select current_date; —> 2016-01-08 
 select now(); —> 2016-01-08 14:37:55.701079+01 

## Emacs

In Emacs I like to have the current Date immediately:
 (defun insert-current-date () "inserts the current date" (interactive) (insert (let ((x (current-time-string))) (concat (substring x 20 24) "-" (cdr (assoc (substring x 4 7) cmode-month-alist)) "-" (let ((y (substring x 8 9))) (if (string= y " ") "0" y)) (substring x 9 10))))) (global-set-key [S-f5] 'insert-current-date) 
Pressing Shift-F5 will put the current date into the cursor position, mostly as if it had been typed.

## Emacs (better Variant)

(From Thomas, see his comment below):
 (defun insert-current-date () "Insert current date." (interactive) (insert (format-time-string "%Y-%m-%d"))) 

## Perl

In the Perl programming language we can use a command line call
 perl -e 'use POSIX qw/strftime/;print strftime("%F", localtime()), "\n"' 
or to use it in larger programms
 use POSIX qw/strftime/; my \$isodate_of_today = strftime("%F", localtime()); 
I am not sure, if this works on MS-Windows as well, but Linux-, Unix- and MacOS-X-users should see this working.

If someone has tried it on Windows, I will be interested to hear about it…
Maybe I will try it out myself…

## Perl 5 (second suggestion)

(From Fritz Zaucker, see his comment below):
 perl -e 'use DateTime; use 5.10.0; say DateTime->now->strftime(„%F“);‘ 

## Perl 6

(From Fritz Zaucker, see his comment below):
 say Date.today; 
or
 Date.today.say; 

## Ruby

This is even more elegant than Perl:
 ruby -e 'puts Time.new.strftime("%F")' 
will do it on the command line.
Or if you like to use it in your Ruby program, just use
 d = Time.new s = d.strftime("%F") 

Btw. like in Oracle SQL it is possible add numbers to this. In case of Ruby, you are adding seconds.

It is slightly confusing that Ruby has two different types, Date and Time. Not quite as confusing as Java, but still…
Time is ok for this purpose.

## C on Linux / Posix / Unix

 #include #include #include 

 main(int argc, char **argv) { 

 char s[12]; time_t seconds_since_1970 = time(NULL); struct tm local; struct tm gmt; localtime_r(&seconds_since_1970, &local); gmtime_r(&seconds_since_1970, &gmt); size_t l1 = strftime(s, 11, "%Y-%m-%d", &local); printf("local:\t%s\n", s); size_t l2 = strftime(s, 11, "%Y-%m-%d", &gmt); printf("gmt:\t%s\n", s); exit(0); } 
This speeks for itself..
But if you like to know: time() gets the seconds since 1970 as some kind of integer.
localtime_r or gmtime_r convert it into a structur, that has seconds, minutes etc as separate fields.
stftime formats it. Depending on your C it is also possible to use %F.

## Scala

 import java.util.Date import java.text.SimpleDateFormat ... val s : String = new SimpleDateFormat("YYYY-MM-dd").format(new Date()) 
This uses the ugly Java-7-libraries. We want to go to Java 8 or use Joda time and a wrapper for Scala.

## Java 7

 import java.util.Date import java.text.SimpleDateFormat

 

... String s = new SimpleDateFormat("YYYY-MM-dd").format(new Date()); 
Please observe that SimpleDateFormat is not thread safe. So do one of the following:
* initialize it each time with new
* make sure you run only single threaded, forever
* use EJB and have the format as instance variable in a stateless session bean
* protect it with synchronized
* protect it with locks
* make it a thread local variable

In Java 8 or Java 7 with Joda time this is better. And the toString()-method should have ISO8601 as default, but off course including the time part.

## Summary

This is quite easy to achieve in many environments.
I could provide more, but maybe I leave this to you in the comments section.
What could be interesting:
* better ways for the ones that I have provided
* other databases
* other editors (vim, sublime, eclipse, idea,…)
* Office packages (Libreoffice and MS-Office)
* C#
* F#
* Clojure
* C on MS-Windows
* Perl and Ruby on MS-Windows
* Java 8
* Scala using better libraries than the Java-7-library for this
* Java using better libraries than the Java-7-library for this
* C++
* PHP
* Python
* Cobol
* JavaScript
* …
If you provide a reasonable solution I will make it part of the article with a reference…