# Orthodox Christmas 2018/2019

Orthodox Christmas in some countries, for example in Ukraine is on 7th of January.
So to all readers, who have Christmas on 7th of January:

С Рождеством! — Hyvää Joulua! — καλά Χριστούγεννα! — Buon Natale! — Prettige Kerstdagen! — З Рiздвом Христовим! — Merry Christmas! — Срећан Божић! — God Jul! — ¡Feliz Navidad! — ميلاد مجيد — クリスマスおめでとう ; メリークリスマス — Natale hilare! — Joyeux Noël! — God Jul! — Frohe Weihnachten! — Crăciun fericit! — Feliĉan Kristnaskon!

I created the message and the random ordering using Perl and the Schwartzian transform:
 #!/usr/bin/perl use Math::Random::Secure qw(irand); my @list = ( "Prettige Kerstdagen!",           "God Jul!",           "Crăciun fericit!",           "クリスマスおめでとう ; メリークリスマス",           "God Jul!",           "Feliĉan Kristnaskon!",           "Hyvää Joulua!",           "ميلاد مجيد",           "Срећан Божић!",           "καλά Χριστούγεννα!",           "З Рiздвом Христовим!",           "Natale hilare!",           "Buon Natale!",           "Joyeux Noël!",           "Frohe Weihnachten!",           "С Рождеством!",           "Merry Christmas!",           "¡Feliz Navidad!" ); my @shuffled = map{ $_->[0] } sort {$a->[1] <=> $b->[1]} map { [$_, irand() ] }                @list; print join(" — ", @shuffled); 

# How to draw lines, circles and other curves

These ideas were developed more than 30 years without knowing that they were already known at that time…

Today the graphics cards can easily do things like this in very little time. And today’s CPUs are of course really good at multiplying. So this has lost a lot of its immediate relevance, but it is a fun topic and why not have some fun…

Let us assume we have a two dimensional coordinate system and a visible area that goes from to and to . Coordinates are discrete.

In this world we can easily measure an angle against a (directed) line parallel to the -axis, for example up to an accuracy of :

• < \alpha < \frac{\pi}{2}(=90^\circ)x = 0 \land y > 0\implies \alpha = \frac{\pi}{2}x < 0 \land y > 0 \land |x| < |y|\implies \frac{\pi}{2}

So let us assume we have a curve that is described by a polynomial function in two variables and , like this:

We have to apply some math to understand that the curve behaves nicely in the sense that it does not behave to chaotic in scales that are below our accuracy, that it is connected etc. We might possibly scale and move it a bit by substituting something like for and for .

For example we may think of

• line:
• circle:
• eclipse:

We can assume our drawing is done with something like a king of chess. We need to find a starting point that is accurately on the curve or at least as accurately as possible. You could use knights or other chess figures or even fictive chess figures..

Now we have a starting point which lies ideally exactly on the curve. We have a deviation from the curve, which is . So we have . Than we move to and with . Often only two or three combinations of need to be considered. When calculating from for the different variants, it shows that for calculating the difference becomes a polynomial with lower degree, because the highest terms cancel out. So drawing a line between two points or a circle with a given radius around a given point or an ellipse or a parabola or a hyperbola can be drawn without any multiplications… And powers of -th powers of can always be calculated with additions and subtractions only from the previous -values, by using successive differences:

These become constant for , just as the th derivatives, so by using this triangle, successive powers can be calculated with some preparational work using just additions.
It was quite natural to program these in assembly language, even in 8-bit assembly languages that are primitive by today’s standards. And it was possible to draw such figures reasonably fast with only one MHz (yes, not GHz).

We don’t need this stuff any more. Usually the graphics card is much better than anything we can with reasonable effort program. Usually the performance is sufficient when we just program in high level languages and use standard libraries.

But occasionally situations occur where we need to think about how to get the performance we need:
Make it work,
make it right,
make it fast,
but don’t stop after the first of those.

It is important that we choose our steps wisely and use adequate methods to solve our problem. Please understand this article as a fun issue about how we could write software some decades ago, but also as an inspiration to actually look into bits and bytes when it is really helping to get the necessary performance without defeating the maintainability of the software.

# 2019 — Happy New Year

Gott nytt år! — Godt nytt år! — Felice anno nuovo! — Καλή Χρονια! — Щасливого нового року! — Срећна нова година! — С новым годом! — Feliĉan novan jaron! — Bonne année! — FELIX SIT ANNUS NOVUS — Gullukkig niuw jaar! — Un an nou fericit! — Frohes neues Jahr! — Happy new year! — ¡Feliz año nuevo! — Onnellista uutta vuotta! — عام سعيد

This was created by a Java-program:
 import java.util.Random; import java.util.List; import java.util.Arrays; import java.util.Collections;

 public class HappyNewYear { 

    public static void main(String[] args) {         List list = Arrays.asList("Frohes neues Jahr!",                                           "Happy new year!",                                           "Gott nytt år!",                                            "¡Feliz año nuevo!",                                           "Bonne année!",                                            "FELIX SIT ANNUS NOVUS",                                            "С новым годом!",                                           "عام سعيد",                                           "Felice anno nuovo!",                                           "Godt nytt år!",                                            "Gullukkig niuw jaar!",                                            "Feliĉan novan jaron!",                                           "Onnellista uutta vuotta!",                                           "Срећна нова година!",                                           "Un an nou fericit!",                                           "Щасливого нового року!",                                            "Καλή Χρονια!");         Collections.shuffle(list);         System.out.println(String.join(" — ", list));     } } 

# Christmas 2018

Feliĉan Kristnaskon! — Frohe Weihnachten! — God Jul! — Merry Christmas! — Joyeux Noël! — クリスマスおめでとう ; メリークリスマス — Срећан Божић! — Buon Natale! — Hyvää Joulua! — З Рiздвом Христовим! — ميلاد مجيد — С Рождеством! — Crăciun fericit! — ¡Feliz Navidad! — καλά Χριστούγεννα! — Natale hilare! — God Jul! — Prettige Kerstdagen!

As I said, I am learning some Python, so let’s use it. I created the message above with this program:
 #!/usr/bin/python3 import random arr = [     "Frohe Weihnachten!",     "Merry Christmas!",     "God Jul!",     "¡Feliz Navidad!",     "Joyeux Noël!",     "Natale hilare!",     "С Рождеством!",     "ميلاد مجيد",     "Buon Natale!",     "God Jul!",     "Prettige Kerstdagen!",     "Feliĉan Kristnaskon!",     "Hyvää Joulua!",     "クリスマスおめでとう ; メリークリスマス",     "Срећан Божић!",     "Crăciun fericit!",     "З Рiздвом Христовим!",     "καλά Χριστούγεννα!" ] random.shuffle(arr) print(" — ".join(arr)) print("\n") 

# Logging

Deutsch

Software often contains a logging functionality. Usually entries one or sometimes multiple lines are appended to a file, written to syslog or to stdout, from where they are redirected into a file. They are telling us something about what the software is doing. Usually we can ignore all of it, but as soon as something with „ERROR“ or worse and more visible stack traces can be found, we should investigate this. Unfortunately software is often not so good, which can be due to libraries, frameworks or our own code. Then stack traces and errors are so common that it is hard to look into or to find the ones that are really worth looking into. Or there is simply no complete process in place to watch the log files. Sometimes the error shows up much later than it actually occurred and stack traces do not really lead us to the right spot. More often than we think logging actually introduces runtime errors, that were otherwise not present. This is related to a more general concept, which is called observer effect, where logging actually changes the business logic.

It is nice that log files keep to some format. Usually they start with a time stamp in ISO-format, often to the millisecond. Please add trailing zeros to always have 3 digits after the decimal point in this case. It is preferable to use UTC, but people tend to stick to local date and time zones, including the issues that come with switching to and from daylight saving time. Usually we have several processes or threads that run simultaneously. This can result in a wild mix of logging entries. As long as even multiline entries stay together and as long as beginning and end of one multiline entry can easily be recognized, this can be dealt with. Tools like splunk or simple Perl, Ruby or Python scripts can help us to follow threads separately. We could actually have separate logs for each thread in the first place, but this is not a common practice and it might hit OS-limitations on the number of open files, if we have many threads or even thousands of actors as in Erlang or Akka. Keeping log entries together can be achieved by using an atomic write, like the write system call in Linux and other Posix systems. Another way is to queue the log entries and to have a logger thread that processes the queue.

Overall this area has become very complex and hard to tame. In the Java world there used to be log4j with a configuration file that was a simple properties file, at least in the earlier version. This was so good that other languages copied it and created some log4X. Later the config file was replaced by XML and more logging frame works were added. Of course quite a lot of them just for the purpose of abstracting from the large zoo of logging frameworks and providing a unique interface for all of them. So the result was, that there was one more to deal with.

It is a good question, how much logic for handling of log files do we really want to see in our software. Does the software have to know, into which file it should log or how to do log rotation? If a configuration determines this, but the configuration is compiled into the jar file, it does have to know… We can keep our code a bit cleaner by relying on program functionality without code, but this still keeps it as part of the software.

Log files have to please the system administrator or whoever replaced them in a pure devops shop. And in the end developers will have to be able to work with the information provided by the logs to find issues in the code or to explain what is happening, if the system administrator cannot resolve an issue by himself. Should this system administrator have to deal with a different special complex setup for the logging for each software he is running? Or should it be necessary to call for developer support to get a new version of the software with just another log setting, because the configurations are hard coded in the deployment artifacts? Interesting is also, what happens when we use PAAS, where we have application server, database etc., but the software can easily move to another server, which might result in losing the logs. Moving logs to another server or logging across the network is expensive, maybe more expensive than the rest of this infrastructure.

Is it maybe a good idea to just log to stdout, maintaining a decent format and to run the software in such a way that stdout is piped into a log manager? This can be the same for all software and there is one way to configure it. The same means not only the same for all the java programs, but actually the same for all programs in all languages that comply to a minimal standard. This could be achieved using named pipes in conjunction with any hard coded log file that the software wants to use. But this is a dangerous path unless we really know what the software is doing with its log files. Just think of what weird errors might happen if the software tries to apply log rotation to the named pipe by renaming, deleting, creating new files and so on. A common trick to stop software from logging into a place where we do not want this is to create a directory with the name of the file that the software usually uses and to write protect this directory and its parent directory for the software. Please find out how to do it in detail, depending on your environment.

What about software, that is a filter by itself, so its main functionality is to actually write useful data to stdout? Usually smaller programs and scripts work like this. Often they do not need to log and often they are well tested relyable parts of our software installation. Where are the log files of cp, ls, rm, mv, grep, sort, cat, less,…? Yes, they do tend to write to stderr, if real errors occur. Where needed, programs can turn on logging with a log file provided on the command line, which is also a quite operations friendly approach. Named pipes can help here.

And we had a good logging framework in place for many years. It was called syslog and it is still around, at least on Linux.

A last thought: We spend really a lot of effort to get well performing software, using multiple processes, threads or even clusters. And then we forget about the fact that logging might become the bottle neck.

# Carry Bit, Overflow Bit and Signed Integers

It has already been explained how the Carry Bit works for addition. Now there was interest in a comment about how it would work for negative numbers.

The point is, that the calculation of the carry bit does not have any dependency on the sign. The nature of the carry bit is that it is meant to be used for the less significant parts of the addition. So assuming we add two numbers and that are having and words, respectively. We assume that and make sure that and are both words long by just providing the necessary number of 0-words in the most significant positions. Now the addition is performed as described by starting with a carry bit of 0 and adding with carry , then and so on up to , assuming that is the least significant word and the most significant word, respectively. Each addition includes the carry bit from the previous addition. Up to this point, it does not make any difference, if the numbers are signed or not.

Now for the last addition, we need to consider the question, if our result still fits in words or if we need one more word. In the case of unsigned numbers we just look at the last carry bit. If it is 1, we just add one more word in the most significant position with the value of , otherwise we are already done with words.

In case of signed integers, we should investigate what can possibly happen. The input for the last step is two signed words and possibly a carry bit from the previous addition. Assuming we have -Bit-words, we are adding numbers between and plus an optional carry bit . If the numbers have different signs, actually an overflow cannot occur and we can be sure that the final result fits in at most words.

If both are not-negative, the most significant bits of and are both . An overflow is happening, if and only if the sum , which means that the result „looks negative“, although both summands were not-negative. In this case another word with value 0 has to be provided for the most significant position to express that the result is while maintaining its already correctly calculated result. It cannot happen that real non-zero bits are going into this new most significant word. Consequently the carry bit can never become 1 in this last addition step.

If both are negative, the most significant bits of and are both . An overflow is happening, if and only if the sum , which means that the result „looks positive or 0“, although both summands were negative. In this case another word with value or , depending on the viewpoint, has to be prepended as new most significant word. In this case of two negative summands the carry bit is always 1.

Now typical microprocessors provide an overflow flag (called „O“ or more often „V“) to deal with this. So the final addition can be left as it is in words, if the overflow bit is 0. If it is 1, we have to signal an overflow or we can just provided one more word. Depending on the carry flag it is for C=0 or all bits 1 ( or , depending on the view point) for C=1.

The overflow flag can be calculated by .
There are other ways, but they lead to the same results with approximately the same or more effort.

The following table shows the possible combinations and examples for 8-Bit arithmetic and :

x<0 or x≥0y<0 or y≥ 0(x+y)%2^8 < 0 or ≥ 0Overflow BitCarry Bitadditional word neededvalue additional wordExamples (8bit)
x≥0y≥0≥000no-0+0
63+64
x≥0y≥0<010yes064+64
127+127
x≥0y<0≥000 or 1no-65+(-1)
127+(-127)
x≥0y<0<000 or 1no-7+(-8)
127+(-128)
0+(-128)
x<0y≥0≥000 or 1no--9 + 12
-1 + 127
-127+127
x<0y≥0<000 or 1no--128+127
-128+0
-1 + 0
x<0y<0≥011yes-1-64 + (-65)
-128+(-128)
x<0y<0<001no--1 + (-1)
-1 + (-127)
-64 + (-64)

If you like, you can try out examples that include the carry bit and see that the concepts still work out as described.

# Christmas — Weihnachten — Рождество 2017

Bon nadal! — Priecîgus Ziemassvçtkus — З Рiздвом Христовим — Buon Natale — Bella Festas daz Nadal! — С Рождеством — Срећан Божић — καλά Χριστούγεννα — God Jul! — Feliĉan Kristnaskon — ميلاد مجيد — Feliz Navidad — Glædelig Jul — Fröhliche Weihnachten — Joyeux Noël — Hyvää Joulua! — クリスマスおめでとう ; メリークリスマス — Merry Christmas — Natale hilare — God Jul — Crăciun fericit — Prettige Kerstdagen — Nollaig Shona Dhuit!

Christmas Tree in Olten 2017

This message has been generated by a program again, this time I decided to use Scala:

import scala.util.Random object XmasGreeting {   val texts : List[String] = List( "Fröhliche Weihnachten",     "Merry Christmas", "God Jul", "Feliz Navidad", "Joyeux Noël",     "Natale hilare", "С Рождеством", "ميلاد مجيد", "Buon Natale", "God Jul!",     "Prettige Kerstdagen", "Feliĉan Kristnaskon", "Hyvää Joulua!",     "Glædelig Jul", "クリスマスおめでとう ; メリークリスマス",     "Nollaig Shona Dhuit!", "Bon nadal!", "Срећан Божић",     "Priecîgus Ziemassvçtkus", "Crăciun fericit", "Bella Festas daz Nadal!",     "З Рiздвом Христовим", "καλά Χριστούγεννα")   val shuffledTexts : List[String] = Random.shuffle(texts)   def main(args: Array[String]) : Unit = {     println(shuffledTexts.mkString(" — "))   } }

# Functional Programming: Article in another Blog

I wrote a Guest Blog Article about Functional Programming on Adesso’s Blog in German.

# XML

In the late 1990es there was a real hype about XML. Tons of standards evolved and it was a big deal to acquire sound knowledge of it.

It has been some success, because it is still around and very common almost 20 years later.

I would say that the idea of having a human readable and editable text format has mostly failed. Trivial XML can be edited manually without too much of a risk of breaking it, but then again simpler formats like JSON or even java-properties-Files or something along these lines would be sufficient and easier to deal with, unless it is the 1001st slightly different format that needs to be learned again. XML is different each time anyway, because it depends on the schema, so we have the problem on that side, but of course the general idea is well known.

For complex XML manual reading and editing becomes a nightmare, it is just so much harder to read for humans than any reasonably common programming languages of our time. It is text, but so involved that it feels like half binary. And who knows, maybe we can also edit binary files with a hex-editor. And real magicians, actually people with too much time in this case, can do so and keep the binary file correct and uncorrupted, at least for some binary formats. And they can do so in XML as well… But it is actually better to have a tool or a script to create and change non-trivial XML-configuration files.

Where XML is strong is for data exchange between systems. This is mostly transfer in space between different systems, but it can also be transfer in time, that is for storing information to be retrieved later. It gives a format that allows for some „type safety“, that is very versatile and that provides a lot of tool and script support around it. Even here we have to acknowledge that there are some drawbacks. Maintaining a XML interface involves some work for the schema files, adopting the software on the human side. It requires some CPU-overhead on the sending and mostly on the receiving side for creating and parsing XML. The libraries have been optimized but still they take a little bit of time. And then on the network size we transmit a multiple of the amount of data, if it is densely packed with tags.

But it is a format that is well understood, that works on pretty much any platform, over the network and also usually allows us to support different versions of the same interface simultaneously. For debugging it is good to have a format that is at least human readable, even if not very pleasant. Ideally the schema is defined in a way that is self documenting.

I wonder why approaches like in WML have not become more common. WML had a customized compressed format that was more friendly to low bandwidth cell phones.

XML is good for many purposes, but as always it is good to know other tools, like JSON and to decide when it is a case for XML and when not.

Some positive side effects of XML are that it helped some other standards to become more mainstream. UTF-8 was from the beginning the default encoding for XML and this is now a common standard encoding for any text. And with XML-schema it became common to encode dates within XML in the ISO-format, which helped this format in becoming generally known and commonly used for cases where one date format should work independently of the origin of the reader.

# Virtual machines

We all know that Java uses a „virtual machine“ that is it simulates a non-existing hardware which is the same independent of the real hardware, thus helping to achieve the well known platform independence of Java. Btw. this is not about virtualization like VMWare, VirtualBox, Qemu, Xen, Docker and similar tools, but about byte code interpreters like the Java-VM.

We tend to believe that this is the major innovation of Java, but actually the concept of virtual machines is very old. Lisp, UCSD-Pascal, Eumel/Elan, the Perl programming language and many other systems have used this concept long before Java. The Java guys have been good in selling this and it was possible to get this really to the mainstream when Java came out. The Java guys deserve the credit for bringing this in the right time and bringing it to the main stream.

Earlier implementations where kind of cool, but the virtual machine technology and the hardware were to slow, so that they were not really attractive, at least not for high performance applications, which are now actually a domain of Java and other JVM languages. Some suggest that Java or other efficient JVM languages like Scala would run even faster than C++. While it may be true to show this in examples, and the hotspot optimization gives some theoretical evidence how optimization that takes place during run time can be better than static optimization at compile time, I do not generally trust this. I doubt that well written C-code for an application that is adequate for both C and Java will be outperformed by Java. But we have to take two more aspects into account, which tend to be considered kind of unlimited for many such comparisons to make them possible at all.

The JVM has two weaknesses in terms of performance. The start-up time is relatively long. This is addressed in those comparisons, because the claim to be fast is only maintained for long running server applications, where start-up time is not relevant. The hotspot optimization requires anyway a long running application in order to show its advantages. Another aspect that is very relevant is that Java uses a lot of memory. I do not really know why, because more high level languages like Perl or Ruby get along with less memory, but experience shows that this is true. So if we have a budget X to buy hardware and then put software written in C on it, we can just afford to buy more CPUs because we save on the memory or we can make use of the memory that the JVM would otherwise just use up to make our application faster. When we view the achievable performance with a given hardware budget, I am quite sure that well written C outperforms well written Java.

The other aspect is in favor of Java. We have implicitly assumed until now that the budget for development is unlimited. In practice that is not the case. While we fight with interesting, but time consuming low level issues in C, we already get work done in Java. A useful application in Java is usually finished faster than in C, again if it is in a domain that can reasonably be addressed with either of the two languages and if we do not get lost in the framework world. So if the Java application is good enough in terms of performance, which it often is, even for very performance critical applications, then we might be better off using Java instead of C to get the job done faster and to have time for optimization, documentation, testing, unit testing.. Yes, I am in a perfect world now, but we should always aim for that. You could argue that the same argument is valid in terms of using a more high-level language than Java, like Ruby, Perl, Perl 6, Clojure, Scala, F#,… I’ll leave this argument to other articles in the future and in the past.

What Java has really been good at is bringing the VM technology to a level that allows real world high performance server application and bringing it to the main stream.
That is already a great achievement. Interestingly there have never been serious and successful efforts to actually build the JavaVM as hardware CPU and put that as a co-processor into common PCs or servers. It would have been an issue with the upgrade to Java8, because that was an incompatible change, but other than that the JavaVM remained pretty stable. As we see the hotspot optimization is now so good that the urge for such a hardware is not so strong.

Now the JVM has been built around the Java language, which was quite legitimate, because that was the only goal in the beginning. It is even started using the command line tool java (or sometimes javaw on MS-Windows 32/64 systems). The success of Java made the JVM wide spread and efficient, so it became attractive to run other languages on it. There are more than 100 languages on the JVM. Most of them are not very relevant. A couple of them are part of the Java world, because they are or used to be specific micro languages closely related to java to achieve certain goals in the JEE-world, like the now almost obsolete JSP, JavaFX, .

Relevant languages are Scala, Clojure, JRuby, Groovy and JavaScript. I am not sure about Jython, Ceylon and Kotlin. There are interesting ideas coming up here and there like running Haskell under the name Frege on the JVM. And I would love to see a language that just adds operator overloading and provides some preprocessor to achieve this by translating for example „(+)“ in infix syntax to „.add(..)“ mainstream, to allow seriously using numeric types in Java.

Now Perl 6 started its development around 2000. They were at that time assuming that the JVM is not a good target for a dynamic language to achieve good performance. So they started developing Parrot as their own VM. The goal was to share Parrot between many dynamic languages like Ruby, Python, Scheme and Perl 6, which would have allowed inter-language inter-operation to be more easily achievable and using libraries from one of these languages in one of the others. I would not have been trivial, because I am quite sure that we would have come across issues that each language has another set of basic types, so strings and numbers would have to be converted to the strings and numbers of the library language when calling, but it would have been interesting.

In the end parrot was a very interesting project, theoretically very sound and it looked like for example the Ruby guys went for it even faster than the the Perl guys, resulting in an implementation called cardinal. But the relevant Perl 6 implementation, rakudo, eventually went for their own VM, Moar. Ruby also did itself a new better VM- Many other language, including Ruby and JavaScript also went for the JVM, at least as one implementation variant. Eventually the JVM proved to be successful even in this area. The argument to start parrot in the first place was that the JVM is not good for dynamic languages. I believe that this was true around 2000. But the JVM has vastly improved since then, even resulting in Java being a serious alternative to C for many high performance server applications. And it has been improved for dynamic languages, mostly by adding the „invoke_dynamic“-feature, that also proved to be useful for implementing Java 8 lambdas. The experience in transforming and executing dynamic languages to the JVM has grown. So in the end parrot has become kind of obsolete and seems to be maintained, but hardly used for any mainstream projects. In the end we have Perl 6 now and Parrot was an important stepping stone on this path, even if it becomes obsolete. The question of interoperability between different scripting languages remains interesting…