Orthodox Christmas 2018/2019

Orthodox Christmas in some countries, for example in Ukraine is on 7th of January.
So to all readers, who have Christmas on 7th of January:


С Рождеством! — Hyvää Joulua! — καλά Χριστούγεννα! — Buon Natale! — Prettige Kerstdagen! — З Рiздвом Христовим! — Merry Christmas! — Срећан Божић! — God Jul! — ¡Feliz Navidad! — ميلاد مجيد — クリスマスおめでとう ; メリークリスマス — Natale hilare! — Joyeux Noël! — God Jul! — Frohe Weihnachten! — Crăciun fericit! — Feliĉan Kristnaskon!

I created the message and the random ordering using Perl and the Schwartzian transform:

#!/usr/bin/perl
use Math::Random::Secure qw(irand);
my @list = ( "Prettige Kerstdagen!",
          "God Jul!",
          "Crăciun fericit!",
          "クリスマスおめでとう ; メリークリスマス",
          "God Jul!",
          "Feliĉan Kristnaskon!",
          "Hyvää Joulua!",
          "ميلاد مجيد",
          "Срећан Божић!",
          "καλά Χριστούγεννα!",
          "З Рiздвом Христовим!",
          "Natale hilare!",
          "Buon Natale!",
          "Joyeux Noël!",
          "Frohe Weihnachten!",
          "С Рождеством!",
          "Merry Christmas!",
          "¡Feliz Navidad!" );
my @shuffled = map{ $_->[0] }
               sort {$a->[1] <=> $b->[1]}
               map { [ $_, irand() ] }
               @list;
print join(" — ", @shuffled);

Share Button

How to draw lines, circles and other curves

These ideas were developed more than 30 years without knowing that they were already known at that time…

Today the graphics cards can easily do things like this in very little time. And today’s CPUs are of course really good at multiplying. So this has lost a lot of its immediate relevance, but it is a fun topic and why not have some fun…

Let us assume we have a two dimensional coordinate system and a visible area that goes from x_{\min} to x_{\max} and y_{\min} to y_{\max}. Coordinates are discrete.

In this world we can easily measure an angle against a (directed) line parallel to the x-axis, for example up to an accuracy of 45^\circ=\frac{\pi}{4}:

  • y=0 \wedge x > 0 \implies \alpha = 0 (= 0^\circ)
  • 0 < y < x \implies 0 < \alpha < \frac{\pi}{4}(=45^\circ)
  • 0 < y = x \implies \alpha = \frac{\pi}{4}
  • 0 < x < y \implies \frac{\pi}{4} < \alpha < \frac{\pi}{2}(=90^\circ)
  • x = 0 \land y > 0\implies \alpha = \frac{\pi}{2}
  • x < 0 \land y > 0 \land |x| < |y|\implies \frac{\pi}{2} < \alpha < \frac{3\pi}{4}(=135^\circ)
  • x < 0 \land y > 0 \land -x = y\implies \alpha = \frac{3\pi}{4}(=135^\circ)

So let us assume we have a curve that is described by a polynomial function in two variables x and y, like this:

    \[f(x, y) = \sum_{j=0}^m\sum_{k=0}^n a_{j,k}x^jy^k = 0\]

We have to apply some math to understand that the curve behaves nicely in the sense that it does not behave to chaotic in scales that are below our accuracy, that it is connected etc. We might possibly scale and move it a bit by substituting something like c_1u+c_2 for x and c_3v+c_4 for y.

For example we may think of

  • line: f(x,y)=ax+by+c
  • circle: f(x, y)=x^2+y^2-r^2
  • eclipse: f(x, y)=\frac{x^2}{a^2}+\frac{y^2}{b^2}-1

We can assume our drawing is done with something like a king of chess. We need to find a starting point that is accurately on the curve or at least as accurately as possible. You could use knights or other chess figures or even fictive chess figures..

Now we have a starting point (x_0, y_0) which lies ideally exactly on the curve. We have a deviation from the curve, which is f(x_0, y_0)=d_0. So we have f(x_n, y_n)=d_n. Than we move to x_{n+1}=x_n + s and y_{n+1}=y_n+t with s, t = \{-1, 0, 1\}. Often only two or three combinations of (s, t) need to be considered. When calculating d_{n+1} from d_n for the different variants, it shows that for calculating d_{n+1}-d_n the difference becomes a polynomial with lower degree, because the highest terms cancel out. So drawing a line between two points or a circle with a given radius around a given point or an ellipse or a parabola or a hyperbola can be drawn without any multiplications… And powers of n-th powers of x can always be calculated with additions and subtractions only from the previous x-values, by using successive differences:
d_{m,1}=(x-m)^n-(x-m-1)^n
d_{m,l+1}=d_{m+1,l}-d_{m,l}
These become constant for l=n, just as the lth derivatives, so by using this triangle, successive powers can be calculated with some preparational work using just additions.
It was quite natural to program these in assembly language, even in 8-bit assembly languages that are primitive by today’s standards. And it was possible to draw such figures reasonably fast with only one MHz (yes, not GHz).

We don’t need this stuff any more. Usually the graphics card is much better than anything we can with reasonable effort program. Usually the performance is sufficient when we just program in high level languages and use standard libraries.

But occasionally situations occur where we need to think about how to get the performance we need:
Make it work,
make it right,
make it fast,
but don’t stop after the first of those.

It is important that we choose our steps wisely and use adequate methods to solve our problem. Please understand this article as a fun issue about how we could write software some decades ago, but also as an inspiration to actually look into bits and bytes when it is really helping to get the necessary performance without defeating the maintainability of the software.

Share Button

2019 — Happy New Year

Gott nytt år! — Godt nytt år! — Felice anno nuovo! — Καλή Χρονια! — Щасливого нового року! — Срећна нова година! — С новым годом! — Feliĉan novan jaron! — Bonne année! — FELIX SIT ANNUS NOVUS — Gullukkig niuw jaar! — Un an nou fericit! — Frohes neues Jahr! — Happy new year! — ¡Feliz año nuevo! — Onnellista uutta vuotta! — عام سعيد

This was created by a Java-program:

import java.util.Random;
import java.util.List;
import java.util.Arrays;
import java.util.Collections;

public class HappyNewYear {

    public static void main(String[] args) {
        List list = Arrays.asList("Frohes neues Jahr!",
                                          "Happy new year!",
                                          "Gott nytt år!", 
                                          "¡Feliz año nuevo!",
                                          "Bonne année!", 
                                          "FELIX SIT ANNUS NOVUS", 
                                          "С новым годом!",
                                          "عام سعيد",
                                          "Felice anno nuovo!",
                                          "Godt nytt år!", 
                                          "Gullukkig niuw jaar!", 
                                          "Feliĉan novan jaron!",
                                          "Onnellista uutta vuotta!",
                                          "Срећна нова година!",
                                          "Un an nou fericit!",
                                          "Щасливого нового року!", 
                                          "Καλή Χρονια!");
        Collections.shuffle(list);
        System.out.println(String.join(" — ", list));
    }
}

Share Button

Christmas 2018


Feliĉan Kristnaskon! — Frohe Weihnachten! — God Jul! — Merry Christmas! — Joyeux Noël! — クリスマスおめでとう ; メリークリスマス — Срећан Божић! — Buon Natale! — Hyvää Joulua! — З Рiздвом Христовим! — ميلاد مجيد — С Рождеством! — Crăciun fericit! — ¡Feliz Navidad! — καλά Χριστούγεννα! — Natale hilare! — God Jul! — Prettige Kerstdagen!

As I said, I am learning some Python, so let’s use it. I created the message above with this program:

#!/usr/bin/python3
import random
arr = [
    "Frohe Weihnachten!",
    "Merry Christmas!",
    "God Jul!",
    "¡Feliz Navidad!",
    "Joyeux Noël!",
    "Natale hilare!",
    "С Рождеством!",
    "ميلاد مجيد",
    "Buon Natale!",
    "God Jul!",
    "Prettige Kerstdagen!",
    "Feliĉan Kristnaskon!",
    "Hyvää Joulua!",
    "クリスマスおめでとう ; メリークリスマス",
    "Срећан Божић!",
    "Crăciun fericit!",
    "З Рiздвом Христовим!",
    "καλά Χριστούγεννα!"
]
random.shuffle(arr)
print(" — ".join(arr))
print("\n")

Share Button

Indexing of Arrays and Lists

We index arrays with integers. Lists also, at least the ones that allow random access. And sizes of collections are also integers.
This allows for 2^{31}-1=2147483647 entries in Java and typical JVM languages, because integers are actually considered to be 32bit. Actually we could think of one more entry, using indices 0..2147483647, but then we would not be able to express the size in an signed integer. This stuff is quite deeply built into the language, so it is not so easy to break out of this. And 2’000’000’000 entries are a lot and take a lot of time to process. At least it was a lot in the first few years of Java. There should have been an unsigned variant of integers, which would in this case allow for 4’000’000’000 entries, when indexing by an uint32, but that would not really solve this problem. C uses 64 bit integers for indexing of arrays on 64 bit systems.

It turns out that we would like to be able to index arrays using long instead of int. Now changing the java arrays in a way that they could be indexed by long instead of int would break a lot of compatibility. I think this is impossible, because Java claims to retain very good backward compatibility and this reliability of both the language and the JVM has been a major advantage. Now a second type of arrays, indexed by long, could be added. This would imply even more complexity for APIs like reflection, that have to deal with all cases for parameters, where it already hurts that the primitives are no objects and arrays are such half-objects. So it will be interesting, what we can find in this area in the future.

For practical use it is a bit easier. We can already be quite happy with a second set of collections, let them be called BigCollections, that have sizes that can only be expressed with long and that are indexed in cases where applicable with longs. Now it is not too hard to program a BigList by internally using an array of arrays or an array of arrays of arrays and doing some arithmetic to calculate the internal indices from the long (int64) index given in the API. Actually we can buy some performance gain when resizing happens, because this structure, if well done, allows for more efficient resizing. Based on this all kinds of big collections could be built.

Share Button

Tip: how to make Thunderbird use ISO Date Format

After an upgrade, my Thnderbird started to use a weird US-date format and forgot my setting for this. I did not find any setting either. But this seems to work fine:

Create a shell script

#!/bin/bash

export LC_TIME=sv_SE.utf8
export LC_DATE=sv_SE.utf8

exec /usr/bin/thunderbird

and make sure that this gets called when thunderbird is started instead of directly /usr/bin/thunderbird.

Share Button

Clojure Exchange 2018

I visited Clojure Exchange 2018 in London.
Since there was only one track and I attended all talks, it is easy to just refer to the schedule.

Interesting topics, that came up multiple times in different flavors where immutability, stories of building real life applications, music, java and clojure and the transition, clojure script, emacs and cider…

I did a lightning talk myself about Some Thoughts about Immutability and its Limits.

Links

  • Clojure
  • Clojure Exchange 2018
  • Clojure Exchange 2016
  • Clojure
  • Share Button

Devoxx Kiew 2018

In the end of 2018 the number of conferences is kind of high. A great highlight is the Devoxx BE in Antwerp. But it has now five partner conferences in London, Paris, Krakow, Morocco and Kiev. So I decided to have a look at the one in Kiev.

How was it in comparison to the one in Belgium? What was better in Kiev: The food was way better, the drinks in the first evening (Whisky and Long Drinks vs. Belgium Beer) might be considered better, there were more people engaged to help the organizers…
What was better in Belgium: There were still a bit more speeches. While the location in Kiev was really great, in Belgium the rooms were way better for the purpose of providing a projection visible for everybody and doing a video recording that did not disturb the audience.
The quality of the speeches was mostly great in both locations. In Kiev they gamified the event a bit more..

Generally there was a wide range of topics and the talks were sorted into the following thematic groups:

  • Methodology & Culture
  • JVM Languages
  • Server Side
  • Architecture & Security
  • Mobile & IoT
  • Machine Learning & AI
  • Big Data & Data Mining
  • Cloud, Containers & Infrastructure
  • Modern Web & UX

See the schedule for the distribution…

I attended on Friday:

I attended on Saturday:

A lot to learn.

Share Button

Devoxx Antwerp 2018

In 2018 I am visiting a few conferences. A great highlight is the Devoxx BE in Antwerp, which I had the privilege of visiting 2012, 2013, 2014, 2015, 2016 and 2017.

As it should be, it is not just the same every year, but content and speakers change a bit from year to year.

Some topics that got a lot of attention were functional programming, artificial intelligence, Big Data, Machine Learning, clouds, JVMs, Kotlin

There was less about other JVM languages (apart from Kotlin), so Scala, Clojure, Groovy or Ceylon were covered little or not at all and Android used to be more present in other years. I would say that Ceylon has become irrelevant, probably because Kotlin was too similar and came out the same time and won. Groovy has its niche, Clojure has its niche, Scala and Kotlin have become mature and are now the two mainstream alternatives to Java, but themselves much smaller than Java. This was represented in the conference, taking into account that Scala has its own large conferences, like Scala Days, Scala Exchange, Scala World and a lot more.

Some side issues that might worry some of us did come up occasionally. Was it bad, that IBM bought Red Hat? At least they paid around 34’000’000’000 USD, which is more than 2’500’000 USD per employee. There are probably no other assets in terms of buildings, patents, hardware or whatever, that would justify this price, so IBM probably will have an interest to keep a large number of these employees and not scare them away by too much „IBM-culture“. We will see, but no reason to get immediately worried. Oracle wants money for running their JVM in production after more than 6 months. This can be avoided by always switching to the newest version or by relying on the JDKs offered by alternative sources like Amazon, RedHat…

Microsoft was a sponsor and had a booth. Their topic was not MS-Windows and MS-Office and MS-SQL-Server, but Azure, which can be used with Linux and Java and PostgreSQL, for example. The company did change a bit since the days of Steve Ballmer and we will see if this is an excursion or a continuous direction.

And James Gosling was there at the opening, as a surprise.

Generally there was a wide range of topics and the talks were sorted into the following thematic groups:

  • Methodology & Culture
  • Java Language
  • Programming languages
  • Architecture & Security
  • Big Data & Machine Learning
  • Mind the Geek
  • Server Side Java
  • Modern Web & UX
  • Cloud, Containers & Infrastructure
  • Mobile & IoT

See the schedule for the distribution…

I attended on Wednesday:

I attended on Thursday:

I attended on Friday:

It was a great conference. A lot of new ideas.

Share Button

Logging

Deutsch

Software often contains a logging functionality. Usually entries one or sometimes multiple lines are appended to a file, written to syslog or to stdout, from where they are redirected into a file. They are telling us something about what the software is doing. Usually we can ignore all of it, but as soon as something with „ERROR“ or worse and more visible stack traces can be found, we should investigate this. Unfortunately software is often not so good, which can be due to libraries, frameworks or our own code. Then stack traces and errors are so common that it is hard to look into or to find the ones that are really worth looking into. Or there is simply no complete process in place to watch the log files. Sometimes the error shows up much later than it actually occurred and stack traces do not really lead us to the right spot. More often than we think logging actually introduces runtime errors, that were otherwise not present. This is related to a more general concept, which is called observer effect, where logging actually changes the business logic.

It is nice that log files keep to some format. Usually they start with a time stamp in ISO-format, often to the millisecond. Please add trailing zeros to always have 3 digits after the decimal point in this case. It is preferable to use UTC, but people tend to stick to local date and time zones, including the issues that come with switching to and from daylight saving time. Usually we have several processes or threads that run simultaneously. This can result in a wild mix of logging entries. As long as even multiline entries stay together and as long as beginning and end of one multiline entry can easily be recognized, this can be dealt with. Tools like splunk or simple Perl, Ruby or Python scripts can help us to follow threads separately. We could actually have separate logs for each thread in the first place, but this is not a common practice and it might hit OS-limitations on the number of open files, if we have many threads or even thousands of actors as in Erlang or Akka. Keeping log entries together can be achieved by using an atomic write, like the write system call in Linux and other Posix systems. Another way is to queue the log entries and to have a logger thread that processes the queue.

Overall this area has become very complex and hard to tame. In the Java world there used to be log4j with a configuration file that was a simple properties file, at least in the earlier version. This was so good that other languages copied it and created some log4X. Later the config file was replaced by XML and more logging frame works were added. Of course quite a lot of them just for the purpose of abstracting from the large zoo of logging frameworks and providing a unique interface for all of them. So the result was, that there was one more to deal with.

It is a good question, how much logic for handling of log files do we really want to see in our software. Does the software have to know, into which file it should log or how to do log rotation? If a configuration determines this, but the configuration is compiled into the jar file, it does have to know… We can keep our code a bit cleaner by relying on program functionality without code, but this still keeps it as part of the software.

Log files have to please the system administrator or whoever replaced them in a pure devops shop. And in the end developers will have to be able to work with the information provided by the logs to find issues in the code or to explain what is happening, if the system administrator cannot resolve an issue by himself. Should this system administrator have to deal with a different special complex setup for the logging for each software he is running? Or should it be necessary to call for developer support to get a new version of the software with just another log setting, because the configurations are hard coded in the deployment artifacts? Interesting is also, what happens when we use PAAS, where we have application server, database etc., but the software can easily move to another server, which might result in losing the logs. Moving logs to another server or logging across the network is expensive, maybe more expensive than the rest of this infrastructure.

Is it maybe a good idea to just log to stdout, maintaining a decent format and to run the software in such a way that stdout is piped into a log manager? This can be the same for all software and there is one way to configure it. The same means not only the same for all the java programs, but actually the same for all programs in all languages that comply to a minimal standard. This could be achieved using named pipes in conjunction with any hard coded log file that the software wants to use. But this is a dangerous path unless we really know what the software is doing with its log files. Just think of what weird errors might happen if the software tries to apply log rotation to the named pipe by renaming, deleting, creating new files and so on. A common trick to stop software from logging into a place where we do not want this is to create a directory with the name of the file that the software usually uses and to write protect this directory and its parent directory for the software. Please find out how to do it in detail, depending on your environment.

What about software, that is a filter by itself, so its main functionality is to actually write useful data to stdout? Usually smaller programs and scripts work like this. Often they do not need to log and often they are well tested relyable parts of our software installation. Where are the log files of cp, ls, rm, mv, grep, sort, cat, less,…? Yes, they do tend to write to stderr, if real errors occur. Where needed, programs can turn on logging with a log file provided on the command line, which is also a quite operations friendly approach. Named pipes can help here.

And we had a good logging framework in place for many years. It was called syslog and it is still around, at least on Linux.

A last thought: We spend really a lot of effort to get well performing software, using multiple processes, threads or even clusters. And then we forget about the fact that logging might become the bottle neck.

Share Button