# Devoxx UA and Devoxx BE 2019

In 2019 I visited Devoxx UA in Kiev and Devoxx BE in Antwerp.
Traveling was actually a little story by itself, so for now we can just assume that I magically was at the locations of DevoxxUA and DevoxxBE.

In Kiew I attended the following talks:

On Wednesday I attended the following talks in Antwerp:

On Thursday I attended the following talks in Antwerp:

On Friday I attended the following talks in Antwerp:

That’s it…
As always, a lot of these topics deserve an article in this blog. And a lot of video recordings from the conference are worth viewing.

# Checked Exceptions in Java

In Java it is possible to declare a method with a „throws“-clause. For certain exceptions, that are not extending „RuntimeException“ or „Error“, this is actually required.

What looked like a good idea 25 years ago has proven to be a dead end. I do not know of any other major programming language that opts for declaring exceptions in this way. Slightly newer frameworks extend all their exceptions from RuntimeException, thus avoiding the need to declare them. Even in relatively early Java there was a weird way of working with exceptions in EJB, when it was required to write an interface and an implementation for the EJB. But it was strongly discouraged to let the implementation implement the interface, because it threw different exceptions. It was not the only weird thing about early EJB, of course. But without checked exceptions it would at least have been possible to let the implementation implement its interface.

We are now able to use Java 13 and as of Java 8 lambdas were introduced. With the introduction of lambdas the declared exceptions became especially painful and for this reason even Oracle has created twins for some essential exceptions that derive from RuntimeException, especially IOException.

We should face it: The throws clause has turned out to be a mistake and we should avoid this mistake by just using exceptions that do not have to be declared, at least in our APIs. It is not the only mistake, see Criticism of Java. Some of my other favorites are the lack of operator overloading for numeric types, the weird concept of Serializable and the lack of natively immutable collections and the lack of a convenient way to write some collections as code. But these issues are being worked on and we will eventually see some progress.

# Devoxx Kiew 2018

In the end of 2018 the number of conferences is kind of high. A great highlight is the Devoxx BE in Antwerp. But it has now five partner conferences in London, Paris, Krakow, Morocco and Kiev. So I decided to have a look at the one in Kiev.

How was it in comparison to the one in Belgium? What was better in Kiev: The food was way better, the drinks in the first evening (Whisky and Long Drinks vs. Belgium Beer) might be considered better, there were more people engaged to help the organizers…
What was better in Belgium: There were still a bit more speeches. While the location in Kiev was really great, in Belgium the rooms were way better for the purpose of providing a projection visible for everybody and doing a video recording that did not disturb the audience.
The quality of the speeches was mostly great in both locations. In Kiev they gamified the event a bit more..

Generally there was a wide range of topics and the talks were sorted into the following thematic groups:

• Methodology & Culture
• JVM Languages
• Server Side
• Architecture & Security
• Mobile & IoT
• Machine Learning & AI
• Big Data & Data Mining
• Cloud, Containers & Infrastructure
• Modern Web & UX

See the schedule for the distribution…

I attended on Friday:

I attended on Saturday:

A lot to learn.

# Devoxx 2017

I was lucky to get a chance to visit Devoxx in Antwerp the sixth time in a row. As always there were interesting talks to listen to. Some issues that were visible across different talks:

Java 9 is now out and Java 8 will soon go into the first steps of deprecation. The step of moving to Java 9 is probably the hardest in the history of Java. There were features in the past that brought very significant changes, but they were usually kind of optional, so adoption could be avoided or delayed. Java 9 brings the module system and a new level of abstraction in that classes of modules can be made public to other modules selectively or globally. Otherwise they can be by themselves declared as public, but only be visible within the module. This actually applies to classes of the standard library, that were always declared as being private, but that could not be efficiently hidden away from external usage. Now they suddenly do not work any more and a lot of software has some difficulty and needs to be adjusted to avoid these internal classes. Beyond that a lot of talks were about Java 9, for example also covering somewhat convenient methods for writing constant collections in code. Future releases will follow a path that is somewhat similar to that of Perl 5. Releases will be created roughly every half year and will include whatever is ready for inclusion at that time. Some releases will be supported for a longer time than others.

Then there was of course security security security… Even more important than in the past.

And a lot more…

I listened to the following talks:
Wednesday, 2017-11-08

Thursday, 2017-11-09

Friday, 2017-11-10

Previous years:

Btw. I always came to Devoxx by bicycle, at least part of the way…

# Conference Talks

Referring to the Swiss Perl Workshop, it is now time to collect all the conference talks that I have given so far and that have been uploaded as video.

# UTF-16 Strings in Java

Deutsch

Strings in Java and many other JVM-languages consist of Unicode content and are encoded as utf-16. It was fantastic to already consider Unicode when introducing Java in the 90es and to make it the only reasonable way to use strings, so there is no temptation to start with a „US-ASCII“-version of a software that never really gets enhanced to deal properly with non-English content and data. And it also avoids having to deal with many kinds of String encodings within the software. For outside storage in databases and files and for communication with other processes and over the network, these issues of course remain. But Java strings are always encoded utf-16. We can be sure. Most common languages can make use of these Strings easily and handling common letter based languages from Europe and western Asia is quite strait forward. Rendering may still be a challenge, but that is another issue. A major drawback of this approach is that more memory is used. This usually does not hurt too much. Memory is not so expensive and a couple of Strings even with utf-16 will not be too big. With 64-bit Java, which should be used these days, the memory limitations of the JVM are not relevant any more, they can basically use as much memory as provided.

But some applications to hit the memory limits. And since usually most of the data we are dealing with is ultimately strings and combinations of relatively long strings with some reference pointers, some integer numbers and more strings, we can say that in typical memory intensive applications strings actually consume a large fraction of the memory. This leads to the temptation of using or even writing a string library that uses utf-8 or some other more condensed internal format, while still being able to express any Unicode content. This is possible and I have done it. Unfortunately it is very painful, because Strings are quite deeply integrated into the language and explicit conversions need to be added in many places to deal with this. But it is possible and can save a lot of memory. In my case we were able to abandon this approach, because other optimizations, that were less painful, proved to be sufficient.

An interesting idea is to compress strings. If they are long enough, algorithms like gzip work on a single string. As with utf-8, selectively accessing parts of the string becomes expensive, because it can only be achieved by parsing the string from the beginning or by adding indexing structures. We just do not know which byte to go to for accessing the n-th character, even with utf-8. In reality we often do not have long strings, but rather many relatively short strings. They do not compress well by themselves. If we know our data and have a rough idea about the content of our complete set of strings, custom compression algorithm can be generated. This allows good results even for relatively short strings, as long as they are within the „language“ that we are prepared for. This is more or less similar to the step from utf-16 to utf-8, because we replace common byte sequences by shorter byte sequences and less common sequences may even get replaced by something longer. There is no gain in utf-8, if we have mostly strings that are in non-Latin alphabets. Even Cyrillic or Greek, that are alphabets similar to the Latin alphabet, will end up needing two bytes for each letter, which is not at all better than utf-16. For other alphabets it will even become worse, because three or four bytes are needed for one symbol that could easily be expressed with two bytes in utf-16. But if we know our data well enough, the approach with the specific compression will work fine. The „dictionary“ for the compression needs to be stored only once, maybe hard-coded in the software, and not in each string. It might be of interest to consider building the dictionary dynamically at run-time, like it is done with gzip, but keeping it in a common place for all strings and thus sharing it. The custom strings that I used where actually using a hard coded compression algorithm generated using a large amount of typical data. It worked fine, but was just too clumsy to use because Java is not prepared to replace String with something else without really messing around in the standard run-time libraries, which I would neither recommend nor want.

It is important to consider the following issues:

1. Is the memory consumption of the strings really a problem?
2. Are there easier optimizations that solve the problem?
3. Can it just be solved by adding more hardware? Yes, this is a legitimate question.
4. Are there solutions for the problem in the internet or even within the current organization?
5. A new String class is so fundamental that excellent testing is absolutely mandatory. The unit tests should be very extensive and complete.

# How to create ISO Date String

It is a more and more common task that we need to have a date or maybe date with time as String.

There are two reasonable ways to do this:
* We may want the date formatted in the users Locale, whatever that is.
* We want to use a generic date format, that is for a broader audience or for usage in data exchange formats, log files etc.

The first issue is interesting, because it is not always trivial to teach the software to get the right locale and to use it properly… The mechanisms are there and they are often used correctly, but more often this is just working fine for the locale that the software developers where asked to support.

So now the question is, how do we get the ISO-date of today in different environments.

## Linux/Unix-Shell (bash, tcsh, …)

date "+%F"

## TeX/LaTeX

 \def\dayiso{\ifcase\day \or 01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or% 1..10 11\or 12\or 13\or 14\or 15\or 16\or 17\or 18\or 19\or 20\or% 11..20 21\or 22\or 23\or 24\or 25\or 26\or 27\or 28\or 29\or 30\or% 21..30 31\fi} \def\monthiso{\ifcase\month \or 01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or 11\or 12\fi} \def\dateiso{\def\today{\number\year-\monthiso-\dayiso}} \def\todayiso{\number\year-\monthiso-\dayiso} 
This can go into a file isodate.sty which can then be included by \include or \input Then using \todayiso in your TeX document will use the current date. To be more precise, it is the date when TeX or LaTeX is called to process the file. This is what I use for my paper letters.

## LaTeX

(From Fritz Zaucker, see his comment below):
 \usepackage{isodate} % load package \isodate % switch to ISO format \today % print date according to current format 

## Oracle

 SELECT TO_CHAR(SYSDATE, 'YYYY-MM-DD') FROM DUAL; 
On Oracle Docs this function is documented.
It can be chosen as a default using ALTER SESSION for the whole session. Or in SQL-developer it can be configured. Then it is ok to just call
 SELECT SYSDATE FROM DUAL; 

Btw. Oracle allows to add numbers to dates. These are days. Use fractions of a day to add hours or minutes.

## PostreSQL

(From Fritz Zaucker, see his comment):
 select current_date; —> 2016-01-08 
 select now(); —> 2016-01-08 14:37:55.701079+01 

## Emacs

In Emacs I like to have the current Date immediately:
 (defun insert-current-date () "inserts the current date" (interactive) (insert (let ((x (current-time-string))) (concat (substring x 20 24) "-" (cdr (assoc (substring x 4 7) cmode-month-alist)) "-" (let ((y (substring x 8 9))) (if (string= y " ") "0" y)) (substring x 9 10))))) (global-set-key [S-f5] 'insert-current-date) 
Pressing Shift-F5 will put the current date into the cursor position, mostly as if it had been typed.

## Emacs (better Variant)

(From Thomas, see his comment below):
 (defun insert-current-date () "Insert current date." (interactive) (insert (format-time-string "%Y-%m-%d"))) 

## Perl

In the Perl programming language we can use a command line call
 perl -e 'use POSIX qw/strftime/;print strftime("%F", localtime()), "\n"' 
or to use it in larger programms
 use POSIX qw/strftime/; my \$isodate_of_today = strftime("%F", localtime()); 
I am not sure, if this works on MS-Windows as well, but Linux-, Unix- and MacOS-X-users should see this working.

If someone has tried it on Windows, I will be interested to hear about it…
Maybe I will try it out myself…

## Perl 5 (second suggestion)

(From Fritz Zaucker, see his comment below):
 perl -e 'use DateTime; use 5.10.0; say DateTime->now->strftime(„%F“);‘ 

## Perl 6

(From Fritz Zaucker, see his comment below):
 say Date.today; 
or
 Date.today.say; 

## Ruby

This is even more elegant than Perl:
 ruby -e 'puts Time.new.strftime("%F")' 
will do it on the command line.
Or if you like to use it in your Ruby program, just use
 d = Time.new s = d.strftime("%F") 

Btw. like in Oracle SQL it is possible add numbers to this. In case of Ruby, you are adding seconds.

It is slightly confusing that Ruby has two different types, Date and Time. Not quite as confusing as Java, but still…
Time is ok for this purpose.

## C on Linux / Posix / Unix

 #include #include #include 

 main(int argc, char **argv) { 

 char s[12]; time_t seconds_since_1970 = time(NULL); struct tm local; struct tm gmt; localtime_r(&seconds_since_1970, &local); gmtime_r(&seconds_since_1970, &gmt); size_t l1 = strftime(s, 11, "%Y-%m-%d", &local); printf("local:\t%s\n", s); size_t l2 = strftime(s, 11, "%Y-%m-%d", &gmt); printf("gmt:\t%s\n", s); exit(0); } 
This speeks for itself..
But if you like to know: time() gets the seconds since 1970 as some kind of integer.
localtime_r or gmtime_r convert it into a structur, that has seconds, minutes etc as separate fields.
stftime formats it. Depending on your C it is also possible to use %F.

## Scala

 import java.util.Date import java.text.SimpleDateFormat ... val s : String = new SimpleDateFormat("YYYY-MM-dd").format(new Date()) 
This uses the ugly Java-7-libraries. We want to go to Java 8 or use Joda time and a wrapper for Scala.

## Java 7

 import java.util.Date import java.text.SimpleDateFormat

 

... String s = new SimpleDateFormat("YYYY-MM-dd").format(new Date()); 
Please observe that SimpleDateFormat is not thread safe. So do one of the following:
* initialize it each time with new
* make sure you run only single threaded, forever
* use EJB and have the format as instance variable in a stateless session bean
* protect it with synchronized
* protect it with locks
* make it a thread local variable

In Java 8 or Java 7 with Joda time this is better. And the toString()-method should have ISO8601 as default, but off course including the time part.

## Summary

This is quite easy to achieve in many environments.
I could provide more, but maybe I leave this to you in the comments section.
What could be interesting:
* better ways for the ones that I have provided
* other databases
* other editors (vim, sublime, eclipse, idea,…)
* Office packages (Libreoffice and MS-Office)
* C#
* F#
* Clojure
* C on MS-Windows
* Perl and Ruby on MS-Windows
* Java 8
* Scala using better libraries than the Java-7-library for this
* Java using better libraries than the Java-7-library for this
* C++
* PHP
* Python
* Cobol
* JavaScript
* …
If you provide a reasonable solution I will make it part of the article with a reference…

# Conversion of ASCII-graphics to PNG or JPG

Images are usually some obscure binary files. Their most common formats, PNG, SVG, JPEG and GIF are well documented and supported by many software tools. Libraries and APIs exist for accessing these formats, but also a phantastic free interactive software like Gimp. The compression rate that can reasonably be achieved when using these format is awesome, especially when picking the right format and the right settings. Tons of good examples can be found how to manipulate these image formats in C, Java, Scala, F#, Ruby, Perl or any other popular language, often by using language bindings for Image Magick.

There is another approach worth exploring. You can use a tool called convert to just convert an image from PNG, JPG or GIF to XPM. The other direction is also possible. Now XPM is a text format, which basically represents the image in ASCII graphics. It is by the way also valid C-code, so it can be included directly in C programms and used from there, when an image needs to be hard coded into a program. It is not generally recommended to use this format, because it is terribly inefficient because it uses no compression at all, but as intermediate format for exploring additional ways for manipulating images it is of interest.
An interesting option is to create the XPM-file using ERB in Ruby and then converting it to PNG or JPG.

# How to recover the Carry Bit

As frequent readers might have observed, I like the concept of the Carry Bit as it allows for efficient implementations of long integer arithmetic, which I would like to use as default integer type for most application development. And unfortunately such facilities are not available in high level languages like C and Java. But it is possible to recover the carry bit from what is available in C or Java, with some extra cost of performance, but maybe neglectable, if the compiler does a good optimization on this. We might assume gcc on a 64-Bit-Linux. It should be possible to do similar things on other platforms.

So we add two unsigned 64-bit integers and and an incoming carry bit to a result

with

using the typical „long long“ of C. We assume that

where

and

. In the same way we assume and with the same kind of conditions for , , or , , , respectively.

Now we have

and we can see that

for some

.
And we have

where

is the carry bit.
When looking just at the highest visible bit and the carry bit, this boils down to

This leaves us with eight cases to observe for the combination of , and :

x_hy_huz_hc
00000
10010
01010
11001
00110
10101
01101
11111

Or we can check all eight cases and find that we always have

or

So the result does not depend on anymore, allowing to calculate it by temporarily casting , and to (signed long long) and using their sign.
We can express this as „use if and use if „.

An incoming carry bit does not change this, it just allows for , which is sufficient for making the previous calculations work.

In a similar way subtraction can be dealt with.

The basic operations add, adc, sub, sbb, mul, xdiv (div is not available) have been implemented in this library for C. Feel free to use it according to the license (GPL). Addition and subtraction could be implemented in a similar way in Java, with the weirdness of declaring signed longs and using them as unsigned. For multiplication and division, native code would be needed, because Java lacks 128bit-integers. So the C-implementation is cleaner.

# Logging

English

Software enthält häufig eine Log-Funktionalität. Üblicherweise werden dort ein- oder mehrzeilige Einträge in eine Datei, nach syslog oder in die Standardausgabe geschrieben (und letzlich in eine Datei umgeleitet), die etwas darüber sagen, was die Software so macht. Normalerweise kann man das alles ignorieren, aber sobald dort etwas mit „ERROR“ auftritt oder schlimmer noch sogenannte Stacktraces, ist eigentlich angesagt, dass man dem Problem auf den Grund geht. Da nun leider Software häufig von minderer Qualität ist, was durchaus von den Unzulänglichkeiten der verwendeten Libraries und Frameworks kommen kann, sieht man das leider recht oft, teilweise so oft, dass man denen nicht mehr wirklich nachgeht. Ärgerlich ist vor allem, dass der eigentliche Fehler oft vorher an einer ganz anderen Stelle aufgetreten ist, sich aber in der Log-Datei nicht nachvollziehen lässt.

Nun ist es schön, dass die Log-Dateien einen gewissen Aufbau der Einträge einhalten. Meist beginnen sie mit einer Zeitangabe im ISO-Format, oft auf die Millisekunde genau. Da in der Regel mehrere Prozesse oder mehrere Threads gleichzeitig laufen, werden deren Einträge natürlich mehr oder weniger wild gemischt. Das ist gut erträglich, wenn auch mehrzeilige Log-Einträge (z.B. Stack-Traces) zusammen bleiben und wenn sich Anfang und Ende von mehrzeiligen Log-Einträgen gut erkennen lassen. Man kann dann mit Splunk oder mit Perl- oder Ruby-Scripten die Log-Dateien analysieren und jeweils für einzelne Threads die Abläufe nachvollziehen. Das Zusammenhalten mehrzeiliger Einträge lässt sich realisieren, indem man ein atomares write verwendet oder indem man einen „Logger-Thread“-hat und alle Log-Einträge diesem Thread in einer Queue zur Verfügung stellt, die dieser dann abarbeitet und herausschreibt.

Insgesamt ist das ganze Thema aber unübersichtlich und unhandhabbar geworden. In der Java-Welt hatte man früher log4j und eine relativ einfache Konfigurations-Datei im properties-Format. Das wurde dann irgendwann durch XML ersetzt und es kamen andere Logging-Frameworks dazu und jeweils immer wieder etwas, was alle diese vereinheitlichte. Letztlich wurde es dadurch komplizierter und unübersichtlicher.

Es stellt sich die Frage, wie viel von der Logik für die Handhabung der Log-Dateien in die Software selbst gehört. Muss die Software wissen, in welche Datei geloggt wird? Muss sie Log-Rotation kennen? Muss es sein, dass eine Software überhaupt nicht startet, nur weil man es nicht schafft, die komplizierte Log-Konfiguration richtig einzustellen? Letztlich müssen die Log-Dateien am Ende den Systemadministrator zufriedenstellen, der die Software auf dem Produktivsystem betreibt. Soll man diesem zumuten, für jede Software eine spezielle Konfiguration des Logging zu pflegen? Oder für diesen Zweck jeweils von den Entwickerln eine neue Software-Version anzufordern, weil die Konfiguration als Teil des Deployments „hardcodiert“ ist? Interessant wird es auch dann, wenn man auf Technologien wie „Platform as a Service“ (PAAS) setzt, wo man Applikationsserver, Framework, Datenbank u.s.w. zur Verfügung hat, aber wo die Software ohne weiteres den Server wechseln kann und damit Dateien verloren gehen.

Ist es vielleicht einfach die bessere Lösung, unter Einhaltung eines vernünftigen Formats nach stdout zu loggen die Software so laufen zu lassen, dass deren Standardausgabe (stdout oder stderr) in einen logmanager geleitet wird? Dieser kann dann für alle Software, die auf dem System läuft, verwendet werden und vom Systemadministrator einheitlich konfiguriert werden. Einheitlich heißt nicht nur, für alle Java-Programme gleich, sondern für alle Software, die sich dazu bringen lässt, ihre Log-Ausgabe nach stdout zu schreiben und gewisse Mindestanforderungen an das Format einzuhalten. Im Prinzip ließe sich mit named-Pipes auch eine beliebig hard-codierte Log-Datei unterstützen, aber das macht die Dinge nur komplizierter. Wenn dann das Logging-Framework der Software selbst noch Log-Rotation unterstützt, kann das natürlich durcheinandergeraten, wenn etwa nach n geschriebenen Bytes oder m Sekunden die named-pipe geschlossen, umbenannt und eine neue Datei angelegt wird, was je nach Schreibrechten in dem Verzeichnis zu verschiedenen Fehler führt.

Was ist jetzt mit Software, die selbst als Filter fungiert, wo also stdout Teil der Funktionalität ist? Solche Software findet man üblicherweise in Form kleinerer Programme oder Skripte, die nicht unbedingt Log-Dateien schreiben müssen oder in Form von gut ausgetesteter Software, die Teil der Installation ist. Oft kann man hier mit stderr für Log-Ausgaben, in diesem Fall meist für Fehlermeldungen, auskommen. Oder man könnte eine Log-Datei auf der Kommandozeile angeben, was dann auch wieder für den Systemadminstrator leicht zugänglich wäre und auf eine named-pipe umleiten könnte.