Devoxx UA and Devoxx BE 2019

In 2019 I visited Devoxx UA in Kiev and Devoxx BE in Antwerp.
Traveling was actually a little story by itself, so for now we can just assume that I magically was at the locations of DevoxxUA and DevoxxBE.

In Kiew I attended the following talks:

On Wednesday I attended the following talks in Antwerp:

On Thursday I attended the following talks in Antwerp:

On Friday I attended the following talks in Antwerp:

That’s it…
As always, a lot of these topics deserve an article in this blog. And a lot of video recordings from the conference are worth viewing.

Links

Share Button

Checked Exceptions in Java

In Java it is possible to declare a method with a „throws“-clause. For certain exceptions, that are not extending „RuntimeException“ or „Error“, this is actually required.

What looked like a good idea 25 years ago has proven to be a dead end. I do not know of any other major programming language that opts for declaring exceptions in this way. Slightly newer frameworks extend all their exceptions from RuntimeException, thus avoiding the need to declare them. Even in relatively early Java there was a weird way of working with exceptions in EJB, when it was required to write an interface and an implementation for the EJB. But it was strongly discouraged to let the implementation implement the interface, because it threw different exceptions. It was not the only weird thing about early EJB, of course. But without checked exceptions it would at least have been possible to let the implementation implement its interface.

We are now able to use Java 13 and as of Java 8 lambdas were introduced. With the introduction of lambdas the declared exceptions became especially painful and for this reason even Oracle has created twins for some essential exceptions that derive from RuntimeException, especially IOException.

We should face it: The throws clause has turned out to be a mistake and we should avoid this mistake by just using exceptions that do not have to be declared, at least in our APIs. It is not the only mistake, see Criticism of Java. Some of my other favorites are the lack of operator overloading for numeric types, the weird concept of Serializable and the lack of natively immutable collections and the lack of a convenient way to write some collections as code. But these issues are being worked on and we will eventually see some progress.

Links

Share Button

Exceptions to implement Program Logic

Sometimes it is conveniant to use exceptions for implementing the regular program logic.

Assuming we want to find some data and then process them. When no data is found, this step can be skipped, because there is nothing to do. So we could program something like this:


public Data readData(Param param) {
    Data data = db.read(param);
    if (data.isEmpty()) {
        throw new NotFoundException("nothing found");
    }
    return data;
}

public ProcessedData doWork(Param param) {
    try {
        Data input = readData(param);
        ....
        ....
        ....
        ProcessedData result = ....
        return result;
    } catch (NotFoundException nfex) {
        return ProcessedData.empty();
    }
}

And some other exceptions could also be handled in a similar way.

Of course some people say, that this is not good and an abuse of exceptions. But sometimes it is tempting.

So is this bad? And if so, why? Let’s find out.

This is some kind of weird obfuscation of the control flow, because throwing and catching of exceptions can be far apart and it can become quite unclear, from where in the stack which exceptions can be thrown. So there are good reasons to recommend using exceptions only for what they are meant for by their name. The Goto has never made it into Java and we are discouraged from using it in many other languages, like C. But languages like Java, C, Perl, Ruby and some others provide quite rich control flow relying neither on goto nor on exceptions by using „return“ anywhere in a function or method or subroutine, leaving loops with „break“ or „last“ or going to the next iteration with „next“ or „continue“. Perl and Java even allow to specify which of nested loops to leave with break or last. These mechanisms are very powerful and there is no urgent need to add exceptions or even gotos just to support the control flow.

Once moving to newer languages like Scala much of this is gone or at least strongly discouraged in a purely functional programming style. This makes programming Scala harder, and comes with benefits that might be worth the extra effort.

But in Java these functional purists have not become very strong yet, so using „break“, „continue“, „return“ etc. is still ok and quite powerful.

In Java there is another very major problem with exceptions. Many, if not most Java programs run in a framework or container like Spring, EJB/JEE, JBoss Fuse, for example. Now a piece of software becomes a software component, that can interact with other components through the framework. And exceptions are noticed by the framework. In many cases they have the effect that an ongoing transaction is marked as „rollback only“. So the whole processing continues normally, and when all the code from the components is finally done, the framework performs a rollback instead of a commit.

As long as exceptions are only used for handling errors or unsual situations, in which cases the rollback is probably the way to go anyway, everything is fine. But if we for example look up something and base the further processing on the outcome of this, then a NotFoundException will result in very counter intuitive behavior.

So the original rule of not abusing exceptions is actually not such a bad idea.

Share Button

Guava-Collections in Java-APIs

When we write APIs, that have parameters or as return values, it is a good idea to consider relying on immutable objects only. This applies also when collections are involved directly or indirectly as content of the classes that occur as return values or parameters. Changing what is given through the API in either direction can create weird side effects. It even causes different behavior, depending on weather the API works locally or via the network, because changes of the parameters are usually not brought back to the caller via some hidden back channel, unless we run locally. I use Java as an example, but it is quite an universal concept and applies to many languages. If we talk about Ruby, for example, the freeze method might be our friend, but it goes only one level deep and we actually want to deep-freeze.. Another story, maybe…

Now we can think of using Java’s own Collections.unmodifiableList and likewise. But that is not really ideal. First of all, these collections can still be modified by just working on the inner of the two collections:

List list = new ArrayList<>();
list.add("a");
list.add("b");
list.add("c");
List ulist = Collections.unmodifiableList(list);
list.add("d");
ulist.get(3)
-> "d"

Now list in the example above may already come from a source that we don’t really control or know, so it may be modified there, because it is just „in use“. Unless we actually copy the collection and put it into an unmodifiableXXX after that and only retain the unmodifiable variant of it, this is not really a good guarantee against accidental change and weird effects. Let’s not get into the issue, that this is just controlled at runtime and not at compile time. There are languages or even libraries for Java, that would allow to require an immutable List at compile time. While this is natural in languages like Scala, you have to leave the usual space of interfaces in Java, because it is also really a good idea to have the „standard“ interfaces in the APIs. But at least use immutable implementations.

When we copy and wrap, we still get the weird indirection of working through the methods of two classes and carrying around two instances. Normally that is not an issue, but it can be. I find it more elegant to use Guava in these cases and just copy into a natively immutable collection in one step.

For new projects or projects undergoing a major refactoring, I strongly recommend this approach. But it is of course only a good idea, if we still keep our APIs consistent. Doing one of 100 APIs right is not worth an additional inconsistency. It is a matter of taste, to use standard Java interfaces implemented by Guava or Guava interfaces and classes itself for the API. But it should be consistent.

Links

Share Button

hashCode, equals and toString

In many programming languages we are urged to define methods hashCode, equals and toString. They are named like this in Java and in many JVM languages or they use similar names. Some languages like Perl and Scala provide decent mechanisms for the language to figure these out itself, which we do most of the time in Java as well by letting the IDE create it for us or by using a library. This solution is not really as good as having it done without polluting our source code and without using mechanisms like reflection, but it is usually the best choice we have in Java. It does have advantages, because it gives us some control over how to define it, if we are willing to exercise this control.

So why should we bother? equals is an „obvious“ concept that we need all the time by itself. And hashCode we need, when we put something into HashMaps or HashSets or something like that. It is very important to follow the basic contract, that hashCode and equals must be compatible, that is

    \[\forall a, b : a.\mathrm{equals}(b) \implies a.\mathrm{hashCode}() == b.\mathrm{hashCode}()\]

And equals of course needs to be an equivalence relation.
There has been an article in this blog about „Can hashCodes impose a security risk?„, which covers aspects that are not covered here again.

An important observation is that these do not fit together with mutability very well. When we mutate objects, their hashCode and equals methods yield different results than before, but the HashSet and HashMap assume that they remain constant. This is not too bad, because usually we actually use very immutable objects like Strings and wrapped primitive numbers as keys for Maps. But as soon as we actually write hashCode and equals, this implies that we are considering the objects of this type to be members of HashMaps or HashSets as keys and the mutability question arises. One very ugly case is the object that we put into the database using Hibernate or something similar. Usually there is an ID field, which is generated, while we insert into the database using a sequence, for example. It is good to use a sequence from the database, because it provides the most robust and reliable mechanism for creating unique ids. This id becomes then the most plausible basis for hashCode, but it is null in the beginning. I have not yet found any really satisfying solution, other than avoiding Hibernate and JPAx. Seriously, I do think, that plain JDBC or any framework like MyBatis or Slick with less „magic“ is a better approach. But that is just a special case of a more general issue. So for objects that have not yet made the roundtrip to the database, hashCode and equals should be considered dangerous.

Now we have the issue that equality can be optimized for hashing, which would be accomplished by basing it on a minimal unique subset of attributes. Or it could be used to express an equality of all attributes, excluding maybe some kind of volatile caching attributes, if such things apply. When working with large hash tables, it does make a difference, because the comparison needs to look into a lot more attributes, which do not change the actual result at least for each comparison that succeeds. It also makes a difference, in which order the attributes are compared for equality. It is usually good to look into attributes that have a larger chance of yielding inequality, so that in case of inequality only one or only few comparisons are needed.

For the hashCode it is not very wrong to base it on the same attributes that are used for the equals-comparison, with this usual pattern of calculating hash codes of the parts and multiplying them with different powers of the some two-digit prime number before adding them. It is often a wise choice to chose a subset of these attributes that makes a difference most of the time and provides high selectivity. The collisions are rare and the calculation of the hash code is efficient.

Now the third method in the „club“ is usually toString(). I have a habit of defining toString, because it is useful for logging and sometimes even for debugging. I recommend making it short and expressive. So I prefer the format
className(attr1=val1 attr2=val2 att3=val3)
with className the name of the actual class of this object without package, as received by
getClass().getSimpleName()
and only including attributes that are of real interest. Commas are not necessary and should be avoided, they are just useless noise. It does not matter if the parantheses are „()“ or „[]“ or „{}“ or „«»“, but why not make it consistent within the project. If attribute values are strings and contain spaces, it might be a good idea to quote them. If they contain non-printable characters or quotation marks, maybe escaping is a good idea. For a real complete representation with all attributes a method toLongString() can be defined. Usually log files are already too much cluttered with noise and it is good to keep them consise and avoid noise.

Share Button

Not all projects are on ideal paths II (Tim Finnerty)

This is another story of a project, that did not go as well as it could have gone while I was there. From unsuccessful projects we can learn a lot, so there will be stories like this once in a while. The first one was about Tom Rocket.

Tim Finnerty

It was the time, when all the cool companies tried to introduce Java. And some of the new Java projects failed, causing the companies to go back to C, which again scared other companies from doing this step. But some companies did not get scared by this. They embraced the new Java-fashion at a time when it still was not clear whether or not this was a good idea. What could possibly go wrong?

Well, in those days the experienced guys did not want to move to Java. It was slow, it was unreliable, not mature,… Maybe for Applets, maybe more generally for GUIs to get rid of VisualBasic, but Java on the server was considered a bad joke. For the server real people used real C, of course on Unix or maybe Linux, which was not really such a bad idea in those days. But there were the young people. Or the ones who had stayed young. They often just had finished their education and firmly believed that by just using technology „xyz“ everything would become great. xyz can be a development method (spiral model in those days, agile today), an architecture (microservices), a paragdigm („OO“, „functional“), a framework („enterprise edition“), a tool or a programming language (yeah, Java).. Often this first enthusiasm ends in a disaster: The money has been spent, the developers are leaving and the software is further away from being useful than anybody would like to admit. In lucky cases there is still some money left to do it right. Maybe even to do it right in Java.

That is were we are coming in. I do not really know the earlier history, but according to the management it was a total disaster. Now Tim Finnerty (real name known to the author) was the new technical lead, team architect or whatever this role is called. He embraced the new technology, but promised to not overstrech it. A bit of old school. Sounds good, because it is exactly what managers want to hear. No more risky experiements, but this time it needs to become a success.

So Tim Finnerty defined, how we had to work. He knew Java, he knew databases and especially Oracle, he knew the web, he even knew Perl. And he knew OO. Better than anybody else, so we did the real thing. Great head start. And everybody had to program according to Tim’s rules.

Of course we were using Java enterprise edition. That meant, that we were programming against some Weblogic application server, that was hard to install, hard to run, required a few minutes of startup time for each minor change of the software that we were writing and forced to a very archaic and primitive programming model. But that was cool and it was the future, which unfortunately proved to be true. Even though it has at least become usable by now. So far nothing to blame on Tim, because it is kind of the stack that everybody used.

Now to the OO and the database. Each database table represented a „Business Object“, with a name like XyzBO. So most of the time, we wrote a class XyzBO plus a few more to fulfill the greed and need for boilerplate code of the old J2EE-world. XyzBO was a enterprise java bean. A stateless session bean, to be accurate. Which meant, that we wrote methods of this EJB, which were basically procedures of the pre-OO-world. But within we could of course use the whole OO-toolset. Which we did. So the class to represent any data from the database was actually called Data. It was a minor subset of the standard Perl data structures, which meant that Data was a list of hash maps, which could behave just like a hash map if it had only one entry. Database queries returned Data, or of course null, if nothing was found. Nobody would ever want to use an empty collection. Pretty much the opposite of what we are now doing the hard way by introducing Optional or Option to avoid the null. But it was easy to just write
if (data == null) { data = new Data(); }
at least for the ones who new this trick.
So data resembled the content of the database or of the query result, with the column names as keys and the values as objects. When working with these, it was really easy. Just know the attribute name accurately. Get the value from data. See if it is null. If not, cast it to the real type, and voila….

The database was designed according to Tim’s advice, he had to review every table. It was mandatory to have as unique key and as first column a string of around 700 characters, which was called HANDLE. Each table had a business primary key, which was always consisting of several columns. Because the system allowed multiple instances of the software to run on the same database, there was always one column called „SITE“ in this logical primary key. But there were no primary keys defined in the database. The unique HANDLE was enough. It contained the name of the BO, like XyzBO, followed by a colon and followed by the concatenation of all the logical primary keys, separated with commas. The date had to be converted to a string using a local US format, not ISO, of course. All foreign keys were defined using HANDLE. In the end more than half of the DB space was wasted for this stupidity. But each single handle value started with an XyzBO:, to remind us that we were programming OO.

And now booleans. It was forbidden to use the boolean type of Java. All booleans were strings containing the words „true“ and „false“. This went like that all the way to the web interface.

At that time web frameworks did not yet exist or were at least unknown. So the way to go was to write JSPs, which contained kind of dynamic web pages and to write servlets to control the flow and access the EJBs. Now according to Tim it was too hard to learn servlets as well, so it was forbidden to use them and instead a connection-JSP had to be written, which did not display anything, but only contained a <% in the beginning and a %> in the end and the code between.

A lot of small and larger stupidities, that were forced on the team. Most people were new to Java and to OO and did not even realize that there was anything wrong, apart from the fact, that it was kind of hard to get stuff done. Some stupidities were due to the fact that the early J2EE really sucked, but mostly it was Tim, who forced everyone to his level. This story happened a long time ago. Tim has already retired. I would say he is one of the guys, who retire as a Junior.

There is (almost) nothing wrong with stupidity. And there is (almost) nothing with arrogance. But the combination really sucks. Especially if it it taken serious by the manager or has to be taken serious by the team.

Share Button

2019 — Happy New Year

Gott nytt år! — Godt nytt år! — Felice anno nuovo! — Καλή Χρονια! — Щасливого нового року! — Срећна нова година! — С новым годом! — Feliĉan novan jaron! — Bonne année! — FELIX SIT ANNUS NOVUS — Gullukkig niuw jaar! — Un an nou fericit! — Frohes neues Jahr! — Happy new year! — ¡Feliz año nuevo! — Onnellista uutta vuotta! — عام سعيد

This was created by a Java-program:

import java.util.Random;
import java.util.List;
import java.util.Arrays;
import java.util.Collections;

public class HappyNewYear {

    public static void main(String[] args) {
        List list = Arrays.asList("Frohes neues Jahr!",
                                          "Happy new year!",
                                          "Gott nytt år!", 
                                          "¡Feliz año nuevo!",
                                          "Bonne année!", 
                                          "FELIX SIT ANNUS NOVUS", 
                                          "С новым годом!",
                                          "عام سعيد",
                                          "Felice anno nuovo!",
                                          "Godt nytt år!", 
                                          "Gullukkig niuw jaar!", 
                                          "Feliĉan novan jaron!",
                                          "Onnellista uutta vuotta!",
                                          "Срећна нова година!",
                                          "Un an nou fericit!",
                                          "Щасливого нового року!", 
                                          "Καλή Χρονια!");
        Collections.shuffle(list);
        System.out.println(String.join(" — ", list));
    }
}

Share Button

Indexing of Arrays and Lists

We index arrays with integers. Lists also, at least the ones that allow random access. And sizes of collections are also integers.
This allows for 2^{31}-1=2147483647 entries in Java and typical JVM languages, because integers are actually considered to be 32bit. Actually we could think of one more entry, using indices 0..2147483647, but then we would not be able to express the size in an signed integer. This stuff is quite deeply built into the language, so it is not so easy to break out of this. And 2’000’000’000 entries are a lot and take a lot of time to process. At least it was a lot in the first few years of Java. There should have been an unsigned variant of integers, which would in this case allow for 4’000’000’000 entries, when indexing by an uint32, but that would not really solve this problem. C uses 64 bit integers for indexing of arrays on 64 bit systems.

It turns out that we would like to be able to index arrays using long instead of int. Now changing the java arrays in a way that they could be indexed by long instead of int would break a lot of compatibility. I think this is impossible, because Java claims to retain very good backward compatibility and this reliability of both the language and the JVM has been a major advantage. Now a second type of arrays, indexed by long, could be added. This would imply even more complexity for APIs like reflection, that have to deal with all cases for parameters, where it already hurts that the primitives are no objects and arrays are such half-objects. So it will be interesting, what we can find in this area in the future.

For practical use it is a bit easier. We can already be quite happy with a second set of collections, let them be called BigCollections, that have sizes that can only be expressed with long and that are indexed in cases where applicable with longs. Now it is not too hard to program a BigList by internally using an array of arrays or an array of arrays of arrays and doing some arithmetic to calculate the internal indices from the long (int64) index given in the API. Actually we can buy some performance gain when resizing happens, because this structure, if well done, allows for more efficient resizing. Based on this all kinds of big collections could be built.

Share Button

Devoxx Kiew 2018

In the end of 2018 the number of conferences is kind of high. A great highlight is the Devoxx BE in Antwerp. But it has now five partner conferences in London, Paris, Krakow, Morocco and Kiev. So I decided to have a look at the one in Kiev.

How was it in comparison to the one in Belgium? What was better in Kiev: The food was way better, the drinks in the first evening (Whisky and Long Drinks vs. Belgium Beer) might be considered better, there were more people engaged to help the organizers…
What was better in Belgium: There were still a bit more speeches. While the location in Kiev was really great, in Belgium the rooms were way better for the purpose of providing a projection visible for everybody and doing a video recording that did not disturb the audience.
The quality of the speeches was mostly great in both locations. In Kiev they gamified the event a bit more..

Generally there was a wide range of topics and the talks were sorted into the following thematic groups:

  • Methodology & Culture
  • JVM Languages
  • Server Side
  • Architecture & Security
  • Mobile & IoT
  • Machine Learning & AI
  • Big Data & Data Mining
  • Cloud, Containers & Infrastructure
  • Modern Web & UX

See the schedule for the distribution…

I attended on Friday:

I attended on Saturday:

A lot to learn.

Share Button

Devoxx Antwerp 2018

In 2018 I am visiting a few conferences. A great highlight is the Devoxx BE in Antwerp, which I had the privilege of visiting 2012, 2013, 2014, 2015, 2016 and 2017.

As it should be, it is not just the same every year, but content and speakers change a bit from year to year.

Some topics that got a lot of attention were functional programming, artificial intelligence, Big Data, Machine Learning, clouds, JVMs, Kotlin

There was less about other JVM languages (apart from Kotlin), so Scala, Clojure, Groovy or Ceylon were covered little or not at all and Android used to be more present in other years. I would say that Ceylon has become irrelevant, probably because Kotlin was too similar and came out the same time and won. Groovy has its niche, Clojure has its niche, Scala and Kotlin have become mature and are now the two mainstream alternatives to Java, but themselves much smaller than Java. This was represented in the conference, taking into account that Scala has its own large conferences, like Scala Days, Scala Exchange, Scala World and a lot more.

Some side issues that might worry some of us did come up occasionally. Was it bad, that IBM bought Red Hat? At least they paid around 34’000’000’000 USD, which is more than 2’500’000 USD per employee. There are probably no other assets in terms of buildings, patents, hardware or whatever, that would justify this price, so IBM probably will have an interest to keep a large number of these employees and not scare them away by too much „IBM-culture“. We will see, but no reason to get immediately worried. Oracle wants money for running their JVM in production after more than 6 months. This can be avoided by always switching to the newest version or by relying on the JDKs offered by alternative sources like Amazon, RedHat…

Microsoft was a sponsor and had a booth. Their topic was not MS-Windows and MS-Office and MS-SQL-Server, but Azure, which can be used with Linux and Java and PostgreSQL, for example. The company did change a bit since the days of Steve Ballmer and we will see if this is an excursion or a continuous direction.

And James Gosling was there at the opening, as a surprise.

Generally there was a wide range of topics and the talks were sorted into the following thematic groups:

  • Methodology & Culture
  • Java Language
  • Programming languages
  • Architecture & Security
  • Big Data & Machine Learning
  • Mind the Geek
  • Server Side Java
  • Modern Web & UX
  • Cloud, Containers & Infrastructure
  • Mobile & IoT

See the schedule for the distribution…

I attended on Wednesday:

I attended on Thursday:

I attended on Friday:

It was a great conference. A lot of new ideas.

Share Button