Karl Brodowsky's IT-Blog – Seite 20 – IT Sky Consulting GmbH

DB Persistence without UPDATE and DELETE

When exploring the usage of databases for persistence, the easiest case is a database that does only SELECT. We can cache as much as we like and it is more or less the functional immutable world brought to the database. For working on fixed data and analyzing data this can sometimes be useful.

Usually our data actually changes in some way. It has been discussed in this Blog already, that it would be possible to extend the idea of immutability to the database, which would be achieved by allowing only INSERT and SELECT. Since data can correlate, an INSERT in a table that is understood as a sub-entity via a one-to-many-relationship by the application actually is mutating the containing entity. So it is necessary to look at this in terms of the actual OR-mapping of all applications that are running on that DB schema.

Life can be simple, if we actually have self contained data as with MongoDB or by having a JSON-column in PostgreSQL, for example. Then inter-table-relations are eliminated, but of course it is not even following the first normal form. This can be OK or not, but at least there are good reasons why best practices have been introduced in the relational DB world and we should be careful about that. Another approach is to avoid the concept of sub entities and only work with IDs that are foreign keys. We can query them explicitly when needed.

An interesting approach is to have two ID-columns. One is an id, that is unique in the DB-table and increasing for newly created data. One is the entity-ID. This is shared between several records referring to different generations of the same object. New of them are generated each time we change something and persist the changes and in a simple approach we just consider the newest record with that entity-ID valid. It can of course be enhanced with validFrom and validTo. Then each access to the database also includes a timestamp, usually close to current time, but kept constant across a transaction. Only records for which validFrom <= timestamp < validTo are considered, and within these the newest. The validFrom and validTo can form disjoint intervals, but it is up to the application logic if that is needed or not. It is also possible to select the entry with the highest ID among the records with a given entityID and timestamp-validTo/From-condition. Deleting records can be simulated by this as well, by allowing a way to express a "deleted" record, which means that in case we find this deleted record by our rules, we pretend not having found anything at all. But still referential integrity is possible, because the pre-deletion-data are still there. This concept of having two IDs has been inspired by a talk on that I saw during Clojure Exchange 2017: Immutable back to front.

2017 — Happy New Year

Sylvester in Zürich New Years Eve in Zurich (C) Karl Brodowsky 2016 — Sylvester in Zürich
New Years Eve in Zurich

Próspero ano novo! — Godt Nyttår! — السنة الجديدة المبتهجة — Sretna nova godina! — Καλή Χρονια! — 新年好 — Felice Anno Nuovo! — Cung chúc tân xuân! — Een gelukkig nieuwjaar! — Gelukkig nieuwjaar! — Gott nýggjár! — Un an nou fericit! — سال نو مبارک — Subho nababarsho! — Akemashite omedetô! — Ath bhliain faoi mhaise! — Lokkich nijjier! — Sugeng warsa enggal! — Весёлого нового года! — Naya barsa ko hardik shuvakamana! — 새해 복 많이 받으세요 — Laimīgu Jauno gadu! — Šťastný nový rok! — Godt nytår! — Shnorhavor nor tari! — FELIX SIT ANNUS NOVUS! — うれしい新しい年 — Gleðilegt nýtt ár! — Onnellista uutta vuotta! — Hääd uut aastat! — Feliz año nuevo! — Szczęśliwego nowego roku! — Frohes neues Jahr! — Bun di bun an! — Feliĉan novan jaron! — Честита нова година! — Laimingų naujųjų metų! — Sala we ya nû pîroz be! — Gott nytt år! — Happy New Year! — Bonne année! — С новым годом! — عام سعيد — Nav varsh ki subhkamna! — Selamat tahun baru! — ¡Próspero año nuevo! — Laimīgu jauno gadu! — Срећна нова година! — Srechno novo leto! — สวัสดีปีใหม่ — Boldog új évet! — Yeni yılınız kutlu olsun! — Среќна нова година! — Feliz ano novo! — Щасливого нового року!

Let’s do it with Clojure this time:
(let [ texts [ "Akemashite omedetô!" "Ath bhliain faoi mhaise!" "Boldog új évet!" "Bonne année!" "Bun di bun an!" "Cung chúc tân xuân!" "Een gelukkig nieuwjaar!" "FELIX SIT ANNUS NOVUS!" "Felice Anno Nuovo!" "Feliz ano novo!" "Feliz año nuevo!" "Feliĉan novan jaron!" "Frohes neues Jahr!" "Gelukkig nieuwjaar!" "Gleðilegt nýtt ár!" "Godt Nyttår!" "Godt nytår!" "Gott nytt år!" "Gott nýggjár!" "Happy New Year!" "Hääd uut aastat!" "Laimingų naujųjų metų!" "Laimīgu Jauno gadu!" "Laimīgu jauno gadu!" "Lokkich nijjier!" "Nav varsh ki subhkamna!" "Naya barsa ko hardik shuvakamana!" "Onnellista uutta vuotta!" "Próspero ano novo!" "Sala we ya nû pîroz be!" "Selamat tahun baru!" "Shnorhavor nor tari!" "Srechno novo leto!" "Sretna nova godina!" "Subho nababarsho!" "Sugeng warsa enggal!" "Szczęśliwego nowego roku!" "Un an nou fericit!" "Yeni yılınız kutlu olsun!" "¡Próspero año nuevo!" "Šťastný nový rok!" "Καλή Χρονια!" "Весёлого нового года!" "С новым годом!" "Срећна нова година!" "Среќна нова година!" "Честита нова година!" "Щасливого нового року!" "السنة الجديدة المبتهجة" "سال نو مبارک" "عام سعيد" "สวัสดีปีใหม่" "うれしい新しい年" "新年好" "새해 복 많이 받으세요" ] ] (reduce (fn [ x y ] (str x " — " y)) (shuffle texts)))

Lazy Collections, Strings or Numbers

The idea is, that we have data that is obtained or calculated to give us on demand as much of it as we request. But it is not necessarily initially present. This concept is quite common in the functional world, where we in a way hide the deprecated concept of state in such structures, by the way in a way that lets use retain the benefits that led to the desire for statelessness.

Actually the concept is quite old. We have it for I/O in Unix and hence in Linux since the 1970ies. „Everything is a file“, at least as long as we constrain ourselves to a universal subset of possible file operations. It can be keyboard input, a named or anonymous pipe, an actual file, a TCP-connection, to name the most important cases. These are „lazy“ files, behave more or less like files as far as sequential reading is concerned, but not for random access reading. The I/O-concept has been done in such a way that it takes the case into account that we want to read $n$ bytes, but get only $m < n$ bytes. This can happen with files when we reach their end, but then we can obtain an indication that we reached the end of the file, while it is perfectly possible that we read less then we want in one access, but eventually get $\ge n$ bytes including subsequent reads. Since the API has been done right, but by no means ideal, it generalizes well to the different cases that exist in current OS environments.

We could consider a File as an array of bytes. There is actually a way to access it in this way by memory-mapping it, but this assumes a physically present file. Now we could assume that we think of the array as a list that is optimized for sequential access and iterating, but not for random access. Both list types actually exist in languages like Java. Actually the random access structure can be made lazy as well, within certain constraints. If the source is actually sequential, we can just assume that the data is obtained up to the point where we actually read. The information about the total length of the stream may or may not be available, it is always available somehow in the case of structures that are completely available in memory. This random access on lazy collections works fine if the reason of laziness is to actually save us from doing expensive operations to obtain data that we do not actually need or to obtain them in parallel to the computation that processes the data. But we loose another potential drawback in this case. If the data is truly sequential, we can actually process data that is way beyond our memory capacity.

So the concept transfers easily from I/O-streams to lists and even arrays, most naturally to iterables that can be iterated only once. But we can easily imagine that this also applies to Strings, which can be seen a sequence of characters. If we do not constrain us to what a String is in C or Java or Ruby, but consider String to be a more abstract concept, again possibly dropping the idea of knowing the length or having a finite length. Just think of the output of the Unix command „yes“ or „cat /dev/zero“, which is infinite, in a theoretical way, but the computer won’t last forever in real life, of course. And we always interrupt the output at some time, usually be having the consumer shut down the connection.

Even numbers can be infinite. For real numbers this can happen only after the decimal point, for p-adic numbers it happens only before the decimal point, if you like to look into that. Since we rarely program with p-adic numbers this is more or less an edge case that is not part of our daily work, unless we actually do math research. But we could have integers with so many digits that we actually obtain and process them sequentially.

Reactive programming, which is promoted by lightbend in the Reactive Manifesto relies heavily on lazy structures, in this case data streams. An important concept is the so called „backpressure“, that allows the consumer to slow down the producer, if it cannot read the data fast enough.

Back to the collections, we can observe different approaches. Java 8 has introduced streams as lazy collections and we need to transform collections into streams and after the operation a stream back into a collection, at least in many real life situations. But putting all into one structure has some drawbacks as well. But looking at it from an abstract point of view this does not matter. The java8-streams to not implement a collection interface, but they are lazy collections from a more abstract point of view.

It is interesting that this allows us to relatively easily write nested loops where the depth of the nesting is a parameter that is not known at compile time. We just need a lazy collections of $n$ -tuples, where $n$ is the actual depth of the nesting and the contents are according to what the loops should iterate through. In this case we might or might not know the size of the collection, possibly not fitting into a 32-bit-integer. We might be able to produce a random member of the collection. And for sure we can iterate through it and stop the iteration wherever it is, once the desired calculation has been completed.

Christmas 2016

З Рiздвом Христовим − کريسمس مبارک − Nollaig Shona Dhuit − Kellemes Karácsonyi Ünnepeket − Feliz Natal − Gëzuar Krishtlindjet − Joyeux Noël − καλά Χριστούγεννα − Vesele Vianoce − Fröhliche Weihnachten − Vesele bozicne praznike − Bon nadal − God Jul − Glædelig Jul − Честита Коледа − Feliz Navidad − Sretan božić − Häid jõule − Zalig Kerstfeest − Prettige Kerstdagen − Merry Christmas − 즐거운 성탄, 성탄 축하 − Срећан Божић − Crăciun fericit − 圣诞快乐 − Feliĉan Kristnaskon − Buon Natale − Priecîgus Ziemassvçtkus − Veselé Vánoce − ميلاد مجيد − God Jul − क्रिसमस मंगलमय हो − Natale hilare − Wesołych Świąt Bożego Narodzenia − Hyvää Joulua − Gledhilig jól − Gleðileg jól − С Рождеством − Selamat Hari Natal − クリスマスおめでとう ; メリークリスマス − Su Šventom Kalėdom − Bella Festas daz Nadal − Mutlu Noeller

Times have changed. In 2015 I used a Perl 5 program to generate the christmas greetings in some arbitrary order. But in 2016 Perl6 has become production ready, so this should be used. It is much shorter anyway ?:

#!/usr/bin/env perl6 my @texts = ( 'Bella Festas daz Nadal', 'Bon nadal', 'Buon Natale', 'Crăciun fericit', 'Feliz Natal', 'Feliz Navidad', 'Feliĉan Kristnaskon', 'Fröhliche Weihnachten', 'Gledhilig jól', 'Gleðileg jól', 'Glædelig Jul', 'God Jul', 'God Jul', 'Gëzuar Krishtlindjet', 'Hyvää Joulua', 'Häid jõule', 'Joyeux Noël', 'Kellemes Karácsonyi Ünnepeket', 'Merry Christmas', 'Mutlu Noeller', 'Natale hilare', 'Nollaig Shona Dhuit', 'Prettige Kerstdagen', 'Priecîgus Ziemassvçtkus', 'Selamat Hari Natal', 'Sretan božić', 'Su Šventom Kalėdom', 'Vesele Vianoce', 'Vesele bozicne praznike', 'Veselé Vánoce', 'Wesołych Świąt Bożego Narodzenia', 'Zalig Kerstfeest', 'καλά Χριστούγεννα', 'З Рiздвом Христовим', 'С Рождеством', 'Срећан Божић', 'Честита Коледа', 'ميلاد مجيد', 'کريسمس مبارک', 'क्रिसमस मंगलमय हो', 'クリスマスおめでとう ; メリークリスマス', '圣诞快乐', '즐거운 성탄, 성탄 축하'); my @shuffled = @texts.pick(*); say @shuffled.join(" − ");

HTTPS

This Blog is now using https. So the new URL is https://brodowsky.it-sky.net/. The old URL http://brodowsky.it-sky.net/ is no longer supported, but it is automatically forwarded to the https-URL.

If you like to read more about changing the links within the Blog you can find information on Vladimir’s Blog including a recipe, both in German.

Is Java becoming non-free?

We are kind of used to the fact that Java is „free“.
It has been free in the sense of „free beer“ pretty much forever.
And more recently also „free“ in the sense of „free speech“.

In spite of the fact that we read that „Oracle is going to monetize on Java“, as can be read in articles like this, it is remaining like that, at least for now. This is also written in the article.
But it seems that they are looking for loopholes. For example we download and install Java SE including X, Y and Z, because it comes like that. Agree to hundred pages of license text and confirm having read and understood everything, as always… Now we really need X, which is the JDK, which is actually free. But we just accidentally also install Y and Z, which we do not need, but which has a price tag on which they are trying to get us.

Even if nothing will really happen, issues like that help undermining the trust in the platform in general, not only for Java, but also for other JVM-languages. Eventually there could be forks like we have seen with LibreOffice vs. OpenOffice or with mariaDB vs. mySQL, which kind of took over by avoiding the ties to Oracle. Solaris seems to have a similar fork, but in this case people are just moving to Linux anyway, so the issue is less relevant.

These prospects are not desirable, but I think we do not have to panic, because there are ways to solve this that are going to be pursued if necessary. Maybe it is a good idea to be more careful when installing software. And to think twice when starting a new project if Oracle or PostgreSQL is the right DB product in the long term, taking into consideration Oracle’s attitude towards loyal long term customers.

It is regrettable. Oracle has great technology from their own history and from SUN in databases, Java including the surrounding universe, Solaris and hardware. Let us hope that they will stay reasonable at least with Java.

JMS

Java has always not just been a language, but it brought us libraries and frameworks. Some of them proved to be bad ideas, some become hyped without having any obvious advantages, but some were really good.

In the JEE-stack, messaging (JMS) was included pretty much from the beginning. In those days, when Java belonged to Sun Microsystems and Sun did not belong to Oracle, an aim was to support databases, which was in those days mostly Oracle, via JDBC and so called Message oriented middleware, which was available in the IBM-world via JMS. JMS is a common interface for messaging, that is like sending micro-email-message not between human, but between software components. It can be used within one JVM, but even between geographically distant servers, provided a safe network connection exists. Since we all know EMail this is in principle not too hard to understand, but the question is, what it really means and if it brings us something that we do not already have otherwise.

We do have web services as an established way to communicate between different servers across the network and of course they can also be used locally, if desired. Web services are neither the first nor the only way to communicate between servers nor are they the most efficient way. But I would say that they are the way how we do it in typical distributed applications that are not tied to any legacy. In principal web services are network capable and synchronous. This is well understood and works fine for many applications. But it also forces us to block processes or threads while waiting for responses, thus occupying valuable resources. And we tend to loose responsiveness, because of the waiting for the response. It needs to be observed that DB-access is typically only available synchronously. In a way understandable because of the transactions, but it also blocks resources to a huge extent, because we know that the performance of many applications is DB driven.

Now message based software architectures think mostly asynchronously. Sending a message is a „fire and forget“. There is such a thing as making message transactional, but this has to be understood correctly. There is one transaction for sending the message. It is guaranteed that the message is sent. Delivery guarantees can only be given to a limited extent, because we do not know anything about the other side and if it is at all working. This is not checked as part of the transaction. We can imagine though that the messaging system has its own transactional database and stores the message there within the transaction. It then retries delivering it forever, until it succeeds. Then it is deleted from this store as part of the receiving transaction. Both these transactions can be part of a distributed transaction and thus be combined with other transactions, usually against databases, for a combined transaction. This is what we usually have in mind when talking about this. I have to mention that the distributed transaction, usually based on the so called two phase commit, is not quite as water proof as we might hope, but it can be broken by construction of a worst case scenario regarding the timing of failures of network and systems. But it is for practical purposes reasonable good to use.

While it is extremely interesting to investigate purely message based architectures, especially in conjunction with functional paradigm, this may not be the only choice. Often it is a good option to use a combination of messaging with synchronous services.

We should observe that messaging is a more abstract concept. It can be implemented by some middle ware and even be accessible by a standardized kind of interface like JMS. But it can also be more abstract as a queuing system or as something like Akka uses for its internal communication. And messaging is not limited to Java or JVM languages. Interoperability does impose some constraints on how to use it, because it bans usage of Object-messages which store serialized Java objects, but there are ways to address this by using JSON or BSON or XML or Protocol Buffers as message contents.

What is interesting about JMS and messaging in general are two major communication modes. We can have queues, which are point to point connections. Or we can have „topics“, which are channels into which messages are sent. They are then received by all current subscribers of the topic. This is interesting to notify different components about an event happening in the system, while possibly details about the event must be queried via synchronous services or requested by further messaging via queues.

Generally JMS in Java has different implementations, usually there are those coming with the application servers and there are also some standalone implementations. They can be operated via the same interface, at least as long as we constrain us to the common set of functionality. So we can exchange the JMS implementation for the whole platform (which is a nightmare in real life), but we cannot mix them, because the wire protocol is usually incompatible. There is now something like a standard network protocol for messaging, which is followed by some, but not all implementations.

As skeptical as I am against Java Enterprise edition, I do find the JMS part of enterprise Java very interesting and worthwhile exploring for projects that have a size and characteristics justifying this.

Clojure Exchange 2016

I have just visited Clojure Exchange. Since it had only one track, there is no point in listing which talks I have attended, since this can easily be seen on the web page of the conference.

It was interesting and there were many great talks and I also met great people among the other participants.

Devoxx 2016 Visit

As already written in Devoxx 2016 I visited Devoxx in Antwerp 2016.

Hot topics where Java 9 and the functional features of Java 8. But there was a wide range of talks. As in previous years visitors can watch all the talks that they missed or found interesting enough to re-watch online afterwards. In earlier years it was done with „Parlays“ and only available to visitors or to those who pay for it, while it is now available on youtube for everybody. Since the conference has been sold out long before it started, this does not seam to stop people from buying tickets for the conference.

So here is what I did.
Wednesday:

Opening Keynote Java Language [Mark Reinhold, Brian Goetz]
Everything You Mother Should Know About Compilers [Cliff Click]
Advanced Spring Data Rest [Oliver Gierke]
Algebraic data types for fun and profit [Clément Delafargue]
Billions of lines of code in a single repository [Guillaume Laforge]
Designing for Performance [Martin Thompson]
g ∘ f patterns [Mario Fusco]
Clojure Web-Application 101 [Michael Vitz]
Terracotta Ehcache: simpler, faster, distributed [Anthony Dahanne]
Java Pub Quiz (no Video)
The Devoxx Tweetwall (no Video)
Streams in JDK8: The Good, The Bad and The Ugly

Thursday:

Javaslang – Functional Java The Easy Way [David Schmitz]
Optional – The Mother of All Bikesheds [Stuart Marks]
A tour of the (advanced) Akka features in 60 minutes [Johan Janssen]
Why Computers calculate wrong [Karl Brodowsky]
Ignite-Talk: Mood Driven Development (no video) [Katharine Beaumont]
Ignite-Talk: Java Sutra (no video) [Nick Vanderhoven and Jeroen Horemans]
Ignite-Talk: Bicycle Touring – Travelling for Vacation and Business by Bicycle (no video) [Karl Brodowsky]
Ignite-Talk: Bicycle Touring – Travelling for Vacation and Business by Bicycle (no video) [Karl Brodowsky]
It is tough to be an application in 2016. Lagom can help. [Katrin Shechtman]
Elixir – Easy fun for busy developers [David Schmitz]
Anticipating Java 9 – Functionality and Tooling [Trisha Gee]
graph databases and the „panama papers“ [Stefan Armbruster]
Functional patterns for scala practitionners [Clément Delafargue]
Closing Keynote Devoxx 2016 – The Java Council

Friday:

Java Language and Platform Futures: A Sneak Peek [Brian Goetz]
Thinking In Parallel [Stuart Marks and Brian Goetz]
Netty – One Framework to rule them all [Norman Maurer]

Find Links here….

I guess that’s it for today… I hope to visit Antwerp for Devoxx next year again.

Devoxx 2016

I am going to the Devoxx in Antwerp 2016.
Updates about what I did will follow soon.

As a starter here is my Devoxx-Talk. Let this be the main content for this posting, which is mostly video instead of text. Here is the github repo with the code examples.