# Phone Numbers and E-Mail Addresses

Most data that we deal with are strings or numbers or booleans and combinations of these into classes and collections. Dates can be expressed as string or number, but have enough specific logic to be seen as a fourth group of data. All these have interesting aspects, some of which have been discussed in this blog already.

Now phone numbers are by an naïve approach numbers or strings, but very soon we see that they have their own specific aspects. The same applies for email addresses which can be represented as strings.

Often projects go by their own „simplified“ specification of what an email address or a phone number is, how to parse, compare and render them. In the end of the day the simplification is harder to tame than the real solution, because it needs to be maintained and specified by the project team rather than being based on a proven library. And once in a while „edge cases“ occur, that cannot be ignored and that make the „home grown“ library even more complex.

Behind phone numbers and email addresses there are well defined and established standards and they are hard to understand thoroughly within the constrained time budget of a typical „business project“, because the time should be allocated to enhancing the business logic and not to reinventing the basics. Unless there is a real need to do so, of course.

Just to give an idea: When phone numbers are parsed or provided by user input, they can start with a „+“ sign or use some country specific logic to express, to which country they belong. And then the „+1“, for example, does not stand for the United States alone, but also for Canada and some smaller countries that are in some way associated with the United States or Canada. Further analysis of the number is required to know about that. The prefix for international number is often „00“, but in the United States it is „011“ and there were and are some other variants, that are still frequently used. Some people like to write something like „+49(0)431 77 88 99 11 1“ instead of „+49 431 77 88 99 11 1“. We can constrain the input to the variants we happen to think of and force the supplier of data to comply, but why bother? Why not accept legitimate formats, as long as they are correct and unambiguous?

Now for E-Mail-addresses there is the famous one page regular expression to recognize correct email addresses which is even by itself not totally complete. Find it at the bottom of the article…

Of course it includes some rarely used variants of email addresses that were once used and have not been completely abolished officially, but it is hard to draw and exact border for this.

So the general recommendation is to find a good library for working with email addresses and phone numbers. Maybe the library can even to some extent eliminate input strings that are formally complying the format, but know to be incorrect by knowing about numbering schemes world wide or about email domains or even by performing lookups.

Another strong recommendation is to store data like email addresses and phone numbers in a technical format, that is in the example of phone numbers always starting with a „+“ followed by digits only. For input any positioning of spaces is accepted, for output the library knows how to format it correctly. This allows selecting by the numbers without dealing with complex formatting, by just using the technical format in the query as well.

For Java (and thus for many JVM-languages), C++ and JavaScript there is an excellent library from Google for dealing with phone numbers. For E-Mails something like apache commons email validator is a way to go.

Keep in mind that for E-Mail addresses and phone numbers, the ultimate way of verification is to send them a link or a code that they need to enter. In the end of the day it is insufficient to rely only on formal verification without this final step.

But still issues remain for transforming data into a canonical technical format for storing them, formatting data for display etc. And there is a huge added value, if we can reliably recognize formally false entries early, when the user can still easily react to it, rather than waiting for an email/SMS/phone call being processed, which may fail when the user is no longer on our „registration site“. And we can process data which has already been verified by a third party, but still we want to parse it to recognize obvious errors.

The concrete libraries may be outdated by the time you are reading this, or they may not be applicable for the language environment that you are using, but please make an effort to find something similar.

So, please use good libraries, that are like to be found for the environment that you are using and write yourself what creates value for your project or organization. Unless your goal is really to write a better library. Better invest the time into areas where there are still no good libraries around.

And as always, you may understand email addresses and phone numbers as an example for a more general idea.

## E-Mail Regex

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?: \r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:( ?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\0 31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\ ](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+ (?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?: (?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n) ?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\ r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n) ?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t] )*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])* )(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*) *:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+ |\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r \n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?: \r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t ]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031 ]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\]( ?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(? :(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(? :\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\". \000-\031]+(?:(? :(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)? [ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]| \\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<> @,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|" (?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t] )*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\ ".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(? :[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\". \000- \031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|( ?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,; :\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([ ^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\" . \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[\ ]\r\$|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\ [\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\ r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\] |\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\". \0 00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\ .|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@, ;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(? :[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])* (?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\". \[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[ ^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$ ]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*( ?:(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\ ".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:( ?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[ $"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t ])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t ])+|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(? :\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+| \Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?: [^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[\ ]]))|"(?:[^\"\r\$|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n) ?[ \t])*(?:@(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$" ()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n) ?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<> @,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@, ;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t] )*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\ ".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)? (?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\". \[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?: \r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[$"()<>@,;:\\".\[$]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t]) *))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t]) +|\Z|(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\ .(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\". \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[$"()<>@,;:\\".\[$]))|$([^\[$\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:( ?:\r\n)?[ \t])*))*)?;\s*)

# Ranges of Dates and Times

In Software we often deal with ranges of dates and times.

Let us look at it from the perspective of an end user.

When we say something like „from 2020-03-07 to 2019-03-10“ we mean the set of all timestamps such that

or more accurately:

Important is, that we mean to include the whole 24 hour day of 2019-03-10. Btw. please try to get used to the ISO-date even when writing normal human readable texts, it just makes sense…

Now when we are not talking about dates, but about times or instants of time, the interpretation is different.
When we say sonmething like „from 07:00 to 10:00“ or „from 2020-03-10T07:00:00+TZ to 2020-04-11T09:00:00+TZ“, we actually mean the set of all timestamps such that

or

respectively. It is important that we have to add one in case of date only (accuracy to one day) and we do not in case of finer grained date/time information. The question if the upper bound is included or not is not so important in our everyday life, but it proves that commonly the most useful way is not to include the upper bound. If you prefer to have all options, it is a better idea to employ an interval library, i.e. to find one or to write one. But for most cases it is enough to exclude the upper limit. This guarantees disjoint adjacent intervals which is usually what we want. I have seen people write code that adds 23:59:59.999 to a date and compares with instead of , but this is an ugly hack that needs a lot of boiler plate code and a lot of time to understand. Use the exclusive upper limit, because we have it.

Now the requirement is to add one day to the upper limit to get from the human readable form of date-only ranges to something computers can work with. It is a good thing to agree on where this transformation is made. And to do it in such a way that it even behaves correctly on those dates where daylight saving starts or ends, because adding one day might actually mean „23 hours“ or „25 hours“. If we need to be really very accurate, sometimes switch seconds need to be added.

Just another issue has come up here. Local time is much harder than UTC. We need to work with local time on all kinds of user interfaces for humans, with very few exceptions like for pilots, who actually work with UTC. But local date and time is ambiguous for one hour every year and at least a bit special to handle for these two days where daylight saving starts and ends. Convert dates to UTC and work with that internally. And convert them to local date on all kinds of user interfaces, where it makes sense, including documents that are printed or provided as PDFs, for example. When we work with dates without time, we need to add one day to the upper limit and then round it to the nearest for our timezone or know when to add 23, 24 or 25 hours, respectively, which we do not want to know, but we need to use modern time libraries like the java.time.XXX stuff in Java, for example.

Working with date and time is hard. It is important to avoid making it harder than it needs to be. Here some recommendations:

• Try to use UTC for the internal use of the software as much as possible
• Use local date or time or date and time in all kinds of user interfaces (with few exceptions)
• add one day to the upper limit and round it to the nearest midnight of local time exactly once in the stack
• exclude the upper limit in date ranges
• Use ISO-date formats even in the user interfaces, if possible

# Functional Scala London 2019

In December 2019 I attended the conference Functional Scala in London which was initiated and managed by John de Goes. See Skillsmatter about what happened to Scala Exchange. Of course a large part of the conference was related to ZIO, which seems to be a part of the eco system surrounding Scala with a lot of dynamic.

It was a single track conference with a lot of talks, so I have attended all of them:
Day 1 (2019-12-12)

• KEYNOTE: XS — A Collections CLI [Paul Phillips] (Video)
• Introduction to Interruption [Jakub Kozlowski] (Video)
• Making Algorthms work with Functional Scala [Karl Brodowsky] (Video)
• Solving the Scala Notebook Experience [Jeremy Smith & Jonathan Indig] (Video)
• Mixing Scala & Kotlin [Alexey Soshin] (Video)
• Prototyping the Future with Functional Scala [Mike Kotsur] (Video)
• Test Effects: First Class [Adam Fraser] (Video)
• Let’s Gossip! [Dejan Mijic & Przemyslaw Wierzbicki] (Video)
• Ray Tracing with ZIO [Pierangelo Cecchetto] (Video)
• Invertible Programs [Sergei Shabanau] (Video)
• Hyper-pragmatic Pure FP Testing with DIStage-Testkit [Pavel Shirshov & Kai] (Video)
• KEYNOTE: Unleash Your Fury [Jon Pretty] (Video)

Day 2 (2019-12-14)

• Modern Data-Driven Applications with ZIO Streams [Itamar Ravid] (Video)
• Functional Architecture [Piotr Golebiewski] (Video)
• ZIO Chunk: A Fast, Pure Alternative to Arrays [Aleksandra A. Holubitska]
• Caliban: Designing a Functional GraphQL Library [Pierre Ricadat] (Video)
• Macros and Environmental Effects [Maxim Schuwalow] (Video)
• Streaming Analytics with Scala and Spark [Bas Geerdink] (Video)
• ZIO Actors [Mateusz Sokol] (Video)
• Adventures in Type-safe Error Handling [Jacob Wang]
• Composition using Arrows and Monoidal Categories [Oleg Nizhnik]
• Practical Logic(al) Programming with Dotty [Lander Lopez]
• Next-Level Type Safety: An Intro to Generalized Algebraic Data Types [Matthias Berndt]
• KEYNOTE: The Many Faces of Modularity [Eric Torreborre]

See Agenda

Maybe I will write more about some topics.

Talks will be on youtube in the near future.

# Visit to reClojure in London 2019

On 2019-12-02 I visited the conference reClojure.

This was an admirable community effort to create a replacement for ScalaExchange, which simply did not happen because of the bankruptcy of Skillsmatter.

There was only one track, so the schedule is exactly what I visited.

I will just copy it below, because schedules from conference sites usually disappear after some time:

• Building stuff with Clojure and 3D Printing. Clément Salaün.
How to design objects with Clojure, OpenSCAD and then 3D print them. This talk covers the motivations, basic concepts and features with a live demo.
• Clojure Art. Karl Brodowsky.
Teaching or learning Clojure using images has been proven to be fun and beneficial! In this talk, learn how.
• Growing Mobile Apps with ClojureScript and React Native. Daniel Neal.
Starting things is fun, but growing them can be a real challenge – and mobile apps are no different…
• Live Coding a Mandelbrot Renderer. Peter Westmacott.
In this talk, Peter will demonstrate live coding of a fractal renderer, with the aim to show how complex beauty can emerge from simple mathematical rules and a little code.
• Pizza Party Lunch (Thank You uSwitch!)
Short 10 minute talks. Various Speakers.
• Unleash the power of the REPL. Dana Borinski.
Return to basics and dive into how to leverage the REPL to solve problems and debug more quickly – and with the added bonus of honing our Clojure skills!
• Generating Generators. Andy Chambers.
Generating data for use in tests can be laborious and boring. However, using the database’s information schema you can alleviate that! Discover the ways to achieve this.
• Living in a Box. Life in Containers with the JVM. Matthew Gilliard.
A focus on how containers and the JVM interact and what implications are there for Clojure Developers. Get the best results from the work gone into OpenJDK container support.
• Closing Keynote – Code, meet data! Malcolm Sparks.
Computers have 3 jobs: Input, process, output. How have we made such a mess of something so fundamental? Observations, opportunities for Clojurists and hope for the future.

There is a youtube channel for reClojure, where we can now find recordings of the talks.

# How to get rid of these HTML-entities in Files

It has been written here that HTML-entities (these &auml; etc) should be avoided with the exception of those that we need due to the HTML-syntax like &lt;, &gt;, &amp; and maybe &quot; and &nbsp;. They were already mostly obsolete more than 20 years ago, but in those days we still did not automatically use UTF-8 or UTF-16, but often an 8-bit character encoding that could express only up to 256 characters, in reality around 200 due to control characters. At least these 200 could be used. That was enough for web pages in those days and texts in German, French, Russian, Greek, Hebrew, Arabic and many other language could well be written, as long as only one language or a few similar languages were used. For the rare occasions that required some characters that were not in this character set, it was an option to rely on these HTML-entities. Or for typing HTML-pages on an US-keyboard without any good tool support.

But now Unicode has been around for more than 25 years and more than 90% of the web pages use UTF-8.

Now some people think that these HTML-entities are kind of necessary or at least „safer“ and I see people still writing HTML-code with them in these days. Or tools by relatively well known companies, that produced such output not so long ago… It is a good thing to have some courage and to change something like this to readable and natural format. Or more generally to try out if a simpler or better solution works. Reasonable courage is good for this, too much of something good can go bad, as so often…

So, please teach your collegues not to use these ugly HTML-entities, where UTF-8-characters are the better option.

And here is a perl script that converts the HTML-entities with the exceptions mentioned above to UTF-8. In the project conversion-utils some more such scripts might be added. The script is a bit too long to be pasted inline in a code block, so it is better to find the current version on github.

Then you can do something like this:
 git commit for file in *.html ; do echo $file mv$file ${file}~entities~ html2utf8 <${file}~entities~ > $file echo /$file done git diff 
to convert all files in a directory. I assume that you are using Linux or at least have bash like for example in cygwin.
There are other tools to do the same thing, I am sure. Just use anything that works for you to get away from this unreadable crap.

# Devoxx UA and Devoxx BE 2019

In 2019 I visited Devoxx UA in Kiev and Devoxx BE in Antwerp.
Traveling was actually a little story by itself, so for now we can just assume that I magically was at the locations of DevoxxUA and DevoxxBE.

In Kiew I attended the following talks:

On Wednesday I attended the following talks in Antwerp:

On Thursday I attended the following talks in Antwerp:

On Friday I attended the following talks in Antwerp:

That’s it…
As always, a lot of these topics deserve an article in this blog. And a lot of video recordings from the conference are worth viewing.

# Travelling to Devoxx UA and Devoxx BE 2019

Travelling to this years Devoxx conferences is worth its own short article, even though it is not very IT oriented material…

On the last weekend in October I brought a bicycle to a place near Frankfurt, let’s call it FrankfurtE. I took the train to Karlruhe and then back from FrankfurtE, but cycled from Karlsruhe to FrankfurtE.

On Wednesday evening 2019-11-30 I took a night train from Basel to Hannover and then a day train to some place in North-Rhine-Westphalia for private reasons, then in the evening a flight from Düsseldorf to Vienna and from there to Kiev. There was a problem with the plane, so we returned to Vienna. Austrian Airlines did a good job on rebooking, because they told us that they would find different alternative connections for all of us and send them to our phone, if possible. So there was no need to stand in a line for hours. Plus a decent hotel was booked which was next to the terminal. I had to translate all this information to Russian, because many of the passengers were only comfortable with Ukrainian and Russian languages and the guys in the airport only with German and English.

Then on the 2019-11-31 I flew back from Vienna to Frankfurt and then finally to Kiev, where I could visit the second half of the first day of Devoxx UA and the second day.

On Sunday 2019-11-03 I flew from Kiev to Frankfurt, picked my bicycle in FrankfurtE and took a train to Trier, where I stayed for the night.

On Monday 2019-11-04 I cycled to Marche-en-Famenne.

On Tuesday 2019-11-05 I cycled to Antwerp, where I participated Devoxx Belgium.

On Friday 2019-11-08 I cycled to Lanaken.

On Saturday 2019-11-09 to Bastogne.

And on Sunday 2019-11-10 to Luxemburg, from where I took the train home to Switzerland.

Just to give an idea, it is absolutely possible to use a bicycle as a means of transport on a business trip, but it has to make sense by not consuming more than one or two working days plus weekends, so it is basically necessary to go relatively long distances of 150 to 200 km on a full day and not spend more time than necessary on breaks. And most of the time it is the right choice to use the bigger highways, at least the biggest that are not forbidden to use, because beautiful quite scenic route usually are longer and would be too time consuming. It is not a vacation, but it remains a business trip. Then, on the other hand, this is not such a bad idea, because it really gives some time for thinking about some of the more interesting talks while cycling.

# Company „Skillsmatter“ stops operations

The company Skillsmatter in London has been put „under administration“ and basically stopped its operations. The web site seems to suggest, that everything is still ok, but that is not the case and I have heard so from several sources. The owner Wendy Devolder writes on Twitter and on Linkedin. Or here are some more news from cbronline or from theregister. The adminstrator is Resolve. They had put a deadline on 2019-11-05 for potential buyers and nothing indicates that such a buyer could be found.

There are some hopes expressed, that either 10’000 people will donate 250 GPB each or that someone buys the company and keeps it afloat. Reasonably it is probably not going to happen.

Now it is hard to obtain further reliable information. Have the employees already been layed off? Have all conferences been cancelled, for example Clojure Exchange (ClojureX) and Scala Exchange (ScalaX)?

The websites mention nothing about it, but simply the fact that there is nothing mentioned indicates that the employees, who could update the site, are gone and that the conferences will probably not take place. Otherwise I would expect an update on the site mentioning that it is taking place in spite of the situation. In case of Clojure Exchange I have been informed by other participants that Clojure Exchange has been canceled and that there will probably be a „community conference“ instead. Being a speaker, I volunteered to perform my talk on this community conference instead.

In case of Scala Exchange there was a strange story. A keynote speaker, John de Goes, was „uninvited“ because of „inclusiveness“. As a result, he decided to create a competing conference, Functional Scala, at exactly the same time as Scala Exchange and also in London. Some speakers have reportedly decided to speak at Functional Scala instead of Scala Exchange and speakers were encouraged to do so. In the end this might come out as a good thing, because Functional Scala will probably take place and might be an option for those who have already booked their visit to Scala Exchange.

So what does all of this mean? If we are heading for bankruptcy of Skillsmatter and if the conferences (Clojure X for sure, Scala X probably) are canceled, we as speakers or simply visitors are entitled to refund for our ticket or our non refundable travel expenses as speakers to the extent that Skillsmatter would have covered them. But reasonably there will not be enough money left for this. A company can go bankrupt and still have funds that is hard to access, but in practice banks will help out if these funds can be documented. So in reality bankruptcy usually means that there are many debts and little money already. Now the salaries of the employees get the highest priority. When they have been paid, other open payments can be covered, according to the rules that apply in the country. Possibly the price for the ticket, that has already been paid, is simply lost. Possibly travel expenses are lost if they cannot be redirected to another event.

If you like to Donate 250 GBP and 10’000 people do so too, the company could continue. I do not think that this is going to happen.

I will keep you informed if I learn more about the issue that is interesting to potential conference visitors and speakers of events organized by skillsmatter.

Update 2019-11-12: I got in contact with the administrators. They did not want to confirm or deny that the conferences scheduled in December would take place. They just do not know, but it seems to be depending on finding a buyer. If a magical buyer appears and decides to reactivate the events, they might take place. Meanwhile all web pages of skillsmatter show a text that the company is „under administration“, so I guess each day it is getting less likely that there will be anything a buyer can reactivate. I know for sure that at least some employees have already been asked to leave.

Now the good news: The replacements for Scala Exchange and Clojure Exchange are already in place, meaning a conference about the same programming language at the same date and also in London. So if you have booked your hotel and your trip to London already, you might want to check them out:

Update 2019-12-09:
Scala Exchange is not going to happen. See web page.
And since so much time has passed, it is becoming unlikely that a buyer turns up, so the company will be gone.

Update 2020-02-12:
The company found a buyer and will start working again. (see comment)

# Checked Exceptions in Java

In Java it is possible to declare a method with a „throws“-clause. For certain exceptions, that are not extending „RuntimeException“ or „Error“, this is actually required.

What looked like a good idea 25 years ago has proven to be a dead end. I do not know of any other major programming language that opts for declaring exceptions in this way. Slightly newer frameworks extend all their exceptions from RuntimeException, thus avoiding the need to declare them. Even in relatively early Java there was a weird way of working with exceptions in EJB, when it was required to write an interface and an implementation for the EJB. But it was strongly discouraged to let the implementation implement the interface, because it threw different exceptions. It was not the only weird thing about early EJB, of course. But without checked exceptions it would at least have been possible to let the implementation implement its interface.

We are now able to use Java 13 and as of Java 8 lambdas were introduced. With the introduction of lambdas the declared exceptions became especially painful and for this reason even Oracle has created twins for some essential exceptions that derive from RuntimeException, especially IOException.

We should face it: The throws clause has turned out to be a mistake and we should avoid this mistake by just using exceptions that do not have to be declared, at least in our APIs. It is not the only mistake, see Criticism of Java. Some of my other favorites are the lack of operator overloading for numeric types, the weird concept of Serializable and the lack of natively immutable collections and the lack of a convenient way to write some collections as code. But these issues are being worked on and we will eventually see some progress.

# How to recover the Borrow Bit

In a similar way as the carry bit for addition it is possible to recover the borrow bit for substraction, just based on the highest bits of three numbers that we deal with during the operation.

With this program, a subtraction operation of an 8-bit CPU can be simulated exhaustively

 #!/usr/bin/perl

 my $x,$x, $bi; my %tab = (); for ($bi = 0; $bi <= 1;$bi++) {     for ($x = 0;$x < 256; $x++) { for ($y = 0; $y < 256;$y++) {             my $zz =$x - $y -$bi;             my $b =$zz < 0 ? 1 : 0;             my $c = 1 -$b;             my $z = ($zz + 256) & 0xff;             my $xs =$x >> 7;             my $ys =$y >> 7;             my $zs =$z >> 7;             my $key = "$xs:$ys:$zs";             $tab{$key} //= $b; my$bb = $tab{$key};             if ($bb !=$b) {                 print "b=$b bb=$bb c=$c xs=$xs ys=$ys zs=$zs x=$x y=$y z=$z zz=$zz bi=$bi\n"; } } } }  for my$key (sort keys %tab) {     $key =~ m/(\d+):(\d+):(\d+)/;$xs=$1;$ys=$2;$zs=$3;$b =$tab{$key};     $c = 1 -$b;     $bb =$xs & $ys &$zs | !$xs & ($ys | $zs); print "b=$b bb=$bb c=$c xs=$xs ys=$ys zs=\$zs\n"; } 

This gives an idea, what is happening. But in real life, probably a 64bit-CPU is used, but the concepts would work with longer or shorter CPU words the same way.

So we subtract two unsigned 64-bit integers and and an incoming borrow bit to a result

with

using the typical „long long“ of C. We assume that

where

and

In the same way we assume and with the same kind of conditions for , , or , , , respectively.

Now we have

and we can see that

for some

.
And we have

where

is the borrow bit.
When combining we get

When looking just at the highest visible bit and the borrow bit, this boils down to

This leaves us with eight cases to observe for the combination of , and :

x_hy_huz_hb
00000
00111
01011
01101
10010
10100
11000
11111

Or we can check all eight cases and find that we always have

So the result does not depend on anymore, allowing to calculate the borrow bit by temporarily casting , and to (signed long long) and using their sign.
We can express this as „use if and use if „.

The incoming borrow bit does not change this, it just allows for , which is sufficient for making the previous calculations work.

The basic operations add, adc, sub, sbb, mul, xdiv (div is not available) have been implemented in this library for C. Feel free to use it according to the license (GPL). Addition and subtraction could be implemented in a similar way in Java, with the weirdness of declaring signed longs and using them as unsigned. For multiplication and division, native code would be needed, because Java lacks 128bit-integers. So the C-implementation is cleaner.