Hidden CPUs

How many CPUs does your computer have?

If we go way back, we will discover that some time ago there were already ancillary CPUs in our computers. The floppy disk drive of the C64 had a CPU very similar to the one in the computer itself, but very little memory and it was hard, though not impossible, to make use of it. I never really tried. The PC-keyboards had CPUs, it was told that a Z80 or 8080 or something like that was built into them. I never bothered to find out.

Now this concept is not at all new, but was already used 35 years ago. So the question is, if our computers still have such hidden CPUs. This seems to be the case and it is easy to search for „hidden CPUs“ or „secret CPUs“. And it would be extremely strange to expect anything different. They do not have compute power for us, but just run and manage hardware, that appears to be just hardware from the point of view of our main CPU, that we can program. So why not just consider this as hardware and ignore the „secret“ or „hidden“ CPUs and see them as implementation detail of the hardware. That is a very legitimate approach and to be honest what we do most of the time.

The issue is more delicate now, because these hidden CPUs can access the internet, even when the computer is turned off or seems to be offline. There are tools to analyze the network traffic and to detect this. But we should start to become aware of this invisible world that is potentially as dangerous as visible malware. And this applies to all kinds of devices, especially cell phones, tablets, routers, TV-sets and all „things“ that have their own CPU power and network access…


Share Button

When to use Scala and Ruby

There are many interesting languages that have their sweet spots and of course a larger set of languages than just two should be considered for new projects.

But Ruby and Scala are both very interesting languages that did not just pick up and sell concepts that were already known, but brought them to a new level and to new beauty. Interestingly, both were started by a single person and finally became community projects.

There are some differences to observe.

Ruby is mostly a dynamic language, which means that it is easier and more natural to change the program at runtime. This is not necessarily a bad thing and different Lisp variants including today’s Clojure have successfully used and perfected this kind of capability for many decades. Consequently more things happen at runtime, especially dynamic typing is used, which means that types only exist at runtime.

Scala is mostly a static language, which means that all program structures have to be created at compile time. But this has been brought to perfection in the sense that a lot of things that are typically available only in dynamic languages, can be done. The type system is static and it is in this sense more consistent and more rigorous than the type system of Java, where we sometimes encounter areas that cannot reasonably be covered by Generics and fall back to the old flavor of untyped collections. This does not happen too often, but the static typing of Scala goes further.

In general this gives more flexibility to Ruby and makes it somewhat harder to tame the ways to do similar things in a static way in Scala. But the type system at compile time of course helps to match things, to find a certain portion of errors and even to make the program more self explanatory without relying on comments. In IDEs it is hard to properly support Scala, but the most common IDEs have achieved this to a very useful level. This should not be overvalued, because there are enough errors that cannot be detected by just using common types. It is possible to always define more specific types which include tight constraints and thus perform really tight checking of certain errors at compile time, but the built in types and the types from common libraries are to convenient and the time effort for this is too high, so it does not seem to be the usual practice. In any case it is a recommended practice to achieve a good test coverage of non-trivial functionality with automated tests. They implicitly cover type errors that are detected by the compiler in Scala, but of course only to the level of the test coverage. Ruby is less overhead to compile and run. We just write the program and run it, while we need a somewhat time intensive compile step for Scala. If tests are included, it does not make so much of a difference, because running the tests or preceding them with a compile job is kind of a minor difference.

An interesting feature of Ruby is called „monkey patching“. This means that it is possible to change methods of an existing class or even of a single object. This can be extremely powerful, but it should be used with care, because it changes the behavior of the class in the whole program and can break libraries. Usually this is not such a bad thing, because it is not used for changing existing methods, but for adding new methods. So it causes problems only when two conflicting monkey patches occur in different libraries. But for big programs with many libraries there is some risk in this area. Scala tries to achieve the same by using „implicit conversions“. So a conversion rule is implicitly around and when a method is called on an object that does not exist in its type, the adequate conversion is applied prior to the method. This works at compile time. Most of the time it is effectively quite similar to monkey patching, but it is a bit harder to tame, because writing and providing implicit conversions is more work and harder to understand than writing monkey patches. On the other hand, Scala avoids the risks of Ruby’s monkey patching.

An increasingly important issue is making use of multiple CPU cores. Scala and especially Scala in combination with Akka is very strong on this. It supports a reasonably powerful and tamable programming model for using multiple threads. The C- or JavaSE-way is very powerful, but it is quite difficult to avoid shooting oneself into the foot and even worse there is a high likelihood that such errors show up in production, in times of heavy load, while all testing seemed to go well. This is the way to go in some cases, but it requires a lot of care and a lot of thinking and a team of skillful developers. There are more developers who think they belong to this group than are actually able to do this well. Of course Scala already filters out some less skilled developers, but still I think its aproach with Akka is more sound.
Ruby on the other hand has very little support for multithreading, and cannot as easily make use of multiple cores by using threads. While the language itself does support the creation of threads, for many years the major implementation had very little support for this in the sense that not actually multiple threads were running at the same time. This propagated into the libraries, so this will probably never become the strength of Ruby. The way to go is to actually start multiple processes. This is not so bad, because the overhead of processes in Ruby is much less than in JVM-languages. Still this is an important area and Scala wins this point.

Concerning web GUIs Ruby has Rails, which is really a powerful and well established way to do this. Scala does provide Play, which is in a way a lot of concepts from Rails and similar frameworks transferred to Scala. It is ok to use it, but rails is much more mature and more mainstream. So I would give this point to Ruby. Rails includes Active Record, about which I do have doubts, but this is really not a necessary component of a pure WebGUI, but more a backend functionality…

So in the end I would recommend to use Scala and Akka for the solution, if it is anticipated that a high throughput will be needed. For smaller solutions I would favor Ruby, because it is a bit faster and easier to get it done.

For larger applications a multi tier architecture could be a reasonable choice, which opens up to combinations. The backend can be done with Scala. If server side rendering is chosen, Ruby and Rails with REST-calls to the backend can be used. Or a single page application which is done in JavaScript or some language compiling to JavaScript and again REST-calls to the backend.

Share Button

Carry Bit, Overflow Bit and Signed Integers

It has already been explained how the Carry Bit works for addition. Now there was interest in a comment about how it would work for negative numbers.

The point is, that the calculation of the carry bit does not have any dependency on the sign. The nature of the carry bit is that it is meant to be used for the less significant parts of the addition. So assuming we add two numbers x and y that are having k and l words, respectively. We assume that n=\max(k,l) and make sure that x and y are both n words long by just providing the necessary number of 0-words in the most significant positions. Now the addition is performed as described by starting with a carry bit of 0 and adding with carry x[0]+y[0], then x[1]+y[1] and so on up to x[n-1]+y[n-1], assuming that x[0] is the least significant word and x[n-1] the most significant word, respectively. Each addition includes the carry bit from the previous addition. Up to this point, it does not make any difference, if the numbers are signed or not.

Now for the last addition, we need to consider the question, if our result still fits in n words or if we need one more word. In the case of unsigned numbers we just look at the last carry bit. If it is 1, we just add one more word in the most significant position with the value of 1, otherwise we are already done with n words.

In case of signed integers, we should investigate what can possibly happen. The input for the last step is two signed words and possibly a carry bit from the previous addition. Assuming we have m-Bit-words, we are adding numbers between -2^{m-1} and 2^{m-1}-1 plus an optional carry bit c. If the numbers have different signs, actually an overflow cannot occur and we can be sure that the final result fits in at most n words.

If both are not-negative, the most significant bits of x[n-1] and y[n-1] are both 0. An overflow is happening, if and only if the sum x[n-1]+y[n-1]+c \ge 2^{n-1}, which means that the result „looks negative“, although both summands were not-negative. In this case another word with value 0 has to be provided for the most significant position n to express that the result is \ge 0 while maintaining its already correctly calculated result. It cannot happen that real non-zero bits are going into this new most significant word. Consequently the carry bit can never become 1 in this last addition step.

If both are negative, the most significant bits of x[n-1] and y[n-1] are both 1. An overflow is happening, if and only if the sum x[n-1]+y[n-1]+c \lt 2^{n-1}, which means that the result „looks positive or 0“, although both summands were negative. In this case another word with value 2^n-1 or -1, depending on the viewpoint, has to be prepended as new most significant word. In this case of two negative summands the carry bit is always 1.

Now typical microprocessors provide an overflow flag (called „O“ or more often „V“) to deal with this. So the final addition can be left as it is in n words, if the overflow bit is 0. If it is 1, we have to signal an overflow or we can just provided one more word. Depending on the carry flag it is 0 for C=0 or all bits 1 (2^n-1 or -1, depending on the view point) for C=1.

The overflow flag can be calculated by o=\mathrm{signbit}(x) = \mathrm{signbit}(y) \land \mathrm{signbit}(x+y\mod 2^n) \ne \mathrm{signbit}(x).
There are other ways, but they lead to the same results with approximately the same or more effort.

The following table shows the possible combinations and examples for 8-Bit arithmetic and n=1:

x<0 or x≥0y<0 or y≥ 0(x+y)%2^8 < 0 or ≥ 0Overflow BitCarry Bitadditional word neededvalue additional wordExamples (8bit)
x≥0y<0≥000 or 1no-65+(-1)
x≥0y<0<000 or 1no-7+(-8)
x<0y≥0≥000 or 1no--9 + 12
-1 + 127
x<0y≥0<000 or 1no--128+127
-1 + 0
x<0y<0≥011yes-1-64 + (-65)
x<0y<0<001no--1 + (-1)
-1 + (-127)
-64 + (-64)

If you like, you can try out examples that include the carry bit and see that the concepts still work out as described.


Share Button

Source Code of Apple-iOS leaked

It seems that the parts of the source code of Apple’s iOS 9 have leaked via github. They might have been removed from there, while you are reading this, but probably they will be passed around in the internet anyway.

Some sources say that this is a risk to security. It might be, but in the end cryptography specialists tend to consider the availability of the source code as an advantage for security, because it can be analyzed by everyone, vulnerabilities can be found and published and of course more easily be corrected if the source is available to everyone. Hiding the source code is some kind of „security by obfuscation“, which is not really a strong mechanism and it should be based on verifiable secure mechanisms, as successfully applied by Linux and other open source operating systems. But this might not be fully true, if the sources are just passed around in somewhat closed circles and not easily available to the general public.

This does not make iOS open source, because the licenses that Apple imposes on their software are still valid and to my understanding they do not make this part of the system open source, which means much more than just being able to read the source code of a certain version that might already be outdated. Please observe that if the source code that you might find on github is really coming from Apple, their original license and not the one mentioned in github applies.

To put Jail breaking somewhere near security breaches is wrong, because this is an action done by the owner of the device with his or her own device at own risk. This should be everyone’s right to do so and there should be nothing wrong with making it easier. I know, we are not living in a perfect world…

So please relax. If Apple has done a good job, there will not be too bad exploits and if they are still doing a good job, they will quickly fix any exploits that show up. And if you like to have an open source system, you should still consider using something else.


Share Button

Java Properties Files and UTF-8

Java uses a nice pragmatic file format for simple configuration tasks and for internationalization of applications. It is called Java properties file or simply „.properties file“. It contains simple key value pairs. For most configuration task this is useful and easy to read and edit. Nested configurations can be expressed by simple using dots („.“) as part of the key. This was introduced already in Java 1.0. For internationalization there is a simple way to create properties files with almost the same name, but a language code just before the .properties-suffix. The concept is called „resource bundle“. Whenever a language specific string is needed, the program just knows a unique key and performs a lookup.

The unpleasant part of this is that these files are in the style of the 1990es encoded in ISO-8859-1, which is only covering a few languages in western, central and northern Europe. For other languages as a workaround an \u followed by the 4 digit hex code can be used to express UTF-16 encoding, but this is not in any way readable or easy to edit. Usually we want to use UTF-8 or in some cases real UTF-16, without this \u-hack.

A way to deal with this is using the native2ascii-converter, that can convert UTF-8 or UTF-16 to the format of properties files. By using some .uproperties-files, which are UTF-8 and converting them to .properties-files using native2ascee as part of the build process this can be addressed. It is still a hack, but properly done it should not hurt too much, apart from the work it takes to get this working. I would strongly recommend to make sure the converted and unconverted files never get mixed up. This is extremely important, because this is not easily detected in case of UTF-8 with typical central European content, but it creates ugly errors that we are used to see like „sch�ner Zeichensalat“ instead of „schöner Zeichensalat“. But we only discover it, when the files are already quite messed up, because at least in German the umlaut characters are only a small fraction of the text, but still annoying if messed up. So I would recommend another suffix to make this clear.

The bad thing is that most JVM-languages have been kind of „lazy“ (which is a good thing, usually) and have used some of Java’s infrastructures for this, thus inherited the problem from Java.

Another way to deal with this is to use XML-files, which are actually by default in UTF-8 and which can be configured to be UTF-16. With some work on development or search of existing implementations there should be ways to do the internationalization this way.

Typically some process needs to be added, because translators are often non-IT-people who use some tool that displays the texts in the original languages and accepts the translation. For good translations, the translator should actually use the software to see the context, but this is another topic for the future. Possibly there needs to be some conversion from the data provided by the translator into XML, uproperties, .properties or whatever is used. These should be automated by scripts or even by the build process and merge new translations properly with existing ones.

Anyway, Java 9 Java 9 will be helpful in this issue. Finally Java-9-properties that are used as resource bundles for internationalization can be UTF-8.


Share Button


Almost every non trivial application uses in some way a database.

For many years this has been anyway Oracle, DB2 or MS-SQL-Server, depending mostly on the habits and on the religious orientation of the organization that developed or ran the application. These days all three are available for Linux and MS-Windows. DB2 is also available z/OS. The „home-platforms“ of these three are probably Linux, z/OS and MS-Windows, respectively (2018).

We saw Teradata as an alternative to DB2 and Oracle for data warehouses. They run on huge amounts of data, but are really invisible to most of us. Maybe the data warehouse is the old „big data“, before the invention of the term.

We saw a big Hype about NoSQL databases and some interesting DB products from this group that could successfully establish themselves.

We saw MySQL (and its fork MariaDB) mostly for database-installations that had somewhat lower requirements on the DB-product in terms of features or in terms of performance. Actually Wikipedia runs on MySQL or MariaDB and that is quite a big installation with heavy user load, but it is mostly about reading.

PostgreSQL was often positioned „somewhere between Oracle and MySQL“.

PostgreSQL 10 just came out. The most important new features where replication on a per table basis, better partitioning of large tables and better support for clustering.

I have worked with all of the database technologies listed here and even giving trainings for MongoDB, Oracle and PostgreSQL.

So where is PostgreSQL positioned really in this landscape?

It is a good database product for a large and growing class of applications. I find it slightly more pleasant than the other four SQL databases mentioned here to work with, because the SQL-implementation and its extensions are powerful, clean and behave more or less as expected. Some minor positive points are the default usage of ISO-date-format, the distinction between Null and empty string and on the other hand that most stuff that works in Oracle at SQL level can easily be transferred to PostgreSQL. The psql-shell works like typical linux shells in terms of command line editing and history. So a lot of minor details are just pleasant or as they should be.

Comparing to the three groups of contenders:


NoSQL databases kind of leave the mainstream of transactional relational SQL-databases and provide us either some interesting special features or promise us performance gains or support of huge data base sizes. The price for this is that we loose a extremely mature, clever and powerful query language, which SQL is. I would go for NoSQL products, if the additional feature of this NoSQL-DB-product cannot be reasonably be duplicated in PostgreSQL or other SQL-DBs and if it is really useful for the job. I would go for a NoSQL-DB-product, if the required data sizes and performance cannot reasonably be achieve using an SQL-product like PostgreSQL, good tuning of hardware, OS, database and application logic, but can actually be achieved with the NoSQL-product. These applications exist and it is important to pick the right NoSQL-DB for the project. It should be observed that PostgreSQL has a lot of features beyond of what normal SQL-databases have and looking into this area might be useful… A typical strength of some NoSQL-databases (like CassandraDB and MongoDB) is that a powerful replication is kind of trivial to set up, while it is a really big story for typical transactional SQL databases… This is due to the transactional feature which adds complexity and difficulty and a performance penalty to some kinds of replications…


I do not count that MySQL belongs to Oracle, because MariaDB is an independent fork outside of Oracle and can be used instead of MySQL.
I do think that MySQL does not have quite the level of PostgreSQL in terms of features and cleanness. So we can get PostgreSQL for the same price as MySQL or MariaDB. So why not go for the better product? Even if MariaDB perfectly fits today, the application will grow and it will at some point prove useful to be based on PostgreSQL. I came across the issue of nested transactions some years ago. They were easily supported by PostgreSQL, but not at all by MariaDB. Issues like that can come up more likely this way than the other way around.

Oracle, DB2, MS-SQL-Server

Especially Oracle makes many long term loyal customers run away due to there pricing and licensing practices. While it is extremely hard to change the database of a non trivial database based application, at least new applications in many organizations are discouraged from using Oracle, unless they can make a point why they really need it. MS SQL-Server might absorb some of these, especially since they are now available on Linux servers. But what Oracle does now might very well be the policy of Microsoft or IBM in a few years, so it makes perfect sense to have a serious look at PostgreSQL. A reasonably well tuned PostgreSQL will work pretty much as good as a reasonably well tuned Oracle, DB2 or MS-SQL-Server. Features that are missing now are being added with new releases. Some interesting features make it just a bit more pleasant to use than for example Oracle. It just feals more modern and more Linux-like.

Btw. there were some more contenders in the space of commercial transactional SQL-databases like Adabas D, SyBase and Informix. While the database products Adabas D and SyBase have been bought by SAP 1997 and the whole Sybase company in 2010 in two more or less unsuccessful attempts to have their own database and not having to use their competitors product as database, but they seem to have some success in using HANA now. Informix has been bought by IBM and is still offered as alternative to DB2. I would say that they have lost their relevance.


So I do recommend to seriously consider PostgreSQL as a DB product. It is currently my favorite in this space, but there is no univeral tool that fits for everything.

Some random aspects to keep in mind when moving from Oracle to PostgreSQL are mentioned here…

Types CLOB and BLOB do not exist. They can mostly be replaced by types TEXT and BYTEA, but it is not exactly the same. The type TEXT, which is a somewhat unlimited variable length string can easily be used for columns where we would try to use VARCHAR2 in Oracle, which gives us the advantage that we do not have to worry about defining a maximum length or exceeding the 4k limit that Oracle imposes on VARCHAR2.

Empty Strings are not the same as NULL in PostgreSQL, they are in Oracle.

PostgreSQL has a boolean type. Please use it and get rid of the workaround using CHAR, VARCHAR2 or NUMBER as replacement.

Oracle only had one kind of transaction isolation that was really well supported and I think this is still the way to go. It is an excellent choice and is very close to „repeatable read“, while PostgreSQL uses by default „read committed“, but it can be brought to use „repeatable read“. Please keep this in mind to avoid very unpleasant surprises and use the transaction isolation level appropriately.

The structuring of PostgreSQL consists of DB-instances, usually only one on a virtual or physical server, which somewhat resembles what is a database in Oracle. Within a DB-instance, it is possible to define a number of database without much pain. This was totally not the case with Oracle in earlier years and it was best practice to rely no schemas, but now we can easily afford to put more virtual servers each running Oracle (or PostgreSQL), if the licensing does not prohibit it in the case of Oracle. And since Oracle 12 there is the concept of the virtual database which splits a Oracle database into sub databases, somewhat behaving like separate database without the overhead of DB instances. It seems to be quite equivalent to what PostgreSQL does, apart from the naming and many details about how to set it up and how to use it. Schema and User are more separate concepts in PostgreSQL, a Schema can be defined totally independently of Users, but there is a way to define Schema names that match the user names to support this way of working. So we can do pretty well what we want, but the details how to work it out are quite different.

Each database has its programming language to write triggers, stored procedures and the like. They seem to be somewhat similar between different DB-products (we are talking about MS-SQL-Server, Oracle, PostgreSQL and DB2), but different enough that we need to rewrite triggers and stored procedures from scratch. This is not as painful as it used to be, since the approach of accessing DB tables for read access only via views and for write access only via stored procedures seems to have lost some popularity. Having written a lot of the business logic PL/SQL the pain of migrating to another DB product is really enormous, while a business logic in Java, Scala, C, C++, Perl, Ruby, C# or Clojure can be ported more easily to different OS and different DB. But it is no way for free.

One remark for development: Some teams like to use in memory databases for development and then trust that deployment on PostgreSQL or Oracle or whatever will more or less work. I strongly recommend not to follow this route. It is totally not trivial to support one more DB product or usually a second DB product and it is quite easy to setup a virtual OS with the DB product that is being used and with test data. PostgreSQL, Oracle, MS-SQL-Server, MongoDB and whatever you like can be configured to use more Memory and perform pretty much like these in memory DBs, if we set them up for development and are willing to risk data loss. This is no problem, because the image can be trivially copied from the master image when needed. Yes, a really good network and SSDs of sufficient size, speed and quality are needed for working efficiently like this and it is possible and worthwhile to have that.

I can give training about PostgreSQL and MongoDB and about SQL in different dialects. Find contact information here.

And please: comments, corrections and additional information are always welcome…


Share Button

The magic trailing space

When comparing string, of course spaces count as well and they should count. To ignore them, we can normalize strings. Typical white space normalization includes the following (Perl regular expressions):

  • /[ \t]+/ /g replace any sequence of tabs and spaces used to separate content by one space.
  • /\r\n/\n/g replace carriage return + linefeed by linefeed only.
  • /\s+$// remove trailing whitespace.
  • /^\s+// remove leading whitespace.

More or less it is often useful to do something like this when comparing strings that originally come from outside sources and are not normalized, but only „the content“ counts. There can be more sophisticated rules, to deal with no-break-space, with control characters, with trailing spaces at the end of each line or only at the end of the whole thing or replacing multiple empty lines by just one empty line. Just the general idea is to think about the right normalization.

In some cases, like long numbers, spaces or other symbols are used to group digits. These should also be removed. Sometimes more specific rules apply, like for phone numbers, web sites, email addresses etc. that need to be done specifically for this type, hopefully using an adequate library.

More often than not we see that web sites do not do this properly. Quite often an information has to be entered and it is not normalized prior to further processing. So credit card numbers or IBAN numbers are rejected because of spaces or anything because of trailing spaces, of course with an error message that does not give us a hint about what was the problem.

For serious application there needs to be a serious processing step for data coming from outside anyway, for security reasons. Even though SQL injection should not work due to sound SQL-placeholder usage, it is a good practice to check the data anyway and reject it early and with a meaningful message. Should I trust the security of a site that cannot deal with spaces in a credit card number for giving them my card number? I am not sure.

It is about time that UI developers get into the habit of doing the proper processing, normalization and checks for user input. Beware that any security relevant checks need to be done on the server or on the server as well.

Share Button

Scala Exchange 2017

I have visited Scala Exchange („#ScalaX“) in London on 2017-12-14 and 2017-12-15. It was great, better than 2015 in my opinion. In 2016 I missed Scala Exchange in favor of Clojure Exchange.

This time there were really many talks about category theory and of course its application to Scala. Spark, Big Data and Slick were less heavily covered this time. Lightbend (former Typesafe), the company behind Scala, did show some presence, but less than in other years. But 800 attendees are a number by itself and some talks about category theory were really great.

While I have always had a hard time accepting why we need this „Über-Mathematics“ like category theory for such a finite task as programming, I start seeing its point and usefulness. While functors and categories provide a meta layer that is actually accessible in Scala there are actually quite rich theories that can even be useful when constrained to a less infinite universe. This helps understanding things in Java. I will leave details to another post. Or forget about it until we have the next Scala conference.

So the talks that I visited were:

  • Keynote: The Maths Behind Types [Bartosz Milewski]
  • Free Monad or Tagless Final? How Not to Commit to a Monad Too Early [Adam Warski]
  • A Pragmatic Introduction to Category Theory [Daniela Sfregola]
  • Keynote: Architectural patterns in Building Modular Domain Models [Debasish Ghosh]
  • Automatic Parallelisation and Batching of Scala Code [James Belsey and Gjeta Gjyshinca]
  • The Path to Generic Endpoints Using Shapeless [Maria-Livia Chiorean]
  • Lightning talk – Optic Algebras: Beyond Immutable Data Structures [Jesus Lopez Gonzalez]
  • Lightning Talk – Exploring Phantom Types: Compile-Time Checking of Resource Patterns [Joey Capper]
  • Lightning Talk – Leave Jala Behind: Better Exception Handling in Just 15 Mins [Netta Doron]
  • Keynote: The Magic Behind Spark [Holden Karau]
  • A Practical Introduction to Reactive Streams with Monix [Jacek Kunicki]
  • Building Scalable, Back Pressured Services with Akka [Christopher Batey]
  • Deep Learning data pipeline with TensorFlow, Apache Beam and Scio [Vincent Van Steenbergen]
  • Serialization Protocols in Scala: a Shootout [Christian Uhl]
  • Don’t Call Me Frontend Framework! A Quick Ride on Akka.Js [Andrea Peruffo]
  • Keynote: Composing Programs [Rúnar Bjarnason]
Share Button

Collection Initializiation in Java

There is this so called „double brace“ pattern for initializing collection. We will see if it should be a pattern or an anti-pattern later on…

The idea is that we should consider the whole initializion of a collection one big operation. In other languages we write something like
[element1 element2 element3]
[element1, element2, element3]
for array-like collections and
{key1 val1, key2 val2, key3 val3}
{key1 => val1, key2 => val2, key3 => val3}.
Java could not do it so well until Java 9, but actually there was a way to construct sets and lists:
Arrays.asList(element1, element2, element3);
new HashSet<>(Arrays.asList(element1, element2, element3));.
Do not ask about immutability (or unmodifyability), which is not very well solved in the standard java library until now, unless you are willing to take a look into Guava, which we will in another article… Let us stick with Java’s own facilities for today.

So the double brace pattern would be something like this:

import java.util.*;

public class D {
    public static void main(String[] args) {
        List<String> l = new ArrayList<String>() {{
        System.out.println("l=" + l);

        Set<String> s = new HashSet<String>() {{
        System.out.println("s=" + s);

        Map<String, String> m = new HashMap<String, String>() {{
                put("k1", "v1");
                put("k2", "v2");
                put("k3", "v3");
        System.out.println("m=" + m);

What does this do?

First of all having an opening brace after the new XXX() creates an anonymous class extending XXX. Then we open the body of the extended class. What is well known to many is that there can be a static {....} section, that is called exactly once for each class. The same applies for a non-static section, which is achieved by omitting the static keyword. This is of course called once for each instance of the class, so in this case it will be called after the constructor of the base class and serves kind of as a replacement for the constructor. To make it look cooler the two pairs of braces are placed together.

It is not so magic, but it creates a lot of overhead by creating anonymous classes with no real additional functionality just for the sake of an initialization. It is even worse, because these anonymous inner classes are not static, so they actually can refer to their surrounding instance. They do not make use of this, but anyway they carry a reference to their surrounding class which might be a very serious problem for serialization, if that is used. And for garbage collection. So please consider the double-brace-initialization as an anti-pattern. Others have blogged about this too…

There are more legitimate ways to group the initialization together. You can put the initialization into a static method and call that. Or you could group it with single braces, just to indicate the grouping. This is a bit unusual, but at least correct:

import java.util.*;

public class E {
    public static void main(String[] args) {
        List<String> l = new ArrayList<String>();
        System.out.println("l=" + l);

        Set<String> s = new HashSet<String>();
        System.out.println("s=" + s);

        Map<String, String> m = new HashMap<String, String>();
            m.put("k1", "v1");
            m.put("k2", "v2");
            m.put("k3", "v3");
        System.out.println("m=" + m);

While the first two can somehow be written using Arrays.asList(...), now in Java 9 there are nicer ways for writing all three using List.of("abc", "def", "uvw");, Set.of("1A2", "2B707", "3DD"); and Map.of("k1", "v1", "k2", "v2", "k3", "v3");, which is recommended over any other way because there are some additional runtime and compile time checks and because these are efficient immutable collections. This has been blogged about too.

The aspect of immutability which we should consider today, is not very well covered by the java collections (apart from the new internal one for the new factory methods. Wrapping in Collections.unmodifyableXXX(...) is a bit of overhead in terms of code, memory and CPU-usage but it does not give a guarantee that the collection wrapped into this is actually not being modified elsewhere.

Share Button

VoIP and Landline Telephony

Some may have noticed, some not, but the landline telephony is actually being shut down in the next few years, if it has not happened already. This is done in Germany and in Switzerland and I assume other countries will follow or even do it earlier. In some countries and in some age groups the landline telephone does not exist any more. Younger people have only the cell phone and use flat rates for mobile telephony or VoIP services on the cell phone to call. And actually asynchronous communication mechanisms like email and messaging are more popular now than actually talking on the phone. So the technology that is relevant for phone companies now is internet and mobile telephony. So it is a logical step to stop supporting what has become an expensive niche technology. It looked like phone companies wanted customers to actually move their infrastructure to VoIP. That means the black phone with the dialing wheel from the 1950s would no longer work and customers would have to buy new devices, which would eventually allow them to make calls like before, just using keys instead of the dialing wheel. Or it would even be necessary to buy a computer or a tablet or a smart phone to do telephony at all. It seems that this approach was too ambitious, because there is a large group of customers who are unwilling or unable to move in this direction or simply unwilling to invest a lot of time on changing their habit and learning how to use the VoIP and a lot of money on buying devices that they actually do not want.

So the challenge is now to provided adapters that support all historical phone technology and map that to VoIP without forcing the customer to get used to a new device or a new method of using it. There are some impacts that can probably not be avoided. The adapter needs electricity, while the phone got its own electricity from the landline and even worked when there was an outage of electricity. The adapter can be small, but it will need some space. And there will be patterns of how making a call can fail that did not exist before. More components are involved and all of them can fail. As fall back for emergency calls even when electricity has failed we will have to rely on cell phones. Hopefully their batteries are charged, but people get used to that. And really almost everyone has a cell phone, even in poorer countries. Or at least a neighbor with a cell phone.

If this approach succeeds that will be quite impressive. But probably it is the only reasonable way to do that. And supporting only one technology, which is internet, is cost efficient. So the question who should pay for the adapters has to be answered in each country where this transition is being made.

Btw., I think that television is also a technology that will disappear. While in the old days half a dozen TV stations where on the air and in some countries financed by fees or by advertising or by taxes, we got alternative access via cable, satellite dishes and now the internet. So the local fee-financed TV stations are getting less relevant, because we can watch content from all over the world. So instead of imposing the fees on everybody who dares to live in the country (like in Germany or Switzerland) it is time to either abolish the TV-fees or to cut them way down or to constrain them to those who actually register as users of the national TV stations. So the national TV stations could make their content available in the internet only to those who pay and generate revenue like that. And of course compete with others all over the world who can do the same, if they just manage to provide content in a language that is comfortably understood. As long as the internet is open and we can view content from other countries without censorship this is a great progress against the national TV, even if that disappears due to the lack of funding and the lack of efficiency.

Share Button