Testing and Micro Components

It is an interesting trend to create micro components, microservices and use building blocks that are very easy to understand. As always, a new approach will not eliminate all problems. The problem that we are solving has an inherent complexity that we cannot beat. But usually we create a lot of complexity that would not be necessary to solve the problem and this can be avoided. Any approach that really helps us here is welcome. If and how micro components and micro services can help us, is another issue, I will just assume that they are in use.

Think of constructing a building. Now this is a very complex task, but we simplified it by building Lego pieces instead. They are simple, easy to understand, well tested and we just have to compose them to get the building. The building does not become trivial buy that, but maybe it becomes easier to create it.

In the end we want to sell a building. The fact that it is constructed of ISO and TÜV and CE and whatever certified lego blocks is not really relevant for the users of the building. They actually want a building, not lego blocks. Maybe they do not even care how it has been made, as long as it is good. They should not care, but usually they do at least a bit. We do not live in an ideal world and I would care too.

Now the building has to fit into the city. It has a function to play. It is itself a lego block for building the city. But for the moment we are actually only doing one building, not a whole city. So we need to provide tests that it works with its environment and provides the functionality that we expect. There is an API and an API-contract. We need to test this API in the way it will be used in production. That is on Linux, if the productive servers run Linux, not on MS-Windows. But Linux is anyway better for developement than MS-Windows, unless we are working on MS-specific technologies, where the servers will run MS-Windows anyway. We have to test with the database product that is used in production, for example PostgreSQL or Oracle. Tests with In-Memory-DBs or products like sqlite are interesting and maybe helpful for finding bugs by having tests that run presumably faster or with less overhead, but there are ways to provide the real DB to each developer and the overhead is not much, once this has been prepared. And a local Oracle instance is not much slower than an in-memory-DB, but allows us to inspect and change data with SQL, which can be very useful. And if we use application servers or middleware, it is mandatory to test with the same middleware as in production, no matter how much compatibility between the implementations of different vendors has been promised. All of these issues are in theory not existent, but I have seen them in practice and that is what counts. Yes, I love theory. 🙂

The other thing is that the fact that the lego pieces have been tested does not guarantee that the whole building will work. The composition creates something special, that is more than the sum of its pieces. That is what we want to achieve. So we need to test it.

So it is good to have automated tests that run against the service using the API that is used in production, possibly REST or SOAP, for example. And to run the test against an instance of the application that runs on the same kind of server against the same kind of database and on the same kind of middleware as in production. At least continuous integration should work like that. The smaller our components are, the more important it becomes to test them in conjunction. Problems will occur there, even though the components by themselves seem to work perfectly.

Now there is demand for running more local tests, unit tests, as they are commonly called, to actually test lego blocks or small groups of lego blocks. It is good to locate problems, because the end-to-end-tests just provide the information that there is a bug, but it is hard to locate. And without thorough testing of the sub components it is possible that we overlook edge cases on the overall tests that will anyway hit us in production, because only the more local tests can explore the boundaries that are visible to them, but obfuscated when accessed indirectly. Also we do not want to be happy with two bugs that cancel each other, because they will not always work together in our favor. So in the end of the day it is important to have automated tests at all levels. And to make them an important issue.

I have seen many projects, where Unit-Tests eroded. They were once done, but did not work any more and were impossible or hard to fix, of course lacking the time to do so. Or the software was hiding access points that were needed for testing. Just two examples for this, for our Java friends: methods can be package private and test classes in the same package to allow testing. And application servers need to actually allow certain remote access, usually via REST or SOAP these days. And we need remote debugging access. This is all easy to achieve, but can be made hard by company policies, even on development workstations and continuous integration servers.

Share Button


Eine größere Software muss man sicher strukturieren, sonst baut man sich ein Monster.

Nun kann man versuchen, Komponenten zu definieren, die so klein wie nur möglich sind. Die Komplexität innerhalb der Komponenten wird dadurch reduziert und überschaubar. Aber man bekommt ein Problem, weil die Menge der Komponenten dabei zu groß wird und man eine sehr hohe Komplexität bekommt, diese vielen Komponenten miteinander zu verknüpfen.

Man sieht es bei Büchern, bei denen die Gliederung gut gelungen ist, dass es eine Hierarchie von Gliederungen gibt und in jeder Ebene findet man etwa 2-10 Untereinträge zu den Einträgen aus der nächst höheren Ebene, nicht aber 50 Kapitel, die nicht in irgendeiner Form gruppiert sind.

Solche Ebenen wie „Teil“, „Kapitel“, „Abschnitt“… hat man in der Software-Architektur auch zur Verfügung, wobei es von der Technologie abhängt, welche Hierarchieebenen für dieses Gliederung und Strukturierung zur Verfügung stehen, z.B. Methode – Klasse – Package – Library – Applikation und man sollte sie mit Bedacht einsetzen. Was gehört zusammen und was nicht?

Es gibt aber noch einen anderen Aspekt. Oft verbaut man sich durch zu feingranulare Aufteilung Wege.

Ein Beispiel: Es soll eine Multiplikation von Matrizen mit kmoplexen Zahlen als Elementen implementiert werden. Nun ist es sehr elegant, die Matrizenmulitplikation einmal zu implementieren und dabei einen abstrakten numerischen Typ zu verwenden. Für jeden dieser numerischen Typen implementiert man die Grundoperationen.

Dies kann aber in Bezug auf die Perforamnce und auch in Bezug auf die Rechengenauigkeit beim Arbeiten mit Fließkommazahlen zu Problemen führen. Es lassen sich viel performantere Algorithmen für diese Matrizenmultiplikation finden, wenn man sie speziell für komplexe Zahlen schreibt und auch auf die Realteile und Imagniärteile der Matrizenelemente zugrifen kann. Besonders tückisch sind aber auch die Rundungsfehler. Um Rundungsfehler zu verringern muss man auf die Kalkulationen Zugriff haben und deren Reihenfolge und Assoziierung steuern können.

Hier ein Beispiel:


Berechnet man nun (a+b)+(c+d), erhält man 7.0, aber mit (a+c)+(b+d) erhält man 0.0.
In irb (ruby) sieht es etwa so aus:

$ irb
irb(main):001:0> a=3.0
=> 3.0
irb(main):002:0> b=4.0
=> 4.0
irb(main):003:0> c=5e30
=> 5.0e+30
irb(main):004:0> d=-5e30
=> -5.0e+30
irb(main):005:0> (a+b)+(c+d)
=> 7.0
irb(main):006:0> (a+c)+(b+d)
=> 0.0

Der Fehler kann sich nun natürlich noch beliebig fortpflanzen.

Das ändert aber nichts daran, dass ein auf Polymorphie basierender Ansatz in den allermeisten Fällen der richtige Weg ist, solange man nicht auf die entsprechende Optimierung angewiesn ist.

Es bleibt aber dabei, dass bei einer Aufteilung in Komponenten die richtige Granularität gewählt werden sollte, also keine Microkomponenten, aber auch nicht wahlloses Zusammenfügen von Dingen, die nicht zusammengehören, nur um die richtige Größe der Komponenten zu erreichen.

Share Button