Object Creation: Builder vs. Constructor vs. Setter

When we create new objects, we are basically confronted with the need to provide at least one construction pattern.

Of course depending on the language we have more or less three ways to go that are commonly available.

Traditionally in OO it was mandatory to write setters and getters. In C++ or Java they really have names like getXyz or setXyz, but in Ruby or C# or Scala they can be written in such a way that they behave as if the attribute were public and could be assigned and read, by just magically calling the setters and getters internally. Actually Java does that internally for public attributes or more generally for attribute assignment and Hibernate can be configured to go via the getters and setters or via the internal attribute-assignment-getters and setters. This can be useful, to apply some DB-specific conversion in the getters and setters and to bypass it for the DB-access.

Why do we at all use these getters and setters? They were introduced to have flexibility to change the internal implementation without changing the API, because getters and setters can actually become more complex. This can be useful for DB-specific conversions in Hibernate, but apart from that in 25 years of OO-ish development this flexibility has hardly been used. Most of the time the set of attributes changes and the set of setters and getters changes simultanously. So one might ask the question, why we go by default the extra mile of adding getters and setters, when we could just make the attributes public and save some time. I am not asking, because that would decrease the life expectancy. But moving on demand from plain accessible attributes to getters and setters would be just a refactoring like changing the sets of attributes.

Now which of these are preferred and why?

First of all, we do need to read all attributes in some way, otherwise they are just a waste of space. Just forget for the moment programming low level APIs where bits have to be counted and dummy attributes have to be added to move the useful ones to the right position. But very often it turns out that we do not need to change them during the life time of the object. Now Ruby has a nice feature of setting everything up and then calling freeze, which makes the object, but not its sub-objects, immutable. I think it would be worth considering to add something like this to Scala and Java, for example. Clojure has something like this, actually, but with a slightly different flavor.

There is some advantage in knowing that the state of objects does not change. It is easier to reason about code. It helps really for creating thread safety and reentrance. And it even helps when passing around an object reference of an object, that still „lives“ somewhere else, for example when a sub-object comes out of a getter. In functional programming this is mandatory for all internal APIs, in other areas it is just something to make life easier, where the mutation is not actually needed.

So, there the setter go away, where not needed and we end up construction with a constructor that contains all attributes or variants with reasonable default values. This makes sense, when the attributes are few and there is no no risk of mixing them up. The builder pattern helps by naming the attributes in languages, that do not allow named parameters for the constructor out of the box. So it is useful in Java, but obsolete in many other languages. If attributes are final or constant or whatever it is called, the need for setters and getters is technically even less, because nothing can go wrong, the attribute can only be read. But in Java it is best practice to write getters and we should comply.

Now the issue arises that there is only a private or public multi-argument constructor and a lot of getters or whatever is used for reading the attribute in the specific language. And then a framework needs to create objects from XML, JSON or whatever automatically. And these frameworks tend to need at least a no-argument-constructor, often a public no-argument-constructor, that should be only for framework use. And the attributes have to be made non-final. If the framework is smart, it bypasses getters and setters and the no-arg-constructor is already enough.

Some frameworks actually require the setters. There we go to the old school world. We can impose a convention that the setters and the no-arg-constructor are there ONLY for the purposes of the framework and should not be used otherwise. Maybe that is a good approach, it is somewhat cleaner. But the box is opened and mistakes with such a convention will happen, so the question arises, why we need to deal with each attribute at last seven times: The attribute itself, its getter, its setter, the multiarg-constructor, the attribute of the builder, the with-method of the builder and the build-method.

Some things are easier, when moving to a new language and dumping all the garbage-traditions in that step.

But good developers can write good software in any reasonably good language that reasonably suits the purpose.

Links

Share Button

MapStruct

In the Java sphere we often develop the same data class several times. Each layer has its own variant and they are named almost the same, with some prefix or suffix or just the package name to distinguish. The set of attributes is the same (or almost the same), they have setters and getters. Or maybe only getters.

Nobody wants to write business logic two or three or four times, no matter how much support we have for copying code between the layers. And there OO is gone. We have to use anemic data objects, which was clearly introduced as an antipattern by Martin Fowler some years ago.

Since we are now using new paradigms and every couple of years, we no longer care and no longer know. OO was 25 years ago. Now we do FP and Microservices. And new frameworks. And many layers.

So, where does this come from?

First of all, the database access layer is Hibernate. I do not know why, because I think that plain JDBC would be easier, but Hibernate is already there and cannot be removed by arguments. Now Hibernate came with the promise that we can just use plain objects (POJOs) for our data and mirror database tables to classes and columns to attributes. Some XML-stuff had to be written and everything worked. Only writing the XML was such a pain that people immediately jumped to the annotations alternative once it was there. It was better and still is. But now the POJOs are obviously cluttered with Hibernate or JPA annotations. So they have to stay in the database layer. Actually there is a much stronger argument for this. Objects contain other objects and collections of objects. And possibly everything, if we go deeper recursively. So accessing the database should be a reasonably fast operations, so some attributes are loaded lazily. That means, they are only really loaded, when we need them. Which can go terribly wrong, because the transaction is no longer around and it is too late.

Also we have our idea what data classes have to look like, so there are some layers where we want no-argument-constructors and setters and getters, some layers where we want final attributes and constructors with all attributes and only getters and again others where we prefer to use a builder. And yes, each layer has its rules that need to be followed.

So, we keep them in the DB layer and map them to almost identical service layer objects without annotations. Then we work with these, write our business logic. Procedural programming mostly. Because the objects cannot have business logic. So we have classes with methods that are kind of behaving like static methods, but are non-static, because the framework wants it like that.

And then again, we build more and more layers, because each concern needs to be dealt in its own layer. And requires its own set of data objects, possibly with its own annotations for REST or SOAP or JSON or XML or whatever.

So, how do we move data between layers? At each layer boundary the data needs to be copied to the sister objects in the new layer. Now this is kind of stupid, programming something like

class HouseL1 {
private final int a;
private final String b;

public void HouseL1(HouseL2 l2) {
this.a = l2.getA();
this.b = l2.getB();
....
}
}

or with builder or with setters and getters, it is a lot of ugly work. And even worse, all sub-objects and collections of sub-objects have to be mapped. And their sub objects. And we possibly have to stop somewhere.

So we would like to avoid doing all this tedious stuff.

What can we do?

Is Java really the right language? Of course, we are writing enterprise software and we need type safety. Not real type safety like Scala, but a little bit of it feels good.

Reconsider our whole architecture and simplify it. Maybe it is possible to get rid of some layers and write much simpler software that does the same thing, only much faster and with less bugs. Ok, I’m only kidding here. We are talking enterprise software here. And yes, sometimes the layers do have real purposes and make sense.

Try to use the same data objects in all layers anyway? It has been tried. It works, but only in very simple settings.

Create the source code for the mapping. You can write a script for this or find one or find a tool or whatever. Parse the data classes and create the source code for the transformation methods. Or just write hibernate classes and create all other layers with their preferred setup in terms of mutability and construction and annotations from that.

But we are in Java, so why not use reflection and figure out at runtime how to map it. Find a library to do it for you or write your own. Performance? No problem. We use enterprise servers, of course.

So, in the end it is a good possibility to have a Java-tool that creates the source code for the transformation as part of the compile process.

MapStruct does exactly that. We write in interface for our transformations. With some annotations non obvious mapping behavior can be specified. Keep that list small and try to make it possible to automatically recognize the mappings, where possible. Then an extension to the maven compiler-plugin is added, that involves MapStruct to create an implementation for this interface at compile time and of course compile the implementation. And voila, it works. Even for classes with builders, if only the builder uses method names that are identical to the corresponding attribute name without „with“-prefix. So deal with it, name the methods of the builder like that.

And yes, we should get rid of setters, where we do not really need them. And we should not write constructors with 20 parameters, because the parameters will get messed up. In a language like Java, that does not yet have named parameters. And we do not want to couple layers, so constructors that use the sister class from another layer are not a good idea either, if we have more than two layers or so. So there we go with a builder..

In the end of the day, we can write good software with any reasonably good language and framework. But it is worth investigating how to do certain things. And it is really worth asking the question, why we are doing this at times when it is possible to make such choices.

Share Button

Homeoffice

A lot of IT guys have to work in home office now or are at least encouraged to do so.

This is nothing new, because some companies have been entirely working like this for years. And people live anywhere in the world. They meet maybe once in a year for a company gathering.

There is some difference, though. These „remote only“ companies can choose their employees, their working area and their technologies in such a way, that it fits this model.

Some people feel more comfortable with going to work and going home, hopefully not with a very long commute, and having these two areas of life clearly separated. This is gone for the moment. But we IT people are lucky, because most of us can continue working with low risk of being infected with COVID-19.

Another important aspect is, that in person meetings are often better than just talking on the phone. Of course, when the working progress is established, smaller issues can easily and efficiently be handled by phone. But for larger issues that may be more complex, controversial or just require not only words, but also white board or something like that, it is usually better, to meet in person. At least if it is not too far to travel.

Also it is very easy to ask colleagues who sit nearby a question, it is nicer to drink coffee together…

So apart from the advantage of saving travel time the home office has its disadvantages.

But now we are learning to work like that and maybe that will benefit us, because the possibility to use homeoffice like a couple of times in a month may be useful even in the future.

Some interesting technical aspects are worth noting:

There are different ways to do the work. Some companies have the policy „bring your own device“. The work is mostly done on this device and probably VPN is already in place. So working at home just works immediately.

Also companies that use company owned laptops are often quickly set up, because they just need to setup VPN and maybe rules about the home network security.

But even for companies that use desktop computers, there are ways to deal with this. One approach is to just leave that computer running and redirect the display. This is built into the X Window System of Linux and Unix. It has been there already 30 years ago. In conjunction with Cygwin this is also possible for MS-Windows, but somewhat limited and difficult with non-cygwin applications and somewhat hard to set up. But other technologies like VNC, RDP and probably some others exist. It just requires enough band width, give a little bit of a delay, but it is absolutely possible to work with this. It may be useful to consider moving some things to the local computer, for example the IDE by just accessing the disk remotely, if that is possible with the company policies and the licenses for the IDE. Also confluence, JIRA and thes web applications might be accessed with the local browser instead of the browser on the remote computer.

Now for meetings it is good to find a good tool. Maybe use different tools simultaneously. We need to talk with good voice quality. In a group. It is better to have video. We want screen sharing. And there are even tools that do something like shared white boards. A lot can be done. Sometimes it is good to use different software simultaneously, because one has good voice and video, the other one good screen sharing and collaboration features. It is possible to use the mobile phone for one part and the computer for the other part. Good earphones can be helpful, so you can talk with your hands free. Unfortunately they are sold out in many online stores.

Share Button

ScalaUA 2020

I like visiting ScalaUA conference in Kiev every year in March or April. I did so in 2017, 2018 and 2019.

So for 2020 it was kind of difficult to perform a regular conference. So there are two options, either it could have been cancelled or it could have been postponed. That is what all other conferences did. ScalaUA being an innovative tech conference opted for a third route. The conference was held, but totally online. It is agreed that it is worth to travel to the location and meet in person. But since that was not an option, it was an innovative and reasonable approach to run the conference online.

So the question is, how did this work?

There was a bit of a fight with the tools. We used Zoom, which was probably a good choice, because it has rich features. The schedule was more or leas as it would have been normally, so the 15 minute break between was enough to get some coffee or whatever, because no lines and no distances had to be dealt with. The talk was delivered in such a way that the speaker was seen full screen when no slides were shown, but usually the slides took up the whole screen and there was a small window in the corner, which could be moved around on the screen which could show the speaker. It was possible to ask questions and put oneself on video as well as a spectator. But for better support a Slack channel was provided for each talk plus a general one and a few that I did not use. They are kept around for a week after the event, so questions can still be dealt with.

One talk was done by two people. One of them wrote on a transparent board, that was between the camera and him, which was very impressive. Of course it was converted to allow him to write from left to write and us to read from left to write. His presentation partner showed code on her computer. This talk actually did something that is not possible in the usual situation.

What was important: Starting the Zoom channel for the talk about five minutes before, because it sometimes took some time to start.

What is more challenging: If the talk is not really very interesting, it is a bit harder not to get distracted. Sitting in the audience, you have to listen.

You can see the agenda. I did not speak myself this year.

Scala 3 (former Dotty) took up a lot of space, because many things have to be redone in Scala 3 or at least can be redone in a much more elegant way.

Share Button

Apple giving up on information technology

Apple has reinvented itself radically many times and done so when things were still going well. This is part of the companies success story. And the CEO Tim Cook apparently plans to continue with this strategy. Major reinventions where:

  • Moving from the Apple II to Macintosh
  • Dropping System 9 and replacing it by the totally new OS X
  • Shifting the priority from computers to the i-pod
  • Moving from i-pod to i-phone and i-pad

There were more, but these transformations could all be explained by early anticipation of an end or major shrink of the existing business. For example i-pods were nice music players, but then each phone contained a music player and every person had a phone in the pocket anyway, so the usefulness of an i-pod was tending to zero. Apple fans are very loyal, so they kept buying apple products, even when they were becoming obsolete, but that can only work for a short time. So they made a i-phone, which was theoretically a phone, but the phone-functionally did not really work. I assume, with more recent iphones they have fixed this.

But Tim Cook is moving on. What are the strengths of Apple? Design, marketing and sales. Especially including the fans in working for free for the company to enhance sales. This is an essential part of the success story. We are already used to the fact, that useful features are radically removed, like the DVD player in laptops, when DVDs were still important. Or the standard earphone plug from phones. Apple fans feel flattered by this, because they are so advanced by losing something first that later on actually becomes obsolete.

So the next step is coming now: Apple will move on to fashion. They will design clothes, shoes and jewelry. Then they can concentrate on their real strengths, reinvent themselves once more and remain in business for another long future. Computers, tablets and phones will be phased out.

Share Button

Skillsmatter working again

The original Skillsmatter company has gone into administration and will be closed down, once all claims have been settled.

Now investors have bought assets of this old company, apparently including the name and the brand, and are now operating und the same name and webdomain skillmatter.com. This is a new company with new management and a new location. Probably some employees of the old skillsmatter are working in the new company.

They plan to continue activities of the old skillsmatter, for example conferences.

This has some implications. If you have bought a ticket for a conference from skillsmatter and this conference did not take place, the new skillsmatter is not obliged to give a refund. This is the old skillsmatters or more precisely its administrators Resolvegroup (See FAQ #13).

The other option is to hope that they will reschedule the events, which they promise to do. They promise to honor tickets from old Skillsmatter, even though they do not have to.

Please, make a choice between the two options. If you get money back from Resolve, you cannot use the same ticket to go to a conference of the new Skillsmatter. It would be abusive and I assume that they will check.

Concerning the two conferences that I liked to visit from Skillsmatter, Scala Exchange and Clojure Exchange, alternatives have been created, Functional Scala and ReClojure. Functional Scala is there to stay, at least for the next couple of years. So it will compete with a potentially revived Scala Exchange. For 2020 I have opted for Functional Scala and already bought the ticket. It was also said that reClojure intends to continue as well, but I consider it possible that they will merge efforts with Clojure Exchange.

Share Button

Pagination of Database Query Results

This article is highly inspired by the blog post We need tool support for keyset pagination.
Please consider reading the original first and then my interpretation and additional thoughts about this idea.

We have a typical database base backed web application. It can be a rich client. It can be a NoSQL database. Whatever, but for the moment maybe a transactional SQL database and a web application will be used as the most common and best understood example.

We kind of construct a query, of which the result set is so large, that we do not want to transfer, render and display it all at once. Think of Google, where you might have millions of hits and only see the first 10 or 100 or so. Now you can navigate through the result set, usually by going to the next pages successively.

Nice web applications tell you how many records there are exactly. And then calculate how to split this up into a large number of pages and allowing you to go to an arbitrary page, or at least to some of the pages (last, next 10 pages) immediately.

There are several problems with this. First of all: the set you are basing this on changes with time. This could be fixed by creating some temporary snapshot in memory, in the database or in some kind of db-cache and keeping that around for some time. This is expensive and usually not worth the price, so we live with some misbehavior.

The second problem is the count. Count actually needs to do a full scan of the result set, there is no decent short cut. Now it would usually be good enough to estimate the size of the result set and that is exactly what google does. Sometimes the estimation is ridiculously bad, but we have gotten used to it. That is much better than using the real count, because it really speeds up the application by a great factor and improves user experience overall. Of course it is a bit more challenging to program an estimation and only worth it for big result sets. We could still jump to any of the next 10 pages by just increasing the size of the current query by 11 and showing the required segment. Since we sort the result by something, we could just get the first N results sorted backwards and then wrap that in a query that sorts them forward. So dropping the count can help for performance without too much loss in functionality.

Now the next problem is, that having navigated through a lot of pages, the usual way of doing a query and then constraining it with some mechanism like „rownum“, „row_number“, „limit/offset“ to the subset we want to display always requires obtaining the resultset from the beginning to our page. Usually that does not matter, because we navigate through the application manually and that means not too many pages from the beginning. But it can become an issue, if the application is heavily used.

Now we sort by something. Usually not the id, but some business logic relevant keys or a combination. They do not have to be unique, but we can always add the ID (or more generally the primary key) as the last sort key to make it unique. Then the next page can be found by just adding a condition based on our sort key. Assuming we want to show N records. The query might be

SELECT * FROM TABX 
  WHERE cond1 AND cond2 
  ORDER BY A, B, C ASC 
  LIMIT n;

Now the last record of the result has A=a1, B=b1, C=c1.
Then the next page can be obtained by

SELECT * FROM TABX 
  WHERE cond1 AND cond2 
   AND (A>a1 OR A=a1 AND B>b1 OR A=a1 AND B=b1 AND C>c1) 
  ORDER BY A, B, C ASC 
  LIMIT n;

To skip three pages and get the fourth of the following pages is done like this:

SELECT * FROM TABX 
  WHERE cond1 AND cond2 
   AND (A>a1 OR A=a1 AND B>b1 OR A=a1 AND B=b1 AND C>c1) 
  ORDER BY A, B, C ASC
  LIMIT n OFFSET 3*n;

We do use (or simulate) OFFSET here, but in a more intelligent way, because we do not force it to work harder than necessary by rebasing our query on what we already know.

We get a bit more consistency, because new records being inserted in the beginning will not be seen, but they will at least not disturb the succession of our result set. Maybe that is worth more than the performance gain, because we get it with a reasonable effort. caches, snapshots or even keeping it in „session memory“ are not really serious approaches to this issue, they will just sooner or later blowup our application in production, at the worst possible moment.

Share Button

Perl Scripts for editing

Even though we do have IDEs with quite powerful refactoring mechanisms for many languages, it is still sometimes useful, to have another automated editing mechanism.

Why would that be the case?

Some examples:

In some cases there was an SQL script to create the database tables, which had been written first. From that classes and even CRUD operations in something like JDBC or DBI can be created. Even though most Java-projects that I have seen recently prefer using Hibernate (or more generally JPA2), there are some benefits in doing just plain old JDBC. This requires some discipline in writing the SQL in a uniform way and it was sometimes even necessary to have „magical comments“ that were ignored by the DB, but used by the script. This can save a lot of work and errors.

In some cases there were large numbers of HTML files. Now a similar kind of change had to be applied to each of them. Using a Script to parse the HTML file and to apply the changes can save a lot of work and provide consistency that is often badly needed. And if the whole thing happens in the context of an agile project, then the stakeholders might want to see the outcome and come up with further ideas. No big deal, just change the transformation scripts and check it out again. I recommend thinking in terms of larger transformation steps. Then I would retain the original files and apply the script for the step until the outcome is ok. Then this can be committed to git and the next step can be worked on. If the intermediate results are not useful for anybody else, just wait with the push or better work on a branch.

Sometimes refactorings have to be done that are not easily supported by the IDE. For example a whole bunch of classes need to be moved to different packages according to some new naming rule. Just find all the classes with their old package names, then move to the new directory structure and rename imports and package declarations to the new structure. This can easily be done with a Script. Always remember to be careful if there is Reflection involved, this will most likely break all refactorings, no matter if by IDE or by script.

Also it is often useful to use scripts to analyze code and to find occurrences of certain patterns.

These things can be done with scripts in Ruby, Python, Perl or Raku. All of these are valid options. Ruby and Raku are somewhat more sophisticated languages than the other two. And Perl and Raku have the most sophisticated Regex capabilities. I would assume that Raku is the best choice, if you start from scratch, it even supports grammars out of the box, which might be a way to address such issues. Perl has it as add on libraries. Of course it is also useful to work with what you know. But sometimes it is worth learning the right tool instead of just using the golden hammer to put in screws.

So in my case the tool for this is currently Perl. It does the job, is available more or less everywhere and I know it well enough. It might be worth moving to Raku in the future.

Some things that often work for simple scripts are the following:

Try to start with normalizing the input files, this might be the first transformation step. Remove trailing spaces, replace tabs with spaces, maybe normalize idention, replace line endings by LF only without CR. In HTML replace HTML-entities for UNICODE-characters by the appropriate Unicode characters, convert everything to UTF-8 etc.
If you have data like phone numbers and dates in the file that are meaningful for your further steps, bring them to a standard format. Of course always depending on your local circumstances, but this is usually the right way to go. This might be partially done with an external tool like xmllint.

For the next steps we need to consider the issue that we have multiple lines. There are regex-variants that do not stop at line boundaries. But if we read the content line wise, it can still be a bit more work. Sometimes it is easier to just replace line feeds by some marker string that does not otherwise occur in your files (check it before!) and then apply usual regex on this longer line.

What we often need is a regex that finds the shortest and not the longest match. If we write something like /a.*b/, this will look for a sequence that starts with an „a“ and ends with the last „b“ that can be found. Often we want to end with the first „b“. This can be achieved by /a.*?b/.

Another pattern that is often useful for such scripts is to work with a state machine. If a certain pattern is discovered, we react to it depending on the state and possibly change the state. So we can apply changes to something only if it occurs in a certain context.

Share Button

JSON instead of Java Serialization: The solution?

We start recognizing that Serialization is not such a good idea.

It is cool and can really work on a wide range of objects, even including complex and cyclic reference graphs. And it was essential for some older Java frameworks like EJB and RMI, which allowed remote access to Java objects and classes.

But it is no longer the future, Oracle will soon deprecate and later remove it. And it will happen this time, even though they really keep stuff around for a long time due to compatibility requirements.

Just to recap: it opens up security discussion, it opens up hidden behavior and makes it harder to reason about code, it creates tight coupling between remote components and it can result in bugs, that only occur at runtime and cannot be discovered at compile time. In short, it is not resilient.

So we need something else. Obvious candidates are XML, YAML and JSON. XML is of course an option and is powerful enough to do many things, but often a bit too clumbsy and too much boiler plate, so we try to move away from it. YAML and JSON kind of do the same thing, but it seems that JSON is winning the race and we all need to know JSON and many of us tend to skip YAML.

So why not use JSON. It is easy, it has good libraries and we can even find databases that work with JSON.

What JSON can express very well are scalars, lists and maps and combinations of these. This is quite exactly what we have in Perl, JavaScript or Clojure as basic building blocks. These languages support object oriented programming, but for simple stuff we go with these basic building blocks. And objects can be modelled as (hash-)maps, with the attribute names as keys. Actually JSON is valid JavaScript code.

We do have to change our thinking when moving from Java Serialization to JSON. JSON does not store any serializable object but just data. Maybe that is enough and that is what we actually want. It totally works in heterogeneous environments, where we are using different programming languages or different implementations.

There are good libraries. I have tried two, Jackson and GSON which both work well, recently mostly Jackson. It is important to think of Clojure, JavaScript, Perl or something like that without objects. So we loose type information, which can be considered good or bad, but if we can arrange ourselves with it, we avoid the tight coupling. JavaBeans are expressed exactly the same as a HashMap with the attribute names as keys. We can provide the top level class when deserializing, but at the child levels it will not be able to figure that out, if it relies on runtime information.

Example Code

Here it has been tried out. Find full example code on github.

A class that contains all kinds of stuff. Not prepared for really putting in nulls, but it is just experimental code…


package net.itsky.jackson;

import java.util.List;
import java.util.Map;
import java.util.Set;

public class TestObject {
    private Long l;
    private String s;
    private Boolean b;
    private Set set;
    private List list;
    private Map map;

    public TestObject(Long l, String s, Boolean b, Set set, List list, Map map) {
        this.l = l;
        this.s = s;
        this.b = b;
        this.set = set;
        this.list = list;
        this.map = map;
    }

    public TestObject() {
        // only for framework purposes
    }

    public Long getL() {
        return l;
    }

    public String getS() {
        return s;
    }

    public Boolean getB() {
        return b;
    }

    public Set getSet() {
        return set;
    }

    public List getList() {
        return list;
    }

    public Map getMap() {
        return map;
    }

    @Override
    public String toString() {
        return getClass().getSimpleName() + "("
+                "l=" + l + " (" + l.getClass() + ") "
                + " s=\"" + s + "\" (" + s.getClass() + ") "
                + " b=" + b + " (" + b.getClass() + ") "
                + " set=" + set  + " (" + set.getClass() + ") "
                + " list=" + list + " (" + list.getClass() + ") "
                + " map=" + map + " (" + map.getClass() + "))";
    }
}

And this is used for running everything. To play around more, it should probably be moved to tests..

package net.itsky.jackson;

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.ObjectWriter;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.ImmutableSet;

import java.io.StringReader;
import java.util.List;
import java.util.Map;
import java.util.Set;

public class App {

    public static void main(String[] args) {
        try {
            Set s1 = ImmutableSet.of(1, 2, 3);
            Set s2 = ImmutableSet.of(1, 2, 3);
            Map m1 = ImmutableMap.of("A", "abc", "B", 3L, "C", s1);
            List l1 = ImmutableList.of("i", "e", "a", "o", "u");
            TestObject t1 = new TestObject(30303L, "uv", true, s2, l1, m1);
            Map m2 = ImmutableMap.of("r", "r101", "s", 202, "t", t1);
            List l2 = ImmutableList.of("ä", "ö", "ü", "å", "ø");
            Set s3 = ImmutableSet.of("x", "y", "z");
            TestObject t2 = new TestObject(40404L, "ijk", false, s3, l2, m2);
            ObjectMapper mapper = new ObjectMapper();
            ObjectWriter writer = mapper.writerWithDefaultPrettyPrinter();
            System.out.println("t2=" + t2);
            String json = writer.writeValueAsString(t2);
            System.out.println("json=" + json);
            StringReader stringReader = new StringReader(json);
            TestObject t3 = mapper.readValue(stringReader, TestObject.class);
            System.out.println("t3=" + t3);
        } catch (Exception ex) {
            RuntimeException rex;
            if (ex instanceof RuntimeException) {
                rex = (RuntimeException) ex;
            } else {
                rex = new RuntimeException(ex);
            }
            throw rex;
        }
    }
}

And here is the output:

t2=TestObject(l=40404 (class java.lang.Long)  s="ijk" (class java.lang.String)
  b=false (class java.lang.Boolean)  
set=[x, y, z] (class com.google.common.collect.RegularImmutableSet)
  list=[ä, ö, ü, å, ø] (class com.google.common.collect.RegularImmutableList)
  map={r=r101, s=202, 
t=TestObject(l=30303 (class java.lang.Long)
  s="uv" (class java.lang.String)  b=true (class java.lang.Boolean)
  set=[1, 2, 3] (class com.google.common.collect.RegularImmutableSet)
  list=[i, e, a, o, u] (class com.google.common.collect.RegularImmutableList)
  map={A=abc, B=3, C=[1, 2, 3]} (class com.google.common.collect.RegularImmutableMap))}
 (class com.google.common.collect.RegularImmutableMap))
json={
  "l" : 40404,
  "s" : "ijk",
  "b" : false,
  "set" : [ "x", "y", "z" ],
  "list" : [ "ä", "ö", "ü", "å", "ø" ],
  "map" : {
    "r" : "r101",
    "s" : 202,
    "t" : {
      "l" : 30303,
      "s" : "uv",
      "b" : true,
      "set" : [ 1, 2, 3 ],
      "list" : [ "i", "e", "a", "o", "u" ],
      "map" : {
        "A" : "abc",
        "B" : 3,
        "C" : [ 1, 2, 3 ]
      }
    }
  }
}
t3=TestObject(l=40404 (class java.lang.Long)  s="ijk" (class java.lang.String)  
b=false (class java.lang.Boolean)  
set=[x, y, z] (class java.util.HashSet)  
list=[ä, ö, ü, å, ø] (class java.util.ArrayList)  
map={r=r101, s=202, 
t={l=30303, s=uv, b=true, 
set=[1, 2, 3], 
list=[i, e, a, o, u], 
map={A=abc, B=3, C=[1, 2, 3]}}}
  (class java.util.LinkedHashMap))

Process finished with exit code 0

So the immediate object and its immediate attributes were deserialized properly to what we provided. But everything inside went to maps, lists and scalars.

The intermediate JSON does not carry the type information at all, so this is the best that can be done.
Often it is useful what we want. If not, we need to find something else or see if we can tweak JSON to carry type information.

It will be interesting to explore other serialization protocols…

Links

Share Button

Project-local Libraries

Many projects have these „project-local“ or „company-local“ libraries, that can be used optionally or are even strongly imposed on developers. They may be called something like

  • core
  • toolkit
  • toolbox
  • base
  • platform
  • utils
  • lib
  • common
  • framework
  • baselib
  • sdk
  • tools

or even with a meaningful good name.

Generally there is nothing wrong with such libraries, if they are used correctly.

Some things should be observed, though: Such libraries are there to be useful. If that is not the case, it is better not to use them and to deprecate and discard them. Quite a lot of projects suffer from the fact that inferior local libraries are imposed on them and have to be used. The strive for „consistency“ is not bad either, but it should be kept to a level at which it is useful and not become a primary goal by itself.

So when need for a functionality arises, an existing library might cover this well enough. If it is not too hard to integrate, there is little need to write a „local library functionality“ for this. If no good library can be found or reasonably be integrated, it is a good idea to write what is needed oneself. It should be observed, that such a library should have slightly higher quality standards and very good automated testing, because it is more universal and the actual and future usage cannot be anticipated as easily as with „business code“. There are places, where it is good to impose a local library, to make sure that certain fields are validated in the same way across the software or even across the organization, for example. For data like phone numbers, dates, email addresses etc. that are commonly used in our world, it is probably possible to find good libraries that do this much better than any local library with reasonable team effort could do.

Now the world moves and we might discover better replacements for parts of the local library. It is a good idea to move on to these better replacements.

Things that can go into a local library are small pieces of business logic that need to be used everywhere. Think of an insurance company. Customers have customer numbers. They have to be entered into web application forms and they have to be checked for correctness, they have to be formatted etc. This could go into a library and we could be sure that everyone in the larger project uses the same rules for what is a valid customer number and how to format and parse them. It is a bit too hard to implement it again and again, because tiny errors sneak in, for example into the code to check for validity with some check sum, but it is too easy to put it into a service that can be called to check formal validity. Such a service will be there to check if some formally correct customer number is actually a real customer’s number, maybe with constraints on who can see this for which customer. And of course to generate new customer numbers, when needed.

It is always good to ask twice if certain functionality should rather sit in a service that can be called or in a library, of course depending also on the local architecture preferences. In cases where the functionality is useful for others in the organization, but where the exact same behavior is less important, it can also be an option to just copy the code to different teams. But this should be used with care, because it means that improvements do not easily flow back to benefit other users of the functionality.

Generally it is an observation that organizations rarely have good „local libraries“. The good libraries are found on the internet. Or only exist within the team. And the bad company libraries are often forced on those who cannot find ways to opt out. Good team libraries that are not known outside of the team can sometimes be a loss for the organization, because things are done twice or even worse are done differently where it would be good to have the same behavior.

Share Button