JSON instead of Java Serialization: The solution?

We start recognizing that Serialization is not such a good idea.

It is cool and can really work on a wide range of objects, even including complex and cyclic reference graphs. And it was essential for some older Java frameworks like EJB and RMI, which allowed remote access to Java objects and classes.

But it is no longer the future, Oracle will soon deprecate and later remove it. And it will happen this time, even though they really keep stuff around for a long time due to compatibility requirements.

Just to recap: it opens up security discussion, it opens up hidden behavior and makes it harder to reason about code, it creates tight coupling between remote components and it can result in bugs, that only occur at runtime and cannot be discovered at compile time. In short, it is not resilient.

So we need something else. Obvious candidates are XML, YAML and JSON. XML is of course an option and is powerful enough to do many things, but often a bit too clumbsy and too much boiler plate, so we try to move away from it. YAML and JSON kind of do the same thing, but it seems that JSON is winning the race and we all need to know JSON and many of us tend to skip YAML.

So why not use JSON. It is easy, it has good libraries and we can even find databases that work with JSON.

What JSON can express very well are scalars, lists and maps and combinations of these. This is quite exactly what we have in Perl, JavaScript or Clojure as basic building blocks. These languages support object oriented programming, but for simple stuff we go with these basic building blocks. And objects can be modelled as (hash-)maps, with the attribute names as keys. Actually JSON is valid JavaScript code.

We do have to change our thinking when moving from Java Serialization to JSON. JSON does not store any serializable object but just data. Maybe that is enough and that is what we actually want. It totally works in heterogeneous environments, where we are using different programming languages or different implementations.

There are good libraries. I have tried two, Jackson and GSON which both work well, recently mostly Jackson. It is important to think of Clojure, JavaScript, Perl or something like that without objects. So we loose type information, which can be considered good or bad, but if we can arrange ourselves with it, we avoid the tight coupling. JavaBeans are expressed exactly the same as a HashMap with the attribute names as keys. We can provide the top level class when deserializing, but at the child levels it will not be able to figure that out, if it relies on runtime information.

Example Code

Here it has been tried out. Find full example code on github.

A class that contains all kinds of stuff. Not prepared for really putting in nulls, but it is just experimental code…


package net.itsky.jackson;

import java.util.List;
import java.util.Map;
import java.util.Set;

public class TestObject {
    private Long l;
    private String s;
    private Boolean b;
    private Set set;
    private List list;
    private Map map;

    public TestObject(Long l, String s, Boolean b, Set set, List list, Map map) {
        this.l = l;
        this.s = s;
        this.b = b;
        this.set = set;
        this.list = list;
        this.map = map;
    }

    public TestObject() {
        // only for framework purposes
    }

    public Long getL() {
        return l;
    }

    public String getS() {
        return s;
    }

    public Boolean getB() {
        return b;
    }

    public Set getSet() {
        return set;
    }

    public List getList() {
        return list;
    }

    public Map getMap() {
        return map;
    }

    @Override
    public String toString() {
        return getClass().getSimpleName() + "("
+                "l=" + l + " (" + l.getClass() + ") "
                + " s=\"" + s + "\" (" + s.getClass() + ") "
                + " b=" + b + " (" + b.getClass() + ") "
                + " set=" + set  + " (" + set.getClass() + ") "
                + " list=" + list + " (" + list.getClass() + ") "
                + " map=" + map + " (" + map.getClass() + "))";
    }
}

And this is used for running everything. To play around more, it should probably be moved to tests..

package net.itsky.jackson;

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.ObjectWriter;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.ImmutableSet;

import java.io.StringReader;
import java.util.List;
import java.util.Map;
import java.util.Set;

public class App {

    public static void main(String[] args) {
        try {
            Set s1 = ImmutableSet.of(1, 2, 3);
            Set s2 = ImmutableSet.of(1, 2, 3);
            Map m1 = ImmutableMap.of("A", "abc", "B", 3L, "C", s1);
            List l1 = ImmutableList.of("i", "e", "a", "o", "u");
            TestObject t1 = new TestObject(30303L, "uv", true, s2, l1, m1);
            Map m2 = ImmutableMap.of("r", "r101", "s", 202, "t", t1);
            List l2 = ImmutableList.of("ä", "ö", "ü", "å", "ø");
            Set s3 = ImmutableSet.of("x", "y", "z");
            TestObject t2 = new TestObject(40404L, "ijk", false, s3, l2, m2);
            ObjectMapper mapper = new ObjectMapper();
            ObjectWriter writer = mapper.writerWithDefaultPrettyPrinter();
            System.out.println("t2=" + t2);
            String json = writer.writeValueAsString(t2);
            System.out.println("json=" + json);
            StringReader stringReader = new StringReader(json);
            TestObject t3 = mapper.readValue(stringReader, TestObject.class);
            System.out.println("t3=" + t3);
        } catch (Exception ex) {
            RuntimeException rex;
            if (ex instanceof RuntimeException) {
                rex = (RuntimeException) ex;
            } else {
                rex = new RuntimeException(ex);
            }
            throw rex;
        }
    }
}

And here is the output:

t2=TestObject(l=40404 (class java.lang.Long)  s="ijk" (class java.lang.String)
  b=false (class java.lang.Boolean)  
set=[x, y, z] (class com.google.common.collect.RegularImmutableSet)
  list=[ä, ö, ü, å, ø] (class com.google.common.collect.RegularImmutableList)
  map={r=r101, s=202, 
t=TestObject(l=30303 (class java.lang.Long)
  s="uv" (class java.lang.String)  b=true (class java.lang.Boolean)
  set=[1, 2, 3] (class com.google.common.collect.RegularImmutableSet)
  list=[i, e, a, o, u] (class com.google.common.collect.RegularImmutableList)
  map={A=abc, B=3, C=[1, 2, 3]} (class com.google.common.collect.RegularImmutableMap))}
 (class com.google.common.collect.RegularImmutableMap))
json={
  "l" : 40404,
  "s" : "ijk",
  "b" : false,
  "set" : [ "x", "y", "z" ],
  "list" : [ "ä", "ö", "ü", "å", "ø" ],
  "map" : {
    "r" : "r101",
    "s" : 202,
    "t" : {
      "l" : 30303,
      "s" : "uv",
      "b" : true,
      "set" : [ 1, 2, 3 ],
      "list" : [ "i", "e", "a", "o", "u" ],
      "map" : {
        "A" : "abc",
        "B" : 3,
        "C" : [ 1, 2, 3 ]
      }
    }
  }
}
t3=TestObject(l=40404 (class java.lang.Long)  s="ijk" (class java.lang.String)  
b=false (class java.lang.Boolean)  
set=[x, y, z] (class java.util.HashSet)  
list=[ä, ö, ü, å, ø] (class java.util.ArrayList)  
map={r=r101, s=202, 
t={l=30303, s=uv, b=true, 
set=[1, 2, 3], 
list=[i, e, a, o, u], 
map={A=abc, B=3, C=[1, 2, 3]}}}
  (class java.util.LinkedHashMap))

Process finished with exit code 0

So the immediate object and its immediate attributes were deserialized properly to what we provided. But everything inside went to maps, lists and scalars.

The intermediate JSON does not carry the type information at all, so this is the best that can be done.
Often it is useful what we want. If not, we need to find something else or see if we can tweak JSON to carry type information.

It will be interesting to explore other serialization protocols…

Links

Share Button

Devoxx UA and Devoxx BE 2019

In 2019 I visited Devoxx UA in Kiev and Devoxx BE in Antwerp.
Traveling was actually a little story by itself, so for now we can just assume that I magically was at the locations of DevoxxUA and DevoxxBE.

In Kiew I attended the following talks:

On Wednesday I attended the following talks in Antwerp:

On Thursday I attended the following talks in Antwerp:

On Friday I attended the following talks in Antwerp:

That’s it…
As always, a lot of these topics deserve an article in this blog. And a lot of video recordings from the conference are worth viewing.

Links

Share Button

Can hashCodes impose a security risk?

This may come as a surprise, but attackers can assume that software is running in one of the common languages with their standard library. This calculates the hashcode of a string in a predictable way. For that reason it is possible, to create a large number of entries that result in strings having the same hashcode. If this software relies on hashmaps using this string as a key, then lookups will regularly use linear time instead of almost constant time. This might slow down the system to such an extent that it might be used for a denial of service attack.

The question is, what we can do about this. First of all it is necessary to understand, where are places that can be used by more or less unknown users to enter data into the system, for example registration of new users or some upload of information of users that are registered already. What would happen in case of such an attack?

In the end of the day we do want to allow legitimate usage of the system. Of course it is possible, to discover and stop abusive usage, but these detectors have a tendency to be accurate and create both „false positives“ and „false negatives“. This is something that a regular security team can understand and address. We need to remember, that maybe even the firewalls itself can be attacked by such an attack. So it is up to the developers to harden it against such an attack, which I hope they do.

From the developer point of view, we should look at another angle. There could be legitimate data that is hard to distinguish from abusive data, so we could just make our application powerful enough to handle this regularly. We need to understand the areas of our software that are vulnerable by this kind of attack. Where do we have external data that needs to be hashed. Now we can create a hashcode h as h(x)=f(\mathrm{md5}(g(x))) or h(x)=f(\mathrm{sha1}(g(x))), where we prepend the string with some „secret“ that is created a startup of the software, then apply the cryptographic hash and in the end apply a function that reduces the sha1 or sha256 or md5 hash to an integer. Since hash maps only need to remain valid during the runtime of the software, it is possible to change the „secret“ at startup time, thus making it reasonably hard for attackers to create entries that result in the same hashcode, even if they know the workings of the software, but do not know the „secret“. A possible way could be to have a special variant of hash map, that uses strings as keys, but uses its own implementation of hashcode instead of String’s .hashCode()-method. This would allow creating a random secret at construction time.

I have only become aware of the weakness of predictable hashcodes, but I do not know any established answers to this question, so here you can read what I came up with to address this issue. I think that it might be sufficient to have a simple hashcode function that just uses some secret as an input. Just prepending the string with a secret and then calculating the ordinary .hashCode() will not help, because it will make the hashcode unpredictable, but the same pairs of strings will still result in collisions. So it is necessary to have a hashcode h(x, s) with x the input string and s the secret such that for each x, y, s with x \ne y \wedge h(x, s)=h(y, s) there exists a t with h(x, t) \ne h(y, t), so the colliding pairs really depend on the choice of the secret and cannot be predicted without knowing the secret.

What do you think about this issue and how it can be addressed from the developer side? Please let me know in the comments section.

Share Button

www.it-sky-consulting.com now https only

I have converted my company site www.it-sky-consulting.com to always use https.

This is something all sites should do in the next few months.

Share Button

Weird blackmailing via email from „Hacker“

I got a few emails, that looked like this (see at the button).

I replaced all references to myself with xxxx. The source of the email indicates, that a mailserver „nmail.brlp.in“ has been used for this.

The fact, that the email seems to come from my own mail address is not a proof that this guy hacked into my system. On more low level email software it is quite easy to set header fields to any valid value, this includes the from-part of the email.

So, if you get such emails, what you can do: report it to the police. This person or organization is criminal and stealing some money from people who do not understand well enough what is happening here. Maybe they can track down the criminal by international cooperation, maybe not. I uploaded one of these emails to the Swiss federal police, who have a form for such uploads. They gave a polite advice, basicly asking me not to pay.

And that is important: PLEASE DO NOT PAY. The „person“ or „script“ is just pretending to have access to my system. Even what he claims to have observed is not true, but the headers of the email also give him away as using some mail server and changing the From-line.

I included the whole text, so it is possible to search for it.

Hi, this account is hacked! Modify the password right away!
You might not know anything about me and you obviously are probably wondering why you are receiving this letter, right?
I’mhacker who openedyour emailand OSa few months ago.
Do not waste your time and try out to talk to me or find me, it is definitely hopeless, because I directed you a letter from YOUR own hacked account.
I’ve created special program on the adult videos (porn) website and suppose you spent time on this site to have a good time (you know what I want to say).
During you have been taking a look at videos, your internet browser began to act like a RDP (Remote Control) with a keylogger which gave me the ability to access your monitor and web camera.
Consequently, my softwareaquiredall information.
You wrote passwords on the sites you visited, and I intercepted all of them.
Surely, you’ll be able to modify them, or have already modified them.
Even so it does not matter, my malware renews needed data every time.
What did I do?
I compiled a backup of your system. Of all files and contacts.
I got a dual-screen video recording. The 1 screen presents the clip you had been watching (you have a very good preferences, ha-ha…), and the second screen presents the recording from your own web camera.
What actually do you have to do?
Great, in my view, 1000 USD is a inexpensive amount of money for this little riddle. You will make your payment by bitcoins (in case you don’t understand this, go searching “how to buy bitcoin” in Google).
My bitcoin wallet address:
1ChU6CTsKhRgz761eaEraDRKYRKp6HWtrA
(It is cAsE sensitive, so copy and paste it).
Important:
You have 48 hours in order to make the payment. (I put an exclusive pixel to this message, and at the moment I know that you’ve read this email).
To monitorthe reading of a letterand the actionswithin it, I usea Facebook pixel. Thanks to them. (Everything thatcan be usedfor the authorities may also helpus.)

If I do not get bitcoins, I’ll undoubtedly transfer your recording to each of your contacts, such as family members, co-workers, etc?

The source of the EMail looked like this (shortened a bit):

Return-Path:
Received: from xxxxxxxx.xxxxxxxx.com ([xx.xx.xx.xx]) by mx-ha.gmx.net
(mxgmx017 [212.227.15.9]) with ESMTPS (Nemesis) id 1MeSc2-1hZOnl0zR6-00aZJW
for ; Tue, 05 Mar 2019 14:49:21 +0100
X-Greylist: delayed 440 seconds by postgrey-1.34 at dd29014; Tue, 05 Mar 2019 14:49:18 CET
X-policyd-weight: using cached result; rate: -6.1
Received: from nmail.brlp.in (nmail.brlp.in [1.6.36.80])
by xxxxxxxx.xxxxxxxx.com (Postfix) with ESMTPS id DDCCD63C255E
for ; Tue, 5 Mar 2019 14:49:18 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
by nmail.brlp.in (Postfix) with ESMTP id D49CD45242ED
for ; Tue, 5 Mar 2019 19:11:55 +0530 (IST)
Received: from nmail.brlp.in ([127.0.0.1])
by localhost (nmail.brlp.in [127.0.0.1]) (amavisd-new, port 10032)
with ESMTP id yaoBiyeSpTXg for ;
Tue, 5 Mar 2019 19:11:55 +0530 (IST)
Received: from localhost (localhost [127.0.0.1])
by nmail.brlp.in (Postfix) with ESMTP id 11F0F452430F
for ; Tue, 5 Mar 2019 19:11:55 +0530 (IST)
X-Virus-Scanned: amavisd-new at brlp.in
Received: from nmail.brlp.in ([127.0.0.1])
by localhost (nmail.brlp.in [127.0.0.1]) (amavisd-new, port 10026)
with ESMTP id ZRHfjiakcy7Q for ;
Tue, 5 Mar 2019 19:11:54 +0530 (IST)
Received: from [216.subnet110-136-205.speedy.telkom.net.id] (unknown [110.136.205.216])
by nmail.brlp.in (Postfix) with ESMTPSA id D2C1345242C8
for ; Tue, 5 Mar 2019 19:11:53 +0530 (IST)
Subject: xxxxxxxxxx
To: xxxxx@xxxxx.com
List-Subscribe:
X-aid: 6812375433
Date: Tue, 5 Mar 2019 14:41:53 +0100
X-Complaints-To: abuse@mailer.brlp.in
Organization: Rprgtkvvr
Message-ID:
List-ID:
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset=UTF-8
From:
Envelope-To:
X-GMX-Antispam: 0 (Mail was not recognized as spam); Detail=V3;
X-Spam-Flag: NO
X-UI-Filterresults: notjunk:1;V03:K0:QH4Z6L3Srwk=:mzSkXH/rOihoavgPXEhMTWJI56
cKYIahCC4FgRRlHBaVws8990Br6YfEZzEIxbqryIMgtwJsN7FDjKIus+cj7uG9Tga9YXqgqay
E1J7ynKQeIqbcWraD91IZITqhvS/rlWR5NE+dn4j3hJbRoQGWunKSSuznhZQgvlS/bF8dBEUu

02qiW7Uezzr0BqlJ2burWZXtbmbMXXqpEvxECr+g2cXwFmSC8eXuutHrX1LMg

SGksIHRoaXMgYWNjb3VudCBpcyBoYWNrZWQhIE1vZGlmeSB0aGUgcGFzc3dvcmQgcmlnaHQgYXdh
eSENCllvdSBtaWdodCBub3Qga25vdyBhbnl0aGluZyBhYm91dCBtZSBhbmQgeW91IG9idmlvdXNs

Cg==

Links

Share Button

Encryption of Disks

Today we should use encryption of disks for many situations.

I recommend at least encrypting disks of portable computers that contain the home directory and portable USB disks. They can easily get stolen or lost and it is better if the thief does not have easy access to the content. We should even consider encrypting swap partitions.

There are many ways to do this on different operating systems and actually I only know how to do it for Linux. A possible approach for Windows is to run MS-Windows in a virtual box inside Linux and just profit from the Linux-based encryption. That is what I do, but I do not use my MS-Windows very much. About Apple computers I have no knowledge, please go to the site of somebody else for encryption of disks for them. I know that there is an option available for this, but I do not know how to use it and how good it is.

I prefer to rely on open source solutions for security related issues, because it is harder (but not impossible) to put in malicious components into the software and it is easier to find and to fix them. This is a general point that serious security specialists tend to make that it is better to rely on good and well maintained open source software for security than on closed software of which we do not know the wanted and unwanted backdoors and vulnerabilities.

The way it works in Linux is that we encrypt a disk partition. It can then be accessed after providing a password. It is possible to provide several alternative passwords. The tools to use are dm-crypt, a kernel module, LUKS, cryptsetup, and cryptmount.

It can be done like this (example session) for an external drive that appears as /dev/sdc. Please be extremely careful not to erase any data that you still need or hide data behind a password that you do not know…


# check the partitions
$ fdisk -l /dev/sdc

# encrypt the partition and provide a password:
$ cryptsetup luksFormat /dev/sdc1

# access the partition
$ cryptsetup luksOpen /dev/sdc1 encrypted-external-drive-1

# format it with whatever file system you want to use
$ mkfs.ext4 /dev/mapper/encrypted-external-drive-1
# or
$ mkfs.btrfs /dev/mapper/encrypted-external-drive-1
# or whatever you prefer..

Now each time the disk is mounted, the password needs to be provided.

The issue which file system is best might be worth writing about in the future, it is not in this article.

Links

Share Button

Hidden CPUs

How many CPUs does your computer have?

If we go way back, we will discover that some time ago there were already ancillary CPUs in our computers. The floppy disk drive of the C64 had a CPU very similar to the one in the computer itself, but very little memory and it was hard, though not impossible, to make use of it. I never really tried. The PC-keyboards had CPUs, it was told that a Z80 or 8080 or something like that was built into them. I never bothered to find out.

Now this concept is not at all new, but was already used 35 years ago. So the question is, if our computers still have such hidden CPUs. This seems to be the case and it is easy to search for „hidden CPUs“ or „secret CPUs“. And it would be extremely strange to expect anything different. They do not have compute power for us, but just run and manage hardware, that appears to be just hardware from the point of view of our main CPU, that we can program. So why not just consider this as hardware and ignore the „secret“ or „hidden“ CPUs and see them as implementation detail of the hardware. That is a very legitimate approach and to be honest what we do most of the time.

The issue is more delicate now, because these hidden CPUs can access the internet, even when the computer is turned off or seems to be offline. There are tools to analyze the network traffic and to detect this. But we should start to become aware of this invisible world that is potentially as dangerous as visible malware. And this applies to all kinds of devices, especially cell phones, tablets, routers, TV-sets and all „things“ that have their own CPU power and network access…

Links

Share Button

Source Code of Apple-iOS leaked

It seems that the parts of the source code of Apple’s iOS 9 have leaked via github. They might have been removed from there, while you are reading this, but probably they will be passed around in the internet anyway.

Some sources say that this is a risk to security. It might be, but in the end cryptography specialists tend to consider the availability of the source code as an advantage for security, because it can be analyzed by everyone, vulnerabilities can be found and published and of course more easily be corrected if the source is available to everyone. Hiding the source code is some kind of „security by obfuscation“, which is not really a strong mechanism and it should be based on verifiable secure mechanisms, as successfully applied by Linux and other open source operating systems. But this might not be fully true, if the sources are just passed around in somewhat closed circles and not easily available to the general public.

This does not make iOS open source, because the licenses that Apple imposes on their software are still valid and to my understanding they do not make this part of the system open source, which means much more than just being able to read the source code of a certain version that might already be outdated. Please observe that if the source code that you might find on github is really coming from Apple, their original license and not the one mentioned in github applies.

To put Jail breaking somewhere near security breaches is wrong, because this is an action done by the owner of the device with his or her own device at own risk. This should be everyone’s right to do so and there should be nothing wrong with making it easier. I know, we are not living in a perfect world…

So please relax. If Apple has done a good job, there will not be too bad exploits and if they are still doing a good job, they will quickly fix any exploits that show up. And if you like to have an open source system, you should still consider using something else.

Links

Share Button

The magic trailing space

When comparing string, of course spaces count as well and they should count. To ignore them, we can normalize strings. Typical white space normalization includes the following (Perl regular expressions):

  • /[ \t]+/ /g replace any sequence of tabs and spaces used to separate content by one space.
  • /\r\n/\n/g replace carriage return + linefeed by linefeed only.
  • /\s+$// remove trailing whitespace.
  • /^\s+// remove leading whitespace.

More or less it is often useful to do something like this when comparing strings that originally come from outside sources and are not normalized, but only „the content“ counts. There can be more sophisticated rules, to deal with no-break-space, with control characters, with trailing spaces at the end of each line or only at the end of the whole thing or replacing multiple empty lines by just one empty line. Just the general idea is to think about the right normalization.

In some cases, like long numbers, spaces or other symbols are used to group digits. These should also be removed. Sometimes more specific rules apply, like for phone numbers, web sites, email addresses etc. that need to be done specifically for this type, hopefully using an adequate library.

More often than not we see that web sites do not do this properly. Quite often an information has to be entered and it is not normalized prior to further processing. So credit card numbers or IBAN numbers are rejected because of spaces or anything because of trailing spaces, of course with an error message that does not give us a hint about what was the problem.

For serious application there needs to be a serious processing step for data coming from outside anyway, for security reasons. Even though SQL injection should not work due to sound SQL-placeholder usage, it is a good practice to check the data anyway and reject it early and with a meaningful message. Should I trust the security of a site that cannot deal with spaces in a credit card number for giving them my card number? I am not sure.

It is about time that UI developers get into the habit of doing the proper processing, normalization and checks for user input. Beware that any security relevant checks need to be done on the server or on the server as well.

Share Button

WPA2 compromised

The WPA2 protocol has been compromised. The so called KRACK attack allows reading encrypted content.
It was always a good idea to use encrypted communication on top of WPA2 for sensitive data, like https, ssh or a VPN. This practice has been recommended in this blog before, which was again inspired by what Bruce Schneier wrote about it.

Anyway, we should certainly start of thinking of WLANs with WPA2 encryption as a useful transport mechanism, but not as a very secure mechanism to encrypt data. At least from now on we should use other encrypted protocols on top of WPA2 where appropriate or use cable networks for internal communication that we do not want to encrypt additionally.

Links:

Share Button