XML

In the late 1990es there was a real hype about XML. Tons of standards evolved and it was a big deal to acquire sound knowledge of it.

It has been some success, because it is still around and very common almost 20 years later.

I would say that the idea of having a human readable and editable text format has mostly failed. Trivial XML can be edited manually without too much of a risk of breaking it, but then again simpler formats like JSON or even java-properties-Files or something along these lines would be sufficient and easier to deal with, unless it is the 1001st slightly different format that needs to be learned again. XML is different each time anyway, because it depends on the schema, so we have the problem on that side, but off course the general idea is well known.

For complex XML manual reading and editing becomes a nightmare, it is just so much harder to read for humans than any reasonably common programming languages of our time. It is text, but so involved that it feels like half binary. And who knows, maybe we can also edit binary files with a hex-editor. And real magicians, actually people with too much time in this case, can do so and keep the binary file correct and uncorrupted, at least for some binary formats. And they can do so in XML as well… But it is actually better to have a tool or a script to create and change non-trivial XML-configuration files.

Where XML is strong is for data exchange between systems. This is mostly transfer in space between different systems, but it can also be transfer in time, that is for storing information to be retrieved later. It gives a format that allows for some „type safety“, that is very versatile and that provides a lot of tool and script support around it. Even here we have to acknowledge that there are some drawbacks. Maintaining a XML interface involves some work for the schema files, adopting the software on the human side. It requires some CPU-overhead on the sending and mostly on the receiving side for creating and parsing XML. The libraries have been optimized but still they take a little bit of time. And then on the network size we transmit a multiple of the amount of data, if it is densely packed with tags.

But it is a format that is well understood, that works on pretty much any platform, over the network and also usually allows us to support different versions of the same interface simultaneously. For debugging it is good to have a format that is at least human readable, even if not very pleasant. Ideally the schema is defined in a way that is self documenting.

I wonder why approaches like in WML have not become more common. WML had a customized compressed format that was more friendly to low bandwidth cell phones.

XML is good for many purposes, but as always it is good to know other tools, like JSON and to decide when it is a case for XML and when not.

Some positive side effects of XML are that it helped some other standards to become more mainstream. UTF-8 was from the beginning the default encoding for XML and this is now a common standard encoding for any text. And with XML-schema it became common to encode dates within XML in the ISO-format, which helped this format in becoming generally known and commonly used for cases where one date format should work independently of the origin of the reader.

Share Button

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.


*