Virtual machines

We all know that Java uses a „virtual machine“ that is it simulates a non-existing hardware which is the same independent of the real hardware, thus helping to achieve the well known platform independence of Java. Btw. this is not about virtualization like VMWare, VirtualBox, Qemu, Xen, Docker and similar tools, but about byte code interpreters like the Java-VM.

We tend to believe that this is the major innovation of Java, but actually the concept of virtual machines is very old. Lisp, UCSD-Pascal, Eumel/Elan, the Perl programming language and many other systems have used this concept long before Java. The Java guys have been good in selling this and it was possible to get this really to the mainstream when Java came out. The Java guys deserve the credit for bringing this in the right time and bringing it to the main stream.

Earlier implementations where kind of cool, but the virtual machine technology and the hardware were to slow, so that they were not really attractive, at least not for high performance applications, which are now actually a domain of Java and other JVM languages. Some suggest that Java or other efficient JVM languages like Scala would run even faster than C++. While it may be true to show this in examples, and the hotspot optimization gives some theoretical evidence how optimization that takes place during run time can be better than static optimization at compile time, I do not generally trust this. I doubt that well written C-code for an application that is adequate for both C and Java will be outperformed by Java. But we have to take two more aspects into account, which tend to be considered kind of unlimited for many such comparisons to make them possible at all.

The JVM has two weaknesses in terms of performance. The start-up time is relatively long. This is addressed in those comparisons, because the claim to be fast is only maintained for long running server applications, where start-up time is not relevant. The hotspot optimization requires anyway a long running application in order to show its advantages. Another aspect that is very relevant is that Java uses a lot of memory. I do not really know why, because more high level languages like Perl or Ruby get along with less memory, but experience shows that this is true. So if we have a budget X to buy hardware and then put software written in C on it, we can just afford to buy more CPUs because we save on the memory or we can make use of the memory that the JVM would otherwise just use up to make our application faster. When we view the achievable performance with a given hardware budget, I am quite sure that well written C outperforms well written Java.

The other aspect is in favor of Java. We have implicitly assumed until now that the budget for development is unlimited. In practice that is not the case. While we fight with interesting, but time consuming low level issues in C, we already get work done in Java. A useful application in Java is usually finished faster than in C, again if it is in a domain that can reasonably be addressed with either of the two languages and if we do not get lost in the framework world. So if the Java application is good enough in terms of performance, which it often is, even for very performance critical applications, then we might be better off using Java instead of C to get the job done faster and to have time for optimization, documentation, testing, unit testing.. Yes, I am in a perfect world now, but we should always aim for that. You could argue that the same argument is valid in terms of using a more high-level language than Java, like Ruby, Perl, Perl 6, Clojure, Scala, F#,… I’ll leave this argument to other articles in the future and in the past.

What Java has really been good at is bringing the VM technology to a level that allows real world high performance server application and bringing it to the main stream.
That is already a great achievement. Interestingly there have never been serious and successful efforts to actually build the JavaVM as hardware CPU and put that as a co-processor into common PCs or servers. It would have been an issue with the upgrade to Java8, because that was an incompatible change, but other than that the JavaVM remained pretty stable. As we see the hotspot optimization is now so good that the urge for such a hardware is not so strong.

Now the JVM has been built around the Java language, which was quite legitimate, because that was the only goal in the beginning. It is even started using the command line tool java (or sometimes javaw on MS-Windows 32/64 systems). The success of Java made the JVM wide spread and efficient, so it became attractive to run other languages on it. There are more than 100 languages on the JVM. Most of them are not very relevant. A couple of them are part of the Java world, because they are or used to be specific micro languages closely related to java to achieve certain goals in the JEE-world, like the now almost obsolete JSP, JavaFX, .

Relevant languages are Scala, Clojure, JRuby, Groovy and JavaScript. I am not sure about Jython, Ceylon and Kotlin. There are interesting ideas coming up here and there like running Haskell under the name Frege on the JVM. And I would love to see a language that just adds operator overloading and provides some preprocessor to achieve this by translating for example „(+)“ in infix syntax to „.add(..)“ mainstream, to allow seriously using numeric types in Java.

Now Perl 6 started its development around 2000. They were at that time assuming that the JVM is not a good target for a dynamic language to achieve good performance. So they started developing Parrot as their own VM. The goal was to share Parrot between many dynamic languages like Ruby, Python, Scheme and Perl 6, which would have allowed inter-language inter-operation to be more easily achievable and using libraries from one of these languages in one of the others. I would not have been trivial, because I am quite sure that we would have come across issues that each language has another set of basic types, so strings and numbers would have to be converted to the strings and numbers of the library language when calling, but it would have been interesting.

In the end parrot was a very interesting project, theoretically very sound and it looked like for example the Ruby guys went for it even faster than the the Perl guys, resulting in an implementation called cardinal. But the relevant Perl 6 implementation, rakudo, eventually went for their own VM, Moar. Ruby also did itself a new better VM- Many other language, including Ruby and JavaScript also went for the JVM, at least as one implementation variant. Eventually the JVM proved to be successful even in this area. The argument to start parrot in the first place was that the JVM is not good for dynamic languages. I believe that this was true around 2000. But the JVM has vastly improved since then, even resulting in Java being a serious alternative to C for many high performance server applications. And it has been improved for dynamic languages, mostly by adding the „invoke_dynamic“-feature, that also proved to be useful for implementing Java 8 lambdas. The experience in transforming and executing dynamic languages to the JVM has grown. So in the end parrot has become kind of obsolete and seems to be maintained, but hardly used for any mainstream projects. In the end we have Perl 6 now and Parrot was an important stepping stone on this path, even if it becomes obsolete. The question of interoperability between different scripting languages remains interesting…

Schreibe einen Kommentar

Antworten abbrechen