Karl Brodowsky's IT-Blog – Seite 15 – IT Sky Consulting GmbH

Microsoft is buying Github

It seems that Microsoft is buying Github for about 7.5 billion USD worth in their own stock.

Is this a good thing or a bad thing? Probably no reason to celebrate, but Microsoft under Nadella seems to run a totally different strategy than Microsoft under Steve Ballmer. Selling licenses of MS-Windows and MS-Office is probably still a good business, if done efficiently, but it cannot be the future of a company of the size of Microsoft. The dominating operating system is now Linux, mostly due to cell phones and tablets running Android, network devices and of course servers. The religious rejection of open source, as it was propagated by Steve Ballmer, is no longer visible. Some MS-Projects like C# and F# have in large parts or completely become open source and on the MS-Azure-platform virtual servers with Linux and PostgreSQL can be chosen, with the limitation that there seems to be a need to have an MS-Windows-box with dedicated software to manage the Azure cloud services. If we ask Richard Stallman this is probably all tactics to achieve bad goals by harming the free software movement. This can be true or not.

But we no longer have to automatically assume that this acquisition is bad. Most likely Github will continue to operate as it did before. Nadella will not shut down Github, while Ballmer might have done so, maybe. He would not have bought Github anyway for so much money. The real asset of Github as well as of LinkedIn and Skype for Microsoft might be the large collection of high quality identity data. Today Google and Facebook have identity data for about two thirds of the adult world population by my estimation. We see it in action when we find a „login with Google“ or „login with Facebook“ option for a web site. Yes, „login with Skype“, „login with Github“ and „login with LinkedIn“ would also be possible or even exist. This is the spot were these acquisitions might or might not be seriously negative. But increasing the pool of identity data makes a lot of sense for the new strategy of the company and I would assume that it was one of the major reasons for this acquisition of Github as well as Skype and LinkedIn before. Is it worth so much money? If the answer is quite obvious, we will probably know in the future.

Links

Links are in English or in German…

Scala Days 2018

In 2018 I visited Scala Days in Berlin with this schedule. Other than in previous years I missed the opening keynote, which is traditionally on the evening before the main conference starts, because this is not so well compatible with my choice of traveling with a night train, especially because I did not want to miss more than two days of my current project.

I might provide Links to Videos of the talks, if they become available, including the interesting keynote that I have missed, because I arrived on the morning of the first full day with a night train.

So I did visit the following talks on the first full day:

And on the second full day:

Keynote: Where Teams Choose: Autonomy, Accountability & Scaling a Platform by Eric Bowman
Functional interpreters and you by Dave Gurnell
Effective Scala by Bill Venners and Frank Sommers
Strings are Evil: Methods to hide the use of primitive types by Noel Welsh
Full Stack Scala by Trond Bjerkestrand
Keeping the „fun“ in Apache Spark: Datasets and FP by Holden Karau
Closing Panel

The little obstacles of interoperability

Deutsch

A lot of things in today’s IT landscape have been unified and interoperability is much better than 20 years ago.

Some examples:

Networking: Today networking is TCP/IP. Even the physical cables with RJ45/Ethernet and the wireless networks have been standardized and all kinds of devices can use the same networks. In the old days there were tons of mutually incompatible proprietary network technologies like BitNet (IBM), NetBios (MS), DecNet (DEC), IPX (Novell),….
Character Sets: Today we have Unicode and a few standard encodings. At least for Web and Emails we have ways to provide meta information about the actual encoding. This area is not totally free of problems, but on a very good way. In earlier days we had to deal with different EBCDIC-encodings or with character sets that only fully supported English langauge and a few languages using the same alphabet without additional letters like „ä“, „ö“, „ü“, „ø“, „ñ“,… So there we have encountered a great progress.
Numbers: For floating point numbers and integers a relatively small set of standardized numerical types has been established, that are used more or less everywhere and behave almost in the same way. The issue of integer overflow remains problematic, though.
Software: In the old days software was written for a specific machine, one CPU architecture and one OS. Today we have common platforms on different kind of hardware: Linux runs on almost any physical and virtual hardware from mobile phones and routers to super computers with practically the same kernel. Java, Ruby, Perl, Scala and some other programming languages are available on a range of platforms and provide some kind of abstract platform. And the web is often a good way to develop applications once for a large range of devices.
File system: At least we now have a common understanding what a file system looks like. There are some OS specific specialties, but the general idea is still the same. It is possible to share file systems between different operating systems by using technologies like Samba.
GNU-Tools: The GNU-Tools (bash, ls, cp, mv,……..) have become the standard on Linux boxes and they are way superior to the traditional Unix tools of the same role and name, that we can still find in Solaris, for example. You can (and should) install them on any Unix and even via cygwin on MS-Windows.

Interoperability is today for many of us interoperability between Linux (and possibly some other rare Posix/Unix-like systems) and Win32/Win64 (MS-Windows).

Experienced Linux users are used to having the forward slash („/“) as file path separator. The backslash is used for escaping special characters. In the MS-Windows-world we often see the backslash („\“) as file path separator. That is enforced in the CMD-window, because it does not pass through the forward slash. My experience with low level Win32/Win64 libraries shows that they understand both variants equally well. Anyway do Ruby, Perl, Java, Cygwin and others support the forward slash. So there is hardly ever any need to check the OS and use backslash or forward slash depending on that, other than for cmd/bat-scripts, but who wants them for more then five lines? I strongly recommend just using the forward slash also under MS-Windows when writing programs in Java, Perl, Scala, Ruby, … It makes life also easier, because the backslash often has to be doubled and it is sometimes hard to keep track of how many backslashes need to be written and to read it.

The line ending is a bit tricky. Linux and Unix use just a „Linefeed“ („\n“=Ctrl-J). For MS-DOS and MS-Windows the combination „Carriage-Return+Linefeed“ („\r\n“=Ctrl-M Ctrl-J) has become the default. Most of today’s programs do not care and understand both variants more or less equally well. Only Notepad does get confused with Linefeed-only files, but notepad is a bad choice anyway. Better Editors (gvim, emacs, ultraedit, scite, …) exist. On the other hand we get problems with the MS-Windows-line-termination in the case of executable scripts. Usually the contain a first line like „#/usr/bin/ruby“. The OS uses that as hint on how to execute them, in this case by calling /usr/bin/ruby. If the line ends with Ctrl-M Ctrl-J, then the OS tries to find a program „/usr/bin/ruby^M“ (^M = Ctrl-M = „\r“), which of course does not exist and we just get an obscure error message.

It is easy to do the conversion:

$ perl -i~ -p -e ’s/\r//g;‘ script

Or the other direction:

$ perl -i~ -p -e ’s/\n/\r\n/g;‘ textfile

For those who use subversion there are ways to enforce a certain way of line endings. Even git supports this.

ScalaUA 2018

About a week ago I visited Scala UA in Kiev.

This was the schedule.

It was a great conference once again, as it was already in 2017 and I really enjoyed everything, including the food, which was great again… 🙂

I listened to the following talks on the first day:

And I gave this talk:

The unexpected difficulties when moving to Scala

On the second day I listened to the following talks:

Advanced Patterns in Asynchronous Programming, Michael Arenzon & Assaf Ronen
Akka: Actors Design And Communication Techniques, Alex Zvolinskiy
Monad Stacks or: How I Learned to Stop Worrying and Love the Free Monad, and other stories, Harry Laoulakos
Scala on Wire: How event streams help us build Android apps, Maciej Gorywoda
Purely functional microservices with http4s and doobie, Jasper van Zandbeek
Tame Your Data with Reactive Streams and Monix, Jacek Kunicki
Future and issues of the Scala Ecosystem, Panel Discussion
Roll your own Event Sourcing, Lina Krutulytė-Kriščiūnė

And I gave this talk:

SQL or NoSQL: What DB-Technology to use for Scala project

As always there was a lot of inspiration coming from the talks and a lot worth exploring in future posts. So there will be Scala posts once in a while, as in the past…

Encryption of Disks

Today we should use encryption of disks for many situations.

I recommend at least encrypting disks of portable computers that contain the home directory and portable USB disks. They can easily get stolen or lost and it is better if the thief does not have easy access to the content. We should even consider encrypting swap partitions.

There are many ways to do this on different operating systems and actually I only know how to do it for Linux. A possible approach for Windows is to run MS-Windows in a virtual box inside Linux and just profit from the Linux-based encryption. That is what I do, but I do not use my MS-Windows very much. About Apple computers I have no knowledge, please go to the site of somebody else for encryption of disks for them. I know that there is an option available for this, but I do not know how to use it and how good it is.

I prefer to rely on open source solutions for security related issues, because it is harder (but not impossible) to put in malicious components into the software and it is easier to find and to fix them. This is a general point that serious security specialists tend to make that it is better to rely on good and well maintained open source software for security than on closed software of which we do not know the wanted and unwanted backdoors and vulnerabilities.

The way it works in Linux is that we encrypt a disk partition. It can then be accessed after providing a password. It is possible to provide several alternative passwords. The tools to use are dm-crypt, a kernel module, LUKS, cryptsetup, and cryptmount.

It can be done like this (example session) for an external drive that appears as /dev/sdc. Please be extremely careful not to erase any data that you still need or hide data behind a password that you do not know…

# check the partitions $ fdisk -l /dev/sdc


# encrypt the partition and provide a password:

$ cryptsetup luksFormat /dev/sdc1
# access the partition

$ cryptsetup luksOpen /dev/sdc1 encrypted-external-drive-1

# format it with whatever file system you want to use $ mkfs.ext4 /dev/mapper/encrypted-external-drive-1 # or $ mkfs.btrfs /dev/mapper/encrypted-external-drive-1 # or whatever you prefer..

Now each time the disk is mounted, the password needs to be provided.

The issue which file system is best might be worth writing about in the future, it is not in this article.

Water pressure CPU

All kinds of technologies are being investigated to become the successor of our decades old silicon chip technology:

quantum computing
using light instead of electricity
using other semiconductors than silicon
…

But it is funny that the most obvious approach has not been investigated until recently.
Now it has been leaked that companies in east Asia are following this approach. The CPUs and all the wires on their computers are just very accurate pipes that carry water. Water pressure is used to transmit information and they have discovered nano-structures that do the equivalent to transistors for electricity, but they are much faster and need less space. Since one water molecule ways only about $2.9915\cdot10^{-26}$ kg, it is possible to transport very much information with very small amounts of water. Even better, the molecule has so many ways to carry vibrations and they learned about this from homeopathy and homeopathic dilutions, which still carry the vibration of the original substance without any molecules of it.

In addition, the issue of CPU heating has totally disappeared. Obviously water cooling is inherently included in the construction.

Eatable Devices

Usability: ATMs

It is interesting, how difficult it can be to use a simple device such as an ATM („bankomat“ in most languages).

Sometimes it is just annoying, sometimes it is really hard to get the machine working properly at all…

So what are the typical use cases? I would say, 90% of the time the intention is to withdraw some money from the main account attached to the card being used. There are some interesting special tasks like looking up the balance or changing the PIN code of the card. And maybe banks are creative and provide even more functionality. Users of the ATM are mostly local users who are customers of the bank that runs the ATM, because they usually provide the best conditions for their own customers. But there are also the foreigners, who do not understand the local language and who want to use a card from somewhere else. Which amazingly sometimes works.

The first question is, in which language to start. Now we usually agree with our bank on some language or the bank imposes its local language on us and we hopefully understand it. So why can’t the ATM find out our language from the card, but asks us as first question, which language we want? There should be a button to change the language, but the machine should be smart enough to figure out our preferred language and the card should provide a preferred language and possibly some alternatives as fallback solution. Also at least for the most common operations a picture could help understand the text of the button.

Reality is, that we find ATMs that work for example in Ukrainian and won’t easily change the language. There is not too much danger in trying several options, but this is annoying for the person who wants to use the ATM and for the ones waiting in the line.

A step that some ATMs provide is that they show how the ATM should look like and to ask us to confirm that the machine is not obviously manipulated. This may be justified for security reasons and it is not too annoying, if we are already in the right language and still have not entered the PIN. When the PIN has been entered, it is already too late. Maybe even when the card has been entered.

Now the PIN code. Usually PIN codes have four digits, but Swiss cards allow six digits, which is a bit better and still reasonable to memorize, because it has always been possible to change the PIN to some memorizable favorite (not „000000“, please) since at least twenty years. I have once seen a ticket vending machine that could only handle four digit PIN codes, but I have never seen an ATM that did not accept six digits. And I have seen ATMs that immediately assume that the number is complete with the sixth digit and thus do not give a chance to correct the last digit. Even worse, the language choice was attached to the „confirm“ button, which therefore could not be used with six digits.

The next step should be to actually withdraw money, with the obvious button, in the right language or at least a secondary language choice and as the largest button or with a picture. In reality it is the third of seven buttons or so, only labeled with text in a language we do not know. And with each wrong choice the PIN has to be entered again. How convenient for the people in the line, if someone does not want to use his own card… Yes, Ukrainians are very honest, but the same applies to any country and to any kind of people who happen to be in the line.

Now the amount. It is a business decision, if the maximum amount we can withdraw is 3 EUR or 3000 EUR or something between. It should be the minimum of what this ATM allows and what our bank allows for this card in this moment. And yes, it would be kind of cool if we could find out about the limit of the ATM before spending too much time with it, especially if each transaction costs about as much as the maximum that the ATM will give us. A good way to enter the amount is just to type on the keyboard. It is obvious and should just work, of course with some way to confirm that this is now the amount we mean, but the „green“ key of the keyboard should just work.

It is really easy to think of good usability for ATMs, but it has not always been done. And it should not be too hard to program their properly.

Writing good software for ticket vending machines is much harder, because there are really much more options and it is usually hard to match the travel plan of the customer with the tariff system of the transport system.

Daylight Saving

In many countries of Europe we have to readjust our watches and clocks today, unless they do it automatically.

It is interesting, that dealing with this has always been a great challenge for software engineers and a very high two digit percentage of software that in some way or other deals with time, does not handle it correctly. Operating systems and standard software are doing a very good job on this, but specialized software that is written today very often does not properly handle the switch. It usually does not matter, because it just results in effects like having rare phone calls charged for an hour too much, to give an example. And really critical software is properly tested for this, I hope. But software developers who are able to deal with this properly are less much than 100%, and even those who are at least able to accept that they do not know and prepared to listen to somebody who knows it, are less than 100%. And daylight saving is really a very very minor invisible side issue for most software projects. They have a task to perform and usually they do that task well enough… So we will continue to develop software that is not really properly handling daylight saving.

One more reason to stop this changing of clocks twice a year, especially since the saving of energy, that was once mentioned as advantage, does not seem to be significant.

Hidden CPUs

How many CPUs does your computer have?

If we go way back, we will discover that some time ago there were already ancillary CPUs in our computers. The floppy disk drive of the C64 had a CPU very similar to the one in the computer itself, but very little memory and it was hard, though not impossible, to make use of it. I never really tried. The PC-keyboards had CPUs, it was told that a Z80 or 8080 or something like that was built into them. I never bothered to find out.

Now this concept is not at all new, but was already used 35 years ago. So the question is, if our computers still have such hidden CPUs. This seems to be the case and it is easy to search for „hidden CPUs“ or „secret CPUs“. And it would be extremely strange to expect anything different. They do not have compute power for us, but just run and manage hardware, that appears to be just hardware from the point of view of our main CPU, that we can program. So why not just consider this as hardware and ignore the „secret“ or „hidden“ CPUs and see them as implementation detail of the hardware. That is a very legitimate approach and to be honest what we do most of the time.

The issue is more delicate now, because these hidden CPUs can access the internet, even when the computer is turned off or seems to be offline. There are tools to analyze the network traffic and to detect this. But we should start to become aware of this invisible world that is potentially as dangerous as visible malware. And this applies to all kinds of devices, especially cell phones, tablets, routers, TV-sets and all „things“ that have their own CPU power and network access…

When to use Scala and Ruby

There are many interesting languages that have their sweet spots and of course a larger set of languages than just two should be considered for new projects.

But Ruby and Scala are both very interesting languages that did not just pick up and sell concepts that were already known, but brought them to a new level and to new beauty. Interestingly, both were started by a single person and finally became community projects.

There are some differences to observe.

Ruby is mostly a dynamic language, which means that it is easier and more natural to change the program at runtime. This is not necessarily a bad thing and different Lisp variants including today’s Clojure have successfully used and perfected this kind of capability for many decades. Consequently more things happen at runtime, especially dynamic typing is used, which means that types only exist at runtime.

Scala is mostly a static language, which means that all program structures have to be created at compile time. But this has been brought to perfection in the sense that a lot of things that are typically available only in dynamic languages, can be done. The type system is static and it is in this sense more consistent and more rigorous than the type system of Java, where we sometimes encounter areas that cannot reasonably be covered by Generics and fall back to the old flavor of untyped collections. This does not happen too often, but the static typing of Scala goes further.

In general this gives more flexibility to Ruby and makes it somewhat harder to tame the ways to do similar things in a static way in Scala. But the type system at compile time of course helps to match things, to find a certain portion of errors and even to make the program more self explanatory without relying on comments. In IDEs it is hard to properly support Scala, but the most common IDEs have achieved this to a very useful level. This should not be overvalued, because there are enough errors that cannot be detected by just using common types. It is possible to always define more specific types which include tight constraints and thus perform really tight checking of certain errors at compile time, but the built in types and the types from common libraries are to convenient and the time effort for this is too high, so it does not seem to be the usual practice. In any case it is a recommended practice to achieve a good test coverage of non-trivial functionality with automated tests. They implicitly cover type errors that are detected by the compiler in Scala, but of course only to the level of the test coverage. Ruby is less overhead to compile and run. We just write the program and run it, while we need a somewhat time intensive compile step for Scala. If tests are included, it does not make so much of a difference, because running the tests or preceding them with a compile job is kind of a minor difference.

An interesting feature of Ruby is called „monkey patching“. This means that it is possible to change methods of an existing class or even of a single object. This can be extremely powerful, but it should be used with care, because it changes the behavior of the class in the whole program and can break libraries. Usually this is not such a bad thing, because it is not used for changing existing methods, but for adding new methods. So it causes problems only when two conflicting monkey patches occur in different libraries. But for big programs with many libraries there is some risk in this area. Scala tries to achieve the same by using „implicit conversions“. So a conversion rule is implicitly around and when a method is called on an object that does not exist in its type, the adequate conversion is applied prior to the method. This works at compile time. Most of the time it is effectively quite similar to monkey patching, but it is a bit harder to tame, because writing and providing implicit conversions is more work and harder to understand than writing monkey patches. On the other hand, Scala avoids the risks of Ruby’s monkey patching.

An increasingly important issue is making use of multiple CPU cores. Scala and especially Scala in combination with Akka is very strong on this. It supports a reasonably powerful and tamable programming model for using multiple threads. The C- or JavaSE-way is very powerful, but it is quite difficult to avoid shooting oneself into the foot and even worse there is a high likelihood that such errors show up in production, in times of heavy load, while all testing seemed to go well. This is the way to go in some cases, but it requires a lot of care and a lot of thinking and a team of skillful developers. There are more developers who think they belong to this group than are actually able to do this well. Of course Scala already filters out some less skilled developers, but still I think its aproach with Akka is more sound.
Ruby on the other hand has very little support for multithreading, and cannot as easily make use of multiple cores by using threads. While the language itself does support the creation of threads, for many years the major implementation had very little support for this in the sense that not actually multiple threads were running at the same time. This propagated into the libraries, so this will probably never become the strength of Ruby. The way to go is to actually start multiple processes. This is not so bad, because the overhead of processes in Ruby is much less than in JVM-languages. Still this is an important area and Scala wins this point.

Concerning web GUIs Ruby has Rails, which is really a powerful and well established way to do this. Scala does provide Play, which is in a way a lot of concepts from Rails and similar frameworks transferred to Scala. It is ok to use it, but rails is much more mature and more mainstream. So I would give this point to Ruby. Rails includes Active Record, about which I do have doubts, but this is really not a necessary component of a pure WebGUI, but more a backend functionality…

So in the end I would recommend to use Scala and Akka for the solution, if it is anticipated that a high throughput will be needed. For smaller solutions I would favor Ruby, because it is a bit faster and easier to get it done.

For larger applications a multi tier architecture could be a reasonable choice, which opens up to combinations. The backend can be done with Scala. If server side rendering is chosen, Ruby and Rails with REST-calls to the backend can be used. Or a single page application which is done in JavaScript or some language compiling to JavaScript and again REST-calls to the backend.