Microsoft is buying Github

It seems that Microsoft is buying Github for about 7.5 billion USD worth in their own stock.

Is this a good thing or a bad thing? Probably no reason to celebrate, but Microsoft under Nadella seems to run a totally different strategy than Microsoft under Steve Ballmer. Selling licenses of MS-Windows and MS-Office is probably still a good business, if done efficiently, but it cannot be the future of a company of the size of Microsoft. The dominating operating system is now Linux, mostly due to cell phones and tablets running Android, network devices and of course servers. The religious rejection of open source, as it was propagated by Steve Ballmer, is no longer visible. Some MS-Projects like C# and F# have in large parts or completely become open source and on the MS-Azure-platform virtual servers with Linux and PostgreSQL can be chosen, with the limitation that there seems to be a need to have an MS-Windows-box with dedicated software to manage the Azure cloud services. If we ask Richard Stallman this is probably all tactics to achieve bad goals by harming the free software movement. This can be true or not.

But we no longer have to automatically assume that this acquisition is bad. Most likely Github will continue to operate as it did before. Nadella will not shut down Github, while Ballmer might have done so, maybe. He would not have bought Github anyway for so much money. The real asset of Github as well as of LinkedIn and Skype for Microsoft might be the large collection of high quality identity data. Today Google and Facebook have identity data for about two thirds of the adult world population by my estimation. We see it in action when we find a „login with Google“ or „login with Facebook“ option for a web site. Yes, „login with Skype“, „login with Github“ and „login with LinkedIn“ would also be possible or even exist. This is the spot were these acquisitions might or might not be seriously negative. But increasing the pool of identity data makes a lot of sense for the new strategy of the company and I would assume that it was one of the major reasons for this acquisition of Github as well as Skype and LinkedIn before. Is it worth so much money? If the answer is quite obvious, we will probably know in the future.

Links

Links are in English or in German…

Share Button

The little obstacles of interoperability

Deutsch

A lot of things in today’s IT landscape have been unified and interoperability is much better than 20 years ago.

Some examples:

  • Networking: Today networking is TCP/IP. Even the physical cables with RJ45/Ethernet and the wireless networks have been standardized and all kinds of devices can use the same networks. In the old days there were tons of mutually incompatible proprietary network technologies like BitNet (IBM), NetBios (MS), DecNet (DEC), IPX (Novell),….
  • Character Sets: Today we have Unicode and a few standard encodings. At least for Web and Emails we have ways to provide meta information about the actual encoding. This area is not totally free of problems, but on a very good way. In earlier days we had to deal with different EBCDIC-encodings or with character sets that only fully supported English langauge and a few languages using the same alphabet without additional letters like „ä“, „ö“, „ü“, „ø“, „ñ“,… So there we have encountered a great progress.
  • Numbers: For floating point numbers and integers a relatively small set of standardized numerical types has been established, that are used more or less everywhere and behave almost in the same way. The issue of integer overflow remains problematic, though.
  • Software: In the old days software was written for a specific machine, one CPU architecture and one OS. Today we have common platforms on different kind of hardware: Linux runs on almost any physical and virtual hardware from mobile phones and routers to super computers with practically the same kernel. Java, Ruby, Perl, Scala and some other programming languages are available on a range of platforms and provide some kind of abstract platform. And the web is often a good way to develop applications once for a large range of devices.
  • File system: At least we now have a common understanding what a file system looks like. There are some OS specific specialties, but the general idea is still the same. It is possible to share file systems between different operating systems by using technologies like Samba.
  • GNU-Tools: The GNU-Tools (bash, ls, cp, mv,……..) have become the standard on Linux boxes and they are way superior to the traditional Unix tools of the same role and name, that we can still find in Solaris, for example. You can (and should) install them on any Unix and even via cygwin on MS-Windows.

Interoperability is today for many of us interoperability between Linux (and possibly some other rare Posix/Unix-like systems) and Win32/Win64 (MS-Windows).

Experienced Linux users are used to having the forward slash („/“) as file path separator. The backslash is used for escaping special characters. In the MS-Windows-world we often see the backslash („\“) as file path separator. That is enforced in the CMD-window, because it does not pass through the forward slash. My experience with low level Win32/Win64 libraries shows that they understand both variants equally well. Anyway do Ruby, Perl, Java, Cygwin and others support the forward slash. So there is hardly ever any need to check the OS and use backslash or forward slash depending on that, other than for cmd/bat-scripts, but who wants them for more then five lines? I strongly recommend just using the forward slash also under MS-Windows when writing programs in Java, Perl, Scala, Ruby, … It makes life also easier, because the backslash often has to be doubled and it is sometimes hard to keep track of how many backslashes need to be written and to read it.

The line ending is a bit tricky. Linux and Unix use just a „Linefeed“ („\n“=Ctrl-J). For MS-DOS and MS-Windows the combination „Carriage-Return+Linefeed“ („\r\n“=Ctrl-M Ctrl-J) has become the default. Most of today’s programs do not care and understand both variants more or less equally well. Only Notepad does get confused with Linefeed-only files, but notepad is a bad choice anyway. Better Editors (gvim, emacs, ultraedit, scite, …) exist. On the other hand we get problems with the MS-Windows-line-termination in the case of executable scripts. Usually the contain a first line like „#/usr/bin/ruby“. The OS uses that as hint on how to execute them, in this case by calling /usr/bin/ruby. If the line ends with Ctrl-M Ctrl-J, then the OS tries to find a program „/usr/bin/ruby^M“ (^M = Ctrl-M = „\r“), which of course does not exist and we just get an obscure error message.

It is easy to do the conversion:

$ perl -i~ -p -e ’s/\r//g;‘ script

Or the other direction:

$ perl -i~ -p -e ’s/\n/\r\n/g;‘ textfile

For those who use subversion there are ways to enforce a certain way of line endings. Even git supports this.

Share Button

Encryption of Disks

Today we should use encryption of disks for many situations.

I recommend at least encrypting disks of portable computers that contain the home directory and portable USB disks. They can easily get stolen or lost and it is better if the thief does not have easy access to the content. We should even consider encrypting swap partitions.

There are many ways to do this on different operating systems and actually I only know how to do it for Linux. A possible approach for Windows is to run MS-Windows in a virtual box inside Linux and just profit from the Linux-based encryption. That is what I do, but I do not use my MS-Windows very much. About Apple computers I have no knowledge, please go to the site of somebody else for encryption of disks for them. I know that there is an option available for this, but I do not know how to use it and how good it is.

I prefer to rely on open source solutions for security related issues, because it is harder (but not impossible) to put in malicious components into the software and it is easier to find and to fix them. This is a general point that serious security specialists tend to make that it is better to rely on good and well maintained open source software for security than on closed software of which we do not know the wanted and unwanted backdoors and vulnerabilities.

The way it works in Linux is that we encrypt a disk partition. It can then be accessed after providing a password. It is possible to provide several alternative passwords. The tools to use are dm-crypt, a kernel module, LUKS, cryptsetup, and cryptmount.

It can be done like this (example session) for an external drive that appears as /dev/sdc. Please be extremely careful not to erase any data that you still need or hide data behind a password that you do not know…


# check the partitions
$ fdisk -l /dev/sdc

# encrypt the partition and provide a password:
$ cryptsetup luksFormat /dev/sdc1

# access the partition
$ cryptsetup luksOpen /dev/sdc1 encrypted-external-drive-1

# format it with whatever file system you want to use
$ mkfs.ext4 /dev/mapper/sdc1 encrypted-external-drive-1
# or
$ mkfs.btrfs /dev/mapper/sdc1 encrypted-external-drive-1
# or whatever you prefer..

Now each time the disk is mounted, the password needs to be provided.

The issue which file system is best might be worth writing about in the future, it is not in this article.

Links

Share Button

Daylight Saving

In many countries of Europe we have to readjust our watches and clocks today, unless they do it automatically.

It is interesting, that dealing with this has always been a great challenge for software engineers and a very high two digit percentage of software that in some way or other deals with time, does not handle it correctly. Operating systems and standard software are doing a very good job on this, but specialized software that is written today very often does not properly handle the switch. It usually does not matter, because it just results in effects like having rare phone calls charged for an hour too much, to give an example. And really critical software is properly tested for this, I hope. But software developers who are able to deal with this properly are less much than 100%, and even those who are at least able to accept that they do not know and prepared to listen to somebody who knows it, are less than 100%. And daylight saving is really a very very minor invisible side issue for most software projects. They have a task to perform and usually they do that task well enough… So we will continue to develop software that is not really properly handling daylight saving.

One more reason to stop this changing of clocks twice a year, especially since the saving of energy, that was once mentioned as advantage, does not seem to be significant.

Share Button

Source Code of Apple-iOS leaked

It seems that the parts of the source code of Apple’s iOS 9 have leaked via github. They might have been removed from there, while you are reading this, but probably they will be passed around in the internet anyway.

Some sources say that this is a risk to security. It might be, but in the end cryptography specialists tend to consider the availability of the source code as an advantage for security, because it can be analyzed by everyone, vulnerabilities can be found and published and of course more easily be corrected if the source is available to everyone. Hiding the source code is some kind of „security by obfuscation“, which is not really a strong mechanism and it should be based on verifiable secure mechanisms, as successfully applied by Linux and other open source operating systems. But this might not be fully true, if the sources are just passed around in somewhat closed circles and not easily available to the general public.

This does not make iOS open source, because the licenses that Apple imposes on their software are still valid and to my understanding they do not make this part of the system open source, which means much more than just being able to read the source code of a certain version that might already be outdated. Please observe that if the source code that you might find on github is really coming from Apple, their original license and not the one mentioned in github applies.

To put Jail breaking somewhere near security breaches is wrong, because this is an action done by the owner of the device with his or her own device at own risk. This should be everyone’s right to do so and there should be nothing wrong with making it easier. I know, we are not living in a perfect world…

So please relax. If Apple has done a good job, there will not be too bad exploits and if they are still doing a good job, they will quickly fix any exploits that show up. And if you like to have an open source system, you should still consider using something else.

Links

Share Button

Microsoft stops Windows Mobile

It seems that Microsoft is ending the development of Windows Mobile. After having tried with some effort to get into the mobile operating system business, it seems that the market share is now less than 1%, with Android being >85% and iOS close to 15% according to IDC. Because the market is so large, it would be possible to run a profitable business even with a low market share, but this is probably hard for a big company and it seems more attractive to concentrate on other areas. It is good to have some good mobile apps available for the platform and that is an area where Android and iOS shine, while doing a third app for Windows phone is a bit unusual. A funny detail is that a retired guy, who was once the founder of Microsoft and who is still associated with that company by some people uses an Android phone for himself. But he is retired, so that is no longer too important.

The story is a bit weird, though… In 2010 Stephen Elop became the boss of Nokia. At this time Nokia had a market share of around 50% in mobile phones covering a wide range from tiny „non-smart“ phone to high end smart phones. They were mostly using Symbian as OS, but the transition to Maemo and MeeGo, like Android Linux variants, was on a good way and it would have been worth seeing where this might go. At this time it was already quite clear that MS Windows phone/Windows mobile was a failure. All efforts concerning Linux-based systems were stopped and Symbian was announced as being a dead end and the strategy was to move to MS-Windows phone only. Most likely this was done because Stephen Elop had more loyalty to Microsoft than to the company that he was running. To my knowledge this never became a case for the courts, but one might assume some criminal energy behind this. And some stupidity of the stock holders, who selected this person as CEO. Some time later the mobile phone branch of Nokia went down and was bought for very little money by Microsoft. After that acquisition it was further downsized and will probably go to zero soon, because Microsoft does not have interest to develop new hardware.

As it seems, there are mobile phones with the brand „Nokia“ again. HMD, a company in Finland designs them, pays to the company Nokia some money to use their brand. And of course they use Android.

It seems that the issue of having to write a mobile app twice because of Android and iOS has become a bit less pressing. Besided mobile web applications that can pretty much behave as rich clients through modern JavaScript frameworks, there are „fake apps“ that are actually pretty much browsers without a visible URL bar and programmed to only surf their home site. And even native apps are now increasingly being developed in Swift for iOS and Kotlin for Android, which seem to be quite similar, at least at a conceptional level.

Share Button

Shell Scripts

Shell scripts can be useful for writing small stuff like combining a few commands to pipes or doing a bit of „back ticking“. Even simple loops and if-conditions are possible. And if we want, it is almost a full programming language. A bit hard to tame, maybe, but quite a lot of stuff is possible. Those who like to know more about it may look into startup scripts of typical java software. Often a .bat and a .sh file are provided, where the right jvm is found, the classpath and the execution path and maybe some other environment are put together. In the end the .sh-file is quite a long and unreadable horror story and the .bat file is even much worse, because the cmd-language is just a lot more primitive and less capable and requires even worse hacks.

There are ways to make shell scripts more readable, which by themselves are truly admirable, but I think that route is wrong. We can learn all the Shell functionalities and understand bit by bit even more complex shell scripts, but I think for non trivial shell scripts it is time to switch to real programming languages instead. Scripting languages, of course, for example Perl, Ruby, Python or Lua. We may still execute „shell commands“, that are actually programs in /bin, /usr/bin or /usr/local/bin where they are powerful and more concise than writing purely in that programming language. But a magic for putting together a classpath is much cleaner in the Perl programming language than in pure bash (or worse cmd/bat).

This is of course another example of the Golden Hammer anti pattern. We should balance our tool box. Not add specific tools for making any minor task a bit easier on the expense of supporting one more tool, but keep a broad range of tools that in conjunction are very powerful. For example I would retire awk and sed and use either Perl or Ruby instead. We only have to keep them around because a lot of system tools that are just there still rely on them, but for a team I would deprecate awk and sed for new scripts or even for enhancing existing scripts. Bash would be ok only for small scripts, you can invent a line number or a maximum complexity, but for very short scripts I think bash is a legitimate tool.

Switch to Perl, Perl6, Ruby or… when you encounter any of the following:

  • The scripts is getting kind of long (>= 100 lines)
  • You find yourself modularizing it with functions
  • You find yourself using non trivial perl, ruby, sed or awk within the script, for example regex-stuff
  • The script need interaction
  • The scripts needs arrays, numbers or other types
  • More than one or two trivial if-statements or loop-statements are needed
  • Database access is done by the script (SQL or NoSQL)
  • String encoding becomes relevant
  • Quoting levels become an issue

This post was inspired by a similar post on the Isoblog by Kris. And the Shell Style Guide of Google is quite good especially in limiting the area where shell scripts are acceptable.

Share Button

Bring your own Device

This issue is quite controversial and it applies to laptops, tablets and smart phones.
Usually the „bringing“ is not really an issue, you can have anything in your bags and connect it via the mobile phone network as long as it does not absorb the working time.
But usually this implies a bit more.
There are some advantages in having company emails and calendar on a smart phone. This is convenient and useful. But there are some security concerns that should be taken serious. How is the calendar and the emails accessed? How confidential are the emails? Do they pass through servers that we do not trust? What happens, if a phone gets lost?
This is an area, where security concerns are often not taken too serious, because it is cool for top manager to have such devices. And they can just override any worries and concerns, if they like.
This can be compensated by being more restrictive in other areas. 😉
Anyway, the questions should be answered. In addition, the personal preferences for a certain type of phone are very strong. So the phone provided by the company might not be the one that the employee prefers, so there is a big desire to use the own phone or one that is similar to the own phone, which depends on the question of who pays the bills, how much of private telephony is allowed on the company phone and if there are work related calls to abusive times.
Generally the desirable path is to accept this and to find ways to make this secure.

The other issue is about the computer we work with. For some kind of jobs it is clear that the computer of the company is used, for example when selling railroad tickets or working in the post office or in a bank serving customers.

It shows that more creative people and more IT-oriented people like to have more control on the computer they work with.
We like to have hardware that is powerful enough to do the job. We like to be able to install software that helps us do our job. We like to use the OS and the software that we are skilled with. Sometimes it is already useful to be able to install this on the company computer or in a virtual computer within the company computer. Does the company allow this? It should, with some reasonable guidelines.

Some companies allow their employees to use their own laptops instead. They might give some money to pay for this and expect a certain level of equipment for that. Or just allow the employees to buy a laptop with their own money and use it instead of the company computer. They will do so and happily spend the money, even though it is wrong and the company should pay it. But the pain of spending some of the own money is for many people less than the pain of having to use crappy company equipment.

This rises the question of the network drive Q:, the outlook, MS-Word, MS-Excel,…
Actually this is not so much an issue, at least for the group we are talking here. Or becoming less of an issue.
Drive Q: can quite well be accessed from Linux, if the company policies allow it. But actually modern working patterns do not need this any more.
We can use a Wiki, like MediaWiki or Confluence for documentation. This is actually a bit better in many cases and I would see a trend in this direction, at least for IT-oriented teams.
Office-Formats and Email are more and more providing Web-Applications that can be used to work with them on Linux, for example. And MS-Office is already available for Linux, at least for Android, which is a Linux Variant. It might or might not come for Desktop Linux. LibreOffice is most of the time a useful replacement. Maybe better, maybe almost as good, depending on perspective… And there is always the possibility to have a virtual computer running MS-Windows for the absolutely mandatory MS-Windows-programs, if they actually exist. Such an image could be provided and maintained by the company instead of a company computer.

It is better to let the people work. To allow them to use useful tools. To pay them for bringing their own laptop or to allow them to install what they want on the company laptop. I have seen people who quit their job because of issues like this. The whole expensive MS-Windows-oriented universe that has been built in companies for a lot of money proves to be obsolete in some areas. A Wiki, a source code repository, … these things can be accessed over the internet using ssh or https. They can be hosted by third parties, if we trust the third party. Or they can be hosted by the company itself. Some companies work with distributed teams…

It is of course important to figure out a good security policy that allows working with „own“ devices and still provide a sufficient level of security. Maybe we just have to get used to other ways of working and to learn how to solve the problems that they bring us. In the end of the day we will see which companies are more successful. It depends on many factors, but the ability to provide a innovative and powerful IT and to have good people working there and actually getting stuff done is often an important factor.

Share Button

tmp-directories

On all computers we have some concept of a tmp-directory. Typically it is /tmp on Linux- and Unix-systems and something like C:/TEMP plus some subdirectory in each users home directory on MS-Windows.

In terms of software development this tends to be some dark area. Programs like to create some files there, store some stuff there and then maybe remove it, maybe not. And we do not know for sure, when we can delete these files and we actually do not want to care. Linux and Unix-Systems sometimes clear their tmp-directories on reboot, while providing an additional /var/tmp-directory, that survives reboot. Sometimes the tmp-directory is deducted from shared memory, so it is kind of a RAM-disk, but usually stored in the swap partition (or swap file) of our OS. Now this cleanup on reboot does not help too much, when we want to keep our system running for a long time.

These days most computers are somehow dedicated. Either they are virtual computers that run exactly one server application or a set of closely related server applications. Or it is a mobile phone, tablet or desktop computer that is typically used by only one person. But still we should not forget that the system should allow being used by several applications and by several users. So sharing the same tmp-directory for everyone can cause some conflicts. The Unix- and Linux-family has a way of setting file permissions for the tmp-directory itself and for its entries that stop users from reading, changing or deleting each others files, but still there is some concurrency about using the namespace of this one directory, which is usually quite elegantly bypassed by each software by using smart naming or by having the OS create unique names. But I would not consider it ideal. On the other hand, sometimes we might actually want to use the tmp-directory to share something between users or between processes, where this one tmp-directory might come in handy.

The approach of having a separate tmp-directory in each home directory and in a sub directory of each server application’s installation is tempting, because it separates name spaces, allows to disallow reading the directory entries by others and does not mix totally unrelated stuff in one directory. There is a drawback to this. We usually have different storage technologies. Some are optimized for reading, maybe even avoiding redundancy, because the system can be reinstalled. Some use sporadic writing, some are strictly read-only. And some use a lot of reading and writing. Some data is transient, some can be easily restored and some data needs to be stored redundantly to be safe. Depending on that we should aim to put it on Flash disks, or on a different RAID setup of hard disks. This is getting harder with virtualization, but eventually we can get to the point where virtual computers have disks of different characteristics, that are mapped to the appropriate hardware.

So there is no real good answer to this question, but I think that a tmp-directory that is separate from the home directory, but specific to each user, would be the best approach. Will this change? Probably not so easily. But maybe in some distant future.

Share Button

Using non-ASCII-characters

Some of us still remember the times when it was recommended to avoid „special characters“ when writing on the computer. Some keyboards did not contain „Umlaut“-characters in Germany and we fell back to the ugly, but generally understandable way of replacing the German special characters like this: ä->ae, ö->oe, ü->ue, ß->sz or ß->ss. This was due to the lack of decent keyboards, decent entry methods, but also due to transmission methods that stripped the upper bit. It did happen in emails that they where „enhanced“ like this: ä->d, ö->v, ü->|,… So we had to know our ways and sometimes use ae oe ue ss. Similar issues applied to other languages like the Scandinavian languages, Spanish, French, Italian, Hungarian, Croatian, Slovenian, Slovak, Serbian, the Baltic languages, Esperanto,… in short to all languages that could regularly be written with the Latin alphabet but required some additional letters to be written properly.

When we wrote on paper, the requirement to write properly was more obvious, while email and other electronic communication via the internet of those could be explained as being something like short wave radio. It worked globally, but with some degradation of quality compared to other media of the time. So for example with TeX it was possible to write the German special letters (and others in a similar way) like this: ä->\“a, ö->\“o, ü->\“u, ß->\ss and later even like this ä->“a, ö->“o, ü->“u, ß->“s, which some people, including myself, even used for email and other electronic communication when the proper way was not possible. I wrote Emacs-Lisp-Software that could put my Emacs in a mode where the Umlaut keys actually produced these combination when being typed and I even figured out how to tweak an xterm window for that for the sake of using IRC where Umlaut letters did not at all work and quick online typing was the way to go, where the Umlaut-characters where automatically typed because I used 10-finger system typing words, not characters.

On the other hand TeX could be configured to process Umlaut characters properly (more or less, up to the issue of hyphenation) and I wrote Macros to do this and provided them to CTAN, the repository for TeX-related software in the internet around 1994 or so. Later a better and more generic solution become part of standard TeX and superseded this, which was a good thing. So TeX guys could type ä ö ü ß and I strongly recommended (and still recommend) to actually do so. It works, it is more user friendly and in the end the computer should adapt to the humans not vice versa.

The web could process Umlaut characters (and everything) from day one. The transfer was not an issue, it could handle binary data like images, so no stripping of high bit or so was happening and the umlaut characters just went through. For people having problems to find them on the keyboard, transcriptions like this were created: ä->ä ö->ö ü->ü ß->ß. I used them not knowing that they where not actually needed, but I relied on a Perl script to do the conversion so it was possible to actually type them properly.

Now some languages like Russian, Chinese, Greek, Japanese, Georgian, Thai, Korean use a totally different alphabet, so they had to solve this earlier, but others might know better how it was done in the early days. Probably it helped develop the technology. Even harder are languages like Arabic, Hebrew, and Farsi, that are written right to left. It is still ugly when editing a Wikipedia page and the switching between left-to-right and right-to-left occurs correctly, but magically and unexpected.

While ISO-8859-x seemed to solve the issue for most European languages and ISO-8859-1 became the de-facto standard in many areas, this was eventually a dead end, because only Unicode provided a way for hosting all live languages, which is what we eventually wanted, because even in quite closed environments excluding some combinations of languages in the same document will at some point of time prove to be a mistake. This has its issues. The most painful is that files and badly transmitted content do not have a clear information about the encoding attached to them. The same applies to Strings in some programming languages. We need to know from the context what it is. And now UTF-8 is becoming the predominant encoding, but in many areas ISO-8859-x or the weird cp1252 still prevail and when we get the encoding or the implicit conversions wrong, the Umlaut characters or whatever we have gets messed up. I would recommend to work carefully and keep IT-systems, configurations and programs well maintained and documented and to move to UTF-8 whenever possible. Falling back to ae oe ue for content is sixties- and seventies-technology.

Now I still do see an issue with names that are both technical and human readable like file names and names of attachments or variable names and function names in programming languages. While these do very often allow Umlaut characters, I would still prefer the ugly transcriptions in this area, because a mismatch of encoding becomes more annoying and ugly to handle than if it is just concerning text, where we as human readers are a bit more tolerant than computers, so tolerant that we would even read the ugly replacements of the old days.

But for content, we should insist on writing it with our alphabet. And move away from the ISO-8869-x encodings to UTF-8.

Links (this Blog):

Share Button