Orthodox Christmas 2019/2020

Orthodox Christmas 2019/2020 in Ukraine and probably some other countries is on 2020-01-07.


God Jul! — Feliĉan Kristnaskon! — ميلاد مجيد — Natale hilare! — Hyvää Joulua! — Срећан Божић! — Prettige Kerstdagen! — クリスマスおめでとう ; メリークリスマス — З Рiздвом Христовим! — Buon Natale! — Joyeux Noël! — С Рождеством! — Frohe Weihnachten! — ¡Feliz Navidad! — Crăciun fericit! — Merry Christmas! — καλά Χριστούγεννα! — God Jul!

This text was generated with a C# program (using Mono on Linux):

using System;
using System.Collections.Generic;
using static System.Collections.Generic.KeyValuePair;
using System.Linq;

class OrthodoxChristmas20192020 {
    private static string[] arr = new string[] {
        "Prettige Kerstdagen!",
        "God Jul!",
        "Crăciun fericit!",
        "クリスマスおめでとう ; メリークリスマス",
        "God Jul!",
        "Feliĉan Kristnaskon!",
        "Hyvää Joulua!",
        "ميلاد مجيد",
        "Срећан Божић!",
        "καλά Χριστούγεννα!",
        "З Рiздвом Христовим!",
        "Natale hilare!",
        "Buon Natale!",
        "Joyeux Noël!",
        "Frohe Weihnachten!",
        "С Рождеством!",
        "Merry Christmas!",
        "¡Feliz Navidad!"
    };

    public static void Main() {
        Random rnd = new Random();

        var shuffled = from item in arr.Select(s => new KeyValuePair<int, string>(rnd.Next(), s)) orderby item.Key select item.Value;
        int count = 0;
        foreach (string s in shuffled) {
            if (count++ > 0) {
                Console.Write(" — ");
            }
            Console.Write(s);
        }
        Console.WriteLine();
    }
}

Share Button

How to replace svn:keywords?

In the old days we used svn, cvs, rcs or other systems for source code management, that allowed enabling something like svn:keywords. This resulted in certain strings in the source code being replaced by strings containing some version information.

More often than we might think these were useful. The question „what version are we running?“ is often answered, but surprisingly often not correctly.

Now putting the version information into a comment or even better into a string that might even be logged or that might at least be extracted by using something like

strings xyz |egrep '\$Id.*\$'

allows to find out.

Now we are using git instead of svn, or at least we should be using git or plan our migration to git. There are other tools like Mercurial, that are probably just as good as git, but git is most common and every developer knows it or has to learn it anyway to stay in business.

Now git is not supporting these svn:keywords or at least not as easily, because it relies sha-checksums, which does not allow for changing file contents. There are some tricks like pre-checking and post-checkout scripts that might solve such issues, but this is kind of difficult to tame, due to the distributed characters of git including a local repo on each developers machine.

So it is better to accept that the time of this svn:keywords-stuff is over and look for something new. As an example we will consider the world of Java and JVM languages. Most use a Jenkins server to compile the software.

To create a release, even a temporary release or a release just for testing, the right way is to first label the head of the branch we are working on, then check out based on this label, compile that and upload it to the artifactory, if it is successful. Maybe rename the label or and another label. If not, maybe delete the label, depending on the processes.

Now the jar-files contain a META-INF-directory and a MANIFEST.MF. This should be the right place to put version information during such a build. More or less this can provide the same benefit as the svn:keywords, but it works with git and needs only be done in one place.

Details about how to do it will can be found out when needed.

I assume that the same approach can also be accomplished for other environments. We can even find ways that the software logs its version by changing a string in a source code file during the build process.

Share Button

Happy New Year 2020

Un an nou fericit! — Onnellista uutta vuotta! — Feliĉan novan jaron! — Καλή Χρονια! — ¡Feliz año nuevo! — С новым годом! — FELIX SIT ANNUS NOVUS — Godt nytt år! — Щасливого нового року! — Frohes neues Jahr! — Felice anno nuovo! — Bonne année! — Gott nytt år! — Срећна нова година! — عام سعيد — Gullukkig niuw jaar! — Happy new year!

This is generated with a Java 13 program using Lambdas and secure random numbers:

import java.security.SecureRandom;
import java.util.List;
import java.util.stream.Stream;
import java.util.stream.Collectors;

public class HappyNewYearJava8 {

    private static final class Element implements Comparable<Element> {
        Element(Long sortKey, String text) {
            this.sortKey = sortKey;
            this.text = text;
        }

        private Long sortKey;
        private String text;

        public String getText() {
            return text;
        }

        public int compareTo(Element e) {
            return this.sortKey.compareTo(e.sortKey);
        }
    }

    public static void main(String[] args) {
        SecureRandom random = new SecureRandom();
        List<String> list = Stream.of("Frohes neues Jahr!",
                                      "Happy new year!",
                                      "Gott nytt år!",
                                      "¡Feliz año nuevo!",
                                      "Bonne année!",
                                      "FELIX SIT ANNUS NOVUS",
                                      "С новым годом!",
                                      "عام سعيد",
                                      "Felice anno nuovo!",
                                      "Godt nytt år!",
                                      "Gullukkig niuw jaar!",
                                      "Feliĉan novan jaron!",
                                      "Onnellista uutta vuotta!",
                                      "Срећна нова година!",
                                      "Un an nou fericit!",
                                      "Щасливого нового року!",
                                      "Καλή Χρονια!")
            .map(s->new Element(random.nextLong(), s))
            .sorted()
            .map(Element::getText)
            .collect(Collectors.toList());

        System.out.println(String.join(" — ", list));
    }
}

Share Button

Christmas 2019

Joyeux Noël! — ميلاد مجيد — Crăciun fericit! — God Jul! — God Jul! — Natale hilare! — С Рождеством! — З Рiздвом Христовим! — Prettige Kerstdagen! — Hyvää Joulua! — クリスマスおめでとう ; メリークリスマス — καλά Χριστούγεννα! — Buon Natale! — Срећан Божић! — Frohe Weihnachten! — ¡Feliz Navidad! — Feliĉan Kristnaskon! — Merry Christmas!

This time the greetings were generated with a C program:

#include <stdio.h>
#include <stdint.h>
#include <openssl/rand.h>

#define N 18
static const uint32_t n = N;

int main(int argc, char **argv) {
  char greetings[N][60] = {
    "С Рождеством!",
    "Hyvää Joulua!",
    "καλά Χριστούγεννα!",
    "Buon Natale!",
    "Prettige Kerstdagen!",
    "З Рiздвом Христовим!",
    "Merry Christmas!",
    "Срећан Божић!",
    "God Jul!",
    "¡Feliz Navidad!",
    "ميلاد مجيد",
    "クリスマスおめでとう ; メリークリスマス",
    "Natale hilare!",
    "Joyeux Noël!",
    "God Jul!",
    "Frohe Weihnachten!",
    "Crăciun fericit!",
    "Feliĉan Kristnaskon!" };
  int32_t i, j;
  uint32_t x;
  uint32_t idx[N];
  int rtc;
  uint64_t r = 0;
  for (i = n-1; i >= 0; i--) {
    idx[i] = i;
  }
  RAND_bytes((char *) &r, sizeof(r));
  for (i = n-1; i > 0; i--) {
    j = r % i;
    r = r / i;
    x = idx[i];
    idx[i] = idx[j];
    idx[j] = x;
  }
  for (i = 0; i < n; i++) {
    if (i > 0) {
      printf(" — ");
    }
    printf("%s", greetings[idx[i]]);
  }
  printf("\n");
}

Share Button

Poor mans profiling with LOGs

We have professional profiling tools and we should use them.. They give really useful and extensive information.

So why bother about doing „poor man’s profiling“?

About ten years ago, running profilers was kind of constrained to very small examples and computers with really huge memory. We have this huge memory now on every development machine. But we can only run such tests on development and test systems.

Sometimes it is possible to copy data from the productive system to the test system and really simulate what is happening on the productive system. Often this is quite far away from what is really happening on the productive system and we often only know about a problem when it has already occurred.

What we usually do have are log files. Each entry has a time stamp and can somehow be connected to a line of code where it is created. We always know the class or the file or something like that from where the log statement was issued and since there is only a limited number of log statements, they can often be recognized. Sometimes it is a good idea to actually help finding where it was logged. For example could we artificially put the source code line number or some uniq number or a short random string into the source code for each log statement. Then we can find several pieces of information in the log statement:

  • The timestamp (please use ISO-format!!! And at least milliseconds)
  • The source code location (line number, class name, random string, relative position in file, meaningful description like entering/leaving methods etc.)
  • The thread and/or process
  • The payload information of the log

Now log statements analyzed by thread. The first step is to look from which pairs of locations in the source code we observe successive log statements from the same thread, ignoring all logging from other threads and processes. We can also estimate the time that was spent for going from the first to the second location. Aggregating this by basically adding up thes timestamp deltas for each such pair over all threads and the whole log file will already indicate some hotspots where time is spent. Sometimes it is even interesting to look at sequences of three log statements (from the same thread). This was useful, when DB-commits, i.e. one such pair like „begin commit“ to „commit done“ used around 50% of the time, if not more. This was not interesting without knowing the statement that was committed, which was found out by looking at the previous log entry of the same thread just before the commit.

Such analysis can be done with scripts for example in Perl, Ruby or Python. I use Perl for this kind of task. Maybe log analysis tools provide this out of the box or with a little bit of tweaking. A concept about what entries in log files look like that make recognizing the location of the program that created the log entry, are very helpful. Sometimes we actually want to see, with which kins of data the problem occurred, which means looking also into the payload.

Share Button

50 years of internet

According to some the RFC1, which was published on 1969-04-07, was the start of the internet. Many RFCs have since this time been published and they describe the standards of the internet.

The early internet did contain functionality to communicate between computers, but it did of course not include http, html and the WWW, which were introduced much later in the early nineties.

Share Button

Orthodox Christmas 2018/2019

Orthodox Christmas in some countries, for example in Ukraine is on 7th of January.
So to all readers, who have Christmas on 7th of January:


С Рождеством! — Hyvää Joulua! — καλά Χριστούγεννα! — Buon Natale! — Prettige Kerstdagen! — З Рiздвом Христовим! — Merry Christmas! — Срећан Божић! — God Jul! — ¡Feliz Navidad! — ميلاد مجيد — クリスマスおめでとう ; メリークリスマス — Natale hilare! — Joyeux Noël! — God Jul! — Frohe Weihnachten! — Crăciun fericit! — Feliĉan Kristnaskon!

I created the message and the random ordering using Perl and the Schwartzian transform:

#!/usr/bin/perl
use Math::Random::Secure qw(irand);
my @list = ( "Prettige Kerstdagen!",
          "God Jul!",
          "Crăciun fericit!",
          "クリスマスおめでとう ; メリークリスマス",
          "God Jul!",
          "Feliĉan Kristnaskon!",
          "Hyvää Joulua!",
          "ميلاد مجيد",
          "Срећан Божић!",
          "καλά Χριστούγεννα!",
          "З Рiздвом Христовим!",
          "Natale hilare!",
          "Buon Natale!",
          "Joyeux Noël!",
          "Frohe Weihnachten!",
          "С Рождеством!",
          "Merry Christmas!",
          "¡Feliz Navidad!" );
my @shuffled = map{ $_->[0] }
               sort {$a->[1] <=> $b->[1]}
               map { [ $_, irand() ] }
               @list;
print join(" — ", @shuffled);

Share Button

2019 — Happy New Year

Gott nytt år! — Godt nytt år! — Felice anno nuovo! — Καλή Χρονια! — Щасливого нового року! — Срећна нова година! — С новым годом! — Feliĉan novan jaron! — Bonne année! — FELIX SIT ANNUS NOVUS — Gullukkig niuw jaar! — Un an nou fericit! — Frohes neues Jahr! — Happy new year! — ¡Feliz año nuevo! — Onnellista uutta vuotta! — عام سعيد

This was created by a Java-program:

import java.util.Random;
import java.util.List;
import java.util.Arrays;
import java.util.Collections;

public class HappyNewYear {

    public static void main(String[] args) {
        List list = Arrays.asList("Frohes neues Jahr!",
                                          "Happy new year!",
                                          "Gott nytt år!", 
                                          "¡Feliz año nuevo!",
                                          "Bonne année!", 
                                          "FELIX SIT ANNUS NOVUS", 
                                          "С новым годом!",
                                          "عام سعيد",
                                          "Felice anno nuovo!",
                                          "Godt nytt år!", 
                                          "Gullukkig niuw jaar!", 
                                          "Feliĉan novan jaron!",
                                          "Onnellista uutta vuotta!",
                                          "Срећна нова година!",
                                          "Un an nou fericit!",
                                          "Щасливого нового року!", 
                                          "Καλή Χρονια!");
        Collections.shuffle(list);
        System.out.println(String.join(" — ", list));
    }
}

Share Button

Christmas 2018


Feliĉan Kristnaskon! — Frohe Weihnachten! — God Jul! — Merry Christmas! — Joyeux Noël! — クリスマスおめでとう ; メリークリスマス — Срећан Божић! — Buon Natale! — Hyvää Joulua! — З Рiздвом Христовим! — ميلاد مجيد — С Рождеством! — Crăciun fericit! — ¡Feliz Navidad! — καλά Χριστούγεννα! — Natale hilare! — God Jul! — Prettige Kerstdagen!

As I said, I am learning some Python, so let’s use it. I created the message above with this program:

#!/usr/bin/python3
import random
arr = [
    "Frohe Weihnachten!",
    "Merry Christmas!",
    "God Jul!",
    "¡Feliz Navidad!",
    "Joyeux Noël!",
    "Natale hilare!",
    "С Рождеством!",
    "ميلاد مجيد",
    "Buon Natale!",
    "God Jul!",
    "Prettige Kerstdagen!",
    "Feliĉan Kristnaskon!",
    "Hyvää Joulua!",
    "クリスマスおめでとう ; メリークリスマス",
    "Срећан Божић!",
    "Crăciun fericit!",
    "З Рiздвом Христовим!",
    "καλά Χριστούγεννα!"
]
random.shuffle(arr)
print(" — ".join(arr))
print("\n")

Share Button

Loops with unknown nesting depth

We often encounter nested loops, like

for (i = 0; i < n; i++) {
    for (j = 0; j < m; j++) {
        doSomething(i, j);
    }
}

This can be nested to a few more levels without too much pain, as long as we observe that the number of iterations for each level need to be multiplied to get the number of iterations for the whole thing and that total numbers of iterations beyond a few billions (10^9, German: Milliarden, Russian Миллиарди) become unreasonable no matter how fast the doSomethings(...) is. Just looking at this example program

public class Modular {
    public static void main(String[] args) {
        long n = Long.parseLong(args[0]);
        long t = System.currentTimeMillis();
        long m = Long.parseLong(args[1]);
        System.out.println("n=" + n + " t=" + t + " m=" + m);
        long prod = 1;
        long sum  = 0;
        for (long i = 0; i < n; i++) {
            long j = i % m;
            sum += j;
            sum %= m;
            prod *= (j*j+1) % m;
            prod %= m;
        }
        System.out.println("sum=" + sum + " prod=" + prod + " dt=" + (System.currentTimeMillis() - t));
    }
}

which measures it net run time and runs 0 msec for 1000 iterations and almost three minutes for 10 billions (10^{10}):

> java Modular 1000 1001 # 1'000
--> sum=1 prod=442 dt=0
> java Modular 10000 1001 # 10'000
--> sum=55 prod=520 dt=1
> java Modular 100000 1001 # 100'000
--> sum=45 prod=299 dt=7
> java Modular 1000000 1001 # 1'000'000
--> sum=0 prod=806 dt=36
> java Modular 10000000 1001 # 10'000'000
--> sum=45 prod=299 dt=344
> java Modular 100000000 1001 # 100'000'000
--> sum=946 prod=949 dt=3314
> java Modular 1000000000 1001 # 1'000'000'000
--> sum=1 prod=442 dt=34439
> java Modular 10000000000 1001 # 10'000'000'000
--> sum=55 prod=520 dt=332346

As soon as we do I/O, network access, database access or simply a bit more serious calculation, this becomes of course easily unbearably slow. But today it is cool to deal with big data and to at least call what we are doing big data, even though conventional processing on a laptop can do it in a few seconds or minutes... And there are of course ways to process way more iterations than this, but it becomes worth thinking about the system architecture, the hardware, parallel processing and of course algorithms and software stacks. But here we are in the "normal world", which can be a "normal subuniverse" of something really big, so running on one CPU and using a normal language like Perl, Java, Ruby, Scala, Clojure, F# or C.

Now sometimes we encounter situations where we want to nest loops, but the depth is unknown, something like

for (i_0 = 0; i_0 < n_0; i_0++) {
  for (i_1 = 0; i_1 < n_1; i_1++) {
    \cdots
      for (i_m = 0; i_m < n_m; i_m++) {
        dosomething(i_0, i_1,\ldots, i_m);
      }
    \cdots
  }
}

Now our friends from the functional world help us to understand what a loop is, because in some of these more functional languages the classical C-Style loop is either missing or at least not recommended as the everyday tool. Instead we view the set of values we iterate about as a collection and iterate through every element of the collection. This can be a bad thing, because instantiating such big collections can be a show stopper, but we don't. Out of the many features of collections we just pick the iterability, which can very well be accomplished by lazy collections. In Java we have the Iterable, Iterator, Spliterator and the Stream interfaces to express such potentially lazy collections that are just used for iterating.

So we could think of a library that provides us with support for ordinary loops, so we could write something like this:

Iterable range = new LoopRangeExcludeUpper<>(0, n);
for (Integer i : range) {
    doSomething(i);
}

or even better, if we assume 0 as a lower limit is the default anyway:

Iterable range = new LoopRangeExcludeUpper<>(n);
for (Integer i : range) {
    doSomething(i);
}

with the ugliness of boxing and unboxing in terms of runtime overhead, memory overhead, and additional complexity for development. In Scala, Ruby or Clojure the equivalent solution would be elegant and useful and the way to go...
I would assume, that a library who does something like LoopRangeExcludeUpper in the code example should easily be available for Java, maybe even in the standard library, or in some common public maven repository...

Now the issue of loops with unknown nesting depth can easily be addressed by writing or downloading a class like NestedLoopRange, which might have a constructor of the form NestedLoopRange(int ... ni) or NestedLoopRange(List li) or something with collections that are more efficient with primitives, for example from Apache Commons. Consider using long instead of int, which will break some compatibility with Java-collections. This should not hurt too much here and it is a good thing to reconsider the 31-bit size field of Java collections as an obstacle for future development and to address how collections can grow larger than 2^{31}-1 elements, but that is just a side issue here. We broke this limit with the example iterating over 10'000'000'000 values for i already and it took only a few minutes. Of course it was just an abstract way of dealing with a lazy collection without the Java interfaces involved.

So, the code could just look like this:

Iterable range = new NestedLoopRange(n_0, n_1, \ldots, n_m);
for (Tuple t : range) {
    doSomething(t);
}

Btw, it is not too hard to write it in the classical way either:

        long[] n = new long[] { n_0, n_1, \ldots, n_m };
        int m1 = n.length;
        int m  = m1-1; // just to have the math-m matched...
        long[] t = new long[m1];
        for (int j = 0; j < m1; j++) {
            t[j] = 0L;
        }
        boolean done = false;
        for (int j = 0; j < m1; j++) {
            if (n[j] <= 0) {
                done = true;
                break;
            }
        }
        while (! done) {
            doSomething(t);
            done = true;
            for (int j = 0; j < m1; j++) {
                t[j]++;
                if (t[j] < n[j]) {
                    done = false;
                    break;
                }
                t[j] = 0;
            }
        }

I have written this kind of loop several times in my life in different languages. The first time was on C64-basic when I was still in school and the last one was written in Java and shaped into a library, where appropriate collection interfaces were implemented, which remained in the project or the organization, where it had been done, but it could easily be written again, maybe in Scala, Clojure or Ruby, if it is not already there. It might even be interesting to explore, how to write it in C in a way that can be used as easily as such a library in Java or Scala. If there is interest, please let me know in the comments section, I might come back to this issue in the future...

In C it is actually quite possible to write a generic solution. I see an API like this might work:

struct nested_iteration {
  /* implementation detail */
};

void init_nested_iteration(struct nested_iteration ni, size_t m1, long *n);
void dispose_nested_iteration(struct nested_iteration ni);
int nested_iteration_done(struct nested_iteration ni); // returns 0=false or 1=true
void nested_iteration_next(struct nested_iteration ni);

and it would be called like this:

struct nested_iteration ni;
int n[] = { n_0, n_1, \ldots, n_m };
for (init_nested_iteration(ni, m+1, n); 
     ! nested_iteration_done(ni); 
     nested_iteration_next(ni)) {
...
}

So I guess, it is doable and reasonably easy to program and to use, but of course not quite as elegant as in Java 8, Clojure or Scala.
I would like to leave this as a rough idea and maybe come back with concrete examples and implementations in the future.

Links

Share Button