Logging

Deutsch

Software often contains a logging functionality. Usually entries one or sometimes multiple lines are appended to a file, written to syslog or to stdout, from where they are redirected into a file. They are telling us something about what the software is doing. Usually we can ignore all of it, but as soon as something with „ERROR“ or worse and more visible stack traces can be found, we should investigate this. Unfortunately software is often not so good, which can be due to libraries, frameworks or our own code. Then stack traces and errors are so common that it is hard to look into or to find the ones that are really worth looking into. Or there is simply no complete process in place to watch the log files. Sometimes the error shows up much later than it actually occurred and stack traces do not really lead us to the right spot. More often than we think logging actually introduces runtime errors, that were otherwise not present. This is related to a more general concept, which is called observer effect, where logging actually changes the business logic.

It is nice that log files keep to some format. Usually they start with a time stamp in ISO-format, often to the millisecond. Please add trailing zeros to always have 3 digits after the decimal point in this case. It is preferable to use UTC, but people tend to stick to local date and time zones, including the issues that come with switching to and from daylight saving time. Usually we have several processes or threads that run simultaneously. This can result in a wild mix of logging entries. As long as even multiline entries stay together and as long as beginning and end of one multiline entry can easily be recognized, this can be dealt with. Tools like splunk or simple Perl, Ruby or Python scripts can help us to follow threads separately. We could actually have separate logs for each thread in the first place, but this is not a common practice and it might hit OS-limitations on the number of open files, if we have many threads or even thousands of actors as in Erlang or Akka. Keeping log entries together can be achieved by using an atomic write, like the write system call in Linux and other Posix systems. Another way is to queue the log entries and to have a logger thread that processes the queue.

Overall this area has become very complex and hard to tame. In the Java world there used to be log4j with a configuration file that was a simple properties file, at least in the earlier version. This was so good that other languages copied it and created some log4X. Later the config file was replaced by XML and more logging frame works were added. Of course quite a lot of them just for the purpose of abstracting from the large zoo of logging frameworks and providing a unique interface for all of them. So the result was, that there was one more to deal with.

It is a good question, how much logic for handling of log files do we really want to see in our software. Does the software have to know, into which file it should log or how to do log rotation? If a configuration determines this, but the configuration is compiled into the jar file, it does have to know… We can keep our code a bit cleaner by relying on program functionality without code, but this still keeps it as part of the software.

Log files have to please the system administrator or whoever replaced them in a pure devops shop. And in the end developers will have to be able to work with the information provided by the logs to find issues in the code or to explain what is happening, if the system administrator cannot resolve an issue by himself. Should this system administrator have to deal with a different special complex setup for the logging for each software he is running? Or should it be necessary to call for developer support to get a new version of the software with just another log setting, because the configurations are hard coded in the deployment artifacts? Interesting is also, what happens when we use PAAS, where we have application server, database etc., but the software can easily move to another server, which might result in losing the logs. Moving logs to another server or logging across the network is expensive, maybe more expensive than the rest of this infrastructure.

Is it maybe a good idea to just log to stdout, maintaining a decent format and to run the software in such a way that stdout is piped into a log manager? This can be the same for all software and there is one way to configure it. The same means not only the same for all the java programs, but actually the same for all programs in all languages that comply to a minimal standard. This could be achieved using named pipes in conjunction with any hard coded log file that the software wants to use. But this is a dangerous path unless we really know what the software is doing with its log files. Just think of what weird errors might happen if the software tries to apply log rotation to the named pipe by renaming, deleting, creating new files and so on. A common trick to stop software from logging into a place where we do not want this is to create a directory with the name of the file that the software usually uses and to write protect this directory and its parent directory for the software. Please find out how to do it in detail, depending on your environment.

What about software, that is a filter by itself, so its main functionality is to actually write useful data to stdout? Usually smaller programs and scripts work like this. Often they do not need to log and often they are well tested relyable parts of our software installation. Where are the log files of cp, ls, rm, mv, grep, sort, cat, less,…? Yes, they do tend to write to stderr, if real errors occur. Where needed, programs can turn on logging with a log file provided on the command line, which is also a quite operations friendly approach. Named pipes can help here.

And we had a good logging framework in place for many years. It was called syslog and it is still around, at least on Linux.

A last thought: We spend really a lot of effort to get well performing software, using multiple processes, threads or even clusters. And then we forget about the fact that logging might become the bottle neck.

Share Button

Meaningless Whitespace in Textfiles

We use different file formats that are more or less tolerant to certain changes. Most well known is white space in text files.

In some programming languages white space (space, newline, carriage return, form feed, tabulator, vertical tab) has no meaning, as long as any whitespace is present. Examples for this are Java, Perl, Lisp or C. Whitespace, that is somehow part of String content is always significant, but white space that is used within the program can be combination of one or more of the white space characters that are in the lower 128 positions (ISO-646, often referred to as ASCII or 7bit ASCII. It is of course recommended to have a certain coding standard, which gives some guidelines of when to use newlines, if tabs or spaces are preferred (please spaces) and how to indent. But this is just about human readability and the compiler does not really care. Line numbers are a bit meaningful in compiler and runtime error messages and stack traces, so putting everything into one line would harm beyond readability, but there is a wide range of ways that are all correct and equivalent. Btw. many teams limit lines to 80 characters, which was a valid choice 30 years ago, when some terminals were only 80 characters wide and 132 character wide terminals where just coming up. But as a hard limit it is a joke today, because not many of us would be able to work with a vt100 terminal efficiently anyway. Very long lines might be harder to read, so anything around 120 or 160 might still be a reasonable idea about line lengths…

Languages like Ruby and Scala put slightly more meaning into white space, because in most cases a semicolon can be skipped if it is followed by a newline and not just horizontal white space. And Perl (Perl 5) is for sure so hard to compile that only its own implementation can properly format or even recognize which white space is part of a literal string. Special cases like having the language in a string and parsing and then executing that should be ignored here.

Now we put this program files into a source code management system, usually Git. Some teams still use legacy systems like subversion, source safe, clear case or CVS, while there are some newer systems that are probably about as powerful as git, but I never saw them in use. Git creates an MD5 hash of each file, which implies that any minor change will result in a new version, even if it is just white space. Now this does not hurt too much, if we agree on the same formatting and on the same line ending (hopefully LF only, not CR LF, even on MS-Windows). But our tooling does not make any difference between significant changes and insignificant formatting only changes. This gets worse, if users have different IDEs, which they should have, because everyone should use the IDE or editor, with which he or she is most efficient and the formal description of the preferred formatting is not shared between editors or differs slightly.

I think that each programming language should come with a command line diff tool and a command line formatting tool, that obey a standard interface for calling and can be plugged into editors and into source code management systems like git. Then the same mechanisms work for C, Java, C#, Ruby, Python, Fortran, Clojure, Perl, F#, Scala, Lua or your favorite programming language.

I can imaging two ways of working: Either we have a standard format and possibly individual formats for each developer. During „git commit“ the file is brought into the standard format before it is shown to git. Meaning less whitespace changes disappear. During checkout the file can optionally be brought into the preferred format of the developer. And yes, there are ways to deal with deliberate formatting, that for some reason should be kept verbatim and for dealing differently with comments and of course all kinds of string literals. Remember, the formatting tool comes from the same source as the compiler and fully understands the language.

The other approach leaves the formatting up to the developer and only creates a new version, when the diff tool of the language signifies that there is a relevant change.

I think that we should strive for this approach. It is no rocket science, the kind of tools were around for many decades as diff and as formatting tools, it would just be necessary to go the extra mile and create sister diff and formatting tools for the compiler (or interpreter) and to actually integrate these into build environments, IDEs, editors and git. It would save a lot of time and leave more time for solving real problems.

Is there any programming language that actually does this already?

How to handle XML? Is XML just the new binary with a bit more bloat? Can we do a generic handling of all XML or should it depend on the Schema?

Share Button

Loops with unknown nesting depth

We often encounter nested loops, like

for (i = 0; i < n; i++) {
    for (j = 0; j < m; j++) {
        doSomething(i, j);
    }
}

This can be nested to a few more levels without too much pain, as long as we observe that the number of iterations for each level need to be multiplied to get the number of iterations for the whole thing and that total numbers of iterations beyond a few billions (10^9, German: Milliarden, Russian Миллиарди) become unreasonable no matter how fast the doSomethings(...) is. Just looking at this example program

public class Modular {
    public static void main(String[] args) {
        long n = Long.parseLong(args[0]);
        long t = System.currentTimeMillis();
        long m = Long.parseLong(args[1]);
        System.out.println("n=" + n + " t=" + t + " m=" + m);
        long prod = 1;
        long sum  = 0;
        for (long i = 0; i < n; i++) {
            long j = i % m;
            sum += j;
            sum %= m;
            prod *= (j*j+1) % m;
            prod %= m;
        }
        System.out.println("sum=" + sum + " prod=" + prod + " dt=" + (System.currentTimeMillis() - t));
    }
}

which measures it net run time and runs 0 msec for 1000 iterations and almost three minutes for 10 billions (10^{10}):

> java Modular 1000 1001 # 1'000
--> sum=1 prod=442 dt=0
> java Modular 10000 1001 # 10'000
--> sum=55 prod=520 dt=1
> java Modular 100000 1001 # 100'000
--> sum=45 prod=299 dt=7
> java Modular 1000000 1001 # 1'000'000
--> sum=0 prod=806 dt=36
> java Modular 10000000 1001 # 10'000'000
--> sum=45 prod=299 dt=344
> java Modular 100000000 1001 # 100'000'000
--> sum=946 prod=949 dt=3314
> java Modular 1000000000 1001 # 1'000'000'000
--> sum=1 prod=442 dt=34439
> java Modular 10000000000 1001 # 10'000'000'000
--> sum=55 prod=520 dt=332346

As soon as we do I/O, network access, database access or simply a bit more serious calculation, this becomes of course easily unbearably slow. But today it is cool to deal with big data and to at least call what we are doing big data, even though conventional processing on a laptop can do it in a few seconds or minutes... And there are of course ways to process way more iterations than this, but it becomes worth thinking about the system architecture, the hardware, parallel processing and of course algorithms and software stacks. But here we are in the "normal world", which can be a "normal subuniverse" of something really big, so running on one CPU and using a normal language like Perl, Java, Ruby, Scala, Clojure, F# or C.

Now sometimes we encounter situations where we want to nest loops, but the depth is unknown, something like

for (i_0 = 0; i_0 < n_0; i_0++) {
  for (i_1 = 0; i_1 < n_1; i_1++) {
    \cdots
      for (i_m = 0; i_m < n_m; i_m++) {
        dosomething(i_0, i_1,\ldots, i_m);
      }
    \cdots
  }
}

Now our friends from the functional world help us to understand what a loop is, because in some of these more functional languages the classical C-Style loop is either missing or at least not recommended as the everyday tool. Instead we view the set of values we iterate about as a collection and iterate through every element of the collection. This can be a bad thing, because instantiating such big collections can be a show stopper, but we don't. Out of the many features of collections we just pick the iterability, which can very well be accomplished by lazy collections. In Java we have the Iterable, Iterator, Spliterator and the Stream interfaces to express such potentially lazy collections that are just used for iterating.

So we could think of a library that provides us with support for ordinary loops, so we could write something like this:

Iterable range = new LoopRangeExcludeUpper<>(0, n);
for (Integer i : range) {
    doSomething(i);
}

or even better, if we assume 0 as a lower limit is the default anyway:

Iterable range = new LoopRangeExcludeUpper<>(n);
for (Integer i : range) {
    doSomething(i);
}

with the ugliness of boxing and unboxing in terms of runtime overhead, memory overhead, and additional complexity for development. In Scala, Ruby or Clojure the equivalent solution would be elegant and useful and the way to go...
I would assume, that a library who does something like LoopRangeExcludeUpper in the code example should easily be available for Java, maybe even in the standard library, or in some common public maven repository...

Now the issue of loops with unknown nesting depth can easily be addressed by writing or downloading a class like NestedLoopRange, which might have a constructor of the form NestedLoopRange(int ... ni) or NestedLoopRange(List li) or something with collections that are more efficient with primitives, for example from Apache Commons. Consider using long instead of int, which will break some compatibility with Java-collections. This should not hurt too much here and it is a good thing to reconsider the 31-bit size field of Java collections as an obstacle for future development and to address how collections can grow larger than 2^{31}-1 elements, but that is just a side issue here. We broke this limit with the example iterating over 10'000'000'000 values for i already and it took only a few minutes. Of course it was just an abstract way of dealing with a lazy collection without the Java interfaces involved.

So, the code could just look like this:

Iterable range = new NestedLoopRange(n_0, n_1, \ldots, n_m);
for (Tuple t : range) {
    doSomething(t);
}

Btw, it is not too hard to write it in the classical way either:

        long[] n = new long[] { n_0, n_1, \ldots, n_m };
        int m1 = n.length;
        int m  = m1-1; // just to have the math-m matched...
        long[] t = new long[m1];
        for (int j = 0; j < m1; j++) {
            t[j] = 0L;
        }
        boolean done = false;
        for (int j = 0; j < m1; j++) {
            if (n[j] <= 0) {
                done = true;
                break;
            }
        }
        while (! done) {
            doSomething(t);
            done = true;
            for (int j = 0; j < m1; j++) {
                t[j]++;
                if (t[j] < n[j]) {
                    done = false;
                    break;
                }
                t[j] = 0;
            }
        }

I have written this kind of loop several times in my life in different languages. The first time was on C64-basic when I was still in school and the last one was written in Java and shaped into a library, where appropriate collection interfaces were implemented, which remained in the project or the organization, where it had been done, but it could easily be written again, maybe in Scala, Clojure or Ruby, if it is not already there. It might even be interesting to explore, how to write it in C in a way that can be used as easily as such a library in Java or Scala. If there is interest, please let me know in the comments section, I might come back to this issue in the future...

In C it is actually quite possible to write a generic solution. I see an API like this might work:

struct nested_iteration {
  /* implementation detail */
};

void init_nested_iteration(struct nested_iteration ni, size_t m1, long *n);
void dispose_nested_iteration(struct nested_iteration ni);
int nested_iteration_done(struct nested_iteration ni); // returns 0=false or 1=true
void nested_iteration_next(struct nested_iteration ni);

and it would be called like this:

struct nested_iteration ni;
int n[] = { n_0, n_1, \ldots, n_m };
for (init_nested_iteration(ni, m+1, n); 
     ! nested_iteration_done(ni); 
     nested_iteration_next(ni)) {
...
}

So I guess, it is doable and reasonably easy to program and to use, but of course not quite as elegant as in Java 8, Clojure or Scala.
I would like to leave this as a rough idea and maybe come back with concrete examples and implementations in the future.

Links

Share Button

Perl 5 and Perl 6

We have now two Perls. Perl 5, which has been around for more than 20 years just as the „Perl programming language“ and Perl 6, which has been developed for more than a decade and of which now stable versions exist.

The fact, that they are both called „Perl“ is a bit misleading. They are two different and incompatible programming languages. But they share the same community. And Perl conferences are usually covering both languages.

So this rises the question about the differences or about which of the two Perls to use.

Here are some differences:

  • Perl 5 is well established and many people know it. Perl 6 has to be learned, even if it is relatively easy to learn for someone with a Perl 5 background.
  • Perl 5 runs about three times faster than Perl 6
  • Perl 6 programs are a bit shorter than Perl 5 programs
  • Perl 6 regular expressions are even better than Perl 5’s regular expressions
  • Perl 6 is more logical than Perl 5
  • Perl 6 uses by default better numerical types
  • Perl 6 makes it easier and more natural to do object oriented programming and functional programming
  • Perl 6 has come up with a useful approach for doing multithreading.
  • Perl 5 has so many cool libraries on CPAN, Perl 6 just a few.

Links:

Share Button

Swiss Perl Workshop 2017

I have attended the Swiss Perl Workshop.
We were a group of about 40 people, one track and some very interesting talks, including by Damian Conway.
I gave a regular talk and a lightning talk myself.
The content of my talk might go into another Blog post in the future.
The Perl programming language is still interesting, and of course it was covered in both variants: Perl 5 and Perl 6.
But many of the talks were about general issues like security and architecture and just exemplified by Perl.

The Video recording of talks was optional. Here are those that have been recorded and already uploaded: Youtube: Swiss Perl Workshop

Share Button

Alpine Perl Workshop

On 2016-09-02 and 2016-09-03 I was able to visit the Alpine Perl Workshop. This was a Perl conference with around 50 participants, among them core members of the Perl community. We had mostly one track, so the documented information about the talks that were given is actually quite closely correlated to the list of talks that I have actually visited.

We had quite a diverse set of talks about technical issues but also about the role of Perl programming language in projects and in general. The speeches were in English and German…

Perl 6 is now a reality. It can be used together with Perl 5, there are ways to embed them within each other and they seem to work reasonably well. This fills some of the gaps of Perl 5, since the set of modules is by far not as complete as for Perl 5.

Perl 5 has since quite a few years established a time boxed release schedule. Each year they ship a new major release. The previous two releases are supported for bugfixes. The danger that major Linux distributions remain on older releases has been banned. Python 3 has been released in 2008 and still in 2016 Python 2.7 is what is usually used and shipped with major Linux distributions. It looks like Perl 5 is there to stay, not be replaced by Perl 6, which is a quite different language that just shares the name and the community. But the recent versions are actually adopted and the incompatible changes are so little that they do not hurt too much, usually. An advantage of Perl is the CPAN repository for libraries. It is possible to test new versions against a ton of such libraries and to find out, where it might break or even providing fixes for the library.

An interesting issue is testing of software. For continuous integration we can now find servers and they will run against a configurable set of Perl versions. But using different Linux distributions or even non-Linux-systems becomes a more elaborate issue. People willing to test new versions of Perl or of libraries on exotic hardware and OS are still welcome and often they discover a weakness that might be of interest even for the mainstream platforms in the long run.

I will leave it with this. You can find more information in the web site of the conference.

And some of the talks are on youtube already.

It was fun to go there, I learned a lot and met nice people. It would be great to be able to visit a similar event again…

Share Button

Numeric types in Perl

Dealing with numeric types in Perl is not as strait-forward as in other programming languages. We can use „scalars“ out of the box, but then we get floating point numbers, more precisely what is called „double“ in most programming languages. This is kind of ok for trivial programs, but we should make a deliberate choice on what to use.

Actually the Perl programming language gives us (at least) two more choices. We can use 64-bit integers (or 32-bit on some platforms) by just adding

use integer;

somewhere in the beginning of the file. This causes Perl to work mostly with integer instead of floating point numbers, but the rules for this are not so obvious. You may read about them in the official documentation. Or find another explanation or one more.

Now we do want to control this on a more fine granular basis than the whole program. There may be legitimate programs that use both floating point and integers. This can be achieved in Perl as well. We can turn this off using:

no integer;

More likely we want to use another approach, that looks more natural and more robust most of the time. We just have to use blocks:

#!/usr/bin/perl -w                                  
                                                                                        
use strict;                                                                             
                                                                                        
my $f1 = 2_000_000_000;                                                                
my $f2 = $f1 * $f1;                                                                  
my $f3 = $f1 * $f2;                                                                  
my $f4 = $f1 * $f3;                                                                  
my $f5 = $f1 * $f4;                                                                  
                                                                                        
my @f = (1, $f1, $f2, $f3, $f4, $f5);                                              
for (my $i = 0; $i <= 5; $i++) {                                                          print($i, " ", $f[$i], "\n");                                                     }                                                                                                                                                                                 my $n2x;                                                                                {                                                                                            use integer;                                                                             my $n1 = 2_000_000_000;                                                                 my $n2 = $n1 * $n1;                                                                   my $n3 = $n1 * $n2;                                                                   my $n4 = $n1 * $n3;                                                                   my $n5 = $n1 * $n4;                                                                                                                                                            my @n = (1, $n1, $n2, $n3, $n4, $n5);                                               for (my $i = 0; $i <= 5; $i++) {                                                          print($i, " ", $n[$i], "\n");                                                     }                                                                                        $n2x = $n2;                                                                        }                                                                                                                                                                                 print "n2x=$n2x\n";                                                                                                                                                              my $g1 = 2_000_000_000;                                                                 my $g2 = $g1 * $g1;                                                                   my $g3 = $g1 * $g2;                                                                   my $g4 = $g1 * $g3;                                                                   my $g5 = $g1 * $g4;                                                                                                                                                            my @g = (1, $g1, $g2, $g3, $g4, $g5);                                               for (my $i = 0; $i <= 5; $i++) {                                                          print($i, " ", $g[$i], "\n");                                                     }                                                                                       
                                                                                 
This will output:                                                                       
                                                                                  
0 1                                                                                     
1 2000000000                                                                            
2 4000000000000000000                                                                   
3 8e+27                                                                                 
4 1.6e+37                                                                               
5 3.2e+46                                                                               
0 1                                                                                     
1 2000000000                                                                            
2 4000000000000000000                                                                   
3 -106958398427234304                                                                   
4 3799332742966018048                                                                   
5 7229403301836488704                                                                   
n2x=4000000000000000000                                                                 

So we see that the integer mode is constrained to the block. And we see that the results for 3, 4 and 5 went wrong...

So it may be a little bit tricky to do this, but we can. These integers have the same flaw as integers in many popular programming languages, because they silently overflow by taking the remainder modulo 2^{64} that lies in the interval [-2^{63}, 2^{63}-1] or modulo 2^{32} that lies in the interval [-2^{31}, 2^{31}-1]. I do not think that is really what we usually want and just hoping that our numbers remain within the safe range may go well in the 64-bit-case, but we have to be sure and explain this in a comment, when we work like this. Usually we do not want to think about this and spending a few extra bits costs less than hunting obscure bugs where everything looks so correct.

Our friend is

use bigint;

which switches to arbitrary precision integers.

#!/usr/bin/perl -w                             
                                                                                 
use strict;                                                                      
                                                                                 
my $f1 = 2_000_000_000;                                                         
my $f2 = $f1 * $f1;                                                           
my $f3 = $f1 * $f2;                                                           
my $f4 = $f1 * $f3;                                                           
my $f5 = $f1 * $f4;                                                           
                                                                                 
my @f = (1, $f1, $f2, $f3, $f4, $f5);                                       
for (my $i = 0; $i <= 5; $i++) {                                                   print($i, " ", $f[$i], "\n");                                              }                                                                                                                                                                   my $b2x;                                                                         {                                                                                     use bigint;                                                                       my $b1 = 2_000_000_000;                                                          my $b2 = $b1 * $b1;                                                            my $b3 = $b1 * $b2;                                                            my $b4 = $b1 * $b3;                                                            my $b5 = $b1 * $b4;                                                                                                                                              my @b = (1, $b1, $b2, $b3, $b4, $b5);                                        for (my $i = 0; $i <= 5; $i++) {                                                   print($i, " ", $b[$i], "\n");                                              }                                                                                 $b2x = $b2;                                                                 }                                                                                                                                                                   print "b2x=$b2x\n";                                                                                                                                                my $g1 = 2_000_000_000;                                                          my $g2 = $g1 * $g1;                                                            my $g3 = $g1 * $g2;                                                            my $g4 = $g1 * $g3;                                                            my $g5 = $g1 * $g4;                                                                                                                                              my @g = (1, $g1, $g2, $g3, $g4, $g5);                                        for (my $i = 0; $i <= 5; $i++) {                                                   print($i, " ", $g[$i], "\n");                                              }                                                                                

This gives us the output:

0 1
1 2000000000
2 4000000000000000000
3 8e+27
4 1.6e+37
5 3.2e+46
0 1
1 2000000000
2 4000000000000000000
3 8000000000000000000000000000
4 16000000000000000000000000000000000000
5 32000000000000000000000000000000000000000000000
b2x=4000000000000000000
0 1
1 2000000000
2 4000000000000000000
3 8e+27
4 1.6e+37
5 3.2e+46

So it is again constrained to the block, but it allows us to use arbitrary lengths of integers, as long as our memory is sufficient.

A less commonly used, but interesting approach is to work with rational numbers:

#!/usr/bin/perl -w                                      
                                                                                       
use strict;                                                                            
use bigrat;                                                                            
                                                                                       
my $x = 3/4;                                                                          
my $y = 4/5;                                                                          
my $z = 5/6;                                                                          
print("x=$x y=$y z=$z\n");                                                          
                                                                                       
my $sum = $x+$y+$z;                                                                
my $diff = $x - $y;                                                                 
my $prod = $x * $x * $z;                                                           
my $quot = $x / $y;                                                                 
print("sum=$sum diff=$diff prod=$prod quot=$quot\n");                              

This gives us:

x=3/4 y=4/5 z=5/6
sum=143/60 diff=-1/20 prod=15/32 quot=15/16

That is kind of cool...

There is also something like Math::BigFloat which can be used most easily by having

use bignum;

Find the documentation about "use bignum" and about Math::BigFloat...

You will find more numeric types, like Math::Decimal and Math::Complex.

While I would say that using good numeric types in Perl is not quite as easy as it should be, at least if we want to mix them, at least we have the means to use the adequate numeric types. And it is way better than in Java.

Share Button

Perl 6

Perl 6 has silently reached its first production ready release on Christmas 2015, called v6c. It will be interesting to explore what this language can do, which features it offers and how it compares to existing relevant and interesting languages like Java, C, Ruby, Perl (5), Clojure, Scala, F#, C++, Python, PHP and others in different aspects. It looks like Perl 5 is there to stay and will be continued. Perl 6 should actually be considered to be a different programming language than Perl 5, so the name is somewhat misleading, because it suggests slightly more similarity than there really is. On the other hand, it was done by the same people, good ideas concepts from Perl 5 were retained and so it does look somewhat similar.

Today I will just provide some links

* Main web page
* Documentation
* Documentation II
* Rakudo: Currently the major implementation
* Wikipedia
* Wikipedia (Russian)
* Wikipedia (Spanish)
* Wikipedia (French)
* Wikipeida (Norwegian)

Maybe I will write more about this in the future…

Share Button

Perl Training in Switzerland

Very soon we will have the opportunity to participate in advanced Perl trainings and even some trainings about presentations.

Here are the Details.

I found these trainings useful, when I visited them. They are done by Damian Conway, one of the core developers of the Perl programming language.

The courses will be held in English. Other than in previous years the location will be in FHNW in walking distance of the railroad station of Brugg or if you are more road oriented in walking distance of the point where the Swiss national highways 5 and 3 meet in Brugg. You can find it on the map.

Share Button

How to create ISO Date String

It is a more and more common task that we need to have a date or maybe date with time as String.

There are two reasonable ways to do this:
* We may want the date formatted in the users Locale, whatever that is.
* We want to use a generic date format, that is for a broader audience or for usage in data exchange formats, log files etc.

The first issue is interesting, because it is not always trivial to teach the software to get the right locale and to use it properly… The mechanisms are there and they are often used correctly, but more often this is just working fine for the locale that the software developers where asked to support.

So now the question is, how do we get the ISO-date of today in different environments.

Linux/Unix-Shell (bash, tcsh, …)

date "+%F"

TeX/LaTeX


\def\dayiso{\ifcase\day \or
01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or% 1..10
11\or 12\or 13\or 14\or 15\or 16\or 17\or 18\or 19\or 20\or% 11..20
21\or 22\or 23\or 24\or 25\or 26\or 27\or 28\or 29\or 30\or% 21..30
31\fi}
\def\monthiso{\ifcase\month \or
01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or 11\or 12\fi}
\def\dateiso{\def\today{\number\year-\monthiso-\dayiso}}
\def\todayiso{\number\year-\monthiso-\dayiso}

This can go into a file isodate.sty which can then be included by \include or \input Then using \todayiso in your TeX document will use the current date. To be more precise, it is the date when TeX or LaTeX is called to process the file. This is what I use for my paper letters.

LaTeX

(From Fritz Zaucker, see his comment below):

\usepackage{isodate} % load package
\isodate % switch to ISO format
\today % print date according to current format

Oracle


SELECT TO_CHAR(SYSDATE, 'YYYY-MM-DD') FROM DUAL;

On Oracle Docs this function is documented.
It can be chosen as a default using ALTER SESSION for the whole session. Or in SQL-developer it can be configured. Then it is ok to just call

SELECT SYSDATE FROM DUAL;

Btw. Oracle allows to add numbers to dates. These are days. Use fractions of a day to add hours or minutes.

PostreSQL

(From Fritz Zaucker, see his comment):

select current_date;
—> 2016-01-08


select now();
—> 2016-01-08 14:37:55.701079+01

Emacs

In Emacs I like to have the current Date immediately:

(defun insert-current-date ()
"inserts the current date"
(interactive)
(insert
(let ((x (current-time-string)))
(concat (substring x 20 24)
"-"
(cdr (assoc (substring x 4 7)
cmode-month-alist))
"-"
(let ((y (substring x 8 9)))
(if (string= y " ") "0" y))
(substring x 9 10)))))
(global-set-key [S-f5] 'insert-current-date)

Pressing Shift-F5 will put the current date into the cursor position, mostly as if it had been typed.

Emacs (better Variant)

(From Thomas, see his comment below):

(defun insert-current-date ()
"Insert current date."
(interactive)
(insert (format-time-string "%Y-%m-%d")))

Perl

In the Perl programming language we can use a command line call

perl -e 'use POSIX qw/strftime/;print strftime("%F", localtime()), "\n"'

or to use it in larger programms

use POSIX qw/strftime/;
my $isodate_of_today = strftime("%F", localtime());

I am not sure, if this works on MS-Windows as well, but Linux-, Unix- and MacOS-X-users should see this working.

If someone has tried it on Windows, I will be interested to hear about it…
Maybe I will try it out myself…

Perl 5 (second suggestion)

(From Fritz Zaucker, see his comment below):

perl -e 'use DateTime; use 5.10.0; say DateTime->now->strftime(„%F“);‘

Perl 6

(From Fritz Zaucker, see his comment below):

say Date.today;

or

Date.today.say;

Ruby

This is even more elegant than Perl:

ruby -e 'puts Time.new.strftime("%F")'

will do it on the command line.
Or if you like to use it in your Ruby program, just use

d = Time.new
s = d.strftime("%F")

Btw. like in Oracle SQL it is possible add numbers to this. In case of Ruby, you are adding seconds.

It is slightly confusing that Ruby has two different types, Date and Time. Not quite as confusing as Java, but still…
Time is ok for this purpose.

C on Linux / Posix / Unix


#include
#include
#include

main(int argc, char **argv) {

char s[12];
time_t seconds_since_1970 = time(NULL);
struct tm local;
struct tm gmt;
localtime_r(&seconds_since_1970, &local);
gmtime_r(&seconds_since_1970, &gmt);
size_t l1 = strftime(s, 11, "%Y-%m-%d", &local);
printf("local:\t%s\n", s);
size_t l2 = strftime(s, 11, "%Y-%m-%d", &gmt);
printf("gmt:\t%s\n", s);
exit(0);
}

This speeks for itself..
But if you like to know: time() gets the seconds since 1970 as some kind of integer.
localtime_r or gmtime_r convert it into a structur, that has seconds, minutes etc as separate fields.
stftime formats it. Depending on your C it is also possible to use %F.

Scala


import java.util.Date
import java.text.SimpleDateFormat
...
val s : String = new SimpleDateFormat("YYYY-MM-dd").format(new Date())

This uses the ugly Java-7-libraries. We want to go to Java 8 or use Joda time and a wrapper for Scala.

Java 7


import java.util.Date
import java.text.SimpleDateFormat

...
String s = new SimpleDateFormat("YYYY-MM-dd").format(new Date());

Please observe that SimpleDateFormat is not thread safe. So do one of the following:
* initialize it each time with new
* make sure you run only single threaded, forever
* use EJB and have the format as instance variable in a stateless session bean
* protect it with synchronized
* protect it with locks
* make it a thread local variable

In Java 8 or Java 7 with Joda time this is better. And the toString()-method should have ISO8601 as default, but off course including the time part.

Summary

This is quite easy to achieve in many environments.
I could provide more, but maybe I leave this to you in the comments section.
What could be interesting:
* better ways for the ones that I have provided
* other databases
* other editors (vim, sublime, eclipse, idea,…)
* Office packages (Libreoffice and MS-Office)
* C#
* F#
* Clojure
* C on MS-Windows
* Perl and Ruby on MS-Windows
* Java 8
* Scala using better libraries than the Java-7-library for this
* Java using better libraries than the Java-7-library for this
* C++
* PHP
* Python
* Cobol
* JavaScript
* …
If you provide a reasonable solution I will make it part of the article with a reference…
See also Date Formats

Share Button