Lazy Collections, Strings or Numbers

The idea is, that we have data that is obtained or calculated to give us on demand as much of it as we request. But it is not necessarily initially present. This concept is quite common in the functional world, where we in a way hide the deprecated concept of state in such structures, by the way in a way that lets use retain the benefits that led to the desire for statelessness.

Actually the concept is quite old. We have it for I/O in Unix and hence in Linux since the 1970ies. „Everything is a file“, at least as long as we constrain ourselves to a universal subset of possible file operations. It can be keyboard input, a named or anonymous pipe, an actual file, a TCP-connection, to name the most important cases. These are „lazy“ files, behave more or less like files as far as sequential reading is concerned, but not for random access reading. The I/O-concept has been done in such a way that it takes the case into account that we want to read n bytes, but get only m < n bytes. This can happen with files when we reach their end, but then we can obtain an indication that we reached the end of the file, while it is perfectly possible that we read less then we want in one access, but eventually get \ge n bytes including subsequent reads. Since the API has been done right, but by no means ideal, it generalizes well to the different cases that exist in current OS environments.

We could consider a File as an array of bytes. There is actually a way to access it in this way by memory-mapping it, but this assumes a physically present file. Now we could assume that we think of the array as a list that is optimized for sequential access and iterating, but not for random access. Both list types actually exist in languages like Java. Actually the random access structure can be made lazy as well, within certain constraints. If the source is actually sequential, we can just assume that the data is obtained up to the point where we actually read. The information about the total length of the stream may or may not be available, it is always available somehow in the case of structures that are completely available in memory. This random access on lazy collections works fine if the reason of laziness is to actually save us from doing expensive operations to obtain data that we do not actually need or to obtain them in parallel to the computation that processes the data. But we loose another potential drawback in this case. If the data is truly sequential, we can actually process data that is way beyond our memory capacity.

So the concept transfers easily from I/O-streams to lists and even arrays, most naturally to iterables that can be iterated only once. But we can easily imagine that this also applies to Strings, which can be seen a sequence of characters. If we do not constrain us to what a String is in C or Java or Ruby, but consider String to be a more abstract concept, again possibly dropping the idea of knowing the length or having a finite length. Just think of the output of the Unix command „yes“ or „cat /dev/zero“, which is infinite, in a theoretical way, but the computer won’t last forever in real life, off course. And we always interrupt the output at some time, usually be having the consumer shut down the connection.

Even numbers can be infinite. For real numbers this can happen only after the decimal point, for p-adic numbers it happens only before the decimal point, if you like to look into that. Since we rarely program with p-adic numbers this is more or less an edge case that is not part of our daily work, unless we actually do math research. But we could have integers with so many digits that we actually obtain and process them sequentially.

Reactive programming, which is promoted by lightbend in the Reactive Manifesto relies heavily on lazy structures, in this case data streams. An important concept is the so called „backpressure“, that allows the consumer to slow down the producer, if it cannot read the data fast enough.

Back to the collections, we can observe different approaches. Java 8 has introduced streams as lazy collections and we need to transform collections into streams and after the operation a stream back into a collection, at least in many real life situations. But putting all into one structure has some drawbacks as well. But looking at it from an abstract point of view this does not matter. The java8-streams to not implement a collection interface, but they are lazy collections from a more abstract point of view.

It is interesting that this allows us to relatively easily write nested loops where the depth of the nesting is a parameter that is not known at compile time. We just need a lazy collections of n-tuples, where n is the actual depth of the nesting and the contents are according to what the loops should iterate through. In this case we might or might not know the size of the collection, possibly not fitting into a 32-bit-integer. We might be able to produce a random member of the collection. And for sure we can iterate through it and stop the iteration wherever it is, once the desired calculation has been completed.

Share Button

How to create ISO Date String

It is a more and more common task that we need to have a date or maybe date with time as String.

There are two reasonable ways to do this:
* We may want the date formatted in the users Locale, whatever that is.
* We want to use a generic date format, that is for a broader audience or for usage in data exchange formats, log files etc.

The first issue is interesting, because it is not always trivial to teach the software to get the right locale and to use it properly… The mechanisms are there and they are often used correctly, but more often this is just working fine for the locale that the software developers where asked to support.

So now the question is, how do we get the ISO-date of today in different environments.

Linux/Unix-Shell (bash, tcsh, …)

date "+%F"


\def\dayiso{\ifcase\day \or
01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or% 1..10
11\or 12\or 13\or 14\or 15\or 16\or 17\or 18\or 19\or 20\or% 11..20
21\or 22\or 23\or 24\or 25\or 26\or 27\or 28\or 29\or 30\or% 21..30
\def\monthiso{\ifcase\month \or
01\or 02\or 03\or 04\or 05\or 06\or 07\or 08\or 09\or 10\or 11\or 12\fi}

This can go into a file isodate.sty which can then be included by \include or \input Then using \todayiso in your TeX document will use the current date. To be more precise, it is the date when TeX or LaTeX is called to process the file. This is what I use for my paper letters.


(From Fritz Zaucker, see his comment below):

\usepackage{isodate} % load package
\isodate % switch to ISO format
\today % print date according to current format



On Oracle Docs this function is documented.
It can be chosen as a default using ALTER SESSION for the whole session. Or in SQL-developer it can be configured. Then it is ok to just call


Btw. Oracle allows to add numbers to dates. These are days. Use fractions of a day to add hours or minutes.


(From Fritz Zaucker, see his comment):

select current_date;
—> 2016-01-08

select now();
—> 2016-01-08 14:37:55.701079+01


In Emacs I like to have the current Date immediately:

(defun insert-current-date ()
"inserts the current date"
(let ((x (current-time-string)))
(concat (substring x 20 24)
(cdr (assoc (substring x 4 7)
(let ((y (substring x 8 9)))
(if (string= y " ") "0" y))
(substring x 9 10)))))
(global-set-key [S-f5] 'insert-current-date)

Pressing Shift-F5 will put the current date into the cursor position, mostly as if it had been typed.

Emacs (better Variant)

(From Thomas, see his comment below):

(defun insert-current-date ()
"Insert current date."
(insert (format-time-string "%Y-%m-%d")))


In Perl we can use a command line call

perl -e 'use POSIX qw/strftime/;print strftime("%F", localtime()), "\n"'

or to use it in larger programms

use POSIX qw/strftime/;
my $isodate_of_today = strftime("%F", localtime());

I am not sure, if this works on MS-Windows as well, but Linux-, Unix- and MacOS-X-users should see this working.

If someone has tried it on Windows, I will be interested to hear about it…
Maybe I will try it out myself…

Perl 5 (second suggestion)

(From Fritz Zaucker, see his comment below):

perl -e 'use DateTime; use 5.10.0; say DateTime->now->strftime(„%F“);‘

Perl 6

(From Fritz Zaucker, see his comment below):




This is even more elegant than Perl:

ruby -e 'puts"%F")'

will do it on the command line.
Or if you like to use it in your Ruby program, just use

d =
s = d.strftime("%F")

Btw. like in Oracle SQL it is possible add numbers to this. In case of Ruby, you are adding seconds.

It is slightly confusing that Ruby has two different types, Date and Time. Not quite as confusing as Java, but still…
Time is ok for this purpose.

C on Linux / Posix / Unix


main(int argc, char **argv) {

char s[12];
time_t seconds_since_1970 = time(NULL);
struct tm local;
struct tm gmt;
localtime_r(&seconds_since_1970, &local);
gmtime_r(&seconds_since_1970, &gmt);
size_t l1 = strftime(s, 11, "%Y-%m-%d", &local);
printf("local:\t%s\n", s);
size_t l2 = strftime(s, 11, "%Y-%m-%d", &gmt);
printf("gmt:\t%s\n", s);

This speeks for itself..
But if you like to know: time() gets the seconds since 1970 as some kind of integer.
localtime_r or gmtime_r convert it into a structur, that has seconds, minutes etc as separate fields.
stftime formats it. Depending on your C it is also possible to use %F.


import java.util.Date
import java.text.SimpleDateFormat
val s : String = new SimpleDateFormat("YYYY-MM-dd").format(new Date())

This uses the ugly Java-7-libraries. We want to go to Java 8 or use Joda time and a wrapper for Scala.

Java 7

import java.util.Date
import java.text.SimpleDateFormat

String s = new SimpleDateFormat("YYYY-MM-dd").format(new Date());

Please observe that SimpleDateFormat is not thread safe. So do one of the following:
* initialize it each time with new
* make sure you run only single threaded, forever
* use EJB and have the format as instance variable in a stateless session bean
* protect it with synchronized
* protect it with locks
* make it a thread local variable

In Java 8 or Java 7 with Joda time this is better. And the toString()-method should have ISO8601 as default, but off course including the time part.


This is quite easy to achieve in many environments.
I could provide more, but maybe I leave this to you in the comments section.
What could be interesting:
* better ways for the ones that I have provided
* other databases
* other editors (vim, sublime, eclipse, idea,…)
* Office packages (Libreoffice and MS-Office)
* C#
* F#
* Clojure
* C on MS-Windows
* Perl and Ruby on MS-Windows
* Java 8
* Scala using better libraries than the Java-7-library for this
* Java using better libraries than the Java-7-library for this
* C++
* Python
* Cobol
* JavaScript
* …
If you provide a reasonable solution I will make it part of the article with a reference…
See also Date Formats

Share Button

What do +, – and * with Integer do?

When using integers in C, Java or Scala, we often use what is called int.

It is presented to us as the default.

And it is extremely fast.

Ruby uses by default arbitrary length integers.

But what do +, – and * mean?

We can rebuild them, in Ruby, kind of artificially restrict the integers to what we have in other programming langauges as int:

MODULUS = 0x100000000;
LIMIT   =  0x80000000;

def normalize(x)
  r = x % MODULUS;
  if (r < -LIMIT) then     return r + MODULUS;   elsif (r >= LIMIT) 
    return r - MODULUS;
    return r;

def intPlus(x, y)

def intMinus(x, y)

def intTimes(x, y)

x = 0x7fffffff;
y = intPlus(x, x);
z = intPlus(x, x);
puts("x=#{x} y=#{y} z=#{z}");

What is the outcome?

Exactly what you get when doing 32-Bit-Ints in C, Java, Scala or C#:

x=2147483647 y=-2 z=-2

int is always calculated modulo a power of two, usually 2^{32}. That is the
x % MODULUS in normalize(). The Rest of the function is just for normalizing the result to the range -2^{31} \ldots 2^{31}-1.

So we silently get this kind of result when an overflow situation occurs, without any notice.
The overflow is not trivial to discover, but it can be done.
For addition I have described how to do it.

See also Overflow of Integer Types, an earlier post about these issues…

I gave a talk that included this content adapted for Scala at Scala Exchange 2015.

Share Button

Conversion of ASCII-graphics to PNG or JPG

Images are usually some obscure binary files. Their most common formats, PNG, SVG, JPEG and GIF are well documented and supported by many software tools. Libraries and APIs exist for accessing these formats, but also a phantastic free interactive software like Gimp. The compression rate that can reasonably be achieved when using these format is awesome, especially when picking the right format and the right settings. Tons of good examples can be found how to manipulate these image formats in C, Java, Scala, F#, Ruby, Perl or any other popular language, often by using language bindings for Image Magick.

There is another approach worth exploring. You can use a tool called convert to just convert an image from PNG, JPG or GIF to XPM. The other direction is also possible. Now XPM is a text format, which basically represents the image in ASCII graphics. It is by the way also valid C-code, so it can be included directly in C programms and used from there, when an image needs to be hard coded into a program. It is not generally recommended to use this format, because it is terribly inefficient because it uses no compression at all, but as intermediate format for exploring additional ways for manipulating images it is of interest.
An interesting option is to create the XPM-file using ERB in Ruby and then converting it to PNG or JPG.

Share Button

Find the next entry in a sequence

In Facebook, Xing, Google+,, Linkedin and other of these social media networks we are often encountered with a trivial question like this:


There are some easy patterns. Either it is some polynomial formula or some trick with the digits.
But the point is, that any such sequence can easily be fullfilled by a polynomial formula. That means we can put any value for 7 and make it work. Or any answer is correct. So what would probably be the real question is the most simple function to full-fill the given constraints. Simplicity can be measured in some way… If the solution is unique is unclear, but let us just look at the polynomial solution.

A function is needed that takes as parameter a list of key-value-pairs (or a hash map) and that yields a function such that the function of any of the key is the associated value.

Assuming a polynomial function in one variable we can make use of the chinese remainder theorem, which can be applied to univariate polynomials over a field F as well as to integral numbers. For a polynomial p(X) we have

    \[p(x) \equiv p(X) \mod X-x\]

where X is the polynomial variable and x\in F is a concrete value.

We are looking for a polynomial p(X) such that for given values x_0,\ldots x_{n-1}, y_0,\ldots,y_{n-1} \in F we have

    \[\bigwedge_{i=0}^{n-1} p(x_i) = y_i\]

or in another way

    \[\bigwedge_{i=0}^{n-1} p(X) \equiv y_i \mod X-x_i\]

which is exactly the Chinese remainder theorem.



    \[\bigwedge_{j=0}^{n-1} I_j = I \setminus \{j\}\]

We can see that for all i \in I the polynomials

    \[e_i = \prod_{j \in I_j} \frac{X-x_j}{x_i-x_j}\]

have the properties


    \[\bigwedge_{j \in I_i} e_i(x_j)=0\]


    \[\bigwedge_{i \in I}\bigwedge_{j \in J} e_i(x_j)=\delta_{i,j}\]

where \delta_{i,j} is the Kronecker symbol, which is 0 if the two indices differ and 1 if they are equal.
Or as congruence:

    \[\bigwedge_{i \in I}\bigwedge_{j \in J} e_i(X)\equiv \delta_{i,j} \mod X-x_j\]

Then we can just combine this and use

    \[p(X) =\sum_{i \in I} y_i e_i(X)\]

This can easily be written as a Ruby function

def fun_calc(pairs)
  n = pairs.size
  result = lambda do |x|
    y = 0
    n.times do |i|
      p_i = pairs[i]
      x_i = p_i[0].to_r
      y_i = p_i[1].to_r
      z = y_i
      n.times do |j|
        if (j != i)
          p_j = pairs[j]
          x_j = p_j[0]
          z *= (x - x_j) / (x_i - x_j)
      y += z

This takes a list of pairs as a parameter and returns the polynomial function als lambda.
It can be used like this:

lop = [[0, 0], [1, 1], [2, 4], [3, 9], [4, 16], [5, 25], [6, 36], [7, 64]]

f = fun_calc(lop)

20.times do |x|
  y =
  puts sprintf("%6d -> %6d", x, y)

Put this together into a ruby program and add some parsing for the list of pairs or change the program each time you use it and all these „difficult“ questions „that 99.9% fail to solve“ are not just easy, but actually soluble automatically.

This is interesting for more useful applications. I assume that there will always be situations where a function is needed that meets certain exact values a certain inputs and is an interpolation or extrapolation of this.

Please observe that there are other interesting and useful ways to approach this:

  • Use a „best“ approximation from a set of functions, for example polynomials with a given maximum degree
  • use cubic splines, which are cubic polynomials within each section between two neighboring input values such that at the input values the two adjacent functions have the same value (y_i, off course), the same first derivative and the same second derivative.

For highway and railroad construction other curves are used, because the splines are making an assumption on what is the x-axis and what is the y-axis, which does not make sense for transport facilities. They are using a curve called Clothoid.

Use Java, C, Perl, Scala, F# or the programming language of your choice to do this. You only need Closures, which are available in Java 8, F#, Scala, Perl, Ruby and any decent Lisp dialect. In Java 7 they can be done with an additional interface as anonymous inner classes. And for C it has been described in this blog how to do closures.

Share Button

Testing Java- and C-programs with Ruby and Perl

It is very important to write good unit tests for software that is non-trivial and that is relied on by other pieces of software.
Often the logic of the software can easily be covered by the native testing facilities of the programming language, like JUnit for Java or, much less well known but available, CUnit for C. When a lot of framework code is involved or third party libraries are used heavily, there is almost no other way for certain tests, because setting up the environment cannot easily be achieved elsewhere.

But we also encounter cases where writing good unit tests in the same language as the library itself becomes a pain. We procrastinate the issue of writing them and end up with way too little or no unit tests at all. A lot of software deals with doing some calculations or transformations of numbers and strings, usually a lot of numbers and a lot of strings. Now Strings are the strength of Perl and are not really implemented very well or at least not very powerful or easy to use in most other languages. Specifically the String facilities of Perl are much superior to those of Java and C. Off course our software needs to perform well, it needs to integrate into the environment and follow the global corporate software standards, so Java or C or some other programming language is the choice that should not be challenged here for the productive software. But some tests of the functionality can more easily be achieved by iterating over some input data and creating output of the input data combined with the results. This can be perl code already or something really easy to parse. Perl is really the tool that can parse almost anything, but we do not really want to be distracted by unnecessary work but get our job done. So something like generating Perl code or CSV or YAML or JSON, but please not XML if not really needed, should do. Then we can pipe the output to perl or to a perl script and this will tell us, if everything is ok. When we know our platform, it can even be done that the Java- or C-Unit-test stores the output to a file or pipe and calls the Perl script on it and fails or succeeds depending on its output.

When it comes to numeric types, Ruby is very strong. It has unlimited size integers by default, which can be casted to n-bit-integers using constructions like
it has Rational, LongDecimal (as an external gem) and Complex and is easily extendible.

Usually we can expect that corporate constraints on which tools and programming languages may be used are less restrictive when it comes to unit tests. Integrating this on a continuous integration platform is a job that needs to be addressed but it is worth the effort, if a lot of tests become easier with this approach. And doing tests in another language makes tests more credible.

Off course the general idea is applicable for other combinations. Look into Scala, F#, Clojure, JavaScript, Python and some others as well, if they seem to be more helpful than Ruby or Perl for your unit testing automation. But this does indeed raise the question if a world where corporate policies allowed Scala and Closure instead of Java, F# instead of C# and Elixir instead of Erlang and PL/I instead of Cobol would be better.

Share Button

Why I still like Ruby

Some years ago Ruby in conjunction with rails was an absolute hype. In the Rails User Group in Zürich we had meetups with 30 people every two weeks. The meetings every two weeks have been retained, but often we are just five to ten people now. Is the great time of Ruby over or is it just a temporary decline? Is Perl 6 now taking over, since it is ready?

Perl 6 is cool and could challenge Ruby and it is now getting more and more towards production readiness. This will happen next year. This is what we will say next year. But Perl 6 is or will be very cool, stay tuned… What about Perl 5? At least Allison Randall still likes Perl (5). But it has other areas of strength than Ruby, so I will leave my opinionated writing focused on Ruby.

But the hype is right now in the area of JavaScript.

There are approaches to use server side JavaScript, which are interesting, because JavaScript needs to be dealt with on the client anyway and then it is tempting to have the same language on both sides.

There are also approaches to move back from the server to the client and put more functionality into the client and make the server less relevant, at least the percentage of development effort of the client gets more and of the server gets less.

There are cool frameworks for the client development in JavaScript and in conjunction with HTML5 and these frameworks a lot can be achieved.

On the other hand we are exploring NoSQL-databases like MongoDB and MongoDB is using JavaScript with some extension library as query language.

The general hype is toward functional programming and voila, JavaScript is functional, it is the most functional of the languages that have been established for a long time and brought the functional paradigm to every developer and even to the more sophisticated web designer, everything under the radar and long time before we knew how cool functional programming is. On the other hand, Ruby (and by the way Perl and especially Perl6) allow programming according to the functional paradigm as well, if used appropriately. But in JavaScript it is slightly more natural.

Then we have more contenders. Java is not dying so soon and it is still strong on its web frameworks, JavaServerFaces and one million others, not so bad actually.

And the idea of rails or at least its name, has been taken over by the groovy guys to develop groovy on grails, which integrates nicely with a Java enterprise backend.

And the more functional languages like F#, Scala, Haskell, Erlang, Elixir (and some others for the guys that don’t find Haskell theoretical enough) are around and gaining or retaining some popularity. Will Scala support the Erlang-VM as alternative backend, like JavaScript and JVM are supported today (and dotnet-CLI in the past)? And Scala has a promising web framework with play, that does WebServices right, which is what is actually needed for the modern client side JavaScript frameworks, just as an example.

PHP was always the ugly baby, why did they implement a second Perl based on the subset of Perl they understood? Well, I am not advocating PHP at all, but there are some pretty decent PHP-applications that are working really well with high load, like Wikipedia.

The unequal twin of Ruby, Python, is doing pretty well as well on attempting to get a web framework called Django. I have not investigated it at all, but it seems to exist.

So, where is ruby and rails? Ruby is a beautiful language, that has done many things better than any of the languages mentioned here:

It is easy to learn.

It has a nice and complete and consequent concept for object orientation.

It allows all the functional programming, most of it pretty natural.

It has decent numerical types by default. Integer grow to arbitrary length, Rationals are included and Complex numbers are included. JavaScript does not even have integers, it just has one numerical type: double precision floating point.

It allows extending, even operator overloading for new numerical types.

And it has a useful and easily accessible web framework like rails and actually some contender as alternative web frameworks, like camping, sinatra and some others.

And the development effort and the amount of code needed for a certain functionality is still lower than in Java or C# or COBOL.

I do believe that the web applications that are mostly server based and are using just HTML or minimal JavaScript still have their place and will be discovered again as useful for many application areas.

To conclude, I think that Ruby did not make it to become the replacement for Java, that some saw in it and that was due according to the history of replacement of other mainstream languages that have moved to legacy in the past. But I do believe that it will play an important role for some years to come and beyond the current JavaScript hype. Off course it will be challenged again in the future and maybe something will replace the main stream and will make Ruby on Rails just another legacy area for web applications. But I don’t even have an idea what that will be. I don’t think JavaScript. Perl6, Scala, F# or Clojure do have technical potential to do so, but I do not see that happening in the near future. Maybe something new will come up? Or something old? Just remember, Ruby is about as old as Java.

We will see.

Read more in this discussion on Quora.

For those who are interested, I actually teach Ruby and it is possible to arrange trainings for two to five days, depending on the previous knowledge of the participants and the goals.

Share Button

Monads in Ruby

Monads? Easy!

A Monad over X simply is a Monoid in the Category of Endofunctors of X.

Monads? Really!

  • Monads are a concept in mathematics
  • Algebra is an area of mathematics
  • Category theory is an abstraction of algebra.
  • Monads are defined in category theory
  • Remember: in category theory we talk about infinity beyond cardinality of infinite sets…
  • The „idea“ has been transplanted into something used in programming languages…
  • Just think of stars in terms of astronomy, in terms of decoration and in terms of music or movies

Monad as Design Pattern

  • Actually Monads should be viewed as a design pattern
  • That describes the level of abstraction


  • Pure functional programmers are poor guys when it comes to state
  • State is evil
  • State is not possible or hard
  • I/O is state, actually a segment of the outer world is manipulated
  • Monads encapsulate that
  • Even if you are happy with state (poor IT theology…):
    Multilevel nil problem 🙁

Monads in Functional Programming

  • Container type
  • wrap(foo):
    • class method
    • can also be called unit
    • Btw. arrogant Haskell guys call it return
  • pass(&block)
    • instance method
    • Can also be called bind
  • Optionally more operations (mjoin, empty, +,…)

Identity Monad

class Identity
    def initialize(value)
      @value = value

def Identity.wrap(value)

class Identity
    def pass
        yield @value


  • Left-identity for pass:
    Calling pass on a newly wrapped value is the same as applying the block to that value
  • Right-identity for pass:
    Calling pass with a block that only calls wrap on its parameter results in the same as its target object
  • Nesting:
    Nesting pass blocks should be equivalent to calling them in sequence.

Does our Monad survive 1st law?

Identity.wrap(foo).pass do |value|

Is equivalent to


That is how we defined pass.

Does our Monad survive 2nd law?

bar.pass do |value|

Is equivalent to


That is how we defined pass.

Does our Monad survive 3rd law?

bar.pass do |value_a|
end.pass do |value_b|

Is equivalent to

bar.pass do |value_a|
  f(value_a).pass do |value_b|

That is how we defined pass.

Array as Monad

def Array.wrap(value)
    [ value ]

class Array
    def mjoin
        inject([]) { |combined, arr| combined + arr }

    def pass(&block)

Does it work?

Have a look yourself…

More Monad Operations


def Array::empty


Already there

Other Examples

  • Option (Some, None)

Links & Sources

This article is based on a speach given in the Ruby on Rails user group Switzerland.

Share Button

Residue Class Rounding


If you do not know what residue classes are, just read on, it will be explained to the extent needed later on.

The concept of rounding seems familiar, but let us try to grab it a little bit more systematically.

Let us assume a set M of numbers and a subset N \subseteq M of it, whose elements can be represented in the programming or software environment we are having in mind. We want to use a metric d : M \times M \rightarrow \Bbb R_+, (x, y) \mapsto r=d(x,y)>=0. Usually we have three requirements for d:

  • Identity of indiscernibles: d(x,y) = 0 \iff x = y
  • Symmetry: d(x,y)=d(y,x)
  • Triangular inquation: d(x,z) \le d(x,y) + d(y,z)

Usually the number that we are dealing with can be considered to be within the world of real or complex numbers, we can hence assume that M \subseteq \Bbb C, often even M \subseteq \Bbb R or if we are honest M \subseteq \Bbb Q. Then we are usually working with d(x,y) = |x-y|, which is kind of implicitly clear. If you do not know complex numbers, just think of real or even rational numbers, which is the most common case anyway. Off course the concepts for rounding of p-adic numbers are really interesting and beautiful, but since I do not want to explain p-adic numbers here, I will not extend on this issue.

What is our intuitive understanding of rounding?
Maybe just a map
r: M \rightarrow N, x \mapsto r(x),
that is chosen with certain constraints for each x in such a way that d(x, r(x)) is minimal.
We need the constraints to enforce uniqueness if multiple values y\in N with minimal d(x,y) exist. The classical case of rounding to integral numbers has to answer the question of how to round 0.5. If N is a subset of the real numbers, which is „usually“ the case, we have ordering. We can choose between the following constraints:

|r(x)|\ge |x| Z.B. r(0.4)=1 and r(0.5)=1 and r(-0.5)=-1
|r(x)|\le |x| Z.B. r(0.6)=0 and r(0.5)=0 and r(-0.5)=0
r(x) \ge x Z.B. r(0.4)=1 and r(0.5)=1 and r(-0.5)=0
r(x) \le x Z.B. r(0.6)=0 and r(0.5)=0 and r(-0.5)=-1
Minimize d(r(x),x), but if multiple optimal values exist for r(x), pick the one furthest away from 0, for example r(0.4)=0 and r(0.6)=1 and r(0.5)=1 and r(-0.5)=-1
Minimize d(r(x),x), but if multiple optimal values exist for r(x), pick the one closest to 0, for example r(0.4)=0 and r(0.6)=1 and r(0.5)=0 and r(-0.5)=0
Minimize d(r(x),x), but if multiple optimal values exist for r(x), pick the largest, for example r(0.4)=0 and r(0.6)=1 and r(0.5)=1 and r(-0.5)=0
Minimize d(r(x),x), but if multiple optimal values exist for r(x), pick the smallest, for example r(0.4)=0 and r(0.6)=1 and r(0.5)=0 and r(-0.5)=-1
Minimize d(r(x),x), but if multiple optimal values exist for r(x), pick the one with even last digit. Please observe that this constraint is useful in the classical case, but it cannot be generalized. For example: r(0.4)=0 and r(0.6)=1 and r(0.5)=0 and r(-0.5)=0 and (1.5)=2
This constraint does not work in the mathematical sense (or only with ugly abusive math formulas), but programmatically we can do that: We try r(x)=x and throw an exception if not x\in N.

Usually we are thinking in the decimal system (even though our computers prefer something else, but the computer should serve us and not vice versa..). So we pick a power of ten 10^n with n \in \Bbb N_0, so n \ge 0. Now we simply define
N = \{ x \in M : 10^n x \in \Bbb Z\},
which means that N contains all numbers of M with a maximum of n digits after the decimal point. This rounding works quite well with something like LongDecimal in Ruby or BigDecimal in Scala or Java, but BigDecimal offers fewer rounding modes than LongDecimal for Ruby.

Now we look at the residue class rounding. We assume such a power of ten 10^n. Then we need an integral number m \ge 2 and a set of numbers R \subseteq \{0,...,m-1\}, we call them residues. Now we define N = \{ x \in M : 10^n x \in {\Bbb Z} \wedge \bigvee_{r \in R} 10^n x \equiv r \mod{m}\}.
That means that if we fill the number with zeros to the given number of places after the decimal point, remove the decimal point and perform a division with residue of this number by m, the residue lies in R. In this case ROUND_HALF_EVEN can become difficult to implement and ambiguous, so we might need to sacrifice it. But even worse, we have to deal with the case that 0 is not in R and provide rules how to round 0. The candidates have self-explanatory names:


An important application of this is m=10 and R=\{0, 5\}. This is used in Switzerland and probably other currency regions for rounding money amounts to multiples of 5 Rappen (0.05 CHF). This has already been described in „Rounding of Money Amounts„. If we have m=10 and R=\{0,1,2,3,4,5,6,7,8,9\} we have the usual non-CHF-rounding case. Maybe there are even cases for m=10 and R=\{0,2,4,6,8\}. But the concept is more general, if you like to use it.

Share Button

Rails 4.0 Beta

Rails 4.0 Beta. 😉

Share Button