Clojure-Art

It is an interesting idea to generate colorful images using or music. In both areas Clojure seems to be quite attractive. Not having explored the music side, I did find the idea of creating images fun and inspiring. It also shows us something about the functions we are working with, if we learn to read the images right, but that will come or not, depending on the circumstances. It is useful not to be too scared of some mathematics when reading this.

Now the challenge is to create an image on a two dimensional array of points, for example 1000×1000 pixel, with x- and y-coordinates ranging from 0 to 999. Each pixel needs to be colored. While it is very interesting to explore different color models, we can for simplicity assume that we need 3 numbers each ranging from 0 to 255 for the red, green and blue channels. This is how most displays work, more or less. Now the goal is to create something that looks good. And of course is reasonable to program, otherwise we could just color one million points individually using for example GIMP, but a million is a lot.

Now we can apply any function on x and y and play around with functions like exp, log, sqrt, sin, cos, tan, sec, csc, sinh, … and of course the basic operations +, -, * and /. It turns out that in most cases we do not get interesting images, but experience will show what is promising to explore. I tried to create pictures by keeping the three channels fairly independent, but this did not work so well. It seems that it is better to keep some connection. One approach that actually works quite well is to consider the pair (x,y) as a complex number z = x+iy and to apply just one complex function on it, again exp, log, sqrt, sin, …. are good building blocks. Now these complex functions have a tendency to grow to infinity somewhere. While real functions can avoid this issue by constraining themselves just to one strait line on the plane, complex functions almost have to go to infinity somewhere. By making the square small enough or by changing the scale we can avoid this, but it imposes quite severe constraints. The Riemann Sphere allows us to map any complex number to a point on the surface of a sphere. With some scaling we can already get to RGB-space and get coordinates that are using, but not exceeding the desired range. There are more ways to visualize complex numbers, but this is a possibility worth exploring.

Another way is to just use functions that calculate a real number and to apply a \sin to it. With some shifting and scaling the values will be between 0 and 255 only and there are nor abrupt changes in color, unless the function we calculated is very steep or very chaotic. Using phase shifts by \frac{2\pi}{3} and \frac{4\pi}{3} the three color channels can be served and we get nice rainbow-waves like the following:

Clojure Art: angle + log(r)

Clojure Art: angle + log(r)

Another experiment was to just assume the HSV-model and to calculate the colors from assuming the function is the H-part. But this ended up looking like plastic and I did not like it too much.

An important issue to observe is that functions may end up in exceptions. I wrapped the functions, so that they do not stop the calculation of the image half way through, but instead provide default values in cases where an exceptions occured.

It can also be fun to explore bitwise-functions like bitxor or even functions like the p-adic exponential function, which yields totally different kind of images.

I have put some of the code from my experiments into Github and licensed it with the GPL, so you can use it as a starting point. Others have worked with this as well, for example Clojure Art on Tumblr, Clojure Art Collective on github, another „clojure art“ on github or creative computing with clojure on O’Reilly’s blog.

Enjoy it and learn some Clojure. I sometimes use this when teaching Clojure.

Share Button

DB Persistence without UPDATE and DELETE

When exploring the usage of databases for persistence, the easiest case is a database that does only SELECT. We can cache as much as we like and it is more or less the functional immutable world brought to the database. For working on fixed data and analyzing data this can sometimes be useful.

Usually our data actually changes in some way. It has been discussed in this Blog already, that it would be possible to extend the idea of immutability to the database, which would be achieved by allowing only INSERT and SELECT. Since data can correlate, an INSERT in a table that is understood as a sub-entity via a one-to-many-relationship by the application actually is mutating the containing entity. So it is necessary to look at this in terms of the actual OR-mapping of all applications that are running on that DB schema.

Life can be simple, if we actually have self contained data as with MongoDB or by having a JSON-column in PostgreSQL, for example. Then inter-table-relations are eliminated, but of course it is not even following the first normal form. This can be OK or not, but at least there are good reasons why best practices have been introduced in the relational DB world and we should be careful about that. Another approach is to avoid the concept of sub entities and only work with IDs that are foreign keys. We can query them explicitly when needed.

An interesting approach is to have two ID-columns. One is an id, that is unique in the DB-table and increasing for newly created data. One is the entity-ID. This is shared between several records referring to different generations of the same object. New of them are generated each time we change something and persist the changes and in a simple approach we just consider the newest record with that entity-ID valid. It can of course be enhanced with validFrom and validTo. Then each access to the database also includes a timestamp, usually close to current time, but kept constant across a transaction. Only records for which validFrom <= timestamp < validTo are considered, and within these the newest. The validFrom and validTo can form disjoint intervals, but it is up to the application logic if that is needed or not. It is also possible to select the entry with the highest ID among the records with a given entityID and timestamp-validTo/From-condition. Deleting records can be simulated by this as well, by allowing a way to express a "deleted" record, which means that in case we find this deleted record by our rules, we pretend not having found anything at all. But still referential integrity is possible, because the pre-deletion-data are still there. This concept of having two IDs has been inspired by a talk on that I saw during Clojure Exchange 2017: Immutable back to front.

Share Button

Lazy Collections, Strings or Numbers

The idea is, that we have data that is obtained or calculated to give us on demand as much of it as we request. But it is not necessarily initially present. This concept is quite common in the functional world, where we in a way hide the deprecated concept of state in such structures, by the way in a way that lets use retain the benefits that led to the desire for statelessness.

Actually the concept is quite old. We have it for I/O in Unix and hence in Linux since the 1970ies. „Everything is a file“, at least as long as we constrain ourselves to a universal subset of possible file operations. It can be keyboard input, a named or anonymous pipe, an actual file, a TCP-connection, to name the most important cases. These are „lazy“ files, behave more or less like files as far as sequential reading is concerned, but not for random access reading. The I/O-concept has been done in such a way that it takes the case into account that we want to read n bytes, but get only m < n bytes. This can happen with files when we reach their end, but then we can obtain an indication that we reached the end of the file, while it is perfectly possible that we read less then we want in one access, but eventually get \ge n bytes including subsequent reads. Since the API has been done right, but by no means ideal, it generalizes well to the different cases that exist in current OS environments.

We could consider a File as an array of bytes. There is actually a way to access it in this way by memory-mapping it, but this assumes a physically present file. Now we could assume that we think of the array as a list that is optimized for sequential access and iterating, but not for random access. Both list types actually exist in languages like Java. Actually the random access structure can be made lazy as well, within certain constraints. If the source is actually sequential, we can just assume that the data is obtained up to the point where we actually read. The information about the total length of the stream may or may not be available, it is always available somehow in the case of structures that are completely available in memory. This random access on lazy collections works fine if the reason of laziness is to actually save us from doing expensive operations to obtain data that we do not actually need or to obtain them in parallel to the computation that processes the data. But we loose another potential drawback in this case. If the data is truly sequential, we can actually process data that is way beyond our memory capacity.

So the concept transfers easily from I/O-streams to lists and even arrays, most naturally to iterables that can be iterated only once. But we can easily imagine that this also applies to Strings, which can be seen a sequence of characters. If we do not constrain us to what a String is in C or Java or Ruby, but consider String to be a more abstract concept, again possibly dropping the idea of knowing the length or having a finite length. Just think of the output of the Unix command „yes“ or „cat /dev/zero“, which is infinite, in a theoretical way, but the computer won’t last forever in real life, of course. And we always interrupt the output at some time, usually be having the consumer shut down the connection.

Even numbers can be infinite. For real numbers this can happen only after the decimal point, for p-adic numbers it happens only before the decimal point, if you like to look into that. Since we rarely program with p-adic numbers this is more or less an edge case that is not part of our daily work, unless we actually do math research. But we could have integers with so many digits that we actually obtain and process them sequentially.

Reactive programming, which is promoted by lightbend in the Reactive Manifesto relies heavily on lazy structures, in this case data streams. An important concept is the so called „backpressure“, that allows the consumer to slow down the producer, if it cannot read the data fast enough.

Back to the collections, we can observe different approaches. Java 8 has introduced streams as lazy collections and we need to transform collections into streams and after the operation a stream back into a collection, at least in many real life situations. But putting all into one structure has some drawbacks as well. But looking at it from an abstract point of view this does not matter. The java8-streams to not implement a collection interface, but they are lazy collections from a more abstract point of view.

It is interesting that this allows us to relatively easily write nested loops where the depth of the nesting is a parameter that is not known at compile time. We just need a lazy collections of n-tuples, where n is the actual depth of the nesting and the contents are according to what the loops should iterate through. In this case we might or might not know the size of the collection, possibly not fitting into a 32-bit-integer. We might be able to produce a random member of the collection. And for sure we can iterate through it and stop the iteration wherever it is, once the desired calculation has been completed.

Share Button

Clojure

Functional programming languages have become a bit of a hype.

But the ideas are not really so new.
The first languages beyond Assembly language that have maintained some relevance up to today were FORTRAN, COBOL and Lisp. Indirectly also Algol, because it inspired pretty much any modern mainstream programming language in some way through some intermediate ancestors. The early Algol Variants itself have disappeared .

It can be argued if the early Lisp Dialects were really functional languages, but they did support some functional patterns and made them popular at least in certain communities. I would say that popular scripting languages like the Ruby programming language, the Perl programming language, the Lua programming language and especially JavaScript brought them to the main stream.

Lisp has always remained in its niche. But the question arose on creating a new Lisp that follows more strictly the functional paradigm and is somewhat more modern, cleaner and simpler than the traditional Lisps. It was done and it is called Clojure.

So anybody who has never used any Lisp will at first be lost, because it is a jungle of parentheses „((((())))()()()(…)“ with some minor stuff in between…
Actually that is an issue, when we move from today’s common languages to Clojure. But it is not that bad. The infix-notation is familiar to us, but it has its benefits to use one simple syntax for almost everything.

An expression that consists of a function call is written like this (function-name param1 param2 parm3...). +, -, *,…. are just functions like anything else, so if we want to write 3\cdot4 + 5\cdot6 we just write (+ (* 3 4) (* 5 6)).

In the early days of calculators it was easier to build something that works with a notion called „RPN“, so there we would write 3 ENTER 4 * 5 ENTER 6 * +, which is similar to the Lisp way, but just the other way round.

It is easy to add a different number of values:
* (+) -> 0
* (+ 7) -> 7
* (+ 1 2 3 4 5 6 7) -> 28

In Clojure functions are just normal values like numbers, arrays, lists,… that can be passed around.. It is good programming practice to rely on this where it helps. And with more experience it will be helpful more often.

Immutability is king. Most of the default structures of Clojure are immutable. They can be passed around without the fear that they might change once they have been constructed. This helps for multithreading.

Clojure provides lists, arrays, sets, hashmaps, and the sorted variants of the latter. These can be written easily:
* List: (list 1 2 3) -> (1 2 3) (entries are evaluated in this case)
* List: '(1 2 3) -> (1 2 3) (entries are not evaluated in this case)
* Array: [1 2 3] (entries are evaluated in this case)
* Set: #{1 2 3} (entries are evaluated in this case)
* Map: {1 2, 3 4} (entries are evaluated in this case. key-value-Pairs are usually grouped with a comma, which is whitespace for Clojure)

All of these are immutable. So methods that change collections, always create a copy that contains the changes. The copy can be done lazily or share data with the original.

Actually I can teach Clojure in course of two to five days duration depending on the experience of the participants and the goals they want to achieve.

There is much more to write about Clojure…

Share Button