Indexing of Arrays and Lists

We index arrays with integers. Lists also, at least the ones that allow random access. And sizes of collections are also integers.
This allows for 2^{31}-1=2147483647 entries in Java and typical JVM languages, because integers are actually considered to be 32bit. Actually we could think of one more entry, using indices 0..2147483647, but then we would not be able to express the size in an signed integer. This stuff is quite deeply built into the language, so it is not so easy to break out of this. And 2’000’000’000 entries are a lot and take a lot of time to process. At least it was a lot in the first few years of Java. There should have been an unsigned variant of integers, which would in this case allow for 4’000’000’000 entries, when indexing by an uint32, but that would not really solve this problem. C uses 64 bit integers for indexing of arrays on 64 bit systems.

It turns out that we would like to be able to index arrays using long instead of int. Now changing the java arrays in a way that they could be indexed by long instead of int would break a lot of compatibility. I think this is impossible, because Java claims to retain very good backward compatibility and this reliability of both the language and the JVM has been a major advantage. Now a second type of arrays, indexed by long, could be added. This would imply even more complexity for APIs like reflection, that have to deal with all cases for parameters, where it already hurts that the primitives are no objects and arrays are such half-objects. So it will be interesting, what we can find in this area in the future.

For practical use it is a bit easier. We can already be quite happy with a second set of collections, let them be called BigCollections, that have sizes that can only be expressed with long and that are indexed in cases where applicable with longs. Now it is not too hard to program a BigList by internally using an array of arrays or an array of arrays of arrays and doing some arithmetic to calculate the internal indices from the long (int64) index given in the API. Actually we can buy some performance gain when resizing happens, because this structure, if well done, allows for more efficient resizing. Based on this all kinds of big collections could be built.

Share Button

Tip: how to make Thunderbird use ISO Date Format

After an upgrade, my Thnderbird started to use a weird US-date format and forgot my setting for this. I did not find any setting either. But this seems to work fine:

Create a shell script

#!/bin/bash

export LC_TIME=sv_SE.utf8
export LC_DATE=sv_SE.utf8

exec /usr/bin/thunderbird

and make sure that this gets called when thunderbird is started instead of directly /usr/bin/thunderbird.

Share Button

Clojure Exchange 2018

I visited Clojure Exchange 2018 in London.
Since there was only one track and I attended all talks, it is easy to just refer to the schedule.

Interesting topics, that came up multiple times in different flavors where immutability, stories of building real life applications, music, java and clojure and the transition, clojure script, emacs and cider…

I did a lightning talk myself about Some Thoughts about Immutability and its Limits.

Links

  • Clojure
  • Clojure Exchange 2018
  • Clojure Exchange 2016
  • Clojure
  • Share Button

Devoxx Kiew 2018

In the end of 2018 the number of conferences is kind of high. A great highlight is the Devoxx BE in Antwerp. But it has now five partner conferences in London, Paris, Krakow, Morocco and Kiev. So I decided to have a look at the one in Kiev.

How was it in comparison to the one in Belgium? What was better in Kiev: The food was way better, the drinks in the first evening (Whisky and Long Drinks vs. Belgium Beer) might be considered better, there were more people engaged to help the organizers…
What was better in Belgium: There were still a bit more speeches. While the location in Kiev was really great, in Belgium the rooms were way better for the purpose of providing a projection visible for everybody and doing a video recording that did not disturb the audience.
The quality of the speeches was mostly great in both locations. In Kiev they gamified the event a bit more..

Generally there was a wide range of topics and the talks were sorted into the following thematic groups:

  • Methodology & Culture
  • JVM Languages
  • Server Side
  • Architecture & Security
  • Mobile & IoT
  • Machine Learning & AI
  • Big Data & Data Mining
  • Cloud, Containers & Infrastructure
  • Modern Web & UX

See the schedule for the distribution…

I attended on Friday:

I attended on Saturday:

A lot to learn.

Share Button

Devoxx Antwerp 2018

In 2018 I am visiting a few conferences. A great highlight is the Devoxx BE in Antwerp, which I had the privilege of visiting 2012, 2013, 2014, 2015, 2016 and 2017.

As it should be, it is not just the same every year, but content and speakers change a bit from year to year.

Some topics that got a lot of attention were functional programming, artificial intelligence, Big Data, Machine Learning, clouds, JVMs, Kotlin

There was less about other JVM languages (apart from Kotlin), so Scala, Clojure, Groovy or Ceylon were covered little or not at all and Android used to be more present in other years. I would say that Ceylon has become irrelevant, probably because Kotlin was too similar and came out the same time and won. Groovy has its niche, Clojure has its niche, Scala and Kotlin have become mature and are now the two mainstream alternatives to Java, but themselves much smaller than Java. This was represented in the conference, taking into account that Scala has its own large conferences, like Scala Days, Scala Exchange, Scala World and a lot more.

Some side issues that might worry some of us did come up occasionally. Was it bad, that IBM bought Red Hat? At least they paid around 34’000’000’000 USD, which is more than 2’500’000 USD per employee. There are probably no other assets in terms of buildings, patents, hardware or whatever, that would justify this price, so IBM probably will have an interest to keep a large number of these employees and not scare them away by too much „IBM-culture“. We will see, but no reason to get immediately worried. Oracle wants money for running their JVM in production after more than 6 months. This can be avoided by always switching to the newest version or by relying on the JDKs offered by alternative sources like Amazon, RedHat…

Microsoft was a sponsor and had a booth. Their topic was not MS-Windows and MS-Office and MS-SQL-Server, but Azure, which can be used with Linux and Java and PostgreSQL, for example. The company did change a bit since the days of Steve Ballmer and we will see if this is an excursion or a continuous direction.

And James Gosling was there at the opening, as a surprise.

Generally there was a wide range of topics and the talks were sorted into the following thematic groups:

  • Methodology & Culture
  • Java Language
  • Programming languages
  • Architecture & Security
  • Big Data & Machine Learning
  • Mind the Geek
  • Server Side Java
  • Modern Web & UX
  • Cloud, Containers & Infrastructure
  • Mobile & IoT

See the schedule for the distribution…

I attended on Wednesday:

I attended on Thursday:

I attended on Friday:

It was a great conference. A lot of new ideas.

Share Button

Logging

Deutsch

Software often contains a logging functionality. Usually entries one or sometimes multiple lines are appended to a file, written to syslog or to stdout, from where they are redirected into a file. They are telling us something about what the software is doing. Usually we can ignore all of it, but as soon as something with „ERROR“ or worse and more visible stack traces can be found, we should investigate this. Unfortunately software is often not so good, which can be due to libraries, frameworks or our own code. Then stack traces and errors are so common that it is hard to look into or to find the ones that are really worth looking into. Or there is simply no complete process in place to watch the log files. Sometimes the error shows up much later than it actually occurred and stack traces do not really lead us to the right spot. More often than we think logging actually introduces runtime errors, that were otherwise not present. This is related to a more general concept, which is called observer effect, where logging actually changes the business logic.

It is nice that log files keep to some format. Usually they start with a time stamp in ISO-format, often to the millisecond. Please add trailing zeros to always have 3 digits after the decimal point in this case. It is preferable to use UTC, but people tend to stick to local date and time zones, including the issues that come with switching to and from daylight saving time. Usually we have several processes or threads that run simultaneously. This can result in a wild mix of logging entries. As long as even multiline entries stay together and as long as beginning and end of one multiline entry can easily be recognized, this can be dealt with. Tools like splunk or simple Perl, Ruby or Python scripts can help us to follow threads separately. We could actually have separate logs for each thread in the first place, but this is not a common practice and it might hit OS-limitations on the number of open files, if we have many threads or even thousands of actors as in Erlang or Akka. Keeping log entries together can be achieved by using an atomic write, like the write system call in Linux and other Posix systems. Another way is to queue the log entries and to have a logger thread that processes the queue.

Overall this area has become very complex and hard to tame. In the Java world there used to be log4j with a configuration file that was a simple properties file, at least in the earlier version. This was so good that other languages copied it and created some log4X. Later the config file was replaced by XML and more logging frame works were added. Of course quite a lot of them just for the purpose of abstracting from the large zoo of logging frameworks and providing a unique interface for all of them. So the result was, that there was one more to deal with.

It is a good question, how much logic for handling of log files do we really want to see in our software. Does the software have to know, into which file it should log or how to do log rotation? If a configuration determines this, but the configuration is compiled into the jar file, it does have to know… We can keep our code a bit cleaner by relying on program functionality without code, but this still keeps it as part of the software.

Log files have to please the system administrator or whoever replaced them in a pure devops shop. And in the end developers will have to be able to work with the information provided by the logs to find issues in the code or to explain what is happening, if the system administrator cannot resolve an issue by himself. Should this system administrator have to deal with a different special complex setup for the logging for each software he is running? Or should it be necessary to call for developer support to get a new version of the software with just another log setting, because the configurations are hard coded in the deployment artifacts? Interesting is also, what happens when we use PAAS, where we have application server, database etc., but the software can easily move to another server, which might result in losing the logs. Moving logs to another server or logging across the network is expensive, maybe more expensive than the rest of this infrastructure.

Is it maybe a good idea to just log to stdout, maintaining a decent format and to run the software in such a way that stdout is piped into a log manager? This can be the same for all software and there is one way to configure it. The same means not only the same for all the java programs, but actually the same for all programs in all languages that comply to a minimal standard. This could be achieved using named pipes in conjunction with any hard coded log file that the software wants to use. But this is a dangerous path unless we really know what the software is doing with its log files. Just think of what weird errors might happen if the software tries to apply log rotation to the named pipe by renaming, deleting, creating new files and so on. A common trick to stop software from logging into a place where we do not want this is to create a directory with the name of the file that the software usually uses and to write protect this directory and its parent directory for the software. Please find out how to do it in detail, depending on your environment.

What about software, that is a filter by itself, so its main functionality is to actually write useful data to stdout? Usually smaller programs and scripts work like this. Often they do not need to log and often they are well tested relyable parts of our software installation. Where are the log files of cp, ls, rm, mv, grep, sort, cat, less,…? Yes, they do tend to write to stderr, if real errors occur. Where needed, programs can turn on logging with a log file provided on the command line, which is also a quite operations friendly approach. Named pipes can help here.

And we had a good logging framework in place for many years. It was called syslog and it is still around, at least on Linux.

A last thought: We spend really a lot of effort to get well performing software, using multiple processes, threads or even clusters. And then we forget about the fact that logging might become the bottle neck.

Share Button

Some thoughts about String equality

Of course Strings are today in some way Unicode. In this article we assume code points as the building blocks of Strings. That means for example in the Java-world, that we are talking about one code point being comprised of one Java character for typical European languages, using Latin, Greek or Cyrillic alphabets including extensions to support all languages typically using these alphabets, for example. But when moving to Asian languages, a code point can also consist of two Java characters and there are Strings that are illegal from Unicode perspective, because they contain characters that should be combined in a way that cannot be combined properly. So here we assume, that Strings consist of sequences of bytes or two-byte characters or whatever encoding that properly express a sequence of code points. There are many interesting issues when dealing with some Asian languages that we will not cover here today.

Now there are a lot of possibilities to create Strings, that look the same, but are actually different. We are not talking about „0“ and „O“ or „1“ and „l“ and „I“ that might look similar in some fonts, but should not look similar, because we actually depend on their distinctness, even on their visual distinctness. Unfortunately we have the bad habit of using traditional typewriter fonts, that make it hard to distinguish these, for source code, where it would be so crucial. But for today, we just assume that we always look hard enough to solve this issue.

The classical example of what looks the same is whitespace. We have ordinary space “ “ and no break space “ „, that are meant to look exactly the same, but to expose a slightly different behavior. There are tons of possibilities to create exactly the same look with different combinations of whitespace. But this is kind of a special case, because in terms of semantics often carries little information and we want to disregard it to some extent when comparing strings. Typical examples are stripping of leading and trailing whitespace of the string or of the lines contained within it and replacing tabulators with the number of spaces that would be equivalent. Or even to replace any amount of adjacent whitespace within a line by a single space. Again, handling of different whitespace code points might require different rules, so it is good to be careful in not putting to much logic and it is better to rely on a library to at least apply exactly the same rules in equivalent situations.

Another example that we actually might know is that certain characters look the same or almost the same in the Cyrillic, Greek and Latin alphabets. I try to give an idea of the meaning of the Greek and Cyrillic characters, but they depend on the language, the dialect and even the word, the word form or the actual occurrence of the letter in the word…

LatinCyrillicGreekmeaning of Cyrillic Lettermeaning of Greek letter
AАAlike Latinlike Latin
BВBlike Latin VBeta (like V in new Greek)
CСlike Latin S
EЕElike LatinEpsilon (like Latin E)
ГHlike Latin GGamma (like Latin G)
HНΗlike Latin NEta (like Latin I in new Greek)
JЈSerbian Ј, like German J
KКΚlike LatinKappa (like Latin K)
MМΜlike LatinMu (like Latin M)
NΝNu (like Latin N)
OОΟlike LatinOmikron (like Latin O)
PРΡlike Latin RRho (like Latin R)
ПΠlike Latin PPi (like Latin P)
TТΤlike LatinTau (like Latin T)
ФΦlike Latin FPhi (like Latin F)
XХΧlike German CHChi (like German CH)
YУΥlike Latin UUpsilon (like Latin U)
ZΖZeta (like German Z)
IІΙUkrainian IIota (like Latin I)

In this case we usually want the characters to look the same or at least very similar, because that is how to correctly display them, but we do want them to be different when comparing strings.

While these examples are kind of obvious, there is another one that we tend to ignore, but that will eventually catch us. There are so called combining characters, that should actually be named „combining code points“, but here we go. That means that we can put them after a letter and they will combine to form a letter with diacritical marks. A typical example is the letter „U“ that can be combined with two dots “ ̈ ̈“ to form an „Ü“, which looks the same as the „Ü“ that is composed of one code point. It is meant to look the same, but it also has the same meaning, at least for most purposes. What we see is the Glyph. We see the difference when we prefix each code point with a minus or a space: „Ü“ -> „-U-̈“ or “ U ̈“, while the second one is transformed like this: „Ü“ -> „-Ü“ or “ Ü“, as we would expect.

While the way to express the Glyph in such a way with two code points is not very well known and thus not very common, we actually see it already today when we look at Wikipedia articles. In some languages, where the pronunciations is ambiguous, it can be made clear by putting an accent mark on one vowel, as for example Кириллица, which puts an accent mark on the term in the beginning of the article like this: „Кири́ллица“. Since in Cyrillic Alphabet accent marks are unfortunately not used in normal writing, it comes in handy that the combining accent also works with cyrillic letter. When putting minus-signs between the code points it looks like this: „К-и-р-и-́-л-л-и-ц-а“ or with spaces like this: „К и р и ́ л л и ц а“. So Strings that we encounter in our programs will contain these combining characters in the future. While we can prohibit them, it is better to embrace this and it is actually not too hard, if we use decent libraries. Java has the Normalizer class in its built in library, that can convert to one or the other convention of expressing such glyphs and then allowing comparison in the way that we actually mean.

Unfortunately issues like semantic lengths of strings or semantic positions become even harder than they already are after moving from characters to code points. And we can be sure that Unicode has still more to offer to complicate things, if we dig deeper. The typical answer that we get on most web sites that talk about these issues is something like: „The length of strings and positions within strings are surprisingly irrelevant to most programs.“

In the end of the day, jobs that have been trivial in the past are now becoming a big deal and we need to learn to think of comparison, length, position, regular expressions, sorting and all kinds of string functionality with bytes, characters, code points and glyphs in mind.

What can our current libraries already do for us, what are we missing in them, considering different programming languages, databases, text files and network transmission?

Links

Share Button

Program Functionality without Code

Depending on the programming language and the frameworks we use, it is possible to have program functionality that is not happening in actual code that we write in that language. It seems weird, but actually it is something that we have been doing for decades and it has been sold to us as being extremely powerful and useful and sometimes it actually is. Aspect oriented programming is mostly based on this idea…

Typical examples are things we want to be taken care of but we do not want to actually write them ourselves..

Some of these look really great and who wants to deal with memory management today? Unless we do real time programming or special security code where information may not exist in the memory for longer than the actual processing, this is just what we successfully and without too much pain are doing all the time.

While some of these look really great, and have become more or less om there is also some danger in having some very powerful implicit functionality, like transaction management. While it looks tempting to delegate transaction management to a framework, because it is annoying and it is not really understood very well by most application developers, there comes some danger with it. This is even worse if it is used in conjunction with something like JPA or Hibernate… Assuming we have a framework that wraps methods marked with an annotation like „@Transactional“, meaning that this method call should be wrapped into a transaction (java-like pseudo-code):

@Transactional
public X myMethod(Y y)  {
   X result = do_something(y);
   return result;
}

being roughly equivalent to

public  X myMethod(Y y) {
  TransactionContext ctx = getTransactionContext();
  try {
      ctx.beginTransaction();
      X result = do_something(y);
      ctx.commit();
      return result;
   } catch (Exception ex) {
      ctx.rollback();
      throw ex;
   }
}

Yes, it is more elegant to just annotate it.
But now we program something like this:

@Transactional
public Function myMethod(Y y) {
      ....
}

where we actually enclose something into the function and give it back. Now when calling the function, we might get an error, because it encloses stuff from the time, when the transaction was actually still open, while it has been committed by the time, the function is actually called. So in frameworks that force the usage of such annotated transaction handling, such beautiful functional style programming patterns may actually not work and need to be avoided or at least constrained to the cases that do still work. This can be a reasonable price to pay, but it is important, to understand the constraints, that come with this implicit functionality.

Another interesting area that comes with a lot of potential functionality is correlated with authorization. Assuming we have a company that sells some services or products and we have key account managers that use the software we have written. Now for whatever reasons, they should only be able to see the data about their own customers, possibly data for customers for whose key account manager they are the deputy. Or if they are the boss of some key account managers, maybe they can see all of their data…

Now a function

List listCustomers() {
...
}

gives different results, depending on who is using it. This introduces an implicit invisible parameter. And however smart the user of this software is, he only sees what he is supposed to see, unless the software has some vulnerabilities, which it probably has.

So whenever we read such code that we have not written ourselves and have not written yesterday, there may be surprises about what it does. It is an interesting question how to test this with a good coverage of all constellations for implicit parameters. Anyway, we have to get used to it and embrace it, it is an integral part of our software ecosystem. But it is also important to use these powerful mechanisms only where they are really so helpful that it is worth the loss in clarity and explicitness.

While annotations are at least in place to be found, there are also other ways. Typically xml files can be used to configure such stuff. Or it can be done programmatically in a totally different place of the software by setting up some hooks, for example. Without good documentation or good information flow within the team, this may be hard to find.

Share Button

Intervals

Intervals are subsets of a universe, that are defined by upper and lower boundaries. Typically we think about real numbers, but any totally ordered universe allows the definition of intervals.

Intervals are defined by lower and upper boundaries, which can be a limiting number or unlimited, typically written as \infty for the upper bound and -\infty for the lower bound. The boundaries can be included or excluded. So the following combinations exist for a universe X:

(-\infty, \infty)=X
unlimited
(-\infty, a] = \{ x \in X : x \le a\}
half open, lower unlimited
(-\infty, a) = \{ x \in X : x < a\}
open, lower unlimited
[b, \infty) = \{ x \in X : x \ge b\}
half open, upper unlimited
(b, \infty) = \{ x \in X : x > b\}
open, upper unlimited
(c, d) = \{ x \in X : c < x < d\}
open
[c, d) = \{ x \in X : c \le x < d\}
half open
(c, d] = \{ x \in X : c < x \le d\}
half open
[c, d] = \{ x \in X : c \le x \le d\}
closed
\emptyset
it is sometimes useful to consider the empty set as an interval as well

The words „open“ and „closed“ refer to our usual topology of real numbers, but they do not necessarily retain their topological meaning when we extend the concept to our typical data types. a, b, c and d in the notation above do not have to be members of X, as long as the comparison is defined between them and all members of X. So we could for example meaningfully define for X=\mathbb{Q} the interval (\sqrt{2}, \pi) = \{ x\in\mathbb{Q} : \sqrt{2} < x < \pi\}.

As soon as we do not imply X=\mathbb{R} we always have to make this clear… And \mathbb{R} is kind of hard to really work with in software on computers with physically limited memory and CPU power.

Intervals have some relevance in software systems.

We sometimes have a business logic that actually relies on them and instead programming somehow around it, it is clearer and cleaner to actually work with intervals. For example, we can have a public transport scheduling system and we deal with certain time intervals in which different schedules apply than during the rest of the day. Or we have a system that records downtimes of servers and services and these are quite naturally expressed as intervals of some date-time datatype. It is usually healthy to consider all the cases mentioned above rather than ignoring the question if the boundary with probability zero of actually happening or having ugly interval limits like 22:59:59.999.

The other case is interval arithmetic. This means, we do floating point calculations by taking into account that we have an inaccuracy. So instead of numbers we have intervals I. When we add two intervals, we get I+J=\{x+y : x\in I \wedge y\in J\}. In the same way we can multiply and subtract and even divide, as long as we can stay clear of zero in the denominator. Or more generally we can define f(I_1, I_2, \ldots, I_n)=\{f(x_1,x_2,\ldots,x_n) : x_1\in I_1 \wedge x_2 \in I_2 \wedge\ldots\wedge x_n\in I_n\}.
It does of course require some mathematical thinking to understand, if the result is an interval again or at least something we can deal with reasonably. Actually we are usually happy with replacing the result by an interval that is possibly a superset of the real result, ideally the minimal superset that can be expressed with our boundary type.

At this point we will probably discover a desire to expand the concept of intervals in a meaningful way to complex numbers. We can do this by working with open disks like \{ z \in \mathbb{C} :  |z-z_0| < d\} or closed disks like \{ z \in \mathbb{C} :  |z-z_0| \le d\}. Or with rectangles based on two intervals I and J like \{ z = x+iy \in \mathbb{C} : x \in I \wedge y \in J \}.

These two areas are quite interesting and sometimes useful. Libraries have been written for both of them.

Often we discover, that intervals alone are not quite enough. We would like to do set operations with intervals, that is

union
A\cup B
intersection
A \cap B
set difference
A \setminus B

While the intersection works just fine, as long as we include the empty set \emptyset as an interval, unions and differences lead us to non-intervals. It turns out that interval-unions, sets that can be expressed as a union of a finite number of intervals, turn out to be a useful generalization, that is actually what we want to work with rather than with intervals. In this case we can drop the empty set as interval and just express it as the union of zero intervals.

There are some questions coming up, that are interesting to deal with:

normalization
Can we normalize interval-unions to some canonical form that allows safe and relyable comparison for equality?
adjacency
is our universe X actually discrete, so we can express all unlimited boundaries with closed boundaries?
interval lengths
Do we have a meaningful and useful way to measure the length of an interval or the total length of an interval-union, as long as they are limited? Or even for unlimited intervals?
collection interfaces
Do we want to implement a Set-interface in languages that have sets and an understanding of sets that would fit for intervals
implementation
How can we implement this ourselves?
implementation
Can we find useful implementations?

Having written a java library to support interval-unions on arbitrary Comparable types once in a project and having heard a speech about an interval library in Scala that ended up in using interval-unions in a pretty equivalent way, it might be interesting to write in the future about how to do this or what can be found in different languages to support us. For interval arithmetic some work has been done to create extensions or libraries for C and Fortran, that support this, while I was a student. So this is pretty old stuff and interesting mostly for the concepts, even if we are not going to move to Fortran because of this.

If there is interest I will write more about actual implementations and issues to address when using or writing them.

Links

Share Button

LF and CR LF in git and svn

Typically we observe that line endings of text file turn out to be „linefeed“ (LF or „\n“ or 0x0a or Ctrl-J) on Linux and Unix-like systems (including MacOSX), while they are „carriage return and linefeed“ (CR LF or „\r\n“ or 0x0d 0x0a or Ctrl-M Ctrl-J) on MS-Windows. See the little obstacles of interoperability.

This can become annoying and there is little reason for this. Most tools for editing, compiling and working with these files just understand both variants. It does become annoying when diffs are created between different files and even more so when scripts are turning out to have the „CR LF“ ending and the script interpreter given in the first line is not found, because the system tries to find one that has the otherwise invisible „CR“ in its file name. It also becomes kind of messy, when multiple CR-characters are present. CR-characters are annoying even on MS-Windows itself as soon as we use cygwin. Since in most cases the target system is a Linux system anyway and we just waste space with the unnecessary CR-character, it is actually in most cases a good idea to agree on not having these CR-characters at all in certain kinds of files.

The easiest way is to just set up git or subversion to change files from CR LF to LF on commit. So the repository only contains „clean“ files, at least from that moment onward.

This is accomplished in subversion by applying the following command on the files that we want to be kept on „LF“ only:

svn pset svn:eol LF

and then committing.

Git has a way to achieve this using the .gitattributes configuration file. So the .gitattributes file can contain something like this:

* text=auto eol=lf

Remark

Of course I recommend to use git instead of svn for new projects and to consider migrating existing projects from svn to git. But for this special aspect svn provides a slightly more powerful and more intuitive tool than git.

Links

Share Button