The magic trailing space

When comparing string, of course spaces count as well and they should count. To ignore them, we can normalize strings. Typical white space normalization includes the following (Perl regular expressions):

  • /[ \t]+/ /g replace any sequence of tabs and spaces used to separate content by one space.
  • /\r\n/\n/g replace carriage return + linefeed by linefeed only.
  • /\s+$// remove trailing whitespace.
  • /^\s+// remove leading whitespace.

More or less it is often useful to do something like this when comparing strings that originally come from outside sources and are not normalized, but only „the content“ counts. There can be more sophisticated rules, to deal with no-break-space, with control characters, with trailing spaces at the end of each line or only at the end of the whole thing or replacing multiple empty lines by just one empty line. Just the general idea is to think about the right normalization.

In some cases, like long numbers, spaces or other symbols are used to group digits. These should also be removed. Sometimes more specific rules apply, like for phone numbers, web sites, email addresses etc. that need to be done specifically for this type, hopefully using an adequate library.

More often than not we see that web sites do not do this properly. Quite often an information has to be entered and it is not normalized prior to further processing. So credit card numbers or IBAN numbers are rejected because of spaces or anything because of trailing spaces, of course with an error message that does not give us a hint about what was the problem.

For serious application there needs to be a serious processing step for data coming from outside anyway, for security reasons. Even though SQL injection should not work due to sound SQL-placeholder usage, it is a good practice to check the data anyway and reject it early and with a meaningful message. Should I trust the security of a site that cannot deal with spaces in a credit card number for giving them my card number? I am not sure.

It is about time that UI developers get into the habit of doing the proper processing, normalization and checks for user input. Beware that any security relevant checks need to be done on the server or on the server as well.

Share Button

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

*