Automatic editing

For changing file contents, we often use editors.

But sometimes it is a good idea to do this from a script and do a change to many files at once or do the same kind of changes often.

The classic approach to this is using sed, which is made exactly for this purpose.

But most scripting languages, like for example Perl can do similar work to sed. Since Perl was developed with the idea of replacing awk and sed and some shell scripting and being a complete programming language, it claims to do the same better than sed. Ruby, Python, Raku and some others are also possible to use.

Basically this can be done in typical bash style work, so the bash script uses specific tools to perform small tasks and together an end result is achieved.

I am quite familiar with Perl, so for this kind of replace-in-file I am using Perl, just in a way that would with some minor changes also work with the other four languages mentioned or probably some others. Of course for more advanced stuff there might be areas where it for example Perl or Raku work better.

One thing needs to be thought about:

These „scripted search and replace“ operations primarily work like pipes, so with something like

perl -p -e 's/ä/ae/g;s/ö/oe/g;' < infile > outfile

or the equivalent in sed, Python, Raku or Ruby. In the end it is probably desired that the outfile just replaces the infile, but

perl -p -e 's/ä/ae/g;s/ö/oe/g;' < infile > infile

won’t work.

So an universally applicable approach is to just mv the modified file to the original afterwards, like

perl -p -e 's/ä/ae/g;s/ö/oe/g;' < infile > outfile \
&& mv outfile infile

Sometimes it is desirable to retain the original file to be able to check if the transformation did what it was meant to do and to be able to go back…

mv infile infile~
perl -p -e 's/ä/ae/g;s/ö/oe/g;' < infile~ > infile

And then the files ending with ~ need to be cleaned up when everything is done.

Since this is the most common way to work, there is a shortcut (at least for Perl):

perl -p -i~ -e 's/ä/ae/g;s/ö/oe/g;' infile

If no backup is needed, this can be done as

perl -p -i -e 's/ä/ae/g;s/ö/oe/g;' infile

For these simple examples, using sed, ruby, perl, python or raku is more a question of what syntax we know by heart. And of course, Perl, Python and SED are most likely installed on almost any Linux server…

Sometimes we want to do more complex things.. So depend on a context or on a state and do different substitutions depending on that. Or aggregate information in variables and insert them at some point.. For more complex things I really prefer real programming languages like Perl or Ruby, but a lot can be done by using just bash, sed and awk, if you know them really well.

A funny thing is that there is a simple editor ed, which can actually be used from the command line to apply editing functions to a file.

I remember that in some project there were a lot of template files that where used to create emails for example. It was quite a good idea to change them with perl scripts (and retain the original). Then I could show the outcome to the stakeholders and change the scripts until it was ok. This works for a couple of hundred files without problems, but of course it works better if they are kept clean with no unnecessary differences sneaking in that cause automatic editing to fail or create different results on some of the files. Testing remains important, of course.

Better learn to do at least some basic automated editing than doing tedious manual editing of a couple of hundred files, possibly having to do it more than once.

Share Button

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

*