Sed – Renaming with Linux [Part 2]

Some time ago I wrote an article about renaming multiple filenames [1]. Now it happened I needed to rename several subtitles for Star Trek Enterprise because, I’m using a self-made mplayer-wrapper script to load subtitles automatically. Therefor the subtitle filename, usually a Sub Rip Text (.srt) file from tvsubtitles have to be in a form like S01E01 or 1x24.

The two according lines in my mplayer-vdpau wrapper script, which I use in My Media System (MMS) looks like this:

.
.
# Automatic subtitle selection for Seasons and Episodes a la S01E01 or 1x01
SUBTITLE=$( ls $FILENAME_PATH/*$( echo $FILENAME | sed 's/^(.*)([0-9][XxEe])([0-9]{2,2})(.*)$/3/' )*.srt )
.
.

Unfortunately a view .srt files had only a four digit sequence where the two first ones represented the season, and the last two are standing for the episode number.
What I needed was not this:

Star Trek Enterprise - 0403 - Home.DVD.en.srt

But this one:

Star Trek Enterprise - S04E03 - Home.DVD.en.srt

.

I’ve managed to do this with this sed line, and a loop:

IFS="" ; for I in $(ls *.avi) ; do echo $I |  ( sed 's/(^.*)([0-9]{2})([0-9]{2})(.*$)/mv "1234" "1S2E34"/') ; done > rename.sh

Now you should check the content of rename.sh, and if all looks good, you can just launch it like this:

sh rename.sh

Explanation
The elegant think I rediscovered for me again, are the placeholders in the sed line. Sed is an very old Unix tool, it actually is the predecessor of the even more mighty awk.
A typical sed line works like this: <s/regexp/replacement/>.

A simple example shows how it works. If we have a text sequence of "A B D" and the result should look line "A B C D". So the letter "C" should be inserted after “B” .

echo A B D | sed 's/(^.*)([B])(.*)$/12 C3/'
  • echo A B D
    prints the text line A B D to standard out
  • sed 's/...'
    initiates a substitution (replacement).
  • (.*)([B])(.*)
    this is the first interesting part. The parenthesis are actually the creator of our three placeholders, to which we will refer later. "[B]" is our letter to match (the pattern we are looking for). The dot and the asterisk in combination represents simply speaking the asterisk ("*") in the bash – it matches all.
  • (^.*)([B])(.*)$
    All parenthesis "(" and ")" must be escaped with a backslash (), else our bash thinks we want to group statements to a logical block. The circumflex (^) represents the beginning of a line and the dollar sign ($) stands for the end of a line.
  • s/()()()/123/
    Here we have to our left three parenthesis pairs, and two the right the numbers with the leading backslashes. These subsequent numbers refers to these parenthesis on the left side. You can mix them and write anything between them.

That’s it.

References
[1] – Because there are so many special characters, here’s a table of their pronunciations.
[2] – Here you can look up the meaning of the brackets and braces which weren’t described here.

One thought on “Sed – Renaming with Linux [Part 2]

  1. bob

    By the way, just a little comment, instead of redefining the IFS, you could use a ‘while read’ loop.

    That would look like:
    ls *.avi|while read I;do echo $I …

    Anyways, that’s a nice regexp :)

Comments are closed.