[Israel.pm] Removing meta characters (^[[1m and ^[[0m) from a file

Oron Peled oron at actcom.co.il
Sun Jun 27 12:52:27 PDT 2010


First a short answer to Guy -- 'col -b' won't help because it removes the
meta sequences of backspaces (that's what the 'b' stands for) used in
Unix/Linux manuals (cat pages FWIW). These sequences were used as a
neutral format that is later translated to terminal specific escape
sequences by programs such as more/less etc.

On Sunday, 27 בJune 2010 12:08:09 Erez David wrote:
> s/\e[\[01]m//g  does't do the job. since the first [ is not a real character
> it is a meta character...

Not exactly. The sequences presented are part of ANSI standard of escape
sequences used to highlight text (bold/underline/etc) on terminals.
So we don't talk about a special character, but a special *strings*.

All these sequences have a common form. The simplest format is:


The ASCII code of ESC (decimal 27 as mentioned by someone else here) is
commonly written as ^[ (control+left bracket) because this is actually
the ASCII number of this character.

BTW: if the escape key in your keyboard is broken, you can use control+left 
     bracket as a substitute because it is really the same character.

The result of this is that when writing the sequence as text, it is often
presented as: ^[[2m

But note that the first bracket is part of "control+bracket" which simply
means the escape character, and the second bracket is the real '['
character which is part of the sequence (the length of the example above
is exactly 4 characters)

Now you should see the problem with your regex -- the '[' in a regex means
open a character class.... so it is special character for regex.
A correct regex should be:
But this also has an error because it's greedy. Let's fix it:

This covers all the simple cases. However, ANSI allows for more complex
sequences that specify two numbers (e.g: two colors) for background
and forground. E.g:
So let's try to generalize:

Hopefully, this will cover all cases (not tested).

Have fun.

