[Israel.pm] Removing meta characters (^[[1m and ^[[0m) from a file

Mikhael Goikhman migo at homemail.com
Sun Jun 27 13:44:37 PDT 2010


I feel that this answer is not complete, here are small corrections.

On 27 Jun 2010 22:52:27 +0300, Oron Peled wrote:
> 
> The sequences presented are part of ANSI standard of escape
> sequences used to highlight text (bold/underline/etc) on terminals.
> So we don't talk about a special character, but a special *strings*.
> 
> All these sequences have a common form. The simplest format is:
> 
>  <ESCAPE>[<numeric_code>m

Please note that <numeric_code> here is optional, in which case it is
the same as 0, i.e. reset all color attributes.

Also it is possible to specify many attributes at once:

    <ESC>[4;32;41m(underlined green on red)<ESC>[m...

> The ASCII code of ESC (decimal 27 as mentioned by someone else here) is
> commonly written as ^[ (control+left bracket) because this is actually
> the ASCII number of this character.

Not exactly. It's written as "^[" because the ascii of "[" is 91. :)

The caret notion means "minus the 6-th bits, i.e. minus 64".
"^@" is ascii code 0, "^A" is 1, ..., "^Z" is 26, "^[" is 27.

> A correct regex should be:
>     s/\e\[\d+m//
> But this also has an error because it's greedy. Let's fix it:
>     s/\e\[\d+?m//

Please note that these two regular expressions are exactly identical.
It does not matter whether \d+ is greedy or not here, since it is in
between of the non digit characters.

And as I wrote, "\d*" is more correct here than "\d+" or "\d+?".

> This covers all the simple cases. However, ANSI allows for more complex
> sequences that specify two numbers (e.g: two colors) for background
> and forground. E.g:
>      ^[[32;45m
> So let's try to generalize:
>      s/\e\[\d+?(;\d+?)?m//
> 
> Hopefully, this will cover all cases (not tested).

It does not cover an empty case and an over-2-attributes case. So:

    s/\e\[\d*(;\d*)*m//g

There are more terminal sequences, not just for color, for example:

    http://www.termsys.demon.co.uk/vtansi.htm

Regards,
Mikhael.

-- 
perl -e 'print+chr(64+hex)for+split//,d9b815c07f9b8d1e'


More information about the Perl mailing list