<div dir="ltr">Thanks Oron, it works!<div><br><div class="gmail_quote">On Sun, Jun 27, 2010 at 10:52 PM, Oron Peled <span dir="ltr"><<a href="mailto:oron@actcom.co.il">oron@actcom.co.il</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Hi,<br>
<br>
First a short answer to Guy -- 'col -b' won't help because it removes the<br>
meta sequences of backspaces (that's what the 'b' stands for) used in<br>
Unix/Linux manuals (cat pages FWIW). These sequences were used as a<br>
neutral format that is later translated to terminal specific escape<br>
sequences by programs such as more/less etc.<br>
<div class="im"><br>
On Sunday, 27 בJune 2010 12:08:09 Erez David wrote:<br>
> s/\e[\[01]m//g does't do the job. since the first [ is not a real character<br>
> it is a meta character...<br>
<br>
</div>Not exactly. The sequences presented are part of ANSI standard of escape<br>
sequences used to highlight text (bold/underline/etc) on terminals.<br>
So we don't talk about a special character, but a special *strings*.<br>
<br>
All these sequences have a common form. The simplest format is:<br>
<br>
<ESCAPE>[<numeric_code>m<br>
<br>
The ASCII code of ESC (decimal 27 as mentioned by someone else here) is<br>
commonly written as ^[ (control+left bracket) because this is actually<br>
the ASCII number of this character.<br>
<br>
BTW: if the escape key in your keyboard is broken, you can use control+left<br>
bracket as a substitute because it is really the same character.<br>
<br>
The result of this is that when writing the sequence as text, it is often<br>
presented as: ^[[2m<br>
<br>
But note that the first bracket is part of "control+bracket" which simply<br>
means the escape character, and the second bracket is the real '['<br>
character which is part of the sequence (the length of the example above<br>
is exactly 4 characters)<br>
<br>
Now you should see the problem with your regex -- the '[' in a regex means<br>
open a character class.... so it is special character for regex.<br>
A correct regex should be:<br>
s/\e\[\d+m//<br>
But this also has an error because it's greedy. Let's fix it:<br>
s/\e\[\d+?m//<br>
<br>
This covers all the simple cases. However, ANSI allows for more complex<br>
sequences that specify two numbers (e.g: two colors) for background<br>
and forground. E.g:<br>
^[[32;45m<br>
So let's try to generalize:<br>
s/\e\[\d+?(;\d+?)?m//<br>
<br>
Hopefully, this will cover all cases (not tested).<br>
<br>
Have fun.<br>
<div><div></div><div class="h5"><br>
><br>
> On Sun, Jun 27, 2010 at 12:02 PM, Shlomi Fish <<a href="mailto:shlomif@iglu.org.il">shlomif@iglu.org.il</a>> wrote:<br>
><br>
> > On Sunday 27 Jun 2010 11:27:02 Erez David wrote:<br>
> > > Hi,<br>
> > ><br>
> > > I am reading a file which has some meta characters in it.<br>
> > > This meta characters are: ^[[1m and ^[[0m which are used to bold some<br>
> > text<br>
> > > out.<br>
> > ><br>
> > > I am looking for the best way to remove this meta characters from the<br>
> > file<br>
> > > before I parse it. (Whether remove it by regex or any other way...)<br>
> > ><br>
> ><br>
> > You can use a regex. Untested:<br>
> ><br>
> > s/\e[\[01]m//g<br>
> ><br>
> > Regards,<br>
> ><br>
> > Shlomi Fish<br>
> ><br>
> > > Thanks<br>
> > ><br>
> > > Erez<br>
> ><br>
> > --<br>
> > -----------------------------------------------------------------<br>
> > Shlomi Fish <a href="http://www.shlomifish.org/" target="_blank">http://www.shlomifish.org/</a><br>
> > Funny Anti-Terrorism Story - <a href="http://shlom.in/enemy" target="_blank">http://shlom.in/enemy</a><br>
> ><br>
> > God considered inflicting XSLT as the tenth plague of Egypt, but then<br>
> > decided against it because he thought it would be too evil.<br>
> ><br>
> > Please reply to list if it's a mailing list post - <a href="http://shlom.in/reply" target="_blank">http://shlom.in/reply</a> .<br>
> ><br>
><br>
</div></div></blockquote></div><br></div></div>