[Israel.pm] A simpler regex required

Yuval Yaari yuval at windax.com
Tue Aug 14 22:50:32 PDT 2007


Peter Gordon wrote:
> Hi. 
>
> Let's suppose that I have the following lines in an HTML file. 
> I want to substitute the spaces in the date part with non-breaking spaces ( )
>
> <td  style="text-align: left" bgcolor="#92c1bb">Aug 12 23:59:59 2007 GMT</td>
> <td  style="text-align: left" bgcolor="#92c1bb">Aug 12 23:59:59 2007 GMT</td>
>
> I came up with this line - but somehow it isn't aesthetic.
>
> s!(<td.*?>)(.*?)(</td>)!my $t1 = $1 ;my $t2 = $2 ; my $t3 = $3 ; $t2 =~ s/\s/&nbsp;/g ; "$t1$t2$t3" ;!egs ;
>
> Is there a nicer/cleaner way to write it?
>   
Using look-ahead and look-behind, I guess.

Oh, yeh, only variable length look-behind won't work in Perl 5.8.8... :-(
I do like this trick in blead, though:

s{<td.*?> \K (.+?) (?=</td>)}
  {(my $text = $1) =~ s/\s/&nbsp;/g; $text}egx;

\K is described in perlre as "Keep the stuff left of the \K, don't 
include it in $&".

HTH (guess not :-)),

  ~Y



More information about the Perl mailing list