[Israel.pm] A simpler regex required

Peter Gordon peter at pg-consultants.com
Thu Aug 16 01:09:17 PDT 2007


How about the following - at least it looks symmetric.

$str =~ s{<td.*?>\K(.+?)\L</td>}{htmlify()}egx;

\L would perform a function similar to \K, only forward looking

On Thu, 2007-08-16 at 01:13 +0300, Yuval Yaari wrote:
> Peter Gordon wrote:
> > Wouldn't it be cute and much more intuitive if we could write this? 
> >
> > s!<td.*?>(.*?)</td>!$1 =~ s/\s/&nbsp;/g!eg
> >   
> 
> Yes. It wouldn't.
> 
> These variables are read-only for very good reasons, y'know :-)
> Also notice the regex you just wrote doesn't do what you originally 
> asked for.
> Assuming you wouldn't get errors for modifying a read-only variable, 
> you're completely deleting <td> and </td>.
> 
> I hope you don't mean you want the current behaviour that we all know 
> and love to "match-but-do-not-replace-anything-that's-not-grouped" :-)
> 
> So basically we:
>  - *Have* to tell the engine what we want to "match-but-not-replace", as 
> opposed to the current "match-and-replace"
>  - *Really* want $1 to be read-only (think of you debugging experience 
> when some function somewhere modifies $1 :))
> 
> The cleanest solution would probably be (works starting from perl 5.9.5):
> 
> $str =~ s{<td.*?>   # variable-length look-behind :)
>           \K        # tells Perl to "keep" that <td>
>           (.+?)     # text that we want to s///
>           (?=</td>) # look-ahead; won't be replaced
>      }{htmlify()}egx;
> 
> sub htmlify {local $_=$1; s/\s/&nbsp;/g; $_}
> 
> 
> I think the substitution should have occurred in a subroutine anyway, 
> even if $1 was writable.
> Maybe it would be cool if look-ahead had a backslash-thingie notation. 
> Erm...
> 
> And you might like this one; I personally hate it:
> 
> my $text;
> $str =~ s{<td.*?> \K
>           (.+?)
>           (?{ $text = $^N; $text =~ s/\s/&nbsp;/g; })
>           (?=</td>)
>      }{$text}egx;
> 
> If you have any better ideas of how it could/should look (with the 
> constraints I've mentioned, or a way to break them without ruining every 
> Perl programmer's life by breaking the current behaviour) please share 
> them :-)
> 
> HTH,
> 
>   ~Y
> _______________________________________________
> Perl mailing list
> Perl at perl.org.il
> http://perl.org.il/mailman/listinfo/perl




More information about the Perl mailing list