[Israel.pm] regexp

Shlomi Fish shlomif at iglu.org.il
Sun Jun 25 03:25:11 PDT 2006


On Sunday 25 June 2006 11:58, Issac Goldstand wrote:
> >> "aaa<asd>='asd'/6>bbb<asd>='asd'/3>ccc<asd>='asd'/5>ddd###"
>
> [snip]
>
> > Using simple regexps to parse HTML (which seems similar to your problem)
> > is a very old Perl request, and often appears in #perl on Freenode.
>
> It's not valid HTML.  Look carefully at the "closing tag".  So HTML
> parsers probably won't help.  If it was, it'd be enough to cleanup the
> trailing ### (or whatever other EOL marker) and run it through
> HTML::Parser asking just for the body text.

I know it's not valid HTML. However, my point was that it was one case, 
*similar* to (but not exactly the same as) parsing HTML where a simple 
one-time regex would probably have been a relatively Evil thing to do.

Regards,

	Shlomi Fish

---------------------------------------------------------------------
Shlomi Fish      shlomif at iglu.org.il
Homepage:        http://www.shlomifish.org/

95% of the programmers consider 95% of the code they did not write, in the
bottom 5%.



More information about the Perl mailing list