[Israel.pm] HTML Tables Parsing with Perl

Yuval Yaari yuval at windax.com
Thu May 27 02:04:55 PDT 2004


Hi,

I know this sounds simple, and of course I didn't try to re-invent the 
wheel BUT I used HTML::TableContentParser...
Which doesn't really work well for me :)

Basically, I need to extract all the data from a <TD> ...
But if there's a table inside that <TD>, HTML::TableContentParser fails.

Basically, I need:
<TD>                 <---- From here
    text
    b/w
    <TABLE>
        <TR>
          <TD>text</TD>
       </TR>
    </TABLE>
</TD>                 <---- All the way to here, excluding the </TD>...

So as you see, I can't be 100% sure that there won't be any <TABLE>s 
inside that <TD> (though I do want them...).
There may also be <TD>'s before/after the specific <TD> I'm looking for, 
so I wasn't able to write a regex.

Any modules, scripts, regexes (???) would be highly appreciated.

  --Yuval




More information about the Perl mailing list