[Israel.pm] HTML Tables Parsing with Perl
Yuval Yaari
yuval at windax.com
Thu May 27 02:04:55 PDT 2004
Hi,
I know this sounds simple, and of course I didn't try to re-invent the
wheel BUT I used HTML::TableContentParser...
Which doesn't really work well for me :)
Basically, I need to extract all the data from a <TD> ...
But if there's a table inside that <TD>, HTML::TableContentParser fails.
Basically, I need:
<TD> <---- From here
text
b/w
<TABLE>
<TR>
<TD>text</TD>
</TR>
</TABLE>
</TD> <---- All the way to here, excluding the </TD>...
So as you see, I can't be 100% sure that there won't be any <TABLE>s
inside that <TD> (though I do want them...).
There may also be <TD>'s before/after the specific <TD> I'm looking for,
so I wasn't able to write a regex.
Any modules, scripts, regexes (???) would be highly appreciated.
--Yuval
More information about the Perl
mailing list