[Perl] Extract a Paragraph from a large file

Shlomo Yona shlomo at cslx.haifa.ac.il
Sun Jun 30 21:57:59 PDT 2002


On Mon, 1 Jul 2002, Ariel E. Y. Brosh wrote:

> Many times we fetch the formatted HTML from the web in order to parse and
> "steal" the data; reverse engineer from report to data.
> 
> HTML::TableExtract might do the job for you. I wrote myself (and never
> uploaded) an extremely memory hungry alternative that parses all tables in
> a page into a tree; do you think such a tool would be useful for people?
> (Very memory hungry, as I said). I used it to "steal" wheather information
> from several sites. (Today I fetch weather from METAR, if it helps
> anybody)
> 

I'm curious enough to see your table parser.
I tried HTML::TableExtract several times, and realized it has a wierd interpretation
of the HTML, and that the resulting parse is not usually correct.


-- 
Shlomo Yona
shlomo at cs.haifa.ac.il
http://cs.haifa.ac.il/~shlomo/




More information about the Perl mailing list