[Israel.pm] Site Crawler

Shlomo Yona shlomo at cs.haifa.ac.il
Sun Nov 14 10:05:56 PST 2004


On Sun, 14 Nov 2004, Guy Malachi wrote:

> The urls in the list all potentially link to my site so I just want to
> find whether or not they do (and the specific page the link is located
> on). I don't want to download the entire site if it's not necessary, I
> want to crawl the site and once I find my link stop crawling.
> So using wget would be an overkill since I would be downloading the
> entire site.

You can, alternatively, use WWW::Mechanize or, implement a
simple spider yourself using LWP::UserAgent. See the example
in HTML::LinkExtor's perldoc for a skeleton.

-- 
Shlomo Yona
shlomo at cs.haifa.ac.il
http://cs.haifa.ac.il/~shlomo/



More information about the Perl mailing list