[Israel.pm] Site Crawler
Shlomo Yona
shlomo at cs.haifa.ac.il
Sun Nov 14 10:05:56 PST 2004
On Sun, 14 Nov 2004, Guy Malachi wrote:
> The urls in the list all potentially link to my site so I just want to
> find whether or not they do (and the specific page the link is located
> on). I don't want to download the entire site if it's not necessary, I
> want to crawl the site and once I find my link stop crawling.
> So using wget would be an overkill since I would be downloading the
> entire site.
You can, alternatively, use WWW::Mechanize or, implement a
simple spider yourself using LWP::UserAgent. See the example
in HTML::LinkExtor's perldoc for a skeleton.
--
Shlomo Yona
shlomo at cs.haifa.ac.il
http://cs.haifa.ac.il/~shlomo/
More information about the Perl
mailing list