[Israel.pm] Site Crawler

Shlomo Yona shlomo at cs.haifa.ac.il
Sun Nov 14 09:17:47 PST 2004


On Sun, 14 Nov 2004, Guy Malachi wrote:

> Hey,
> Anybody have any tips on how I can create a site crawler that will
> extract all the links on a remote site and see if the site links to my
> site?

You can use wget to download a site and then locally use
File::Find + HTML::LinkExtor to extract the links from the
files and then you will just need to compare then to the
links of your site(s).

> Basically I have a list of urls that I want to check for each url if
> somewhere on the site (extracting all links and following onsite links
> recursively) there is a link to my site.
>
> Oh yea, it must run on Windows.
>

-- 
Shlomo Yona
shlomo at cs.haifa.ac.il
http://cs.haifa.ac.il/~shlomo/



More information about the Perl mailing list