[Israel.pm] Scraping data via Perl from ASP websites
idokan at gmail.com
Thu Aug 28 04:16:47 PDT 2008
Live HTTP Header is a great tool, however every open tab (and plugins)
effect it. So you might have Gmail or other auto refresh page opened
in a tab.
Personally I prefer Wireshark, because I can filter properly what
belongs to what.
I hope this help a bit.
On Thu, Aug 28, 2008 at 2:08 PM, Yossi Klein <kleinyossi at yahoo.com> wrote:
> I have had much success scraping data from Perl websites using the HTTP modules (HTTP:Request, HTTP::Response, etc.). However, I now have a need to use Perl to scrape data off of ASP sites and am not having much success.
> Without going into too much detail, I've always used the Live HTTPHeaders add-on to Firefox to see the HTTP requests and responses and I use that information to help me emulate the same in a Perl program. This method doesn't work for ASP sites. I see my original request and the response from the website. But after that I see my browser making a request to a completely unrelated site and I can't figure out how that request gets initiated. If anyone can help me figure this out or knows of a tool (preferrably in Perl and preferrably one with they'd had success) that can help me, it would be greatly appreciated.
> (I know that this doesn't sound like a Perl question, but I've taken on this assignment and will only have time to do it if I have a Perl solution. The request for a tool even if it's not Perl, is a favor to the person who asked me to do this so that at least he has a head-start for whoever takes over for me if I can't do it in Perl).
> Perl mailing list
> Perl at perl.org.il
More information about the Perl