[Israel.pm] LWP and the Jewish/Hebrew question
meir at guttman.co.il
Mon Jun 28 02:21:37 PDT 2010
Hey Hebrew processing folks!
The LWP (Lib WWW for Perl) package seems to be very unfriendly or even right
out hostile to Unicode, RTL text or just Hebrew.
There are many examples. For one: the HTML::TreeBuilder package ->Dump and
->as_HTML methods totally garble Hebrew UTF-8 text.
But let's start first with a very simple Cookie_jar case which I first
posted <http://firstname.lastname@example.org/msg06768.html> on the
libwww at perl.org mailing list (with no results!) The following code to the
Google site suggested by the LWP originator and maintainer Gisle Aas worked
my $jar = HTTP::Cookies->new(file => "lwp.jar", autosave => 1);
my $ua = LWP::UserAgent->new(cookie_jar => $jar);
And indeed, it fills the 'lwp.jar' file with a few entries.
But just changing the URL to http://www.magna.isa.gov.il
<http://www.magna.isa.gov.il/> produced a cookie jar file with just a
comment headline and no entries at all.
As you might see for yourself, the MAGNA site does send a Set-Cookie:
ASP.NET_SessionId=ddkkv245c14tol45bgk35m45; path=/; HttpOnly line.
So, in the spirit of the last week's events, why is the Israeli site
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Perl