[Israel.pm] Unicode un-handling

Shmuel Fomberg semuelf at 012.net.il
Tue Apr 8 14:29:20 PDT 2008


Hi All.

I'm writing my site in Hebrew/Unicode.
Actually, as much as I can see, the processing of the pages is not does 
in unicode. The Tamplate::Toolkit is loading the files without the :utf8 
modifier, process the file as if it were normal ascii.
The data that come from the DB probably is not marked as unicode either. 
so all the fields are being entered to the tamplate as byte sequence.

And it all works. somehow.

But then I tried to enter data that is marked as utf8:
my $check_encoding_mark = decode("utf8", pack "H*", 'c3a4e284a2c2ae');
as one of the fields in the tamplate.

suddenly, all the hebrew turned to something that look like:
×?×?×¥ ×?×?×? ×?×?×^(a)× ×^(a)ק×?×^(a)
should be "press here to disconnect", in hebrew.

My guess is that when adding a utf8-marked data, Perl tried to convert 
the old data from (latin-1?) to utf8.
Is that correct?

I think that I should mark everything as utf8. I use:
CGI::Application
CGI::Application::Plugin::AnyTemplate - Tamplate Toolkit
Class::DBI

Can anyone help convincing these modules to grok utf8?

Shmuwl.



More information about the Perl mailing list