[Israel.pm] Detecting html form charset

sawyer x xsawyerx at gmail.com
Fri Apr 4 05:33:09 PDT 2008


iconv has a recognition mechanism that usually works.
there's Text::Iconv on CPAN

the pack seems like a good way to do it

On Fri, Apr 4, 2008 at 3:21 PM, Shmuel Fomberg <semuelf at 012.net.il> wrote:
>
> ik wrote:
>
> > I have an html form, and while my page is set to UTF-8, I had a
> > problem that someone used a non UTF-8 text, making it loose the data
> > completely.
> > Is there a way to know what is the charset each form field is in ?
>
> I didn't understood if the non utf8 text that you ae talking about was
> sent by the user, or do you have strings to display that you do not know
> which encoding they are?
>
> If you have strings that you want to display but don't know their
> encoding, good luck with that.
>
> If you are talking about users submitting forms with other encodings, I
> took idea from this page:
> http://dev.mysql.com/tech-resources/articles/4.1/unicode.html
> that is to add a hidden input with special charecters, and see what the
> user submit. I wrote the following code snip: (untested)
>
> sub check_encoding {
>     my $self = shift;
>     my $unicode_check = $self->query->param("charset_check");
>     my $check_hexed = unpack "H*", $unicode_check;
>     if ($check_hexed eq 'c3a4e284a2c2ae') {
>         # got a unicode string. so nothing.
>         return sub { return $_[0] };
>     } elsif ($check_hexed eq 'e499ae') {
>         return sub { return decode("cp1255", $_[0]) };
>         #$question = "Windows-1252 " . $question;
>     } else {
>         #$question = "unknown($check_hexed) " . $question;
>         warn "Do not know this encoding: $check_hexed";
>         return sub { return $_[0] };
>     }
> }
>
> Shmuel.
>
>
>
> _______________________________________________
> Perl mailing list
> Perl at perl.org.il
> http://perl.org.il/mailman/listinfo/perl
>



More information about the Perl mailing list