[Israel.pm] about utf8

Gaal Yahas gaal at forum2.org
Sun Jan 18 09:37:35 PST 2009


Are you sure? €, U+20AC is represented in UTF-8 as 0xE2, 0x82, 0xAC.
The middle byte & 0xC0 == 0.

On Sun, Jan 18, 2009 at 7:26 PM, Shmuel Fomberg <semuelf at 012.net.il> wrote:
> Hi.
>
> I've been reading a bit about utf8, and I learned that when reading a
> utf8 character, for each byte I need to check:
> (byte & 0xC0 ) == 0xC0
> means that there is another byte for this character. Otherwise, it's the
>  last byte of the character.
>
> Shmuel.
> _______________________________________________
> Perl mailing list
> Perl at perl.org.il
> http://perl.org.il/mailman/listinfo/perl
>



-- 
Gaal Yahas <gaal at forum2.org>
http://gaal.livejournal.com/



More information about the Perl mailing list