[Israel.pm] unicode characters in your code

Mikhael Goikhman migo at homemail.com
Sat Mar 13 10:34:01 PST 2004

On 13 Mar 2004 09:09:20 +0200, Shlomo Yona wrote:
> I'm using Perl 5.8.0 and am facing a problem with unicode
> characters.
> I cannot seem to explicitly use a unicode character in the
> code (say, in a regular expression pattern). I can, however,
> use the \x{...} notation to represent the unicode character.
> Is there a way to explicitly use the unicode character in
> the code?

It is not clear from your question what is "unicode character".

Do you mean that you want $str = "binary_data"; to be interpreted by Perl
as utf8 string? I think (please someone correct me), Perl code itself is
considered ascii, so this is not possible, you should either use \x{...}
notation or read the unicode data from stdin/file using utf8 encoding.

Starting with 5.8.1, a new "perl -C" option is introduced that makes your
stdin/stdout and other streams to be considered utf8. Read "man perlrun"
in the recent perl versions. You may also use "use utf8;" or unicode
locale for this.

BTW, perl 5.8.0 has some known bugs with unicode locale that are fixed in
5.8.1. It is possible this does not affect you. Still, I think you should
upgrade your perl if you want to use advanced unicode features.


perl -e 'print+chr(64+hex)for+split//,d9b815c07f9b8d1e'

More information about the Perl mailing list