[Israel.pm] \w for utf8

Pinkhas Nisanov pinkhas at nisanov.com
Mon Aug 20 06:00:16 PDT 2007


binmode($fh, ":utf8") solve problem

Thanks!
Pinkhas Nisanov


On 8/20/07, Yuval Kogman <nothingmuch at woobling.org> wrote:
> use utf8;
>
> Will tell perl that the current file is encoded in utf8 and all
> strings will be assumed to be that (as opposed to latin1).
>
> Since your string is likely coming from elsewhere, look into
> binmode($fh, ":utf8) and open($fh, "<:utf8", $file), and also
> Encode::decode.
>
> These are the common methods to get a string to be marked as unicode
> in memory, at which point the regex engine treats \w+ as really all
> alphanumerical characters, not only [a-zA-Z0-9_].
>
> There is a tutorial by Juerd somewhere, it's supposed to be pretty
> good. Try google perhaps
>
> On Mon, Aug 20, 2007 at 15:39:58 +0300, Pinkhas Nisanov wrote:
> > Hi,
> >
> > I need catch string that may include 'utf8' characters:
> > e.g.:
> >
> >   my $str_utf8 = 'N-Größe';
> >   my @res = ( $str_utf8 =~ /(\w+)/g );
> >   print join( " ++ ", @res ), "\n";
> >
> >
> > it prints:
> >
> >  N ++ Gr ++ e
> >
> > but I need:
> >
> > N ++ Größe
> >
> >
> > thanks
> > Pinkhas Nisanov
> > _______________________________________________
> > Perl mailing list
> > Perl at perl.org.il
> > http://perl.org.il/mailman/listinfo/perl
>
> --
>   Yuval Kogman <nothingmuch at woobling.org>
> http://nothingmuch.woobling.org  0xEBD27418
>
> _______________________________________________
> Perl mailing list
> Perl at perl.org.il
> http://perl.org.il/mailman/listinfo/perl
>


More information about the Perl mailing list