[Israel.pm] The qr// operator
Eli Billauer
eli at billauer.co.il
Tue Dec 30 13:33:30 PST 2008
Hello all,
After quite a few years of using Perl, I suddenly discovered how complex
regular expressions can be written, using the qr// operator. It does
appear in the perlop manpage, but somehow I managed not to realize how
useful it is.
Have a look on the code below, which was taken from
http://www.cs.cmu.edu/~cache/email/
Note that $atext contains the *compiled* regular expression, so it can
be used within another regular expression, possibly to create another
compiled regular expression, such as $dot_atom_text below. The variable
doesn't contain the string which created the regular expression, nor
some matching result, but it takes the meaning of the regular
expression, and encapsulates it in a string.
Now just skip to the bottom lines, and see how readable the final
expression is.
Remember all these patterns for "what I call a whitespace" or "this,
this or that" which are repeated ten times in a long one-liner? I really
wonder what they were good for.
Which makes me wonder: Why aren't all nontrivial regular expressions
written like this?
Eli
sub is_valid_email ($) {
my ($addr) = @_;
my $atext = qr/[A-Za-z0-9\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\+\~]/;
my $dot_atom_text = qr/$atext+(\.$atext+)*/;
my $no_ws_ctl_char = qr/[\x01-\x08\x0b\x0c\x0e-\x1f\x7f]/;
my $qtext_char = qr/([\x21\x23-\x5b\x5d-\x7e]|$no_ws_ctl_char)/;
my $text = qr/[\x01-\x09\x0b\x0c\x0e-\x7f]/;
my $qtext = qr/($qtext_char|\\$text)*/;
my $quoted_string = qr/"$qtext"/;
my $quotedpair = qr/\\$text/;
my $dtext = qr/[\x21-\x5a\x5e-\x7e\x01-\x08\x0b\x0c\x0e-\x1f\x7f]/;
my $dcontent = qr/($dtext|$quotedpair)/;
my $domain_literal = qr/\[(${dcontent})*\]/;
if ( $addr =~ /^($dot_atom_text|$quoted_string)\@($dot_atom_text|$domain_literal)$/ ) {
return 1;
} else {
return 0;
}
}
--
Web: http://www.billauer.co.il
More information about the Perl
mailing list