[Israel.pm] segmentation fault in regex
Yossi.Itzkovich at ecitele.com
Yossi.Itzkovich at ecitele.com
Sun Mar 12 05:35:05 PST 2006
Before you all try to invernt a broken wheele, ehere is the original text
from perlfaq6:
How do I use a regular expression to strip C style comments
from a file?
While this actually can be done, it's much harder than you'd
think. For example, this one-liner
perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c
will work in many but not all cases. You see, it's too
simple-minded for certain kinds of C programs, in
particular, those with what appear to be comments in quoted
strings. For that, you'd need something like this, created
by Jeffrey Friedl and later modified by Fred Curtis.
$/ = undef;
$_ = <>;
s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs
print;
This could, of course, be more legibly written with the "/x"
modifier, adding whitespace and comments. Here it is
expanded, courtesy of Fred Curtis.
s{
/\* ## Start of /* ... */ comment
[^*]*\*+ ## Non-* followed by 1-or-more *'s
(
[^/*][^*]*\*+
)* ## 0-or-more things which don't start with /
## but do end with '*'
/ ## End of /* ... */ comment
| ## OR various things which aren't comments:
(
" ## Start of " ... " string
(
\\. ## Escaped char
| ## OR
[^"\\] ## Non "\
)*
" ## End of " ... " string
| ## OR
' ## Start of ' ... ' string
(
\\. ## Escaped char
| ## OR
[^'\\] ## Non '\
)*
' ## End of ' ... ' string
| ## OR
. ## Anything other char
[^/"'\\]* ## Chars which doesn't start a comment, string
or escape
)
}{$2}gxs;
A slight modification also removes C++ comments:
s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs;
My code was taken from the last line
Yossi
More information about the Perl
mailing list