[Israel.pm] approximate match (agrep like)

Yossi.Itzkovich at ecitele.com Yossi.Itzkovich at ecitele.com
Tue Apr 5 23:53:55 PDT 2005

Thanks,  that's the one.

Here are few lines from the documentation:

"String::Approx lets you match and substitute strings approximately. With
this you can emulate errors: typing errorrs, speling errors, closely
related vocabularies (colour color), genetic mutations (GAG ACT),
abbreviations (McScot, MacScot).

NOTE: String::Approx has been designed to work with strings, not with text.
In other words, when you want to compare things like text or source code,
consisting of words or tokens and phrases and sentences, or expressions and
statements, you should probably use some other tool than String::Approx,
like for example the standard UNIX diff(1) tool, or the Algorithm::Diff
module from CPAN, or if you just want the Levenshtein edit distance
(explained below), the Text::Levenshtein module from CPAN. See also
Text::WagnerFischer and Text::PhraseDistance.

The measure of approximateness is the Levenshtein edit distance. It is the
total number of "edits": insertions,

        word world


        monkey money

and substitutions

        sun fun

required to transform a string to another string. For example, to transform
"lead" into "gold", you need three edits:

        lead gead goad gold

The edit distance of "lead" and "gold" is therefore three, or 75%.

Thanks again


String::Approx ?

