Universal Levenshtein Automata for a Generalization of the Levenshtein Distance

TitleUniversal Levenshtein Automata for a Generalization of the Levenshtein Distance
Publication TypeJournal Article
Year of Publication2009
AuthorsMitankin P., Mihov S., Schulz K.
JournalAnnuaire de l’Université de Sofia “St. Kliment Ohridski”. Faculté de Mathématiques et Informatique
Volume99
Abstract

The need to efficiently find approximate matches for a given input string in a large background dictionary arises in many areas of computer science. In earlier work we introduced the concept of a universal Levenshtein automaton for a distance bound $n$. Given two arbitrary strings $v$ and $w$, we may use a sequence of bitstrings $\chi(v,w)$ obtained from $v$ and $w$ in a trivial way as input for the automaton. The automaton is deterministic. The sequence $\chi(v,w)$ is accepted iff the Levenshtein distance between $v$ and $w$ does not exceed $n$. We showed how universal Levenshtein automata can be used to efficiently select approximate matches in large dictionaries. In this paper we consider variants of the Levenshtein distance were substitutions may be blocked for specific symbol pairs. The concept of an universal Levenshtein automaton is extended to cope with this larger class of similarity measures.

AttachmentSize
PDF icon 99-005-023.pdf1.6 MB