stemmer
<information science, human language> A program or algorithm which
determines the morphological root of a given inflected (or, sometimes, derived)
word form -- generally a written word form.
A stemmer for English, for example, should identify the string "cats" (and
possibly "catlike", "catty" etc.) as based on the root "cat", and "stemmer",
"stemming", "stemmed" as based on "stem".
English stemmers are fairly trivial (with only occasional problems, such as
"dries" being the third-person singular present form of the verb "dry", "axes"
being the plural of "ax" as well as "axis"); but stemmers become harder to
design as the morphology, orthography, and character encoding of the target
language becomes more complex. For example, an Italian stemmer is more complex
than an English one (because of more possible verb inflections), a Russian one
is more complex (more possible noun declensions), a Hebrew one is even more
complex (a hairy writing system), and so on.
Stemmers are common elements in query systems, since a user who runs a query on
"daffodils" probably cares about documents that contain the word "daffodil"
(without the s).
(This dictionary has a rudimentary stemmer which currently (April 1997) handles
only conversion of plurals to singulars).
(1997-04-09)
Nearby terms:
steam-powered « Steelman « steganography «
stemmer » stemming » STENSOR » STEP
|