HUMAN-COMPUTER INTERACTION
SECOND EDITION
Dix, Finlay, Abowd and Beale
Stemming is the reducing of words to their root. For example, removing the 'ing' from stemming to get 'stem'. However, even this example shows it is not simple. The root of 'stemming' is 'stem' not 'stemm'!
Stemming is used in many free text information retrieval systems to improve the matching algorithms.
The full text search of Human-Computer Interaction uses the most rudimentary stemming - it removes 's' from the ends of words. There are a small number of 'stop words' which are not stemmed to prevent it from stemming words such as 'fuss', but that is all.
if you are intersted in more complex forms of stemming, see the pages for the Lancaster (Paice/Husk) stemming algorithm which includes links to implementations of that algorithm as well as more general stemming resources and links to other algorithms.
If you know of any other public domain stemming software or any links to stemming on the web do let us know.
HCI 2e home page || changes and additions || search || resources || authors || ordering | |
feedback to feedback@hcibook.com
this page is at: http://www.hcibook.com/hcibook/glossary/stemming.html |
designed and hosted by
hiraeth mixed media
webmaster@hiraeth.com |
|