| Category: ||E-Material |
| Subject: ||WMTrans Language Processing Tools Available |
| From: ||Sandra Wendland |
| Email: ||sandra.wendland_(on)_canoo.com |
| Date received: ||28 Nov 2002 |
Editorial Department: Software, Information Retrieval, Natural Language
Processing, Language Learning
FOR IMMEDIATE RELEASE
WMTrans Language Processing Tools Available
German Word Analysis and Generation for more than Two Million Words
Basel, Switzerland, November 25, 2002. Canoo Engineering AG today
the release of its Word Manager Transducer (WMTrans) product range.
Available through the Web at http://www.canoo.com/wmtrans,
morphology analysis software, developed by Canoo, offers intelligent
processing for information retrieval and language processing
Typical use cases include intelligent search, text indexing, text
language learning, hyperlink generation, spell checking, grammar
and machine translation.
WMTrans is based on Canoo's German Morphological Dictionary, containing
than 200'000 entries and generating over two million fully categorized
forms, including information on word formation, all types of
irregularities and spelling variants.
The WMTrans product range includes the following software components:
The Lemmatizer analyses German words and finds their base form and
An analysis of ging, for example, returns the infinitive verb, gehen,
corresponding base form listed in a dictionary.
query -> ging
result -> gehen (Cat V)
WMTrans Unknown Word Lemmatizer
In German, complex new words can be formed easily - either by
by adding pre- and suffixes. Examples of German compounds are words like
Umsatzwarnungen, skandalgesch=FCttelten, or abbausicheres. Though
many compounds have a low frequency and are not listed in dictionaries.
Unknown Word Lemmatizer recognizes non-lexicalized words such as
by applying word formation rules. This is a powerful advantage in a
generative language such as German.
The Unknown Word Lemmatizer includes the Lemmatizer and therefore knows
the entire dictionary and the word formation rules. Typical usage is as
follows: A first call to the Lemmatizer determines whether or not a word
form is included in the Morphological Dictionary. If the Lemmatizer
find the word, it is passed on to the Unknown Word Lemmatizer for
processing. The Unknown Word Lemmatizer analyses the word's structure
associates one or more word formation rules with the corresponding base
forms in the lexicon. The output is the base form of a word and its
category. As a result, a word such as Umsatzwarnungen is analyzed
successfully, even though the base form Umsatzwarnung is not listed in
query -> umsatzwarnungen
result -> umsatzwarnung (Cat N)
WMTrans Inflection Analyzer
The Inflection Analyzer determines the base form and category of a
well as providing additional grammatical and orthographical information.
The Recognizer detects if a character string is a valid German word.
The Generator returns all inflected word forms and spelling variants
WMTrans Inflection Analyzer/Generator
The Inflection Analyzer/Generator determines the base form and category
word and computes all possible inflected forms and spelling variants
given base form
WMTrans Word Formation Analyzer/Generator
The Word Formation Analyzer/Generator determines the components of a
from which it has been derived or composed and finds all possible word
composites and derivations in which a given word is involved.
Benefits of WMTrans Products
Canoo's language tools offer the following unique benefits:
Effective use of technology: WMTrans products are finite state machines,
which are highly efficient in memory consumption and processing speed.
Excellent dictionary quality: the dictionary has been hand-compiled by a
team of highly qualified linguists, using a dedicated authoring
which offers superior support during data entry and ensures a high data
Complete set of word formation rules: this comprehensive dictionary
knowledge is used, for example, by the Unknown Word Lemmatizer to
accurate analyses of non-lexicalized entries.
WMTrans products are available for several platforms:
Platform (API) Product
Windows, Linux, Solaris (Java) WMTrans Lemmatizer
Unknown Word Lemmatizer
Linux (Java, C++) WMTrans Lemmatizer
Download Trial Versions
Download free evaluation licenses at:
Browse through the product descriptions, test the APIs and find out how
WMTrans shared libraries can be integrated into your application.
Canoo Online Services are based on WMTrans products and provide
possible applications. These services are available at: http://www.canoo.net
Founded in 1999, Canoo (http://www.canoo.com) delivers
solutions for business applications on the Internet, Intranet, and
Canoo is based in Basel, Switzerland.
Canoo Engineering AG
Phone: +41 (61) 228 94 44
Voucher copy requested
Canoo Engineering AG
Tel. +41 61 228 94 66
Fax +41 61 228 94 49