MT Roadmap Workshop at TMI2002
Report on the MT Roadmap workshop organised in conjunction with the 9th
International Conference on Theoretical and Methodological Issues in machine
Translation (TMI2002, March 13-17, Keihanna, Japan).
The workshop was the fourth in a series of
ELSNET workshops aimed at the
creation of a broadly supported roadmap for human language technologies.
For more information on earlier and future workshops visit
The URL of the workshop is
The workshop was organized by Steven Krauwer and Laurie Gerber, and it
was attended by some 30 participants.
The aim of the workshop was to identify major challenges for MT. As for
an individual researcher or developer the solution of a single detail
problem in his dissertation or in a prototype he is developing may count
as a major challenge, some indicative quantification of the size of a real
challenge was given: Participants were asked to imagine that they were in
charge of a major R&D programme, and that they had 100 million USD to
spend on one single problem over a period of up to 5 years.
The challenges were grouped in three classes, based on the perspectives
one could adopt:
- research challenges
- provider challenges
- user challenges
It turned out that one important header was missing: human resources,
especially cross-training of linguists and computer scientists.
Some participants objected that under the present economical conditions
even fantasizing about such amounts of R&D money was hard. Others said
that massive funding for a single programme to solve a single problem
would not make sense because there is not such a thing as a big problem,
and that it would be much more effective to fund a large number of
independent small size projects.
Under the heading research problems the following candidate
challenges were collected:
- Developing a formal theory of translation
- Developing a semantic theory
- Eliminating the knowledge acqusition bottleneck
- Using translation memories (bitexts) and machine translation
together in a product
- Creating permanent shared language repositories (sharing), including
huge, word aligned multitexts
- Robust speech recognition (based on speech and other signals such as
gestures, facial expressions, etc) to meaning.
- Moving towards a theory of crosslingual communication aids for
situation dependent solutions
The following challenges from the user perspective were
- Language plug-ins for mobile phones (for transactions rather than
full fledged interpretation)
- Help with the hard part of foreign languages.
- Large MT evaluation from user perspective.
- Standard control menu language (for cross-language communication by
means of small menu driven devices)
- Crosslingual sign-reading eyeglasses (foreign language signs or
messages are read by a small camera, and the translation is projected in
the user's glasses)
- Learning from user feedback (via post-edition tools), and predicting
user needs, constructing user models
- Web search and translation with CLIR.
- Automatic stenography (TV, conferences)
- Language plug in for cellphone, but as a service
- Ways to stick language books into MT system
- Using TM (bitexts) & MT together in a product
- Coverage of Minority languages.
- Masssively annotated multitext.
- Exploiting markup.
When trying to look back and identify achievements that in hindsight
would count as overcoming challenges the following items were proposed:
- Free on-line web translatioin (however poor the quality)
- Preservation of markup
- Commercial speech-to-speech translation
There was a general agreement that it would be an interesting follow-up
exercise to try to integrate the points listed above in the timeline
contained in Bernsen's roadmap document based on the first roadmap
workshop in November 2000.