| Category: ||E-Conf |
| Subject: ||Colloquium on Audio Information Retrieval |
| From: ||Roeland Ordelman |
| Email: ||ordelman_(on)_cs.utwente.nl |
| Date received: ||29 Sep 2003 |
| Start date: ||10 Oct 2003 |
---COLLOQUIUM ON AUDIO INFORMATION RETRIEVAL----
Friday, October 10, 2003
TKI, University of Twente
9:45 welcome and coffee
10:15 - 11:00 AUDIO INDEXING AS FIRST STEP IN A SPOKEN DOCUMENT IR
SYSTEM -Jean-Pierre Martens
11:00 - 11:45 INFORMATION ACCESS FROM SPOKEN LANGUAGE - Steve Renals
11:45 - 12:00 Discussion
See also the abstracts below. The colloquium takes place at the
BB-building (room BB-5) of the University of Twente. Participants are
also welcome to visit the Ph.D.-defense of Roeland Ordelman regarding
his thesis "Dutch Speech Recognition in Multimedia Information
Retrieval" that follows the colloquium at 13:00 in room BB-2 of the
BB-building. It is also possible to have lunch there.
1. AUDIO INDEXING AS FIRST STEP IN A SPOKEN DOCUMENT INFORMATION
Electonics and Information Systems dept.
University Gent (Belgium)
The retrieval of information from an audio stream is a challenging
task requiring audio segmentation, speech recognition and
summarization, nd document information retrieval technologies. The
ATRANOS project currently running in Flanders tries to develop such
technologies for Dutch.
In this colloquium, I will briefly introduce the aims of the ATRANOS
project, and then discuss in more detail the techniques that have been
developed for the segmentation of audio streams into so-called
homogeneous parts. Such a segmentation involves the separation of
speech from non-speech parts, the labeling of these parts
(e.g. prepared versus spontaneous speech), the localisation of speaker
changes and the labeling (also called clustering) of speaker turns.
Although some of the described indexing tasks can be performed after
or during the speech recognition process, it is common to perform
them before the recognition. I will briefly indicate how the indexing
information can be used to facilitate the task of the speech
recognizer, and I will demonstrate the system that was developed in the
the ATRANOS project.
2. INFORMATION ACCESS FROM SPOKEN LANGUAGE
Centre for Speech Technology Research
University of Edinburgh
The processing and recognition of unconstrained speech produced for
human ears, from unpredictable environments, raises many difficult
problems. However, it is possible to access information from such
spoken audio in an effective manner. In this talk I will discuss ways
of addressing tasks such as the browsing and search of television and
radio broadcasts, speech summarization and the access of information
from multi-party meetings.
Some key questions that will be discussed in this colloquium include:
- are automatic speech transcriptions with word error rates of 25%
worse) useful for information access?
- how far can we go with simple methods (eg hidden Markov models)
based on word-level statistics?
- how can speech information beyond the word level (eg prosody,
interaction structure) be used for information access?
Some examples using television and radio news broadcasts, voicemail
and meeting recordings will be included.
Train & bus: Take the Intercity train to Enschede and get off at the
Hengelo station. Take bus line 3. This bus stops at the bus stop
"UT-viaduct". Go to the main entrance of the University campus, the
BB-building is the first building on the left.
Car: From the motorway A1 take the A35, direction Enschede and then =
exit 26 "Enschede-West/Universiteit". Keep following
"Enschede-West/Universiteit". At the traffic lights turn left and
follow the signs "Universiteit". Go to the main entrance of the
University campus, the BB-building is the first building on the left.
see also: http://www.utwente.nl/en
Contact for more information
Roeland Ordelman: ordelman_(on)_cs.utwente.nl
Franciska de Jong: fdejong_(on)_cs.utwente.nl