Colloquium on Audio Information Retrieval
---COLLOQUIUM ON AUDIO INFORMATION RETRIEVAL---- Friday, October 10, 2003 TKI, University of Twente Programme: 9:45 welcome and coffee 10:15 - 11:00 AUDIO INDEXING AS FIRST STEP IN A SPOKEN DOCUMENT IR SYSTEM -Jean-Pierre Martens 11:00 - 11:45 INFORMATION ACCESS FROM SPOKEN LANGUAGE - Steve Renals 11:45 - 12:00 Discussion See also the abstracts below. The colloquium takes place at the BB-building (room BB-5) of the University of Twente. Participants are also welcome to visit the Ph.D.-defense of Roeland Ordelman regarding his thesis "Dutch Speech Recognition in Multimedia Information Retrieval" that follows the colloquium at 13:00 in room BB-2 of the BB-building. It is also possible to have lunch there. ------------------------------------------ Abstracts: 1. AUDIO INDEXING AS FIRST STEP IN A SPOKEN DOCUMENT INFORMATION RETRIEVAL SYSTEM Jean-Pierre Martens Electonics and Information Systems dept. University Gent (Belgium) The retrieval of information from an audio stream is a challenging task requiring audio segmentation, speech recognition and summarization, nd document information retrieval technologies. The ATRANOS project currently running in Flanders tries to develop such technologies for Dutch. In this colloquium, I will briefly introduce the aims of the ATRANOS project, and then discuss in more detail the techniques that have been developed for the segmentation of audio streams into so-called homogeneous parts. Such a segmentation involves the separation of speech from non-speech parts, the labeling of these parts (e.g. prepared versus spontaneous speech), the localisation of speaker changes and the labeling (also called clustering) of speaker turns. Although some of the described indexing tasks can be performed after or during the speech recognition process, it is common to perform them before the recognition. I will briefly indicate how the indexing information can be used to facilitate the task of the speech recognizer, and I will demonstrate the system that was developed in the context of the ATRANOS project. ------------------------------------------ 2. INFORMATION ACCESS FROM SPOKEN LANGUAGE Steve Renals Centre for Speech Technology Research University of Edinburgh ABSTRACT The processing and recognition of unconstrained speech produced for human ears, from unpredictable environments, raises many difficult problems. However, it is possible to access information from such spoken audio in an effective manner. In this talk I will discuss ways of addressing tasks such as the browsing and search of television and radio broadcasts, speech summarization and the access of information from multi-party meetings. Some key questions that will be discussed in this colloquium include: - are automatic speech transcriptions with word error rates of 25% (or worse) useful for information access? - how far can we go with simple methods (eg hidden Markov models) based on word-level statistics? - how can speech information beyond the word level (eg prosody, interaction structure) be used for information access? Some examples using television and radio news broadcasts, voicemail and meeting recordings will be included. ------------------------------------------------------ Route description Train & bus: Take the Intercity train to Enschede and get off at the Hengelo station. Take bus line 3. This bus stops at the bus stop "UT-viaduct". Go to the main entrance of the University campus, the BB-building is the first building on the left. Car: From the motorway A1 take the A35, direction Enschede and then = take exit 26 "Enschede-West/Universiteit". Keep following "Enschede-West/Universiteit". At the traffic lights turn left and follow the signs "Universiteit". Go to the main entrance of the University campus, the BB-building is the first building on the left. see also: http://www.utwente.nl/en /services/route/ Contact for more information Roeland Ordelman: ordelman_(on)_cs.utwente.nl Franciska de Jong: fdejong_(on)_cs.utwente.nl

