``This article appeared in ELSNews 4.3 (May 1995), and is re-printed by permission from the Editor. ELSNews is the newsletter of ELSNET, the European Network in Language and Speech. Information about ELSNET is available from the Coordinator, at firstname.lastname@example.org.''
ELRA Now Receiving Members
Meeting to Discuss Integration of Speech and NL Held in UK: Russell Collingham, Laboratory for Natural Language Engineering, Univ. of Durham
FRANCIL: Language Engineering, en français: Joseph Mariani, LIMSI-CNRS, Orsay
The Dutch National R&D Programme `Language and Speech Technology': Lou Boves, University of Nijmegen and Alice Dijkstra, The Netherlands Organization of Scientific Research (NWO)
Update on HCM Spoken Dialogue and Discourse Project
SDD Project Workshop Held in Dublin
Workshop on Discourse and Dialogue Prosody Held in Stuttgart
Minutes of the March ELSNET Executive Board Meeting Summarised
Information Technology & Music CD Available
Second LE Convention to be Held in London
European Studies in Phonetics and Speech Communication
Training and Mobility Grants Available under FP4
What is ELSNET?
When the Editor asked me to contribute an article on how ``industrials'' can exploit the resources offered by ELSNET, and how ELSNET might become more useful to us, I was amused by the Victorian connotations of my categorisation. Today's high-tech reality is very different, but I shall use the word ``industrial'' as a useful shorthand.
Let me start by saying that I think that ELSNET already does a good job. It stands out from the general run of amorphous umbrella organisations chiefly through its two flagships, namely its newsletter ELSNews, and the annual summer schools. ELSNews provides a regular update on what is going on in the worlds of speech and language. It is edited well, is just the right length, and provides information in an easily assimilated fashion which nicely complements other media such as electronic news lists, which are much less organised. BT has sent people to the last two ELSNET summer schools, and we will be there again in Edinburgh this July. I greatly enjoyed the last summer school in Utrecht, and would strongly recommend fellow ``industrials'' to attend. In fact the attendance list reveals that there was quite a healthy representation from industry in Utrecht. The summer schools generally feature good lecturers. Some of the courses are introductions to topics which one has perhaps not covered before, others are in-depth surveys of the state of the art, which can be very helpful in choosing which approach to adopt back at the lab. There is a good helping of practical work, sometimes including source code or algorithms, which can be very useful in getting new ideas from the lecture room to practical application. I found the Utrecht summer school very friendly, with many opportunities for the exchange of ideas, which are often restricted at conferences, etc. There was no rigid division between students and staff: many of the students had considerable experience themselves, and lecturers were quite often to be seen attending each otherUs lectures, which says something about the quality of what was on offer. In summary, I think that the summer schools are ELSNET's best contribution to building the bridges which Steven Krauwer talked about in ELSNews 4.2, and also to building bridges across language and national boundaries.
How could ELSNET do better? I don't have space in this article to distinguish between ELSNET's role and that of other related organisations, EC-funded or otherwise. I shall simply give a few suggestions from an industrial perspective of what might help us.
The opinions expressed in this article are Peter Wyard's, and not necessarily the opinions of British Telecommunications.
ELRA has the goal of promoting the creation, verification, and distribution of language resources in Europe. The Association will also assist users and developers of European language resources, as well as government agencies and other interested parties, to exploit language resources for a wide variety of uses; it will oversee the distribution of language resources via CD-ROM and other means; and it will promote standards for such resources. Eventually, ELRA will serve as the European repository for all EU-funded language resources.
Three weeks after its founding, the Association submitted an application to the European Commission for funding under the FP4 call that closed on March 15. The contents of this application are still confidential, but reports indicate that a substantial sum is being requested for a period of four years. The results of the call will not be known until this Summer, but if the application is successful, the first funds will be available in December 1995. The plan is that, after the first four years, the Association will be financially self-funding.
The date of the first General Assembly meeting has been set for Monday, September 25. At this meeting, the members of ELRA will elect an Executive Board. In the meantime, an interim steering committee, headed by Antonio Zampolli and Brian Oakley and comprising representatives of the LRE-funded RELATOR project and the three MLAP projects --- PAROLE, POINTER and SPEECHDAT --- is continuing to hammer out the details of the actual operation of the Association as a distributor of language resources. The interim steering committee last met in Brussels on March 27 to discuss issues related to membership and staff recruitment, and to adopt a formal business plan for the Association.
ELRA currently has 24 members, though this number is increasing rapidly as the result of an on-going membership campaign. Membership is open to any organisation, public or private, for an annual fee of 1,000 ECU. At present, however, voting rights are available only to organisations in the EU or European Economic Area. Organisations based elsewhere may participate in ELRA as Subscribers. As a result of the discussion at the March 27 meeting, it is expected that full membership will soon be open to institutions in Switzerland and Eastern Europe.
All ELRA members will receive regular information about language resources, many of which will be licensed and purchased directly from the Association. In addition, members will be able to contribute (through annual General Assembly meetings, committees and working parties) to the direction that the organisation takes over the next few years.
All ELRA members will be classified according to their chief interests (spoken, written, or terminological resources), and will belong to one or more corresponding ``colleges''. The purpose of the colleges is to prevent the Association's 12-member Executive Board from becoming dominated by any one particular interest group. Six places on the Executive Board are elected from among the members by the whole membership of the Association, and six places will be elected as representatives of the colleges, with two places available to each college. Only members associated with a particular college are allowed to nominate or vote for the representatives of that college. The Board will appoint a full-time Chief Executive, who may participate in all Board meetings in a non-voting capacity.
A job description for the Chief Executive has been drawn up and circulated widely within the European language engineering community. The Chief Executive will be responsible for managing the day-to-day activities of the Association, and it is expected that his or her appointment will begin in July.
Another of the actions taken by the interim steering committee at the March 27 meeting was to establish a number of working parties. One of these is concerned with the creation of a catalogue of European Language Resources. A second has been given the remit of drawing up a report for the Commission, suggesting where the priorities for development lie. It is planned that both of these reports will be available in time for the General Assembly meeting in September.
A working party on Validation has also been established, partly as an attempt to resolve a debate between those on the interim steering committee who believe that the Association must validate everything before it puts it on the market, and those who believe that resources should be made available as inexpensively as possible, and who do not mind whether the resources have been validated. The Validation working party will create a publicly-available manual which will describe the validation standard of the resources distributed by ELRA. It is hoped that producers of resources will conform to the standards set out in the manual before offering their resources to ELRA for distribution.
Clearly, in order to get started, ELRA needs resources to distribute. A large-scale resource collection effort is underway by the RELATOR project and the three MLAP projects. Although there continues to be discussion among the members of interim steering committee about ELRA's pricing policy, it seems likely that resources will be priced on the basis of market forces, though the members of ELRA will, of course, be able to purchase resources at a discounted price. It is also likely that material which is intended to be used for research purposes only will be sold as cheaply as possible.
A meeting will take place later this month between representatives of ELRA and of the US-based Linguistic Data Consortium (LDC). Proposals will be discussed regarding a possible exchange agreement between the two organisations, as well as other means of cooperation.
The Institute of Acoustics Speech Group held a one-day meeting, on March 23, 1995, at the University of Durham. The purpose of the meeting was to explore ways in which research into speech recognition technology and natural language processing can be integrated to produce spoken language understanding systems. The US ARPA Spoken Language Systems (SLS) programme encourages the integration of SR and NLP systems in the ATIS domain and the CSR programme has had one or two word recognition systems that use NL techniques. The traditional approach is for an NL system to post-process the n-best output of an SR system. How successful is this approach, and what are its limitations? What is the future of speech recognition, especially with respect to the handling of more natural, spontaneous, speech? Do NL systems have anything to offer, perhaps through providing summaries instead of verbatim recognition?
The meeting was attended by 29 people from UK academic and industrial sites. The workshop suffered from having two speakers withdraw at the last minute. It would have been very interesting to have heard other examples of this integration of the two technologies. Having said that, there was a good mix of industrial and academic representatives.
The meeting was opened by Russell Collingham (Univ of Durham) followed by an introduction to the IOA Speech Group by Briony Williams (Univ of Edinburgh). The first talk was given by Steve Young (Cambridge Univ) on the state of the art in automatic speech recognition. He reviewed the structure of current continuous speech recognition systems and summarised the main developments over the last five years. He briefly described the ARPA Resource Management, ATIS and Wall Street Journal CSR Evaluations. It was demonstrated that if the trend in performance improvements is maintained, useable large vocabulary speech recognition systems will be available very soon. He pointed out that the challenge to integrate them with comparable natural language understanding capability is therefore urgent.
Robert Gaizauskas (Univ of Sheffield) then spoke on the state of the art in natural language understanding (NLU) systems from the perspective of building real systems designed to work on real language in the real world. He gave an overview of several areas of NLU and presented results for various systems. He then went on to describe in detail the ARPA MUC-5 and MUC-6 message understanding conferences. The first two talks therefore established the capabilities and performance of existing technology in the speech and text processing fields. The next two talks focused on the topic of speech recognition, and in particular, the way in which search space may be constrained, using both word-level knowledge (n-grams) and semantic-level knowledge, in order to produce the actual words that were spoken.
George Demetriou (Univ of Leeds) presented a talk on semantics for speech by looking at the different approaches that have been undertaken and their various problems. A taxonomy classifying the various approaches in six main categories (semantic grammars, semantic networks, case-frames, unification-based, statistical-modelled and connectionist) was presented. Details were then given of the on-going research at Leeds concerning the clustering of related words (or word meanings) from on-line dictionaries, and similar sets of conceptual (or semantic) tags from large text corpora, to provide semantic constraints for the disambiguation of word-candidate lattices output by speech recognisers.
Valtcho Valtchev (Cambridge Univ) gave a talk on using n-grams for large vocabulary speech recognition. He described the n-gram approach to structuring the probabilistic dependencies for words in natural language. A practical development framework was presented, which allows for the easy creation and manipulation of complex language models (LMs) from a very large text corpus (230 million words). He also discussed the incorporation of tri-gram and four-gram LMs into the HTK large vocabulary continuous speech recognition system which achieved state of the art performance in the 1994 ARPA evaluation. The final talk demonstrated the interconnection of speech and text processing technology in an actual application. This talk, given by Gavin Churcher (Univ of Leeds), was on the development of a corpus-based grammar model for use with a commercial continuous speech recognition package. He described the results of experiments with a commercial ``off-the-shelf'' continuous speech recognition system, applied to the (apparently) restricted domain of Air Traffic Control for light aircraft.
This seems to be a young area of research in the UK. In the US, the integration of speech and text processing systems is encouraged by the ARPA ATIS competition, in which there has not been a UK entrant to date. The UK is well placed for the future in this respect, having leading groups in both of the ARPA speech recognition and natural language understanding competitions --- CSR (Cambridge) and MUC (Sheffield, Durham); and also having one of the leading speech-to-speech translation systems developed by SRI International.
Following the outcome of the most recent summit of the heads of state and government officials of the French-speaking nations (Ile Maurice, October 1993), the Agency of the French-speaking Communities for Higher Education and Research (Agence francophone pour l'enseignement supèrieur et la recherche (Aupelf-Uref)) has launched a four-year programme, with a tentative budget of 4 million ECU, to intensify its activities in the field of Language Engineering (LE). At the summit, it was felt that automatic language processing is a rapidly-expanding field covering various R&D topics including language analysis and generation, and speech recognition, understanding and synthesis. The field has applications in the areas of document processing and information retrieval, human-machine communication, authoring aids and automatic translation. Each of these areas presents a number of challenges --- industrial and economic, as well as scientific and technological. Moreover, language engineering research, with its added cultural dimension, raises important challenges that scientists in many other areas of research do not have to consider: it is always preferable to master the language which one uses to present one's research results --- even more so when the topic of research is that language. Clearly, the scientists in the French-speaking communities should be in a good position to deal with these challenges for the French language.
With these issues in mind, FRANCIL (French-Speaking Community Language Engineering Network) was set up and the first General Assembly meeting of the network took place last Summer on June 21, 1994.
These goals will be implemented by several means: through a specific programme which promotes exchanges between research teams and laboratories; through the production and distribution of scientific and technical information; through the organisation of scientific workshops; and through the allocation of grants to young post-doctoral researchers and students who are at the stage of writing their thesis. During last Summer's General Assembly meeting, it was recommended that FRANCIL focus upon four major research topics:
One of these actions, called Concerted Research Actions, is a four-year programme which will use the `evaluation paradigm' for Language Engineering research based on French linguistic resources. This programme has two strands: the first one, on Written Language Processing, will focus on four topics: information retrieval; bilingual text alignment; automatic terminological database construction; and text understanding. The second strand, Spoken Language Processing, focuses on three topics: voice dictation, spoken dialogue and speech synthesis. Each topic involves three kinds of actors: the organiser, the linguistic resources providers, and the participants, who will test their system on the same data with the same commonly-agreed protocol.
A second set of FFR actions particularly encourage the integration of universities from the South (i.e., in Tunisia, Egypt and Morocco) by providing aid for scientific infrastructure and grants, so that these universities can develop a consistent research policy and become involved in projects that have the support of an international scientific committee.
Higher education is also a priority for the FFR, and the Foundation has plans to set up a Doctoral Regional School in `Language Engineering' for the French-speaking communities. In conjunction with this, the Aupelf-Uref has provided funding to organise a Summer School on spoken language processing this year in Marseille.
FRANCIL currently has nearly 50 members from research labs in Belgium, Canada, Egypt, France, Morocco, Switzerland and Tunisia. The network is headed by a six-member Executive Board including: C. Boitet (IMAG, Grenoble), C. Delcourt (BELTEXT, Lige), Y. Hlal (Mohammadia School Eng., Rabat), J. Mariani (LIMSI-CNRS, Orsay), Y. Normandin (CRIM, Montréal), and E. Wehrli (Univ. Genève). Scientific coordination is provided by Joseph Mariani and Françoise Néel (both at LIMSI-CNRS, Orsay). Research teams interested in joining the FRANCIL network should submit an application containing details about their laboratory or organisation and its research topics, training activities and recent publications. It is important to give the name of a correspondant who would represent the laboratory within the network. All applications will be considered by the Executive Board.
The Netherlands Organization of Scientific Research (NWO) has provided Dfl 5,000,000 for a five-year research programme aimed at the development of advanced telephone-based information systems. The programme is co-funded by Philips and Royal Dutch PTT (KPN), and will be carried out by Nijmegen University (KUN), the Institute for Perception Research (IPO), the University of Amsterdam (UvA), and Groningen University (RUG), in close collaboration with Philips Corporate Research and KPN Research.
One prominent feature of the programme is its attempt to achieve scientific as well as practical goals at the same time. The practical goal is to build a working demonstrator of an interactive spoken language system that can give travel advice to individuals using public transport in the Netherlands. A number of increasingly powerful demonstrators are planned, starting with a query system which can handle questions about the schedule of Dutch Railways, the Metro in Amsterdam and Rotterdam and the so-called Interliner buses, (express bus services between medium-sized cities lacking convenient railway access). The first versions of the demonstration system will make use of the technology made available for research purposes by Philips. Interaction between user and machine will be via a telephone network, using large vocabulary speaker-independent continuous speech recognition techno-logy, combined with natural language processing using a probabilistic partial parsing approach. Dialogue management in the initial versions of the demonstrator will be modeled after a form-filling paradigm, where the caller has consi-derable freedom in determining the order in which the fields are completed. Dialogue management is mainly goal directed, allowing clarification sub-dialogues for a number of problematic cases (e.g., the place names Baarn and Maarn or Rheden and Rhenen, which differ only in a single phonetic feature in one consonant).
From a scientific point of view, original contributions are envisaged in robust speech recognition over the telephone, natural language processing and dialogue management in information-seeking dialogues. The first laboratory demon-strators of the complete system will be used in in-house experiments to collect corpora of human-machine conversations and to study the human factors aspects of this kind of interaction. In the area of speech recognition, the focus will be on signal processing techniques to remove channel characteristics, on the one hand, and on explicit modeling of pronunciation variation, on the other. With respect to the latter goal, emphasis will be placed on the type of speech that is characteristic of goal-directed information-seeking dialogues, where we expect pronunciation accuracy somewhere in between read speech and completely free conversational speech. As for NLP aspects of the system, three approaches will be compared, viz. the AI-type approach presently imple-mented in the Philips system, corpus-based parsing, and parsing using a conventional wide coverage grammar. Both the corpus-based and the wide coverage parser will be adapted to the type of language use prevailing in information-seeking dialogues in the public transport domain. Formal comparisons are planned of the three approaches, taking into account not only error rate, but also computational complexity. On the level of dialogue control, we will investigate how the communication with the user can be made more effective and user-friendly. In doing so, the functionality of dialogue control must be enlarged to allow simple forms of collaborative problem-solving dialogues. Finally, the question of how best to present complex travel schedules in spoken form will be investigated (or, alternatively, how spoken output can be combined with other output media).
Future versions of the demonstrator will have a completely modular architecture, that will allow one to ``mix and match'' advanced automatic speech recognition, with one's favourite natural language processing and dialogue management strategies. Future systems should have more powerful dialogue management capabilities, allowing collaborative problem-solving in addition to mixed-initiative form-filling. The domain in which the system can carry on conversations will be enlarged by including regional and local bus, tram and taxi transport facilities, and information on fares will also be included.
This programme has obvious relations with the MLAP projects MAIS and RAILTEL. (Please refer to the March 1995 issue of ELSNews for information about MAIS and RAILTEL).
Harry Bunt (Tilburg University) presented a paper entitled, Semantic and Pragmatic Interpretation in the DELTA Dialogue Project, which describes some recent ideas concerning the semantic and pragmatic interpretation of utterances in human-computer natural language information dialogues. This work, done as part of the ESPRIT DELTA project, is based on the view that language understanding systems are best designed with a `cascaded' organization where several levels of meaning representation are distinguished, at which different knowledge sources are consulted and different degrees of specificity in meaning are targeted.
Perhaps the most salient feature of the DELTA approach is that it considers both semantic and pragmatic interpretation to be heavily context-based, and does not take the resolution of ambiguity and vagueness as a goal in itself. Ambiguity is taken to be a virtue of natural language which should be exploited, not fought, and which should be eliminated only to the extent required by the processing context.
The paper is concerned with three issues and their relationship: (1) the pragmatic organization of dialogues and the communicative functions of utterances; (2) establishing and representing the semantic content of utterances in a dialogue; (3) the formal representation of dialogue context.
Ole Ravnholt (post-doc, Tilburg University) presented a paper entitled, Information Packaging in DELTA, in which he explains that `information packaging' is that part of the structure of sentences that tells the hearer (or reader) how the content conveyed is to be understood, i.e., how the speaker intends him to update his understanding of the ongoing discourse.
In English, and apparently also in the other Germanic languages, the most reliable cues (and those that may override other cues) used to mark information packaging are prosodic. The Delta dialogue system uses typed input, and therefore, like human readers, it does not have access to such cues. Besides prosody, a variety of `fancy' syntactic structures are used to mark special assignments of information structure, but in the syntactically unmarked cases, access to previous context is necessary in order to narrow down the set of syntactically possible assignments by excluding the pragmatically impossible ones.
In the Delta system, the information that is extracted from an input sentence and added to the discourse model contains (among other things) an HPSG parse of the sentence, information about its communicative function and the reactive pressures it deals with, and about the attitudes (beliefs and intentions) that it conveys.
Laila Dybkjær, (Roskilde University) then presented work done jointly with Niels Ole Bernsen, Hans Dybkjær and Dimitris Papazachariou in a paper called, On the Use of Context in Building Spoken Language Dialogue Systems for Large Tasks. Context is of crucial importance to language understanding in general and plays a central role in spoken language dialogue systems design. Dybkjær, et al, take the approach of viewing context as denoting a collection of aspects or contextual elements each of which may be defined and analysed with respect to its specific contribution to dialogue understanding. Massive exploitation of context is essential in spoken language dialogue systems design for large tasks because the feasibility of such systems demands a high degree of control of the user-system dialogue. The paper discusses in detail how knowledge about contextual elements is used in system-directed dialogue design to achieve an optimal trade-off between technological feasibility and user acceptability and to enable controlled steps in the direction of mixed-initiative dialogue.
Dimitris Papazachariou (post-doc, Roskilde University) presented work done jointly with Niels Ole Bernsen, Laila Dybkjær and Hans Dybkjær) in a paper entitled, Identification of Speaker Actions in Mixed Initiative Dialogue. One of the aims in building spoken language dialogue systems, is to allow the user to take the initiative when it is necessary. This interaction of the user is sometimes needed in order to maintain the ``natural'' structure of the dialogue --- i.e., to allow the user to be more accurate, to check for alternatives that will help him/her to successfully complete the task, as well as to point out mistakes.
The paper reports on the collection of a small spoken dialogue corpus and the analysis of speech acts in the corpus. The task domain of the dialogues is that of informed flight reservation. Users phoned a simulated system in order to book flight tickets and to ask for the information they needed to make relevant decisions.
User utterances have been categorised with respect to the specific actions that the user is performing with them. This categorisation is justified by reference to Speech Act Theory. Finally the paper proposes a possible implementation of this work which will allow systems to handle a certain amount of mixed initiative.
Ernst Buchberger (post-doc, University of Stuttgart), presented a paper called, On the Use of Prosodic Cues in Discourse Processing, which reports how prosody (and prosodic cues) may help in the analysis and generation of discourse.
As background, the paper describes some of the approaches for the handling of local discourse structure, (dealing with focus and the given/new-distinction), and also proposals dealing with global discourse structure, (in particular, the use of prosodic cues in the segmentation of discourse). Buchberger's approach goes a step further by looking not only at the segmentation of discourse itself, but also at whether and how the relations of these discourse segments to one another might be signalled prosodically. These relations are sketched in the formal framework of Kamp's Discourse Representation Theory (DRT) and AsherUs extension of DRT.
Connections are shown between the treatment of global and local discourse structure in DRT, and the benefits of this approach for analysis and for generation are pointed out.
Dimitris Galanis (Univ. of Patras) presented work done jointly with Vassilos Darsinos and George Kokkinakis in a paper entitled, Analysis and Generation of Intonation Patterns for Synthetic Speech Dialogues. This paper presents an analysis of F0 patterns used in a dialogue context, and a set of rules which might be used for improving the naturalness of a synthesized voice in a text-to-speech (TTS) system. The analysis is based on 60 context-dependent utterances collected from a spontaneous information exchange dialogue between two male speakers. The lexical components carrying focus information were determined through a manual focus assignment procedure. For the description and representation of the F0 contour characteristics, the utterances were segmented into three parts according to the position of the focus accent: pre-focal, focal and post-focal. The representation of the F0 contour within the three utterance domains, was done in terms of four F0-levels: FOCUS, HI, MID and LOW, thus enabling the definition of a focusing context and revealing the relative prominence relations among the accented syllables as well. Syntactic and contextual information was used for modeling different intonation patterns. A set of rules for the FO contour generation was established and implemented in an existing TTS system.
The implementation of this method was based on the assumption that focus knowledge sources are available to the system and that they are used in combination with the already extracted linguistic information in order to determine the F0 pattern. Evaluation results have shown that the F0 curves can be regenerated quite accurately in comparison to the original contours. In addition, preliminary listening tests have shown that the synthetic output is successfully judged as far as it concerns the determination and the perception of focus.
Georg Niklfeld (University of Vienna) presented work done jointly with Hannes Pirker and Harald Trost in a paper entitled, Using two-level morphology as a generator-synthesizer interface in concept-to-speech generation. The paper describes an integrated morphological and phonological component based on two-level-morphology for an experimental German concept-to-speech (CTS) system. This system consists of a unification-based linguistic generator and a demi-syllable-based synthesizer. The aim of the component described in the paper is to mediate between the semantic/syntactic description produced by the linguistic generator and the phonetic/acoustic speech synthesizer. The component deals with morphotactics, morphophonolgy, syllabification and stress, and bridges the gap between the feature-based descriptions produced by the generator on the one hand and the synthesizer on the other . In particular, it uses morphological and phonological knowledge to produce a phonological description that conveys segmental and prosodic information which can then be used to drive the speech-synthesis component.
In contrast to TTS systems, CTS systems provide linguistic as well as pragmatic information about the message to be uttered. This is because CTS systems are generally used in restricted domains with a small vocabulary, and thus rely on a small, but information-rich lexicon. This information can be made use of in order to improve the quality of the synthesized utterances.
Paul McKevitt (Sheffield University) presented a paper, entitled, From Chinese Rooms to Irish Rooms, which points out that lexicons are typically structured in the form of sequences of natural language words with their content defined using flat symbolic descriptions in natural languages, rather than encoding pictures for words just like we do in our heads. There is now a move towards integrated systems in Artificial Intelligence (AI) and that will cause a need for dire actions on lexical research in the form of integrated lexicons. Lexicons must move towards a situation where natural language words are also defined in terms of spatial and visual structures. These spatial and visual structures will solve what have been two of the most prominent problems in the field of Natural Language Processing (NLP) for years: (1) Where are symbolic semantic primitive meanings in computer programs grounded? and (2) Why do some words in dictionaries have circular definitions so that words end up defining each other? Integrated lexicons are expected to solve both these problems and hence help solve Searle's ``Chinese Room'' problem and move more towards ``Irish Rooms'' of people like James Joyce.
On February 14-15, 1995, a workshop on discourse and dialogue prosody was held at Stuttgart, Germany. It was one of the activities within ELSNET's HCM project, Scientific Cooperation in the European Network in Language and Speech, aiming at a Scientific Network which contributes to the foundations of the next generation of spoken language systems. The project comprises 18 partner institutions, one of which is IMS, the Institute of Natural Language Processing (IMS) at Stuttgart University, engaged in work on spoken dialogue and discourse. The workshop was organized by the Chair in Experimental Phonetics at IMS, Grzegorz Dogil and the local organization was handled by Ernst Buchberger.
The first day of the workshop was devoted to tutorial talks by invited speakers, starting with Julia Hischberg from AT&T. She presented an overview on discourse prosody, drawing on her own research in the field and related work. She presented evidence for a correlation between prosodic features and discourse features, but the need for more research, especially with regard to empirical studies of discourse and dialogue and especially for cross-linguistic studies was stressed.
Gösta Bruce (Lund) spoke about dialogue prosody, based on work performed in the Prosodiag (Prosodic segmentation and structuring of dialogue) project. He mentioned differences in prosodic signalling between spontaneous speech and read speech.
Daniel Hirst's (Aix-en-Provence) talk was entitled, Levels of Representation and Levels of Analysis for Intonation. It featured a multilingual parametric approach realized in a number of bi- and multilateral research projects. He presented the INTSINT transcription system of intonation which permits absolute and relative notations, and a method for the representation of microprosody. Grzegorz Dogil (Stuttgart) spoke about prosodic cues to ``alternative'' semantics in DRT, showing the influence of prosodic marking on the interpretation of sentences represented in Discourse Representation Theory.
After lunch, which was taken in small groups at local restaurants, giving the possibility for less formal, more in-depth communication between research groups, the afternoon session featured talks by Anton Batliner and Elmar Nöth, Wolfgang Hess, and Dieter Huber. Anton Batliner and Elmar Nöth (Munich/Erlangen) spoke about the use of prosody in speech recognition systems, drawing on examples from the EVAR, SUNDIAL and VERBMOBIL projects. They claimed that in spite of a lot of research in the field of prosody, acoustic/phonetic decoding is still many years ahead of prosodic decoding.
Wolfgang Hess (Bonn) gave an overview of a number of methods for realizing prosody in speech synthesis systems. He made an interesting observation, stating that users tend to prefer systems incorporating prosody, even if that means a decrease in intellegibility.
As the last speaker of the day, Dieter Huber (Mainz) presented Prosodic Transfer in Spoken Language Interpretation. Starting by stating the functions of prosody, he presented experiences drawn from work in Japan showing a number of differences in prosody between the languages involved (Japanese, English), making clear the need for rules of prosodic transfer.
The morning of the second day saw a panel discussion about various aspects of hardware and software integration of prosody in dialogue and discourse systems. A number of topics were discussed, among them:
Finally, I would like to say I enjoyed two days of intensive talks and discussions and I appreciated the widespead response to the Call for Participation. While many of the more than 40 participants came from Germany, in total we had participants from seven European countries (and one speaker from the US), making the workshop a European event which has contributed to fostering information dissemination on a multi-national level and strengthening the European research community in this area.
I would personally like to thank the initiator of the workshop, Grzegorz Dogil, our secretary, Sabine Schmid, and all colleagues at the institute for their help in organizational matters. The institute gratefully acknowledges financial support for part of the speakers' costs from ELSNET funds.
ELSNET's role for industry: Five industrial representatives attended the March meeting to discuss ELSNET's past and future role with respect to the European speech and language industry. A summary of this discussion will be sent upon request [email@example.com].
Continuation of ELSNET/Internal review: This year ELSNET will apply for continuation after 1995. A draft proposal is aimed at sometime early July. All ELSNET nodes will shortly be invited to express their view on their future role in ELSNET. In preparation for submitting a continuation proposal, as well as in preparation for ELSNET's external review by the Commission (late 1995) an internal review was held. The results of this review have been summarised in writing. A copy of this summary will be sent upon request [firstname.lastname@example.org].
Legal Status: A legal entity ELSNET will be created as soon as possible. The following two-step procedure will be followed: a foundation will be formally created by notarial act; by-laws will be worked out as soon as the articles are approved by the EB (expected in July). The draft articles will be communicated to all ELSNET nodes. Resources: Now that ELRA has gotten off the ground, ELSNET's (future) role with respect to language resources was reconsidered. It was decided to investigate whether ELSNET should continue activities in the area of resources that will not be covered by ELRA and it was emphasized to take into account the network's basic principle of `sharing'.
Industrial Links and Information Dissemination: The Industrial Links TG and the Information Dissemination TG will enhance ELSNET's WWW pages structure by increasing the collection of pages, and by adding appropriate hooks and associated services, including a ``first port of call'' service to locate the right contact within ELSNET. This service will be monitored in order to determine whether there is a need for an extended brokerage service. ELSNET will subsidise the organisation of introductory NLP sessions aimed at industrial participants attending EUROSPEECH `95 in Madrid. The degree of interest of companies in student placement will be investigated. A more extensive list of Eastern European speech and language companies will be compiled [email@example.com].
Research: The ELSNET Research TG is preparing a proposal in the area of ``evaluation procedures'' within the LE programme ``Telematics Applications'', aiming at the development and implementation of an evaluation infrastructure on a European scale.
Training: A proposal for the 1996 ELSNET Summer School will be available soon. Since ELSNET goes East has some funds for the organisation of a Summer School, the possibility of having a joint Summer School in Eastern Europe is being investigated. The proposed subject is ``Dialogue Systems'' [firstname.lastname@example.org].
Training and industry: The Training TG and the Industrial Links TG will look into the possibility of setting up a modular course taught at a variety of sites across Europe, aimed at industrial participation; they will investigate the possibility of getting funds under TMR to implement this initiative [email@example.com, firstname.lastname@example.org].
ELSNET and ELSNET goes East: The EB approved of the following five Eastern European representatives to take part in ELSNET's Task Groups: Klara Vicsi (Hungary) for Training, Eva Hajicova (Czech Republic) for Resources, Leonid Iomdin (Russia) for Information Dissemination, Ryszard Gubrynowicz (Poland) for Research, and Gabor Proszeky (Hungary) for Industry. ELSNET goes East will organise a small exchange programme; Eastern Europeans will be given the opportunity to visit or to work in Western Europe [email@example.com].
The disk contains freeware and demo programs for a number of different musical purposes. It also presents many sound examples, some classical, some new, which demonstrate characteristics of musical sounds. For example, you can listen to tones played on different instrucments and see the spectrum simultaneously. You can try programs for composition, analysis, synthesis and performance.
In order to use the CD, you will need: an Apple Macintosh with at least 8 MB of RAM; a 13-inch or larger colour display with 356 colors (8 bits); a CD ROM drive; and system software 7.0.1 or later.
The book will present the views of 31 eminent researchers in the field, who all understood the importance in bringing their excitement and vision to a broad audience of students and colleagues. With the help of 27 ``country-editors'', an unprecedented overview will be given of studies in Phonetics and Speech Communication in most European countries, from Ireland to Russia. For each country a short overview is presented of historical developments in research, its present status, the educational system, and advice for exchange students. This is followed by a description of institutes in each country, including history and local characteristics, research interests and programme of studies. The book will conclude with an overview of the elements of studies, as an ordered list of subjects with a short introduction and a list of key texts for each of 14 main sections.
This 500-page book will be an invaluable source of information for students and staff members, not only in Europe but from all over the world. The book will be available from August 1995 and will be for sale at Phonetics Congress in Stockholm and at Eurospeech 95 in Madrid for the price of only 15 ECU (including mailing).
June 23-28, 1995: Fifth Toulouse International Workshop on Time, Space and Movement: Meaning and Knowledge in the Sensible World. Chateau de Bonas, Gascony, France. Deadline for paper submissions: 10/2/95. For information, contact: TSM95, c/o Mario Borillo, IRIT, Université Paul Sabatier, 118 route de Narbonne, F-31062 Toulouse Cedex, France, Tel: +33 61 55 60 91, Fax: +33 61 55 83 25, Email: firstname.lastname@example.org.
June 29-30, 1995: Workshop on Intelligent Computer Communication (ICC 95). Cluj-Napoca, Romania. Deadline for paper submissions: 6/3/95. For information, contact: Ioan Alfred Letia, Department of Computer Science, Technical University of Cluf-Napoca, Baritiu 26, RO-3400 Cluj-Napoca, Romania, Tel: +40 64 194 684, Fax: +40 64 192 055, Email: email@example.com.
July 5-7, 1995: The Sixth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI95). University of Leuven, Belgium. For information, contact: Centre for Computational Linguistics, Maria-Theresiastraat 21, B-3000, Leuven, Belgium, Fax: +32 16 32 50 98, e-mail: firstname.lastname@example.org, or click here.
July 10-21, 1995: Third European Summer School on Language and Speech Communication. For information, contact: Dawn Griesbach, Centre for Cognitive Science, 2 Buccleuch Place, Edinburgh EH8 9LW, UK, Email: email@example.com, or click here.
July 11-13, 1995: International Association for Machine Translation's MT Summit V. For information, contact: P. van den Daele, Fax: +32 2 512 1076.
August 13-19, 1995: Thirteenth International Congress of Phonetic Sciences (ICPhS 95). Stockholm, Sweden. Deadline for submission of abstracts: 1/11/94. For information, contact: ICPhS 95, c/o Congrex, PO Box 5619, S-114 86 Stockholm, Sweden, Fax: +46 8 612 6292, Email: firstname.lastname@example.org.
August 14-25, 1995: Seventh European Summer School in Logic, Language and Information (ESSLLI95). Barcelona, Spain. For information, contact: ESSLI95, GILCUB, Avda. Vallvidrera 25, 08017 Barcelona, Spain, Tel: +43 3 203 3597, Fax: +43 3 205 4656, Email: email@example.com.
August 28-September 8, 1995: Speechreading by Man and Machine: Models, Systems and Applications: A NATO Advanced Study Institute. Chateau de Bonas, France. For information, contact Dr. David G. Stork, Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025-7022, USA, Fax: +1 415 854 8740. Deadline for registration is May 23, 1995. Registration forms can be obtained from ftp://ftp.crc.ricoh.com/asi/register.ps.Z, or via the World Wide Web by clicking here. An ASCII version of the form may be requested from, and submitted via firstname.lastname@example.org.
September 18-21, 1995: EUROSPEECH 95. Madrid, Spain. For information, contact: EUROSPEECH 95 Secretariat, CONGRHISA, C/. Serrano, 240, E-28016 Madrid, Spain, Tel: +34 1 457 61 12, Fax: +34 1 457 01 73.
September 18-23, 1995: International Conference on Cognitive Processes in Spoken and Written Communication: Theories and Applications. Crimea, Ukraine. For information, contact: Dr. Dikareva Svetlana, Simferopol State University, Dept of Russian Language, Yaltinskaya 4, Simferopol, Crimea, Ukraine 333036, Tel: +7 0652 23 03 84, Fax: +7 0652 23 21 69, Email: email@example.com.
September 28-30, 1995: First Conference on Formal Approaches to South Slavic Languages. Plovdiv, Bulgaria. Deadline for submission of abstracts: 28/2/95. For information, contact Prof. Yordan Pencev/Maria Stambolieva, Institute for Bulgarian, Bulgarian Academy of Sciences, 52 Shipchenski proxod, bl 17, BG-1113 Sofia, Bulgaria, Email: firstname.lastname@example.orgemail@example.com.
NL Utrecht University (coordinator) A ARIAI/Univ. Vienna B University of Antwerp B University of Leuven BU Bulgarian Acad. of Sciences, Sofia CH IDSIA, Lugano CH ISSCO, Geneva CZ Charles University, Prague D Univ. des Saarlandescken D Univ. Hamburg D Univ. Kiel D Univ. of Stuttgart D Ruhr-Univ. Bochum D Univ. Erlangen DK Ctr for Sprogteknologie, Copenhagen DK Speech Technology Ctr, Aalborg Univ. DK Ctr for Cognitive Science, Roskilde Univ. E Univ. Politecnica de Catalonia/Univ. Autonoma de Barcelona E Univ. Politecnica de Madrid E Univ. Politecnica de Valencia F LIMSI-CNRS, Orsay F Univ. Paul-Sabatier/IRIT, Toulouse F Inst. de la Comm. Parlee, Grenoble F Lab. Parole et Langage/CNRS, Aix-en-Provence F CRIN/INRIA Lorraine, Nancy GR ILSP/NCSR RDemokritosS, Athens GR Wire Communications Lab., Patras H Hungarian Acad. of Sciences, Budapest I Ist. di Linguistica Computazionale, Pisa I IRST, Trento I Fondazione Ugo Bordoni, Rome IRL University College Dublin N University of Trondheim NL Stichting Spraaktechnologie, Utrecht NL Inst. for Perception Research, Eindhoven NL Leyden Univ. NL Catholic Univ. of Nijmegen NL Univ. of Amsterdam NL Inst. for Language Technology & AI (ITK), Tilburg P INESC/ILTEC/Univ. Nova de Lisboa RO Research Inst. for Informatics, Bucharest S KTH, Stockholm S Univ. of Linkoping UK Defence Research Agency, Malvern UK UMIST, Univ. of Manchester UK Univ. of Cambridge UK Univ. College London/School of Oriental and African Studies (SOAS) UK University of Edinburgh UK Univ. of Essex UK Univ. of Dundee UK Univ. of Leeds UK Univ. of Sheffield UK Univ. of Sussex UK Univ. of York
B Lernout & Hauspie Speech Products D Aspect GmbH D CAP debis D Daimler-Benz AG D Electronic Publishing Partners GmbH D Grundig Professional Electronics GmbH D IBM Deutschland D Langenscheidt D Novotech GmbH D PC-Plus Computing D Philips Research Laboratories D Siemens AG DK Jydsk-Telefon E ENA Telecomunicaciones E Telefonica I & D F ACSYS F Aerospatiale F CAP Gemini Innovation F GSI-ERLI F LINGA s.a.r.l. F Memodata F Systran SA F TGID F VECSYS Speech Processing F Rank Xerox Research Center GR Knowledge A.E. H Morphologic I CSELT I Database Informatica I Sogei I Syntax Sistemi Software I Tecnopolis CSATA Novus Ortus NL PTT Research Laboratories P Uninova CRIA S Infovox AB UK Aldus Europe, Ltd. UK ALPNET UK, Ltd UK BICC plc UK BT Laboratories UK Cambridge Algorithmica Ltd. UK Canon Research Centre Europe Ltd. UK Ensigma Ltd. UK Hewlett-Packard Labs UK Logica Cambridge Ltd. UK Sharp Laboratories UK Vocalis Ltd.
Comments and suggestions for new WWW pages are very welcome. In particular, each ELSNET site coordinator is encouraged to send details of his or her siteUs home page so that a hyperlink might be set up to it from the ELSNET home page.