ELSNET-list archive

Category:   E-CFP
Subject:   ACL-2004 Workshop: Discourse Annotation
From:   Donna Byron
Email:   dbyron_(on)_cis.ohio-state.edu
Date received:   22 Jan 2004
Deadline:   22 Mar 2004
Start date:   25 Jul 2004

Discourse Annotation A Workshop in conjunction with ACL'04 in Barcelona, Spain ---------------------------- Workshop date: July 25-26, 2004 Full paper submissions due: March 22, 2004 Workshop website: http://www.cllt.osu.edu/dbyr on/acl04 ---------------------------- WORKSHOP OVERVIEW: Advances in language technology draw on a combination of annotated empirical data and linguistic theory. The richer the annotation, the more that can potentially be learned and applied to unseen data. Thus the Penn TreeBank (PTB), with its part-of-speech (POS) tags and syntactic annotation, has been more useful than corpora annotated for POS-tags alone, and PropBank, in which PTB is annotated with predicate-argument relations, will be useful for more applications than the PTB alone. Two gross features of PTB and PropBank are that they annotate sentence/clause-level features and that they were undertaken with communal agreement (albeit somewhat contentious at first). Similar, largely communal projects have been undertaken for dialogue annotation, including MATE (now NITE). Discourse annotation (in contrast with sentence-level annotation) has taken a somewhat different course. While an early communal effort (DRI) to annotate discourse structure according to a consensus framework failed to achieve its goal, recognition remained of the value of discourse annotated corpora. The result has been that diverse grass-roots efforts have been producing individual corpora annotated for a wide variety of phenomena such as - referring/attributive expressions and coreference; - spatial/temporal expressions and spatial/temporal relations; - other anaphoric and/or elliptic expressions and their discourse dependencies; - discourse units and their relations to one another; - information structure themes and the themes/rhemes that license them; - discourse connectives and what they connect; - contexts of interpretation; - cognitive accessibility scales (e.g. animacy); - types of speech (direct, indirect, free indirect). Groups involved in these efforts appear to be using (or planning to use) these corpora for a range of applications that include: empirical testing of theoretical claims/hypotheses; supporting second-language acquisition of discourse-sensitive linguistic devices; training resolution procedures for co-referring expressions or other anaphors, that can be used in annotating additional texts or in supporting technologies such as information extraction, question answering, summarization, and/or text generation; training discourse parsers that can be used for annotating additional texts or for reducing the amount of manual effort needed in the process; and probabilistic sentence and text realization. The workshop is neutral as to whether consensus annotation is possible for every type of discourse phenomenon. Its aims are rather to: - bring a fuller range of discourse annotation activity to the attention of researchers working on discourse phenomena and their usefulness for language technologies; - highlight tools used in the annotation process or used to display or further analyse the results of annotation; - discuss obstacles to some (all?) forms of discourse-level annotation, such as the greater subjectivity that seems involved in making judgments related to, for example, bracketting and labelling; - identify gaps in this work (e.g., in the range of genre being annotated); - stimulate researchers with respect to the uses other researchers are putting their data to; - discuss (in small groups and in feedback sessions) whether we already have, or could together create, a significantly large, reusable corpus (or set of corpora) annotated for multiple discourse and sentence-level phenomena, as a much richer basis for both assessing theories and building better tools. With these aims in mind, we solicit papers on: - discourse annotation projects (in any language); - uses made of discourse annotated corpora, alone or together with other forms of annotation; - tools for discourse annotation (e.g., for assisting manual annotation or for (semi-)automating the process) or for analysing discourse annotated data; - tools for integrating layers of annotation (different types of word-, sentence-, and discourse-level markup); - requirements for annotated corpora from the perspective of computational linguistics (e.g., vis-a-vis data sharing, comparison, integration/alignment, etc.) - experiments with integrating and exploiting different layers of annotation (from word to discourse level) As well as for presentation, the papers will be used for structuring the above-mentioned small group discussions and feedback sessions. ---------------------------- Format for Submissions Submissions are limited to original, unpublished work. Submissions must use the 2-column ACL latex style or Microsoft Word style (see submission style files at http://www.acl2004.or g/aclstyles/style.html). Paper submissions should consist of a full paper (up to 8 pages in length, including references, with a minimum font size of 10 point). Papers outside the specified length are subject to be rejected without review. The paper should be written in English. ---------------------------- Submission Questions Please send submission questions to the co-chairs: bonnie_(on)_inf.ed.ac.uk dbyron_(on)_cis.ohio-state.edu ---------------------------- Submission Procedure Electronic submission only: send the pdf (preferred), postscript, or MS Word form of your submission to: Donna Byron (dbyron_(on)_cis.ohio-state.edu). The Subject line should be "ACL2004 WORKSHOP PAPER SUBMISSION". N.B. If you use any special fonts, please include them with your PDF submission. Otherwise reviewers may have unnecessary problems with printing. ---------------------------- Deadlines: Paper submission deadline: Mar 22, 2004 Notification of acceptance for papers: April 30, 2004 Camera ready papers due: May 24, 2004 Workshop date: Jul 25, 2004 ---------------------------- PROGRAMME COMMITTEE Bonnie Webber, University of Edinburgh (co-chair) Donna Byron, Ohio State University (co-chair) Steven Bird, Melbourne University Liesbeth Degand, University of Louvain Eva Hajicova, Charles University Aravind Joshi, University of Pennsylvania Andrew Kehler, UC San Diego Daniel Marcu, ISI Katja Markert, Leeds University Malvina Nissim, Edinburgh University Livia Polanyi, FXPAL Frank Schilder, University of Hamburg Andrea Setzer, Sheffield University Wilbert Spooren, Free University of Amsterdam Manfred Stede, University of Potsdam Michael Strube, EML Research, Heidelberg Martin van den Berg, FXPAL Annie Zaenen, PARC ---------------------------- CONTACT INFORMATION: Professor Bonnie Webber School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW UK email: bonnie_(on)_inf.ed.ac.uk phone: +44 131 650 4190 fax: +44 131 650 4587 Professor Donna Byron Department of Computer and Information Science The Ohio State University 395 Dreese Laboratory 2015 Neil Avenue Columbus, Ohio 43210 USA email: dbyron_(on)_cis.ohio-state.edu phone: 614-292-6370 fax: 614-292-2911 -- Dr. Donna K. Byron Assistant Professor OSU Computer and Information Science Ph: 614-292-6370 Fax 614-292-2911 Website: www.cis.ohio-state.edu/%7Edbyron ___________________________________________________________________ ELSNET's mailing list elsnet-list is intended for those who are working in the field of language and speech technology. Send your messages to elsnet-list_(on)_elsnet.org Visit http://www.elsnet.org/list.html to search the archives. Use http://www.elsnet.org/su bscriptions.html to (un)subscribe. Go to http://www.elsnet.org for more information about ELSNET.

