Natural Language Processing (NLP) for Question-Answering (A Workshop in conjunction with EACL2003) Co-chairs: Maarten de Rijke, University of Amsterdam Bonnie Webber, University of Edinburgh INTRODUCTION Modern, corpus-based "open-domain" question answering (QA) aims to move from retrieving relevant documents to providing targetted information, supported by documentary evidence. Specifically, QA systems are given a large document set and a set of questions. For each question, the system returns an answer consisting of one or more text strings, each supported by the document in which it was found. Most modern QA systems use the following general pipe-lined approach: The system first attempts to classify a question according to the type of its answer. It then uses Information Retrieval (IR) technology, with the question as a query, to retrieve a small portion of the document collection. It then analyses the returned documents to detect entities of the appropriate type. The final answer justification step uses a range of techniques and resources, including a mixture of location-based scoring methods, lexical information, and/or shallow inferencing. Could more intensive use of NLP or using it in better ways (either on its own or combined with techniques from Machine Learning or automated inference) help to improve performance in open-domain QA, or to broaden the range of questions that can be addressed effectively, with a better understanding of what constitutes an answer? For example, one of the biggest challenges in TREC-style QA is overcoming the surface string mismatch between the question formulation and the string containing its answer. For some question/answer pairs, deep reasoning is needed to relate the two, and many QA systems attack this challenge with rich linguistic resources such as parsers, WordNet, on-line dictionaries and gazeteers, and ontologies. But even for languages such as English, existing linguistic resources have been found to be incomplete, error-prone and/or in constant need of updates. And for other languages, such resources may be sparse or missing altogether. This has led to the idea of allowing the data, instead of the methods, to do most of the work, avoiding complex linguistic processing and hence the need for rich linguistic resources. Recent studies suggest that e.g. approximately 500 Gb web data suffices to perform TREC-style QA using redundancy-based approaches. Does this mean that NLP is now out of the picture? The aim of the workshop is to enable participants to hear about, discuss and assess where NLP can make a contribution to current and future QA. * Is the role of NLP restricted to domains where the amount of available data is insufficient for redundancy-based approaches to work -- where one needs to answer a question based on 1 or 100 or 1000 documents rather than millions? * Are there kinds of questions that NLP is needed in order to answer, such as subjective questions, temporal questions, why questions, questions involving assessment, information fusion, etc.? * Should NLP be seen as a single module within QA, or do different parts of QA benefit from the output of different NL processes (e.g., NP chunking, computing dependency relations, resolving anaphors, deriving (quasi-)logical forms, etc.)? * Can NLP be made to exploit the semi-structured nature of more and more web-based documents in QA? * Can QA systems use NLP to exploit the emergence of the semantic web? * Can QA take anything from previous work on Natural Language interfaces to databases? Is there potential synergy between the two? * In evaluating system performance, can NLP provide new methods of answer assessment, to help move QA evaluation beyond time-consuming manual assessment? CALL FOR PAPERS We welcome short papers (4-6 pages) describing and, ideally, also assessing NLP techniques applied to any aspect of QA. We also welcome short notes (2-4 pages) or papers (4-6 pages) describing systems that can be demonstrated at the workshop. WORKSHOP FORMAT AND INTENDED AUDIENCE This will be a one-day workshop, with a series of paper presentations in the morning, and with demonstrations and further presentations in the afternoon. The intended audience are academic and industrial researchers working in the area of QA and students looking for research topics in QA. SUBMISSION FORMAT Submissions should follow the same format as for the EACL conference itself (http://www.elsnet.org/workshops/format.html). All documents (submissions and final copy) must be in either PS, PDF, or DOC format. (We have a strong preference for PS or PDF.) However, there is no need to obtain paper numbers, and submissions do not need to be anonymized. They should be sent electronically to mdr@science.uva.nl by the deadline shown below. Hard copies will be accepted only if the authors explicitly make such arrangements the co-chairs at least one week prior to the official submission date. In that case, the hard copies will still have to arrive by the submission date. IMPORTANT DATES Papers due: Tuesday, 7 January 2003 Notification of acceptance/rejection: Tuesday: 28 January 2003 Deadline for camera-ready copy: 13 February 2003 Workshop: 13 April 2003 REGISTRATION Please see the main conference page (http://www.conferences.hu/EACL03/) for registration details. PROGRAM COMMITTEE Steve Abney, University of Michigan Johan Bos, University of Edinburgh Eric Brill, Microsoft Research Sabine Buchholz, Tilburg University Charles Clarke, U. of Waterloo Oren Etzioni, U. of Washington Claire Gardent, CNRS Nancy Brigitte Grau, LIMSI Donna Harman, NIST Bernardo Magnini, ITC-IRST Mark Maybury, MITRE Dan Moldovan, U Texas at Dallas Maarten de Rijke, U. of Amsterdam Karen Sparck-Jones, Cambridge University Bonnie Webber, University of Edinburgh For further information, please contact: Maarten de Rijke Language and Inference Technology group ILLC, U. of Amsterdam Nieuwe Achtergracht 166, 1018 WV Amsterdam, NL Tel: +31 20 525 5358 Fax: +31 20 525 2800 Email: mdr@science.uva.nl Bonnie Webber School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh UK EH8 9LW Tel: +44 131 650 4190 Fax: +44 131 650 4587 Email: bonnie.webber@ed.ac.uk