SECOND CALL FOR PAPERS COLING 2004 WORKSHOP ON COMPUTATIONAL APPROACHES TO ARABIC SCRIPT-BASED LANGUAGES Geneva, Switzerland, 23-27 August 2004 Submission Deadline: 25 March 2004 http://members.cox.net/ karinem/COLING2004 WORKSHOP DESCRIPTION Recently, there has been a surge of interest in the study of the languages of the Middle East, especially Arabic, Persian (Farsi), Pashto, Kurdish and Urdu. This sudden and urgent interest is manifested by the availability of funding for rapid development of practical systems for processing large volumes of data in these languages. Computational applications for proper name identification, entity recognition, categorization, information retrieval, summarization, machine translation and other implementations are currently in high demand. This comes at a time when advances in formal and computational linguistics over the last fifty years are being consolidated, while work on machine learning and statistical methods has been showing great promise. Although there exists a considerable body of work in computational linguistics specifically targeted to these middle eastern languages, much of the research and development has been the result of initiatives by individual research establishments or industry firms. Furthermore, the usage of the Arabic script gives rise to certain issues that are common to all these languages despite their being of distinct language families. Hence, these languages share properties such as the absence of capitalization, right to left direction, lack of clear word boundaries, complex word structure, a high degree of ambiguity due to non-representation of short vowels in the writing system, and related encoding issues. The goal of this workshop is to provide a forum for those involved in the development of NLP systems in Arabic script languages to exchange ideas, approaches and implementations of computational systems; to discuss the common challenges faced by all practitioners; and to assess the state of the art in the field. In addition, one of the aims of the workshop is to identify promising areas for future collaborative research in the development of NLP systems for Arabic script languages. Solutions that are designed to solve the specific problems of these languages could very well have wider applications and relevance to the rest of the NLP community. WORKSHOP TOPICS Authors of papers in any area of NLP in Arabic script-based languages are encouraged to apply. We encourage submissions dealing with language-specific issues, as well as discussions of challenges imposed by the usage of the Arabic script. Papers dealing with various methodologies such as statistical approaches, shallow parsing and linguistic-based analyses are encouraged. Submissions could also be on - but not limited to - any of the following topics: · Morphological analysis · Syntactic ambiguity resolution · Machine translation from and to Arabic script languages · Sense disambiguation · Homograph resolution · Semantic analysis · Entity recognition · Information retrieval · Classification of documents · Text mining · Summarization · Speech recognition and generation · Lexical databases · Knowledge and domain representation · Spelling and grammar checking tools Proposals for formal demonstrations of advanced operational systems as well as research prototypes are welcome. SUBMISSION REQUIREMENTS Papers should be original, previously unpublished work and should not identify the author(s). They should be no longer than 8 pages (including figures and references) and should emphasize completed work rather than intended work. Papers that are being submitted to other conferences must reflect this fact on the title page. Submissions are limited to one individual and one joint paper per author. Demonstration proposals should give a short description of the system, provide its technical specifications and indicate how the demonstration illustrates new ideas and contributes to the computational work on Arabic-script languages. The proposals are not to exceed 4 pages. Email submissions (ps or pdf) are preferred and should be sent to both AliFarghaly_(on)_aol.com and karinem_(on)_inxight.com. Submissions should be in English. The papers should be attached to an email indicating contact information for the author(s) and paper’s title. The hardware, software and network requirements for the system demonstrations should also be indicated in the text of the email. Formatting requirements for the final version of accepted papers will be posted as soon as they become available. Hardcopy submissions should be sent to: Ali Farghaly SYSTRAN Software, Inc. 9333 Genesee Ave, Pl 1 San Diego, CA 92121 USA PROCEEDINGS AND WORKSHOP ORGANIZATION Accepted papers and formal demonstrations will be published in a proceedings volume. For the workshops to take place, the COLING 2004 organizers require at least 20 participants to register for the workshop. Speakers and participants are therefore asked to register via the official COLING 2004 site as soon as possible. IMPORTANT DATES Submissions due: March 25th, 2004 Notification date: April 25th, 2004 Deadline for camera ready copy: May 25th, 2004 ORGANIZING COMMITTEE This workshop is organized by Ali Farghaly (SYSTRAN Software, Inc.) Karine Megerdoomian (Inxight Software and University of California San Diego) The call for papers as well as future information on the workshop can be found at http://members.cox.net/ karinem/COLING2004 PROGRAM COMMITTEE Jan W. Amtrup, Bowne Global Solutions Tim Buckwalter, Linguistic Data Consortium Miriam Butt, Konstanz University, Germany Violetta Cavalli-Sforza, Carnegie Mellon University Joseph Dichy, Lyon University Abdel Kadir Fassi Fehri, Arabization Bureau, Rabat, Morocco Andrew Freeman, University of Washington Nizar Habash, University of Maryland, College Park Masayo Iida, Inxight Software, Inc. Simin Karimi, University of Arizona Martin Kay, Stanford University Kevin Knight, USC/Information Sciences Institute Farhad Oroumchian, University of Wollongong in Dubai Ahmed Rafea, The American University in Cairo Jean Senellart, SYSTRAN Software Bonnie Glover Stalls, University of Southern California Rémi Zajac, SYSTRAN Software

