Subject: Second CFP: ECML/MLNET Workshop on Empirical Learning of NLP Tasks
Date: Thu, 09 Jan 1997 12:29:00 +0100 (MET)
From: Antal van den Bosch <antal@cs.unimaas.nl>
To: elsnet-list@let.ruu.nl

Please accept our apologies if you receive multiple copies of this message.

=========================================================================

               ECML'97 MLNET Familiarization Workshop
               April 26, 1997, Prague, Czech Republic

      Empirical Learning of Natural Language Processing Tasks
                             WORKSHOP

                  in cooperation with ACL SIGNLL

                   SECOND CALL FOR CONTRIBUTIONS

---------------------------------------------------------------------------
General information

The `Empirical Learning of Natural Language Processing Tasks' MLNET
Familiarization Workshop is held in conjunction with the 1997 European
Conference on Machine Learning, in Prague, April 23-26, 1997. The
programme will consist of invited keynote lectures, submitted
presentations, and moderated discussions. The time allocated to each
of these three components is roughly equal, to ensure ample time for
discussions and informal contact. 

The workshop is organised in cooperation with ACL SIGNLL.

---------------------------------------------------------------------------
Confirmed keynote events

The following keynote events have been confirmed:

 * Keynote speaker: Luc Steels (VUB, Brussels, Belgium):
   "Self-organisation in the origin and acquisition of language"

 * Key-panel: David Page (Oxford, UK),
              Luc De Raedt (Leuven, Belgium),
              Luc Dehaspe (Leuven, Belgium):
   "Inductive Logic Programming for tagging and parsing"

Information on keynote events and the workshop programme is continuously
updated on the workshop web page at http://www.cs.unimaas.nl/ecml97/.

---------------------------------------------------------------------------
Focus and motivation

It is becoming apparent that empirical learning of Natural Language
Processing (NLP) can alleviate  NLP's all-time main problem, viz. the
knowledge acquisition bottleneck: for each new language, domain,
theoretical framework, and application, linguistic knowledge bases
(lexicons, rule sets, grammars) have to be built basically from
scratch. Empirical, symbolic machine learning methods such as rule
induction, top down induction of decision trees, lazy learning, and
inductive logic programming, seem to be excellently suited to
automatically learn (induce) exactly that knowledge that is hard to
gather by hand.

We see at least three reasons why Machine Learning researchers should
become more interested in NLP as an application area.

 * Availability of Large Datasets. NLP problems provide realistically
   sized training sets for inductive algorithms. Datasets of tens or
   hundreds of thousands of instances are readily available.
   Traditional "benchmark datasets" usually contain far less 
   instances.

 * Real-World Application. "Hand-crafting" NLP knowledge bases has
   proven to be infeasible or unaffordable for most practical
   applications. ML techniques may help in realising the enormous
   market potential for NLP applications.

 * Type of Complexity. Data sets describing language problems at
   all levels of description exhibit a complex interaction of
   regularities, sub-regularities, pockets of exceptions, idiosyncratic
   exceptions, and noise. As such, they are a perfect model for a large
   class of other poorly-understood real world problems (e.g. medical
   diagnosis) for which it is less easy to find large amounts of data.

---------------------------------------------------------------------------
Topics

The workshop addresses the following topics:

 * Suitability of different empirical Machine Learning paradigms (ILP,
   Lazy Learning, TDIDT, supervised-learning neural networks, etc.) to
   solving classes of NLP tasks (phonology, morphology, syntax,
   semantics, discourse; analysis, generation, translation; speech
   technology).  

 * Case studies of NLP tasks solved with ML techniques. 

 * Application feedback: how did properties of the NLP target
   application influence the design of the algorithm (e.g. in case of
   sparse data, or very large datasets).  

 * Comparisons of empirical symbolic ML methods to connectionist,
   stochastic, and `hand-crafting' approaches on NLP tasks.

---------------------------------------------------------------------------
Second call for contributions

Submission requirements

  Contributions should take the form of a paper of max. 10 pages,
  including title, author(s), addresses (including e-mail if possible),
  and affiliation across the top of the first page. The paper should be
  received by the contact person of the organising committee,
  preferrably by e-mail, before 15 February 1997.  

  Accepted papers will be collected and published in a Workshop Proceedings.
  They will also be made publicly available via FTP.

Format specifications

  Papers may be submitted either as ASCII files or as LaTeX files. In
  the latter case, we kindly ask you to use as standard LaTeX as
  possible, using only standard macros and including preferably only
  .EPS (encapsulated postscript) figures. 

Please follow these global guidelines in preparing your submission: 

 * the paper should be in single-column format; 
 * line spacing should be single (1.0);
 * the font should be 12 point, preferrably Times Roman.

---------------------------------------------------------------------------
Submission address

Please send your paper according to the format specifications
mentioned above via e-mail to Antal van den Bosch,

      antal@cs.unimaas.nl

If a submission via email is not possible, please send a laserprinted
hardcopy of your paper to 

      Antal van den Bosch 
      Department of Computer Science 
      Faculty of General Sciences 
      Universiteit Maastricht 
      PO Box 616 
      NL-6200 MD Maastricht 
      The Netherlands 
 
      phone: +31.43.3882019 
      fax: +31.43.3252392

---------------------------------------------------------------------------
Important dates

 * Contribution submissions     15 February 1997 
 * Notification of acceptance   8 March 1997 
 * Deadline camera-ready copy   1 April 1997 
 * Workshop                     26 April 1997 

---------------------------------------------------------------------------
Organizing committee

  Walter Daelemans (Walter.Daelemans@kub.nl, Tilburg University)
  Ton Weijters (weijters@cs.unimaas.nl, Universiteit Maastricht)
  Antal van den Bosch (antal@cs.unimaas.nl, Universiteit Maastricht)

---------------------------------------------------------------------------
Registration

The workshops will be open to anyone. Participants who are not members
of MLnet pay a fee to cover the marginal costs of the workshop. The
fee is yet to be determined. MLnet will pay the organisational costs
for its members. 

MLnet will arrange travel bursaries for its members to take part in the
workshops. 

---------------------------------------------------------------------------
World-Wide Web

Information on the workshop is also available on the world-wide-web at

  http://www.cs.unimaas.nl/ecml97/

--------------------------------------------------------------------------
 Antal van den Bosch antal@cs.unimaas.nl http://www.cs.unimaas.nl/~antal/
 Department of Computer Science, Universiteit Maastricht, the Netherlands
-----------------------[ spelling is a lossed art ]-----------------------

Last update: Thu Jan 9 15:40:34 1997 by the webmaster