next up previous
Next: List of Italian and Up: Short summary of the Previous: Annotation: Schemata used and

Form of the delivery

The data are delivered in the form of an annotated corpus, encoded in XML, according to MATE guidelines. It is however planned to have a few analyses of the available material follow the actual data production, such that more insight can be gained on the basis of the resource. However, to be sure that the resource is accessible to interested parties at present, already the current ``undigested'' version is made available via ELRA.

A first type of analysis will consist in the provision of tabular output, indicating for each verb all possible syntactic subcategorization types, along with an inventory, for each subcategorization type, of the semantic types found in the nominal groups that realize the functions. This will be a preliminary version of a corpus-documented syntactic and semantic dictionary, as it could be the outcome of a large scale semi-automatic annotation exercise, based on taxonomic resources of the kind of SIMPLE or EuroWordNet.

This small collection of entries will, we hope, provide insight into the new lexical resources which could be extracted from corpora of the kind illustrated here.


next up previous
Next: List of Italian and Up: Short summary of the Previous: Annotation: Schemata used and
Hannah Kermes
2/8/2001