Task #9: Multilevel Semantic Annotation of Catalan and Spanish
Task Organizers
Lluís Màrquez (TALP, Universitat Politècnica de Catalunya)
Maria Antònia Martí (CLiC, Universitat de Barcelona)
Mariona Taulé (CLiC, Universitat de Barcelona)
Luis Villarejo (TALP, Universitat Politècnica de Catalunya)
Web page: http://www.lsi.upc.edu/~nlp/semeval/msacs.html
Email: semeval-msacs@lsi.upc.edu
Task description
In this task, we aim at evaluating and comparing automatic systems for
semantic annotation at several levels for the Catalan and Spanish
languages. The three semantic levels considered include: semantic
roles and verb disambiguation, disambiguation of all nouns, and named
entity recognition.
[1, Semantic Role Labeling, SRL] The annotation of semantic roles of
verb predicates will be in PropBank style (Palmer et al. 2005; Taulé
et al. 2005; Taulé et al. 2006), and the task setting similar to that
of 2005 CoNLL shared task (http://www.lsi.upc.edu/~srlconll/). Verb
disambiguation refers to the assignment of the proper role-set tag to
the verb, which is a much coarser grained level than the usual sense
disambiguation. This tag is composed by the thematic structure number
(as indexed in the role set file for the verb predicate) and the
lexico-semantic class, which is used to map the numbered arguments
into semantic roles.
[2, Noun Sense Disambiguation, NSD] The disambiguation of nouns will
have a similar shape to an "all-words" disambiguation task. The sense
repository used for the annotation will consist of the current
versions of the Catalan and Spanish WordNets (see resources below).
[3, Named Entity Recognition, NER] The annotation of named entities
will include recognition and classification of simple entity types
(person, location, organization, etc.) but including embedding of
entities. We will be considering core "strong" entities (e.g.,
[US]_loc) and "weak" entities, which, by definition, include some
strong entities (e.g., The [president of [US]_loc]_per) (Arévalo,
Civit & Martí 2004; Arévalo et al. 2002).
All semantic annotation tasks will be performed on exactly the same
corpora for each language. We present all the annotation levels
together as a complex global task, since we are interested in
approaches which address these problems jointly, possibly taking into
account cross-dependencies among them. However, we will be also
accepting systems approaching the annotation in a pipeline style, or
addressing any of the particular subtasks in any of the languages (3
levels x 2 languages = 6 subtasks). See the evaluation section for
details.
More particularly, the input for training will consists of a
medium-size set of sentences (100-200Kwords per language) with
gold-standard full syntactic annotation (including function tags) and
the semantic annotations of SRL, NSD, and NER, which is the target
knowledge to be learned. The full parse trees are provided only to
ease the learning process, but participants are not committed to use
them if they do not want. The test corpus will be about 10 times
smaller than the training corpus and will include the full syntactic
annotation without the semantic levels, which have to be predicted. In
order to put the evaluation task under a realistic scenario, parse
trees for testing material will be automatically generated by
state-of-the art parsers, while for training both the gold standard
(hand-corrected) and the automatic parse trees will be provided.
Formats will be formally described later on, but will be highly
similar to those of the CoNLL-2005 shared task (column style
presentation of levels of annotation). in order to be able to share
evaluation tools and already developed scripts for format conversion.
Evaluation
As previously said, we will use standard evaluation metrics for each
of the defined subtasks (SRL, NSD, NER), presumably based on
precision/recall/F1 measures, since they are basically recognition
tasks. Classification accuracy will be also calculated for verb
disambiguation and NSD. Special metrics relaxing the need for perfect
matching of arguments/entities will be also studied for the NER and
SRL subtasks.
All systems will be ranked and studied according to the official
evaluation metrics in each of the six subtasks (SRL-cat, NSD-cat,
NER-cat, SRL-sp, NSD-sp, NER-sp). Additionally, global measures will
be derived as a combination of all partial evaluations to rank
systems' performance per language and for the complete global task
(language independent).
The organization will prepare a simple baseline processor for each of
the subtasks. Participant teams not presenting results in any of the
subtasks will be evaluated using the baseline processors in those
tasks in order to get global performance scores.
The evaluation on the test set will be carried out by the organizers
based on the outputs submitted by participant systems.
The participants will have available the official evaluation software
from the moment in which the training datasets are released.
Resources provided to the participants
With the aim of easing the participation of teams with few
resources/tools/experience on Spanish and Catalan languages, we will
provide as many resources/tools as possible to participants. By now,
we have in mind the following:
* The full syntactic annotation level of training and test files,
which can be very useful for feature extraction.
* Updated Catalan and Spanish WordNets, which are linked to English
WordNet 1.6 for all noun synsets, some of them enriched with glosses,
examples, collocations, etc.
* Roleset descriptions for all verbs in the training/test corpora
* General scripts for format conversion, which are very useful to
convert CoNLL-style files into more suited representations for
automatic processing.
Development of resources
All the resources will be provided by the organizers. They are free
for research usage, thus no special requirements will be needed by
participants to get and use them (signing a simple license agreement
for all the distributed materials will suffice). All these resources
and tools are being developed in a joint effort by several NLP
research groups and partially funded by the Spanish government under
several projects: 3LB (FIT-150500-2002-244) responsible in 2003-2004
for the syntactic annotation of 100Kw Catalan and Spanish corpora
together with noun/verb sense annotation; CESS-ECE (HUM-2004-21127-E)
which is currently extending the 3LB corpora to 500Kw including a
first annotation of semantic roles; and a probable follow-up project,
PRAXEM, which will provide extra resources to complete the SRL
annotation and to include the labeling of named entities.
By the time of the SemEval-2007 exercise we can guarantee a
portion of the corpus completely annotated with semantic
information in the [100Kw-200Kw] interval.
Contact address
Please direct all your questions regarding the SemEval-2007 task on
Multilevel Semantic Annotation of Catalan and Spanish to the following
email address: semeval-msacs@lsi.upc.edu
Task URL: http://www.lsi.upc.edu/~nlp/semeval/msacs.html
References
Arévalo, M., M. Civit and M.A. Martí (2004) MICE: a
Module for Named-Entities Recognition and Classification, in
International Journal of Corpus Lingüistics, vol. 9 num. 1. John
Benjamins, Amsterdam.
Arévalo, M., X. Carreras, L. Màrquez, M.A. Martí,
L. Padró, M.J. Simón (2002) A proposal for Wide-Coverage
Spanish Named Entity Recognition, in Procesamiento del Lenguaje
Natural, revista 28. SEPLN, Alicante.
Palmer, M., P. Kingsbury, D. Gildea (2005) The Proposition Bank: An
Annotated Corpus of Semantic Roles, Computational Linguistics, 21 (1),
MIT Press, USA.
Taulé, M., J. Aparicio, J. Castellví, M.A. Martí
(2005) 'Mapping syntactic functions into semantic roles', Proceedings
of the Fourth Workshop on Treebanks and Linguistic Theories (TLT
2005). Barcelona: Universitat de Barcelona.
Taulé, M., J. Castellví, M.A. Martí, J. Aparicio
(2006) 'Fundamentos teóricos y metodológicos para el
etiquetado semántico de CESS-CAT y CESS-ESP', Procesamiento del
Lenguaje Natural, SEPLN, Zaragoza.
|