Organizers
Roberto Navigli and Ken Litkowski
Introduction
One of the major obstacles to effective WSD is the fine granularity of
the adopted computational lexicon. Specifically, WordNet, by large the
most commonly used dictionary within the NLP community, encodes sense
distinctions which are too subtle even for human annotators (Edmonds
and Kilgariff, 1998). Nonetheless, many annotated resources, as well
as the vast majority of disambiguation systems, rely on WordNet as a
sense inventory: as a result, choosing a different sense inventory
would make it hard to retrain supervised systems and would even pose
copyright problems.
To deal with these issues, we propose a coarse-grained English all-words
task for Semeval-2007.
Description of the task
We will tag approximately 6,000 words of three running texts (analogous
with the previous Senseval all-words tasks) with coarse senses. Coarse senses
will be based on a clustering of the WordNet sense inventory obtained via a mapping to
the Oxford Dictionary of English (ODE), a long-established dictionary
which encodes coarse sense distinctions (the Macquarie Dictionary will
also be considered as an option -- contacts are ongoing). We will
prepare the coarse-grained sense inventory semi-automatically:
starting from an automatic clustering of senses produced by Navigli
(2006) with the Structural Semantic Interconnections (SSI) algorithm,
we will manually validate the clustering for the words
occurring in the text. Two annotators will tag the text with the
coarse senses by using a special web interface. A judge will solve
disputed cases (though we hope that, given the coarse nature of the task,
there will be a very small number of such cases). For each content
word we will provide the participants with its lemma and part of
speech. As a second stage, we plan to associate fine-grained senses
with those words in the test set which have clear-cut distinctions in
the WordNet inventory. This second set of annotations would allow for
both a coarse and a fine-grained assessment of WSD systems.
For disambiguation purposes, participating systems can exploit the knowledge of coarse distinctions as well as each fine-grained WordNet sense belonging to a sense cluster. Thus, supervised systems can be retrained on the usual data sets (e.g. SemCor) where a sense cluster replaces the fine-grained sense choice. Each system will provide a single coarse (and possibly fine) answer for each content word in the test set. We will provide an example data set beforehand (as in the tradition of the previous Senseval exercises) and a test set.
Evaluation
Evaluation will be performed in terms of standard precision, recall
and F1 scores. We will avoid words with untagged senses, i.e. the "U"
cases present in the Senseval-3 all-words test set.
Resources required to prepare the task
The following steps will be carried out by the task organizers:
1. Obtaining the necessary resources to be provided to the participants (copyright, etc.)
2. Computation and annotation necessary to prepare the task resources (estimated at 40-50 hours, but it is a difficult estimate)
3. Validating the mapping WordNet -> ODE for the content words (estimated at 40-50 hours)
4. Processing the text to be disambiguated (POS tagging, parsing, etc.)
5. Annotating the text with coarse (and possibly fine) senses (estimated at 40 hours), using professional lexicographers
Provisional schedule
- Competition: March 2007
- SemEval-2007 Workshop: June 2007
Pending issues
- Data format
- Availability of the original sense inventory from the Oxford Dictionary of English (this would allow the participants to work on both WordNet and the ODE sense distinctions)
References
P. Edmonds and A. Kilgariff. Introduction to the special issue on
evaluating word sense disambiguation systems, Journal of Natural
Language Engineering, 8(4), Cambridge University, 1998.
R. Navigli. Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance. Proc. of COLING-ACL 2006, Sydney, Australia, July 17-21, 2006.