Task #10: English Lexical Substitution Task
Diana McCarthy and Roberto Navigli
WSD has been described as a task in need of an application. Whilst
researchers believe that it will ultimately prove useful for
applications which need some degree of semantic interpretation, the
jury is still out on this point. One problem is that WSD systems have
been tested on fine-grained inventories, rendering the task harder
than it need be for many applications (Ide and Wilks, 2006). A
significant problem is that there is no clear choice of inventory for
any given task (other than the use of a parallel corpus for a
specific language pair for a machine translation application).
We propose a substitution task where the task for both annotators
and systems is to find a substitute for the target word in the test
sentence. Systems will have to identify candidate synonyms for a given
word, and identify which fits best in a given context, thus requiring
some discrimination between contexts. The task could be performed
using a predefined inventory, such as WordNet, but we hope that it
will also be a useful task for unsupervised systems which define
meanings automatically from the data e.g. (Schütze,1998; Pantel and Lin,
2002).
The annotators will be asked to use single words as substitutes
where possible but be instructed that they can provide either a phrase
if one fits perfectly, or a slightly more general word if
necessary. They will also be asked to identify where the word in the
sentence is an integral part of a phrase. We hope that this will
provide useful data for multiword evaluation however scores
will be reported for i) all items, ii) single word (targets and subsitutes)
and iii) multiwords. Thus systems not wishing to deal with multiwords will
have scores reported for the subset of data where subjects have not identified multiwords in the target sentences and where the substitutes provided are "single word" items.
References
Ide, N. and Wilks, Y. (2006). Making Sense About Sense. In Agirre, E.,
Edmonds, P. (Eds.), Word Sense Disambiguation: Algorithms and
Applications, Springer.
Pantel, P. and Dekang, L. 2002. Discovering Word Senses from
Text. In Proceedings of ACM Conference on Knowledge Discovery and Data
Mining (KDD-02). pp. 613-619. Edmonton, Canada.
Schütze, H. (1998) Automatic Word Sense Discrimination. In Computational Linguistics. 24 (1) pp97-123
|