Task #8: Metonymy Resolution at SemEval-2007

Organizers
Katja Markert and Malvina Nissim

The phenomenon

Metonymy is a form of figurative speech, in which one expression is used to refer to the standard referent of a related one. For example, in

(1) he was shocked by Vietnam

"Vietnam", the name of a location, refers to an event (a war) that happened there. Although metonymic readings are potentially open ended, most of them tend to follow specific patterns. Many other location names, for instance, can be used in the same fashion as Vietnam above. Thus, given a semantic class (e.g. location), one can specify several regular metonymic shifts (e.g. place-for-event) that instances of the class are likely to undergo.

Extensive annotation (4,000 instances) and analysis of real occurring data of the location and organisation class (Markert and Nissim 2002a; Markert and Nissim 2006) showed that

(i) annotating metonymies in text can be done reliably (K > .80), provided that annotators are trained and follow guidelines, and (ii) metonymies which follow regular patterns are indeed the overwhelming majority, with only about 1% of all metonymies being unconventional uses.

Building on this evidence, we have therefore suggested to view metonymy resolution as a classification task (Markert & Nissim 2002b). Specifically, it can be seen as a disambiguation task between literal readings and a fixed set of metonymic patterns for a particular semantic class. If on the one hand its similarity to word sense disambiguation should allow for re-adaptation and retuning of existing WSD systems, this task offers on the other hand new settings and new challenges.

First, training and testing is not necessarily done on the same instances (e.g. train on 'bank' and test on 'bank'), rather on possibly different instances of the same class (e.g. train on 'France' and test on 'Britain'). Second, metonymies do not often cross topical boundaries, so that co-occurrences- and 'one-sense-per-discourse'-based approaches are unlikely to be helpful. For example, the general context of 'Germany' referring to the football team will not be very different than that of 'Germany' used to refer to the country where the football world cup is being held. Third, although rare, there is a portion of occurrences that do not fall into any of the prespecified patterns (senses), so that yet different methods must be developed to handle these.

The importance of resolving metonymies has been shown for a variety of NLP tasks, such as machine translation (Kamei and Wakao, 1992), question answering (Stallard, 1993), and anaphora resolution (Harabagiu, 1998; Markert and Hahn, 2002).

The task

The metonymy resolution task is a lexical sample task for English and consists of automatically classifying some preselected expressions of a particular semantic class (such as country names) as having a literal or one of the available regular metonymic readings, plus innovative readings. Although the task can be defined for any semantic class, we suggest using locations and organisations for SemEval-2007 with possible extensions to other classes or full-text in following years.

The training set will therefore consist of four-sentence snippets containing a country or company name annotated with one of the possible values (literal, or any of the metonymic patterns, or unconventional). The testset will be provided in the same fashion. Rather than training and testing on one particular word, training and testing instances will be possibly different names belonging to the same class (e.g. location names).

Training/Testing data

As of now, we have an existing dataset of ca. 3,000 country names, and ca. 1,000 company names, presented within a four-sentence context (two sentences before and one after the sentence containing the possibly metonymic name) from the British National Corpus (for copyright issues see below). Such names are annotated with a literal reading, a metonymic pattern or an unconventional metonymic reading. The annotation is standoff XML that maps to an XML version of the British National Corpus. The corpus is provided in text format. The annotation has been tested for reliability for the whole corpus with very good results. The current corpus is already freely available and can be used in its entirety as training data for SemEval-2007.

For providing unseen test data, we suggest annotating a further 800 lexical samples for each of the two classes. Metonymy annotation requires more subtle decision making than standard word sense disambiguation, so that we would estimate a rate of ca. 30 four-sentence samples per hour. The annotation is not computationally demanding.

Evaluation

Systems will be evaluated against a manually annotated unseen set, using the following measures. Accuracy is defined as the percentage of correctly classified instances in the whole set. Precision, recall, and balanced f-score will then be used to assess performance with respect to each class.

Copyright

For the current dataset, we have cleared copyright issues with the BNC. A copyright statement is already included in the current distribution of the annotated data (http://www.cogsci.ed.ac.uk/~malvi/mascara/mascara.2.0.zip). For future annotation, copyright issues will have to be determined with the body that provides the data to be annotated. As of now, the annotation itself is not under any copyright restrictions.

References

Sanda Harabagiu (1998). Deriving metonymic coercions from WordNet. In Workshop on the Usage of WordNet in Natural Language Processing Systems, COLING ACL, 1998, pages 142-148.

S. Kamei and T. Wakao (1992). Metonymy: Reassessment, survey of acceptability and its treatment in machine translation systems. In Proc. of ACL, 1992, pages 309-311.

Katja Markert and Udo Hahn (2002). Understanding metonymies in discourse. Artificial Intelligence, 135(1/2):145-198.

Katja Markert and Malvina Nissim (2002a). Towards a corpus annotated for metonymies: the case of location names. In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC2002), pages 1385-1392, Las Palmas, Canary Islands, 2002.

Katja Markert and Malvina Nissim (2002b). Metonymy resolution as a classification task. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pages 204-213, Philadelphia, Penn., 6-7 July 2002.

Katja Markert and Malvina Nissim (2006). Metonymic Proper Names: A Corpus-based Account. In A. Stefanowitsch (ed.), Corpora in Cognitive Linguistics. Vol. 1: Metaphor and Metonymy, Mouton de Gruyter, 2006.

David Stallard (1993). Two kinds of metonymy. In Proc. of ACL, 1993, pages 87-94.