SemEval-2007 Home
 News
 Schedule
 senseval.org


 Call for Tasks
 Call for Participation


 Task Descriptions
 Download Data


 Paper Submissions
 Program Committee
 Program
 Registration


 Organizers
 Administration

 

Task #5: Multilingual Chinese-English Lexical Sample Task

The goal of this task is to create a framework for the evaluation of word sense disambiguation in Chinese-English machine translation systems. We will provide 40 Chinese polysemous words: 20 nouns and 20 verbs, and each sense of one word will be provided at least 15 instances, in which around 2/3 is used as the training data and 1/3 test data. The "sense tags" for the ambiguous Chinese target words are given in the form of their English translations. The translator comes from the Chinese Semantic Dictionary (CSD) developed by the Institute of Computational Linguistics, Peking University (ICL/PKU). The texts will be extracted from the corpus of People's Daily News, which have been word segmented and POS-tagged. The semantically ambiguous target words will be manually sense tagged with their English equivalents.

The training sense tagged data will be distributed to all participants. The test data will be used to evaluate the participating systems, where the target ambiguous words are explicitly marked, and the participants are required to assign one unique translator to each instance. And an answer key file will be provided as a separate one.

Coordinators
Peng Jin (jandp@pku.edu.cn)
Yunfang Wu (wuyf@pku.edu.cn)
Shiwen Yu (yusw@pku.edu.cn)

 

Please e-mail questions to .
Website hosted by the Department of Computer Science at Swarthmore College.