Organizers

Sameer Pradhan, Martha Palmer, and Edward Loper

 

Subtask 1: Coarse-grained English Lexical Sample WSD

This task consists of lexical sample style training and testing data for 35 nouns and 65 verbs in the WSJ Penn Treebank II as well as the Brown corpus. This data will include, for each target item: OntoNotes sense tags (these are groupings of WordNet senses that are more coarse-grained than traditional WN entries, and which have achieved on average 90% ITA), as well as the sense inventory for these lemmas.

This data has been made available to Eneko Agirre and Aitor Soroa for use in the Word Sense Induction task, and also to German Rigau and Montse Cuadros for their evaluation of lexical resources. As described above, the OntoNotes senses have links to WN senses.

Subtask 2: Coarse-grained English Lexical Sample SRL

For the same lemmas (but not necessarily exactly the same training and testing instances), we will also supply:

This will support a second subtask for SRL, in both PropBank style and VerbNet style. We propose that the SRL subtask have two evaluation tracks.

Subtask 3: English fine-grained All-Words

We have supplied a 5000 word chunk of WSJ where all of the verbs and the head words of the verb arguments have WordNet 2.1 sense tags. This is for testing purposes only, and has no training annotation associated with it, or PropBank or VerbNet labels. Participants can of course use Semcor and the previous Senseval data as training data if they choose to. We have coordinated with Roberto Navigli and Ken Litkowski so that their Coarse-grained All-Words task annotates the same data. Since there is no training data for this task there is no Closed track.

The data tagged with OntoNotes sense tags was prepared under DARPA GALE funding at the University of Colorado (verbs) and ISI (nouns). The VerbNet labels were attached to the PropBank data at the University of Colorado using a semi-automatic process that involved a hand correction step. This was funded by AQUAINT, and is described in a NAACL-07 paper, (Yi, Loper, Palmer).

The thematic role labels will be evaluated using Precision and Recall against the Gold Standard test data the same way PropBank was evaluated at CoNLL. The sense tags will be evaluated using Precision and Recall the same way Senseval English sense tags have been evaluated.

LDC agreed to let Semeval distribute the WSJ raw text for the data.

 

Deadline:

We have extended the deadline for submitting the results to the midnight of April 4th.