This task
consists of lexical sample style training and testing data for 35 nouns and 65
verbs in the WSJ Penn Treebank II as well as the Brown corpus. This data will
include, for each target item: OntoNotes sense tags (these are groupings of
WordNet senses that are more coarse-grained than traditional WN entries, and
which have achieved on average 90% ITA), as well as the sense inventory for
these lemmas.
This data
has been made available to Eneko Agirre and Aitor Soroa for use in the Word
Sense Induction task, and also to German Rigau and Montse Cuadros for their
evaluation of lexical resources. As described above, the OntoNotes senses have
links to WN senses.
For the
same lemmas (but not necessarily exactly the same training and testing
instances), we will also supply:
This
will support a second subtask for SRL, in both PropBank style and VerbNet
style. We propose that the SRL subtask have two evaluation tracks.
We have
supplied a 5000 word chunk of WSJ where all of the verbs and the head words of
the verb arguments have WordNet 2.1 sense tags. This is for testing purposes
only, and has no training annotation associated with it, or PropBank or VerbNet
labels. Participants can of course use Semcor and the previous Senseval data as
training data if they choose to. We have coordinated with Roberto Navigli and
Ken Litkowski so that their Coarse-grained All-Words task annotates the same
data. Since there is no training data for this task there is no Closed track.
The data
tagged with OntoNotes sense tags was prepared under DARPA GALE funding at the
University of Colorado (verbs) and ISI (nouns). The VerbNet labels were
attached to the PropBank data at the University of Colorado using a
semi-automatic process that involved a hand correction step. This was funded by
AQUAINT, and is described in a NAACL-07 paper, (Yi, Loper, Palmer).
The
thematic role labels will be evaluated using Precision and Recall against the Gold
Standard test data the same way PropBank was evaluated at CoNLL. The sense tags
will be evaluated using Precision and Recall the same way Senseval English
sense tags have been evaluated.
LDC agreed
to let Semeval distribute the WSJ raw text for the data.
We have extended the deadline for submitting the results to the midnight of April 4th.