Task #6: Word-Sense Disambiguation of Prepositions
Organizer
Ken Litkowski
Description
One of the current major research topics in computational linguistics
is semantic role labeling. This topic has been the subject of a
previous Senseval task and a Conference on Natural Language Learning
(CoNLL) task. A special issue of Computational Linguistics on semantic
role labeling has also been announced. Research into the behavior of
prepostions has also been a topic of considerable research, with two
recent ACL workshops and also a planned special issue of Computational
Linguistics on prepositions.
To a large extent, prepositions have received little attention in
previous research, relegated to minor roles with little variation in
treatment in such resources as the Penn Treebank. Similarly, even
within the lexicographic community (and dictionaries), prepositions
are rarely accorded the full treatment of corpus analysis given to
other parts of speech, particularly verbs. To the extent that they
have been included in computational treatment, they have been closely
tied to verbs, as indicators of internal arguments. Notwithstanding
their view as "mere" function words, prepositions have a range of
polysemy comparable to other parts of speech. Fortunately, the number
of such prepositions is relatively small, as a generally closed
class. Prepositions are the bearers of much semantic information, so
the development of techniques for their disambiguation would be of
great benefit to the computational community, particularly if they
could be dealt with in a concentrated effort by the community.
The publicly available Preposition Project has been developed to
provide a comprehensive treatment of preposition behavior. As part of
the project, Oxford University Press has made its preposition sense
inventory publicly available. This sense inventory has been used in
tagging large numbers of preposition instances from FrameNet, with a
professional lexicographer performing this task. As a result, large
numbers of instances are available for more than 50 prepositions,
ranging from 100 to over 4000 for the preposition "of". In addition,
these tagged instances have been prepared in the format used in
previous Sensevals, so they are immediately available for use in the
task.
The task will be carried out in the same manner as previous Senseval
lexical sample tasks, following the same methodology for evaluation
(including the use of the same evaluation scripts, with sense tagging
available for both fine-grained and coarse-grained
disambiguation). All the necessary resources are already available to
potential participants.
Task Details (to be finalized)
1. All prepositions currently tagged on the Preposition Project (the
34 most common English prepositions as of August 1, 2006) will be
included in the task. It is expected that this task will provide a
definitive characterization of the closed class of prepositions that
will then be publicly available to all members of the computational
linguistics community.
2. Participants can use other data that is available from the
Preposition Project. Each instance includes an identifying number from
the FrameNet project, so all information from the FrameNet tagging is
available. This includes a syntactic characterization of the sentence
elements and FrameNet frames and frame elements.
Individuals interested in participating in the task are encouraged to
discuss their concerns with the task organizer, Ken Litkowski.
|