TempEval Final Training Data
February 26th, 2007
This document describes the TempEval data, the way they were
created, and the validation and scoring scripts that are bundled with
the data. If needed, updates to this document will be posted on
the TempEval website and on
the TempEval Google group and mailing list (see the TempEval website
on how to join the mailing list). This document does not replace the
task description on the
SemEval
and TempEval websites, but complements it.
Data Description
The TempEval annotation language is a simplified version
of TimeML. The
TimeML specifications,
annotation guidelines and
document type
definition (all for TimeML version 1.2.1) are included here for
easy reference. For TempEval, we use the following five tags:
- <TempEval>
- The document root.
- <s>
- The sentence tag. All sentence tags in the TempEval data are
automatically created using the Alembic Natural Language processing
tools. A sentence tag can contain TIMEX3 tags and
EVENT tags, but no TLINK tags.
- <TIMEX3>
- Tags the time expressions in the text. It is identical to the
TIMEX3 tag in TimeML. See the
TimeML specifications
and guidelines for further
details on this tag and its attributes. Each document has one special
TIMEX3 tag, the Document Creation Time, which is interpreted as an
interval that spans the whole day.
- <EVENT>
- Tags the events in the text. The TempEval EVENT merges the
information on two TimeML tags: EVENT and MAKEINSTANCE. TimeML used
these two tags to refer to two instances of an event in sentences like
"He taught on Wednesday and Friday". This complication was not
necessary for the TempEval data. Both tags and their attributes are
described in the
TimeML specifications
and guidelines. For TempEval
task C, one extra attribute is added: mainevent, with
possible values YES and NO.
- <TLINK>
- A simplified version of the TimeML TLINK tag. The relation
types for the TimeML version form a fine-grained set based on James
Allen's interval logic (James Allen, "Maintaining Knowledge about
Temporal Intervals." Communications of the ACM 26, 11, 832-843,
November 1983). For TempEval, we only use three relations as well as
three disjunctions over those three: BEFORE, OVERLAP, AFTER,
BEFORE-OR-OVERLAP, OVERLAP-OR-AFTER, and VAGUE. Here, OVERLAP refers
to two events (or an event and a time interval) that have a non-empty
overlap. VAGUE is used for those cases where no particular relation
can be established.
The training data contain all TLINKS required by tasks A, B and C. In
addition, the training data contain all event and timex information,
including, for task C, markers to indicate main events for each sentence.
Recall that tasks A and B are
constrained to linking events from the event target list. The event
target list consists of those events that occur 20 times or more in
the corpus. A complete list
of stems ordered on frequency
is included in the docs directory (only stems occurring
more than once are added to the list).
The data directory has two sub directories, one with the data for
tasks A and B, with 162 documents, and one with data for task C, with
163 documents. This discrepancy is due to one document where the
Document Creation Time was placed in the future, which makes task B
rather hard to do. This document was removed from the training set.
Test Data
The test data are distributed separately from the training data. The
format of the test data is identical to the format of the training
data. But there are two differences in the actual content:
- the test data is comprised of a different documents set
- all relation types of TLINKs in test documents are set to UNKNOWN
Annotation Procedure
The EVENT and TIMEX3 annotation were taken from TimeBank
(http://timeml.org/site/timebank/timebank.html
). The
annotation procedure for TLINKs includes dual annotation by seven
annotators using a web-based annotation interface
(see the screen shot page for more
details). After this phase, two experienced annotators looked at all
occurrences where two annotators differed as to what relation type to
select. For task C, there was an extra annotation phase where the main
events were
selected. Annotation
guidelines for main event annotation are included in this
distribution.
Validation
Included with the trial data are a Perl validation script and
a Document Type
Definition for TempEval annotation. All files in the training set
have been validatated. To validate TempEval files using the DTD,
open a terminal window (Linux/Unix/MacOSX) or a command prompt
(Windows) and type the following:
% perl validate.pl ../data/taskAB
% perl validate.pl ../data/taskC
This will write validation errors and warnings to the standard
output. All lines with INFO-300 can be ignored, in
general, they report on reference counts. On Unix/Linux systems, these
lines can be filtered out by using:
% perl validate.pl ../data/taskAB | grep -v INFO-300
% perl validate.pl ../data/taskC | grep -v INFO-300
The script assumes the Perl modules XML::Checker and XML::RegExp,
both available at CPAN (http://www.cpan.org).
Evaluation
Also included with the training data is
a Perl scoring
script. It measures precision and recall using a strict and a
relaxed scoring scheme. See
the evaluation document in
the docs directory for more details.