Online TPP

CL Research, 2006


Please note that the online version of The Preposition Project (TPP) is intended for quick reference of the data available in the project. The data is in the process of active development and is not yet complete. Currently (12/9/06), semantic relation or role names are available for 558 senses of prepositions and prepositional phrases through the word off, and for major prepositions thereafter (on, over, round, through, to, toward(s), and with). There are approximately 847 senses of 374 prepositions and prepositional phrases in total. Online TPP will be updated monthly.

1. Introduction

Online TPP is based on data generated in TPP (the link to the project provides an in-depth discussion of how the data is being developed). TPP combines data from two major sources: the Oxford Dictionary of English (Oxford University Press, 2003) and sentence instances in the FrameNet project for individual prepositions. It includes all the data from ODE that appears in the printed text (primarily definitions and usage examples), plus a considerable range of additional data describing the behavior of each preposition sense. The intent is that such data will be useful in text processing applications that can use information about preposition behavior in understanding the semantic relations of textual units (the objects of the prepositions).
The display routines for Online TPP are adapted from various CGI scripts provided by James McCracken of Oxford University Press. These scripts were developed as a demonstration project used to display the full contents of ODE. Online TPP does not employ the full range of capabilities used on Online ODE, such as full disambiguation of all content words in definitions and the display of noun and domain hierarchies.

2. What is special about the Online TPP data?

In addition to all the components which appear in the printed text of the Oxford Dictionary of English, the Online TPP data has a number of special features which distinguish it from typical dictionary databases. These features make it particularly suitable for computational applications - both as an electronic dictionary and as a database for natural language processing applications.
Some aspects of these features are described below:
Data structure and lexical objects — Fundamental to all the functionality of Online ODE is the fact that all the data is structured as a series of discrete lexical objects. These function as small packets of data which exist independently of each other, each containing all relevant information about meaning, morphology, lexical function, semantic category, etc. Crucially, each lexical object corresponds to a sense rather than to a dictionary entry. Hence every sense may be queried, extracted, or manipulated as an independent packet of data without any dependencies on the entry in which it appears. Although this can seem counterintuitive to human readers used to treating the entry as the basic object in a dictionary, from a computational point of view it allows a much more detailed and exhaustive specification of the way the language functions on a sense-by-sense basis.
Lexical and phrasal morphology — Every lexical object in Online TPP provides a complete and discrete specification of all the lexical forms relevant to its sense. This includes not only the morphology of the word forms themselves, but also structured data about their syntactic roles, variant spellings, British and US spellings, alternative forms, and the correspondence relationships between them (the source data also includes phonetics). This is true not only for single-word lexemes but also for multi-word phrases. Online TPP thus provides a facility for robust and positive lookup of real-world lexical forms, including permutation of phrases.

3. Searches

The top bar includes a quick search box, which searches for the query term, either a one-word preposition or a prepositional phrase. The search operates by pushing the Go button or hitting the return key. The top bar also includes a link to the main site for The Preposition Project, a link to a feedback page (at CL Research) for making comments or asking questions, and a link to this help page.
Note that if a simple search returns no results, Online TPP may indicate that no match was found with suggestions for revising the search term. Or a set of alternatives may be displayed in a table with four columns (the matched lemma, the part of speech of the matched lemma, the definition for the first matching sense in the entry, and the headword of the entry), in which case clicking on the matched lemma will take you to the desired entry.

4. Entry definition display

The entry display includes a top bar with « and » buttons linking to the previous and next entries in the dictionary. The definitions for an entry appears next in the left column of a two-column display. For entries that have a preposition part of speech in ODE, this part of speech is shown. For phrases, no part of speech is shown. If an entry contains more than one core sense, the core senses are numbered. Subsenses are displayed in bullets underneath a core sense.

5. Entry properties display options

At the bottom of each entry are controls for modifying the information that is displayed in the right column of the entry display. This column displays one of the twelve properties associated with each lexical object (i.e., sense or subsense). Clicking on a control displays the selected property for all senses and subsenses. However, it is possible that a given sense or subsense does not have the particular property, in which case the result of clicking on a property type will result in an empty right column. The following properties are provided, with the primary source of the property in parentheses (see note above for the proportion of entries and senses for which these properties are available):
  1. labels mode shows classification or domain labels (relatively few preposition senses have such domain-specific labels) (ODE)
  2. word forms mode shows word forms and inflectional morphology (for simple prepositions, this will be only the preposition itself, while for phrases, some variant forms may be listed) (ODE)
  3. semantic relations mode shows the semantic relation or semantic role name that has been assigned by the lexicographer (TPP)
  4. complement properties mode shows properties of the object of the preposition or phrasal preposition (TPP)
  5. attachment properties mode shows properties of the linguistic entity to which the preposition phrase (i.e., the preposition plus the complement), usually a noun, a verb, or an adjective (TPP)
  6. Quirk syntax mode shows the syntactic position where a prepositional phrase headed by the preposition or phrasal preposition (noun postmodifier (1); adverbial adjunct (2a), subjunct (2b), disjunct (2c), or conjunct (2d); and/or verb (3a) or adjective (3b) complement, as described in paragraph 9.1, p. 657 of Quirk et al.) (TPP)
  7. Quirk paragraph mode shows the paragraph(s) in Quirk et al. where an in-depth discussion of the sense may be found, with an asterisk (*) indicating that the sense is not discussed (TPP)
  8. Frame::Element mode shows FrameNet (FN) Frame::FrameElement pairs in which prepositional phrases headed by the given preposition appear in the set of instance sentences tagged by the lexicographer (note that instance sets are available in FN for only about major, single-word prepositions) (FN, TPP)
  9. other prepositions (short) mode shows other prepositions that have a highly similar sense, as judged by the lexicographer (TPP)
  10. other prepositions (long) mode shows other prepositions that have been found in the FN sentences expressing the same Frame::Element pair (TPP)
  11. sense relations mode shows the relation of the subsenses to the core sense, usually either specific (a more narrow sense) or extension (a broader sense)
  12. comments mode shows any notes the lexicographer may have made in analyzing the sense

Ken Litkowski
CL Research
2006