================================================================= Semeval-2007: 4th International Workshop on Semantic Evaluations *Information for Task Participants* Contributors to this document: Eneko Agirre, Lluís Màrquez, Richard Wicentowski ================================================================= This document provides useful information for task participants at SemEval-2007. Some of the information is necessarily partial and incomplete at the moment. This document is available at the SemEval-2007 official website. Updates on the document: Jan 18 2007: * added section on registration id's Feb 13 2007: * changed section of registration and general rules about team-system definition and paper autorship Contact Information ========================================================== SemEval-2007 website: http://nlp.cs.swarthmore.edu/semeval/ SemEval-2007 organizers: semeval@cs.swarthmore.edu Task participants are responsible for ========================================================== - submit the system results - write a paper describing the system - make a presentation/poster at the SemEval-2007 workshop General Scheduling ========================================================== This is the general schedule. Note that some tasks might have specific constraints. We will be working under a very tight schedule. We note below which are the hard deadlines for task participants:
  • Trial datasets: January 3, 2007
  • Evaluation start: February 26, 2007
  • Evaluation end: April 1, 24:00 pm (GMT-7) 2007 HARD DEADLINE
  • Task coordinators send evaluation: April 10, 2007
  • Description papers due: April 17, 2007 HARD DEADLINE
  • Paper reviews due: April 27, 2007
  • Camera-ready papers: May 6, 2007 HARD DEADLINE
  • Workshop (in conjunction with ACL-07 in Prague): June 23-24, 2007 Task descriptions ========================================================== Descriptions of the 19 accepted tasks have been available at the SemEval-2007 website since October 11.
    Registration
    ========================================================== Registration is required before downloading task training/test datasets. When registering, teams need to carefully choose how they wish to be identified. Note that both your team name and the name(s) of your system(s) will be used on the website, in the proceedings, in the result tables, and also in your system paper title. For this reason participants must to follow the guidelines below. (Additional information can be found here: "General rules about team-system definition and paper authorship") - Each team should be identified by a descriptive 2-5 letter abbreviation, preferably taken from the affiliation (e.g. UBC for the University of the Basque Country) - We allow sites to register more than one team, where each of the teams signs in for a number of tasks. Team members don't need to be disjoint, but two teams with exactly the same members may not register as two teams. These teams should use abbreviation-label as identification. Teams should choose the label with care. Some options include to choose the first letters of the members (e.g. UBC-ALMB, UBC-AC) or just a number (e.g. UBC1, UBC2) - In order to ease browsing the proceedings and finding citations, teams will have to put their team name in the title of their system paper (e.g "UBC-ALMB: WSD using ..."). Please take into account the "General rules about team-system definition and paper authorship" in these guidelines, and try to match team identifications and papers. Please try to follow these guidelines. If you have trouble with them please contact the Semeval coordinators.
    General rules about team-system definition and paper authorship
    ================================================================ The organizers would like to avoid multiple submissions of very similar systems from the same teams, as well as multiple papers from the same team describing one basic system applied to similar tasks. We would also like to make the proceedings easier to browse, making the mapping from system identification to paper title easier. We therefore would like to set general rules regarding these issues: 1. On registration, participant teams will need to provide a unique team ID and a list of its members.
    2. The team will then specify the tasks it is participating on, adding a short description of the systems (e.g. "supervised system based on SVMs and kernels"). The webpage will then return unique "team-task" keywords (e.g. UBC-ALMB-11 for task 11). 3. Each team can only submit a single system for each task. The system will be identified by the "team-task" keyword. If multiple results are uploaded all but the last will be automatically discarded. 4. Each team will get one paper in the proceedings. Exceptions to 3 are allowed when one team prepares two substantially different systems for the same (or the same family of) tasks. Participants should contact task organizers and Semeval organizers, which might then grant the authorization. In any case, all systems will have to be described in a single (possibly longer) paper. Exceptions to 4 are allowed when one team participates in two different kind of tasks. Participants should contact task organizers and Semeval organizers, which might then grant the authorization. Downloading the Datasets ========================================================== Trial data By January 3, 2007, we expect task organizers to provide a trial dataset for the task. This dataset can be fairly small but should be as complete as possible. The goal of the trial datasets is to allow participant teams to start working on their systems. Ideally, the degree of completeness of the trial datasets should allow participants to work exactly in the same framework as during the evaluation period except for the amount of data. For that, trial datasets should be accompanied by detailed documentation on the data formatting, evaluation measures and software, accompanying resources, baseline systems, etc. We have recommended organizers to provide scripts for easing data management, feature generation, etc. to participants. Final training/test datasets Complete datasets (training and test), full documentation, and evaluation software will be available by February 26. The SemEval-2007 website will centralize the uploading/downloading of all datasets. Evaluation Period ========================================================== The evaluation period will comprise the 5 weeks from February 26 to April 1. During this period, participants can download training/test data for any given task at any time, with the following restrictions: * Results for a given task have to be submitted no later than 21 days after downloading the training data for that particular task. * Results for a given task have to be submitted no later than 7 days after downloading the test data for that particular task. Time constraints will be checked automatically by the downloading application. Before the test period expires, participants will upload the answer files output by their systems, again to the application in the SemEval-2007 website. Most tasks will abide by the described time constraints, but there are a few exceptional cases with different evaluation schedules which will be announced by task organizers. Results analysis and paper preparation ========================================================== Task organizers will be provided with the answer files submitted by the participants in the corresponding tasks (by April 2). Task organizers evaluate the data and return answers to participants (by April 10). Each participant and each task organizer will have to write a paper describing their system/task (April 17). Papers are reviewed by the SemEval-2007 program committee (reviews due April 27), then revised by the authors, and final camera-ready papers are submitted by May 6. This procedure has internal dependencies. In particular, task organizers should contact participants as needed, anytime during April 2-16, to obtain additional information about their participation in the task (basic architecture of the system, knowledge sources, features, etc.) so that they can include this information in the task description paper. In the other direction task organizers should provide participnats with any information about the task that is useful to produce a better system description paper. In particular, task organizers may consider informing participants about baseline results, results obtained by other participants, etc. Coherence among task description paper and participants' system description papers is very welcome. A good point here is to enforce a common formatting style among all participants for the presentation of results. Also, task organizers might release some guidelines for requiring participants to include in their papers some specific information relevant to the task analysis (e.g., knowledge sources and features used, learning algorithms, training/testing times, error analysis, etc.). All paper submissions have to be compliant with ACL styles, available at the conference website. Paper submission will be online. The website with the submission form will be announced in advance. At the Workshop ========================================================== Task participants are invited to present a description of their systems at the workshop, which will take place in conjunction with ACL-2007 in Prague on June 23-24. When we decide on workshop details, we will inform about the type of presentations that will be given (oral presentation, posters, etc.). Acknowledgements ========================================================== Thanks to Rada Mihalcea and Phil Edmonds for their continuous advice and for providing us with the very useful handbook of Senseval organizers.