Goal and purpose:
Text (and language in general) has ABOUTNESS; it has meaning, or semantic content. We as (computational) linguists are highly adept at dissecting text on a number of different levels: we can perform grammatical analysis of the words in the text, we can detect animacy and salience, we can do syntactic analysis and build parse trees of partial and whole sentences, and we can even identify and track topics throughout the text. However, we are comparatively inept when it comes to identifying the semantic content, or meaning, of the text. Or, to put matters in more concise terms, even though there are theories and methods that claim to accomplish this, there is a striking lack of consensus regarding both acquisition, representation, and practical utility of semantic content.
The aim of this workshop is not only to provide a forum for researchers to present and discuss theories and methods for semantic content acquisition and representation. The aim is also to discuss a common evaluation methodology whereby different approaches can be adequately compared. As a first step in this direction, participants will be encouraged to apply their methods, or relate their theories, to a specific test corpus that will be available in several of the Nordic languages and English. Participants will be expected to demonstrate what kind of results their methods can yield. In this workshop, the relevance of an approach to meaning is judged only by what it can tell us about real language data. The overall purpose of this workshop is thus to put theories and models into action.
Questions of interest:
Is there a place in linguistic theory for a situation- and speaker-independent semantic model beyond syntactic models?
What are the borders, if any, between morphosyntax, lexicon and pragmatics on the one hand and semantic models on the other?
Are explicit semantic models necessary, useful or desirable? (Or should they be incidental to morphosyntactic and lexical analysis on the one hand and pragmatic discourse analysis on the other?)
We encourage submissions in the following areas:
Discussions of foundational theoretical issues concerning meaning and representation in general.
Methods for supervised, unsupervised and weakly supervised acquisition (machine learning, statistical, example- or rule-based, hybrid etc.) of semantic content.
Representational schemes for semantic content (wordnets, vectorial, logic etc.).
Evaluation of semantic content acquisition methods, and semantic content representations (test collections, evaluation metrics etc.).
Applications of semantic content representations (information retrieval, dialogue systems, tools for language learning etc.).
Online submission is now open at http://www.easychair.org/SCAR2007/. Submissions should not exceed 8 pages, and should use the ACL style files available at http://ufal.mff.cuni.cz/acl2007/styles/. Reviewing will be blind, so papers should not include the authors' names and affiliations, and self-references should be avoided.
Proceedings are now available at: ftp://ftp.sics.se/pub/SICS-reports/Reports/SICS-T--2007-06--SE.pdf.
Submission deadline: April 2
Notification of acceptance: April 26
Final papers due: May 7
Workshop: May 24
Location:
NoDaLiDa
2007, Tartu, Estonia.
The workshop will take place at the Faculty of Mathematics and
Computer Science (J. Liivi 2), room 202.
Organizers:
Magnus Sahlgren, SICS (mange@sics.se)
Ola Knutsson, KTH (knutsson@csc.kth.se)
Peter Bruza, Queensland University of Technology, Australia
Gregory Grefenstette, CEA LIST, France
Jussi Karlgren, SICS, Sweden
Alessandro Lenci, University of Pisa, Italy
Hinrich Schütze, University of Stuttgart, Germany
Dominic Widdows, MAYA Design, USA
Fabrizio Sebastiani, Consiglio Nazionale delle Ricerche, Italy
09:45-10:00 Introduction by the organizers
10:00-10:30 Octavian Popescu and Bernardo Magnini: Sense Discriminative Patterns for Word Sense
Disambiguation [.pdf]
10:30-11:00 Coffee break
11:00-11:30 Henrik Oxhammar: Evaluating Feature Selection Techniques on Semantic
Likeness [.pdf]
11:30-12:00 Jaakko Väyrynen, Timo Honkela and Lasse
Lindqvist: Towards Explicit Semantic Features using Thresholded
Independent Component Analysis [.pdf]
12:00-12:30 Discussion on statistical methods for semantic
content acquisition (led by the organizers)
12:30-14:00 Lunch
14:00-14:15 Demo: Ontological-Semantic Internet Search (Christian Hempelmann)
14:15-14:30 Demo: Infomat - Visualizing and Exploring Vector
Space Models (Magnus Rosell)
14:30-15:00 Anne Tamm: Representing achievements from Estonian transitive
sentences [.pdf]
15:00-15:30 Summary of results and conclusions (led by the organizers)