 | LLAVES |
Unlocking topicality in text - foreground and background information in written language
One-year assessment project funded by the Information Society Technologies Program
under its Future and Emerging Technologies Open scheme.
LLAVES investigates characteristics of clauses in written text with the
objective of distinguishing different types of clause.
Hypotheses
-
Clauses:
Clauses have different information bearing roles.
-
Mechanisms:
The role of a clause is conveyed from author to reader by their surface
form or position.
-
Generality:
Clausal roles are not language specific, but heavily text ecology and
domain specific.
-
How?
Different sorts of clause can be coaxed apart using language and domain
specific mechanisms.
-
Use:
These mechanisms can be applied in the development of practical tools.
Criteria for success
The overall success criterion of the project is to find that any or all of
the following apply:
-
there are systematic syntactic differences between foreground and
background clauses as determined by kappa and Mann Whitney U tests;
-
indexing material separately for foreground and background clauses gives
better results than state-of-the-art indexing as determined by TREC
evaluation;
-
there are systematic likenesses between foreground and background clauses
across languages in translated texts according to the kappa test; and
-
summarizing texts by selecting foreground vs. background clauses gives a
basis for multi-doc summaries, given that human assessors give quality
judgments of summaries, again as determined by the kappa test.
All items are quantitative; the first two items have built-in thresholds
for hypothesis testing.
During the course of the project we found good reason to concentrate on
retrieval evaluation rather than evaluating summarization competence.
Project Reports
Periodic Progress Report, December 2000
Final Report, March 2001
Project Presentation Slides (long! 24 slides!)
Project Presentation Slides (short! 3 slides!)
Technology Implementation Plan, February - March 2000
Project partners
- SICS, Jussi Karlgren
- Conexor Oy, Timo Järvinen and Pasi Tapanainen
Project plan