ProjectAttityd i text

Attityd i text

Kristofer Franzén
Jussi Karlgren
Magnus Sahlgren
Gunnar Eriksson

Detta projekt avser studera attitudinella uttryck i mänskliga språk, specifikt skriven text. Attitudinella uttryck är uttryck som "Jag gillar varm korv" eller "Bertram har en utmärkt V-botten och tar motsjö fint." Sådana uttryck har i typiska -- men inte alla -- fall någon explicit eller lätt inläsbar agent i uttrycket som är den som håller attityden, någon tydligt värderande term och någonting som omtalas och avses med attityden.Detta projekt avser identifiera och analysera attitudinella uttryck baserat på hur sådana typiskt förekommer i text.Till vår hjälp har vi en egenutvecklad teknik för distributionell analys av språkliga data som visat sig vara mycket användbar för modellering av enklare semantiska samband -- distributionell analys har i tidigare publicerade och väl emottagna experimentella resultat tillåtit oss hitta synonymer och betydelsebesläktade ord ur textuella data. I detta projekt avser vi generalisera den tidigare modellen för distributionell analys genom att gifta ihop den med mer sofistikerad språkvetenskaplig modellering --- det har länge varit en utmaning att på ett skarvfritt sätt foga ihop språkvetenskapliga kunskaper med språkstatistiska beräkningsmekanismer.

Attitude in text

BESbswyBESbswyBESbswyBESbswy

This project, funded by the Swedish Research Council, will study attitudinal expression in human language, specifically text and written discourse. Attitudinal expressions are expressions such as "I like sauerkraut." or "Even physicians have come to dislike the traditional fee-for-service model." This project will aim to identify and analyze attitudinal expressions by their textual characteristics.

Publications
Number of items: 10.

Karlgren, Jussi and Eriksson, Gunnar and Täckström, Oscar and Sahlgren, Magnus (2010) Between Bags and Trees - Constructional Patterns in Text Used for Attitude Identification. In: ECIR 2010, 32nd European Conference on Information Retrieval, 2010-03-28 -- 2010-03-31, Milton Keynes, Great Britain. (In Press)

Sahlgren, Magnus and Karlgren, Jussi (2009) Terminology mining in social media. In: The 18th ACM Conference on Information and Knowledge Management (CIKM 2009), 2-6 Nov 2009, Hong Kong.

Karlgren, Jussi and Eriksson, Gunnar and Täckström, Oscar (2008) SICS at NTCIR-7 MOAT: constructions represented in parallel with lexical items. In: The 7th NTCIR Workshop (2007/2008) - Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, 16-19 December 2008, Tokyo, Japan.

Karlgren, Jussi (2008) Changing the subject; one way of measuring trust in information. In: Workshop on Novel Methodologies for Evaluation in Information Retrieval, 30 March 2008, Glasgow, Scotland.

Karlgren, Jussi and Dalianis, Hercules and Jongejan, Bart (2008) Experiments to investigate the connection between case distribution and topical relevance of search terms. In: 8th international conference on Language Resources and Evaluation, LREC'08, 27-30 May 2008, Marrakech, Morocco.

Sahlgren, Magnus and Karlgren, Jussi (2008) Buzz monitoring in word space. In: European Conference on Intelligence and Security Informatics (EuroISI 2008), 3-5 December 2008, Esbjerg, Denmark.

Karlgren, Jussi and Eriksson, Gunnar (2007) Authors, Genre, and Linguistic Convention. In: SIGIR Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection, 27 August 2007, Amsterdam.

Sahlgren, Magnus and Karlgren, Jussi and Eriksson, Gunnar (2007) SICS: Valence annotation based on seeds in word space. In: Fourth International Workshop on Semantic Evaluations (SemEval-2007), June 2007, Prague, Czech Republic.

Uzuner, Ozlem and Argamon, Shlomo and Karlgren, Jussi (2006) Stylistics for text retrieval in practice. ACM SIGIR Forum, 40 (2).

Karlgren, Jussi (2006) New Text - New Conversations in the Media Landscape. ERCIM News, 66 .

This list was generated on Sun Jul 15 21:17:08 2018 CEST.
Research Goals and Work Plan

Research Efforts

  1. formulate an adequate and explanatorily satisfactory framework for the understanding and analysis of textual topical elements beyond the level of abstraction afforded by today’s syntactic models;
  2. formulate an theoretically sound and computationally functional framework for the processing and analysis of contextual and distributional information in terms of said textual elements;
  3. study the expression of attitude and perspective in terms of distributional data of attitudinal expression generally and said textual elements specifically;
  4. evaluate project results quantitatively using valid target categories.

Work Plan

  1. Set up a test collection of text, specified by genre. Many newsprint collections, e.g., include opinion pieces, editorials, reviews, columns, and letters to the editor.
  2. Manually annotate attitudinal expressions for some segment of the collection.
  3. Preprocess the text using available syntactic processors — e.g. using the FDG toolkit from Connexor which provides dependency analyses of clauses.
  4. Compute base distributional statistics for a seed set of pre-selected terms which are prototypically attitudinal: “good”, “bad”, “like”, “hate” etc.
  5. Find distributionally similar terms to the seed set.
  6. Investigate the contextual data for the given seed set and generalize from the found patterns using data from syntactic analysis of the texts: define context in terms of dependency relations rather than adjacency.
  7. Recompute distributional statistics using the more refined notion of context.
  8. Find distributionally similar terms to the seed set.
  9. Compare the two sets of distributionally similar sets.
  10. Evaluate the sets using non-parametric statistical hypothesis testing.
  11. Evaluate the sets by human assessor scoring.
  12. Return to investigating the contextual data to obtain more refined definitions of context.
  13. Deliver distributional patterns for attitudinal expressions expressed on an appropriate level of abstraction.