3d ESAIR Workshop, October 30, 2010
Jaap Kamps Jussi KarlgrenRalf Schenkel
University of Amsterdam SICS, Stockholm MPI, Saarland University

The 3d ESAIR workshop was held on October 30, 2010 in Toronto in conjunction with the 19th CIKM conference. The topic of the workshop was semantic annotations by which we refer to linguistic annotations (such as named entities, semantic classes or roles, etc.) as well as user annotations (such as microformats, RDF, tags, etc.) and other related information additions to textual or other objects. With the advent of several new and robust analysis tools for text, speech, and video; more structured data sources on the web, thanks to modern web languages; large numbers of services which provide support for user annotation and tagging we can expect new application areas for this new, deeper, and enhanced analysis of information. The CIKM conference keynotes by Jamie Callan ("Search engine support for software applications"), Gregory Grefenstette ("Use of semantics in real life applications"), and Sue Dumais ("Temporal dynamics and information retrieval") touched upon this issue, as did several presentations and posters at the conference. We can expect the launch of new tools and applications to exploit semantic cues profitably in the near future.

The aim of this workshop was not to discuss the craft of semantic annotation itself, but rather the applications of semantic annotation to information access tasks on various levels of abstraction such as ad-hoc retrieval, classification, browsing, textual mining, summarization, question answering, etc. The program included two keynote talks, seventeen poster presentations, three break-out discussion groups and a final discussion session followed by a Saturday night social event in a nearby Toronto establishment.

The previous two editions of ESAIR, organised by Hugo Zaragoza and Omar Alonso, provided a basis of fruitful discussion as a point of departure:

The first two ESAIR workshops were exploratory workshops to discuss the research space around the topic. A selection of results from them are published in the journal Information Processing and Management (IPM), Volume 46, issue 4, (July 2010). This third edition of the workshop discussed various aspects of application of semantic annotations in practice.

Proceedings

The proceedings of this workshop are available from the ACM Digital Library and are indexed in the DBLP Computer Science Bibliography.

Challenge questions

The call for participation in the workshop included a set of challenge questions to focus and guide the discussion. These questions were addressed in one break-out group each.

Application/Use Case
What use cases make obvious the need for semantic annotation of information? What tasks cannot be solved by document retrieval using the traditional bag-of-words? What are the prerequisites of successful application? (slides from the discussion)
Annotation and Aggregation
What types of annotation are available? Are there crucial differences between author-, software-, user-, and machine-generated annotations? Named entities, temporal expressions on the one hand and sentiment and hedging on the other are examples of analyses beyond topic that have moved to profitable application. What is the family likeness between the various procedures we might call "annotation"? What is holding back the widespread use of these annotations? Are there other types of annotations that are within our grasp? (report from the discussion)
Searcher/Query
With shallow 2.4 word navigational queries, there may be little benefit in semantic annotations. What expressive power is hidden in the semantic annotation? What is keeping searchers from exploring these powerful search request? (slides from the discussion)

Keynote addresses

Liz Liddy
Questions to be Asked & Answered as to NLP's Role in Improving Semantic Annotation (Slides)

In the realm of Information Retrieval, why is Semantic Annotation needed? What has changed? Is it the users, the sources, the genres, the technologies, the applications, the queries? If there are differences, why and how can Semantic Annotation help? And more specifically, how and what is Natural Language Processing (NLP) contributing? This talk will share some practical use cases of where and how NLP-based semantic annotation has demonstrated its utility (or a solid promise of utility) in some information access tasks, by virtue of the depth and richness of annotation possible with today's more powerful NLP technologies. What is showing promise is the ability to understand how to utilize the higher levels of language processing to do Semantic Annotation. This is largely through the introduction of the Pragmatic level of language processing - the functional perspective which provides the extra understanding that comes from the study of language in actual use. Pragmatics is concerned with the aspects of language which require context to be understood. Basically, how situational context is lexicalized and grammaticalized. In Pragmatics, the goal is to recognize the extra meaning that humans read into utterances, which other levels of language processing have not recognized as being encoded in them. Semantic Annotation can then go the next step and amplify current annotations with this additional contextual & intentional knowledge. Examples will be shown of what is being done with NLP today that couldn't be done, or simply wasn't being done in earlier days of IR. In applications of keenest interest today, there is an increased relative emphasis focus on dialogue, interaction, real-time, social, and exploratory search, where understanding the user's intent or plan in their query is key. Applications in exploratory search, eDiscovery, sentiment recognition, collaborative search, along with very, very large scale medical insurance consumer applications will be considered in terms of how and what Semantic Annotations can improve by adding more advanced levels of Natural Language Processing.

Maarten Marx
The Surplus Value of Semantic Annotations (Slides)

We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries, and result grouping. Applications which summarize large collections of text or explain real world phenomena based on textual evidence may receive even more benefit from semantic annotations. Semantic annotation creates surplus value if the annotated data can be used beyond any foreseen application. In particular by third parties linking your data by means of your semantic markup to other data with similar markup. We present a list of properties of the annotated data which optimize this surplus value. They are derived from the principle which states that annotation should facilitate the reuse of data in a mashup without information being lost or distorted. For the Dutch House of Parliament we annotated the parliamentary proceedings based on this principle. Concrete examples from this data collection will illustrate the surplus value enhancing properties.

Posters

Brandeis Marshall
Modeling Betweenness for Question Answering
Walter Tichy, Sven Koerner, and Mathias Landhäußer
Creating Software Models with Semantic Annotation
Antonio Badia
Is Formalizing Events Necessary for Full Exploitation?
Feza Baskaya, Jaana Kekäläinen and Kalervo Järvelin
A Tool for Ontology-editing and Ontology-based Information Exploration
Pham Huy Anh and Takashi Yukawa
Cross language information retrieval based on concept base and Language Grid
Blaz Fortuna, Dunja Mladenic, and Marko Grobelnik
Application of Semantic Annotations to Predicting Users' Demographics
Hany Azzam and Thomas Roelleke
A Semantic Query Rating Scheme
Nikolaos Lagos, Stefania Castellani and Aaron Kaplan
Semantic Annotations for Digital Investigations
Alan Said, Ernesto W. De Luca, and Jérôme Kunegis
Exploiting Hierarchical Tags for Context-awareness
Monica Marrero, Julián Urbano, Jorge Morato and Sonia Sánchez-Cuadrado
On the Definition of Patterns for Semantic Annotation
Arjen de Vries, Wouter Alink, and Roberto Cornacchia
Search by Strategy
Vicente Palacios, Juan Lloréns, Sonia Sánchez-Cuadrado and Monica Marrero
Tagging for Improved Semantic Interpretation of XML Documents
Fredric Gey, Noriko Kando, and Ray Larson
The Crucial Role of Semantic Discovery and Markup in Geo-temporal Search
Shawn Bowers, Huiping Cao, Mark Schildhauer, Matt Jones, Ben Leinfelder, and Margaret O'Brien
A Semantic Annotation Framework for Retrieving and Analyzing Observational Datasets
Sumithra Velupillai
Semantic Annotations in Clinical Documentation -- Exploring Potentials for Future Information Retrieval
Karen Shiells, Omar Alonso, and Ho John Lee
Generating Document Summaries from User Annotations
Paolo Ferragina, Ugo Scaiella
TAGME: On-the-fly Annotation of Short Text Fragments by Wikipedia Entities

Program committee