Information Access in a Multilingual World

SIGIR logoWorkshop to be held in conjunction with the SIGIR Conference on July 23, 2009 in cooperation with the Japanese Info-plosion project.

This workshop is intended to collate experiences and plans for the real-world application of multilingual technology to information access.

The application of multi-lingual search, summarisation, filtering, monitoring, and other technologies is only now starting become real.

Multilinguality can mean different things: from the identification of potentially relevant information in languages not understood by system users, to the provision of information in several languages to users fluent in them.

 

Call for papers, presentations, participation

 

Background

Since 2006, substantial additional research and development has progressed in this area, with the release of the JRC-Acquis Multilingual Parallel Corpus (22 European languages), operational multilingual news summarization sites, a multilingual patent corpus, and research into multilingual interactive image retrieval, and a new evaluation campaign for Indian subcontinent languages (FIRE). Large national initiatives in Europe (QUAERO in France and THESEUS in Germany) and Japan (InfoPlosion) have been launched to promote transition of information access research from laboratories into industrial practice. To be incorporated into such initiatives, multilingual research efforts must prove their usefulness for industrial deployment. While benchmarking efforts can prove the usefulness of some technical component, the influence of a technology on the take-up of an application is only part of the picture.

What would constitute realistic success criteria for multilingual information access systems? This provides a significant challenge for research projects. While the information retrieval field is, to a large extent, defined by its concern for evaluation, evaluation schemes risk being seen as irrelevant for systems providers if the data they investigate are not of realistic scale and if the use cases and scenarios the systems are tested for appear not to be valid. How can a research project prove the validity of its approaches, in the face of data access challenges and interface design issues?

This workshop will explore realistic success predictors for multi-lingual systems for information access, including

  • What user communities do we aim to aid with the systems we are proposing?
  • What use cases and usage scenarios do we envision?
  • What sort of contextual factors should be taken into account in design and evaluation?
  • What sort of outcome variables can be studied to establish usefulness of a system?

We invite participants from research projects to discuss their use cases and envisioned application scenarios, and from practical industrial projects to discuss their experiences in deploying technology; we expect to hear reports of discussions with user and stakeholder organisations and to see designs for habitable interaction with data in multiple languages.

Organisation

Chairs

Fredric Gey, Noriko Kando, Jussi Karlgren.

Programme committee

 

Martin Braschler, Zürich
Aitao Chen, Yahoo! Research
Kuang-hua Chen, National Taiwan University
Ruihua Chen, Microsoft Research
Nicola Ferro, Padua
Atsushi Fujii, Tsukuba University
Julio Gonzalo, Madrid
Gareth Jones, Dublin
Kazuaki Kishida, Keio University
Sadao Kurohashi, Kyoto University
Kazuko Kuriyama, Shirayuri College
Gina-Ann Levow, Chicago
Chin-Yew Lin, Microsoft Research
Thomas Mandl, Hildesheim
James Mayfield, Johns Hopkins University, USA
Mandar Mitra, Indian Statistical Institute
Tatsunori Mori, Yokohama National University
Isabelle Moulinier, Thompson Legal, USA
Paul McNamee, Johns Hopkins University, USA
Douglas Oard, University of Maryland, USA
Maarten de Rijke, Amsterdam
Miguel Ruiz, University of North Texas, USA
Yohei Seki, Toyohashi University of Technology
Benjamin Tsou, City University of Hong Kong
Takehito Utsuro, Tsukuba University
Christa Womser-Hacker, Hildesheim
Kam-fai Wong, Chinese University of Hong Kong
Masaharu Yoshioka, Hokkaido University

 

Tentative program

The workshop will feature one invited keynote talk, several brief project presentations but most importantly ample time for discussion and debate.

Keynote talk by Ralf Steinberger, Joint Research Centre of the European Commission, presenting the Joint Research Centre's multilingual media monitoring and analysis applications, including NewsExplorer (http://press.jrc.it/overview.html)

Accepted papers

Fredric Gey:
Romanization -- An Untapped Resource for Out-of-Vocabulary Machine Translation for CLIR

John I. Tait:
What’s wrong with Cross-Lingual IR?

David Nettleton, Mari-Carmen Marcos, Bartolomé Mesa:
User Study of the Assignment of Objective and Subjective Type Tags to Images in Internet, considering Native and non Native English Language Taggers

Elena Filatova:
Multilingual Wikipedia, Summarization, and Information Trustworthiness

Michael Yoshitaka Erlewine:
Ubiquity: Designing a Multilingual Natural Language Interface

Masaharu Yoshioka:
NSContrast: An Exploratory News Article Analysis System that Characterizes the Differences between News Sites

Elena Montiel-Ponsoda, Mauricio Espinoza, Guadalupe Aguado de Cea:
Multilingual Ontologies for Information Access

Jiangping Chen, Miguel Ruiz:
Towards an Integrative Approach to Cross-Language Information Access for Digital Libraries

Wei Che (Darren) Huang, Andrew Trotman, Shlomo Geva:
A Virtual Evaluation Track for Cross Language Link Discovery

Kashif Riaz:
Urdu is not Hindi for Information Access

Hideki Isozaki, Tsutomu Hirao, Katsuhito Sudoh, Jun Suzuki, Akinori
Fujino, Hajime Tsukada, Masaaki Nagata:
A Patient Support System based on Crosslingual IR and Semi-supervised Learning
 

Documentation

In addition to the workshop notes and a summary report in the SIGIR Forum, the contributions to the workshop will be published as a technical report.

The results of the workshops will also be reported in a white paper delineating success criteria for profitable deployment of multilingual technology. All participants of the workshop will be invited to contribute and co-author the white paper.

Dates

June 15
Deadline for submission (extended, to conform with other SIGIR workshops)
June 22
Notification of acceptance
June 30
Deadline for final revised version of paper
July 23
Workshop day!

Contact

clir2009@sics.se