Botond Pakucs
![]()
Björn Gambäck
![]()
Preben Hansen
![]()
botte@sics.se
gamback@sics.se
preben@sics.se
Information and Language Engineering
Computational LinguisticsSwedish Institute of Computer Science
University of Helsinki
Box 1263, S-164 29 Kista, Sweden
P.O. Box 4, SF-00014 Helsinki, Finland
Often, information retrieval from various other media is analogous to text-based retrieval; however, accessing documents in e.g. audio or video formats causes some extra problems, in particular with respect to document segmentation, choice of indexing features, and robustness. We review these difficulties, together with some previous attempts to overcome them, and then describe a very flexible, modular IR system which has been designed with a specific eye towards these issues.