| Human-System Interaction Human-Computer Interaction and Language Engineering | |||||
has also pursued several application projects in which adaptive and multimodal interaction has been applied, including on-route route guidance, information retrieval in intranet applications, spoken interaction for interactive TV applications, community information to the unemployed and many other. This double focus on core research and application projects allows the group to empirically validate its results from a practical perspective. Often, issues that arise from this applied perspective become central research issues for the group and develop to research projects of their own. Application projects also serve as a useful route for research dissemination.
State of the ArtThe research area of adaptive and multimodal interaction is a subfield of Intelligent Interfaces. This research field was originally thought of as a subfield of artificial intelligence, and concentrated to a large extent on producing interfaces that "acted human", in particular by carrying out a conversation in natural language. Some early advancements in the area were the work on goal-oriented natural language dialogue by Allen and Perrault, and the very early work on modelling of user preferences by Rich, both in the late seventies and early eighties. Despite these early successful results, the research area stagnated during the eighties, meeting the same difficulties as artificial intelligence in general: to further research in this direction, it would be necessary to achieve general common sense in computer systems. To counter this problem, the Intelligent Interface community has embarked on a different route than artificial intelligence. The target goal is no longer to produce human-like interfaces, but to produce interfaces that can collaborate with humans. This has caused a shift in focus from generic interfaces to application-specific interfaces, and from the development of generic user interface manage |
ment systems to the development of smaller components which can be utilized for specific purposes, such as user or dialogue modelling components. Recent research also put a great emphasis on evaluating the interfaces for proof that they actually provide the sought improvements in user interaction compared to traditional interfaces. In summary, Intelligent Interfaces can no longer be seen as a subfield of artificial intelligence, although techniques from that area still are being utilized in Intelligent Interface research. The central issue for the field now is the development of mechanisms for user and context adaptation of context and presentation; this is why we have chosen to describe our research as dealing with adaptive and multimodal interfaces, rather than with Intelligent Interfaces in general. This novel research direction for Intelligent Interfaces has proven very successful, and commercial applications are now starting to appear. One success area is intelligent help and tutoring, where system intelligence is used to model the learning strategy or misconceptions of a user, so that the system can adapt its advice to the user's needs. One example of this is the recent launch of the MS Office Assistant shipped with MS Office '97. Another successful area of application has been adaptive information filtering based on user preferences, two examples being the Firefly and Pointcast services available over the Internet. Finally, this approach has produced the first successful applications of natural language dialogue, utilizing service-specific dialogue models to limit the necessary language capabilities of the system.
Research IssuesThe research area of multimodal and adaptive interfaces consists of three focal issues: · How a system can maintain the information needed about the individual user and the usage context: what information can be retrieved and |
||||
| Human-System Interaction Human-Computer Interaction and Language Engineering | ||||||
how should this information affect the system's knowledge and dialogue goals. · What are the desired effects of this information: how should the user and context model affect infomation selection and presentation, and which adaptations are not desirable. · How should the user-interaction model be constructed to empower users to inspect and control the intelligent mechanisms of the system.
Technical DescriptionThe focus of the HUMLE group is split into two parts: basic research and technical development. In basic research, we focus separately on issues of user adaptivity and issues of multimodal interaction. In technical development, we focus on two areas of development, in which adaptivity and multimodality are combined to provide advanced user interactivity. These areas are collaborative interface agents and information retrieval. Finally, the group maintains a level of competence in basic technology for natural language interaction.
Research on User AdaptivityResearch in user adaptivity concerns both technical development of techniques and algorithms for user adaptations, so-called user modelling issues, and research on design issues concerning how systems should adapt to fit different users. Concerning user modelling techniques, the HUMLE group has focused in particular on dynamic aspects of user adaptations: how systems can adapt to user characteristics that change (such as the user's current task and usage situation) and how systems automatically can acquire better models of users. In this research, we explore the possibilities of using machine-learning techniques. Concerning the design of user-adaptive interfaces, we do research on how to allow for user adaptations, and allow users to inspect and control these adaptations. A focal issue here is which user characteristics |
should influence how the system is to adapt. Here, we have chosen to focus more on stable characteristics of users, such as cognitive abilities and physical impairments.
Basic Research on Multimodal InteractionWithin the area of multimodal interaction, the HUMLE group has concentrated on two issues: combined modalities in interaction (using several modalities to transfer a piece of information, either as input or output) and flexible modalities in interaction (transferring the same information in alternative modalities). The aim is to design and realize effective dialogue models that allow both kinds of multimodality. A special issue concerns how to extend such models to function in open and loosely coupled systems of collaborating services. Most research challenges for multimodal interaction models arise when speech or text input is combined with other modes of interaction. In conjunction with this focus on linguistic interaction, the group maintains a certain competence in the area of basic technology for natural language interaction; see description below.
Development of Collaborative Interface AgentsAdaptive and multimodal interactions are particularly useful tools to realize so-called collaborative interface agents. These constitute a relatively novel user interaction paradigm that is complementary to the direct manipulation paradigm. The collaborative interface agent is seen as a "helper agent" which executes alongside a main application, and can monitor the user's actions, carry out a dialogue with the user and carry out actions within the application. The idea in principle is that the user and the agent have the same view of the application the user monitors what the agent does, and the agent can monitor what the user does (see figure). |
|||||
| Human-System Interaction Human-Computer Interaction and Language Engineering | |||||
Interface agents are introduced to resolve several problematic issues for adaptive and multimodal interfaces:
· They can be used to provide a "dialogue partner": a visualization of language-based interaction and a possibility to carry out a dialogue about the interpretation of a user's requests. · They can carry out a dialogue with the user to make his or her needs and preferences clear, and then carry out services on the user's behalf (so called "indirect management.") · They provide a means for users to inspect and control how the system adapts to the user. · They can be used to provide a consistent interaction model that is independent of media and modality. · They can be used to provide a visualization of the distribution of competence in a multiagent multiservice system and allow the user to inspect and control the flow of information between different system agents. · Finally, they can be used to visualize a "user representative" agent in a multiagent system.
Our aim is to develop a base technology for collaborative interface agents in loosely coupled distributed systems based on agent technology, and investigate how this technology can be used to realize the different functionalities of interface agents described above.
Development of Information Retrieval SystemsUser-adaptive interfaces constitute a particularly promising approach to address applications that exhibit information overflow. The HUMLE group pursues research in the development of both a general model for information extraction services, and the development of specific components that are necessary to realize such services. Some examples of such components are: |
· Tools for the visualization and restructuring of large sets of information. · Novel types of metadata. In particular, we investigate the formation and usage of metadata which deal with the genre or usage of information, rather than classical information retrieval indexes. · Tools for authoring and indexing information, in particular such indexes can be used for user-adaptive presentation. · Agent-based base technology for information extraction in open and distributed service environments.
Basic NL TechnologyPreviously, the HUMLE group has pursued research in basic natural language (NL) technology. This research was very successful and has produced a number of NL tools for Swedish that are useful by themselves or integrated in NL systems. As a result of the Svensk project, many of these tools were integrated into a common NL platform, the Gate platform, together with other tools and systems for Swedish NL processing, including the commercial system SweCG. Although the group does not plan to pursue further research in NLT, we aim to maintain a high level of competence in NL technology so that these results can be utilized in projects that require high-end NL processing.
Demonstration ProjectsThe PUSH projectPUSH investigated the usage of adaptive techniques for selecting and presenting a vast textual information source. The project developed an adaptive hypermedia information system that provided help on a large software development method. PUSH addressed the problems of information overflow with 1) a heavily domain-based design of the information |
||||
| Human-System Interaction Human-Computer Interaction and Language Engineering | |||||
structure, with dynamically created follow-up questions and rhetorically typed links between explanation units, and 2) adaptive information presentation based on plan-recognition techniques. The PUSH system was developed as a WWW-based intranet application, and utilizes CGI scripts and Java Applets to realize the adaptivity. (See http://www.sics.se/humle/projects/push.html.)
The Olga projectThis project was a collaboration project with CID (Center for IT-Design) and the language technology department at the Royal Institute of Technology. The project demonstrated a collaborative interface agent capable of spoken dialogue in Swedish, integrated with a direct manipulation interface. (See http://www.nada.kth.se/cid/interaktionsformer.html.)
The EdInfo projectThis ongoing project aims to develop an infrastructure for user-adaptive information editor services. A major difficulty in producing user-adaptive systems lies in structuring the information so as to allow adaptations. This problem is most apparent in domains where information is rapidly changing or highly unstructured. In this project, we address this issue by providing support for information editors. Editors are tooled to allow them to select and structure information for adaptive presentation, and to retrieve and review feedback information from users on how the information was utilized. This feedback can be used to improve on both the selection of information to distribute, and the structuring of the selected information. (See http://www.sics.se/~kia/papers/kia_asa_annika. html.) |
|||||
| Human-System Interaction Distributed Collaborative Environments | ||||||
By gathering the experiences gained in the activities and relating them to industrial needs we aim to inform the IT industry of the potential offered by the market for systems to visualize and interact with online services. This will hopefully result in better informed product trends and strategies for a rapidly growing market. Finally, the development of a common framework and set of mechanisms to support this framework will allow the development of open protocols. Experiences from the World Wide Web suggest that this will significantly encourage the growth of online electronic landscapes.
State of the ArtVirtual reality is currently seen as offering the promise of a paradigm shift in information and communication interfaces in general. The very liveliness of research and development of shared virtual environments and the fact that they are of interest for a large variety of reasons has meant that existing environments vary enormously. This variability is related to the application, to the capabilities of available technology and of course the social context. A number of commercial organizations offer access to shared virtual environments either on a dialup basis or via the Internet : AlphaWorld, Worlds Away, The Palace and Habitat being some of the most well known. Of special interest is Sony Corporation´Community Place as SICS was involved in its design and it adresses to some extent a number of current research issues of interest to us. While the basic VRML(Virtual Reality Modelling Language) standard for distributing models of virtual environments over the Internet does not provide explicit support for simultaneously shareable worlds, it is anticipated future developments of the VRML standard will. In current systems, connections between virtual environments are typically by means of |
"portals" or "gateways" and travel between environments is a form of "teleporting." For example, VRML (and VRML-derived shared environments) supports standard WWW-style links which provide access to other environments. It is also common for shared virtual environments to provide maps or overviews of the environments they contain. Several existing shared virtual environments support the integration of video and audio with 3D graphical worlds. Of special importance and unfortunately often neglected is the provision of participant-to-participant audio or "conference audio." For any kind of socially oriented virtual environment this is probably the single most important functionality. An area that is underdeveloped in current systems is what can be called social computing, i.e., how to interact with other users, how one is represented in virtual space, support and understanding of the social rules of work and play in inhabited virtual spaces. Existing online VR based communities offer an extremely poor range of professional, social and community oriented activities and should be seen more as demonstration novelties than serious efforts to settle cyberspace. In terms of hardware there is an increased availability of relatively high-powered and cheap 3D graphics accelerators mainly for the Pentium class Windows 95/NT platforms. There is also an increasing market for very high-end and very expensive top-of-the-line machines where Silicon Graphics is still leading the technology of integrated graphics workstations. Finally, there is a trend away from the "archetypical" interface technologies of early VR, that is the headmount display and the finger/hand tracking glove, towards less encumbered ways of handling the visualization and interaction sensing. Possibly the two most interesting technologies are cave-style surround display setups on the visualization side and artificial-vision based video-tracking systems. |
|||||
| Human-System Interaction Distributed Collaborative Environments | |||||
Research Issues and Technical DescriptionUser Participation and InteractionThe aim is to enable single participants as well as groups of people to interact with the electronic environment in unencumbered ways. In the case of the individual user, it is a matter of how one interacts with the environment through the interface, controls one´representation, navigates or communicates with remote participants. Group interaction adds to the complexity by requiring that several persons should be able to perform such tasks jointly, for instance to change viewpoint in a virtual world. Social interaction is of great importance and must be supported since it allows the participants to make use of the social behaviour of ordinary life when interacting with other people in electronic space. To allow and support a very broad range of media for interaction, the flexibility and versatility of the software is a major objective. This means moving towards seamlessly supporting highly advanced projection systems, like caves, virtual workbenches, etc., as well as for instance textual interfaces, voice command, and gesture recognizing. Both the development of general interfaces and metaphors and the specific support for different hardware devices are of interest here. Tools such as wands, navigation devices, smart objects (i.e., chairs that seat you when you click) can all be the enabling device that makes that world work as a usable tool. Within large-scale participation the issue of awareness and participant-participant contact is fundamental to a sense of participation.
Synthesising and Mixing RealitiesThis activity deals with techniques for synthesising real and virtual spaces. Recent developments in virtual reality, augmented reality and telepresence can be seen as addressing different but related aspects of constructing new kinds of social and interactive spaces. This work needs to be taken further in order to provide a |
more transparent boundary between the physical and the electronic environments. We also need to develop techniques for creating engaging, stimulating and aesthetically pleasing content, for example, innovative artistic works and engaging settings for work, consumption, leisure, artistic performances and exhibitions. Realworld devices, such as robots and cameras, can retrieve realworld information which can then be blended in with computer-generated information to create mixed realities. Such augmented-reality techniques can provide the base for applications that offer the flexibility of manipulating realworld information using VR methods. Scalable Systems and InfrastructuresThe focus here is on developing software, especially network architectures, that are capable of supporting scaleable realtime communication between hundreds or thousands of simultaneous participants in shared virtual spaces while still allowing individuals to actively communicate with one another (not just to receive broadcast information). These architectures might integrate efficient underlying network mechanisms such as multicast with spatial mechanisms, for structuring and partitioning virtual worlds. We seek to support and enable a rapidly growing number of interacting participants and agents. This demands research efforts into fundamental areas such as scaleable network and distribution mechanisms, agent technology and so on. Specifically this means utilizing research technology, such as multicast, on stable consumer platforms. As the number of users and the amount of information increase, so does the need to avoid cognitive overloading. Cognitive scaling means that a user's ability to access information and communicate with other people should not decrease as the electronic space grows in terms of user population and information content. This might be achieved by developing appropriate abstractions and representations of groups of participants and information objects. For |
||||
| Human-System Interaction Distributed Collaborative Environments | ||||||
example, from a distance one may see a common representation of a whole mass of people that provides summary information as to their presence. Such representations must be potentially dynamic, adaptive, mobile; they must also be developed for a range of media (e.g., audio and graphical aggregations).
Individual RepresentationIn multiuser electronic spaces where users can meet and interact, the issue of user representaion is of significant importance. A wide variety of design issues will influence the design of avatars: the need to represent identity, location, activity, capabilities, degree of presence, availability and many other factors. The importance of these factors and the techniques for representing them may vary according to the role of the participant, the nature of the activity and the available infrastructure.
EmbodimentsUsers within collaborative virtual environments may be represented using different levels of computing power and may be interested in different degrees of presence and interaction abilities. Hence, each user should be provided with a range of embodiments, ranging from fully animated realistic people with limbs to more simplistic representations constructed from a few cubes, and a user should be provided with a mechanism for choosing and switching between them. A related problem observed with desktop use in collaborative virtual environments is when users are distracted by events in the real world but continue to be embodied in the virtual one. Some representation is needed to convey an "absent" state. In addition an alert function should be provided which, when a "sleeping" body is selected by another user, the owner of the body is requested to rejoin the virtual world.
Socially Aware AgentsSocially inhabited spaces may contain intelligent agents, perhaps in the role of advisers and hel |
pers. Such agents must be appropriately embodied and be given some form of social awareness that allows them to decide how and when to interact with human participants and also with each other. It is useful to combine human embodiments with agent functionality. For example, semi-autonomous embodiments might automatically deal with the issues of viewpoint navigation and even the generation of facial expressions (especially where the end user has access to only limited equipment). This will give the environment a great sense of life. Alternatively, an uninhabited or partially inhabited embodiment may act as a personal agent during its owner's absence (i.e., a "cyberspace" answering machine). The objective here is to extend the agent abstraction framework in the DIVE system. Typically a virtual environment is populated with specializations of some basic agent class. A virtual conference scenario may include "creators" with specialized rules for creating appropriate "presenters" and an "assistant" guiding participants to interesting groups. This approach may be extended to support user-oriented applications in general. The implementation phase would preferably make use of a concurrent constraint programming language, like Oz, with support for concurrency, reactivity and realtime control, since these are important features when considering autonomous agents. Other suitable languages are Java and, to some extent, TCL.
Consumer Platform DevelopmentBroadening the user base of DIVE is an important step, with the rapidly increasing performance of consumer-priced hardware. Moreover, increasing the accessibility is a crucial issue. In practice, this means providing compatibility and support both for hardware and software; industry standard PCs, sound cards, graphics cards, as well as widespread software tools like Netscape and Real Audio and the rapidly evolving phone-over-the-Internet area. |
|||||
| Human-System Interaction Distributed Collaborative Environments | |||||
Furthermore, the increased use of VR in television applications (both at the production and viewer ends) together with the development of Internet-based TV offers increased possibilities for the use of our technology.
The SICS GrottoThe SICS Grotto is a facility to enable distributed meetings and demonstrations. The Grotto is equipped with three large-screen projectors and screens, four-channel audio equipment and several input devices. The screens can be configured to show screen output from several computers and video equipment. Additionally, an SGI Onyx with MCO can be coupled to all three displays to present a wide-angle presentation for VR applications. The Grotto is meant to be an electronic meeting place for casual as well as formal meetings. The primary intended use is distributed VR based on the DIVE system and computer-based video-conferencing over high bandwidth networks (ATM) and the Internet.
Demonstration ProjectsThis section gives example applications that illustrate the goals of the research presented here.
Example 1: Social Mass ParticipationThe focus here is an environment for staging events for large audiences where the goal of interaction is either social or for edification. Such events could take the form of performances or exhibitions to large virtual audiences. The aim of a performance is to create a shared experience in which participants are able to actively participate in some way. Typical events might include theatre, music, dance, games/sports, talks and lectures. In an exhibition there is less an emphasis on temporal group participation. Here participants wander through some series of installations and exhibits. Although the event represents a large and extended shared experience, |
in contrast to a performance, there may be more of an individual feel to each exhibition experience.
Example 2: The MarketplaceThis thematic space would involve developing new forms of interactive, multiparticipant marketplaces and the support of those services which crucially depend upon producers and consumers interacting in a common environment. Typical areas include teleshopping, tele-banking and counselling. Issues to be considered here include the representation of goods and services, the embodiment of producers and consumers, navigation between retail spaces, and the augmenting of real marketplaces with virtual spaces.
Example 3: Dataspace NavigationHere the focus is on visualizing, sharing and interacting with dataspaces in three dimensions. These data could take the form of scientific test data, medical models, or models for military strategy. The goal of interaction with such dataspaces is to enable novel techniques for education, cooperation and insight where the data are the shared medium. Developments in agent technology can be applied here to aid in searching large spaces.
Example 4: The FactoryAs companies and the size of projects grow there is a greater need for systems that aid in the understanding, consolidation and sharing of the efforts from workforces that may span geographical and temporal zones. As more and more work is done with the aid of computers, the need for geographical localization should go down. Techniques described in this research plan can be applied to systems designed for sharing data to create a virtual factory. The concept of the "virtual factory" embodies the idea of production using tools that allow the visualization of the process for the disparate or large commercial workforce. The production |
||||
| Human-System Interaction Distributed Collaborative Environments | ||||||
a patient's situation. A potential result of this is increased awareness of a patient's situation. This situation can also work to preserve an existing doctor-patient relationship. A proposed system for this would be based on work of the DCE group at SICS in cooperation with medical professionals. A system prototype would include a means of videoconferencing, teleinvestigation (means of performing and accessing diagnostic procedures and results) as well as a means to share documents and records. A solution to this would follow a model where the user sits in a workstation environment built into a desk setting and where most functions, such as document manipulation, communication, and general input/output are controlled directly by simple touch gestures and speech. This work on "natural interfaces," videoconferencing and virtual reality complements current work being done.
Example 6: Distributed CAD/VRA distributed VR system, such as Dive, can enhance the CAD process by allowing virtual conferencing with the integration of CAD models between geographically separate engineers. Furthermore, since CAD is in essence a three-dimensional activity, it lends itself to a natural integration with virtual environments in general. Using Dive, we believe that we can provide significant support for many CAD activities, which often are carried out over large distances, with spread-out workgroups exchanging and modifying complex CAD models. With Dive, a common virtual space is provided, in which CAD models and applications can be presented in varying detail and level of interaction. Collaborative tools, such as a revision control system, can be interfaced through the virtual world. Animated simulations and other types of analyses can be presented in the common space, for discussion, exchange and enhancement thus bridging the distance in time and space between workers that tends to hamper conventional CAD activities today. |
||||||
| Virtual conferencing. | ||||||
could encompass everything from software to automobiles by sharing code and CAD/CAM models as well as remote access to factory tools.
Example 5: TelemedicineLong-distance consultation by medical professionals is an area that is ready for application and further research and development within the general area of distance communication. This type of medical consultation, and general communication within a medical setting, lends itself to the sharing of critical information. The goal is to implement a system that is comparable to conferencing via physical presence. Telemedical conferencing makes it possible to have the primary caregivers involved in a simultaneous conference. Currently it is very rare that all the specialists, as well as the primary caregiver and the patient can be present in the same room at the same time. Enabling this kind of presence can, in addition to making the process more efficient, also increase accuracy and competence. Accuracy and care can be improved by having access to the initial caregivers and thus to previous diagnoses and treatments, all of which may not be readily available through immediate patient records. Efficiency and cost savings come from reducing the amount of travel different caregivers might have to make in order to discuss |
||||||