SICS, Box 1263, S-164 28 Kista, Sweden E-mail: espinoza@sics.se, kia@sics.se
Keywords: adaptive hypermedia, WWW, user modeling, multi-modality, information filtering
Java, HotJava, and Sun are trademarks of Sun
Microsystems, Inc.
All other product names mentioned herein are the trademarks
of their respective owners.
In order to make the interface interactive we stretch the limits of html and the WWW. We generate an answer page consisting of graphs and text which the user is allowed to manipulate. The users can navigate in the information space by clicking in the graphs or by posing questions via menus. They can manipulate the answer generated by the system by closing or opening parts of the text. They can also pose follow-up questions on 'hot-words' in the text.
The choice of what information is made available is based on the users information seeking task, which we infer from their interaction with the system. The user can also actively change the assumed task, and thereby control the adaptive behavior of the system.
We stress the point that it is the combination of the multi-modal interface with adaptive information filtering that meets the individual users needs. Rather than abstracting the adaptive behavior as an interface agent, we present the adaptivity in domain dependent terms integrated with the whole interface.
Our realization of the system is through a knowledge representation implemented in SICStus Prolog Objects. The Prolog program will run as a process in the background and serve the remote Netscape clients. The interface is realized using dynamically generated html-pages, and graphs which are generated at the site of the Netscape client using a transferred Java applet.
We have evaluated the interface in our prototype with thirteen users as a part of bootstrapping the adaptivity. Their reactions were positive. They did not have any problems with distinguishing between different kinds of links available in our web-pages. They found the outlined adaptive scenario realistic.
Information overflow can be tackled through adapting the information to a particular user or a group of users. We have studied the information overflow problem in one particular domain, the documentation of an object-oriented software development method, and designed an interactive, adaptive hyper-media system which utilizes WWW as its interface.
Until now, the WWW potential for interactivity has been very limited: the user can choose to follow or not follow links to other pages of information. We claim that it is important that the adaptive system is integrated with a highly interactive interface for two reasons. With an interactive system it will be easier for the user to correct any mistakes made by the adaptive system based on erroneous assumptions about the user. The second reason is that any modeling of the user can only be based on the users interaction with the system. If those interactions are limited to "clicking" on links in the hyper-media system, very little can be known about the user.
Our target domain requires that the interactive interface should be multi-modal, including both static text, generated text and generated graphics as output, and accepting both direct-manipulation and free form query input. These demands have forced us to design new ways of interacting with a WWW page that stretch the original hypertext metaphor. (By multi-modal we mean that the system generates, in this case, both graphs and text from a knowledge representation of the target domain. It also accepts both input via direct-manipulation and as queries.)
When realizing the system, our goal has been to create a modular solution. We separate the user model from the information in the database and we also separate the actual generation of the html-code from our database. This can be compared with Kay and Kummerfeld's approach which mixes the html-formatted information in the database with the user model (1994).
The separation of user model and information in the database is necessary since the information in the database is changed over time. In our target domain, a number of authors work with recurrent releases of the information database. It would be impossible to require that they would mingle the target domain information with user modeling control sequences.
The second separation is necessary to ensure that we can update our interface as the page viewers and html-standards are changed and enhanced. For example, we have recently re-implemented our presentation of the information to fit with the new Netscape viewer which includes Java virtual machine interface. This could be done without changing anything in our database of the software method.
We describe our approach to interactivity and adaptivity in section 3. In section 4 we describe the system architecture. The system is named PUSH (Plan- and User Sensitive Help). In order to make our reasoning concrete we start by describing the interface through an example in section 2.
The already available on-line manual consists of more than 500 documents of 5 to 20 pages of text and pictures. Our task in the PUSH project was to reorganize that manual into an adaptive and highly interactive multi-modal hypertext system. We have conducted several studies on how the users understand the method and how they search for information about it (Bladh and Höök 1995, Höök et al. 1996a), and the design of the adaptive hyper-media system presented here is based on those studies.
In figure 1 we see a screen dump of our interface. It describes one process, 'iom,' in the SDP method. The answer page is divided into three frames (frames are subparts of the Navigator application window that can be scrolled and resized independently of each other and that each contain a web page):
Figure 1. A screen-dump from the PUSH system.
Our system is interactive on several levels. It is interactive at the interface level, allowing the user to manipulate the output from the system. It is also interactive in terms of allowing the user to control the adaptivity. In order to realize the interactivity at these both levels, we have a dynamic interaction between the users' actions and the information system behind the scene. There are no ready-made html-pages that the user can down-load, instead we create the pages on the fly in response to the users' actions and history of actions.
Apart from the goal to make the output from the system interactive, our second design goal is to utilize the hypertext metaphor and de facto standard interaction with the WWW. It will be easier to learn our interface if it does not divert too much from the prevailing web style of interaction. This goal conflicts with the interactivity goal since WWW offers few possibilities for interaction. Still, we want to rely on the basic metaphor of pages and links as a means for moving between pages. The basic structure of our prototype is therefore that each object and process in the target domain will be presented in one answer page each. This page contains all the relevant information about the object or process, even if some of the information is hidden from the users immediate view.
The limitation of the potential nodes in the hyper-space to only one node per process or object also serves another purpose, namely to help our users with the "lost-in-hyper-space" problem. By limiting the nodes to be the whole description of a process or object, we make the hyper-space substantially smaller. An alternative would have been to divide the information into small, stand-alone units presented in one page each. This would have meant thousands of potential pages in this domain, so clearly that is infeasible given the goal that users will be trying to learn the structure of the whole method, not only tiny pieces of information about certain aspects.
The problem with our approach is that each page on a process or object might, if fully expanded, contain a lot of information. It is therefore crucial to structure the information within the page, and have means for navigation within a page.
Keeping our two design goals in mind, interactive but still web-like, let us start by describing the interactivity at the interface level.
The presentation in the graphs meets the needs of users who are not so knowledgeable in SDP. They need to see how the objects are related to one another.
The graphs in our interface are not static pictures which are ready-made in our database. We generate the graphs from the object-oriented database the lies in the background. This means that whenever our database is updated, e.g. an object might be added or a relation changed, the graphs will immediately reflect the change.
An 'information entity' is a stand-alone piece of information about one particular aspect of an object or a process. It is not necessary to read one information entity before another - there are no references between the entities. Our studies of the domain, and studies of similar domains (Svenberg 1995), show that this is not an impossible requirement on technical documentation: it is often written as a set of small stand-alone pieces of information.
In the underlying knowledge representation, the information is organized as objects (one for each process or object in SDP) with attributes. The attributes are the objects relation to other objects and also the information entities which describe the objects. So, each information entity constitute an attribute of the object. When we construct the textual presentation, some information entities will be ready-made texts in the database, other texts we generate from the information about the object. (We use an object-oriented terminology to describe our database, in other adaptive hyper-media systems, objects are named frames and their attributes are slots, see Brusilovsky (1996)).
In the interface, an information entity can be viewed as either a piece of text, or a part of the graphs. Sometimes an information entity is doubled so that we display, for example, input objects of a process in the graphs, and also as a piece of text.
The user can be dissatisfied with the provided information in the answer page, and is therefore allowed to manipulate the textual parts of the page. They can close or open the information entities through clicking in the guide frame next to the textual frame. By clicking on the name of the information entity displayed in the guide frame, the text will be inserted into the page. Another click on the information entity name in the guide frame will close that part of the text. This is what is named stretch-text (see also Brusilovsky (1996)). By opening and closing parts of the text, the user can create an answer page that is better fitted to their needs.
The hotwords and their associated follow-up questions allows the user to increase their knowledge of SDP. If they are already knowledgeable in SDP, they do not have to read irrelevant information about these basic concepts.
Allowing the user to pose questions is crucial if we want to meet the needs of experienced users. They do not want to spend time navigating to a particular piece of information, but instead just 'jump' to it.
This has caused us to try to find aspects of the users that can be used as a basis for information filtering. We found that the user's information seeking task was a good tool for determining which information entities would be most relevant to the user in a specific situation (Höök 1995, Höök et al. 1995). We constructed a hierarchy of information seeking tasks as a result of a task analysis on user's behavior in their daily work situation. Examples of tasks are: 'project planning', 'learning the structure of SDP', 'working in an activity within a process', etc.
So how can we know which information seeking task the user is performing in any particular situation? Our approach to knowing about the task is by combining a user-controlled and a self-adaptive approach (Höök et al. 1995, Höök et al 1996a). (Self-adaptive are such that the whole adaptive process is done by the system alone: the system initiates, proposes, decides, and executes the adaptive behavior (Kuhme et al. 1992)). According to Oppermann (1994) this middle route is to be preferred since the users must have control over the adaptivity, but they will not spend much time controlling it continuously. The adaptive hyper-media tool should optimally be a tool to aid the users in their daily work, not an obstacle that they have to actively adapt in order to make it work smoothly.
We allow the users to set which task they are working with initially, and then we use plan inference (i.e. inferring the users' underlying goal from their actions at the system) to update their assumed current task continuously (Waern 1994). The user can at any time change the inferred task to some other task, although we limit the set of potential tasks to those which will actually change the explanation in the current situation. (In figure 1 we see under the heading 'task' which information seeking task the system assumes that the user is performing. The task is marked bold that means that the user can click on it and alter it.)
Given that we know of the information seeking task, we utilize a set of simple rules that connect a question plus a task with the most relevant information entities. Examples of such rules can be found in figure 2.
By allowing the manipulation of the answer page, and in particular by the follow-up questions on concepts crucial to the understanding of SDP, we cater for the difference in knowledge held by the users. With the two ways of navigation in the information space, menus and graphs, and by presenting some information both in graph and textual form we attempt to cater for differences in spatial cognition (Höök et al. 1996b).
In the PUSH system, pages are created in two distinct ways. One is for presenting the results of a new query to POP (the database part of the PUSH system) and the other is for filtering or modifying the currently displayed page. As a query is made to POP, certain data that is tailored to the current user is retrieved. This data is in the form of plain text containing tags signifying hot-words that lead to further queries. The data is channeled to the Page Generator, a CGI program, via a socket. The Page Generator parses the information and builds the finished page by incorporating HTML code into the textual data to construct interface tools such as clickable buttons and menus. When finished, the complete page is piped to the Netscape browser and displayed. The page is also cached to disk. It contains hidden formatting instructions that make it possible to alter the appearance of the page without accessing the database.
Each query starts the Page Generator CGI which sends the query parameters to the POP Prolog program. Since each query changes the state of the POP program to allow different follow up questions depending on the current context, and since the Page Generator is a CGI program that lives until the current query answer has been presented to the user, a scheme for saving the current state is needed. For example, the Page Generator must have a way of knowing to which of perhaps several different POP prolog processes to talk to, so the socket name is saved in each presented HTML page as a hidden input field. An example of such hidden instructions is the insertion of the menu of potential follow-up queries associated with a hotword when the user clicks on the hot-word button. This will not invoke a call to the Prolog process, but can be handled the Page Generator.
Several enhancements will be feasible within short. One of the advantages of using a widely spread web browser such as the Netscape Navigator as well as the up and coming Java language, is that as soon as advances are made to these products they can be taken advantage of in the PUSH system.
The graph data is transferred from the POP program to the Java applet by means of a file. Improvements to look forward to are the Java socket capabilities in Netscape, enhancements to the HTML standard, and for example HTML parsing and presenting capabilities in a Java text-area. This will allow us to completely remove the Page Generator component and instead have one, more complex, Java applet communicate with and present the information of the POP prolog server. For example, we hope to replace the current somewhat awkward way of posing follow-up queries to hot-words by pop-up venues.
Another great improvement to the system will be to allow authorized users to update the information database in POP by viewing and editing and finally uploading the content from a remote location.
At the time of the study of the interface, we only had the graph and text frames. The guide frame had not yet been implemented. We did instead show our design with the guide frame as a picture to the subjects after their session with our system, and they all expressed a feeling that this would improve the interface.
Finally, the subjects were asked to answer a set of follow-up questions on their impressions of the interface.
Related to the possible actions at the interface, was whether they could make sense of how the information space was organized. On the query "Was the information space understandable (did you get lost at any point)?", 7 subjects claimed to have no problems, while 3 asked for go-back functions, and 2 were irritated by the fact that when they did what they felt was going back to a previous page, the system would have closed some IE's (this will not be the case in the adaptive system). Finally, one subject had problems with the navigation, but liked the possibility to have control over the textual parts and wanted those to be the basis for navigation rather than the graphs.
Concerning the graphs in the interface, the users were not confused by the fact that they were both used for navigation (implicitly posing queries) as well as for presenting information about the target domain. Some complained about the fact that they were designed to be quite small. In general, the impression was that the subjects would have liked to decide how big the graphs versus the text frames should be. Sometimes and for some subjects the graphs were more important, while for other the text was the main source of information.
So, in summary, it seems as though our subjects made use of the possible actions at the interface and they did not have too big problems with getting lost or understanding how to interact with the system. As this was used to bootstrap the adaptivity, we asked the subjects whether our "tasks" were realistic and whether they kind of adaptivity envisioned would be feasible. All seemed to find the scenarios realistic. Many added that what determined which information would be most relevant was also the project status. If they were just about to start a process, they would require information related to the learning task, while towards the end of a process, they would require very specific information on how to document the results of the process in the objects and the relations between the objects, etc.
Adding the project status to our system and using it as an additional source of information for the adaptive behavior, is considered and will be included in our future work.
We have also shown that it is possible to realize our interactive solution by using the Netscape Navigator and Java applets to communicate with a server-side database which generates the information needed.
Finally, we have made an initial evaluation of the interface which showed that it was comprehensible and fulfilled our demands on being interactive, yet web-like and intuitive. Next, we aim to perform a study on the adaptive behavior of the system.
Brusilovsky, Peter (1996) Methods and Techniques of Adaptive Hypermedia, Journal of User Modeling and User-Adapted Interaction, special issue on Adaptive Hypermedia, UMUAI 6, in press.
HTML3 Specification, http://www.w3.org/hypertext/WWW/MarkUp/html3/CoverPage.html
Höök, Kristina (1995) Adaptation to the User's Task, SICS Research Report, SICS, Sweden.
Höök, Kristina, Karlgren, Jussi and Waern, Annika (1995) A Glass Box Approach to Intelligent Help, IMMI-1 (First workshop on Intelligent Multi-Modal Interaction), Edinburgh, U.K.
Höök, Kristina, Karlgren, Jussi, Waern, Annika, Dahlbäck, Nils, Jansson, Carl-Gustaf, Karlgren, Klas, and Lemaire, Benoit (1996a). A Glass Box Approach to Adaptive Hypermedia, Journal of User Modeling and User-Adapted Interaction, special issue on Adaptive Hypermedia, UMUAI 6, in press.
Höök, Kristina, Sjölinder, Marie and Dahlbäck, Nils (1996b) Spatial Cognition and Hypermedia Navigation, SICS Research Report, SICS, Sweden.
Java, the language, the Java home-page: http://java.sun.com/.
Kay, Judy (1994) Lies, damned lies, and stereotypes: pragmatic approximations of users, Fourth International Conference on User Modeling, Hyannis, Massachusetts, the MITRE corporation.
Kay, Judy, and Kummerfeld, R.J. (1994) An Individual Course for the C Programming Language , Proceedings of the Second Internation WWW conference'94 Mosaic and the Web. Kobsa, A., Muller, D. & Nill, A. (1994) KN-AHS: An Adaptive Hypertext Client of the User Modeling System BGP-MS, Fourth Int. Conference on UM Hyannis, MA, 1994.
Kuhme,T., Dieterich, H., Malinowski, U., and Schneider-Hufschmidt, M. (1992) Approaches to Adaptivity in User Interface Technology: Survey and Taxonomy. In: C. Unger and J. A. Larson (eds.): Proceedings of the IFIP TC2/WG2.7 Working Conference on Engineering for Human-Computer Interaction, Elsevier, North-Holland, 1992.
Maes, Patti (1994) Agents that reduce work and information overload, Communications of the ACM, July 1994, vol. 37, no. 7.
Nielsen, Jacob (1995) Interface Design for Sun's WWW Site, invited talk at the Interact'95 conference in Lillehammer.
Oppermann, Reinhard (1994) Adaptively supported adaptability, International Journal of Human-Computer Studies 40:455-472.
Rice, James, Farquhar, Adam, Piernot, Philippe, and Gruber, Thomas (1995) Lessons Learned Using the Web as an Application Interface, Knowledge Systems Laboratory, KSL-95-69, September 1995, http://WWW-ksl.stanford.edu/.
SICStus Prolog User's Manual (Release #3). Swedish Institute of Computer Science, Box 1263, S-164 28 Kista, Sweden, ISBN 91-630-3648-7.
Svenberg, Stefan (1995) Structure-Driven Derivation of Inter-Lingual Functor-Argument Trees for Multi-Lingual Generation, Licentiate thesis 498, Department of Computer and Information Sciences, Linköping Unviersity, Sweden.
Waern, A. (1994) Cooperative Enrichment and Reactive Plan Inference - applying plan inference outside Natural Language Dialog, SIG meeting at Fourth Int. Conference on UM ,Hyannis, 1994.
Fredrik's main interests lie within design of interfaces, both implementation and in understanding how to meet users needs.
Kristina Höök is a licentiate doctorate and researcher at SICS.
Her main interest lie within design of adaptive interfaces, design of explanation and in general Human-Computer Interaction. A special interest is in the role of individual differences, as spatial ability, in their effect on the design of interface.