Lab-studies vs Ecological validity; what is the point of evaluation studies?; what is scientific and what is not?; what is HCI and what is general emotion theory?

 

1. Lab-studies versus ecological validity

In HCI, we are turning more and more towards what we could name "ecological validity". That is, systems needs to be brainstormed, developed and tested in settings that are very close to the "real" settings they will be used in. We talk about studies "in the wild" and issues like "staged lived experiences", that is, how we can set up situations with half-working systems that resemble the real situations in which they will be used.

Theoretical foundations from this movement is picked from phenomenology (see e g Paul Dourish' book "Where the action is" that makes use of the later Wittgenstein, Husserl, Heidegger and so on), from ethnographic work, from grounded theory, from situated cognition (see Suchman "Plans and situated action"), from activity theory (see Nardi "Context and Consciousness: Activity Theory and Human-Computer Interaction”), participatory design (or the so-called Scandinavian school of design), and so on. In common for all these is an emphasis on moving beyond cognitivism and instead try to see the whole context in which a system is used, the other tools people make use of, their daily practices, their culture, etc. The belief is by understanding more of the whole complexity and being more “emphatic” with the end users, actually placing ourselves as designers in their shoes so to speak, we will be able to produce better systems.

If we bring people into the lab, we know for sure that a number of unwanted consequences arise. They will feel inclined to be nice to the person who designed the system being tested or who set up the study. We are social beings! It feels bad to be unkind to other people. A range of studies on this phenomenon can be found in, for example, “The media equation” by Reeves and Nass where people are nice even to computers when asked to evaluate their performance. We therefore get quite unexpected differences between what people say in questionnaires or interviews and how they actually behave with the system (see for example our studies of Agneta & Frida – if you do not have it I can send it to you, Fiorella de Rosis have had similar results in her studies).

In my research life I have done both – both controlled lab studies trying to figure out only one variable and more open-ended, interpretative studies. I have done a bunch of studies with statistically significant results of this or that special aspects (to do with elderly users, to do with privacy issues, to do with spatial ability and navigation, etc.) and these have not been nearly as useful to the design process as the more ethnographically oriented studies where we provide an interpretation of what is really going on - the whole story so to speak. Those are no less painstakingly difficult to do, no less scientifically challenges, no less put to the test from a scientific point of view. But the results they come up with are much closer to a real description of what is really going on in our beautiful, complex, social, emotional, culturally-influenced world of people, practices and tools.

2. The problem of evaluating emotional reactions to systems

In our field, affective interactive systems, we have additional problems compared to more traditional HCI-problems with the development with tools. We are trying to design for some kind of user experience. We cannot guarantee that the users will experience something because it is obviously constructed by themselves, in their minds, but we can design more or less good interaction that at least form the basis for certain kinds of experiences. Here I am working from the following quote:

“Rather than experience as something to be poured into passive users, we argue that users actively and individually construct meaningful human experiences around technology.”
(Sengers et al., 2004)

(A very good book to read on this subject is McCarthy & Wright “Technology as Experience”, especially the first 5 chapters).

But we can do a more or less good job in the design. The problem lies in how we evaluate our systems so that we can provide the designer of the system with the right level of feedback? What will help the designer to figure out where the weaknesses in the system are and what the best way of altering it is to actually reach the design goals? As the experience has elements of unconscious processes (both cognitive and bodily) we need to find some way of tapping into those. In addition, we also need to compare those unconscious emotional reactions to the more conscious reflections that users make. Both are relevant in the design process! I will not buy a system that does not fit with my attitudes and beliefs about the system. And a system that fails to reach its design goals is not a good system! Thus, both kinds of evaluations are crucial in order for the designer to actually produce what is intended.

Thus, in my view, the main purpose of doing user studies from an HCI-perspective is in order to give feedback to the designer of the system vis-à-vis their design goal for the system so that they can improve it.

3. Design principles and not only “one study per system”?

Another reason to do a set of studies on affective interactive applications is of course in search for general design principles and problems or “i-patterns” (following the terminology of Prof Jonas Löwgren) that will be able to generate many good affective interaction systems within one class of applications or domain for applications. The goal is also to find re-occurring design problems as well as design principles that will (often) lead to good applications. I-patterns are interaction patterns that seem to work well for a certain class of applications that others can pick up and use in their design process. These "i-patterns" are like middle-range theories of what works in interaction design.

4. Being scientific?

I would claim that the way we do research in a more phenomenological, ethnographically oriented tradition is no less scientific. In fact, I would claim that we learn more about people are really experiencing. We are not abstracting away from all the details, all the complicated issues to do with several people, several artefacts, cultural practices, etc. down to one or two measurable variables... but this is a looong debate that we could have for ages. More important is probably to make clear what the method is there for. Is it to help designers create better systems? Or is it there for researchers to write a bunch of papers with "proven" results creating a method that is not really useful to people outside academia? I would claim that the proof you get is not valid and interesting unless you can show that it will indeed help designers create better systems. This is where the real test is.

BACK to Kristina Höök's home page