Lab-studies vs
Ecological validity; what is the point of evaluation
studies?; what is scientific and what is not?; what is
HCI and what is general emotion theory?
1. Lab-studies versus ecological validity
In HCI, we are turning more and more
towards what we could name "ecological validity". That is, systems
needs to be brainstormed, developed and tested in settings that are very close
to the "real" settings they will be used in. We talk about studies
"in the wild" and issues like "staged lived experiences",
that is, how we can set up situations with half-working systems that resemble
the real situations in which they will be used.
Theoretical foundations from this movement
is picked from phenomenology (see e g Paul Dourish' book "Where the action
is" that makes use of the later Wittgenstein, Husserl,
Heidegger and so on), from ethnographic work, from grounded theory, from
situated cognition (see Suchman "Plans and
situated action"), from activity theory (see Nardi
"Context and Consciousness: Activity Theory and Human-Computer
Interaction”), participatory design (or the so-called Scandinavian school of
design), and so on. In common for all these is an emphasis on moving beyond cognitivism and instead try to see the whole context in
which a system is used, the other tools people make use of, their daily
practices, their culture, etc. The belief is by understanding more of the whole
complexity and being more “emphatic” with the end users, actually placing
ourselves as designers in their shoes so to speak, we will be able to produce
better systems.
If we bring people into the lab, we know
for sure that a number of unwanted consequences arise. They will feel inclined
to be nice to the person who designed the system being tested or who set up the
study. We are social beings! It feels bad to be unkind to other people. A range
of studies on this phenomenon can be found in, for example, “The media
equation” by Reeves and Nass where people are nice even
to computers when asked to evaluate their performance. We therefore get quite
unexpected differences between what people say in questionnaires or interviews
and how they actually behave with the system (see for example our studies of Agneta & Frida – if you do
not have it I can send it to you, Fiorella de Rosis have had similar results in
her studies).
In my research life I have done both – both
controlled lab studies trying to figure out only one variable and more
open-ended, interpretative studies. I have done a bunch of studies with
statistically significant results of this or that special aspects (to do with
elderly users, to do with privacy issues, to do with spatial ability and
navigation, etc.) and these have not been nearly as useful to the design
process as the more ethnographically oriented studies where we provide an
interpretation of what is really going on - the whole story so to speak. Those
are no less painstakingly difficult to do, no less scientifically challenges, no less put to the test from a scientific point of view. But
the results they come up with are much closer to a real description of what is
really going on in our beautiful, complex, social, emotional,
culturally-influenced world of people, practices and tools.
2. The problem of evaluating emotional reactions to
systems
In our field, affective interactive
systems, we have additional problems compared to more traditional HCI-problems
with the development with tools. We are trying to design for some kind of user
experience. We cannot guarantee that the users will experience something
because it is obviously constructed by themselves, in their minds, but we can
design more or less good interaction that at least form the basis for certain
kinds of experiences. Here I am working from the following quote:
“Rather than experience as something to be
poured into passive users, we argue that users actively and individually
construct meaningful human experiences around technology.”
(Sengers et al., 2004)
(A very good book to read on this subject
is McCarthy & Wright “Technology as Experience”, especially the first 5
chapters).
But we can do a more or less good job in
the design. The problem lies in how we evaluate our systems so that we can
provide the designer of the system with the right level of feedback? What will
help the designer to figure out where the weaknesses in the system are and what
the best way of altering it is to actually reach the design goals? As the
experience has elements of unconscious processes (both cognitive and bodily) we
need to find some way of tapping into those. In addition, we also need to
compare those unconscious emotional reactions to the more conscious reflections
that users make. Both are relevant in the design process! I will not buy a
system that does not fit with my attitudes and beliefs about the system. And a
system that fails to reach its design goals is not a good system! Thus, both
kinds of evaluations are crucial in order for the designer to actually produce
what is intended.
Thus, in my view, the main purpose of doing
user studies from an HCI-perspective is in order to give feedback to the
designer of the system vis-à-vis their design goal for the system so that they
can improve it.
3. Design principles and not only “one study per system”?
Another reason to do a set of studies on
affective interactive applications is of course in search for general design
principles and problems or “i-patterns” (following
the terminology of Prof Jonas Löwgren) that will be able to generate many good
affective interaction systems within one class of applications or domain for
applications. The goal is also to find re-occurring design problems as well as
design principles that will (often) lead to good applications. I-patterns are
interaction patterns that seem to work well for a certain class of applications
that others can pick up and use in their design process. These "i-patterns" are like middle-range theories of what
works in interaction design.
4. Being scientific?
I would claim that the way we do research
in a more phenomenological, ethnographically oriented tradition is no less
scientific. In fact, I would claim that we learn more about people are really
experiencing. We are not abstracting away from all the details, all the
complicated issues to do with several people, several artefacts, cultural
practices, etc. down to one or two measurable variables... but this is a looong debate that we could have for ages. More important
is probably to make clear what the method is there for. Is it to help designers
create better systems? Or is it there for researchers to write a bunch of
papers with "proven" results creating a method that is not really
useful to people outside academia? I would claim that the proof you get is not
valid and interesting unless you can show that it will indeed help designers
create better systems. This is where the real test is.