Visual Memory and Attention in Natural Tasks
Mary M. Hayhoe
Department of Brain and Cognitive Science
University of Rochester, U.S.A.
Montag, 9.12.2002, 16 Uhr c.t., Hörsaal 5
Traditionally, visual processing has been thought of as parallel and
high capacity, whereas cognitive mechanisms are sequential and
capacity limited. How the interface occurs is a matter of
considerable debate, and recent work showing profound attentional
effects in early visual areas shows that it is not really possible to
consider vision and cognition separately. The tension is reflected in
the body of work on "change blindness", which shows that observers
are extremely insensitive to changes in the visual scene made during
an eye movement, film cut, or similar masking stimulus. This work
implies that visual representations may be extremely limited, a
finding that conflicts with the implicit assumption that vision
somehow provides a complete representation of the visual scene. It
seems likely that resolution of these issues requires a consideration
of the functional context of the observer. This emerges as a natural
organizing principle when one considers ordinary behavior. In an
experiment where subjects picked up objects in a virtual environment
with haptic feedback and moved them across the field, we occasionally
made substantial changes in the size of the object during the
movement. Subtle differences in instruction that define the time
during the task when object size is relevant, lead to substantial
differences in awareness of these size changes, even though the
object was the focus of attention throughout the trial. This suggests
that the role of attention needs to be more tightly defined. We
postulate that the underlying determiner for what visual information
is represented is exactly what tasks, or visual computations, the
observer is engaged in from moment to moment (for example: determine
object size, select object, fixate object, program reach and grasp to
pick up object, select location for putdown etc). That is, the task
micro-structure determines both what is attended, and what is
remembered. The dynamic and task-specific nature of visual
representations was also demonstrated in another experiment using a
virtual driving enviroment. Briefly presented Stop signs do not
attract gaze unless they are both relevant to the task and in a
likely location, and observers deploy quite different gaze patterns
depending on task goals. This suggests that the information acquired
from natural environments is under the control of learnt behavioral
programs that determine when an active search for specific
information is executed.