Universität Bielefeld - Sonderforschungsbereich 360

Picture Descriptions: Comparing Verbal and Visual 'Areas of Interest'

Jana Holsánová
Lund University, Cognitive Science
Lund, Sweden

Chafe (1994, 1996) suggests that spoken language has similar properties as vision: it proceeds in brief spurts, has a focus and a peripheri. The hypothesis is that there are similar principles for processing visual and verbal information (Just & Carpenter (1976). The comparison between the way in which we perceive pictures visually and the way we describe them verbally form the red thread of the following three studies.

My first analysis (Holsánová, 1997), was aimed at studying a) how spoken picture descriptions are structured, b) how speakers connect sequential steps in their descriptions and mark the relations between them to the listener and c) how speakers linearize the information gained from the pictorial source and adopt it to a verbal medium. A complex picture from a Swedish childrens' book was shown to 12 subjects, and each of them was then asked to describe it from memory. Their spoken descriptions were recorded, transcribed and segmented into intonation units and centers of interest (Chafe 1994).

To sum up, speakers focused their attention on different aspects of the picture, proceeded in brief spurts, 'jumping' back and forward between these aspects. The most dominant 'areas of intrest' have been identified. Beside the descriptive foci (dealing with the picture content), many non-descriptive focusses could be identified. Speakers expressed their attitudes, expert knowledge and predictions about interpretations. Their presentations included traces of recall, metatextual comments and questions about the procedures in the ongoing personal interaction. I therefore distinguished between interactive, introspective, evaluative, explicative, and anticipatory foci. When categorizing, changes in the level of abstraction have contributed to the structuring of information. The overall tendency was to introduce the whole picture description and the centers of interest with a global summary and go into details later on. Speakers connected sequential steps in their descriptions mostly with the help of pauses, hesitations and discourse markers, but also by changing loudness, voice quality, acceleration or by using stressed spatial expressions. In the way of transforming the picture into language, two different styles of description were identified and analysed in detail. Perceiving space was dominant in the more static technical style, while perceiving time was dominant in the more dynamic narrative style.

In a pilot study (Holsánová, Hedberg & Nilsson, forthcoming), four subjects were asked to describe the same complex picture from memory. Their eye movements were registered with an SMI iView system both during inspectation of the picture and during their consecutive verbal description.

In sumary, we could identify two general patterns in the eye movement data: an initial general survey followed by an examination phase consisting of several detailed examinations (as suggested by Buswell, 1935:142). Similar patterns with an initial orientation in the picture, followed by detail examinations were also identified in the spoken language descriptions, although differently distributed. Another similarity between visual and verbal data concerned re-examination of picture elements. As an extension to Yarbus' (1967) claims about observers repeated returns to certain picture elements, patterns of comparing activities could be distinguished. Furthermore, unusual, incomprehensible or strange parts of the picture attracted the viewer's attention and were focused both in the eye movement data and in the verbal data. Finally, when describing the picture from memory, our subjects directed their gaze at certain locations on the white board in front of them. These locations corresponded to the actual locations of the picture elements. We suggested that a form of mental scanning (Kosslyn 1980) was used as an aid in recalling the picture.

In my third study, four participants examined the same complex picture and describe it both simultaneously during viewing and consecutively, 15 minutes after viewing. The aims of the study are a) to compare semantic and sequential patterns in eye movement data and in spoken language data in each subject and b) to find out in more detail if there are similar scanning strategies and descriptive strategies in all subjects (e.g. a general survey and a following detailed examinations identified in the two preceding studies, comparing activities uncovered in pilot study, creating classes of objects with similar traits etc.). In my presentation, I will demonstrate the method of comparing both data samples by using time logged protocolls over visual and verbal 'areas of interest'.

References

Buswell, G. T. (1935):
How people look at pictures. A study of the psychology of perception in art. Chicago: The University of Chicago Press.
Chafe, W. (1994):
Discourse, Consciousness, and Time. The Flow and Displacement of Conscious Experience in Speaking and Writing. The University of Chicago Press: Chicago: London.
Chafe, W. (1996):
How consciousness shapes language. In: Pragmatics and Cognition 4:1, 1996, 35-54.
Holsánová, J. (1997):
Bildbeskrivning ur ett kommunikativt och kognitivt perspektiv. LUCS Minor 6. Lund university.
Holsánová, J., Hedberg, B., & Nilsson, N. (forthcoming):
Visual and Verbal Focus Patterns when Describing Pictures. In: Becker, W., Deubel, H. and Mergner, T. (eds.), Current Oculomotor Research: Physiological and Psychological Aspects.
Just, M. A. & Carpenter, P.A. (1976):
Eye fixations and cognitive processes. In: Cognitive Psychology 8, 1976, 441-480.
Kosslyn, S (1980):
Image and Mind. Harvard University Press. Cambridge, Mass. And London, Endland.
Yarbus, A. L. (1967):
Eye Movements and Vision. New York: Plenum Press.

Anke Weinberger, 1998-11-09