Assessing Reliability on Annotations (2):
Statistical Results for the DEIKON Scheme

Andy Lücking and Jens Stegmann
Abstract
This is the second part of a two-report mini-series focussing on issues in the evaluation of annotations. In this empirically-oriented report we lay out the documentation of the annotation scheme used in the deikon pro ject, discuss the results obtained in a respective reliability study and conclude with some suggestions regarding forthcoming versions of the scheme. Relevant statistical background, theoretical considerations in reliability statistics and an evaluation of some pertaining approaches are given in the first, more theoretically-oriented report [Stegmann and Lücking, 2005]. The following points are dealt with in detail here: we describe the setting that was used to elicit the empirical data. The annotation scheme that is put to scrutiny is documented and exemplified. Aspects of our theoretical work in linguistics are mentioned en passant. Then we present, discuss, and interpret the actual results obtained for our scheme. We find a high degree of correlation on the exact placement of time-stretched entities (word and gesture phase boundaries), mildly good results pertaining to agreement concerning time-related categories that appeal to structural configurations (e. g. the position of a gesture with respect to the parts of accompanying speech), but rather weak agreement with respect to the determination of gesture function. Therefore, the results for time- based type-i data look more promising than those obtained for the more theoretically- framed type-ii categories. However, the type-i results must not be compared with the type-ii ones on superficial grounds, since the statistics are of a different kind (correlation vs. agreement, i. e. not chance-adjusted vs. chance-adjusted) and, hence, the results have to be interpreted in different terms, respectively. Finally, we discuss some issues in the future make-up of the annotation scheme with a focus on its dialogue parts. Our respective suggestions amount to a shift towards a more theory-oriented annotation.
PDF (~1372 k)
Anke Weinberger, 2006-05-29, 2006-05-30