This volume presents a selection of papers presented at MMSYM 2015, the 3rd European and 7th Nordic Symposium on Multimodal Communication, held in Dublin, Ireland, on the 17th and 18th September 2015. MMSYM aims to provide a multidisciplinary forum, bringing together researchers from different disciplines working on multimodality in human communication and human-machine interaction. Originating in the Nordic countries, this third edition of the symposium at European level has continued to attract an international audience.
MMSYM 2015 attracted researchers whose work spans several domains, linked by the topic of multimodality. The papers are listed in alphabetical order in the table of contents, but below we briefly describe them in terms of their thematic commonalities spanning from the analysis of gestures, to the analysis of filled pauses and other interactive phenomena observed in multimodal communication. Multimodality is observed not only in communication between humans (speech communication, visual communication), but also in communication between human and machine, for instance first encounter dialogues analysed in multimodal corpora of human dialogues or experimental human-machine dialogue systems. In this volume, multimodality is seen in a wide variety of domains, including the multimodal perception of attitudes in video blogs, multimodal perception in infants with and without risk of autism, multimodality in language learners, and even multimodal aspects related to turn-taking in contemporary dance improvisation.
There are papers addressing development and learning. Lozano et al report an ongoing meta-analysis of multimodal perception in infants with and without genetic risk for autism, which they posit will shed light on the acquisition of multimodal perceptual integration during development. Two papers address gesture in adult second language learners, with Levantinou and Navaretta investigating the effects of beat and iconic gestures in aiding comprehension and recall of a second language, while Wessel-Tolvig demonstrates how the acquisition of target language gestural patterns in advanced Danish learners of Italian gives evidence of learners’ shift to target language semantic representations.
Allwood and Ahlsén address the contribution of gesture and speech to the construction of meaning, proposing a framework which extends the notion of meaning potential from symbols to iconic and indexical gestures, and presenting multimodal combinations of symbols, icons, and indices in face-to-face communication. Madzlan et al investigate multimodal perception of attitude in their study on video blogs, in which they present a novel annotation scheme for attitudes and report on experiments validating their annotation scheme and investigating how the different modalities jointly and separately contribute to perception of attitudes.
Several papers investigate first encounter dialogues, reporting analyses of multimodal corpora of human dialogues or experimental human-machine dialogue systems. In their respective papers, Navaretta and Paggio both address the interplay of multimodal elements of first encounter dialogues in the Danish language NOMCO corpus. Paggio reports on an analysis of temporal alignment between head movements and associated speech segments, while Navaretta investigates fillers, filled pauses, and co-occurring gestures in terms of their function, and contrasts her findings for Danish with previous work on other languages. Jokinen investigates automatic detection of co-speech gesturing in first encounter conversations, focussing on a top-down bottom-up paradigm combining human annotation and automatic analysis of video data, and discussing the applications of such technology to automatic dialogue systems. Ólafsson et al focus on the very first stages of interaction with strangers, outlining their Explicit Announcement of Presence (EAP) model, and reporting on a qualitative study of video recordings of humans approaching strangers to ask for directions, the design of a multimodal virtual agent incorporating this functionality, and a pilot user study of the system in the context of aiding second language acquisition in Icelandic.
A different kind of turntaking, that found in contemporary dance improvisation, is investigated in Evola et al’s contribution, which describes the collection and annotation of recordings of improvised dance performed by experts, a micro analysis of turntaking between performers based on bodily movements and gaze, and a macro-analysis comparing the data with analogous data from non-performers.
Anastasiou et al and Cummins and Byrne treat the establishment of awareness and co-presence in communication in their respective contributions. Anastasiou et al present a Wizard-of-Oz study on awareness signals, where a smart object such as a lamp is used to demonstrate the potential for communication with a distant colleague before verbal or written communication across a network begins. Cummins and Byrne investigate the establishment of co-presence, proposing that the technical requirement for this across network relies on the establishment of zero-mean lag in communication. They discuss different ways of thinking about this problem and outline possible routes to this goal.
Brueck analyses multimodal representation of shared cultural knowledge pertaining to spatial orientation and conceptualisation in Kreol Seselwa, a French creole spoken in the Seychelles. In the study, she investigates the contribution of voice, gesture, and cultural factors including geography to speakers’ use of frames of reference in spatial reference.
The range of work reflected in the papers presented here demonstrates depth and breadth of current research into multimodality and reflects the high level of interest from several disciplines in questions of how best to analyse the full range of signals and cues present in various types of interaction. We hope that this collection will excite further interest in the field, help maintain the momentum of multimodal studies, and contribute to the continuing success of the MMSYM symposia.
E. Gilmartin, L. Cerrato, N. Campbell