Article | Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016), Copenhagen, 29-30 September 2016 | Towards classification of head movements in audiovisual recordings of read news
Göm menyn

Title:
Towards classification of head movements in audiovisual recordings of read news
Author:
Johan Frid: Lund University, Humanities Laboratory, Lund University, Sweden Gilbert Ambrazaitis: Linguistics and Phonetics, Centre for Languages and Literature, Lund University, Sweden Malin Svensson-Lundmark: Linguistics and Phonetics, Centre for Languages and Literature, Lund University, Sweden David House: Department of Speech, Music and Hearing, KTH, Sweden
Download:
Full text (pdf)
Year:
2017
Conference:
Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016), Copenhagen, 29-30 September 2016
Issue:
141
Article no.:
002
Pages:
4-9
No. of pages:
6
Publication type:
Abstract and Fulltext
Published:
2017-09-21
ISBN:
978-91-7685-423-5
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

In this paper we develop a system for detection of word-related head movements in audiovisu-al recordings of read news. Our materials consist of Swedish television news broadcasts and comprise audiovisual recordings of five news readers (two female, three male). The corpus was manually labelled for head movement, applying a simplistic annotation scheme consisting of a binary decision about absence/presence of a movement in relation to a word. We use OpenCV for frontal face detection and based on this we calculate velocity and acceleration features. Then we train a machine learning system to predict absence or presence of head movement and achieve an accuracy of 0.892, which is better than the baseline. The system may thus be helpful for head movement labelling.

Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016), Copenhagen, 29-30 September 2016

Author:
Johan Frid, Gilbert Ambrazaitis, Malin Svensson-Lundmark, David House
Title:
Towards classification of head movements in audiovisual recordings of read news
References:

Ambrazaitis, G., Svensson Lundmark, M. & House, D. (2015). Multimodal levels of prominence : a preliminary analysis of head and eyebrow movements in Swedish news broadcasts. In Svensson Lundmark, M., Ambrazaitis, G. & van de Weijer, J. (Eds.) Working Papers in General Linguistics and Phonetics (Proceedings from Fonetik 2015) (pp. 11-16), 55. Centre for Languages and Literature, Lund University.


Boersma, P., Weenink, D. 2014. Praat: doing phonetics by computer [Computer program]. http://www.praat.org/


Bruce, G. 1977. Swedish Word Accents in Sentence Perspective. Travaux de l’institut de linguistique de Lund 12. Malm√∂: Gleerup.


Bruce, G., B. Granstr√∂m (1993). Prosodic modelling in Swedish speech synthesis. Speech Communication 13, 63‚Äď73.


Chen, T. & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining.


Fleiss, J. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378-382.


Nystrom, M., & Holmqvist, K. (2010). An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data. Behavior Research Methods, 42, 188-204. doi:10.3758/BRM.42.1.188


Savitzky, A., & Golay, M. J. E. (1964). Smoothing and differentiation of data by simplified least squares proce-dures. Analytical Chemistry, 36, 1627-1639.


Viola, P., & Jones, M. J. (2001) Rapid Object Detection using a Boosted Cascade of Simple Features, Proceed-ings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. Volume: 1, pp.511‚Äď518.


Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., Sloetjes, H. 2006. ELAN: a professional framework for multimodality research. Proc. of LREC 2006, Fifth International Conference on Language Resources and Evaluation. See also: http://tla.mpi.nl/tools/tla-tools/elan/


Zhang, S., Wu, Z., Meng, H., Cai, L. (2007) Head Movement Synthesis based on Semantic and Prosodic Fea-tures for a Chinese Expressive Avatar In: ICASSP 2007, Vol. 4, pp.837-840, 2007.4

Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016), Copenhagen, 29-30 September 2016

Author:
Johan Frid, Gilbert Ambrazaitis, Malin Svensson-Lundmark, David House
Title:
Towards classification of head movements in audiovisual recordings of read news
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21