Article | Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden | Málrómur: A Manually Verified Corpus of Recorded Icelandic Speech Link�ping University Electronic Press Conference Proceedings
Göm menyn

Title:
Málrómur: A Manually Verified Corpus of Recorded Icelandic Speech
Author:
Steinþór Steingrímsson: The Árni Magnússon Institute for Icelandic Studies, Iceland Jón Guðnason: Reykjavik University, Iceland Sigrún Helgadóttir: The Árni Magnússon Institute for Icelandic Studies, Iceland Eiríkur Rögnvaldsson: Department of Icelandic, University of Iceland, Iceland
Download:
Full text (pdf)
Year:
2017
Conference:
Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden
Issue:
131
Article no.:
029
Pages:
237-240
No. of pages:
4
Publication type:
Abstract and Fulltext
Published:
2017-05-08
ISBN:
978-91-7685-601-7
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

This paper describes the Málrómur corpus, an open, manually verified, Icelandic speech corpus. The recordings were collected in 2011–2012 by Reykjavik University and the Icelandic Center for Language Technology in cooperation with Google. 152 hours of speech were recorded from 563 participants. The recordings were subsequently manually inspected by evaluators listening to all the segments, determining whether any given segment contains the utterance the participant was supposed to read, and nothing else. Out of 127,286 recorded segments 108,568 were approved and 18,718 deemed unsatisfactory.

Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Author:
Steinþór Steingrímsson, Jón Guðnason, Sigrún Helgadóttir, Eiríkur Rögnvaldsson
Title:
Málrómur: A Manually Verified Corpus of Recorded Icelandic Speech
References:

Jón Guðnason, Oddur Kjartansson, Jökull Jóhannsson, Elín Carstensdóttir, Hannes Högni Vilhjálmsson, Hrafn Loftsson, Sigrún Helgadóttir, Kristín M. Jóhannsdóttir, and Eiríkur Rögnvaldsson. 2012. Almannarómur: An Open Icelandic Speech Corpus. In Proceedings of SLTU ’12, 3rd Workshop on Spoken Languages Technologies for Under-Resourced Languages, Cape Town, South Africa.

Sigrún Helgadóttir and Eiríkur Rögnvaldsson. 2013. Language Resources for Icelandic. In K. De Smedt, L. Borin, K. Lindén, B. Maegaard, E. Rögnvaldsson, and K. Vider, editors, Proceedings of the Workshop on Nordic Language Research Infrastructure at NODALIDA 2013, pages 60–76. NEALT Proceedings Series 20. Linköping Electronic Conference Proceedings, Linköping, Sweden.

Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro Moreno, and Mike LeBeau. 2010. Building Transcribed Speech Corpora Quickly and Cheaply for Many Languages. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), pages 1914–1917, Makuhari, Chiba, Japan.

Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Author:
Steinþór Steingrímsson, Jón Guðnason, Sigrún Helgadóttir, Eiríkur Rögnvaldsson
Title:
Málrómur: A Manually Verified Corpus of Recorded Icelandic Speech
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2018-9-11