Article | Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16 | Analysis of phonetic transcriptions for Danish automatic speech recognition
Göm menyn

Title:
Analysis of phonetic transcriptions for Danish automatic speech recognition
Author:
Andreas Søeborg Kirkedal: Department of International Business Communication, CBS, Frederiksberg, Denmark
Download:
Full text (pdf)
Year:
2013
Conference:
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16
Issue:
085
Article no.:
029
Pages:
321-330
No. of pages:
10
Publication type:
Abstract and Fulltext
Published:
2013-05-17
ISBN:
978-91-7519-589-6
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press; Linköpings universitet


Export in BibTex, RIS or text

Automatic speech recognition (ASR) relies on three resources: audio; orthographic transcriptions and a pronunciation dictionary. The dictionary or lexicon maps orthographic words to sequences of phones or phonemes that represent the pronunciation of the corresponding word. The quality of a speech recognition system depends heavily on the dictionary and the transcriptions therein. This paper presents an analysis of phonetic/phonemic features that are salient for current Danish ASR systems. This preliminary study consists of a series of experiments using an ASR system trained on the DK-PAROLE corpus. The analysis indicates that transcribing e.g. stress or vowel duration has a negative impact on performance. The best performance is obtained with coarse phonetic annotation and improves performance 1% word error rate and 3.8% sentence error rate.

Keywords: Automatic speech recognition; phonetics; phonology; speech; phonetic transcription

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Andreas Søeborg Kirkedal
Title:
Analysis of phonetic transcriptions for Danish automatic speech recognition
References:

Brøndsted; T. and Madsen; J. (1997). Fonemteori og talegenkendelse. Sprog og multimedier. Aalborg Universitetsforlag.

Fiscus; J. (1998). Sclite scoring package version 1.5. US National Institute of Standard Technology (NIST); URL http://www. itl. nist. gov/iaui/894.01/tools.

Gregersen; F. (2007). The lanchart corpus of spoken danish; report from a corpus in progress. Current Trends in Research on Spoken Language in the Nordic Countries; 2:130–143.

Grønnum; N. (2005). Fonetik og fonologi; 3. udg. Akademisk Forlag; København.

Grønnum; N. (2006). Danpass-a danish phonetically annotated spontaneous speech corpus. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC); Genova; Italy; May.

Henrichsen; P. (2007). The danish parole corpus-a merge of speech and writing. Current Trends in Research on Spoken Language in the Nordic Countries; 2:84–93.

Henrichsen; P. and Kirkedal; A. (2011). Founding a large-vocabulary speech recognizer for danish. In Speech in Action; pages 175–193. International Phonetic Association (1999). Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet. Cambridge University Press.

Ljolje; A. (1994). High accuracy phone recognition using context clustering and quasi-triphonic models. Computer Speech & Language; 8:129–151.

Novotney; S. and Callison-Burch; C. (2010). Cheap; fast and good enough: Automatic speech recognition with non-expert transcription. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics; pages 207–215. Association for Computational Linguistics.

Placeway; P.; Chen; S.; Eskenazi; M.; Jain; U.; Parikh; V.; Raj; B.; Ravishankar; M.; Rosenfeld; R.; Seymore; K.; Siegler; M.; et al. (1997). The 1996 hub-4 sphinx-3 system. In Proc. DARPA Speech recognition workshop; pages 85–89. Citeseer.

Schachtenhaufen; R. (2010). Schwa-assimilation og stavelsesgrænser. NyS; (39):64–92.

Wells; J. et al. (1997). Sampa computer readable phonetic alphabet. Handbook of standards and resources for spoken language systems; 4.

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Andreas Søeborg Kirkedal
Title:
Analysis of phonetic transcriptions for Danish automatic speech recognition
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21