Article | Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania | Automatic word stress annotation of Russian unrestricted text Link�ping University Electronic Press Conference Proceedings
Göm menyn

Title:
Automatic word stress annotation of Russian unrestricted text
Author:
Robert Reynolds: HSL Faculty, UiT The Arctic University of Norway, Tromsø, Norway Francis Tyers: HSL Faculty, UiT The Arctic University of Norway, Tromsø, Norway
Download:
Full text (pdf)
Year:
2015
Conference:
Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania
Issue:
109
Article no.:
022
Pages:
173-180
No. of pages:
8
Publication type:
Abstract and Fulltext
Published:
2015-05-06
ISBN:
978-91-7519-098-3
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

We evaluate the effectiveness of finite-state tools we developed for automatically annotating word stress in Russian unrestricted text. This task is relevant for computer-assisted language learning and text-to-speech. To our knowledge, this is the first study to empirically evaluate the results of this task. Given an adequate lexicon with specified stress, the primary obstacle for correct stress placement is disambiguating homographic wordforms. The baseline performance of this task is 90.07%, (known words only, no morphosyntactic disambiguation). Using a constraint grammar to disambiguate homographs, we achieve 93.21% accuracy with minimal errors. For applications with a higher threshold for errors, we achieved 96.15% accuracy by incorporating frequency-based guessing and a simple algorithm for guessing the stress position on unknown words. These results highlight the need for morphosyntactic disambiguation in the word stress placement task for Russian, and set a standard for future research on this task.

Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania

Author:
Robert Reynolds, Francis Tyers
Title:
Automatic word stress annotation of Russian unrestricted text
References:

Kenneth R. Beesley and Lauri Karttunen. 2003. Finite State Morphology: Xerox tools and techniques. CSLI Publications, Stanford. Kenneth Church. 1985. Stress assignment in letter to sound rules for speech synthesis. Association for Computational Linguistics, pages 246–253.

Katherine Crosswhite, John Alderete, Tim Beasley, and Vita Markman. 2003. Morphological effects on default stress in novel Russian words. In WCCFL 22 Proceedings, pages 151–164.

Qing Dou, Shane Bergsma, Sittichai Jiampojamarn, and Grzegorz Kondrak. 2009. A ranking approach to stress prediction for letter-to-phoneme conversion. In Proceedings of the Joint Conference of the 47th annual meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 118–126, Suntec, Singapore. Association for Computational Linguistics.

Kieth Hall and Richard Sproat. 2013. Russian stress prediction using maximum entropy ranking. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 879–883, Seattle, Washington, USA. Association for Computational Linguistics.

Fred Karlsson, Atro Voutilainen, Juha Heikkilä, and Arto Anttila, editors. 1995. Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text. Number 4 in Natural Language Processing. Mouton de Gruyter, Berlin and New York.

Fred Karlsson. 1990. Constraint grammar as a framework for parsing running text. In Proceedings of the 13th Conference on Computational Linguistics (COLING), Volume 3, pages 168–173, Helsinki, Finland. Association for Computational Linguistics.

Kimmo Koskenniemi. 1983. Two-level morphology: A general computational model for word-form recognition and production. Technical report, University of Helsinki, Department of General Linguistics.

Kimmo Koskenniemi. 1984. A general computational model for word-form recognition and production. In Proceedings of the 10th International Conference on Computational Linguistics, COLING ’84, pages 178–181, Stroudsburg, PA, USA. Association for Computational Linguistics.

Olga F. Krivnova. 1998. Avtomaticeskij sintez russkoj reci po proizvol’nomu tekstu (vtoraja versija s ženskim golosom) [Automatic Russian speech synthesis with unrestricted text (version 2 with female voice)]. In Trudy meždunarodnogo seminara Dialog [Proceedings of the international seminar Dialog], pages 498–511.

Yulia Lavitskaya and Bari¸s Kabak. 2014. Phonological default in the lexical stress system of Russian: Evidence from noun declension. Lingua, 150:363–385, Oct.

Krister Linden, Miikka Silfverberg, Erik Axelson, Sam Hardwick, and Tommi Pirinen. 2011. Hfst-framework for compiling and applying morphologies. In Cerstin Mahlow and Michael Pietrowski, editors, Systems and Frameworks for Computational Morphology, volume Vol. 100 of Communications in Computer and Information Science, pages 67–85. Springer.

Igor Nožov. 2003. Morfologiceskaja i sintaksiceskaja obrabotka teksta (modeli i programmy) [Morphological and Syntactic Text Processing (models and programs)] also published as Realizacija avtomati?ceskoj sintaksiceskoj segmentacii russkogo predloženija [Realization of automatic syntactic segmentation of the Russian sentence]. Ph.D. thesis, Russian State University for the Humanities, Moscow.

Steve Pearson, Roland Kuhn, Steven Fincke, and Nick Kibre. 2000. Automatic methods for lexical stress assignment and syllabification. In International Conference on Spoken Language Processing, pages 423–426.

Ilya Segalovich. 2003. A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In International Conference on Machine Learning; Models, Technologies and Applications, pages 273–280.

Sowmya Vajjala and Detmar Meurers. 2012. On improving the accuracy of readability classification using insights from second language acquisition. In Joel Tetreault, Jill Burstein, and Claudial Leacock, editors, In Proceedings of the 7thWorkshop on Innovative Use of NLP for Building Educational Applications, pages 163—-173, Montréal, Canada, June.

Association for Computational Linguistics. Gabriel Webster. 2004. Improving letterto-pronunciation accuracy with automatic morphologically-based stress prediction. In Eighth International Conference on Spoken Language Processing, pages 2573–2576.

Briony Williams. 1987. Word stress assignment in a text-to-speech synthesis system for british english. Computer Speech and Language, 2:235–272.

Olga Xomicevic, Sergej Rybin, Andrej Talanov, and Ilya Oparin. 2008. Avtomaticeskoe opredelenie mesta udarenie v neznakomyx slovax v sisteme sinteza reci [Automatic determination of the place of stress in unknown words in a speech synthesis system]. In Materialy XXXVI meždunarodnoj filologiceskoj konferencii [Proceedings of the XXXVI International Philological Conference], Saint Petersburg.

Andrej Anatoljevic Zaliznjak. 1977. Grammaticeskij slovar’ russkogo jazyka: slovoizmenenie: okolo 100 000 slov [Grammatical dictionary of the Russian language: Inflection: approx 100 000 words]. Russkij jazyk.

Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania

Author:
Robert Reynolds, Francis Tyers
Title:
Automatic word stress annotation of Russian unrestricted text
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2018-9-11