Article | NEAL Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland | Tagging a Norwegian Dialect Corpus Linköping University Electronic Press Conference Proceedings
Göm menyn

Title:
Tagging a Norwegian Dialect Corpus
Author:
Andre Kåsen: Department of Informatics, University of Oslo, Norway Kristin Hagen: The Text Laboratory, University of Oslo, Norway Anders Nøklestad: The Text Laboratory, University of Oslo, Norway Joel Priestley: The Text Laboratory, University of Oslo, Norway
Download:
Full text (pdf)
Year:
2019
Conference:
NEAL Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland
Issue:
167
Article no.:
040
Pages:
350--355
No. of pages:
5
Publication type:
Abstract and Fulltext
Published:
2019-10-02
ISBN:
978-91-7929-995-8
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

This paper describes an evaluation of five data-driven part-of-speech (PoS) taggers for spoken Norwegian. The taggers all rely on different machine learning mechanisms: decision trees, hidden Markov models (HMMs), conditional random fields (CRFs), long-short term memory networks (LSTMs), and convolutional neural networks (CNNs). We go into some of the challenges posed by the task of tagging spoken, as opposed to written, language, and in particular a wide range of dialects as is found in the recordings of the LIA (Language Infrastructure made Accessible) project. The results show that the taggers based on either conditional random fields or neural networks perform much better than the rest, with the LSTM tagger getting the highest score.

Keywords: part-of-speech tagging spoken language dialects Norwegian

NEAL Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Andre Kåsen, Kristin Hagen, Anders Nøklestad, Joel Priestley
Title:
Tagging a Norwegian Dialect Corpus
References:
No references available

NEAL Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Andre Kåsen, Kristin Hagen, Anders Nøklestad, Joel Priestley
Title:
Tagging a Norwegian Dialect Corpus
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2019-11-06