Article | Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16 | Statistical syntactic parsing for Latvian
Göm menyn

Title:
Statistical syntactic parsing for Latvian
Author:
Lauma Pretkalnina: Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia Laura Rituma: Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia
Download:
Full text (pdf)
Year:
2013
Conference:
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16
Issue:
085
Article no.:
025
Pages:
279-289
No. of pages:
11
Publication type:
Abstract and Fulltext
Published:
2013-05-17
ISBN:
978-91-7519-589-6
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press; Linköpings universitet


Export in BibTex, RIS or text

Syntactic parsing is an important technique in the natural language processing; yet Latvian is still lacking an efficient general coverage syntax parser. This paper reports on the first experiments on statistical syntactic parsing for Latvian — a highly inflective Indo-European language with a relatively free word order. We have induced a statistical parser from small; non-balanced Latvian Treebank using the MaltParser toolkit and measured the unlabeled attachment score (UAS). As MaltParser is based on the dependency grammar approach; we have also developed a convertor from the hybrid dependency-based annotation model used in the Latvian Treebank to the pure dependency annotation model. We have obtained a promising 74.63% UAS in 10-fold cross-validation using only ~2500 sentences. The results revealed that best results can be achieved using non-projective stack parsing algorithm with lazy arc adding strategy; but comparably good results can be achieved using projective parsing algorithms combined with appropriate projectiviziation preprocessing.

Keywords: Latvian; treebank; dependency parsing; statistical parsing; MaltParser

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Lauma Pretkalnina, Laura Rituma
Title:
Statistical syntactic parsing for Latvian
References:

Barzdinš; G.; Gruzitis; N.; Nešpore; G. and Saulite; B. (2007). Dependency-Based Hybrid Model of Syntactic Analysis for the Languages with a Rather Free Word Order. In: Proceedings of the 16th Nordic Conference of Computational Linguistics; pages 13–20; Tartu.


Bohnet; B. and Nivre; J. (2012) A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning; pages 1455–1465.


Chang; C.-C. and Lin; C.-J. (2011). LIBSVM : a library for support vector machines. In ACM Transactions on Intelligent Systems and Technology; 27(2); pages 1–27.


Deksne; D. and Skadinš; R. (2011). CFG Based Grammar Checker for Latvian. In Proceedings of the 18th Nordic Conference of Computational Linguistics ; pages 275–278 Riga.


Erjavec; T. (2010). MULTEXT-East Version 4: Multilingual Morphosyntactic Specifications; Lexicons and Corpora. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’2010); pages 19–21; Malta.


Gómez-Rodríguez; C. and Nivre; J. (2010). A Transition-Based Parser for 2-Planar Dependency Structures. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics; pages 1492–1501


Hajic; J.; Böhmová; A.; Hajicová; E. and Vidová Hladká; B. (2000). The Prague Dependency Treebank: A Three-Level Annotation Scenario. A. Abeillé (ed.): Treebanks: Building and Using Parsed Corpora; pages 103–127; Amsterdam; Kluwer.


Hajic; J.; Vidová Hladká; B. and Pajas; P. (2001). The Prague Dependency Treebank: Annotation Structure and Support. In Proceedings of the IRCS Workshop on Linguistic Databases; pages 105–114; Philadelphia.


Koo; T. and Collins; M. (2010). Efficient Third-order Dependency Parsers. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics ; pages 1–11; Association for Computational Linguistics.


Nešpore G.; Saulite B.; Barzdinš G. and Gruzitis N. (2010). Comparison of the SemTi- Kamols and Tesnière’s Dependency Grammars. In Proceedings of the 4th International Conference on Human Language Technologies — the Baltic Perspective. Frontiers in Artificial Intelligence and Applications; Vol. 219; pages. 233–240; IOS Press.


Nivre; J. (2003). An Efficient Algorithm for Projective Dependency Parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT 03); pages 149–160; Nancy.


Nivre; J. (2004). Incrementality in Deterministic Dependency Parsing. In Incremental Parsing: Bringing Engineering and Cognition Together. Workshop at ACL-2004; Barcelona.


Nivre; J. (2009). Non-Projective Dependency Parsing in Expected Linear Time. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4thInternational Joint Conference on Natural Language Processing of the AFNLP; pages 351– 359.


Nivre; J. and Nilsson; J. (2005). Pseudo-Projective Dependency Parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL) ; pages 99–106.


Nivre; J.; Kuhlmann; M. and Hall; J. (2009). An Improved Oracle for Dependency Parsing with Online Reordering. In Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09); pages 73–76.


Nivre J. and Hall J. (2010). A Quick Guide to MaltParser Optimization. http://maltparser.org/guides/opt/quick-opt.pdf [last visited on 16/01/2013].


Paikens P.; Gruzitis N. (2012). An implementation of a Latvian resource grammar in Grammatical Framework. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC); pages 1680–1685; Istanbul.


Paikens P.; Rituma L.; and Pretkalnina L. (2013). Morphological analysis with limited resources: Latvian example. In Proceedings of 19th Nordic Conference of Computational Linguistics; to be published; Oslo.


Pretkalnina L.; Nešpore G.; Levane-Petrova K.; and Saulite B. (2011a). A Prague Markup Language Profile for the SemTi-Kamols Grammar Model. In Proceedings of the 18th Nordic Conference of Computational Linguistics; pages 303–306; Riga.


Pretkalnina L.; Nešpore G.; Levane-Petrova K.; and Saulite B. (2011b). Towards a Latvian Treebank. In Actas del 3 Congreso Internacional de Lingüística de Corpus. Tecnologias de la Información y las Comunicaciones: Presente y Futuro en el Análisis de Corpus ; pages 119–127; Valence.

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Lauma Pretkalnina, Laura Rituma
Title:
Statistical syntactic parsing for Latvian
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21