Keywords: Estonian; Proficiency Classification; CEFR; Morphological Features; Machine Learning
Proceedings of the third workshop on NLP for computer-assisted language learning at SLTC 2014, Uppsala University
Burstein, J. (2003). The e-rater Scoring Engine: Automated Essay Scoring with Natural Language Processing, chapter 7, pages 107–115. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Burstein, J. and Chodorow, M. (2010). Progress and New Directions in Technology for Automated Essay Evaluation, chapter 36, pages 487–497. Oxford University Press, 2nd edition.
Council of Europe (2001). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press, Cambridge.
Crossley, S. A., Salsbury, T., McNamara, D. S., and Jarvis, S. (2011). Predicting lexical proficiency in language learners using computational indices. Language Testing, 28:561–580.
Eslon, P. (2014). Eesti vahekeele korpus (Estonian Interlanguage Corpus). Keel ja Kirjandus, 6:436–451.
Gyllstad, H., Grandfeldt, J., Bernardini, P., and Källkvist, M. (2014). Linguistic correlates to communicative proficiency levels of the CEFR: The case of syntactic complexity in written l2 english, l3 french and l4 italian. EUROSLA Yearbook, 14(1):1–30.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The WEKA data mining software: An update. The SIGKDD Explorations, 11(1):10–18.
Hall, M. A. (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, Newzealand.
Hancke, J. (2013). Automatic prediction of CEFR proficiency levels based on linguistic features of learner language. Master’s thesis, International Studies in Computational Linguistics. Seminar für Sprachwissenschaft, Universität Tübingen.
Hancke, J. and Meurers, D. (2013). Exploring CEFR classification for german based on rich linguistic modeling. In Learner Corpus Research 2013, Book of Abstracts, Bergen, Norway.
Kira, K. and Rendell, L. A. (1992). A practical approach to feature selection. In Ninth International Workshop on Machine Learning, pages 249–256.
Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF. In European Conference on Machine Learning, pages 171–182.
Kyle, K. and Crossley, S. A. (2014). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, –:–.
Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4):474–496.
Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. The Modern Languages Journal.
Östling, R., Smolentzov, A., Tyrefors Hinnerich, B., and Höglin, E. (2013). Automated essay scoring for swedish. In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, pages 42–47, Atlanta, Georgia. Association for Computational Linguistics.
Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing, pages 44–49, Manchester, UK.
Tono, Y. (2000). A corpus-based analysis of interlanguage development: analysing pos tag sequences of EFL learner corpora. In PALC’99: Practical Applications in Language Corpora, pages 323–340.
Vajjala, S. and Lõo, K. (2013). Role of morpho-syntactic features in Estonian proficiency classification. In Proceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications (BEA8), Association for Computational Linguistics.
Vyatkina, N. (2012). The development of second language writing complexity in groups and individuals: A longitudinal learner corpus study. The Modern Language Journal.
Williamson, D. M. (2009). A framework for implementing automated scoring. In The annual meeting of the American Educational Research Association (AERA) and the National Council on Measurement in Education (NCME).
Yannakoudakis, H., Briscoe, T., and Medlock, B. (2011). A new dataset and method for automatically grading ESOL texts. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT ’11, pages 180–189, Stroudsburg, PA, USA.
ssociation for Computational Linguistics. Corpus available: http://ilexir.co.uk/applications/clc-fce-dataset.
Zhang, B. (2008). Investigating proficiency classification for the examination for the certificate of proficiency in english (ECPE). In Spaan Fellow Working Papers in Second or Foreign Language Assessment.