Article | Proceedings of the 7th Workshop on NLP for Computer Assisted Language Learning (NLP4CALL 2018) at SLTC, Stockholm, 7th November 2018 | The Role of Diacritics in Increasing the Difficulty of Arabic Lexical Recognition Tests Linköping University Electronic Press Conference Proceedings
Göm menyn

Title:
The Role of Diacritics in Increasing the Difficulty of Arabic Lexical Recognition Tests
Author:
Osama Hamed: Language Technology Lab, University of Duisburg-Essen, Germany Torsten Zesch: Language Technology Lab, University of Duisburg-Essen, Germany
Download:
Full text (pdf)
Year:
2018
Conference:
Proceedings of the 7th Workshop on NLP for Computer Assisted Language Learning (NLP4CALL 2018) at SLTC, Stockholm, 7th November 2018
Issue:
152
Article no.:
003
Pages:
23-31
No. of pages:
9
Publication type:
Abstract and Fulltext
Published:
2018-11-02
ISBN:
978-91-7685-173-9
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

Lexical recognition tests are widely used to assess the learners’ vocabulary size. We investigate the role that diacritics play in increasing the difficulty of an Arabic lexical recognition test. An NLP pipeline is implemented to reliably estimate the frequency of diacritized word forms. We conduct a user study and compare Arabic LRTs in three settings: one has no diacritics, and two are diacritized using the most frequent and least frequent diacritized form of a word. We find that the use of infrequent diacritics can better increase the difficulty of Arabic LRTs.



Keywords: Lexical Recognition Tests, Arabic LRTs, Vocabulary Size, Diacritics, Frequency Counts, Test Difficulty/Generation

Proceedings of the 7th Workshop on NLP for Computer Assisted Language Learning (NLP4CALL 2018) at SLTC, Stockholm, 7th November 2018

Author:
Osama Hamed, Torsten Zesch
Title:
The Role of Diacritics in Increasing the Difficulty of Arabic Lexical Recognition Tests
References:

Afnan Aqel, Sahar Alwadei, and Mohammad Dahab. 2015. Building an Arabic Words Generator. International Journal of Computer Applications, 112(14).

Harun Baharudin, Zawawi Ismail, Adelina Asmawi, and Normala Baharuddin. 2014. TAV of Arabic language measurement. Mediterranean Journal of Social Sciences, 5(20):2402.

Marc Brysbaert. 2013. LEXTALE FR: A fast, free, and efficient test to measure language proficiency in French. Psychologica Belgica, 53(1):23–37.

Kareem Darwish, Ahmed Abdelali, and Hamdy Mubarak. 2014. Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging. In LREC, pages 2926–2931.

Kareem Darwish and Hamdy Mubarak. 2016. Farasa: A New Fast and Accurate Arabic Word Segmenter. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC2016), Paris, France. European Language Resources Association (ELRA).

Ali Farghaly and Khaled Shaalan. 2009. Arabic natural language processing: Challenges and solutions. ACM Transactions on Asian Language Information Processing (TALIP), 8(4):14.

Abed Alhakim Ali Kayed Freihat, Gabor Bella, Mubarak Hamdy, Fausto Giunchiglia, et al. 2018. A single-model approach for arabic segmentation, pos-tagging and named entity recognition. In International Conference on Natural Language and Speech Processing ICNLSP 2018, Algiers, Algeria. ICNLSP.

Nizar Habash. 2010. Introduction to Arabic natural language processing. Synthesis Lectures on Human Language Technologies, 3(1):1–187.

Osama Hamed and Torsten Zesch. 2015. Generating Nonwords for Vocabulary Proficiency Testing. In Proceeding of the 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 473–477, Pozna, Poland.

Osama Hamed and Torsten Zesch. 2017a. A Survey and Comparative Study of Arabic Diacritization Tools. JLCL: Special Issue - NLP for Perso-Arabic Alphabets., 32(1):27–47.

Osama Hamed and Torsten Zesch. 2017b. The Role of Diacritics in Designing Lexical Recognition Tests for Arabic. In 3rd International Conference on Arabic Computational Linguistics (ACLing 2017), Dubai, UAE. Elsevier.

Osama Hamed and Torsten Zesch. 2018. Exploring the Effects of Diacritization on Arabic Frequency Counts. In Proceeding of the 2nd International Conference on Natural Language and Speech Processing (ICNLSP 2018), Algiers, Algeria.

Ineke Huibregtse, Wilfried Admiraal, and Paul Meara. 2002. Scores on a yes-no vocabulary test: Correction for guessing and response style. Language testing, 19(3):227–245.

Cristina Izura, Fernando Cuetos, and Marc Brysbaert. 2014. Lextale-Esp: A test to rapidly and efficiently assess the Spanish vocabulary size. Psicol´ogica, 35(1):49–66.

Kristin Lemh¨ofer and Mirjam Broersma. 2012. Introducing LexTALE: A quick and valid lexical test for advanced learners of English. Behavior Research Methods, 44(2):325–343.

Arfath Pasha, Mohamed Al-Badrashiny, Mona Diab, Ahmed El Kholy, Ramy Eskander, Nizar Habash, Manoj Pooleery, Owen Rambow, and Ryan Roth. 2014. Madamira: A fast, comprehensive tool for morphological analysis and disambiguation of arabic. In LREC, pages 1094–1101.

Robert Ricks. 2015. The Development of Frequency-Based Assessments of Vocabulary Breadth and Depth for L2 Arabic.

Raymond Stubbe. 2012. Do pseudoword false alarm rates and overestimation rates in yes/no vocabulary tests change with japanese university students English ability levels? Language Testing, 29(4):471–488.

Wajdi Zaghouani. 2014. Critical survey of the freely available Arabic corpora. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’2014), OSACT Workshop. Rejkavik, Iceland.

Taha Zerrouki and Amar Balla. 2017. Tashkeela: Novel corpus of Arabic vocalized texts, data for auto-diacritization systems. Data in Brief, 11:147–151.

Proceedings of the 7th Workshop on NLP for Computer Assisted Language Learning (NLP4CALL 2018) at SLTC, Stockholm, 7th November 2018

Author:
Osama Hamed, Torsten Zesch
Title:
The Role of Diacritics in Increasing the Difficulty of Arabic Lexical Recognition Tests
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2019-06-04