Article | Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16 | Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries
Göm menyn

Title:
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries
Author:
Ryan Johnson: University of Tromsø, Norway Lene Antonsen: University of Tromsø, Norway Trond Trosterud: University of Tromsø, Norway
Download:
Full text (pdf)
Year:
2013
Conference:
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16
Issue:
085
Article no.:
010
Pages:
59-71
No. of pages:
13
Publication type:
Abstract and Fulltext
Published:
2013-05-17
ISBN:
978-91-7519-589-6
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press; Linköpings universitet


Export in BibTex, RIS or text

This article presents a novel way of combining finite-state transducers (FSTs) with electronic dictionaries; thereby creating efficient reading comprehension dictionaries. We compare a North Saami - Norwegian and a South Saami - Norwegian dictionary; both enriched with an FST; with existing; available dictionaries containing pre-generated paradigms; and show the advantages of our approach. Being more flexible; the FSTs may also adjust the dictionary to different contexts. The finite state transducer analyses the word to be looked up; and the dictionary itself conducts the actual lookup. The FST part is crucial for morphology-rich languages; where as little as 10% of the wordforms in running text actually consists of lemma forms. If a compound or derived word; or a word with an enclitic particle is not found in the dictionary; the FST will give the stems and derivation affixes of the wordform; and each of the stems will be given a separate translation. In this way; the coverage of the FST-dictionary will be far larger than an ordinary dictionary of the same size.

Keywords: Lexicography; Computational Morphology; Orthographic Variation; Finite-state Transducers; Electronic Dictionaries

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Ryan Johnson, Lene Antonsen, Trond Trosterud
Title:
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries
References:

Antonsen; L. (2013). ?Cállinmeattáhusaid guorran. [English summary: Tracking misspellings.]. University of Tromsø.


Antonsen; L. and Trosterud; T. (2010). Manne dihtor galgá máhttit grammatihka? [English summary: Why the computer should know its Sami grammar.]. Sámi die¯dalaš áige?cála; 1:3–28.


Antonsen; L.; Trosterud; T.; Gerstenberger; C.-V.; and Moshagen; S. N. (2009). Ei intelligent ordbok for samisk. LexicoNordica; 16:271–283.


Beesley; K. R. and Karttunen; L. (2003). Finite State Morphology. CSLI publications in Computational Linguistics; USA.


Facebook-group (2012). Discussions in NSR – a Norwegian Saami Organisation’s facebook group. https://www.facebook.com/groups/norskesamersriksforbund/?fref= ts. [last visited on 25/01/2013].


Koskenniemi; K. (1983). Two-level morphology : a general computational model for word-form recognition and production. Helsingin yliopisto; Helsinki.


Larsson; L.-G. (1997). Prästen och ordet. Ur den samiska lexikografins historia. LexicoNordica; 4:101–117.


Lindén; K.; Silfverberg; M.; and Pirinen; T. (2009). HFST tools for morphology – An Efficient Open-Source Package for Construction of Morphological Analyzers. In Proceedings of the Workshop on Systems and Frameworks for Computational Morphology; Zürich; Switzerland.


Magga; O. H. (2012). Lexicography and indigenous languages. In Fjeld; R. V. and Torjusen; J. M.; editors; Proceedings of the 15th EURALEX International Congress; pages 3–18; Oslo; Norway. Department of Linguistics and Scandinavian Studies; University of Oslo.


Maxwell; M. and Poser; W. (2004). Morphological interfaces to dictionaries. In Zock; M.; editor; COLING 2004 Enhancing and using electronic dictionaries; pages 65–68; Geneva; Switzerland. COLING.


Moshagen; S.; Sammallahti; P.; and Trosterud; T. (2004). Twol at work. In Arppe; A.; Carlson; L.; Lindén; K.; Piitulainen; J.; Suominen; M.; Vainio; M.; Westerlund; H.; and Yli-Jyrä; A.; editors; Inquiries into Words; Constraints and Contexts; pages 94–105; Stanford; CA. CSLI.


Trosterud; T. (2000). Kåven; Brita E. (red) 2000: Stor norsk-samisk ordbok [book review]. LexicoNordica; 8:283–306.


Trosterud; T. and Eskonsipo; B. N. (2012). A North Sami translator’s mailing list seen as a key to minority language lexicography. In Fjeld; R. V. and Torjusen; J. M.; editors; Proceedings of the 15th EURALEX International Congress; pages 250–256; Oslo; Norway. Department of Linguistics and Scandinavian Studies; University of Oslo.

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Ryan Johnson, Lene Antonsen, Trond Trosterud
Title:
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21