Article | Proceedings of the workshop on lexical semantic resources for NLP at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 19 | Enriching a wordnet from a thesaurus
Göm menyn

Title:
Enriching a wordnet from a thesaurus
Author:
Sanni Nimb: Society for Danish Language and Literature, Denmark Bolette S. Pedersen: University of Copenhagen, Denmark Anna Braasch: University of Copenhagen, Denmark Nicolai Sørensen: Society for Danish Language and Literature, Denmark Thomas Troelsgård: Society for Danish Language and Literature, Denmark
Download:
Full text (pdf)
Year:
2013
Conference:
Proceedings of the workshop on lexical semantic resources for NLP at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 19
Issue:
088
Article no.:
005
Pages:
36-50
No. of pages:
15
Publication type:
Abstract and Fulltext
Published:
2013-05-17
ISBN:
978-91-7519-586-5
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press; Linköpings universitet


Export in BibTex, RIS or text

Wordnets are traditionally built around synonym sets with the vertical hyponymy relations as the central structuring principle. The hyponymy relation; however; does not necessarily group concepts into synsets that are particularly close from a thematic or functional point of view; a phenomenon which is sometimes referred to as the “ISA overload”; or if contemplated from a thematic view point: the “tennis problem”. In this paper we present two experiments. The first one concerns a method for remedying these problems by transferring thematic information from a thesaurus to a wordnet (Danish Thesaurus to DanNet). Hereby we can automatically subdivide co-hyponyms thematically as well as relate synsets thematically across parts of speech. Since the thesaurus is not yet fully completed; the paper describes work in progress; nevertheless; with an error rate below 5% of the most coarse-grained transferred themes; the experiment appears to be very promising. Finally; the second experiment concerns extension of DanNet via the Danish Thesaurus: The thematic organisation of the thesaurus in near synonyms is further applied as a very precise method for automatically extending the lexical coverage of DanNet.

Keywords: Wordnet; “ tennis problem”; ISA overload; thesaurus; thematic information

Proceedings of the workshop on lexical semantic resources for NLP at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 19

Author:
Sanni Nimb, Bolette S. Pedersen, Anna Braasch, Nicolai Sørensen, Thomas Troelsgård
Title:
Enriching a wordnet from a thesaurus
References:

Amaro; Raquel; Sara Mendes & Palmira Marrafa (2010). Encoding Event and Argument Structures in Wordnets. TSD 2010; LNAI 6231; 21–28. Berlin Heidelberg: Springer-Verlag. DOI:10.1007/978-3-642-15760-8.


Baccianella; Stefano; Andrea Esuli & Fabrizio Sebastiano (2010). SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. Proceedings of 7th LREC - Language Resources and Evaluation. Paris: ELRA (European Language Resources Association). http://www.lrec-conf.org/proceedings/lrec2010/index.html.


Bilgin; Orhan; Özlem Cetinoglu & Kemal Oflazer (2004). Building a Wordnet for Turkish. Romanian Journal of Information; Science and Technology; 7 (1-2); 163-172. Bucarest: Editura Academiei Române.


Black; William; Sabri Elkateb; Horacio Rodriguez; Musa Alkhalifa; Piek Vossen; Adam Pease; Christiane Fellbaum (2006). Introducing the Arabic Word Net Project. Petr Sojka; Key-Sun Choi; Chritiane Fellbaum; Piek Vossen (Eds.) Proceedings of the third International WordNet Conference (GWC-06). Brno: Masaryk University. http://NLPweb.kaist.ac.kr/gwc/ pdf2006/74.pdf


Braasch; A. & B.S. Pedersen (2010). Encoding Attitude and Connotation in Wordnets . In: The 14th EURALEX International Congress; Leeuwarden ; The Netherlands.


Fellbaum; Christiane (ed) (1998). WordNet – An Electronic Lexical Database. Cambridge; Massachusetts; London; England: The MIT Press.


Fellbaum; Christiane; Georg A. Miller (2006). Whither Wordnets? Zampolli Prize Presentation at LREC 2006; Genova. http://www.lrecconf.org/lrec2006/IMG/pdf/ AZPrize.Christiane%20 Fellbaum%20Presentation.LREC06.pdf.


Fellbaum; Christiane & Piek Vossen (2008). Challenges for a Global WordNet. Online Proceedings of the First International Workshop on Global Interoperability for Language Resources (ICGL 2008); 75-82. Hongkong: City University of Hongkong. http://icgl.ctl.cityu.edu.hk/2008/html/resources/~proceeding_conference.pdf.


Gonzalo; Julio; Felisa Verdejo; Carol Peters & Nicoletta Calzolari (1998). Applying EuroWordNet to Cross-Language Retrieval. Computers and the Humanities. 32 (2/3); 185-207. The Netherlands: Kluwer Academic Publishers.


Guarino; Nicola (1998). Some Ontological Principles for Designing Upper Level Lexical Resources. Proceedings from the First International Conference on Language Resources and Evaluation; 527–534. Granada.


Guarino; Nicola & Chris Welty (2002). Identity and Subsumption. Green; R.; Bean; C.A. & Myaeng; S. H. (Eds.); The Semantics of Relationships: An Interdisciplinary Perspective; Information Science and Knowledge Management. Springer Verlag.


Hjorth; Ebba & Kjeld Kristensen (eds.) (2005). Den Danske Ordbog. Copenhagen: Gyldendal & Det Danske Sprog- og Litteraturselskab. Online version: http://ordnet.dk/ddo.


Huang; Chu-Ren.; I-Li Su; Pei-Yi Hsiao; Xiu-Ling Ke (2008). Paranymy: Enriching Ontological Knowledge in WordNets. Proceedings of the Fourth Global WordNet Conference; 221–228. Szeged; Hungary: Juhász Press Ltd.


Kokkinakis; Dimitrios; Maria Toporowska Gronostaj; Karin Warmenius (2000). Annotating; Disambiguating & Automatically Extending the Coverage of the Swedish SIMPLE Lexicon. Proceeding LREC 2000; 1397-1403. Paris; France: ELRA


Kuti; Judit; Károly Varasdi; Ágnes Gyarmati; & Péter Vajda (2008). Language Independent and Language Dependent Innovations in the Hungarian WordNet. Proceedings of the Fourth Global WordNet Conference. 254-268. Szeged; Hungary: Juhász Press Ltd.


Madsen; Bodil Nistrup; Hanne Erdman Thomsen; & Carl Vikner (2004). Comparison of Principles Applying to Domain-Specific versus General Ontologies. Ontolex 2004; 90-95. Paris; France: ELRA.


Madsen; Bodil Nistrup & Hanne Erdman Thomsen (2009). Ontologies vs. Classification Systems. Proceedings of the NODALIDA 2009 workshop WordNets and other Lexical Semantic Resources — between Lexical Semantics; Lexicography; Terminology and Formal Ontologies. NEALT Proceedings Series 7; 27-32. Tartu: Northern European Association for Language Technology (NEALT) and Tartu University. http://dspace.utlib.ee/dspace/handle/10062/9840.


Mandala; Rila; Takenobu Tokunaga; & Hozumi Tanaka (1998). The use of WordNet in Information Retrieval. Proceedings of the COLING-ACL workshop on Usage of Wordnet in Natural Language Processing; 31– 37. Montreal; Canada: ACL / Morgan Kaufmann Publishers.


Montoyo; Andrés; Manuel Palomar and German Rigau (2001). Method for WordNet Enrichment using WSD. Matousek; V. ;P. Mautner R. Moucek and Karel Tauser (eds.) Proceeding TSD 2001 Lecture Notes in Computer Science;Volume 2166 ; 180-186. Springer.


Navigli; Roberto & Simone Paolo Ponzetto (2010). BabelNet: Building a Very Large Multilingual Semantic Network. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics; 216-225. Uppsala; Sweden. Association for Computational Linguistics.


Navigli; Roberto & Paola Velardi (2002). Automatic Adaptation of Wordnet to Domains Proceedings of the Third International Conference on Language Resources and Evaluation (LREC); 1499-1504. Paris; France: ELRA.


Navigli; Roberto; Paola Velardi; Alessandro Cucchiarelli & Francesca Neri (2004). Extending and Enriching WordNet with OntoLearn. Proceedings of The Second Global Wordnet Conference - GWC 2004. Brno: Masaryk University. http://www.dsi.uniroma1.it/ ~navigli/pubs/GCW_2004_Navigli_al.pdf.


Nimb; S. & B.S. Pedersen (2012). Towards a richer wordnet representation of properties – exploiting semantic and thematic information from thesauri. In: LREC 2012 Proceedings pp. 3452-3456. Istanbul; Turkey.


Pala; Karel & Dana Hlavácková: Derivational Relations in Czech WordNet (2007). ACL ‘07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies. Stroudsburg; PA; USA: Association for Computational Linguistics. http://portal.acm.org/citation.cfm?id=1567559.


Pedersen; Bolette.S. & Patrizia Paggio (2004). The Danish SIMPLE Lexicon and its Application in Content-based Querying. Nordic Journal of Linguistics 27 (1); 97-127. Cambridge University Press.


Pedersen; Bolette S; Sanni Nimb; Jørg Asmussen; Nicolai Sørensen; Lars Trap-Jensen & Henrik Lorentzen (2009). DanNet: The challenge of compiling a WordNet for Danish by reusing a monolingual dictionary. Language Resources and Evaluation; Computational Linguistics Series 43 (3); 269-299; doi:10.1007/s10579-009-9092-1.


Pedersen; B.S. & A. Braasch (2009). What do we need to know about humans? A view into the DanNet Database. In: K. Jokinen and E. Bick (eds.) Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. NEALT Proceedings Series; Vol. 4; Odense; Denmark.


Pianta; Emanuele; Luisa Bentivogli & Christian Girard (2002). MultiWordNet – Developing an aligned multilingual database. Proceedings of the First International Conference on Global WordNet; 293-302. Mysore; India.


Piasecki; Maciej; Stanislaw Szpakovicz & Bartosz Broda (2010). Toward plWordNet 2.0. Proceedings of the 5th International Conference on Global Wordnet (GWC2010); 263-270. Mumbai: Narosa Publishers.


Ruiz-Casado; Maria; Enrique Alfonesca & Pablo Castells (2005). Automatic Assignment of Wikipedia Encyclopedic Entries to WordNet Synsets. Piotr S. Szczepaniak; Janusz Kacprzyk; Adam Niewiadomski (Eds.): Advances in Web Intelligence Third International Atlantic Web IntelligenceConference; AWIC 2005; Lodz; Poland; Proceedings. Lecture Notes in Computer Science 3528. Springer


Sampson; Geoffrey (2000). Review of WordNet: An Electronic Lexical Database. In International J. of Lexicography 13.54–9; 2000. Veale; Tony (2006). Tracking the Lexical Zeitgeist with WordNet and Wikipedia. Proceedings of the 17th European Conference on Artificial Intelligence (ECAI 2006); IOS Press; 56-60. Amsterdam; The Netherlands


Veale; Tony & Yanfen Hao (2008). Enriching WordNet with Folk Knowledge and Stereotypes. Proceedings of the Fourth Global WordNet Conference; 453-461. Szeged; Hungary: Juhász Press Ltd.


Veale; Tony & Cristina Butnariu (2010). Harvesting and understanding on-line neologisms. Alexander Onysko; Sascha Michel (eds.) Cognitive Perspectives on Word Formation. 399-420. De Gruyter Mouton.


Veale; Tony & Mourad el Moueddeb (2010). Similarity; Comparability and Analogy in WordNet: Squaring the Analogical Circle with Mondrian. Proceedings of the 5th International Conference on Global Wordnet (GWC2010). Mumbai: Narosa Publishers. tp://afflatus.ucd.ie/Papers/Mondrian%20GWC%20paper.pdf


Voorhees; E.M. (1993). Using wordnet to disambiguate word senses for text retrieval. Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval; 171-180. New York; NY; USA: ACM.


Voorhees; Ellen M. (1994). Query expansion using lexical-semantic relations. Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval; 61-69. New York: Springer-Verlag New York; Inc.


Voorhees; Ellen M. & Donna Harman (1997). Overview of the fifth text retrieval conference (trec-5). Proceedings of the Fifth Text Retrieval Conference; 1-28. NIST Special Publication 500- 238. Gaithersburg: NIST. http://trec.nist.gov/pubs/trec5/t5_proceedings.html


Vossen; Piek; Eneko Agirre; Nicoletta Calzolari; Christiane Fellbaum; Shu-Kai Hsieh; Chu-Ren Huang; Hitoshi Isahara; Kyoko Kanzaki; Andrea Marchetti; Monica Monachini; Feririco Neri; Remo Raffaelli; German Rigau; Maurisio Tesconi & Joop CanGent (2008). KYOTO: A System for Mining; Structuring and Distributing Knowledge Across Language and Culture. Proceedings of the Fourth Global WordNet Conference; 474-484. Szeged; Hungary: Juhász Press Ltd.


Vossen; Piek (ed.) (1998). EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Dordrecht: Kluwer Academic Publishers.

Proceedings of the workshop on lexical semantic resources for NLP at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 19

Author:
Sanni Nimb, Bolette S. Pedersen, Anna Braasch, Nicolai Sørensen, Thomas Troelsgård
Title:
Enriching a wordnet from a thesaurus
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21