Proceedings of the workshop on lexical semantic resources for NLP at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 19
Full text (pdf)
Lars Borin: University of Gothenburg, Sweden Ruth Vatvedt Fjeld: University of Oslo, Norway Markus Forsberg: University of Gothenburg, Sweden Sanni Nimb: Association for Danish Language and Literature, Denmark Pierre Nugues: Lund University, Sweden Bolette Sandford Pedersen: University of Copenhagen, Denmark
Linköping Electronic Conference Proceedings
NEALT Proceedings Series
Linköping University Electronic Press; Linköpings universitet

High-quality lexical semantic resources with sufficiently large vocabularies still make up a serious bottleneck not only in purely rule-based NLP applications but also in supervised corpus-based approaches. The oldest widely-known lexical semantic resource; Princeton WordNet (PWN); has been around for over two decades. While PWN and the numerous wordnet projects for other languages that it has inspired adhere fairly closely to the traditional dictionary in their conception and organization; there are also lexical-semantic resources where a closer integration of lexical data information and corpus data is attempted. Such resources can be seen either as extremely richly exemplified lexicons or extremely deeply annotated corpora; depending on your outlook. Berkeley FrameNet; VerbNet; PropBank and several others can be mentioned in this connection. A recent trend in the wake of the increased awareness of the importance of standardization and interoperability of language resources; is the development towards large-scale integration of lexical resources (variously referred to as “lexical cores”; “lexical macroresources”; “lexical resource networks”; and the like) both within and across languages; the ultimate expression of which is at the moment the linked open data in linguistics movement.

For largely extraneous reasons; English-language resources tend to receive most attention in the LT literature; but there is an increasing number of lexical semantic resources under development for many other languages; including Nordic; Baltic and other languages of the NEALT area.

In parallel to this development of new lexical semantic resources; much effort is put into exploring how such resources and formal ontologies can be made to work together in knowledgebased systems. The workshop – a follow-up on the succesful Nodalida 2009 workshop where the focus was on wordnets – was intended to bring together researchers involved in building and integrating lexical semantic resources for NLP as well as researchers that are more theoretically interested in investigating the interplay between lexical semantics; lexicography; terminology and formal ontologies.

We invited papers presenting original research relating to lexical semantic resources for NLP on topics such as:

  • representation of lexical-semantic knowledge for computational use
  • the interplay between formal ontologies and lexical resources
  • corpus-based approaches to lexical semantic resources
  • terminology and lexical semantics: concept-based vs lexical semantic approaches
  • monolingual vs. multilingual approaches to lexical-semantic resources and ontologies
  • word-space models for building and expanding ontologies
  • domain-specific classification: taxonomy and ontology – computational aspects
  • quality assessment of lexical-semantic resources: criteria; methods
  • computational use of lexical-semantic resources (information retrieval; semantic tagging of corpora; MT; etc.)
  • traditional lexicography and NLP lexicons: re-use and differences
  • cognitive aspects: computational lexical models as opposed to the ’mental lexicon’

Out of the six submissions received; four were accepted for presentation at the workshop and inclusion in this proceedings volume after a thorough review procedure and subsequent revision by the authors of the papers; Each submission was reviewed by three (anonymous) members of the program committee:

The invited speaker at the workshop; Graeme Hirst (University of Toronto); presented some of his recent work on lexical semantic resources for NLP under the title Ontologies versus lexical semantics.

The workshop organizers:
WS website: http://spraakbanken.gu.se/eng/nodalida-lexsem-ws-2013

Acknowledgements: Financial support for the organization of the workshop has come in part from the Swedish Research Council (the project Swedish FrameNet++; contract no. 2010-6013); and in part from the University of Gothenburg; through its support of the Centre for Language Technology:http://www.clt.gu.se

Graeme Hirst
Ontologies versus lexical semantics
[Abstract and Fulltext]

Linnéa Bäckström, Lars Borin, Markus Forsberg, Benjamin Lyngfelt, Julia Prentice, Emma Sköldberg
Automatic identification of construction candidates for a Swedish constructicon
[Abstract and Fulltext]

Rune Lain Knudsen, Ruth Vatvedt Fjeld
LBK2013: A balanced; annotated national corpus for Norwegian Bokmål
[Abstract and Fulltext]

Hamps Lilliehöök, Magnus Merkel
Clustering word senses from semantic mirroring data
[Abstract and Fulltext]

Sanni Nimb, Bolette S. Pedersen, Anna Braasch, Nicolai Sørensen, Thomas Troelsgård
Enriching a wordnet from a thesaurus
[Abstract and Fulltext]

Lars Borin, Ruth Vatvedt Fjeld, Markus Forsberg, Sanni Nimb, Pierre Nugues, Bolette Sandford Pedersen
