The tradition of bi-annual Nordic conferences in Computational Linguistics and related disciplines dates back to 1977; well before our professional organization—The Northern European Association for Language Technology (NEALT; http://omilia.uio.no/nealt/)—was formally established. With a sense of tradition as well as pride; this volume comprises the proceedings of the 19th Nordic Conference on Computational Linguistics (NODALIDA 2013); held on the campus of the University of Oslo; Norway; between May 22 and May 24; 2013. On the first day of NODALIDA 2013; four topical worshops are held; each with its own set of organizers and programme committee; these workshops have compiled their own proceedings volumes; which are published in the same series and included on the media distributed at the conference.

NODALIDA addresses all aspects of speech recognition and synthesis; natural language processing; and computational linguistics - including work in closely related neighbouring disciplines (such as; for example; linguistics or psychology) that is sufficiently formalized or applied to bear relevance to speech and language technologies. Following the pattern of previous years; the Programme Committee invited paper submissions in four distinct tracks:

  • regular papers on substantial; original; and unpublished research; including empirical evaluation results; where appropriate;
  • student papers on completed or ongoing work; where at least the first author is a Masteror PhD-level student;
  • short papers on smaller; focused contributions; work in progress; negative results; surveys; or opinion pieces; and
  • demonstration papers summarizing a software system or language resource; to be accompanied by a live demonstration at the conference.

The conference received 60 submissions from all over Europe (and one each from Mexico and the US); of which 38 are collected in this volume and will be presented at the conference: 13 regular; 6 student; 12 short; and 7 demonstration papers. All submissions were reviewed by at least three experts in the field (two for demonstration papers); and the final selection was made by the Programme Committee. We are indebted to everyone who contributed to the reviewing and selection process. The conference programme is complemented by three invited keynotes by distinguished researchers from Denmark; Germany; and the US; as well as by a special session on High-Performance Computing for Natural Language Processing.

NODALIDA 2013 is made possible by the joint work of many dedicated individuals; in particular the Programme and Organizing Committees; we warmly acknowledge their enthusiasm and community spirit. From the Organizing Committee; Kristin Hagen deserves a special note of gratitude; as the untiring ‘heart and soul’ of the conference logistics. We are grateful to the Department of Linguistics and Scandinavian Studies and the Department of Informatics at the University of Oslo for generously making available infrastructure and staff time. The conference is financially supported by organizations listed on the back cover; who thus make an important contribution to keeping participation fees at quite reasonable leves (by Norwegian standards).

With just about two more weeks to go; we expect some 150 participants at the conference and much look forward to welcoming our colleagues and peers to Oslo.

Stephan Oepen (Programme Chair); Janne Bondi Johannessen (Organizing Chair)

Invited Keynotes

Ron Kaplan
The Conversational User Interface
[Abstract and Fulltext]

Caroline Sporleder
Detecting and Processing Figurative Language in Discourse
[Abstract and Fulltext]

Anders Søgaard
6;909 Reasons to Mess Up Your Data
[Abstract and Fulltext]

Special Session on HPC for NLP

Gudmund Høst
The Nordic e-Infrastucture Collaboration: Opportunities for Synergy Without Borders
[Abstract and Fulltext]

Stephan Oepen
Tidying up the Basement: A Tale of Large-Scale Parsing on National eInfrastructure
[Abstract and Fulltext]

Jörg Tiedemann
Experiences in Building the Let’s MT! Portal on Amazon EC2
[Abstract and Fulltext]

Regular Papers

Eckhard Bick
Using Constraint Grammar for Chunking
[Abstract and Fulltext]

Johan Falkenjack, Katarina Heimann Mühlenbock, Arne Jönsson
Features indicating readability in Swedish text
[Abstract and Fulltext]

Katri Haverinen, Veronika Laippala, Samuel Kohonen, Anna Missilä, Jenna Nyblom, Stina Ojala, Timo Viljanen, Tapio Salakoski, Filip Ginter
Towards a Dependency-based PropBank of General Finnish
[Abstract and Fulltext]

Ryan Johnson, Lene Antonsen, Trond Trosterud
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries
[Abstract and Fulltext]

Jurgita Kapočūtė-Dzikienė, Anders Nøklestad, Janne Bondi Johannessen, Algis Krupavičius
Exploring Features for Named Entity Recognition in Lithuanian Text Corpus
[Abstract and Fulltext]

Hrafn Lofsson
Tagging the Past: Experiments using the Saga Corpus
[Abstract and Fulltext]

Hrafn Lofsson, Robert Östling
Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic
[Abstract and Fulltext]

Magnus Merkel, Jody Foo, Lars Ahrenberg
IPhraxtor - A linguistically informed system for extraction of term candidates
[Abstract and Fulltext]

Constanza Navarretta, Patrizia Paggio
Classifying Multimodal Turn Management in Danish Dyadic First Encounters
[Abstract and Fulltext]

Pedersen S. Pedersen, Lars Borin, Markus Forsberg, Neeme Kahusk, Krister Lindén, Jyrki Niemi, Niklas Nisbeth, Lars Nygaard, Heili Orav, Hiríkur Rögnvaldsson, Mitchel Seaton, Kadri Vider, Kaarlo Voionmaa
Nordic and Baltic wordnets aligned and compared through “WordTies”
[Abstract and Fulltext]

Eva Pettersson, Beàta Megyesi, Joakim Nivre
Normalisation of Historical Text Using Context-Sensitive Weighted Levenshtein Distance and Compound Splitting
[Abstract and Fulltext]

Teemu Ruokolainen, Miikka Silfverberg
Modeling OOV Words With Letter N-Grams in Statistical Taggers: Preliminary Work in Biomedical Entity Recognition
[Abstract and Fulltext]

Lars Borin, Inguna Skadina, Andrejs Vasiljevs, Krister Lindén, Gyri Losnegaard, Sussi Olsen, Bolette S. Pedersen, Roberts Rozis, Koenraad De Smedt
Baltic and Nordic Parts of the European Linguistic Infrastructure
[Abstract and Fulltext]

Student Papers

Liesbeth Augustinus, Peter Dirix
The IPP effect in Afrikaans: a corpus analysis
[Abstract and Fulltext]

Chrstopher Horn, Alisa Zhila, Alexander Gelbukh, Roman Kern, Elisabeth Lex
Using Factual Density to Measure Informativeness of Web Documents
[Abstract and Fulltext]

Tapio Luostarinen, Oskar Kohonen
Using Topic Models in Content-Based News Recommender Systems
[Abstract and Fulltext]

Bernd Opitz, Cäcilia Zirn
Bootstrapping an Unsupervised Approach for Classifying Agreement and Disagreement
[Abstract and Fulltext]

Pēteris Paikens, Laura Rituma, Lauma Pretkalnina
Morphological analysis with limited resources: Latvian example
[Abstract and Fulltext]

Lauma Pretkalnina, Laura Rituma
Statistical syntactic parsing for Latvian
[Abstract and Fulltext]

Short Papers

Filip Ginter, Jenna Nyblom, Veronika Laippala, Samuel Kohonen, Katri Haverinen, Simo Vihjanen, Tapio Salakoski
Building a Large Automatically Parsed Corpus of Finnish
[Abstract and Fulltext]

Lars Hellan, Tore Bruland
Constructing a Multilingual Database of Verb Valence
[Abstract and Fulltext]

Jussi Karlgren
New Measures to Investigate Term Typology by Distributional Data
[Abstract and Fulltext]

Andreas Søeborg Kirkedal
Analysis of phonetic transcriptions for Danish automatic speech recognition
[Abstract and Fulltext]

Samuel Läubli, Mark Fishel, Martin Volk, Manuela Weibel
Combining Statistical Machine Translation and Translation Memories with Domain Adaptation
[Abstract and Fulltext]

Sjur N. Moshagen, Tommi A. Pirinen, Trond Trosterud
Building an open-source development infrastructure for language technology projects
[Abstract and Fulltext]

Gailius Raškinis, Asta Kazlauskienė
From speech corpus to intonation corpus: clustering phrase pitch contours of Lithuanian
[Abstract and Fulltext]

Jonathon Read, Rebeca Dridan, Stephan Oepen
Simple and Accountable Segmentation of Marked-up Text
[Abstract and Fulltext]

Sara Stymne, Jörg Tiedemann, Christian Hardmeier, Joakim Nivre
Statistical Machine Translation with Readability Constraints
[Abstract and Fulltext]

Hideyuki Tanushi, Hercules Dalianis, Martin Duneld, Maria Kvist, Maria Skeppstedt, Sumithra Velupillai
Negation Scope Delimitation in Clinical Text Using Three Approaches: NegEx; PyConTextNLP and SynNeg
[Abstract and Fulltext]

Marus Uneson
Tone restoration in transcribed Kammu: Decision-list word sense disambiguation for an unwritten language
[Abstract and Fulltext]

Nynke van der Vliet, Gosse Bouma, Gisela Redeker
The automatic identification of discourse units in Dutch text
[Abstract and Fulltext]

Demonstration Papers

Liesbeth Augustinus, Vincent Vandeghinste, Ineke Schuurman, Frank Van Eynde
Example-Based Treebank Querying with GrETEL - now also for Spoken Dutch
[Abstract and Fulltext]

Malin Ahlberg, Lars Borin, Markus Forsberg, Martin Hammarstedt, Leif-Jöran Olsson, Olof Olsson, Johan Roxendal, Jonatan Uppström
Korp and Karp - a bestiary of language resources: the research infrastructure of Språkbanken
[Abstract and Fulltext]

Lars Hellan, Tore Bruland, Elias Aamot, Mads H. Sandøy
A Grammar Sparrer for Norwegian
[Abstract and Fulltext]

Mans Hulden, Mikka Silfverberg, Jerid Francom
Finite state applications with Javascript
[Abstract and Fulltext]

Emanuele Lapponi, Erik Velldal, Nikolay A. Vazov, Stephan Oepen
HPC-ready Language Analysis for Human Beings
[Abstract and Fulltext]

Paul Meurer, Helge Dyvik, Victoria Rosén, Koenraad De Smedt, Gunn Inger Lyse, Gyri Smørdal Losnegaard, Martha Thunes
The INESS Treebanking Infrastructure
[Abstract and Fulltext]

Per Erik Solberg
Building gold-standard treebanks for Norwegian
[Abstract and Fulltext]

