Selected papers from the CLARIN Annual Conference 2016, Aix-en-Provence, 26–28 October 2016, CLARIN Common Language Resources and Technology Infrastructure
Lars Borin: University of Gothenburg, Sweden
This volume contains a selection of papers presented at the CLARIN Annual Conference 2016 which was held in Aix-en-Provence, France, on 26–28 October 2016.

This was the fifth edition of the conference. It started in 2012 as an internal event, where members of the national CLARIN consortia came together to share their experiences of and thoughts on the development of the CLARIN ERIC infrastructure.

In 2014, it was felt that the time was ripe to change the format of the conference into an event with an open call for contributions, in order to include also the humanities and social-science research communities – the intended users of the infrastructure – in the exchange of ideas and experiences on the CLARIN infrastructure. This includes its design, construction and operation, the data and services that it contains or should contain, its actual use by researchers, its relation to other infrastructures and projects, and the CLARIN Knowledge Sharing Infrastructure.

As a result of the 2016 call for papers we received 34 anonymous extended abstracts, each of which was anonymously reviewed by at least three members of the program committee, which as always consisted of the members of the CLARIN ERIC National Coordinators’ Forum, i.e., one member from each participating country or NGO. In order to avoid conflicts of interest, no PC member reviewed submissions from their own country. As a result of the reviewing process, a total of 25 submissions were accepted for presentation at the conference, 14 as oral presentations and 11 as posters.

In addition to the submitted presentations, the conference featured two invited speakers. The keynote on the first day was presented by professor Ian Gregory from Lancaster University, under the title Texts, language and geography: Understanding literature using geographical text analysis, and on the second day, professor Sally Wyatt, Maastricht University talked about Why technologies are not neutral, and why it matters for linguists.

As a new feature, the CLARIN 2016 call for papers included a call for submissions to a thematic session, focusing on Language resources and historical sources. The general area of interest for the thematic session was stated in the call for papers as CLARIN-related research in the historical sciences, understood in a wide sense to encompass fields such as History, “History of ...”/“... history” (e.g., History of science, Rhetorical history), as well as the various historically oriented subfields of linguistics (e.g., Historical linguistics, Historical pragmatics, etc.), and philology. We invited submissions on two separate but overlapping aspects that we construed this theme to encompass:

  1. The historical aspect in a narrower sense: Processing historical language stages in the form of text or speech, with the concomitant issues of digitization, non-standardized language, etc.
  2. The diachronic aspect: Discovering, characterizing and tracking change through time, both linguistic changes and changes in the world as reflected in the content of text.

Two of the oral presentations and several posters addressed this theme. Ian Gregory’s keynote speech together with the two oral presentations were organized into a thematic session scheduled at the very beginning of the conference program.

The conference was video recorded; see the YouTube playlist: https://www.youtube.com/playlist?list=PLlKmS5dTMgw2pP-uvhKNVSgOuuZjvmLwy

Following the conference, authors of the accepted papers were invited to submit full versions of their papers to be considered for the conference proceedings volume, although this time the submissions were not anonymous. Again the papers were reviewed (anonymously) by two to four PC members, at least one of which had not reviewed the original abstract submitted for the conference. We received 14 full-length submissions, out of which 10 were accepted for this volume. Most of these address core CLARIN issues dealing with the construction, maintenance and use of the European infrastructure coordinated in the framework of the CLARIN ERIC, such as search engine design, resource discovery, metadata quality, researcher training in infrastructure use, and design of specific tools and resources. There is one “pure” research paper in this volume – by Hinrichs, Erdmann and Joseph – but many of the contributions refer to research conducted using the CLARIN infrastructure. In two cases – the papers by Beißwenger et al. and by MacWhinney – the focus is on resource-building with specific research questions or a specific research field in mind, where the research and infrastructure-building activities feed into each other and actually become hard to disentangle.

I would like to thank the reviewers for the dedicated efforts they put down in evaluating the submissions, and also Peter Berkesand at Linköping University Electronic Press, who (as usual) has ensured that the digital publication of this volume went smoothly and painlessly.

Lars Borin
University of Gothenburg
Program committee chair

Michael Beißwenger, Thierry Chanier, Tomaž Erjavec, Darja Fišer, Axel Herold, Nikola Ljubešic, Harald Lüngen, Céline Poudat, Egon Stemle, Angelika Storrer, Ciara Wigham
Closing a Gap in the Language Resources Landscape: Groundwork and Best Practices from Projects on Computer-mediated Communication in four European Countries
[Abstract and Fulltext]

Matthijs Brouwer, Hennie Brugman, Marc Kemps-Snijders
MTAS: A Solr/Lucene based Multi Tier Annotation Search solution
[Abstract and Fulltext]

Erhard Hinrichs, Alex Erdmann, Brian Joseph
What’s in A Name? The Case of Albanisch-Albanesisch and Broader Implications
[Abstract and Fulltext]

Danijel Koržinek, Krzysztof Marasek, Łukasz Brocki, Krzysztof Wołk
Polish Read Speech Corpus for Speech Tools and Services
[Abstract and Fulltext]

Vesna Lušicky, Tanja Wissik
Discovering Resources in the VLO: A Pilot Study with Students of Translation Studies
[Abstract and Fulltext]

Brian MacWhinney
TalkBank and CLARIN
[Abstract and Fulltext]

Davor Ostojic, Go Sugimoto, Matej Ďurčo
The Curation Module and Statistical Analysis on VLO Metadata Quality
[Abstract and Fulltext]

Jean-Marie Pierrel, Christophe Parisse, Jérôme Blanchard, Etienne Petitjean, Frédéric Pierre
ORTOLANG: a French Infrastructure for Open Resources and TOols for LANGuage
[Abstract and Fulltext]

Thomas Schmidt, Hanna Hedeland, Daniel Jettka
Conversion and Annotation Web Services for Spoken Language Data in CLARIN
[Abstract and Fulltext]

Tanja Wissik, Claudia Resch
Researcher Hands-On Training in the Digital Humanities: The ACDH Tool Gallery as an Austrian Case Study
[Abstract and Fulltext]

