Article | Proceedings of the workshop on computational historical linguistics at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 18 | The Anselm Corpus: Methods and perspectives of a parallel aligned corpus
Göm menyn

Title:
The Anselm Corpus: Methods and perspectives of a parallel aligned corpus
Author:
Stefanie Dipper: Department of Linguistics, Ruhr University Bochum, Germany Simone Schultz-Balluff: German Department, Ruhr University Bochum, Germany
Download:
Full text (pdf)
Year:
2013
Conference:
Proceedings of the workshop on computational historical linguistics at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 18
Issue:
087
Article no.:
003
Pages:
27-42
No. of pages:
16
Publication type:
Abstract and Fulltext
Published:
2013-05-17
ISBN:
978-91-7519-587-2
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press; Linköpings universitet


Export in BibTex, RIS or text

This paper presents ongoing work in the Anselm project at Ruhr-University Bochum; which deals with a parallel corpus of historical language data. We first present our corpus; which consists of about 50 versions of the medieval text Interrogatio Sancti Anselmi de Passione Domini (‘Questions by Saint Anselm about the Lord’s Passion’); written in different dialects from Early New High German; Middle Low German; and Middle Dutch. The versions were transcribed in a diplomatic way; and are currently being normalized and annotated with lemma and part of speech. In addition; the versions are being aligned at different levels of granularity (paragraph; sentence; phrase; word). We describe two use cases that profit from the annotations: one use case from historical lexical semantics; the other from historical syntax. We finally sketch further application scenarios from the historico-cultural domain of Digital Humanities.

Keywords: Parallel corpus; Early New High German; lexical semantics; extraposition

Proceedings of the workshop on computational historical linguistics at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 18

Author:
Stefanie Dipper, Simone Schultz-Balluff
Title:
The Anselm Corpus: Methods and perspectives of a parallel aligned corpus
References:

Behaghel; O. (1932). Deutsche Syntax. Eine geschichtliche Darstellung. Band IV: Wortstellung. Periodenbau. Winter; Heidelberg.


Bein; T.; editor (1995). Altgermanistische Editionswissenschaft. Peter Lang; Frankfurt/Main; New York.


Besch; W. (1967). Sprachlandschaften und Sprachausgleich im 15. Jahrhundert. Studien zur Erforschung der spätmittelhochdeutschen Schreibdialekte und zur Entstehung der neuhochdeutschen Schriftsprache. Francke; München.


Bollmann; M. (2012). Automatic normalization for linguistic annotation of historical language data. Master’s thesis; Ruhr-Universität Bochum.


Bollmann; M.; Petran; F.; and Dipper; S. (2011). Applying rule-based normalization to different types of historical texts — an evaluation. In Vetulani; Z.; editor; Proceedings of the 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics; pages 339–344; Poznan; Poland.


Brants; S.; Dipper; S.; Eisenberg; P.; Hansen; S.; König; E.; Lezius; W.; Rohrer; C.; Smith; G.; and Uszkoreit; H. (2004). TIGER: Linguistic interpretation of a German corpus. Research on Language and Computation; 2(4):597–620.


Bumke; J. (1996). Der unfeste Text. Überlegungen zur Überlieferungegeschichte und Textkritik der höfischen Epik im 13. Jahrhundert. In Müller; J.-D.; editor; Aufführung und Schrift in Mittelalter und Früher Neuzeit. Metzler; Stuttgart; Weimar.


Busse; D.; Hermanns; F.; and Teubert; W.; editors (1994). Begriffsgeschichte und Diskursgeschichte. Methodenfragen und Forschungsergebnisse der historischen Semantik. Westdeutscher Verlag; Opladen.


Ebert; R. P. (1986). Historische Syntax des Deutschen II: 1300-1750. Peter Lang; Frankfurt.


Hinrichs; E.; Kübler; S.; Naumann; K.; Telljohann; H.; and Trushkina; J. (2004). Recent developments in linguistic annotations of the TüBa-D/Z Treebank. In Proceedings of TLT 2004; Tübingen; Germany.


Höhle; T. (1986). Der Begriff ‘Mittelfeld’. Anmerkungen zur Theorie der topologischen Felder. In Schöne; A.; editor; Kontroversen; neue und alte. Akten des 7. Internationalen Germanistenkongresses Göttingen 1985; pages 329–340. Niemeyer; Tübingen.


Petran; F. (2012a). Aligning the un-alignable — a pilot study using a noisy corpus of nonstandardized; semi-parallel texts. In Gelbkuh; A.; editor; Computational Linguistics and Intelligent Text Processing; volume 2. Springer; Berlin; Heidelberg.


Petran; F. (2012b). Studies for segmentation of historical texts: Sentences or chunks? In Proceedings of the TLT-Workshop on Annotation of Corpora for Research in the Humanities (ACRH-2); 2012; Lisbon; Portugal.


Quast; B. (2001). Der feste Text. Beobachtungen zur Beweglichkeit des Textes aus Sicht der Produzenten. In Peters; U.; editor; Text und Kultur. Mittelalterliche Literatur 1150–1450. Metzler; Stuttgart; Weimar.


Schiewer; H. J. (2005). Fassung; Bearbeitung; Version und Edition. In Schubert; M. J.; editor; Deutsche Texte des Mittelalters zwischen Handschriftennähe und Rekonstruktion. Berliner Fachtagung 1.-3. April 2004. de Gruyter; Tübingen.


Schmid; H. and Laws; F. (2008). Estimation of conditional probabilities with decision trees and an application to fine-grained POS tagging. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING); Manchester; UK.


Schultz-Balluff; S. (2013). Auf dem Wandbord einer Nonne — Ein Passionstraktat in täglichem Gebrauch. In Rosenkränze und Seelengärten. Bildung und Frömmigkeit in niedersächsischen Frauenklöstern; Ausstellungskataloge der Herzog August Bibliothek 95; pages 147–155. Herzog August Bibliothek Wolfenbüttel; Wiesbaden.


Schultz-Balluff; S. and Dipper; S. (2013). ‘St. Anselmi Fragen an Maria’ — Schritte zu einer (digitalen) Erschließung; Auswertung und Edition der gesamten deutschsprachigen Überlieferung (14.–16. Jh.). In Bohnenkamp-Renken; A.; editor; Medienwandel/Medienwechsel in der Editionswissenschaft; Beihefte zu editio; pages 173–191. Berlin; Boston: de Gruyter.


Stolz; M.; Lucas; M.; and Loop; J.; editors (2007). Literatur und Literaturwissenschaft auf dem Weg zu den neuen Medien. Eine Standortbestimmung. germanistik.ch.

Proceedings of the workshop on computational historical linguistics at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 18

Author:
Stefanie Dipper, Simone Schultz-Balluff
Title:
The Anselm Corpus: Methods and perspectives of a parallel aligned corpus
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21