Article | Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language | Ambiguity in Semantically Related Word Substitutions: an investigation in historical Bible translations
Göm menyn

Title:
Ambiguity in Semantically Related Word Substitutions: an investigation in historical Bible translations
Author:
Maria Moritz: Institute of Computer Science, University of Goettingen, Germany Marco BĂĽchler: Institute of Computer Science, University of Goettingen, Germany
Download:
Full text (pdf)
Year:
2017
Conference:
Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language
Issue:
133
Article no.:
005
Pages:
18-23
No. of pages:
6
Publication type:
Abstract and Fulltext
Published:
2017-05-10
ISBN:
978-91-7685-503-4
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

Text reuse is a common way to transfer historical texts. It refers to the repetition of text in a new context and ranges from nearverbatim (literal) and para-phrasal reuse to completely non-literal reuse (e.g., allusions or translations). To improve the detection of reuse in historical texts, we need to better understand its characteristics. In this work, we investigate the relationship between para-phrasal reuse and word senses. Specifically, we investigate the conjecture that words with ambiguous word senses are less prone to replacement in para-phrasal text reuse. Our corpus comprises three historical English Bibles, one of which has previously been annotated with word senses. We perform an automated word-sense disambiguation based on supervised learning. By investigating our conjecture we strive to understand whether unambiguous words are rather used for word replacements when a text reuse happens, and consequently, could serve as a discriminating feature for reuse detection.

Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language

Author:
Maria Moritz, Marco BĂĽchler
Title:
Ambiguity in Semantically Related Word Substitutions: an investigation in historical Bible translations
References:

Salha M. Alzahrani, Naomie Salim, and Ajith Abraham. 2012. Understanding plagiarism linguistic patterns, textual features, and detection methods. Trans. Sys. Man Cyber Part C, 42(2):133–149.


Daniel Baer, Torsten Zesch, and Iryna Gurevych. 2012. Text reuse detection using a composition of text similarity measures. In Proceedings of COLING 2012, pages 167–184, Mumbai, India. The COLING 2012 Organizing Committee.


Susanne R Borgwaldt, Frauke M Hellwig, and Annette M B De Groot. 2005. Onset entropy matters–letterto-phoneme mappings in seven languages. Reading and Writing, 18(3):211–229.


Zdenek Ceska and Chris Fox. 2011. The influence of text pre-processing on plagiarism detection. Association for Computational Linguistics.


Christine Fellbaum. 1998. WordNet An Electronic Lexical Database. MIT Press.


Samuel Fernando and Mark Stevenson. 2008. A semantic similarity approach to paraphrase detection. Computational Linguistics UK (CLUK 2008) 11th Annual Research Colloqium.


Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity linking meets word sense disambiguation: A unified approach. Transactions of the Association for Computational Linguistics, 2:231–244.


Roberto Navigli and Simone Paolo Ponzetto. 2012. Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell., 193:217–250, December.


GH Paetzold. 2015. Morph adorner toolkit: Morph adorner made simple.


Alessandro Raganato, Jose Camacho-Collados, Antonio Raganato, and Yunseo Joung. 2016. Semantic indexing of multilingual corpora and its application on the history domain. In Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH), pages 140–147, Osaka, Japan. The COLING 2016 Organizing Committee.


Miguel A Sanchez-Perez, Grigori Sidorov, and Alexander F Gelbukh. 2014. A winning approach to text alignment for text reuse detection at pan 2014. In CLEF (Working Notes), pages 1004–1011.


Helmut Schmid. 1999. Improvements in part-ofspeech tagging with an application to german. In Natural language processing using very large corpora, pages 13–25. Springer.


Claude E Shannon. 1949. Communication theory of secrecy systems. Bell Labs Technical Journal, 28(4):656–715.

Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language

Author:
Maria Moritz, Marco BĂĽchler
Title:
Ambiguity in Semantically Related Word Substitutions: an investigation in historical Bible translations
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21