Article | Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies, 22 May, Gothenburg Sweden | Communicative efficiency and syntactic predictability: A cross-linguistic study based on the Universal Dependencies corpora
Göm menyn

Title:
Communicative efficiency and syntactic predictability: A cross-linguistic study based on the Universal Dependencies corpora
Author:
Natalia Levshina: Leipzig University, Leipzig, Germany
Download:
Full text (pdf)
Year:
2017
Conference:
Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies, 22 May, Gothenburg Sweden
Issue:
135
Article no.:
009
Pages:
72-78
No. of pages:
7
Publication type:
Abstract and Fulltext
Published:
2017-05-29
ISBN:
978-91-7685-501-0
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

There is ample evidence that human communication is organized efficiently: more predictable information is usually encoded by shorter linguistic forms and less predictable information is represented by longer forms. The present study, which is based on the Universal Dependencies corpora, investigates if the length of words can be predicted from the average syntactic information content, which is defined as the average information content of a word given its counterpart in a dyadic syntactic relationship. The effect of this variable is tested on the data from nine typologically diverse languages while controlling for a number of other well-known parameters: word frequency and average word predictability based on the preceding and following words. Poisson generalized linear models and conditional random forests show that the words with higher average syntactic informativity are usually longer in most languages, although this effect is often found in interactions with average information content based on the neighbouring words. The results of this study demonstrate that syntactic predictability should be considered as a separate factor in future work on communicative efficiency.

Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies, 22 May, Gothenburg Sweden

Author:
Natalia Levshina
Title:
Communicative efficiency and syntactic predictability: A cross-linguistic study based on the Universal Dependencies corpora
References:

Matthew Aylett and Alice Turk. 2004. The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech, 47(1):31-56.


Alan Bell, Jason Brenier, Michelle Gregory, Cynthia Girand and Dan Jurafsky. 2009. Predictability Effects on Durations of Content and Function Words in Conversational English. Journal of Memory and Language, 60(1): 92-111.


Christian Bentz and Ramon Ferrer-i-Cancho. 2016. Zipf’s law of abbreviation as a language universal. In Bentz, Christian, Gerhard J√§ger and Igor Yanovich (eds.), Proceedings of the Leiden Workshop on Capturing Phylogenetic Algorithms for Linguistics. University of Tubingen, online publication system: https://publikationen.unituebingen.de/xmlui/handle/10900/68558.


Patrick Breheny and Woodrow Burchett. 2016. visreg: Visualization of Regression Models. R
package version 2.3-0. https://CRAN.Rproject.org/package=visreg.


John Fox and Sanford Weisberg. 2011. An R Companion to Applied Regression. 2nd ed. Thousand Oaks, CA: Sage, http://socserv.socsci.mcmaster.ca/jfox/Books/Companion.


Joseph Greenberg. 1966. Language universals, with special reference to feature hierarchies. The Hague: Mouton.


Martin Haspelmath. 2008. Frequencies vs. iconicity in explaining grammatical asymmetries. Cognitive
Linguistics, 19(1): 1‚Äď33.


John A. Hawkins. 2014. Cross-linguistic Variation and Efficiency. Oxford: OUP.


Roger Levy and T. Florian Jaeger. 2007. Speakers optimize information density through syntactic reduction. In Bernhard Schl√∂kopf, John Platt & Thomas Hoffman (eds.), Advances in neural information processing systems (NIPS) Vol. 19, 849‚Äď856. Cambridge, MA: MIT Press.


Joakim Nivre, ŇĹeljko Agic, Lars Ahrenberg et al. 2017. Universal Dependencies 2.0, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics, Charles University in Prague, http://hdl.handle.net/11234/1-1983.


Steven T. Piantadosi, Harry Tily and Edward Gibson. 2011. Word lengths are optimized for efficient communication. PNAS, 108(9). http://www.pnas.org/cgi/doi/10.1073/pnas.1012551108


R Core Team. 2016. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/


Claude E. Shannon. 1948. A Mathematical Theory of Communication, Bell System Technical Journal, 27: 379‚Äď423 & 623‚Äď656.


Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis & Torsten Hothorn. 2007. Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution. BMC Bioinformatics, 8, 25, http://www.biomedcentral.com/1471-2105/8/25.


George K. Zipf. 1935 [1968]. The Psycho-Biology of Language: An Introduction to Dynamic Philology. Cambridge, MA: MIT Press.

Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies, 22 May, Gothenburg Sweden

Author:
Natalia Levshina
Title:
Communicative efficiency and syntactic predictability: A cross-linguistic study based on the Universal Dependencies corpora
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21