Article | Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16 | New Measures to Investigate Term Typology by Distributional Data
Göm menyn

Title:
New Measures to Investigate Term Typology by Distributional Data
Author:
Jussi Karlgren: Kungliga Tekniska Högskolan, Stockholm, Sweden and Gavagai, Stockholm
Download:
Full text (pdf)
Year:
2013
Conference:
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16
Issue:
085
Article no.:
028
Pages:
311-319
No. of pages:
9
Publication type:
Abstract and Fulltext
Published:
2013-05-17
ISBN:
978-91-7519-589-6
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press; Linköpings universitet


Export in BibTex, RIS or text

This report describes a series of exploratory experiments to establish whether terms of different semantic type can be distinguished in useful ways in a semantic space constructed from distributional data. The hypotheses explored in this paper are that some words are more variant in their distribution than others; that the varying semantic character of words will be reflected in their distribution; and this distributional difference is encoded in current distributional models; but that the information is not accessible through the methods typically used in application of them. This paper proposes some new measures to explore variation encoded in distributional models but not usually put to use in understanding the character of words represented in them. These exploratory findings show that some proposed measures show a wide range of variation across words of various types.

Keywords: Term typology; distributional semantics

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Jussi Karlgren
Title:
New Measures to Investigate Term Typology by Distributional Data
References:

Hisamitsu; T.; Niwa; Y.; and Tsujii; J.-i. (2000). A method of measuring term representativeness: baseline method using co-occurrence distribution. In Proceedings of the 18th conference on Computational linguistics; pages 320‚Äď326; Morristown; NJ; USA. Association for Computational Linguistics.


Justeson; J. S. and Katz; S. M. (1995). Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering; 1:9‚Äď27.


Kanerva; P.; Kristofersson; J.; and Holst; A. (2000). Random indexing of text samples for latent semantic analysis. In Proceedings of the 22nd Annual Conference of the Cognitive Science Society; CogSci’00; page 1036. Erlbaum.


Katz; S. (1996). Distribution of content words and phrases in text and language modelling. Natural Language Engineering; 2(1):15‚Äď60.


Robertson; S. and Zaragoza; H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval; 3:333‚Äď389.


Sahlgren; M. (2006). The Word-Space Model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. PhD Dissertation; Department of Linguistics; Stockholm University.


Sahlgren; M.; Holst; A.; and Kanerva; P. (2008). Permutations as a means to encode order in word space. In Proceedings of the 30th Annual Conference of the Cognitive Science Society; CogSci‚Äô08; pages 1300‚Äď1305; Washington D.C.; USA.


Smadja; F. (1993). Retrieving collocations from text: Xtract. Computational Linguistics; 19:143‚Äď177.


Sp√§rck Jones; K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation; 28:11‚Äď21.


Swadesh; M. (1971). The origin and diversification of language. Aldine; Chicago. Edited by Joel Sherzer post mortem.

Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Author:
Jussi Karlgren
Title:
New Measures to Investigate Term Typology by Distributional Data
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21