Article | Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden | Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
Göm menyn

Title:
Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
Author:
Johannes Bjerva: Center for Language and Cognition Groningen, University of Groningen, The Netherlands Robert √Ė stling: Department of Linguistics, Stockholm University, Sweden
Download:
Full text (pdf)
Year:
2017
Conference:
Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden
Issue:
131
Article no.:
024
Pages:
211-215
No. of pages:
5
Publication type:
Abstract and Fulltext
Published:
2017-05-08
ISBN:
978-91-7685-601-7
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

Assessing the semantic similarity between sentences in different languages is challenging. We approach this problem by leveraging multilingual distributional word representations, where similar words in different languages are close to each other. The availability of parallel data allows us to train such representations on a large amount of languages. This allows us to leverage semantic similarity data for languages for which no such data exists. We train and evaluate on five language pairs, including English, Spanish, and Arabic. We are able to train wellperforming systems for several language pairs, without any labelled data for that language pair.

Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Author:
Johannes Bjerva, Robert √Ė stling
Title:
Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
References:

Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Rada Mihalcea, German Rigau, and Janyce Wiebe. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In Proceedings of SemEval, pages 497‚Äď511.


Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. 2013. Polyglot: Distributed word representations for multilingual nlp. CoNLL-2013.


Hanan Aldarmaki and Mona Diab. 2016. GWU NLP at SemEval-2016 Shared Task 1: Matrix factorization for crosslingual STS. In Proceedings of SemEval 2016, pages 663‚Äď667.


Islam Beltagy, Stephen Roller, Pengxiang Cheng, Katrin Erk, and Raymond J Mooney. 2016. Representing meaning with a combination of logical and distributional models. Computational Linguistics. John R Firth. 1957. A synopsis of linguistic theory, 1930-1955. Blackwell.


Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Aistats, volume 9, pages 249‚Äď256.


Jiang Guo, Wanxiang Che, David Yarowsky, Haifeng Wang, and Ting Liu. 2016. A representation learning framework for multi-source transfer parsing. In Proc. of AAAI.


Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.


Omer Levy and Yoav Goldberg. 2014. Dependencybased word embeddings. In ACL, pages 302‚Äď308.


Chi-kiu Lo, Cyril Goutte, and Michel Simard. 2016. Cnrc at semeval-2016 task 1: Experiments in crosslingual semantic textual similarity. Proceedings of SemEval, pages 668‚Äď673.


Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.


Robert √Ėstling and J√∂rg Tiedemann. 2016. Efficient word alignment with markov chain monte carlo. The Prague Bulletin of Mathematical Linguistics, 106(1):125‚Äď146.


Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014.


Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929‚Äď1958.

Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Author:
Johannes Bjerva, Robert √Ė stling
Title:
Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21