Article | Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden | Twitter Topic Modeling by Tweet Aggregation
Göm menyn

Title:
Twitter Topic Modeling by Tweet Aggregation
Author:
Asbjørn Ottesen Steinskog: Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway Jonas Foyn Therkelsen: Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway Björn Gambäck: Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway
Download:
Full text (pdf)
Year:
2017
Conference:
Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden
Issue:
131
Article no.:
010
Pages:
77-86
No. of pages:
10
Publication type:
Abstract and Fulltext
Published:
2017-05-08
ISBN:
978-91-7685-601-7
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

Conventional topic modeling schemes, such as Latent Dirichlet Allocation, are known to perform inadequately when applied to tweets, due to the sparsity of short documents. To alleviate these disadvantages, we apply several pooling techniques, aggregating similar tweets into individual documents, and specifically study the aggregation of tweets sharing authors or hashtags. The results show that aggregating similar tweets into individual documents significantly increases topic coherence.

Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Author:
Asbjørn Ottesen Steinskog, Jonas Foyn Therkelsen, Björn Gambäck
Title:
Twitter Topic Modeling by Tweet Aggregation
References:

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent Dirichlet Allocation. In the Journal of Machine Learning Research, volume 3, pages 993–1022, MIT, Massachusetts, USA. JMLR. org.


David M. Blei. 2012. Probabilistic Topic Models. In Communications of Association for Computer Machinery, volume 55, New York, NY, USA, April. ACM.


William Boag, Peter Potash, and Anna Rumshisky. 2015. TwitterHawk: A Feature Bucket Based Approach to Sentiment Analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 640–646, Denver, Colorado, June. Association for Computational Linguistics.


Fazli Can and Esen A. Ozkarahan. 1990. Concepts and Effectiveness of the Covercoefficient-based Clustering Methodology for Text Databases. In ACM Transitional Database Systems, volume 15, pages 483–517, New York, NY, USA, December. Association for Computer Machinery.


Jonathan Chang, Sean Gerrish, Chong Wang, Jordan L Boyd-Graber, and David M Blei. 2009. Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems, pages 288–296, Vancouver, British Columbia.


Derek Greene, Derek O’Callaghan, editor="Calders Toon Cunningham, Pádraig", Floriana Esposito, Eyke Hüllermeier, and Rosa Meo, 2014. How Many Topics? Stability Analysis for Topic Models, pages 498–513.


Springer Berlin Heidelberg, Berlin, Heidelberg. Liangjie Hong and Brian D. Davison. 2010. Empirical Study of Topic Modeling in Twitter. In Proceedings of the First Workshop on Social Media Analytics, SOMA ’10, pages 80–88, New York, NY, USA. ACM.


Olessia Koltsova and Sergei Koltcov. 2013. Mapping the public agenda with topic modeling: The case of the Russian LiveJournal. In Policy & Internet, volume 5, pages 207–227, Russia.


Jey Han Lau, Nigel Collier, and Timothy Baldwin. 2012. On-line Trend Analysis with Topic Models:\# Twitter Trends Detection Topic Model Online. In Proceedings of COLING 2012: Technical Papers, pages 1519–1534, pages 1519–1534, Mumbai, India.


David Mimno, Hanna M. Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pages 262–272, Stroudsburg, PA, USA. Association for Computational Linguistics.


Brendan O’Connor, Ramnath Balasubramanyan, Bryan R Routledge, and Noah A Smith. 2010. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. In International Conference on Web and Social Media, volume 11, pages 1–2, Washington DC, USA.


Nataliia Plotnikova, Micha Kohl, Kevin Volkert, Andreas Lerner, Natalie Dykes, Heiko Emer, and Stefan Evert. 2015. KLUEless: Polarity Classification and Association. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Erlangen, Germany. Friedrich-Alexander-Universitat Erlangen-Nurnberg.


Xiaojun Quan, Chunyu Kit, Yong Ge, and Sinno Jialin Pan. 2015. Short and sparse text topic modeling via self-aggregation. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 2270–2276. AAAI Press.


Michal Rosen-Zvi, Thomas Griffiths, Mark Steyvers, and Padhraic Smyth. 2004. The Author-topic Model for Authors and Documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI ’04, pages 487–494, Arlington, Virginia, United States. AUAI Press.


Dionisios N Sotiropoulos, Chris D Kounavis, Panos Kourouthanassis, and George M Giaglis. 2014. What drives social sentiment? An entropic measure-based clustering approach towards identifying factors that influence social sentiment polarity. In Information, Intelligence, Systems and Applications, IISA 2014, The 5th International Conference, pages 361–373, Chania Crete, Greece. IEEE.


Pranav Waila, VK Singh, and Manish K Singh. 2013. Blog text analysis using topic modeling, named entity recognition and sentiment classifier combine. In Advances in Computing, Communications and Informatics (ICACCI), 2013 International Conference on, pages 1166–1171, Mysore, India. IEEE.


Y. Wang, J. Liu, J. Qu, Y. Huang, J. Chen, and X. Feng. 2014. Hashtag Graph Based Topic Model for Tweet Mining. In 2014 IEEE International Conference on Data Mining, pages 1025–1030, Shenzhen, China, Dec.


Jianshu Weng, Ee-Peng Lim, Jing Jiang, and Qi He. 2010. TwitterRank: Finding Topicsensitive Influential Twitterers. In Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM ’10, pages 261–270, New York, NY, USA. ACM.


Zhihua Zhang, Guoshun Wu, and Man Lan. 2015. East China Normal University, ECNU: Multilevel Sentiment Analysis on Twitter Using Traditional Linguistic Features and Word Embedding Features. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Shanghai, China. East China Normal University Shanghai.

Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Author:
Asbjørn Ottesen Steinskog, Jonas Foyn Therkelsen, Björn Gambäck
Title:
Twitter Topic Modeling by Tweet Aggregation
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2017-02-21