Acquiring Sentiment from Twitter using Supervised Learning and Lexicon-based Techniques

Authors

  • Jitrlada ROJRATANAVIJIT School of Applied Statistics, National Institute of Development Administration, Bangkok 10240
  • Preecha VICHITTHAMAROS School of Applied Statistics, National Institute of Development Administration, Bangkok 10240
  • Sukanya PHONGSUPHAP Faculty of Information and Communication Technology, Mahidol University, Bangkok 10400

DOI:

https://doi.org/10.48048/wjst.2018.2731

Keywords:

Twitter, sentiment analysis, social media content, opinion mining, social media mining

Abstract

The emergence of Twitter in Thailand has given millions of users a platform to express and share their opinions about products and services, among other subjects, and so Twitter is considered to be a rich source of information for companies to understand their customers by extracting and analyzing sentiment from Tweets. This offers companies a fast and effective way to monitor public opinions on their brands, products, services, etc. However, sentiment analysis performed on Thai Tweets has challenges brought about by language-related issues, such as the difference in writing systems between Thai and English, short-length messages, slang words, and word usage variation. This research paper focuses on Tweet classification and on solving data sparsity issues. We propose a mixed method of supervised learning techniques and lexicon-based techniques to filter Thai opinions and to then classify them into positive, negative, or neutral sentiments. The proposed method includes a number of pre-processing steps before the text is fed to the classifier. Experimental results showed that the proposed method overcame previous limitations from other studies and was very effective in most cases. The average accuracy was 84.80 %, with 82.42 % precision, 83.88 % recall, and 82.97 % F-measure.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Author Biography

Jitrlada ROJRATANAVIJIT, School of Applied Statistics, National Institute of Development Administration, Bangkok 10240

I have worked at Metropolitan Electricity Authority (MEA) in District Affairs Department. My position is Data Processing Officer Level 10. Now, I am a Ph.D. student at NIDA, Thailand. 

Education

M.Sc., Computer Science, Mahidol University, Bangkok, Thailand, 2002.

B.Eng., Computer Engineering, Kasetsart University, Bangkok, Thailand, 1995. 

References

C Zinner and C Zhou. Social Media and the Voice of the Customer. In: N Smith, R Wollan and C Zhou (eds.). The Social Media Management Handbook: Everything You Need to Know to Get Social Media Working in Your Business. John Wiley & Sons, New Jersey, 2011, p. 67-70.

W He, S Zha, and L Li. Social media competitive analysis and text mining: A case study in the pizza industry. Int. J. Inform. Manag. 2013; 33, 464-72.

N Glance, M Hurst, K Kigam, M Siegler, R Stockton and T Tomokiyo. Deriving marketing intelligence from online discussion. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Washington DC, 2005, p. 419-28.

D Gaffney. #iranElection: Quantifying online activism. In: Proceedings of the WebSci10: Extending the Frontiers of Society On-Line, Raleigh, North Carolina, 2010.

H Dong. 2013, Social Media Data Analytics applied to Hurricane Sandy. Master’s Thesis. University of Maryland, Maryland, USA.

S Asur and BA Huberman. Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Washington DC, 2010, p. 492-9.

J Paniagua and J Sapena. Business performance and social media: Love or hate? Bus. Horizons 2014; 57, 719-28.

Wikipedia: “Twitter”, Available at http://en.wikipedia.org/wiki/Twitter, accessed November 2014.

A Java, X Song, T Finin and B Tseng. Why we Twitter: Understanding microblogging. In: Proceedings of the Joint 9th WebKDD and 1st SNA-KDD 2007 Workshop, San Jose, California. 2007, p. 56-65.

S Sakawee. Thailand Social Media Stats, Available at https://www.techinasia.com/thailand-social-media-stats-28-million-facebook-45-million-twitter-17-million-instagram, accessed October 2014.

F H Khan, S Bashir and U Qamar. TOM: Twitter opinion mining framework using hybrid classification scheme. Decis. Support Syst. 2014; 57, 245-57.

C Haruechaiyasak and A Kongthon. Constructing Thai opinion mining resource: a case study on hotel reviews. In: Proceedings of the 8th Workshop on Asian Language Resources, Beijing, China. 2010, p. 64-71.

C Haruechaiyasak, A Kongthon, P Palingoon and K Trakultaweekoon. S-Sense: A sentiment analysis framework for social media sensing. In: Proceedings of the Workshop on Natural Language Processing for Social Media, Nagoya, Japan, 2013, p. 6-13.

Wikipedia: “Thai alphabet (in Thai)”, Available at https://en.wikipedia.org/wiki/Thai_alphabet, accessed January 2016.

Twitter Developers: “Twitter Developer Documentation”, Available at https://dev.twitter.com/rest/ public, accessed August 2014.

C Goncalves. GitHub Inc: Twitter-text Library, Available at https://github.com/twitter/twitter-text, accessed January 2015.

Wikipedia: “List of Emoticon”, Available at https://en.wikipedia.org/wiki/List_of_emoticons, accessed January 2015.

NECTEC: “LexTo - Thai Lexeme Tokenizer (in Thai)”, Available at http://www.sansarn.com/lexto, accessed August 2014.

Wiktionary: “The Free Dictionary (in Thai)”, Available at https://th.wiktionary.org, accessed August 2015.

O Chinakarapong. Conceptual metaphor of Thai curse words (in Thai). J. Hum. Fac. Hum. Naresuan Univ. 2014; 11, 57-76.

WEKA: “Data Mining Software in Java”, Available at http://www.cs.waikato.ac.nz/ml/weka, accessed March 2015.

WEKA: “Text categorization with WEKA”, Available at https://weka.wikispaces.com/Text+ categorization+with+WEKA, accessed March 2015.

V Kasorn. 2010, Similarity Measurement of Thai Document using Natural Language Processing (in Thai). Independent Study. Chiang Mai University, Chiang Mai, Thailand.

A Bifet and E Frank. Sentiment Knowledge Discovery in Twitter Streaming Data. In: Proceedings of 13th International Conference on Discovery Science, Canberra, Australia. 2010, p. 1-15.

B Liu. Sentiment Analysis and Opinion Mining, Draft. Morgan & Claypool Publishers, 2012, p. 31.

AsianWordNet Project: “Thai WordNet”, Available at http://awn.iisilab.org, accessed January 2016.

W Wunnasri, T Theeramunkong and C Haruechaiyasak. Solving unbalanced data for Thai sentiment analysis. In: Proceedings of the 10th International Joint Conference on Computer Science and Software Engineering, Mahasarakham, Thailand, 2013, p. 200-5.

Downloads

Published

2016-12-02

How to Cite

ROJRATANAVIJIT, J., VICHITTHAMAROS, P., & PHONGSUPHAP, S. (2016). Acquiring Sentiment from Twitter using Supervised Learning and Lexicon-based Techniques. Walailak Journal of Science and Technology (WJST), 15(1), 63–80. https://doi.org/10.48048/wjst.2018.2731

Issue

Section

Research Article