Improving sentiment analysis on PeduliLindungi comments: a comparative study with CNN-Word2Vec and integrated negation handling

(1) * Herlina Jayadianti Mail (Universitas Pembangunan Nasional Veteran Yogyakarta, Indonesia)
(2) Berliana Andra Arianti Mail (Universitas Pembangunan Nasional Veteran Yogyakarta, Indonesia)
(3) Nur Heri Cahyana Mail (Universitas Pembangunan Nasional Veteran Yogyakarta, Indonesia)
(4) Shoffan Saifullah Mail (Universitas Pembangunan Nasional Veteran Yogyakarta; AGH University of Krakow, Poland)
(5) Rafał Dreżewski Mail (AGH University of Krakow, Poland)
*corresponding author

Abstract


This study investigates sentiment analysis in Google Play reviews of the PeduliLindungi application, focusing on the integration of negation handling into text preprocessing and comparing the effectiveness of two prominent methods: CNN-Word2Vec CBOW and CNN-Word2Vec SkipGram. Through a meticulous methodology, negation handling is incorporated into the preprocessing phase to enhance sentiment analysis. The results demonstrate a noteworthy improvement in accuracy for both methods with the inclusion of negation handling, with CNN-Word2Vec SkipGram emerging as the superior performer, achieving an impressive 76.2% accuracy rate. Leveraging a dataset comprising 13,567 comments, this research introduces a novel approach by emphasizing the significance of negation handling in sentiment analysis. The study not only contributes valuable insights into the optimization of sentiment analysis processes but also provides practical considerations for refining methodologies, particularly in the context of mobile application reviews.

Keywords


analysis sentiment; word2vec; negation handling; opinion mining; cbow; skipgram; text classification

   

DOI

https://doi.org/10.31763/sitech.v4i2.1184
      

Article metrics

10.31763/sitech.v4i2.1184 Abstract views : 689 | PDF views : 181

   

Cite

   

Full Text

Download

References


[1] N. Naqvi and A. Saikia, “Lessons learned on building trust during a global pandemic: looking at future directions,” J. Commun. Healthc., pp. 1–4, Oct. 2023, doi: 10.1080/17538068.2023.2274198.

[2] L. Stanca, D.-C. Dabija, and V. Câmpian, “Qualitative analysis of customer behavior in the retail industry during the COVID-19 pandemic: A word-cloud and sentiment analysis approach,” J. Retail. Consum. Serv., vol. 75, p. 103543, Nov. 2023, doi: 10.1016/j.jretconser.2023.103543.

[3] S. Saifullah, Y. Fauziah, and A. S. Aribowo, “Comparison of Machine Learning for Sentiment Analysis in Detecting Anxiety Based on Social Media Data,” Jan. 2021, doi: 10.26555/jifo.v15i1.a20111.

[4] M. V. Wirastri, N. Morrison, and G. Paine, “The connection between slums and COVID-19 cases in Jakarta, Indonesia: A case study of Kapuk Urban Village,” Habitat Int., vol. 134, p. 102765, Apr. 2023, doi: 10.1016/j.habitatint.2023.102765.

[5] J. Verma and A. S. Mishra, “COVID-19 infection: Disease detection and mobile technology,” PeerJ, vol. 8, p. e10345, Nov. 2020, doi: 10.7717/peerj.10345.

[6] R. A. Hidayat, G. Hana Zafira, N. Rahmawati Indah Nurfitriani, and A. Alfitriya Syahida, “Digital Healthcare Development for Global Citizenship: Equality of Access to Health Facilities and Services During the COVID-19 Pandemic in Indonesia,” KnE Soc. Sci., Feb. 2023, doi: 10.18502/kss.v8i3.12813.

[7] Y. Mahendradhata, T. Lestari, and R. Djalante, “Strengthening government’s response to COVID-19 in Indonesia: A modified Delphi study of medical and health academics,” PLoS One, vol. 17, no. 9, p. e0275153, Sep. 2022, doi: 10.1371/journal.pone.0275153.

[8] M. I. Wijaya et al., “A Qualitative Study on Barriers to Stunting Primordial Prevention during the PentaCOME Project,” Open Access Maced. J. Med. Sci., vol. 11, no. E, pp. 152–161, Feb. 2023, doi: 10.3889/oamjms.2023.11289.

[9] S. H.-W. Chuah and J. Yu, “The future of service: The power of emotion in human-robot interaction,” J. Retail. Consum. Serv., vol. 61, p. 102551, Jul. 2021, doi: 10.1016/j.jretconser.2021.102551.

[10] M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,” Knowledge-Based Syst., vol. 226, p. 107134, Aug. 2021, doi: 10.1016/j.knosys.2021.107134.

[11] A. Yadav and D. K. Vishwakarma, “Sentiment analysis using deep learning architectures: a review,” Artif. Intell. Rev., vol. 53, no. 6, pp. 4335–4385, Aug. 2020, doi: 10.1007/s10462-019-09794-5.

[12] S. Saifullah, R. Dreżewski, F. A. Dwiyanto, A. S. Aribowo, Y. Fauziah, and N. H. Cahyana, “Automated Text Annotation Using Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection,” Preprints, p. 2023110963, 2023, doi: 10.20944/preprints202311.0963.v1.

[13] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, and R. Drezewski, “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” Ilk. J. Ilm., vol. 14, no. 3, pp. 348–354, 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.

[14] P. Mukherjee, Y. Badr, S. Doppalapudi, S. M. Srinivasan, R. S. Sangwan, and R. Sharma, “Effect of Negation in Sentences on Sentiment Analysis and Polarity Detection,” Procedia Comput. Sci., vol. 185, pp. 370–379, 2021, doi: 10.1016/j.procs.2021.05.038.

[15] Y. S. Mehanna and M. Mahmuddin, “The Effect of Pre-processing Techniques on the Accuracy of Sentiment Analysis Using Bag-of-Concepts Text Representation,” SN Comput. Sci., vol. 2, no. 4, p. 237, Jul. 2021, doi: 10.1007/s42979-021-00453-7.

[16] T. Kolajo, O. Daramola, A. Adebiyi, and A. Seth, “A framework for pre-processing of social media feeds based on integrated local knowledge base,” Inf. Process. Manag., vol. 57, no. 6, p. 102348, Nov. 2020, doi: 10.1016/j.ipm.2020.102348.

[17] M. T. Ari Bangsa, S. Priyanta, and Y. Suyanto, “Aspect-Based Sentiment Analysis of Online Marketplace Reviews Using Convolutional Neural Network,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 14, no. 2, p. 123, Apr. 2020, doi: 10.22146/ijccs.51646.

[18] U. Lal and P. Kamath, “Effective Negation Handling Approach for Sentiment Classification using synsets in the WordNet lexical database,” in 2022 First International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT), Feb. 2022, pp. 01–07, doi: 10.1109/ICEEICT53079.2022.9768641.

[19] G. Xu, Z. Yu, H. Yao, F. Li, Y. Meng, and X. Wu, “Chinese Text Sentiment Analysis Based on Extended Sentiment Dictionary,” IEEE Access, vol. 7, pp. 43749–43762, 2019, doi: 10.1109/ACCESS.2019.2907772.

[20] N. Punetha and G. Jain, “Optimizing Sentiment Analysis: A Cognitive Approach with Negation Handling via Mathematical Modelling,” Cognit. Comput., Nov. 2023, doi: 10.1007/s12559-023-10227-3.

[21] S. Jahić and J. Vičič, “Impact of Negation and AnA-Words on Overall Sentiment Value of the Text Written in the Bosnian Language,” Appl. Sci., vol. 13, no. 13, p. 7760, Jun. 2023, doi: 10.3390/app13137760.

[22] Y. Mehmood and V. Balakrishnan, “An enhanced lexicon-based approach for sentiment analysis: a case study on illegal immigration,” Online Inf. Rev., vol. 44, no. 5, pp. 1097–1117, Jun. 2020, doi: 10.1108/OIR-10-2018-0295.

[23] M. Neumann and R. Linzmayer, “Capturing Student Feedback and Emotions in Large Computing Courses: A Sentiment Analysis Approach,” in Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, Mar. 2021, pp. 541–547, doi: 10.1145/3408877.3432403.

[24] R. I. Kurnia, “Classification of User Comment Using Word2vec and SVM Classifier,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 9, no. 1, pp. 643–648, Feb. 2020, doi: 10.30534/ijatcse/2020/90912020.

[25] M. E. Basiri, M. Abdar, M. A. Cifci, S. Nemati, and U. R. Acharya, “A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques,” Knowledge-Based Syst., vol. 198, p. 105949, Jun. 2020, doi: 10.1016/j.knosys.2020.105949.

[26] P. K. Jain, R. Pamula, and G. Srivastava, “A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews,” Comput. Sci. Rev., vol. 41, p. 100413, Aug. 2021, doi: 10.1016/j.cosrev.2021.100413.

[27] K. Paramesha, H. L. Gururaj, A. Nayyar, and K. C. Ravishankar, “Sentiment analysis on cross-domain textual data using classical and deep learning approaches,” Multimed. Tools Appl., vol. 82, no. 20, pp. 30759–30782, Aug. 2023, doi: 10.1007/s11042-023-14427-9.

[28] E. M. Mercha and H. Benbrahim, “Machine learning and deep learning for sentiment analysis across languages: A survey,” Neurocomputing, vol. 531, pp. 195–216, Apr. 2023, doi: 10.1016/j.neucom.2023.02.015.

[29] T. Cui, N. Du, X. Yang, and S. Ding, “Multi-period portfolio optimization using a deep reinforcement learning hyper-heuristic approach,” Technol. Forecast. Soc. Change, vol. 198, p. 122944, Jan. 2024, doi: 10.1016/j.techfore.2023.122944.

[30] N. S. Alharbi, H. Jahanshahi, Q. Yao, S. Bekiros, and I. Moroz, “Enhanced Classification of Heartbeat Electrocardiogram Signals Using a Long Short-Term Memory–Convolutional Neural Network Ensemble: Paving the Way for Preventive Healthcare,” Mathematics, vol. 11, no. 18, p. 3942, Sep. 2023, doi: 10.3390/math11183942.

[31] S. N. Kigo, E. O. Omondi, and B. O. Omolo, “Assessing predictive performance of supervised machine learning algorithms for a diamond pricing model,” Sci. Rep., vol. 13, no. 1, p. 17315, Oct. 2023, doi: 10.1038/s41598-023-44326-w.

[32] A. Gandhi, K. Adhvaryu, S. Poria, E. Cambria, and A. Hussain, “Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions,” Inf. Fusion, vol. 91, pp. 424–444, Mar. 2023, doi: 10.1016/j.inffus.2022.09.025.

[33] M. S. Başarslan and F. Kayaalp, “Sentiment analysis using a deep ensemble learning model,” Multimed. Tools Appl., Oct. 2023, doi: 10.1007/s11042-023-17278-6.

[34] M. Bordoloi and S. K. Biswas, “Sentiment analysis: A survey on design framework, applications and future scopes,” Artif. Intell. Rev., vol. 56, no. 11, pp. 12505–12560, Nov. 2023, doi: 10.1007/s10462-023-10442-2.

[35] S. M. Sherif et al., “Lexicon annotation in sentiment analysis for dialectal Arabic: Systematic review of current trends and future directions,” Inf. Process. Manag., vol. 60, no. 5, p. 103449, Sep. 2023, doi: 10.1016/j.ipm.2023.103449.

[36] G. Fatouros, J. Soldatos, K. Kouroumali, G. Makridis, and D. Kyriazis, “Transforming sentiment analysis in the financial domain with ChatGPT,” Mach. Learn. with Appl., vol. 14, p. 100508, Dec. 2023, doi: 10.1016/j.mlwa.2023.100508.

[37] M. Y. Khan, A. Qayoom, M. S. Nizami, M. S. Siddiqui, S. Wasi, and S. M. K.-R. Raazi, “Automated Prediction of Good Dictionary EXamples (GDEX): A Comprehensive Experiment with Distant Supervision, Machine Learning, and Word Embedding-Based Deep Learning Techniques,” Complexity, vol. 2021, pp. 1–18, Sep. 2021, doi: 10.1155/2021/2553199.

[38] M. Giatsoglou, M. G. Vozalis, K. Diamantaras, A. Vakali, G. Sarigiannidis, and K. C. Chatzisavvas, “Sentiment analysis leveraging emotions and word embeddings,” Expert Syst. Appl., vol. 69, pp. 214–224, Mar. 2017, doi: 10.1016/j.eswa.2016.10.043.

[39] K. L. Tan, C. P. Lee, K. M. Lim, and K. S. M. Anbananthen, “Sentiment Analysis With Ensemble Hybrid Deep Learning Model,” IEEE Access, vol. 10, pp. 103694–103704, 2022, doi: 10.1109/ACCESS.2022.3210182.

[40] R. Kaushal and R. Chadha, “A Survey of Various Sentiment Analysis Techniques of Whatsapp,” in 2023 2nd International Conference for Innovation in Technology (INOCON), Mar. 2023, pp. 1–6, doi: 10.1109/INOCON57975.2023.10101105.

[41] H. Jayadianti, B. Santosa, J. Cahyaning, S. Saifullah, and R. Drezewski, “Essay auto-scoring using N-Gram and Jaro Winkler based Indonesian Typos,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 22, no. 2, pp. 325–338, Mar. 2023, doi: 10.30812/matrik.v22i2.2473.

[42] C. P. Chai, “Comparison of text preprocessing methods,” Nat. Lang. Eng., vol. 29, no. 3, pp. 509–553, May 2023, doi: 10.1017/S1351324922000213.

[43] K. S. Eljil, F. Nait-Abdesselam, E. Hamouda, and M. Hamdi, “Enhancing Sentiment Analysis on Social Media with Novel Preprocessing Techniques,” J. Adv. Inf. Technol., vol. 14, no. 6, pp. 1206–1213, 2023, doi: 10.12720/jait.14.6.1206-1213.

[44] A. M. Ningtyas and G. B. Herwanto, “The Influence of Negation Handling on Sentiment Analysis in Bahasa Indonesia,” Proc. 2018 5th Int. Conf. Data Softw. Eng. ICoDSE 2018, pp. 1–6, 2018, doi: 10.1109/ICODSE.2018.8705802.

[45] S. Saifullah, R. Dreżewski, F. A. Dwiyanto, A. S. Aribowo, and Y. Fauziah, “Sentiment Analysis Using Machine Learning Approach Based on Feature Extraction for Anxiety Detection,” in Computational Science – ICCS 2023: 23rd International Conference, Prague, Czech Republic, July 3–5, 2023, Proceedings, Part II, Berlin, Heidelberg: Springer-Verlag, 2023, pp. 365–372, doi: 10.1007/978-3-031-36021-3_38.

[46] H. Xia, “Continuous-bag-of-words and Skip-gram for word vector training and text classification,” J. Phys. Conf. Ser., vol. 2634, no. 1, p. 012052, Nov. 2023, doi: 10.1088/1742-6596/2634/1/012052.

[47] F. Xiao, S. Yu, and Y. Li, “Efficient Large-Capacity Caching in Cloud Storage Using Skip-Gram-Based File Correlation Analysis,” IEEE Access, vol. 11, pp. 111265–111273, 2023, doi: 10.1109/ACCESS.2023.3322725.

[48] S. J. Johnson, M. R. Murty, and I. Navakanth, “A detailed review on word embedding techniques with emphasis on word2vec,” Multimed. Tools Appl., Oct. 2023, doi: 10.1007/s11042-023-17007-z.

[49] F. Incitti, F. Urli, and L. Snidaro, “Beyond word embeddings: A survey,” Inf. Fusion, vol. 89, pp. 418–436, Jan. 2023, doi: 10.1016/j.inffus.2022.08.024.

[50] B. Jang, I. Kim, and J. W. Kim, “Word2vec convolutional neural networks for classification of news articles and tweets,” PLoS One, vol. 14, no. 8, p. e0220976, Aug. 2019, doi: 10.1371/journal.pone.0220976.

[51] L. Alzubaidi et al., “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,” J. Big Data, vol. 8, no. 1, p. 53, Mar. 2021, doi: 10.1186/s40537-021-00444-8.

[52] S. Saifullah, B. Yuwono, H. C. Rustamaji, B. Saputra, F. A. Dwiyanto, and R. Drezewski, “Detection of Chest X-ray Abnormalities Using CNN Based on Hyperparameters Optimization,” Eng. Proc., vol. 52, pp. 1–7, 2023. [Online]. Available at: https://sciforum.net/paper/view/16260.

[53] S. Saifullah et al., “Nondestructive chicken egg fertility detection using CNN-transfer learning algorithms,” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 9, no. 3, pp. 854–871, 2023. [Online]. Available at: https://arxiv.org/abs/2309.16257.

[54] S. Saifullah and R. Dreżewski, “Enhanced Medical Image Segmentation using CNN based on Histogram Equalization,” 2023 2nd Int. Conf. Appl. Artif. Intell. Comput., pp. 121–126, May 2023, doi: 10.1109/ICAAIC56838.2023.10141065.

[55] P. Y. Mahajan and D. P. Rana, “Text mining approach for the prediction of disease status from discharge summaries using CCBE and NEROA-CNN,” Expert Syst. Appl., vol. 227, p. 120310, Oct. 2023, doi: 10.1016/j.eswa.2023.120310.

[56] M. Umer et al., “Impact of convolutional neural network and FastText embedding on text classification,” Multimed. Tools Appl., vol. 82, no. 4, pp. 5569–5585, Feb. 2023, doi: 10.1007/s11042-022-13459-x.

[57] H. I. Abdalla, A. A. Amer, and S. D. Ravana, “BoW-based neural networks vs. cutting-edge models for single-label text classification,” Neural Comput. Appl., vol. 35, no. 27, pp. 20103–20116, Sep. 2023, doi: 10.1007/s00521-023-08754-z.

[58] S. Balasubramaniam, Y. Velmurugan, D. Jaganathan, and S. Dhanasekaran, “A Modified LeNet CNN for Breast Cancer Diagnosis in Ultrasound Images,” Diagnostics, vol. 13, no. 17, p. 2746, Aug. 2023, doi: 10.3390/diagnostics13172746


Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Herlina Jayadianti, Berliana Andra A rianti, Nur Heri Cahyana, Shoffan Saifullah

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
Science in Information Technology Letters
ISSN 2722-4139
Published by Association for Scientific Computing Electrical and Engineering (ASCEE)
W : http://pubs2.ascee.org/index.php/sitech
E : sitech@ascee.org, andri@ascee.org, andri.pranolo.id@ieee.org

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

View My Stats