Performance analysis of naive bayes in text classification of islamophobia issues

(1) * Faiz Mohammad Ridho Mail (Universitas Negeri Malang, Malang, Indonesia)
(2) Aji Prasetya Wibawa Mail (Universitas Negeri Malang, Malang, Indonesia)
(3) Fachrul Kurniawan Mail (State University of Maulana Malik Ibrahim Malang, Malang, Indonesia)
(4) Badrudin Badrudin Mail (State University of Maulana Malik Ibrahim Malang, Malang, Indonesia)
(5) Anusua Ghosh Mail (University of South Australia, Australia)
*corresponding author

Abstract


In the aftermath of the 2013 Woolwich attack, a disturbing surge in hate crimes against the Muslim community emerged both offline and on social media platforms, prompting concerns about the widespread issue of Islamophobia. To systematically evaluate and quantify the presence of Islamophobic sentiment in online spaces, this study employed sentiment analysis, a robust method for deriving insights from textual data. Two classification models, Bernoulli Naive Bayes and Multinomial Naive Bayes, were selected to conduct a thorough analysis. Bernoulli Naive Bayes, specialized in handling binary data, was used for binary sentiment analysis, while Multinomial Naive Bayes, well-suited for data with multiple occurrences, was applied for more comprehensive analysis. The research encompassed nine meticulously designed test-train data scenarios, ranging from a 10:90 test-train data ratio to a 20:80 ratio. Surprisingly, both models exhibited a maximum accuracy rate of 68% in their respective optimal scenarios, raising intriguing questions about the potential and limitations of sentiment analysis and Naive Bayes models in the complex task of identifying and quantifying Islamophobic content on social media

Keywords


Naive bayes; Text classification; Islamophobia

   

DOI

https://doi.org/10.31763/sitech.v3i1.1211
      

Article metrics

10.31763/sitech.v3i1.1211 Abstract views : 643 | PDF views : 215

   

Cite

   

Full Text

Download

References


[1] I. Awan, “Islamophobia and Twitter: A Typology of Online Hate Against Muslims on Social Media,” Policy & Internet, vol. 6, no. 2, pp. 133–150, Jun. 2014, doi: https://doi.org/10.1002/1944-2866.POI364.

[2] D. M. Jones and M. L. R. Smith, “The Age of Ambiguity: Art and the War on Terror Twenty Years after 9/11,” Stud. Confl. Terror., pp. 1–20, Jul. 2021, doi: 10.1080/1057610X.2021.1943813.

[3] S. L. Perry, “American Religion in the Era of Increasing Polarization,” Annu. Rev. Sociol., vol. 48, no. 1, pp. 87–107, Jul. 2022, doi: 10.1146/annurev-soc-031021-114239.

[4] P. N. Jain and A. S. Vaidya, “Analysis of Social Media Based on Terrorism — A Review,” Vietnam J. Comput. Sci., vol. 08, no. 01, pp. 1–21, Feb. 2021, doi: 10.1142/S2196888821300015.

[5] M. Ozalp and M. Ćufurović, “Religion, Belonging, and Active Citizenship: A Systematic Review of Literature on Muslim Youth in Australia,” Religions, vol. 12, no. 4, p. 237, Mar. 2021, doi: 10.3390/rel12040237.

[6] M. Pucelj, “Manifestations of Islamophobia During COVID-19.,” Challenges Futur., vol. 7, no. 3, pp. 139-163, 2022. [Online]. Available at: https://web.p.ebscohost.com/abstract?direct.

[7] V. Vidotto, “Italy and Turkey in the Mediterranean: challenges and opportunities for cooperation,” p. 123, 2022. [Online]. Available at: http://dspace.unive.it/handle/10579/21165.

[8] A. Emon and N. Hasan, “Under layered suspicion: A review of CRA audits of Muslim-led charities,” pp. 1-194, 2021. [Online]. Available at: https://tspace.library.utoronto.ca/handle/1807/126225.

[9] H. A. Al-Shaibani and S. Al-Augby, “Terrorist Tweets Detection using Sentiment Analysis: Techniques and Approaches,” in 2022 5th International Conference on Engineering Technology and its Applications (IICETA), May 2022, pp. 585–590, doi: 10.1109/IICETA54559.2022.9888461.

[10] N. S. Mullah and W. M. N. W. Zainon, “Advances in Machine Learning Algorithms for Hate Speech Detection in Social Media: A Review,” IEEE Access, vol. 9, pp. 88364–88376, 2021, doi: 10.1109/ACCESS.2021.3089515.

[11] M. Fernandez and H. Alani, “Artificial intelligence and online extremism,” in Predictive Policing and Artificial Intelligence, Routledge, 2021, pp. 132–162, doi: 10.4324/9780429265365-7.

[12] Y. W. Tama and S. D. Sulistyaningrum, “A Systematic Literature Review of Islamophobia on Media: Trends, Factors, and Stereotypes,” Indones. J. Relig. Soc., vol. 5, no. 1, pp. 14–23, Jun. 2023, doi: 10.36256/ijrs.v5i1.288.

[13] C. Bird, E. Ungless, and A. Kasirzadeh, “Typology of Risks of Generative Text-to-Image Models,” in Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, Aug. 2023, pp. 396–410, doi: 10.1145/3600211.3604722.

[14] T. Garg, S. Masud, T. Suresh, and T. Chakraborty, “Handling Bias in Toxic Speech Detection: A Survey,” ACM Comput. Surv., vol. 55, no. 13s, pp. 1–32, Dec. 2023, doi: 10.1145/3580494.

[15] M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,” Knowledge-Based Syst., vol. 226, p. 107134, Aug. 2021, doi: 10.1016/j.knosys.2021.107134.

[16] M. Wankhade, A. C. S. Rao, and C. Kulkarni, “A survey on sentiment analysis methods, applications, and challenges,” Artif. Intell. Rev., vol. 55, no. 7, pp. 5731–5780, Oct. 2022, doi: 10.1007/s10462-022-10144-1.

[17] A. Miller, J. Panneerselvam, and L. Liu, “A review of regression and classification techniques for analysis of common and rare variants and gene-environmental factors,” Neurocomputing, vol. 489, pp. 466–485, Jun. 2022, doi: 10.1016/j.neucom.2021.08.150.

[18] L. Saha, H. K. Tripathy, S. R. Nayak, A. K. Bhoi, and P. Barsocchi, “Amalgamation of Customer Relationship Management and Data Analytics in Different Business Sectors—A Systematic Literature Review,” Sustainability, vol. 13, no. 9, p. 5279, May 2021, doi: 10.3390/su13095279.

[19] C. Zucco, B. Calabrese, G. Agapito, P. H. Guzzi, and M. Cannataro, “Sentiment analysis for mining texts and social networks data: Methods and tools,” WIREs Data Min. Knowl. Discov., vol. 10, no. 1, Jan. 2020, doi: 10.1002/widm.1333.

[20] F. Arias, M. Zambrano Nunez, A. Guerra-Adames, N. Tejedor-Flores, and M. Vargas-Lombardo, “Sentiment Analysis of Public Social Media as a Tool for Health-Related Topics,” IEEE Access, vol. 10, pp. 74850–74872, 2022, doi: 10.1109/ACCESS.2022.3187406.

[21] S. A. Castaño-Pulgarín, N. Suárez-Betancur, L. M. T. Vega, and H. M. H. López, “Internet, social media and online hate speech. Systematic review,” Aggress. Violent Behav., vol. 58, p. 101608, May 2021, doi: 10.1016/j.avb.2021.101608.

[22] K. Machova, M. Mach, and M. Vasilko, “Comparison of Machine Learning and Sentiment Analysis in Detection of Suspicious Online Reviewers on Different Type of Data,” Sensors, vol. 22, no. 1, p. 155, Dec. 2021, doi: 10.3390/s22010155.

[23] E. Nugraheni, “Indonesian Twitter Data Pre-processing for the Emotion Recognition,” in 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Dec. 2019, pp. 58–63, doi: 10.1109/ISRITI48646.2019.9034653.

[24] K. K. Agustiningsih, E. Utami, and H. Al Fatta, “Sentiment Analysis of COVID-19 Vaccine on Twitter Social Media: Systematic Literature Review,” in 2021 IEEE 5th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), Nov. 2021, pp. 121–126, doi: 10.1109/ICITISEE53823.2021.9655960.

[25] G. Yenduri, B. R. Rajakumar, K. Praghash, and D. Binu, “Heuristic-Assisted BERT for Twitter Sentiment Analysis,” Int. J. Comput. Intell. Appl., vol. 20, no. 03, Sep. 2021, doi: 10.1142/S1469026821500152.

[26] S. Bagui, C. Wilber, and K. Ren, “Analysis of Political Sentiment From Twitter Data,” Nat. Lang. Process. Res., vol. 1, no. 1–2, p. 23, 2020, doi: 10.2991/nlpr.d.201013.001.

[27] S. K. Trivedi and A. Singh, “Twitter sentiment analysis of app based online food delivery companies,” Glob. Knowledge, Mem. Commun., vol. 70, no. 8/9, pp. 891–910, Nov. 2021, doi: 10.1108/GKMC-04-2020-0056.

[28] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, and R. Drezewski, “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” Ilk. J. Ilm., vol. 14, no. 3, pp. 348–354, 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.

[29] D. C. Neagu, A. B. Rus, M. Grec, M. A. Boroianu, N. Bogdan, and A. Gal, “Towards Sentiment Analysis for Romanian Twitter Content,” Algorithms, vol. 15, no. 10, p. 357, Sep. 2022, doi: 10.3390/a15100357.

[30] C. Dewi and R.-C. Chen, “Complement Naive Bayes Classifier for Sentiment Analysis of Internet Movie Database,” Nguyen, N.T., Tran, T.K., Tukayev, U., Hong, TP., Trawiński, B., Szczerbicki, E. Intell. Inf. Database Syst. ACIIDS 2022. Lect. Notes Comput. Sci., vol. 13757, pp. 81–93, 2022, doi: 10.1007/978-3-031-21743-2_7.

[31] B. Gaye, D. Zhang, and A. Wulamu, “A Tweet Sentiment Classification Approach Using a Hybrid Stacked Ensemble Technique,” Information, vol. 12, no. 9, p. 374, Sep. 2021, doi: 10.3390/info12090374.

[32] Y. Fauziah, S. Saifullah, and A. S. Aribowo, “Design Text Mining for Anxiety Detection using Machine Learning based-on Social Media Data during COVID-19 pandemic,” in Proceeding of LPPM UPN “Veteran” Yogyakarta Conference Series 2020–Engineering and Science Series, 2020, vol. 1, no. 1, pp. 253–261. [Online]. Available at: http://proceeding.rsfpress.com/index.php/ess/article/view/117


Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Faiz Mohammad Ridho, Aji Prasetya Wibawa, Fachrul Kurniawan, Badrudin Badrudin, Anusua Ghosh

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
Science in Information Technology Letters
ISSN 2722-4139
Published by Association for Scientific Computing Electrical and Engineering (ASCEE)
W : http://pubs2.ascee.org/index.php/sitech
E : sitech@ascee.org, andri@ascee.org, andri.pranolo.id@ieee.org

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

View My Stats