Performance analysis of naive bayes in text classification of islamophobia issues

Faiz Mohammad Ridho; Aji Prasetya Wibawa; Fachrul Kurniawan; Badrudin Badrudin; Anusua Ghosh

doi:10.31763/sitech.v3i1.1211


Performance analysis of naive bayes in text classification of islamophobia issues

^{(1) *} Faiz Mohammad Ridho

(Universitas Negeri Malang, Malang, Indonesia)
⁽²⁾ Aji Prasetya Wibawa

(Universitas Negeri Malang, Malang, Indonesia)
⁽³⁾ Fachrul Kurniawan

(State University of Maulana Malik Ibrahim Malang, Malang, Indonesia)
⁽⁴⁾ Badrudin Badrudin

(State University of Maulana Malik Ibrahim Malang, Malang, Indonesia)
⁽⁵⁾ Anusua Ghosh

(University of South Australia, Australia)
^*corresponding author

Abstract

In the aftermath of the 2013 Woolwich attack, a disturbing surge in hate crimes against the Muslim community emerged both offline and on social media platforms, prompting concerns about the widespread issue of Islamophobia. To systematically evaluate and quantify the presence of Islamophobic sentiment in online spaces, this study employed sentiment analysis, a robust method for deriving insights from textual data. Two classification models, Bernoulli Naive Bayes and Multinomial Naive Bayes, were selected to conduct a thorough analysis. Bernoulli Naive Bayes, specialized in handling binary data, was used for binary sentiment analysis, while Multinomial Naive Bayes, well-suited for data with multiple occurrences, was applied for more comprehensive analysis. The research encompassed nine meticulously designed test-train data scenarios, ranging from a 10:90 test-train data ratio to a 20:80 ratio. Surprisingly, both models exhibited a maximum accuracy rate of 68% in their respective optimal scenarios, raising intriguing questions about the potential and limitations of sentiment analysis and Naive Bayes models in the complex task of identifying and quantifying Islamophobic content on social media

Keywords

Naive bayes; Text classification; Islamophobia

DOI

https://doi.org/10.31763/sitech.v3i1.1211

Article metrics

10.31763/sitech.v3i1.1211 Abstract views : 847 | PDF views : 330

Cite

How to cite item

Full Text

Download

References

[1] I. Awan, â€œIslamophobia and Twitter: A Typology of Online Hate Against Muslims on Social Media,â€ Policy & Internet, vol. 6, no. 2, pp. 133â€“150, Jun. 2014, doi: https://doi.org/10.1002/1944-2866.POI364.

[2] D. M. Jones and M. L. R. Smith, â€œThe Age of Ambiguity: Art and the War on Terror Twenty Years after 9/11,â€ Stud. Confl. Terror., pp. 1â€“20, Jul. 2021, doi: 10.1080/1057610X.2021.1943813.

[3] S. L. Perry, â€œAmerican Religion in the Era of Increasing Polarization,â€ Annu. Rev. Sociol., vol. 48, no. 1, pp. 87â€“107, Jul. 2022, doi: 10.1146/annurev-soc-031021-114239.

[4] P. N. Jain and A. S. Vaidya, â€œAnalysis of Social Media Based on Terrorism â€” A Review,â€ Vietnam J. Comput. Sci., vol. 08, no. 01, pp. 1â€“21, Feb. 2021, doi: 10.1142/S2196888821300015.

[5] M. Ozalp and M. Ä†ufuroviÄ‡, â€œReligion, Belonging, and Active Citizenship: A Systematic Review of Literature on Muslim Youth in Australia,â€ Religions, vol. 12, no. 4, p. 237, Mar. 2021, doi: 10.3390/rel12040237.

[6] M. Pucelj, â€œManifestations of Islamophobia During COVID-19.,â€ Challenges Futur., vol. 7, no. 3, pp. 139-163, 2022. [Online]. Available at: https://web.p.ebscohost.com/abstract?direct.

[7] V. Vidotto, â€œItaly and Turkey in the Mediterranean: challenges and opportunities for cooperation,â€ p. 123, 2022. [Online]. Available at: http://dspace.unive.it/handle/10579/21165.

[8] A. Emon and N. Hasan, â€œUnder layered suspicion: A review of CRA audits of Muslim-led charities,â€ pp. 1-194, 2021. [Online]. Available at: https://tspace.library.utoronto.ca/handle/1807/126225.

[9] H. A. Al-Shaibani and S. Al-Augby, â€œTerrorist Tweets Detection using Sentiment Analysis: Techniques and Approaches,â€ in 2022 5th International Conference on Engineering Technology and its Applications (IICETA), May 2022, pp. 585â€“590, doi: 10.1109/IICETA54559.2022.9888461.

[10] N. S. Mullah and W. M. N. W. Zainon, â€œAdvances in Machine Learning Algorithms for Hate Speech Detection in Social Media: A Review,â€ IEEE Access, vol. 9, pp. 88364â€“88376, 2021, doi: 10.1109/ACCESS.2021.3089515.

[11] M. Fernandez and H. Alani, â€œArtificial intelligence and online extremism,â€ in Predictive Policing and Artificial Intelligence, Routledge, 2021, pp. 132â€“162, doi: 10.4324/9780429265365-7.

[12] Y. W. Tama and S. D. Sulistyaningrum, â€œA Systematic Literature Review of Islamophobia on Media: Trends, Factors, and Stereotypes,â€ Indones. J. Relig. Soc., vol. 5, no. 1, pp. 14â€“23, Jun. 2023, doi: 10.36256/ijrs.v5i1.288.

[13] C. Bird, E. Ungless, and A. Kasirzadeh, â€œTypology of Risks of Generative Text-to-Image Models,â€ in Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, Aug. 2023, pp. 396â€“410, doi: 10.1145/3600211.3604722.

[14] T. Garg, S. Masud, T. Suresh, and T. Chakraborty, â€œHandling Bias in Toxic Speech Detection: A Survey,â€ ACM Comput. Surv., vol. 55, no. 13s, pp. 1â€“32, Dec. 2023, doi: 10.1145/3580494.

[15] M. Birjali, M. Kasri, and A. Beni-Hssane, â€œA comprehensive survey on sentiment analysis: Approaches, challenges and trends,â€ Knowledge-Based Syst., vol. 226, p. 107134, Aug. 2021, doi: 10.1016/j.knosys.2021.107134.

[16] M. Wankhade, A. C. S. Rao, and C. Kulkarni, â€œA survey on sentiment analysis methods, applications, and challenges,â€ Artif. Intell. Rev., vol. 55, no. 7, pp. 5731â€“5780, Oct. 2022, doi: 10.1007/s10462-022-10144-1.

[17] A. Miller, J. Panneerselvam, and L. Liu, â€œA review of regression and classification techniques for analysis of common and rare variants and gene-environmental factors,â€ Neurocomputing, vol. 489, pp. 466â€“485, Jun. 2022, doi: 10.1016/j.neucom.2021.08.150.

[18] L. Saha, H. K. Tripathy, S. R. Nayak, A. K. Bhoi, and P. Barsocchi, â€œAmalgamation of Customer Relationship Management and Data Analytics in Different Business Sectorsâ€”A Systematic Literature Review,â€ Sustainability, vol. 13, no. 9, p. 5279, May 2021, doi: 10.3390/su13095279.

[19] C. Zucco, B. Calabrese, G. Agapito, P. H. Guzzi, and M. Cannataro, â€œSentiment analysis for mining texts and social networks data: Methods and tools,â€ WIREs Data Min. Knowl. Discov., vol. 10, no. 1, Jan. 2020, doi: 10.1002/widm.1333.

[20] F. Arias, M. Zambrano Nunez, A. Guerra-Adames, N. Tejedor-Flores, and M. Vargas-Lombardo, â€œSentiment Analysis of Public Social Media as a Tool for Health-Related Topics,â€ IEEE Access, vol. 10, pp. 74850â€“74872, 2022, doi: 10.1109/ACCESS.2022.3187406.

[21] S. A. CastaÃ±o-PulgarÃn, N. SuÃ¡rez-Betancur, L. M. T. Vega, and H. M. H. LÃ³pez, â€œInternet, social media and online hate speech. Systematic review,â€ Aggress. Violent Behav., vol. 58, p. 101608, May 2021, doi: 10.1016/j.avb.2021.101608.

[22] K. Machova, M. Mach, and M. Vasilko, â€œComparison of Machine Learning and Sentiment Analysis in Detection of Suspicious Online Reviewers on Different Type of Data,â€ Sensors, vol. 22, no. 1, p. 155, Dec. 2021, doi: 10.3390/s22010155.

[23] E. Nugraheni, â€œIndonesian Twitter Data Pre-processing for the Emotion Recognition,â€ in 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Dec. 2019, pp. 58â€“63, doi: 10.1109/ISRITI48646.2019.9034653.

[24] K. K. Agustiningsih, E. Utami, and H. Al Fatta, â€œSentiment Analysis of COVID-19 Vaccine on Twitter Social Media: Systematic Literature Review,â€ in 2021 IEEE 5th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), Nov. 2021, pp. 121â€“126, doi: 10.1109/ICITISEE53823.2021.9655960.

[25] G. Yenduri, B. R. Rajakumar, K. Praghash, and D. Binu, â€œHeuristic-Assisted BERT for Twitter Sentiment Analysis,â€ Int. J. Comput. Intell. Appl., vol. 20, no. 03, Sep. 2021, doi: 10.1142/S1469026821500152.

[26] S. Bagui, C. Wilber, and K. Ren, â€œAnalysis of Political Sentiment From Twitter Data,â€ Nat. Lang. Process. Res., vol. 1, no. 1â€“2, p. 23, 2020, doi: 10.2991/nlpr.d.201013.001.

[27] S. K. Trivedi and A. Singh, â€œTwitter sentiment analysis of app based online food delivery companies,â€ Glob. Knowledge, Mem. Commun., vol. 70, no. 8/9, pp. 891â€“910, Nov. 2021, doi: 10.1108/GKMC-04-2020-0056.

[28] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, and R. Drezewski, â€œSentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,â€ Ilk. J. Ilm., vol. 14, no. 3, pp. 348â€“354, 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.

[29] D. C. Neagu, A. B. Rus, M. Grec, M. A. Boroianu, N. Bogdan, and A. Gal, â€œTowards Sentiment Analysis for Romanian Twitter Content,â€ Algorithms, vol. 15, no. 10, p. 357, Sep. 2022, doi: 10.3390/a15100357.

[30] C. Dewi and R.-C. Chen, â€œComplement Naive Bayes Classifier for Sentiment Analysis of Internet Movie Database,â€ Nguyen, N.T., Tran, T.K., Tukayev, U., Hong, TP., TrawiÅ„ski, B., Szczerbicki, E. Intell. Inf. Database Syst. ACIIDS 2022. Lect. Notes Comput. Sci., vol. 13757, pp. 81â€“93, 2022, doi: 10.1007/978-3-031-21743-2_7.

[31] B. Gaye, D. Zhang, and A. Wulamu, â€œA Tweet Sentiment Classification Approach Using a Hybrid Stacked Ensemble Technique,â€ Information, vol. 12, no. 9, p. 374, Sep. 2021, doi: 10.3390/info12090374.

[32] Y. Fauziah, S. Saifullah, and A. S. Aribowo, â€œDesign Text Mining for Anxiety Detection using Machine Learning based-on Social Media Data during COVID-19 pandemic,â€ in Proceeding of LPPM UPN â€œVeteranâ€ Yogyakarta Conference Series 2020â€“Engineering and Science Series, 2020, vol. 1, no. 1, pp. 253â€“261. [Online]. Available at: http://proceeding.rsfpress.com/index.php/ess/article/view/117

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
Science in Information Technology Letters
ISSN 2722-4139
Published by Association for Scientific Computing Electrical and Engineering (ASCEE)
W : http://pubs2.ascee.org/index.php/sitech
E : sitech@ascee.org, andri@ascee.org, andri.pranolo.id@ieee.org

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

View My Stats

Username
Password
Remember me