Application of natural language processing and fuzzy logic to disinformation detection

Melnyk Halyna; Melnyk Vasyl; Vikovan Valentyn

doi:https://doi.org/10.31861/bmj2024.01.03

Melnyk Halyna ¹ , Melnyk Vasyl ² , Vikovan Valentyn ³

¹ Department of Aplied Mathematics and Information Technologies, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, 58000, Ukraine

² Department of Mathematical Modeling, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, 58000, Ukraine

³ Chernivtsi National University named after Yuriy Fedkovych, Chernivtsi, 58002, Ukraine

DOI: https://doi.org/10.31861/bmj2024.01.03

Keywords: Fuzzy logic, TF-IDF, natural language processing, n-gramms

Download

Abstract

In the modern information environment, the problem of automatic detection of disinformation is a pressing task that requires new approaches to text data analysis. This article presents a model that combines natural language processing (NLP) methods — such as TF-IDF and n-gram analysis — with the use of fuzzy logic for more accurate identification of disinformation texts. The use of TF-IDF (term-frequency, inverse document frequency) allows us to quantitatively assess the importance of terms in the context of a document, and n-gram analysis provides the detection of lexical patterns that often accompany disinformation.

However, classical NLP approaches, including TF-IDF and n-gram models, exhibit limitations in the form of a high frequency of false positive classifications. To overcome this problem, the integration of fuzzy logic rules that model uncertainty and gradations of truth has been proposed. Specifically, fuzzy logic allows us to take into account multiple factors, including source reliability, lexical content indicators, and emotional tone of the text, using membership functions for each factor. The initial estimate of the probability of disinformation is calculated through the composition of membership functions and fuzzy rules of the “If... then...” type, which allows us to obtain a fuzzy solution that reflects the degree of compliance of the text with the disinformation criteria.

Experimental results show that the proposed approach using fuzzy logic provides a reduction in the number of false positives and an increase in overall accuracy compared to baseline models, such as the support vector machine (SVM) and hybrid rule-based systems. Comparative analysis has shown the advantages of the fuzzy logic model in conditions of incomplete or contradictory information, which is typical for disinformation detection tasks. The proposed model opens up new opportunities for the development of text analysis tools that can adaptively respond to different levels of uncertainty in linguistic content.

References

[1] Practical Natural Language Processing / S. Vajjala et al. O’Reilly Media, Inc., 2020. (https://www.oreilly.com/library/view/practical-natural-language/9781492054047/ )
[2] Bressert E. SciPy and Numpy. O’Reilly, 2012. (https://www.oreilly.com/library/view/scipy-andnumpy/9781449361600/)
[3] Robertson S. E. Understanding Inverse Document Frequency: On Theoretical Arguments for IDF. Journal of Documentation. 2004. Vol. 60, no. 5. P. 503–507.
[4] Interpreting TF-IDF term weights as making relevance decisions / H. C. Wu et al. ACM Transactions on Information Systems. 2008. Vol. 26, no. 3.
[5] Cavnar W., Trenkle J. M. N-Gram-Based Text Categorization. Environmental Research Institute of Michigan. 2001.
[6] B. Cardone, F. Di Martino, and S. Senatore, "Improving the emotion-based classification by exploiting the fuzzy entropy in FCM clustering," International Journal of Intelligent Systems, 2021, 36(11).
[7] O. Iparraguirre-Villanueva, V. Guevara-Ponce, F. Sierra-Liñan, S. Beltozar-Clemente, and M. Cabanillas-Carbonell, "Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the KMeans Algorithm," International Journal of Advanced Computer Science and Applications, 2022, 13(6), 571-578.
[8] L. A. Zadeh, "Fuzzy sets," Information and control, vol. 8 (1965), pp. 338-353.
[9] Chakraborty, K., Bhattacharyya, S., Bag, R. (2022). A Three-Step Fuzzy-Based BERT Model for Sentiment Analysis. In: Bhattacharyya, S., Das, G., De, S. (eds) Intelligence Enabled Research. Studies in Computational Intelligence, vol 1029. Springer, Singapore. https://doi.org/10.1007/978-981-19-0489-9_4
[10] Aytug Onan, Hesham A. Alhumyani,FuzzyTP-BERT: Enhancing extractive text summarization with fuzzy topic modeling and transformer networks,Journal of King Saud University - Computer and Information Sciences, Volume 36, Issue 6,2024,102080,ISSN 1319-1578, https://doi.org/10.1016/j.jksuci.2024.102080. (https://www.sciencedirect.com/science/article/pii/S1319157824001691)
[11] Ch. Sun (2024). Combining Fuzzy Logic and Transformers for Improved Text Classification under Uncertainty. Vol. 5 (2024): 2nd International Conference on Artificial Intelligence, Database and Machine Learning (AIDML 2024).
[12] R. Seth and A. Sharaff, "Sentiment-Aware Detection Method of Fake News Based on Linguistic Fuzzy Bi-LSTM," 2023 OITS International Conference on Information Technology (OCIT), Raipur, India, 2023, pp. 628-633, doi: 10.1109/OCIT59427.2023.10430669.
[13] https://github.com/diptamath/covid_fake_news

Download

Cite

ACS Style: Melnyk, H.; Melnyk, V.; Vikovan, V. Application of natural language processing and fuzzy logic to disinformation detection. Bukovinian Mathematical Journal. 2024, 12 https://doi.org/https://doi.org/10.31861/bmj2024.01.03
AMA Style: Melnyk H, Melnyk V, Vikovan V. Application of natural language processing and fuzzy logic to disinformation detection. Bukovinian Mathematical Journal. 2024; 12(1). https://doi.org/https://doi.org/10.31861/bmj2024.01.03
Chicago/Turabian Style: Halyna Melnyk, Vasyl Melnyk, Valentyn Vikovan. 2024. "Application of natural language processing and fuzzy logic to disinformation detection". Bukovinian Mathematical Journal. 12 no. 1. https://doi.org/https://doi.org/10.31861/bmj2024.01.03

Export

BibTex EndNote RIS