Abstract
The proliferation of fake news in digital media poses a significant challenge to information credibility, particularly in linguistically diverse regions such as the Arab world. This study addresses the critical problem of detecting fake news in Arabic media by leveraging advanced natural language processing (NLP) techniques and the AFND dataset, which contains over 600,000 news articles categorized into "credible," "not credible," and "undecided" classes. The research focuses on developing a robust system using AraBERT, a transformer-based model specifically pre-trained for Arabic text. Key contributions of this work include a comprehensive preprocessing pipeline tailored to Arabic linguistic complexities, incorporating stopword removal, stemming, normalization, and diacritic handling. The proposed model achieved an accuracy of 92.3% and a macro-averaged F1-score of 72%, outperforming traditional machine learning methods and demonstrating competitive results compared to state-of-the-art solutions in the field. The findings highlight the importance of leveraging deep learning models to capture contextual relationships in text, overcoming the limitations of traditional approaches. Despite computational constraints during training, the results suggest significant potential for further improvement with better hardware and additional fine-tuning. This research contributes to advancing Arabic fake news detection by providing a scalable and reliable framework that aligns with the growing need for accurate information in digital media. Future work aims to enhance performance through multimodal analysis, domain-specific pretraining, and real-time system deployment.
Library of Congress Subject Headings
Natural language processing (Computer science); Fake news--Arab countries; Machine learning; Data sets; Support vector machines
Publication Date
2025
Document Type
Thesis
Student Type
Graduate
Degree Name
Professional Studies (MS)
Department, Program, or Center
Graduate Programs & Research
Advisor
Sanjay Modak
Advisor/Committee Member
Ehsan Warriach
Recommended Citation
Bahri, Hussain Ali, "Fake News Detection in Arabic Media Using Machine Learning and the AFND Dataset" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12180
Campus
RIT Dubai
Plan Codes
PROFST-MS