Study of SMS Spam Detection Using Machine Learning Based Algorithms

Authors

  • Ravi H Gedam Research Scholar, Department of Computer Science and Engineering Amity School of Engineering and Technology, Amity University Chhattisgarh, Raipur, India. Author

DOI:

https://doi.org/10.47392/IRJAEM.2025.0054

Keywords:

SMS Spam Detection, Machine Learning, Classification Models, Text Processing, Data Analysis

Abstract

SMS spam detection is a crucial task in text classification, as unsolicited messages continue to pose security risks and inconvenience to users. This study explores the effectiveness of machine learning-based algorithms, particularly the Naive Bayes classifier, in accurately identifying and filtering spam messages. The primary objective is to classify SMS messages into spam or ham categories by analysing the occurrence of words and patterns within the text. The proposed approach involves a comprehensive pre-processing stage, including tokenization, stop-word removal, stemming, and feature extraction using techniques such as Term Frequency-Inverse Document Frequency (TF-IDF). The Naive Bayes algorithm is then trained on a labelled dataset to learn probabilistic distributions of words in spam and ham messages. Additionally, we compare the performance of Naive Bayes with other machine learning models like Support Vector Machines (SVM), Decision Trees, and Random Forest to assess their efficiency in spam detection. The experimental analysis demonstrates that the Naive Bayes classifier, due to its probabilistic nature, achieves high accuracy with minimal computational complexity. The study also evaluates precision, recall, F1-score, and overall classification accuracy to determine the best-performing algorithm. The results suggest that machine learning-based approaches significantly enhance SMS spam detection, reducing false positives and improving message filtering. Future work aims to integrate deep learning techniques and real-time detection mechanisms to further enhance accuracy and adaptability in dynamic environments.

Downloads

Download data is not yet available.

Downloads

Published

2025-02-25