Using J48 Tree Partitioning for scalable SVM in Spam Detection


  •  Mohammad-Hossein Nadimi-Shahraki    
  •  Zahra S. Torabi    
  •  Akbar Nabiollahi    

Abstract

Support Vector Machines (SVM) is a state-of-the-art, powerful algorithm in machine learning which has strong regularization attributes. Regularization points to the model generalization to the new data. Therefore, SVM can be very efficient for spam detection. Although the experimental results represent that the performance of SVM is usually more than other algorithms, but its efficiency is decreased when the number of feature of spam is increased. In this paper, a scalable SVM is proposed by using J48 tree for spam detection. In the proposed method, dataset is firstly partitioned by using J48 tree, then, features selection are applied in each partition in parallel. Consistently, selected features are used in the training phase of SVM. The propose method is evaluated conducted some benchmark datasets and the results are compared with other algorithms such as SVM and GA-SVM. The experimental results show that the proposed method is scalable when the number of features are increased and has higher accuracy compared to SVM and GA-SVM.



This work is licensed under a Creative Commons Attribution 4.0 License.
  • ISSN(Print): 1913-8989
  • ISSN(Online): 1913-8997
  • Started: 2008
  • Frequency: semiannual

Journal Metrics

WJCI (2022): 0.636

Impact Factor 2022 (by WJCI):  0.419

h-index (January 2024): 43

i10-index (January 2024): 193

h5-index (January 2024): N/A

h5-median(January 2024): N/A

( The data was calculated based on Google Scholar Citations. Click Here to Learn More. )

Contact