Back to Current Issues

An Effective algorithm for Spam Filtering and Cluster Formation

Kavitha Guda, ,

Affiliations
Associate Professor, Department of Computer Science and Engineering.
:10.22362/ijcert/2016/v3/i12/4321


Abstract
K-means clustering algorithm is one of the most widely used partitioning algorithms used for grouping the elements over spatiotemporal data. It is the fast, simple and can work with large datasets. It has some of the pitfalls regarding Number of iterations are more due to clusters details not known at an initial stage. It can detect only spherical clusters. Here we will propose a Hybrid K-Means clustering algorithm which will mostly work on the concept of splitting dataset and reducing the number of iterations. It will inherit the some of the features from two revised K-means algorithms. The advantage of separating more massive datasets is that handle easy, and the benefit of reducing iterations leads the easy cluster formation in this way the efficiency of the traditional K-means clustering algorithm is increased. Furthermore, we also proposed Naïve Bayes Algorithm for Email Spam Filtering on SPAMBASE Dataset.


Citation
Kavitha Guda, “An Effective algorithm for Spam Filtering and Cluster Formation”, International Journal Of Computer Engineering In Research Trends, 3(12):659-666, December-2016.


Keywords : Data Mining, KDD, E-Mail, Spam, Naïve Bayes Algorithm, Spam Filter, K-Means Algorithm, Hybrid K-means Algorithm, SPAMBASE dataset.

References
[1] Marek Rychly, Pavlina Ticha, “A tool for clustering in data mining”, International Federation for Information Processing, 2007.
[2]P.Verma, D.Kumar, “Association Rule Mining Algorithm’s Variant Analysis”, International Journal of Computer Application (IJCA), vol. 78, no. 14, September 2013, pp. 26–34.
[3]L.Firte, C.Lemnaru, R.Potolea, “Spam Detection Filter using KNN Algorithm and Resampling”, 6th International Conference on Intelligent Computer Communication and Processing- IEEE, 2010, pp.27-33. [4] G.Kaur, R.K.Gurm, “A Survey on Classification Techniques in Internet Environment”, International Journal of Advance Research in Computer and Communication Engineering, vol. 5, no. 3, March 2016, pp. 589–593.
[5] Rushdi, S. and Robet, M, “Classification spam emails using text and readability features”, IEEE 13th International Conference on Data Mining, 2013.
 [6] Androutsopoulos, I., Paliouras, G., and Michelakis, “E. Learning to filter unsolicited commercial e-mail”, Technical report NCSR Demokritos, 2011.
[7]Na shi, “Research on k-means clustering algorithm”, 3rd international symposium on intelligent information technology and security informatics, 2011. 
[8] Shah Sourabh, Singh Manmohan, “comparison of a time efficient modified k-mean algorithm with k-mean and kmedoid algorithm” international conference on communication systems and network technologies, 2012.
 [9] Boomjia M.D, “Comparison of partitioning based clustering algorithms”. 
[10] Han kwai, “Approximate distributed k-means clustering over a peer-to-peer network”, IEEE transactions on knowledge and data engineering, 2009.
[11]Tariq, M., B., Jameel A. Tariq, Q., Jan, R. Nisar, A. S., “Detecting Threat E-mails using Bayesian Approach”, IJSDIA International Journal of Secure Digital Information Age, Vol. 1. No. 2, December 2009.
[12]ML & KD- Machine Learning & Knowledge Discovery Group. http://mlkd.csd.auth.gr/concept drift.html.
[13] Rizky, W. M., Ristu, S., Afrizal, D. “The Effect of Best First and Spreadsubsample on Selection of a Feature Wrapper With Naïve Bayes Classifier for The Classification of the Ratio of Inpatients”. Scientific Journal of Informatics, Vol. 3(2), p. 41-50, Nov. 2016.
[14]Feng, W., Sun, J., Zhang, L., Cao, C. and Yang,Q., “A support vector machine based naive Bayes algorithm for spam filtering,” 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC), Las Vegas, NV, 2016, pp. 1-8.
[15] Lalchand G. Titare1, Prof. Riya Qureshi,” Cloud Centric loT Based Farmer’s Virtual Market place” International Journal of Computer Engineering In Research Trends., vol.3, no.12, pp. 654-658, 2016.


DOI Link : 10.22362/ijcert/2016/v3/i12/4321

Download :
  V3I1210.pdf


Refbacks : Currently There are norefbacks

Quick Links


DOI:10.22362/ijcert


Science Central

Score: 13.30



Submit your paper to editorijcert@gmail.com

>