Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


Distance Based Hybrid Approach for Cluster Analysis Using Variants of K-means and Evolutionary Algorithm

O.A. Mohamed Jafar and R. Sivakumar
Department of Computer Science, A.V.V.M. Sri Pushpam College (Autonomous), Poondi, Thanjavur, Tamil Nadu, India
Research Journal of Applied Sciences, Engineering and Technology  2014  11:1355-1362
http://dx.doi.org/10.19026/rjaset.8.1107  |  © The Author(s) 2014
Received: June ‎14, ‎2014  |  Accepted: July ‎09, ‎2014  |  Published: September 20, 2014

Abstract

Clustering is a process of grouping same objects into a specified number of clusters. K-means and K-medoids algorithms are the most popular partitional clustering techniques for large data sets. However, they are sensitive to random selection of initial centroids and are fall into local optimal solution. K-means++ algorithm has good convergence rate than other algorithms. Distance metric is used to find the dissimilarity between objects. Euclidean distance metric is commonly used by number of researchers in most algorithms. In recent years, Evolutionary algorithms are the global optimization techniques for solving clustering problems. In this study, we present hybrid K-means++ with PSO technique (K++_PSO) clustering algorithm based on different distance metrics like City Block and Chebyshev. The algorithms are tested on four popular benchmark data sets from UCI machine learning repository and an artificial data set. The clustering results are evaluated through the fitness function values. We have made a comparative study of proposed algorithm with other algorithms. It has been found that K++_PSO algorithm using Chebyshev distance metric produces good clustering results as compared to other approaches.

Keywords:

Cluster analysis , distance metrics, evolutionary algorithms , K-means , K-means++, K-medoids , particle swarm optimization,


References

  1. Aghdasi, T., J. Vahidi and H. Motameni, 2014. K-harmonic means data clustering using combination of particle swarm optimization and tabu search. Int. J. Mech. Electr. Comput. Technol., 4(11): 485-501.
  2. Arthur, D. and S. Vassilvitskii, 2007. K-means++: The advantages of careful seeding. Proceeding of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, pp: 1027-1035.
  3. Bandyopadhyay, S. and U. Maulik, 2002. An evolutionary technique based on K-means algorithm for optimal clustering in Rn. Inform. Sci., 146: 221-237.
    CrossRef    
  4. Berkhin, P., 2002. Survey of clustering data mining techniques. Technical Report, Accrue Software, San Jose, California.
  5. Chen, C.Y. and Y. Fun, 2004. Particle swarm optimization algorithm and its application to clustering analysis. Proceeding of IEEE International Conference on Networking Sensing and Control, 2: 789-794.
  6. Chuang, L.Y., Y.D. Lin and C.H. Yang, 2012. An improved particle swarm optimization for data clustering. Proceeding of International MultiConference of Engineers and Computer Scientists (IMECS, 2012). Hong Kong, Vol. 1, March 14-16.
  7. Danesh, M., M. Naghibzadeh, M.R.A. Totonchi, M. Danesh, B. Minaei and H. Shirgahi, 2011. Data clustering based on an efficient hybrid of K-harmonic means, PSO and GA. In: Nguyen, N.T. (Ed.), Transactions on CCI IV. LNCS 6660, Springer-Verlag, Berlin, Heidlberg, pp: 125-140.
    CrossRef    
  8. Dong, J. and M. Qi, 2009. A new algorithm for clustering based on particle swarm optimization and K-means. Proceeding of International Conference on Artificial Intelligence and Computational Intelligence (AICI'09), pp: 264-268.
    CrossRef    
  9. Eberhart, R.C. and Y. Shi, 2001. Particle swarm optimization: Developments, applications and resources. Proceeding of the 2001 Congress on Evolutionary Computation, 1: 81-86.
    CrossRef    
  10. Esmin, A.A.A., D.L. Pereira and F. de Araujo, 2008. Study of different approach to clustering data by using the particle swarm optimization algorithm. Proceeding of the IEEE World Congress on Evolutionary Computation, pp: 1817-1822.
    CrossRef    
  11. Gan, G., C. Ma and J. Wu, 2007. Data Clustering: Theory, Algorithms and Applications. SIAM, Philadelphia, PA.
  12. Han, J. and M. Kamber, 2001. Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco. Retrieved form: http:// archive.ics.uci.edu/ml/.
  13. Jain, A. and R. Dubes, 1998. Algorithms for Clustering Data. Prentice Hall, New Jersey.
  14. Kao, Y. and S.Y. Lee, 2009. Combining K-means and particle swarm optimization for dynamic data clustering problems. Proceeding of the IEEE International Conference on Intelligent Computing and Intelligent System, pp: 757-761.
    CrossRef    
  15. Kaufman, L. and P.J. Rousseeuw, 1990. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons Inc., New York.
    CrossRef    
  16. Kennedy, J. and R. Eberhart, 1995. Particle swarm optimization. Proceeding of IEEE International Conference on Neural Networks. Piscataway, NJ, 4: 1942-1948.
    CrossRef    
  17. Li, Y.R., Z.Y. Yong and Z.C. Na, 2013. The K-means clustering algorithm based on chaos particle swarm. J. Theor. Appl. Inform. Technol., 48(2): 762-767.
  18. Liu, Y., J. Peng, K. Chen and Y. Zhang, 2006. An improved hybrid genetic clustering algorithm. In: Antoniou, G. et al. (Eds.), SETN 2006. LNAI 3955, Springer-Verlag, Berlin, Heidlberg, pp: 192-202.
    CrossRef    
  19. MacQueen, J., 1967. Some methods for classification and analysis of multivariate observations. Proceeding of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp: 281-297.
  20. Mohamed Jafar, O.A. and R. Sivakumar, 2013. A study of bio-inspired algorithm to data clustering using different distance measures. Int. J. Comput. Appl. (IJCA), 66(12): 33-44.
  21. Niknam, T. and B. Amiri, 2010. An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl. Soft Comput., 10(1): 183-197.
    CrossRef    
  22. Omran, M., A. Salman and A.P. Engelbrecht, 2002. Image classification using particle swarm optimization. Proceeding of the 4th Asia-Pacific Conference on Simulated Evolution and Learning. Singapore, pp: 370-374.
  23. Poli, R., J. Kennedy and T. Blackwell, 2007. Particle swarm optimization-an overview. Swarm Intell., 1(1): 33-57.
    CrossRef    
  24. Rana, S., S. Jasola and R. Kumar, 2010. A hybrid sequential approach for data clustering using K-means and particle swarm optimization algorithm. Int. J. Eng. Sci. Technol., 2(6): 167-176.
  25. Sethi, C. and G. Mishra, 2013. A linear PCA based hybrid K-means PSO algorithm for clustering large dataset. Int. J. Sci. Eng. Res., 4(6): 1559-1566.
  26. Tsai, C.Y. and I.W. Kao, 2010. Particle swarm optimization with selective particle regeneration for data clustering. Expert Syst. Appl., 38: 6565-6576.
    CrossRef    
  27. Van Der Merwe, D.W. and A.P. Engelbrecht, 2003. Data clustering using particle swarm optimization. Proceeding of the IEEE Congress on Evolutionary Computation. Canberra, Australia, pp: 215-220.
    CrossRef    
  28. Xu, R. and D. Wunsch II, 2005. Survey of clustering algorithms. IEEE T. Neural Networ., 16(3): 645-678.
    CrossRef    PMid:15940994    
  29. Yang, F., T. Sun and C. Zhang, 2009. An efficient hybrid data clustering method based on K-harmonic means and particle swarm optimization. Expert Syst. Appl., 36: 9847-9852.
    CrossRef    
  30. Ye, F. and C.Y. Chen, 2005. Alternative KPSO-clustering algorithm. Tamkang J. Sci. Eng., 8(2): 165-174.
  31. Yu, X. and M. Gen, 2010. Introduction to Evolutionary Algorithms. Springer, London.
    CrossRef    

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved