Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


Fuzzy Discretization based Classification of Medical Data

1M. Shanmugapriya, 1H. Khanna Nehemiah, 1R.S. Bhuvaneswaran, 2Kannan Arputharaj and 1J. Dhalia Sweetlin
1Ramanujan Computing Centre
2Department of Information Science and Technology, Anna University, Chennai-600025, India
Research Journal of Applied Sciences, Engineering and Technology  2017  8:291-298
http://dx.doi.org/10.19026/rjaset.14.4953  |  © The Author(s) 2017
Received: December 22, 2016  |  Accepted: April 11, 2017  |  Published: August 15, 2017

Abstract

Discretization is one of the commonly used data preprocessing technique to improve the efficiency of the knowledge extraction process on clinical data. Generally, clinical data contains numeric attributes with continuous values. Data discretization simplifies the original data by transforming continuous data attribute values into a finite set of intervals. Although discretization is capable of handling continuous attributes on clinical data, there are cases where discretization is not an appropriate technique for handling continuous attributes. There are instances where attribute values are vague, imprecise and have multiple distributions with different classes, which challenges the process of mining in clinical data. Hence, there is a need for fuzzy discretization to pre-process the clinical data before mining. The aim of this study is to derive fuzzy discretization from crisp-interval discretization using geometric approach for constructing fuzzy sets, where overlapping region between the fuzzy sets is represented as geometric area. This study comprises of three steps: First, non-overlapping fuzzy sets are constructed using intervals generated from crisp-interval discretization. Second, area of overlapping between the fuzzy sets is computed based on the geometric approach and an average area of overlapping is estimated. Third, fuzzy sets are redesigned based on the estimated average area of overlapping. Fuzzy discretizations for three, five and seven intervals have been examined using Pima Indian Diabetes dataset (PID) and Bupa Liver Disorder dataset (BLD) taken from the University of California Irvine machine learning repository. The variation in performance of crisp and fuzzy discretization methods is measured using six classification approaches namely, tree based approach, probabilistic induction based approach, rule-based approach, network learning approach, kernel-based approach and distance-based approach and a rule-based fuzzy inference system. The results show that the classification accuracy remains stable with less deviation across different classifiers with varying intervals.

Keywords:

Classification, fuzzy discretization, fuzzy set, interval discretization, membership function, overlapping area,


References

  1. Dougherty, J., R. Kohavi and M. Sahami, 1995. Supervised and unsupervised discretization of continuous features. Proceeding of the 12th International Conference on Machine Learning, 12: 194-202.
    CrossRef    
  2. Zadeh, L.A., 1965. Fuzzy sets. Inform. Control, 8(3): 338-353.
    CrossRef    
  3. Alcalá-Fdez, J., A. Fernández, J. Luengo, J. Derrac, S. García, L. Sánchez and F. Herrera, 2011. KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. J. Mult-Valued Log. S., 17: 255-287.
    Direct Link
  4. Allahverdi, N., 2009. Some applications of fuzzy logic in medical area. Proceeding of the IEEE International Conference on Application of Information and Communication Technologies (AICT), pp: 1-5.
    CrossRef    
  5. Bera, S., A.J. Gaikwad and D. Datta, 2014. Selection of fuzzy membership function based on probabilistic confidence. Proceeding of the International Conference on Control, Instrumentation, Energy and Communication (CIEC), pp: 612-615.
    CrossRef    
  6. Exarchos, T.P., A.T. Tzallas, D. Baga, D. Chaloglou, D.I. Fotiadis, S. Tsouli, M. Diakou and S. Konitsiotis, 2012. Using Partial decision trees to predict Parkinson's symptoms: A new approach for diagnosis and therapy in patients suffering from Parkinson's disease. Comput. Biol. Med., 42(2): 195-204.
    CrossRef    PMid:22197114    
  7. Fazzolari, M., R. Alcalá and F. Herrera, 2014. A multi-objective evolutionary method for learning granularities based on fuzzy discretization to improve the accuracy-complexity trade-off of fuzzy rule-based classification systems: D-MOFARC algorithm. Appl. Soft Comput., 24: 470-481.
    CrossRef    
  8. Ishibuchi, H. and T. Yamamoto, 2003. Deriving fuzzy discretization from interval discretization. Proceeding of the 12th IEEE International Conference on Fuzzy Systems, 1: 749-754.
    CrossRef    
  9. Ishibuchi, H., T. Yamamoto and T. Nakashima, 2001. Fuzzy data mining: Effect of fuzzy discretization. Proceeding of the IEEE International Conference on Data Mining (ICDM), pp: 241-248.
    CrossRef    
  10. Kaufmann, A., 1975. Introduction to the Theory of Fuzzy Subsets, V.1: Fundamental Theoretical Elements. Academic Press, San Diego.
  11. Kianmehr, K., M. Alshalalfa and R. Alhajj, 2008. Effectiveness of fuzzy discretization for class association rule-based classification. In: An, A., S. Matwin, Z.W. Ras and D. Slezak (Eds.), Foundations of Intelligent Systems. ISMIS, 2008. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 4994: 298-308.
    CrossRef    
  12. Klir, G.J. and B. Yuan, 1991. Fuzzy Sets and Fuzzy Logic. Prentice-Hall, Englewood Cliffs, NJ.
    PMid:1798912    
  13. Liu, H., F. Hussain, C.L. Tan and M. Dash, 2002. Discretization: An enabling technique. Data Min. Knowl. Disc., 6(4): 393-423.
    CrossRef    
  14. Maslove, D.M., T. Podchiyska and H.J. Lowe, 2013. Discretization of continuous features in clinical datasets. J. Am. Med. Inform. Assn., 20(3): 544-553.
    CrossRef    PMid:23059731 PMCid:PMC3628044    
  15. Mehta, R.G., D.P. Rana and M.A. Zaveri, 2009. A novel fuzzy based classification for data mining using fuzzy discretization. Proceeding of the WRI World Congress on Computer Science and Information Engineering, 3: 713-717.
    CrossRef    
  16. Mittal, A. and L.F. Cheong, 2002. Employing discrete bayes error rate for discretization and feature selection tasks. Proceeding of the IEEE International Conference on Data Mining (ICDM-2002), pp: 298-305.
    CrossRef    
  17. Muthukaruppan, S. and M.J. Er, 2012. A hybrid particle swarm optimization based fuzzy expert system for the diagnosis of coronary artery disease. Expert Syst. Appl., 39(14): 11657-11665.
    CrossRef    
  18. Naaz, S., A. Alam and R. Biswas, 2011. Effect of different defuzzification methods in a fuzzy based load balancing application. Int. J. Comput. Sci., 8(5).
    Direct Link
  19. Pal, D., K.M. Mandana, S. Pal, D. Sarkar and C. Chakraborty, 2012. Fuzzy expert system approach for coronary Pal, D., K.M. Mandana, S. Pal, D. Sarkar and C. Chakraborty, 2012. Fuzzy expert system approach for coronary artery disease screening using clinical parameters. Knowl-Based Syst., 36: 162-174.
    CrossRef    
  20. Pappis, C.P. and N.I. Karacapilidis, 1993. A comparative assessment of measures of similarity of fuzzy values. Fuzzy Set. Syst., 56(2): 171-174.
    CrossRef    
  21. Quinlan, J.R., 1996. Improved use of continuous attributes in C4.5. J. Artif. Intell. Res., 4: 77-90.
    CrossRef    
  22. Rajasekaran, S. and G.A. Vijayalakshmi Pai, 2007. Neural Networks, Fuzzy Logic and Genetic Algorithms: Synthesis and Applications. Prentice Hall, New Delhi, India.
  23. Roy, A. and S.K. Pal, 2003. Fuzzy discretization of feature space for a rough set classifier. Pattern Recogn. Lett., 24(6): 895-902.
    CrossRef    
  24. Russell, S.J. and P. Norvig, 1995. Artificial Intelligence: A Modern Approach. Prentice-Hall, Englewood Cliffs, NJ.
  25. Samuel, O.W., M.O. Omisore and B.A. Ojokoh, 2013. A web based decision support system driven by fuzzy logic for the diagnosis of typhoid fever. Expert Syst. Appl., 40(10): 4164-4171.
    CrossRef    
  26. Setnes, M., R. Babuška, U. Kaymak and H.R. van Nauta Lemke, 1998. Similarity measures in fuzzy rule base simplification. IEEE T. Syst. Man Cy. B, 28(3): 376-386.
    CrossRef    PMid:18255954    
  27. Shanmugapriya, M., H. Khanna Nehemiah, R.S. Bhuvaneswaran, K. Arputharaj and J. Jabez Christopher, 2016a. SimE: A geometric approach for similarity estimation of fuzzy sets. Res. J. Appl. Sci. Eng. Technol., 13(5): 345-353.
    CrossRef    
  28. Shanmugapriya, M., H. Khanna Nehemiah, R.S. Bhuvaneswaran, K. Arputharaj and J. Dhalia Sweetlin, 2016b. Unsupervised discretization: An analysis of unsupervised discretization approaches for clinical datasets. Res. J. Appl. Sci. Eng. Technol., (Accepted for Publication).
  29. Zeinalkhani, M. and M. Eftekhari, 2014. Fuzzy partitioning of continuous attributes through discretization methods to construct fuzzy decision tree classifiers. Inform. Sciences, 278: 715-735.
    CrossRef    
  30. Zimmermann, H.J., 1996. Fuzzy Set Theory-and Its Applications. 3rd Edn., Kluwer Academic Publishers, Norwell, MA, USA.
    CrossRef    
  31. Zwick, R., E. Carlstein and D.V. Budescu, 1987. Measures of similarity among fuzzy concepts: A comparative analysis. Int. J. Approx. Reason., 1(2): 221-242.
    CrossRef    

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved