Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


The Optical Character Recognition for Cursive Script Using HMM: A Review

1, 3Saeeda Naz, 1Arif I. Umar, 1Syed H. Shirazi, 2Muhammad M. Ajmal and 2 Salahuddin
1Department of Information Technology, Hazara University, Mansehra, Pakistan
2COMSATS Institute of Information Technology
3GGPGC No.1, Department of Higher Education, KPK, Abbottabad, Pakistan
Research Journal of Applied Sciences, Engineering and Technology  2014  19:2016-2025
http://dx.doi.org/10.19026/rjaset.8.1193  |  © The Author(s) 2014
Received: December 09, 2013  |  Accepted: ‎June ‎08, ‎2014  |  Published: November 20, 2014

Abstract

Automatic Character Recognition has wide variety of applications such as automatic postal mail sorting, number plate recognition and automatic form of reader and entering text from PDA's etc. Cursive script’s Automatic Character Recognition is a complex process facing unique issues unlike other scripts. Many solutions have been proposed in the literature to solve complexities of cursive scripts character recognition. This paper present a comprehensive literature review of the Optical Character Recognition (OCR) for off-line and on-line character recognition for Urdu, Arabic and Persian languages, based on Hidden Markov Model (HMM). We surveyed all most all significant approaches proposed and concluded future directions of OCR for cursive languages.

Keywords:

Character, hidden Markov model, ligature, optical character recognition,


References

  1. Ahmed, H. and S.A. Azeem, 2011. On-line Arabic handwriting recognition system based on HMM. Proceeding of International Conference on Document Analysis and Recognition (ICDAR'11), pp: 1324-1328.
    CrossRef    
  2. Akram, Q.U., S. Hussain and Z. Habib, 2010. Font size independent OCR for Noori Nastaleeq. Proceedings of Graduate Colloquium on Computer Sciences (GCCS), Vol. 1, NUCES Lahore.
  3. Al-Badr, B. and S.A. Mahmoud, 1995. Survey and bibliography of Arabic optical text recognition. Signal Process., 41(1): 49-77.
    CrossRef    
  4. Al-Hajj, R., C. Mokbel and L. Likforman-Sulem, 2007. Combination of HMM-based classifiers for the recognition of Arabic handwritten words. Proceeding of IEEE 3th International Conference on Document Analysis and Recognition (ICDAR'07), 2: 959-963.
    CrossRef    
  5. AlKhateeb, J.H., J. Ren, J. Jiang and H. Al-Muhtaseb, 2011. Offline hand-written Arabic cursive text recognition using Hidden Markov Models and re-ranking. Pattern Recogn. Lett., 32(8):1081-1088.
    CrossRef    
  6. AlKhateeb, J.H., O. Pauplin, J. Ren and J. Jiang, 2011a. Performance of hidden Markov model and dynamic Bayesian network classifiers on handwritten Arabic word recognition. Knowl. Based Syst., 24: 680-688.
    CrossRef    
  7. Aulama, M.M., A.M. Natsheh, G.A. Abandah and M.M. Olama, 2011. Optical character recognition of handwritten Arabic using hidden Markov models. Proceedings of SPIE.
    CrossRef    
  8. Awaida, S.M. and M.S. Khorsheed, 2012. Developing discrete density hidden Markov models for Arabic printed text recognition. Proceeding of IEEE International Conference on Computational Intelligence and Cybernetics (CyberneticsCom'12), pp: 35-39.
    CrossRef    
  9. Awaidah, S.M. and S.A. Mahmoud, 2009. A multiple feature/resolution scheme to Arabic (Indian) numerals recognition using hidden Markov models. Signal Process., 89(6): 1176-1184.
    CrossRef    
  10. Azeem, S.A. and H. Ahmed, 2012. Combining on-line and off-line systems for Arabic handwriting recognition. Proceeding of 21st IEEE International Conference on Pattern Recognition (ICPR'12), pp: 3725-3728.
  11. Azeem, S.A. and H. Ahmed, 2013. Effective technique for the recognition of off-line Arabic handwritten words using hidden Markov models. IJDAR'13, 16: 399-412.
    CrossRef    
  12. Benouareth, A., A. Ennaji and M. Sellami, 2006. HMMs with explicit state duration applied to handwritten Arabic word recognition. Proceeding of IEEE 18th International Conference on Pattern Recognition (ICPR'06), 2: 897-900.
    CrossRef    
  13. Benouareth, A., A. Ennaji and M. Sellami, 2006a. Semi-continuous HMMs with explicit state duration applied to Arabic handwritten word recognition. Proceeding of 10th International Workshop on Frontiers in Handwriting Recognition.
  14. Benouareth, A., A. Ennaji and M. Sellami, 2008. Semi-continuous HMMs with explicit state duration for uncon-strained Arabic word modeling and recognition. Pattern Recogn. Lett., 29(12): 1742-1752.
    CrossRef    
  15. Cao, H., J. Chen, J. Devlin, R. Prasad and P. Natarajan, 2012. Docu-ment recognition and translation system for unconstrained Arabic documents. Proceeding of 21st International Conference on Pattern Recognition (ICPR'12).
  16. Decerboet, M., E. MacRostie and P. Natarajan, 2004. The BBN Byblos Pashto OCR system. Proceedings of the 1st ACM Workshop on Hardcopy Document Processing, pp: 29-32.
    CrossRef    
  17. Dehghan, M., K. Faez, M. Ahmadi and M. Shridhar, 2001. Handwritten Farsi (Arabic) word recognition: A holistic approach using discrete HMM pattern recognition. Elsevier, 34: 1057-1065.
  18. Dreuw, P., D. Rybach, C. Gollan and H. Ney, 2009. Writer adaptive training and writing variant model refinement for off-line Arabic handwriting recognition. Proceeding of IEEE 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp: 21-25.
  19. Elbaati, A., H. Boubaker, M. Kherallah, A.M. Alimi, A. Ennaji and H. El-Abed, 2009. Arabic handwriting recognition using restored stroke chronology. Proceeding of 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp: 411-415.
    CrossRef    
  20. El-Abed, H. and V. Margner, 2007. Comparison of different preprocessing and feature extraction methods for off-line recognition of handwritten Arabic words. Proceeding of IEEE 9th International Conference on Document Analysis and Recognition (ICDAR'07), 2: 974-978.
  21. Ghods, V. and E. Kabir, 2010. Feature extraction for online Farsi characters. Proceeding of International Conference on Frontiers in Handwriting Recognition (ICFHR'10), pp: 477-482. https://doi.org/10.1109/icfhr.2010.81.
    CrossRef    
  22. Ghods, V., E. Kabir and F. Razzazi, 2013a. Decision fusion of horizontal and vertical trajectories for recognition of online Farsi sub words. Eng. Appl. Artificial Intell., 26: 544-550.
    CrossRef    
  23. Ghods, V., E. Kabir and F. Razzazi, 2013b. Effect of delayed strokes on the recognition of online Farsi handwriting. Pattern Recogn. Lett. Elsevier Sci. Inc., 34: 486-491.
    CrossRef    
  24. Hamdani, M., H. El-Abed, M. Kherallah and A.M. Alimi, 2009. Combining multiple HMMs using on-line and off-line features for off-line Arabic handwriting recognition. Proceeding of IEEE 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp: 201-205.
    CrossRef    
  25. Husain, S.A., A. Sajjad and F. Anwar, 2007. On-line Urdu character recognition system. Proceeding of IAPR Conference on Machine Vision Applications (MVA'07).
  26. Javed, S.T., 2007. M.A. Thesis, National University, (2007).
  27. Javed, S.T., S. Hussain, A. Maqbool, S. Asloob, S. Jamil and H. Moin, 2010. Segmentation free Nastalique Urdu OCR. Proceedings of World Academy of Science, Engineering and Technology, 46: 456-461.
  28. Kessentini, Y., T. Paquet and A.B. Hamadou, 2010. Off-line handwritten word recognition using multi-stream hidden Markov models. Pattern Recogn. Lett., 31(1): 60-70.
    CrossRef    
  29. Khorsheed M. S. (2007). off-line recognition of omnifont Arabic text using the HMM ToolKit (HTK)," Pattern Recognition Letters, 28(12): 1563–1571.
    CrossRef    
  30. Khorsheed, M.S. and H. Al-Omari, 2011. Recognizing cursive Arabic text: Using statistical features and interconnected mono-HMMs. Proceeding of 4th International Congress on Image and Signal Processing (CISP'11), 3: 1540-1543.
    CrossRef    
  31. Kundu, A., T. Hines, J. Phillips, B.D. Huyck and L.C.V. Guilder, 2007. Arabic handwriting recognition using variable duration HMM. Proceeding of IEEE 9th International Conference on Document Analysis and Recognition (ICDAR'07), 2: 644-648.
    CrossRef    
  32. Mahmoud, S., 2008. Recognition of writer-independent off-line handwritten Arabic (Indian) numerals using hidden Markov models. Signal Pro-cessing, 88(4): 844-857.
    CrossRef    
  33. Margner, V., H. El-Abed and M. Pechwitz, 2006. Off-line handwritten Arabic word recognition using HMM-a character based approach without explicit segmentation. Proceeding of Actes du 9` eme Colloque International Francophone sur l'Ecrit et le Document, 2006, pp: 259-264.
  34. Menasri, F., N. Vincent, E. Augustin and M. Cheriet, 2007. Shape-based alphabet for off-line Arabic handwriting recognition. Proceeding of 9th International Conference on Document Analysis and Recognition (ICDAR'07), pp: 969-973.
    CrossRef    
  35. Mohamad, R.A., L. Likforman-Sulem and C. Mokbel, 2009. Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition. Trans. Pattern Anal. Machine Intell., 31(7): 1165-1177.
    CrossRef    PMid:19443916    
  36. Natarajan, P., D. Belanger, R. Prasad, M. Kamali and K. Subramanian, 2011. Baseline dependent percentile features for off-line Arabic handwriting recognition. Proceeding of International Conference on Document Analysis and Recognition (ICDAR'11), pp: 329-333.
  37. Natarajan, P., S. Saleem, R. Prasad, E. MacRostie and K. Subramanian, 2008. Multi-lingual off-line handwriting recognition using hidden Markov models: A script-independent approach. Proceeding of Arabic and Chinese Handwriting Recognition, pp: 231-250.
  38. Natarajan, P., Z. Lu, R. Schwartz, I. Bazzi and J. Makhoul, 2001. Multilingual machine printed OCR. Int. J. Pattern Recogn. Artificial Intell., 15(01): 43-63.
    CrossRef    
  39. Naz, S., K. Hayat, M.I. Razzak, M.W. Anwar and H. Akbar, 2013. Arabic script based character segmentation: A review. Proceeding of IEEE World Congress on in Computer and Information Technology (WCCIT), pp: 1-6.
    CrossRef    
  40. Naz, S., K. Hayat, M.I. Razzak, M.W. Anwar and H. Akbar, 2013. Arabic script based language character recognition: Nasta'liq vs Naskh analysis. Proceeding of IEEE World Congress on in Computer and Information Technology (WCCIT), pp: 1-7.
    CrossRef    
  41. Razzak, M.I., F. Anwar, S.A. Husain, A. Belaid and M. Sher, 2010. HMM and fuzzy logic: A hybrid approach for on-line Urdu script-based languages character recognition. Knowl. Based Syst., 23(8): 914-923.
    CrossRef    
  42. Razzak, M.I., M. Sher and S.A. Hussain, 2010. Locally baseline detection for on-line Arabic script based languages character recognition. Int. J. Phys. Sci., 5(7): 955-959.
  43. Razzak, M.I., S.A. Husain, A.A. Mirza and A. Belaid, 2012. Fuzzy based preprocessing using fusion of on-line and of-fline trait for on-line Urdu script based languages character recognition. Int. J. Innov. Comput. Inform. Control, 8: 1349-4198.
  44. Razzak, M.I., S.A. Hussain, A. Belaid and M. Sher, 2009a. Multi-font numerals recognition for Urdu script based languages. Int. J. Recent Trends Eng., (IJRTE).
    PMCid:PMC2678776    
  45. Sajedi, H., M. Jamzad, H. Sameti and B. Babaali, 2007. A grouping-based method for on-line Farsi discrete character recognition using hidden Markov model. Proceeding of the 12th International Conference of Computer Society of Iran, pp: 419-426.
  46. Satti, D.A. and K. Saleem, 2012. Complexities and implementation challenges in off-line Urdu Nastaliq OCR. Proceeding of Conference on Language and Technology 2012.
  47. Slimane, F., R. Ingold, S. Kanoun, A.M. Alimi and J. Hennebert, 2009. A new arabic printed text image database and evaluation protocols. Proceeding of 10th International Conference on Document Analysis and Recognition (ICDAR'09), pp: 946-950.
    CrossRef    
  48. Slimane, F., S. Kanoun, J. Hennebert, A.M. Alimi and R. Ingold, 2012. A study on font-family and font-size recognition applied to Arabic word images at ultra-low resolution. Pattern Recogn. Lett., 34(2): 209-218.
    CrossRef    
  49. Xiang, D., H. Liu, X. Chen, Y. Cheng and H. Yao, 2012. Recognition of off-line Arabic handwriting using hidden Markov model toolkit. Proceeding of 11th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES'12), pp: 409-412.
    CrossRef    

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved