Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


Arabic Audio News Retrieval System Using Dependent Speaker Mode, Mel Frequency Cepstral Coefficient and Dynamic Time Warping Techniques

1Hasan Muaidi, 2Ayat Al-Ahmad, 1Thaer Khdoor, 1Shihadeh Alqrainy and 1Mahmud Alkoffash
1Prince Abdullah Bin Ghazi Faculty of Information Technology, Al-Balqa
Research Journal of Applied Sciences, Engineering and Technology  2014  24:5082-5097
http://dx.doi.org/10.19026/rjaset.7.903  |  © The Author(s) 2014
Received: May 31, 2013  |  Accepted: April ‎09, ‎2014  |  Published: June 25, 2014

Abstract

Recently, audio data has increasingly becomes one of the prevalent source of information, especially after the exponential growth of using Internet, digital libraries systems and digital mobile devices. The currently massive amount of audio data stimulates working on developing custom audio retrieval tools to facilitate the audio retrieval tasks. The most familiar audio retrieval systems are based on searching using keyword, title or authors. This study presents the feasibility of using MEL Frequency Cepstral Coefficients (MFCCs) to extract features and Dynamic Time Warping (DTW) to compare the test patterns for Arabic audio news. The study proposes and implements architecture for content based audio retrieval system that is dedicated for the Arabic Audio News. The proposed architecture (ARANEWS) utilizes automatic speech recognition for isolated Arabic keyword speech mode; template based automatic speech recognition approach, MFCCs and DTW. ARANEWS presents a style of retrieval system that based on modeling signal waves and measuring the similarity between features that are extracted from spoken queries and spoken keywords. One of the major components that compose ARANEWS system is feature Database (ARANEWSDB). ARANEWSDB stores the extracted features (MFCCs) from the spoken keywords that are prepared to retrieve Arabic audio news. ARANEWS supports using Query by Humming (QBH) and Query by Example (QBE) instead of using query by text.

Keywords:

Arabic information retrieval, audio news retrieval system, dynamic time warping, frequency cepstral coefficient,


References

  1. Ali, M., M. Hossain and M. Bhuiyan, 2013. Automatic speech recognition technique for Bangla words. Int. J. Adv. Sci. Technol., 50: 51-60.
  2. Bala, A., A. Kumar and N. Birla, 2010. Voice command recognition system based on MFCC and DTW. Int. J. Eng. Sci. Technol., 2(12): 7335-7342.
  3. Dhingra, S., G. Nijhawan and P. Pandit, 2013. Isolated speech recognition using MFCC and DTW. Int. J. Adv. Res. Electr. Electron. Instrum. Eng., 2(8): 4085-4092.
  4. Fujii, A., K. Itou and T. Ishikawa, 2002. Speech-driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition. In: Coden, A.R., E.W. Brown and S. Srinivasan (Eds.), IR Techniques. Springer-Verlag, Berlin, Heidelberg, LNCS 2273, pp: 94-104.
  5. Gaikwad, S., B. Gawali and P. Yannawar, 2010. A review on speech recognition technique. Int. J. Comput. Appl., 10(3): 16-24.
    CrossRef    
  6. Gawali, B.W., S. Gaikwad, P. Yannawar and S.C. Mehrotra, 2011. Marathi isolated word recognition system using MFCC and DTW features. Int. J. Inform. Technol., 1(1): 21-24.
  7. HeleĢn, M. and T. Lahti, 2006. Query by example methods for audio signals. Proceeding of the 7th Nordic Signal Processing Symposium. Reykjavik, pp: 302-305.
    CrossRef    
  8. Lu, G. and A. Sajjanhar, 1998. On performance measurement of multimedia information retrieval systems. Proceeding of the International Conference on Computational Intelligence and Multimedia Applications. Monash University, pp: 781-787.
  9. Mitrovic, D., M. Zeppelzauer and C. Breiteneder, 2010. Features for content-based audio retrieval. Adv. Comput., 78: 71-150.
    CrossRef    
  10. Muda, L., M. Begam and I. Elamvazuthi, 2010. Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. J. Comput., 2(3): 138-143.
  11. Picone, J., 1993. Signal modeling techniques in speech recognition. P. IEEE, 81(9): 1215-1247.
    CrossRef    
  12. Pope, S., F. Holm and A. Kouznetsov, 2004. Feature extraction and database design for music software. Proceedings of the International Computer Music Conference, pp: 596-603.
  13. Ratanamahatana, C. and P. Tohlong, 2006. Speech Audio Retrieval using Voice Query. In: Sugimoto, S. et al. (Eds.), ICADL 2006. Springer-Verlag Berline Heidelberg, LNCS 4312, pp: 494-497.
    CrossRef    
  14. Reddy, D., 2005. Speech recognition by machine: A review. P. IEEE, 64(4): 501-531.
    CrossRef    
  15. Salvador, S. and P. Chan, 2007. Toward accurate dynamic time warping in linear time and space. Intell. Data Anal., 11(5): 561-580.
  16. Shaneh, M. and A. Taheri, 2009. Voice command recognition system based on MFCC and VQ algorithms. World Acad. Sci. Eng. Technol., 33: 534-538.
  17. Thakur, A., N. Singla and V. Patil, 2011. Design of Hindi key word recognition system for home automation system using MFCC and DTW. Int. J. Adv. Eng. Sci. Technol., 11(1): 177-182.
  18. Thakur, A. and N. Sahayam, 2013. Speech recognition using Euclidean distance. Int. J. Emerg. Technol. Adv. Eng., 3(3).
    Direct Link
  19. Tiwari, V., 2005. MFCC and its applications in speaker recognition. Int. J. Emerg. Technol., 1(1): 19-22.
  20. Van Rijsbergen, C.J., 1979. Information Retrieval. 2nd Edn., Butterworths, London.

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved