Arabic Audio News Retrieval System Using Dependent Speaker Mode, Mel Frequency Cepstral Coefficient and Dynamic Time Warping Techniques

Hasan Muaidi; Ayat Al-Ahmad; Thaer Khdoor; Shihadeh Alqrainy; Mahmud Alkoffash

doi:10.19026/rjaset.7.903

Research Journal of Applied Sciences, Engineering and Technology

Research Article | OPEN ACCESS

Arabic Audio News Retrieval System Using Dependent Speaker Mode, Mel Frequency Cepstral Coefficient and Dynamic Time Warping Techniques

¹Hasan Muaidi, ²Ayat Al-Ahmad, ¹Thaer Khdoor, ¹Shihadeh Alqrainy and ¹Mahmud Alkoffash

¹Prince Abdullah Bin Ghazi Faculty of Information Technology, Al-Balqa

Research Journal of Applied Sciences, Engineering and Technology 2014 24:5082-5097

http://dx.doi.org/10.19026/rjaset.7.903 | © The Author(s) 2014

Received: May 31, 2013 | Accepted: April ‎09, ‎2014 | Published: June 25, 2014

Back to issue | PDF | HTML

Abstract

Recently, audio data has increasingly becomes one of the prevalent source of information, especially after the exponential growth of using Internet, digital libraries systems and digital mobile devices. The currently massive amount of audio data stimulates working on developing custom audio retrieval tools to facilitate the audio retrieval tasks. The most familiar audio retrieval systems are based on searching using keyword, title or authors. This study presents the feasibility of using MEL Frequency Cepstral Coefficients (MFCCs) to extract features and Dynamic Time Warping (DTW) to compare the test patterns for Arabic audio news. The study proposes and implements architecture for content based audio retrieval system that is dedicated for the Arabic Audio News. The proposed architecture (ARANEWS) utilizes automatic speech recognition for isolated Arabic keyword speech mode; template based automatic speech recognition approach, MFCCs and DTW. ARANEWS presents a style of retrieval system that based on modeling signal waves and measuring the similarity between features that are extracted from spoken queries and spoken keywords. One of the major components that compose ARANEWS system is feature Database (ARANEWSDB). ARANEWSDB stores the extracted features (MFCCs) from the spoken keywords that are prepared to retrieve Arabic audio news. ARANEWS supports using Query by Humming (QBH) and Query by Example (QBE) instead of using query by text.

Keywords:

Arabic information retrieval, audio news retrieval system, dynamic time warping, frequency cepstral coefficient,

References

Ali, M., M. Hossain and M. Bhuiyan, 2013. Automatic speech recognition technique for Bangla words. Int. J. Adv. Sci. Technol., 50: 51-60.
Bala, A., A. Kumar and N. Birla, 2010. Voice command recognition system based on MFCC and DTW. Int. J. Eng. Sci. Technol., 2(12): 7335-7342.
Dhingra, S., G. Nijhawan and P. Pandit, 2013. Isolated speech recognition using MFCC and DTW. Int. J. Adv. Res. Electr. Electron. Instrum. Eng., 2(8): 4085-4092.
Fujii, A., K. Itou and T. Ishikawa, 2002. Speech-driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition. In: Coden, A.R., E.W. Brown and S. Srinivasan (Eds.), IR Techniques. Springer-Verlag, Berlin, Heidelberg, LNCS 2273, pp: 94-104.
Gaikwad, S., B. Gawali and P. Yannawar, 2010. A review on speech recognition technique. Int. J. Comput. Appl., 10(3): 16-24.
CrossRef
Gawali, B.W., S. Gaikwad, P. Yannawar and S.C. Mehrotra, 2011. Marathi isolated word recognition system using MFCC and DTW features. Int. J. Inform. Technol., 1(1): 21-24.
Helén, M. and T. Lahti, 2006. Query by example methods for audio signals. Proceeding of the 7th Nordic Signal Processing Symposium. Reykjavik, pp: 302-305.
CrossRef
Lu, G. and A. Sajjanhar, 1998. On performance measurement of multimedia information retrieval systems. Proceeding of the International Conference on Computational Intelligence and Multimedia Applications. Monash University, pp: 781-787.
Mitrovic, D., M. Zeppelzauer and C. Breiteneder, 2010. Features for content-based audio retrieval. Adv. Comput., 78: 71-150.
CrossRef
Muda, L., M. Begam and I. Elamvazuthi, 2010. Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. J. Comput., 2(3): 138-143.
Picone, J., 1993. Signal modeling techniques in speech recognition. P. IEEE, 81(9): 1215-1247.
CrossRef
Pope, S., F. Holm and A. Kouznetsov, 2004. Feature extraction and database design for music software. Proceedings of the International Computer Music Conference, pp: 596-603.
Ratanamahatana, C. and P. Tohlong, 2006. Speech Audio Retrieval using Voice Query. In: Sugimoto, S. et al. (Eds.), ICADL 2006. Springer-Verlag Berline Heidelberg, LNCS 4312, pp: 494-497.
CrossRef
Reddy, D., 2005. Speech recognition by machine: A review. P. IEEE, 64(4): 501-531.
CrossRef
Salvador, S. and P. Chan, 2007. Toward accurate dynamic time warping in linear time and space. Intell. Data Anal., 11(5): 561-580.
Shaneh, M. and A. Taheri, 2009. Voice command recognition system based on MFCC and VQ algorithms. World Acad. Sci. Eng. Technol., 33: 534-538.
Thakur, A., N. Singla and V. Patil, 2011. Design of Hindi key word recognition system for home automation system using MFCC and DTW. Int. J. Adv. Eng. Sci. Technol., 11(1): 177-182.
Thakur, A. and N. Sahayam, 2013. Speech recognition using Euclidean distance. Int. J. Emerg. Technol. Adv. Eng., 3(3).
Direct Link
Tiwari, V., 2005. MFCC and its applications in speaker recognition. Int. J. Emerg. Technol., 1(1): 19-22.
Van Rijsbergen, C.J., 1979. Information Retrieval. 2nd Edn., Butterworths, London.

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online): 2040-7467
ISSN (Print): 2040-7459

Information

Sales & Services



Journal Home \| Aim & Scope \| Author(s) Information \| Editorial Board \| MSP Download Statistics