Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


Real Time Talking System for Virtual Human based on ProPhone

Itimad Raheem Ali, Ghazali Sulong and Hoshang Kolivand
MaGIC-X (Media and Games Innovation Centre of Excellence), UTM-IRDA Digital Media Centre Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia
Research Journal of Applied Sciences, Engineering and Technology  2016  8:611-616
http://dx.doi.org/10.19026/rjaset.13.3046  |  © The Author(s) 2016
Received: May ‎25, ‎2015  |  Accepted: July ‎26, ‎2015  |  Published: October 15, 2016

Abstract

Lip-syncing is a process of speech assimilation with the lip motions of a virtual character. A virtual talking character is a challenging task because it should provide control on all articulatory movements and must be synchronized with the speech signal. This study presents a virtual talking character system aimed to speeding and easing the visual talking process as compared to the previous techniques using the blend shapes approach. This system constructs the lip-syncing using a set of visemes for reduced phonemes set by a new method named Prophone. This Prophone depend on the probability of appearing the phoneme in the sentence of English Language. The contribution of this study is to develop real-time automatic talking system for English language based on the concatenation of the visemes, followed by presenting the results that was evaluated by the phoneme to viseme table using the Prophone.

Keywords:

Phoneme , prophone , real-time talking , virtual character , visemes,


References

  1. Akagunduz, E., U. Halici and K. Ulusoy, 2004. Simulation of Turkish lip motion and facial expressions in a 3D environment and synchronization with a Turkish speech engine. Proceeding of the IEEE 12th Signal Processing and Communications Applications Conference, pp: 276-279.
  2. Balci, K., 2004. Xface: MPEG-4 based open source toolkit for 3D facial animation. Proceeding of the Working Conference on Advance Visual Interfaces, pp: 399-402.
  3. Boersma, P., and D. Weenink, 2001. Praat 3.9. 15 [Computer Software]. Institute of Phonetic Sciences, Amsterdam, the Netherlands.
  4. Esposito, A. and A.M. Esposito, 2011. On Speech and Gestures Synchrony. In: Esposito A. et al. (Eds.), Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 6800: 252-272.
  5. Gonseth, C., A. Vilain and C. Vilain, 2013. An experimental study of speech/gesture interactions and distance encoding. Speech Commun., 55(4): 553-571.
    CrossRef    Direct Link
  6. Scobbie, J.M., O.B. Gordeeva and B. Matthews, 2006. Acquisition of Scottish english phonolgy: An overview. Proceeding of QMUC Speech Science Research Centre Working Paper WP-7, Queen Margaret University College, 7: 3-30.
  7. López-Colino, F. and J. Colás, 2012. Spanish sign language synthesis system. J. Visual Lang. Comput., 23(3): 121-136.
    CrossRef    Direct Link
  8. NET Framework Conceptual Overview, 2012. Microsoft Developer Network Platform, Retrieved from http://msdn.microsoft.com/enus/library/w0x726c2%28v=vs.90%29.aspx.
    Direct Link
  9. Schuller, B., S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller and S. Narayanan, 2013. Paralinguistics in speech and language—State-of-the-art and the challenge. Comput. Speech Lang., 27(1): 4-39.
    CrossRef    Direct Link
  10. Serra, J., M. Ribeiro, J. Freitas, V. Orvalho and M.S. Dias, 2012. A Proposal for a Visual Speech Animation System for European Portuguese. In: Toledano, D.T. et al. (Eds.), Advances in Speech and Language Technologies for Iberian Languages. Springer-Verlag, Berlin, Heidelberg, 328: 267-276.
    CrossRef    Direct Link
  11. TRueSpel, 2001. English-Truespel (USA Accent) Text Conversion Tool. Retrieved from: http://www.foreignword.com/dictionary/truespel/transpel.htm.
    Direct Link
  12. Wang, L., H. Chen, S. Li and H.M. Meng, 2012. Phoneme-level articulatory animation in pronunciation training. Speech Commun., 54(7): 845-856.
    CrossRef    Direct Link
  13. Xu, Y., A.W. Feng, S. Marsella and A. Shapiro, 2013. A practical and configurable lip sync method for games. Proceeding of the Motion on Games (MIG'13), pp: 131-140.
  14. Zhang, L., M. Jiang, D. Farid and M.A. Hossain, 2013. Intelligent facial emotion recognition and semantic-based topic detection for a humanoid robot. Expert Syst. Appl., 40(13): 5160-5168.
    CrossRef    Direct Link

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved