Academia.eduAcademia.edu

Outline

Multimodal human emotion/expression recognition

1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition

https://doi.org/10.1109/AFGR.1998.670976

Abstract

Recognizing human facial expression and emotion by computer is an interesting and challenging problem. Many have investigated emotional contents in speech alone, or recognition of human facial expressions solely from images. However, relatively little has been done in combining these two modalities for recognizing human emotions. De Silva et al. 4 studied human subjects' ability to recognize emotions from viewing video clips of facial expressions and listening to the corresponding emotional speech stimuli. They found that humans recognize some emotions better by audio information, and other emotions better by video. They also proposed an algorithm to integrate both kinds of inputs to mimic human's recognition process. While attempting to implement the algorithm, we encountered di culties which led us to a di erent approach. We found these two modalities to be c omplimentary. By using both, we show it is possible to achieve higher recognition rates than either modality alone.

References (17)

  1. C.C. Chiu, Y.L. Chang, and Y.J. Lai, The Analy- sis and Recognition of Human Vocal Emotions," in Proc. International Computer Symposium, NCTU, Hsihchu, Taiwan, R.O.C., December 12-15, 1994.
  2. R. Cowie and E. Douglas-Cowie, Automatic Sta- tistical Analysis Of The Signal And Prosodic Signs Of Emotion In Speech," in Proc. International Conf. on Spoken Language Processing, Philadel- phia, PA, USA, pp. 1989 1992, October 3-6, 1996.
  3. F. Dellaert, T. Polzin and A. Waibel, Recognizing Emotion in Speech," in Proc. International Conf. on Spoken Language Processing, Philadelphia, PA, USA, pp. 1970 1973, October 3-6, 1996.
  4. L. C. De Silva, T. Miyasato, and R. Nakatsu, Fa- cial Emotion Recognition Using Multimodal In- formation." in Proc. IEEE Int. Conf. on Infor- mation, Communications and Signal Processing ICICS'97, Singapore, pp. 397-401, Sept. 1997.
  5. P. Ekman, ed, Emotion In the Human Face, C a m - bridge: Cambridge University Press, 1982.
  6. P. Ekman, Strong Evidence for Universals in Fa- cial Expressions: A Reply to Russell's Mistaken Critique," Psychological Bulletin, vol. 115, no. 2, pp. 268 287, 1994.
  7. I. A. Essa and A. P. P entland, Coding, Analysis, Interpretation, and Recognition of Facial Expres- sions," IEEE Trans. PAMI, vol. 19, no. 7, pp. 757 763, July 1997.
  8. H. Fujisaki, Prosody, Models, and Spontaneous Speech," in Computing Prosody, by Y. Sagisaka, N. Campbell, and N. Higuchi, Eds, Springer-Verlag, New York, 1997.
  9. T. Johnstone, Emotional Speech Elicited Using Computer Games," in Proc. International Conf. on Spoken Language Processing, Philadelphia, PA, USA, pp. 1985 1988, October 3-6, 1996.
  10. K. Mase, Recognition of Facial Expression from Optical Flow," IEICE Trans., vol. E74, no. 10, pp. 3474 3483, October 1991.
  11. I. R. Murray and J. L. Arnott, Toward the simu- lation of Emotion in Synthetic Speech: A Review of The Literature of Human Vocal Emotion," Journal of the Acoustic Society of America, vol. 93, no. 2, pp. 1097 1108, February 1993.
  12. T. Otsuka and J. Ohya, Recognizing Multiple Persons' Facial Expressions Using HMM Based on Automatic Extraction of Signi cant Frames from Image Sequences," in Proc. Int. Conf. on Image Processing ICIP-97, Santa Barbara, CA, USA, pp. 546 549, Oct 26-29, 1997.
  13. M. Rosenblum, Y. Yacoob and L.S. Davis, Hu- man Expression Recognition from Motion Using a Radial Basis Function Network Architecture," IEEE Trans. Neural Network, vol. 7, no. 5, pp. 1121 1138, September 1996.
  14. J. Sato and S. Morishima, Emotion Modeling in Speech Production Using Emotion Space," in Proc. IEEE Int. Workshop on Robot and Human Commu- nication, Tsukuba, Japan, pp. 472 477, Nov 1996.
  15. K. R. Scherer, Adding The A ective Dimension: A New Look In Speech Analysis And Synthesis," in Proc. International Conf. on Spoken Language Processing, Philadelphia, PA, USA, page no. not available, October 3-6, 1996.
  16. T. Sakaguchi, Facial Feature Extraction Based on the Wavelet Transform for Dynamic Expression Recognition," Submitted to IEEE Trans. PAMI.
  17. N. Ueki, S. Morishima, H. Yamada, and H. Ha- rashima, Expression Analysis Synthesis System Based on Emotion Space Constructed by Multilay- ered Neural Network," Systems and Computers in Japan, vol. 25, no. 13, pp. 95-103, Nov. 1994.