Multimodal Signals: Cognitive and Algorithmic Issues

Patrick Perrot; Gérard Chollet; Patrick Horain

doi:10.1007/978-3-642-00525-1

Outline

Multimodal Signals: Cognitive and Algorithmic Issues

Patrick Perrot

Gérard Chollet

Patrick Horain

2009, Lecture Notes in Computer Science

https://doi.org/10.1007/978-3-642-00525-1

visibility

…

description

30 pages

link

1 file

Abstract

... 1 Gerard Chollet, Anna Esposito, Annie Gentes, Patrick Horain, Walid Karam, Zhenbo Li, Catherine Pelachaud, Patrick Perrot, Dijana Petrovska-Delacretaz, Dianle Zhou, and Leila Zouari Speech through the Ear, the Eye, the Mouth and the Hand..... ...

References (142)

J. Bigun, G. Borgefors and G. Chollet (Eds) Audio-and Video-based Biometric Per- son Authentication, Proceedings of 1st International Conference AVBPA, Springer Verlag, LNCS, Vol. 1206, ISBN 3-540-62660-3, 1997
S.Z. Li, Z. Sun, T. Tan, G. Chollet, S. Pankanti and D. Zhang (Eds) Advances in Biometric Person Authentication, Proceedings of International Workshop on Biomet- ric Recognition Systems, IWBRS2005, Beijing, China, Springer Verlag, LNCS 3781, ISBN 3-540-29431-7, 2005
C. Pelachaud, J-C. Martin, E.Andre, G. Chollet, K. Karpouzis and D. Pele In- telligent Virtual Agents, Proceedings of the 7th International Working Conference, IVA-2007, Paris, France, Springer Verlag, LNAI 4722, ISBN 3-540-74996-9, 2007
G. Chollet Automatic Speech Recognition: Overview, research directions, and per- spectives., in Fundamentals of Speech Synthesis and Speech Recognition, E. Keller (ed), John Wiley and Sons, 1994
G. Chollet, J. Cernocky, G. Gravier, J. Hennebert. D. Petrovska-Delacretaz, and F. Yvon Toward Fully Automatic Speech Processing Techniques for Interactive Voice Servers, in Speech Processing, Recognition and Artificial Neural Networks, G. Chol- let, M-G. Di Benedetto, A. Esposito, M. Marinaro (eds), Springer Verlag, 1999
D. Genoud and G. Chollet Voice Transformations: Some tools for the imposture of Speaker Verification systems, in Advances in Phonetics, A. Braun (ed.), Franz Steiner Verlag, Stuttgart, 1999
G. Chollet and D. Petrovska-Delacretaz Searching through a speech memory for efficient coding, recognition and synthesis, in Festchrift for Prof. Dr J-P.Koester, A. Braun and H. Masthoff (eds.), Franz Steiner Verlag, Stuttgart, pp 453-464, 2002
G. Chollet, K. McTait and D. Petrovska Data Driven Approaches to Speech and Language Processing, in Advances in Non-Linear Speech Processing and Applica- tions, G. Chollet et al. (eds.), LNCS, Vol. 3445, 164-198, Springer Verlag, 2005
W. Karam, C. Mokbel, H. Greige, G. Aversano, G. Chollet and C. Pelachaud An Audio-Visual Imposture Scenario by Talking Face Animation, in Advances in Non- Linear Speech Processing and Applications, G. Chollet et al. (eds.), LNCS, Vol. 3445, 365-369, Springer Verlag, 2005
M. Tomokiyo, G. Chollet and S. Hollard Studies of Emotional Expressions In Oral Dialogues: An Extension of UNL, in Universal Network Language : Advances in Theory and Applications, J. CARDENOSA et al. (eds.), Research in Computing Science, Vol. 12, pp. 286-299, 2005
B. Abboud, H. Bredin, G. Aversano and G. Chollet Audio-Visual Identity Verifi- cation: an Introductory Overview, in Progress in Non-Linear Speech Processing, Y. Stylianou (ed), LNCS-4391, ISSN 0302-9743, Springer Verlag, pp. 118-134, 2007
P. Perrot, G. Aversano and G. Chollet Voice Disguise and Automatic Detection, Review and Program, in Progress in Non-Linear Speech Processing, Y. Stylianou (ed), LNCS-4391, ISSN 0302-9743, Springer Verlag, 2007
D. Petrovska-Delacretaz, A. El-Hannani and G. Chollet Automatic Speaker Veri- fication, state of the art and current Issues, in Progress in Non-Linear Speech Pro- cessing, Y. Stylianou(ed), LNCS-4391, ISSN 0302-9743, Springer Verlag, 2007
G. Chollet, R. Landais, H. Bredin, T. Hueber, C. Mokbel, P. Perrot and L. Zouari Some Experiments in Audio-Visual Speech Processing, in Non-Linear Speech Pro- cessing, M. Chetnaoui (ed), Springer Verlag, 32 pages, 2007
Apolloni B., Aversano G., Esposito A. Pre-processing and Classification of Emo- tional Features in Speech Sentences. In Proc. of Inter. Workshop on Speech and Com- puter, 49-52, Y. Kosarev (Ed), the Russian Foundation for Basic Research (2000)
Butterworth B. L., Beattie G. W.: Gestures and Silence as Indicator of Planning in Speech. In Campbell R. N., Smith P. T. (Eds.), Recent Advances in the Psychology of Language, Olenum Press, New York, 347 360 (1978)
Bryll R., Quek F., Esposito A.: Automatic Hand Hold Detection in Natural Con- versation. In Proc. of IEEE Workshop on Cues in Communication, Hawai, December 9 (2001)
Chen L., Liu Y., Harper M.P., Shriberg E.: Multimodal Model Integration for Sentence Unit Detection. Proc of ICMI 04, October 13-15 2004, State College Penn- sylvania, USA (2004)
Chollet G., Abboud B., Bredin H., Aversano G.: Audio-Visual Identity Verification : An Introductory Overview. In Y. Stylianou et al. (Eds): Progress in Nonlinear Speech Processing, LNCS, 4392, Springer-Verlag (2007)
Chollet G., Perrot P. , Aversano G.: Voice Disguise and Automatic Detection, Review and Perspectives. In Y. Stylianou et al. (Eds): Progress in Nonlinear Speech Processing, LNCS, 4392, Springer-Verlag (2007)
Esposito A., Marinaro M.: What Pauses Can Tell Us About Speech and Gesture Partnership. In A. Esposito, M. Bratanic, E. Keller, M. Marinaro (Eds.), Funda- mentals of Verbal and Nonverbal Communication and the Biometrical Issue, NATO Publishing Series, IOS press, 18, 45-57 (2007)
Esposito A.: The Amount of Information on Emotional States Conveyed by the Verbal and Nonverbal Channels: Some Perceptual Data. In Y. Stylianou et al. (Eds): Progress in Nonlinear Speech Processing, LNCS, 4392, 245-264, Springer-Verlag (2007)
Esposito A., Bourbakis N. G.: The Role of Timing in Speech Perception and Speech Production Processes and its Effects on Language Impaired Individuals. In Proceed- ings of 6th International IEEE Symposium BioInformatics and BioEngineering, 348- 356, IEEE Computer Society (2006)
Esposito A.: Children's Organization of Discourse Structure through Pausing Means. In M. Faundez et al. (eds), Nonlinear Analyses and Algorithms for Speech Processing, LCNS 3817, 108-115, Springer Verlag (2006)
Ezzat T., Geiger G., Poggio T.: Trainable Video realistic Speech Animation. In Proc. of SIGGRAPH, San Antonio, Texas,388-397 (2002)
Fu S., Gutierrez-Osuna R., Esposito A., Kakumanu P., Garcia O.N.: Audio/Visual Mapping with Cross-Modal Hidden Markov Models. IEEE Transactions on Multime- dia, 7(2), 243-252 (2005)
Gutierrez-Osuna R., Kakumanu P., Esposito A. , Garcia O.N., Bojorquez A. , Castello J., Rudomin I.: Speech-driven Facial Animation with Realistic Dynamic. .IEEE Transactions on Multimedia, 7(1), 33-42 (2005)
Hartmann B., Mancini M., Pelachaud C.: Implementing Expressive Gesture Syn- thesis for Embodied Conversational Agents. Gesture Workshop, LNAI, Springer, May (2005)
Heylen D., Ghijsen M., Nijholt A., op den Akker R.: Facial Signs of Affect During Tutoring Sessions. In J. Tao, T. Tan, R.W. Picard (eds.), Lecture Notes in Computer Science 3784, Springer-Verlag, Berlin, 24-31 (2005)
Horain P., Marques Soares J., Rai P. K., Bideau A.: Virtually Enhancing the Perception of User Actions. In Proceedings of the 15th International Conference on Artificial Reality and Telexistence (ICAT05) December 5-8, New Zealand, 245-246 (2005)
Jovanovic N., op den Akker R., Nijholt A.: A Corpus for Studying Addressing Behavior in Multi-Party Dialogues. L. Dybkjaer and W. Minker (Eds), Proceedings 6th SIGdial Workshop on Discourse and Dialogue, Lisbon, Portugal, 2-3 September, 107-116 (2005)
Khler K., Haber, J. and Seidel, H.: Geometry-based Muscle Modelling for Facial Animation. In Proc of Inter. Conf. on Graphics Interface, 27-36 (2001)
Kakumanu P., Esposito A., Gutierrez-Osuna R., Garcia O. N.: Comparing Different Acoustic Data-Encoding for Speech Driven Facial Animation. Speech Communica- tion, 48(6), 598-615 (2006)
Kendon A: Gesture: Visible Action as Utterance, Cambridge Press (2004)
Kipp M.: From Human Gesture to Synthetic Action. Proc. Workshop on Multi- modal Communication and Context in Embodied Agents, Montreal,.9-14 (2001)
Mancini M., Bresin R., Pelachaud C.: From Acoustic Cues to Expressive ECAs. Gesture Workshop, LNAI, Springer, May (2005)
Martin J.-C., Pelachaud C., Abrilian S., Devillers L., Lamolle M., Mancini M., Levels of Representation in the Annotation of Emotion for the Specification of Ex- pressivity. In ECAs, IVA05 International Working Conference on Intelligent, Kos, Greece, September (2005)
McNeill D.: Hand and Mind: What Gestures Reveal about Thought, Chicago. Chicago Univ. Press (1992).
McNeill, D. Gesture and thought. Chicago: University of Chicago Press (2005)
Munhall K.G. Jones J.A., Callan D.E., Kuratate T., Vatikiotis-Bateson E.: Visual Prosody and Speech Intellegibility. Psychological Science, 15 (2),133-137 (2004)
Ochs M., Niewiadomski R., Pelachaud C., Sadek D.: Intelligent Expressions of Emotions, 1st International Conference on Affective Computing and Intelligent In- teraction ACII, China, October (2005)
Pelachaud C., Peters C., Mancini M., Bevacqua E., Poggi I.: A Model of Atten- tion and Interest Using Gaze Behavior. IVA05 International Working Conference on Intelligent Virtual Agents, September, Greece (2005)
Quek F., McNeill D., Bryll R., Kirbas C., Arslan H., McCullough K.E., Furuyama N., Ansari R.: Gesture, Speech, and Gaze for Discourse Segmentation. Proc. IEEE International Symposium on Computer, Vision and Pattern Recognition, 2, 247-254 (2000)
Stephen Karungaru, Minoru Fukumi and Norio Akamatsu, Automatic Human Faces Morphing using Genetic Algorithms based Control Points Selection, Interna- tional Journal of Innovative Computing, pp. 1-6. April 2007
Li Y.qnd Wen Y. : A study on face morphing algorithms, pp. 3. conf ou livre ou ..?? date??
Zanella V. and Fuentes O. : An approach to automatic morphing of face images in frontal view, Lecture notes in computer science, pp. 2-6. Springer, Berlin 2004
George W. : Recent advances in image morphing, department of computer science, New York, pp. 1-8.
M. Abe, S. Nakamura, K. Shikano, H. Kuwabara : Voice conversion through vector quantization, Proc. ICASSP 88, New-York, 1988
O. Cappe, Y. Stylianou, E Moulines : Statistical methods for voice quality trans- formation, Proc. of EUROSPEECH 95, Madrid, 1995
Chollet G., Cernocky J. : Constantinescu A., Deligne S., Bimbot F., Toward ALISP: a proposal for Automatic Language Independent Speech Processing, Computational Models of Speech Processing, NATO ASI Series, 1997
Kain A.,Macon M. W. : Spectral voice conversion for text to speech synthesis, Proc. ICASSP 98, New-York, 1998
Perrot P., Aversano G., Blouet R., Charbit M., Chollet G. : Voice forgery using ALISP, Proc. ICASSP 2005, Philadelphie
Sundermann D., Hge H., Bonafonte A., Ney H., Black A. and Narayanan S. : Text-independent voice conversion based on unit selection, in Proc. ICASSP 2006, Toulouse
Stylianou Y., Capp O. : A system for voice conversion based on probalistic classi- fication and a harmonic plus noise model, Proc. ICASSP 1998, Seattle.
Valbret H., Moulines E., Tubach J.P. : Voice transformation using TDPSOLA technique, Proc. ICASSP 1992, San Francisco.
Viola, P., Jones, Terzopoulos, D., Waters, K.: Analysis and synthesis of facial im- age sequences using physical and anatomical models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6), 569-579 (1993)
Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681-685 (2001)
Xiao, J., Baker, S., Matthews, I., Kanade, T.: Real-time combined 2D+3D active appearance models. In: IEEE Conference on Computer Vision and Pattern Recogni- tion, pp.25:35 -542 (2004)
Ahlberg,J.: Candide-3 ? an updated parameterized face. Technical Report LiTH ISY R 2326, Linkping University, Sweden (2001)
Dornaika, F., Ahlberg, J.: Fast and Reliable Active Appearance Model Search for 3D Face Tracking. IEEE Transactions on Systems, Man, and Cybernetics, PartB, 34:1838-1853 (2004)
Ahlberg, J.: Real-Time Facial Feature Tracking Using an Active Model With Fast Image Warping. In International Workshop on Very Low Bitrates Video, pp.39-43 (2001)
Matthews, I., Baker, S.: Active appearance models revisited. International Journal of Computer Vision, 60(2):135-164 (2004)
Gross, R., Matthews, I., Baker, S.: Generic vs. person specific active appearance models, Image and Vision Computing, 23(11), 1080-1093 (2005)
Wiskott, L., Fellous, J. M., Krger, N., von der Malsburg, C.: Face Recognition by Elastic Bunch Graph Matching. IEEE Transactions on Pattern Recognition and Machine Intelligence, 17(7), 775-779
Littlewort,G., Bartlett,M.S., Fasel, I., Susskind,J., Movellan,J.: Dynamics of Facial Expression Extracted Automatically from Video. In IEEE Conference on Computer Vision and Pattern Recognition, Workshop on Face Processing in Video (2004)
K. Pullen, C. Bregler. Motion capture assisted animation: Texturing and synthesis. In Proceedings of SIGGRAPH2002, pp. 501 ? 508, 2002.
Alla Safonova, Jessica K. Hodgins, Nancy S. Pollard: Synthesizing physically realis- tic human motion in low-dimensional, behavior-specific spaces. ACM Trans.Graph.23 (3) pp. 524 -521, 2004.
Raquel Urtasun. Motion models for robust 3D human body tracking. PhD thesis, EPFL.2006
Yee Whye Teh, Sam T. Roweis, Automatic alignment of local representations Advances in Neural Information Processing Systems (NIPS), vol. 15, Vancouver, Canada, pp. 841?848, 2002
Neil D. Lawrence, Gaussian process latent variable models for visualisation of high dimensional data Advances in Neural Information Processing Systems (NIPS), vol. 16, Vancouver, Canada, pp. 329?336, 2003.
Neil D. Lawrence, Andrew J. Moore: Hierarchical Gaussian process latent variable models. ICML pp. 481-488, 2007.
Jack M. Wang, David J. Fleet, Aaron Hertzmann, Gaussian process dynamical models Advances in Neural Information Processing Systems (NIPS), vol. 18, Van- couver, Canada, pp. 1441?1448, 2005.
Raquel Urtasun, David J. Fleet, Aaron Hertzmann, Pascal Fua, Priors for people tracking from small training sets, in: Proceedings of the International Conference On Computer Vision (ICCV'05), vol.1, Beijing, China, October, pp. 403?410, 2005.
Raquel Urtasun, David J. Fleet, Pascal Fua, 3D people tracking with gaussian process dynamical models, in: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR'06), vol. 1, New York, NY, June, pp. 238?245, 2006.
Kooksang Moon, Vladimir I. Pavlovic?, Impact of dynamics on subspace embed- ding and tracking of sequences, in: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR'06), vol. 1, New York, NY, June, pp. 198?205, 2006. PartickH F01. Yang, M.H., Kriegman, D., Ahuja, N.: Detecting Faces in Images: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(1), 34-58 (2002) F02.
Viola, P., Jones, M.: Rapid Object Detection Using a Boosted Cascade of Simple Features. In: IEEE Computer Vision and Pattern Recognition, Volume 1, p. 511 (2001) F03. Face Detection using OpenCV. http://opencvlibrary.sourceforge.net/FaceDetection F04. Terzopoulos, D., Waters, K.: Analysis and synthesis of facial image sequences using physical and anatomical models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6), 569-579 (1993) F05.
Wiskott, L., Fellous, J. M., Krger, N., von der Malsburg, C.: Face Recognition by Elastic Bunch Graph Matching. IEEE Transactions on Pattern Recognition and Machine Intelligence, 17(7), 775-779 (1997)
F06. Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 23(6), 681-685 (2001) F07. Matthews, I., Baker, S.: Active appearance models revisited. International Journal of Computer Vision, 60(2):135-164 (2004)
Xiao, J., Baker, S., Matthews, I., Kanade, T.: Real-time combined 2D+3D active appearance models. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 25:35 -542 (2004)
Ahlberg, J.: Candide-3 -an updated parameterized face. Technical Report LiTH ISY R 2326, Linkping University, Sweden (2001)
FT4.
Dornaika, F., Ahlberg, J.: Fast and Reliable Active Appearance Model Search for 3D Face Tracking. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 34:1838-1853 (2004)
Ahlberg, J.: Real-Time Facial Feature Tracking Using an Active Model With Fast Image Warping. In International Workshop on Very Low Bitrates Video, pp. 39-43 (2001)
Moeslund T., Hilton A., Kruger V.: A survey of advances in vision-based human motion capture and analysis. Computer vision and image understanding, vol. 4, 90-126 (2006) MC02. Poppe, R.: Vision-based human motion analysis: an overview. Computer Vision and Image Understanding, Volume 108, Issue 1-2, pp. 4-18 (2007) MC03. Sminchisescu, C.: 3D Human Motion Analysis in Monocular Video. Tech- niques and Challenges. In: IEEE International Conference on Video and Signal Based Surveillance, pp. 76-(2006)
Lu S., Huang G., Samaras D., Metaxas D.: Model-based integration of visual cues for hand tracking. In: IEEE workshop on Motion and Video Computing, Orlando, Florida, p. 118-124. (2002) MC05. Sminchisescu, C., Triggs, B.: Estimating articulated human motion with covariance scaled sampling. International Journal of Robotic Research, 22(6), 371-392 (2003)
Agarwal A., Triggs B., Recovering 3D human pose from monocular images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), p.44-58 (2006) MC07. Horain, P., Bomb, M.: 3D Model Based Gesture Acquisition Using a Single Camera, In: IEEE Workshop on Applications of Computer Vision, pp. 158-162 (2002) MC08. Marques Soares J., Horain P., Bideau A., Nguyen M.H.: Acquisition 3D du geste par vision monoscopique en temps rel et tlprsence. In: Acquisition du geste humain par vision artificielle et applications, p. 23-27 (2004) G01. Pullen, K., Bregler, C.: Motion capture assisted animation: Texturing and synthesis. In: SIGGRAPH2002, pp. 501-508 (2002).
Safonova, A., Hodgins, J.K., Pollard, N.S.: Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces. ACM Trans. Graph.23 (3) pp. 524-1 (2004).
Urtasun, R.: 0Motion models for robust 3D human body tracking. PhD thesis, EPFL (2006)
Wang, J.M.: Gaussian process dynamical models for human motion. Master thesis, University of Toronto (2005) G05. Teh, Y.W., Roweis, S.T.: Automatic alignment of local representations. Advances in Neural Information Processing Systems (NIPS), vol. 15, Vancouver, Canada, pp. 841-848 (2002)
Lawrence, N.D.: Gaussian process latent variable models for visualisation of high dimensional data. Advances in Neural Information Processing Systems (NIPS), vol. 16, Vancouver, Canada, pp. 329-336 (2003)
Urtasun, R., Fleet, D.J., Hertzmann, A., Fua, P.: Priors for people tracking from small training sets. In: International Conference On Computer Vision vol.1, Beijing, China, , pp. 403 410 (2005) G10. Urtasun, R., Fleet, D.J, Fua, P.: 3D people tracking with gaussian process dynamical models. In: Conference on Computer Vision and Pattern Recognition, vol. 1, New York, pp. 238 245 (2006) G11. Moon, K., Pavlovic, V.I.: Impact of dynamics on subspace embedding and tracking of sequences. In: Conference on Computer Vision and Pattern Recognition, vol. 1, New York, pp. 198-205 (2006) Walid Speaker transformation D. Genoud, G. Chollet, "Voice transformations: some tools for the imposture of speaker verification systems," Advances in Phonetics, A. Braun (ed.), Franz Steiner Verlag, Stuttgart, 1999
A. Kain, M. W. Macon, "Spectral voice conversion for text to speech synthesis," Proc. ICASSP 98, New-York, 1998.
B. Yegnanarayana, K. Sharat Reddy, and S. P. Kishore, "Source and system features for speaker recognition using AANN models," Proceedings of ICASSP, 2001.
A. Kain and M. Macon. "Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction," Proceedings of ICASSP, 2001. M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in Proc: of the ICASSP, 1988.
L. M. Arslan, "Speaker Transformation Algorithm using Segmental Codebooks (STASC)," Speech Comm., 28: 1999.
A. Kain, "High Resolution Voice Transformation," Ph.D. dissertation, OGI, Port- land, USA, 2001.
H. Ye and S. Young, "Perceptually weighted linear transformation for voice conver- sion", Eurospeech 2003.
E. Turajlic, D. Rentzos, S. Vaseghi, and C.-H. Ho, "Evaluation of methods for parametric formant transformation in voice conversion," in Proc: of the ICASSP, 2003. D. Sndermann, H. Ney, and H. Hge, "VTLN-Based Cross-Language Voice Conver- sion," in Proc. of the ASRU'03, 2003.
A. Mouchtaris, J. Spiegel, and P. Mueller, "Non-parallel training for voice conversion by maximum likelihood constrained adaptation," in Proc: of the ICASSP'04, 2004. E. Bocchieri, V. Digalakis, A. Corduneanu, C. Boulis, "Correlation Modeling of MLLR Transform Biases for Rapid HMM Adaptation to New Speakers," Proceedings of ICASSP, 1999.
O. Cappe, Y. Stylianou, E Moulines, "Statistical methods for voice quality transfor- mation," Proc. of EUROSPEECH 95, Madrid, 1995
Y. Stylianou and O. Cappe, "A system for voice conversion based on probabilistic classification and a harmonic plus noise model, " In Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), pages I-281-I-284, Seattle, May 1998. [ ] D. Sundermann, H. Hge, A. Bonafonte, H. Ney, A. Black and S. Narayanan, "Text-independent voice conversion based on unit selection" in Proc. ICASSP 2006, Toulouse P. Perrot, G. Aversano, R. Blouet, M. Charbit, G. Chollet, "Voice forgery using ALISP: Indexation in a client memory," in Proc. Of the ICASSP'05, 2005.
H. Ye and S. J. Young, "Voice conversion for unknown speakers," in Proc. of the ICSLP'04, 2004.
G. Chollet, J. Cernocky, A. Constantinescu, S. Deligne, F . Bimbot, "Toward ALISP: a proposal for Automatic Language Independent Speech Processing," Computational Models of Speech Processing, NATO ASI Series, 1997.
G. Chollet, R. Landais, T. Hueber, H. Bredin, C. Mokbel, P. Perrot, L. Zouari, "Some experiments in Audio-Visual Speech Processing," Advances in Non linear Speech Processing, Ed. Springler 2007. PatrickP Speaker verification
Asmaa El Hannani, Dijana Petrovska-Delecrtaz, Benot Fauve, Aurlien Mayoue, John Mason, Jean-Franois Bonastre, Grard Chollet, "Text Independent Speaker Verification" in Biometric Evaluation Systems and evaluation framework chap7. PatrickP Face transformation
Stephen Karungaru, Minoru Fukumi and Norio Akamatsu, "Automatic Human Faces Morphing using Genetic Algorithms based Control Points Selection," International Journal of Innovative Computing, pp. 1-6.
Yu-Li and Yi-Wen, "A study on face morphing algorithms", pp. 3.
Vittorio Zanella and Olac Fuentes, "An approach to automatic morphing of face images in frontal view", pp. 2-6.
George Wolberg, "Recent advances in image morphing", department of computer science, New York, pp. 1-8.
In Kyu Park, Hui Zhang, Vladimir Vezhnevets, "Image Based 3D Face Modelling System" EURASIP, Journal on Applied Signal Processing 2005:13, 2072-2090 Catherine ARTICLEKopp-Wachsmuth04, AUTHOR = " S. Kopp and I. Wachsmuth", TITLE = "Synthesizing Multimodal Utterances for Conversational Agents", JOURNAL = "The Journal Computer Animation and Virtual Worlds", YEAR = "2004", volume = "15", number = "1", pages = "39-52"
David Traum, Talking to Virtual Humans: Dialogue Models and Methodologies for Embodied Conversational Agents" In I Wachsmuth and G Knoblich (Ed.), Modeling Communication with Robots and Virtual Humans, pp. 296-309, 2008.
INPROCEEDINGSGebhard-Klesen05, AUTHOR = "Patrick Gebhard and Martin Klesen", TITLE = "Using Real Objects to Communicate with Virtual Characters", BOOKTITLE = "The 5th International Working Conference on Intelligent Virtual Agents", YEAR = "2005", address = "Kos, Greece", pages = "99-110", month = "September 12-14"
Thrisson, K.R., T. List, C. Pennock, J. DiPirro (2005). Whiteboards: Scheduling Blackboards for Semantic Routing of Messages and Streams. In K. R. Thrisson, H. Vilhjalmsson, S. Marsella (eds.), AAAI-05 Workshop on Modular Construction of Human-Like Intelligence, Pittsburgh, Pennsylvania, July 10, 8-15. Menlo Park, CA: American Association for Artificial Intelligence.
citeOAA Cheyer, Adam and Martin, David. The Open Agent Architecture. Journal of Autonomous Agents and Multi-Agent Systems, vol. 4 , no. 1, pp. 143-148, March 2001. INPROCEEDINGSNiewiadomski-Pelachaud07, AUTHOR = "R. Niewiadomski and C. Pelachaud", TITLE = "Model of Facial Expressions Management for an Embodied Conversational Agent", BOOKTITLE = "2nd International Conference on Affective Computing and Intelligent Interaction ACII", YEAR = "2007", address = "Lisbon", month = "September" ARTICLEEl-Nasr.et.al.03, AUTHOR = "M.S. El-Nasr and J. Yen and T. Loerger", TITLE = "FLAME -Fuzzy Logic Adaptive Model of Emotions", JOURNAL = "International Journal of Autonomous Agents and Multi-Agent Systems", YEAR = "2003", volume = "3", number = "3", pages = "1-39"
Arjan Egges, Sumedha Kshirsagar, Nadia Magnenat-Thalmann: Generic personality and emotion simulation for conversational agents. Journal of Visualization and Computer Animation 15(1): 1-13 (2004)
BOOKPandzic-Forcheimer02, AUTHOR = "I.S. Pandzic and R. Forcheimer (Eds)", TITLE = "MPEG4 Facial Animation -The standard, implementations and appli- cations", PUBLISHER = "John Wiley & Sons", YEAR = "2002" InProceedingsCassell.et.al.99, author = J. Cassell and J. Bickmore and M. Billinghurst and L. Campbell and K. Chang and H. Vilhjálmsson and H. Yan, title = Embodiment in Conversational Interfaces: REA, booktitle = CHI'99, pages = 520-527, address = Pittsburgh, PA, year = 1999
InProceedingsCassell.et.al.01, author = J. Cassell and H. Vilhjálmsson and T. Bickmore, title = BEAT : the Behavior Expression Animation Toolkit, booktitle = "Computer Graphics Proceedings, Annual Conference Series", publisher = ACM SIGGRAPH, year = 2001
Kopp, S., Jung, B., Lessmann, N., Wachsmuth, I. (2003). Max -A Multimodal Assistant in Virtual Reality Construction. KI Knstliche Intelligenz 4/03: 11-17.
INPROCEEDINGSKopp.et.al.06, AUTHOR = "S. Kopp and B. Krenn and S. Marsella and A.N. Marshall and C. Pelachaud and H. Pirker and K.R. Thórisson and H.H. Vilhjálmsson", TITLE = "Towards a Common Framework for Multimodal Generation: The Behavior Markup Language", BOOKTITLE = "Intelligent Virtual Agents IVA", YEAR = "2006", pages = "205-217", address = "Marina del Rey, USA" INPROCEEDINGSVilhjalmsson.et.al.07, AUTHOR = "H. Vilhjalmsson and N. Cantelmo and J. Cassell and N. E. Chafai and M. Kipp and S. Kopp and M. Mancini and S. Marsella and A. N. Marshall and C. Pelachaud and Z. Ruttkay and K. R. Thórisson and H. van Welbergen and R. van der Werf", TITLE = "The Behavior Markup Language: Recent Developments and Challenges", BOOKTITLE = "7th International Conference on Intelligent Virtual Agents, IVA'07", YEAR = "2007", address = "Paris", month = "September" INPROCEEDINGSThiebaux.et.al.08, AUTHOR = "M. Thiebaux and A. Marshall and S. Marsella and M. Kallmann", TITLE = "SmartBody: Behavior Realization for Embodied Conversational Agents", BOOKTITLE = "Seventh International Joint Conference on Autonomous Agents and Multi-Agent Systems, AAMAS'08", YEAR = "2008", address = "Portugal", month = "May" PHDTHESISDuyBui04, AUTHOR = "T. Duy Bui", TITLE = "Creating Emotions And Facial Expressions For Embodied Agents", SCHOOL = "University of Twente, Department of Computer Science", YEAR = "2004", address = "Enschede" InCollectionTsapatsoulis.et.al.02, author = N. Tsapatsoulis and A. Raouzaiou and S. Kollias and R. Cowie and E. Douglas-Cowie, title = Emotion Recognition and Synthesis based on MPEG-4 FAPs in MPEG-4 Facial Animation, booktitle = MPEG4 Facial Animation -The standard, implementations and applications, publisher = John Wiley & Sons, year = 2002, editor = Igor S. Pandzic and Robert Forcheimer ARTICLEAlbrecht.et.al.05, AUTHOR = "I. Albrecht and M. Schroeder and J. Haber and H.-P. Seidel", TITLE = "Mixed feelings -Expression of non-basic emotions in a muscle-based talking head", JOURNAL = "Virtual Reality (Special Is- sue "Language, Speech and Gesture for VR")", YEAR = "2005", month = "August" INPROCEEDINGSBecker.et.al.05, AUTHOR = "C. Becker and I. Wachsmuth", TITLE = "Modeling Primary and Secondary Emotions for a Believable Communica- tion Agent", BOOKTITLE = "International Workshop on Emotion and Computing, in conj. with the 29th annual German Conference on Artificial Intelligenz (KI2006)", YEAR = "2006", pages = "31-34", address = "Bremen, Germany" @INPROCEEDINGSThiebaux.et.al.08, AUTHOR = "Marcus Thiebaux and An- drew Marshall and Stacy Marsella and Marcelo Kallmann", TITLE = "SmartBody: Behavior Realization for Embodied Conversational Agents", BOOKTITLE = "Seventh International Joint Conference on Autonomous Agents and Multi-Agent Systems, AAMAS'08", YEAR = "2008", address = "Portugal", month = "May" @INPROCEEDINGSVilhjalmsson.et.al.07, AUTHOR = "H. Vilhjalmsson and N. Cantelmo and J. Cassell and N. E. Chafai and M. Kipp and S. Kopp and M. Mancini and S. Marsella and A. N. Marshall and C. Pelachaud and Z. Ruttkay and K. R. Thórisson and H. van Welbergen and R. van der Werf", TITLE = "The Behavior Markup Language: Recent Developments and Challenges", BOOKTITLE = "7th International Conference on Intelligent Virtual Agents, IVA'07", YEAR = "2007", address = "Paris", month = "September"
@INPROCEEDINGSThiebaux.et.al.08, AUTHOR = "M. Thiebaux and A. Marshall and S. Marsella and M. Kallmann", TITLE = "SmartBody: Behavior Realization for Embodied Conversational Agents", BOOKTITLE = "Seventh International Joint Conference on Autonomous Agents and Multi-Agent Systems, AAMAS'08", YEAR = "2008", address = "Portugal", month = "May" Anna
Abboud B., Bredin H., Aversano G., Chollet G.: Audio-Visual Identity Verification : An Introductory Overview. In Y. Stylianou et al. (Eds): Progress in Nonlinear Speech Processing, LNCS, 4392, Springer (2007)
Esposito A., Marinaro M.: What Pauses Can Tell Us About Speech and Gesture Partnership. In A. Esposito, et al. (Eds.), Fundamentals of Verbal and Nonverbal Communication and the Biometrical Issue, NATO Publishing Series, IOS press, 18, 45-57 (2007)
Esposito A.: The Amount of Information on Emotional States Conveyed by the Verbal and Nonverbal Channels: Some Perceptual Data. In Y. Stylianou et al. (Eds): Progress in Nonlinear Speech Processing, LNCS, 4392, 245-264, Springer (2007)
Esposito A., Bourbakis N. G.: The Role of Timing in Speech Perception and Speech Production Processes and its Effects on Language Impaired Individuals. In Proceedings of 6th International IEEE Symposium BioInformatics and BioEngineer- ing, 348-356, IEEE Computer Society (2006)
Esposito A.: Children's Organization of Discourse Structure through Pausing Means. In M. Faundez et al. (eds), Nonlinear Analyses and Algorithms for Speech Processing, LCNS 3817, 108-115, Springer (2006)
Gutierrez-Osuna R., Kakumanu P., Esposito A. , Garcia O.N., Bojorquez A. , Castello J., Rudomin I.: Speech-Driven Facial Animation with Realistic Dynamic. IEEE Trans. on Multimedia, 7(1), 33-42 (2005)
Hartmann B., Mancini M., Pelachaud C.: Implementing Expressive Gesture Synthesis for Embodied Conversational Agents. Gesture Workshop, LNAI 3881, 188-199, Springer (2005)
Heylen D., Ghijsen M., Nijholt A., op den Akker R.: Facial Signs of Affect During Tutoring Sessions. In J. Tao et al. (Eds.), LNCS 3784, 24-31, Springer (2005)
Horain P., Marques Soares J., Rai P. K., Bideau A.: Virtually Enhancing the Perception of User Actions. In Proceedings of the 15th ICAT05, December 5-8, New Zealand, 245-246 (2005)
Kakumanu P., Esposito A., Gutierrez-Osuna R., Garcia O. N.: Comparing Different Acoustic Data-Encoding for Speech Driven Facial Animation. Speech Communication, 48(6), 598-615 (2006)
Kendon A: Gesture: Visible Action as Utterance. Cambridge Press (2004)
Kopp S., Wachsmuth I.: Synthesizing Multimodal Utterances for Conversational Agents. The Journal Computer Animation and Virtual Worlds, 15(1), 39-52 (2004)
Kopp, S., Jung, B., Lessmann, N., Wachsmuth, I.: Max -A Multimodal Assistant in Virtual Reality Construction. KI Knstliche Intelligenz 4(3), 11-17 (2003)
McNeill, D. Gesture and thought. Chicago: University of Chicago Press (2005) 15. Niewiadomski R., Pelachaud C.: Model of Facial Expressions Management for an Embodied Conversational Agent. Proceedings of 2nd International Conference ACII07, Lisbon (2007)
Ochs M., Niewiadomski R., Pelachaud C., Sadek D.: Intelligent Expressions of Emotions. Proceedings of 1st International Conference ACII05, China, October (2005)
Pelachaud C., Martin J-C., Andre E., Chollet G., Karpouzis K., Pele D (Eds).: Intelligent Virtual Agents. LNAI 4722, Springer (2007)
Perrot P. , Aversano G., Chollet G.: Voice Disguise and Automatic Detection, Review and Perspectives. In Y. Stylianou et al. (Eds): Progress in Nonlinear Speech Processing, LNCS, 4392 Springer (2007)
Petrovska-Delacretaz D., El-Hannani A., Chollet G.: Automatic Speaker Verifi- cation: State of the Art and Current Issues. In Y. Stylianou et al. (Eds): Progress in Nonlinear Speech Processing, LNCS 4392, Springer (2007)
Poppe, R.: Vision-Based Human Motion Analysis: An Overview. Computer Vision and Image Understanding, 108(1-2), 4-18 (2007)
Traum D.: Talking to Virtual Humans: Dialogue Models and Methodologies for Embodied Conversational Agents. In I. Wachsmuth and G.Knoblich (Eds.): Modelling Communication with Robots and Virtual Humans, 296-309 (2008)
Thiebaux M., Marshall A., Marsella S., Kallmann M.; Smart Body Behavior Realization for Embodied Conversational Agents. Proceeding of the 7th International Conference AAMAS08, Portugal (2008)
Acknowledgment Jean, David, Aude, Aurlie Barbier and Christophe Guilmart The paper has been partially supported by COST 2102: "Cross Modal Analysis of Verbal and Nonverbal Communication (www.cost2102.eu)"

Multimodal Signals: Cognitive and Algorithmic Issues

Sign up for access to the world's latest research

Abstract

Related papers

References (142)

Related papers

Related topics