ISCA - International Speech
Communication Association

ISCApad Archive  »  2015  »  ISCApad #205  »  Jobs  »  (2015-04-29) PhD Position, IRISA lab, University of Rennes 1 at Lannion, Côtes d’Armor, France

ISCApad #205

Wednesday, July 08, 2015 by Chris Wellekens

6-27 (2015-04-29) PhD Position, IRISA lab, University of Rennes 1 at Lannion, Côtes d’Armor, France

PhD Position

IRISA lab, University of Rennes 1 at Lannion, Côtes d’Armor

Expression Team

Subject: multimodal detection of abnormality in discourse: using voice and facial expressions

Application: URGENT

Please send a CV and reference letters by e-mail to all the following contacts: Arnaud Delhay

(, Pierre-François Marteau ( and Damien

Lolive ( BEFORE the 4th of May 2015.

The thesis will be co-funded by the DGA (French Defence ministry). The candidate must have the

nationality of a country of the European Union or of Switzerland. S/he must hold a Master

degree (or equivalent) in computer science.

The candidate is expected to conduct cutting-edge applied research in one or several of the

following domains: signal processing, statistical machine learning, speech and gesture recognition.

S/he should have excellent computer programming skills (e.g. C/C++, Python/Perl, etc.), and

possibly knowledge in machine learning, signal processing or human computer interaction.

Duration: 3 years

Date: October 2015 – September 2018

This PhD, proposed by the EXPRESSION team at IRISA, will address the detection of abnormality

from facial movements and speech signals of a human being in a situation of stress. We mean by

abnormalityexistence of foreign elements to a normal situation in a given context. The study will

focus in particular on the joint use of facial and vocal expression parameters to detect abnormal

variations of expressivity in speech, not only related to emotion, but also to social interactions and

psychological signals. These abnormal signals can appear in extreme stress situations for pilots or

vehicle drivers, for example. This study could also find applications in the medical field, e.g.,

detection of abnormal behaviors due to mental disabilities such as autism.

We aimed at developing a system capable of detecting abnormal behaviors by the analysis of

records of concrete situations. The thesis will then explore several issues including the followings:

Collect, segment and annotate multimodal data;

Identification of descriptors enabling the description of abnormality;

Development of dedicated machine learning approaches for abnormality detection;

Development of a decision system.

Keywords: Speech, facial expressivity, gesture analysis, heterogeneous information, machine

learning, classification



[1] Carlos Busso, Zhigang Deng, Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh,

Sungbok Lee, Ulrich Neumann, and Shrikanth Narayanan. Analysis of emotion recognition using

facial expressions, speech and multimodal information. In Proceedings of the 6th international

conference on Multimodal interfaces, pages 205–211. ACM, 2004.

[2] B. Fasel and Juergen Luettin. Automatic facial expression analysis : a survey. Pattern Recognition,

36(1) :259 – 275, 2003.

[3] Wesley Mattheyses and Werner Verhelst. Audiovisual speech synthesis : An overview of the

state-of-the-art. Speech Communication, 66(0) :182 – 217, 2015.

[4] Marie Tahon. Acoustic analysis of speakers emotional voices during a human-robot interaction.

Theses, Université Paris Sud - Paris XI, November 2012.

[5] Mariette Soury. Multimodal stress detection for remediation software design. Theses, Université

Paris Sud - Paris XI, October 2014.

[6] Soujanya Poria, Erik Cambria, Amir Hussain, and Guang-Bin Huang. Towards an intelligent

framework for multimodal affective data analysis. Neural Networks, 63(0) :104 – 116, 2015.

[7] D Govind and SR Mahadeva Prasanna. Expressive speech synthesis : a review. International

Journal of Speech Technology, pages 1–24, 2013.

[8] Marc Le Tallec, Jeanne Villaneau, Jean-Yves Antoine, Agata Savary, and Arielle Syssau-

Vaccarella. Emologus - a compositional model of emotion detection based on the propositionnal

content of spoken utterances. In Text, Speech and Dialogue, Proc., Brno, Czech Republic, 2010.

Back  Top

 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA