ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2022 » ISCApad #291 » Jobs » (2022-06-23) Post-Doctoral/PhD position at Telecom-Paris

ISCApad #291

Thursday, September 08, 2022 by Chris Wellekens

6-35 (2022-06-23) Post-Doctoral/PhD position at Telecom-Paris

Post-Doctoral/PhD position at Telecom-Paris on Deep learning approaches for social computing

*Place of work* Telecom Paris, Palaiseau (Paris outskirt)

*Starting date* From September 2022 (but can start later)

*Context*

The PhD student/post-doctoral fellow will take part in the REVITALISE project, funded by ANR ( viRtual bEhaVioral skIlls TrAining for pubLIc SpEaking). The research activity will bring together the research topics of Prof. Chloé Clavel [Clavel] of the S2a [SSA] team at Telecom-Paris– social computing [SocComp] - and Dr. Mathieu Chollet [Chollet] from University of Glasgow – multimodal systems for social skills training, and Dr Beatrice Biancardi [Biancardi] – Social Behaviour Modelling from CESI Engineering School, Nanterre.

* Candidate profile*

As a minimum requirement, the successful candidate should have:

• A master degree in one or more of the following areas: human-agent interaction, deep learning, computational linguistics, affective computing, reinforcement learning, natural language processing, speech processing

• Excellent programming skills (preferably in Python)

• Excellent command of English

*How to apply*

The application should be formatted as **a single pdf file** and should include:

• A complete and detailed curriculum vitae

• A cover letter

• The contact of two referees

For the post-doctoral fellow position, additional documents are required:

• The defense and Phd reports

• The contact of two referees

The pdf file should be sent to the three supervisors: Chloé Clavel, Beatrice Biancardi and Mathieu Chollet: chloe.clavel@telecom-paris.fr, bbiancardi@cesi.fr, mathieu.chollet@glasgow.ac.uk

Multimodal attention models for assessing and providing feedback on users’ public speaking ability

*Keywords* human-machine interaction, attention models, recurrent neural networks, Social Computing, natural language processing, speech processing, non-verbal behavior processing, multimodality, soft skills, public speaking

*Supervision* Chloé Clavel, Mathieu Chollet, Beatrice Biancardi

*Description* Oral communication skills are essential in many situations and have been identified as core skills of the 21st century. Technological innovations have enabled social skills training applications which hold great training potential: speakers’ behaviors can be automatically measured, and machine learning models can be trained to predict public speaking performance from these measurements and subsequently generate personalized feedback to the trainees.

The REVITALISE project proposes to study explainable machine learning models for the automatic assessment of public speaking and for automatic feedback production to public speaking trainees. In particular, the recruited intern will address the following points:

- identify relevant datasets for training public speaking and prepare them for model training

- propose and implement multimodal machine learning models for public speaking assessment and compare them to existing approaches in terms of predictive performance.

- integrate the public assessment models to produce feedback a public speaking training interface, and evaluate the usefulness and acceptability of the produced feedback in a user study

The results of the project will help to advance the state of the art in social signal processing, and will further our understanding of the performance/explainability trade-off of these models.

The compared models will include traditional machine learning models proposed in previous work [Wortwein] and sequential neural approaches (recurrent networks) that integrate attention models as a continuation of the work done in [Hemamou_a], [Hemamou_b][BenYoussef]. The feedback production interface will extend a system developed in previous work [Chollet21].

Selected references of the team:

[Hemamou_a] L. Hemamou, G. Felhi, V. Vandenbussche, J.-C. Martin, C. Clavel, HireNet: a Hierarchical Attention Model for the Automatic Analysis of Asynchronous Video Job Interviews. in AAAI 2019

[Hemamou_b] Leo Hemamou;Arthur Guillon;Jean-Claude Martin;Chloe Clavel, Multimodal Hierarchical Attention Neural Network: Looking for Candidates Behaviour which Impact Recruiter’s Decision, IEEE Trans. of Affective Computing, Sept. 2021

[Ben-Youssef] Atef Ben-Youssef, Chloé Clavel, Slim Essid, Miriam Bilac, Marine Chamoux, and Angelica Lim. Ue-hri: a new dataset for the study of user engagement in spontaneous human-robot interactions. In Proceedings of the 19th ACM International Conference on Multimodal Interaction, pages 464–472. ACM, 2017.

[Wortwein] Torsten Wörtwein, Mathieu Chollet, Boris Schauerte, Louis-Philippe Morency, Rainer Stiefelhagen, and Stefan Scherer. 2015. Multimodal Public Speaking Performance Assessment. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (ICMI '15). Association for Computing Machinery, New York, NY, USA, 43–50.

[Chollet21] Chollet, M., Marsella, S., & Scherer, S. (2021). Training public speaking with virtual social interactions: effectiveness of real-time feedback and delayed feedback. Journal on Multimodal User Interfaces, 1-13.

Other references:

[TPT] https://www.telecom-paristech.fr/eng/

[IMTA] https://www.imt-atlantique.fr/fr

[SocComp.] https://www.tsi.telecom-paristech.fr/recherche/themes-de-recherche/analyse-automatique-des-donnees-sociales-social-computing/

[SSA] http://www.tsi.telecom-paristech.fr/ssa/#

[PACCE] https://www.ls2n.fr/equipe/pacce/

[Clavel] https://clavel.wp.imt.fr/publications/

[Chollet] https://matchollet.github.io/

[Biancardi] https://sites.google.com/view/beatricebiancardi

-Rasipuram, Sowmya, and Dinesh Babu Jayagopi. 'Automatic multimodal assessment of soft skills in social interactions: a review.' Multimedia Tools and Applications (2020): 1-24.

-Sharma, Rahul, Tanaya Guha, and Gaurav Sharma. 'Multichannel attention network for analyzing visual behavior in public speaking.' 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018.

-Acharyya, R., Das, S., Chattoraj, A., & Tanveer, M. I. (2020, April). FairyTED: A Fair Rating Predictor for TED Talk Data. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 01, pp. 338-345).

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy