ISCA Services

ISCA - International Speech
Communication Association

Previous

ISCApad Archive » 2025 » ISCApad #323 » Jobs

ISCApad #323

Friday, May 09, 2025 by Chris Wellekens

6 Jobs

6-1

(2024-11-05) Ingénieur·e de recherche en intelligence artificielle pour la pédagogie, Université Grenoble-Alpes, France

Dans le cadre du projet EFELIA MIAI, Les laboratoires de recherche et départements des IUT de l’UGA développent des actions de formation en Intelligence Artificielle. À ce titre, ils recherchent un·e ingénieur·e de recherche en IA pouvant contribuer à l'élaboration de ressources et de pratiques pédagogiques pour les formations de l'institut ainsi qu'au développement des activités de recherche du Laboratoire d'Informatique de Grenoble dans le domaine des LLMs (Large Language Models) notamment dans le cadre du projet ANR Pantagruel (https://pantagruel.imag.fr/).

Le détail du poste est accessible sur le site de l'UGA

https://emploi.univ-grenoble-alpes.fr/offres/ingenieur-de-recherche-en-intelligence-artificielle-f-h--1504906.kjsp?RH=1135797159702996

*Pour postuler*

Suivez le lien ci-dessus et cliquez sur 'Je postule'

*Date limite*

Le poste est ouvert jusqu'à ce qu'il soit pourvu.

*Rémunération*

À partir de 2289€ mensuel brut et en fonction de l’expérience.

*Pour toutes informations complémentaires sur le poste*, contactez
M. François PORTET, Professeur - francois.portet@imag.fr
M. Didier SCHWAB - Professeur - didier.schwab@imag.fr

6-2

(2024-11-06) Proposition de stage, BEA, Le Bourget, Ile-de-France, France

Objet : Proposition de stage « Parole superposée dans les cockpits d'aeronefs: annotations et essais acoustiques»

Lieu : Laboratoire Audio-CVR, BEA, 10 rue de Paris, 93350 Le Bourget

Déplacements en métropole de plusieurs jours consécutifs à prévoir (pris en charge par le BEA)
Période : 4 à 6 mois, finissant au plus tôt en juin 2025
Compensation financière : gratification réglementaire, remboursement des frais de transport

Contexte d’application du stage

Dans le cadre des enquêtes sur les accidents et incidents de l’aviation civile et militaire, le département technique du BEA (pour l’aviation civile) et le laboratoire RESEDA (pour l’aviation militaire) sont chargés de la récupération des données contenues dans les enregistreurs de vol communément appelés « boîtes noires » par le grand public.

Dans le cadre du projet de recherche ANR / AID BLeRIOT (Bea Lisic Reseda Irit investigation on aerOnautic speech Transcription), le BEA et RESEDA ont la charge de fournir et produire des données de paroles superposées pour investiguer de nouvelles méthodes de transcription automatique adaptées au contexte des enregistreurs vocaux de vol et répondant aux besoins nés de la réglementation imposant une augmentation significative de la durée d’enregistrement (passant de 2h à 25h). Ces données annotées seront utilisées par des partenaires du projet pour générer des modèles pour la retranscription automatique et seront évalués dans un cadre scientifique ultérieurement.

Les travaux seront réalisés au département technique du BEA sur une durée de 4 à 6 mois, avec des déplacements de plusieurs jours en France métropolitaine pour les campagnes de mesures acoustiques, et en collaboration avec les partenaires universitaires, à savoir le Laboratoire d‘Informatique Signal et Image de la Côte d’Opale (LISIC) et l’Institut de Rechercheen Informatique de Toulouse (IRIT). Le/la stagiaire sera intégré-e à l’équipe du laboratoire d’analyse audio du BEA ; il/elle aura l’occasion de découvrir les techniques d’exploitation et d’analyse des données audio réalisées dans le cadre du support aux enquêtes de sécurité de l’aviation civile.

Travaux à réaliser lors du stage

Au cours de ce stage la/le stagiaire devra :

Réaliser un corpus de plusieurs heures issues de CVRs (Cockpit Voice Recorders)

Etablir une convention d’annotation en prenant en compte le besoin des chercheurs

Corriger des transcriptions automatiques d’enregistrements vocaux

Annoter les tours de paroles entre pilote de ligne et commandant de bord

Transcrire la parole multi-locuteurs

Mener une campagne d’essais acoustiques dans des cockpits d’aéronefs étatiques

Rédiger un protocole d’essai acoustique avec les acteurs étatiques et l’industriel fournissant le matériel d’enregistrement

Constituer une base audio de parole (parole générale et issue de l’aéronautique)

Enregistrer en multi-pistes la résultante acoustique de la diffusion des paroles sur des mannequins dans des cockpits d’aéronefs en simulation de vol

Décharger les enregistrements CVR

Documenter les deux bases réalisées

Profil du/de la candat-e

Niveau M1/M2 ou équivalent dans le domaine de l’ingénierie sonore, de l’acoustique, ou de la parole

Bonne compréhension de l’anglais requise

Une connaissance du vocabulaire aéronautique serait un plus

Le projet ayant un caractère lié à la Défense,

Il sera demandé de signer une charte de confidentialité et des règles devront être scrupuleusement respectés pour garantir la confidentialité des données

les non ressortissant-e-s d’un État membre de l’Union européenne ou d’un État faisant partie de l’Espace économique européen ou de la Confédération Suisse feront l’objet d’une autorisation préalable de l’AID (Agence de l’Innovation de Défense) qui étudiera au cas par cas ces candidatures sous huit semaines

Outils utilisés

Annotations : Samplitude

Transcription automatique : Whisper (OpenAI)

Système d’enregistrement et diffusion audio sur mannequin

Logiciels de déchargement de CVR

Bibliographie

- Puigt, M., Bigot, B., Devulder, H., Introducing the « Cockpit Party Problem » : blind source separation enhances aircraft cockpit speech transcription, J. Audio Eng. Soc., 2024. https://hal.science/hal-04666683v1

- BEA, Ce qu’il faut savoir sur les enregistreurs de vol, 2009.

- Bigot, B., Bredin, H., Delmaire, G., Guerin, H., Menez, C., Pinquier, J., Puigt, M., Roussel, G., BLeRIOT Transcription et Investigation du Bea, du Lisic, de Reseda et de l’Irit sur la transcription de parole aéronautique, projet de recherche ANR/AID, 2024.

Contact et encadrement de stage

Lionel Feugère – Laboratoire Audio-CVR

Enquêteur spécialisé et chercheur, docteur en acoustique

Email : lionel.feugere@bea.aero

Tel: +33 1 49 92 74 07

Candidature

Envoyer un CV et une lettre de motivation à lionel.feugere@bea.aero

Les candidatures seront analysées au fil de l’eau.

6-3

(2024-11-06) PhD and postdoc vacancy in multimodal search, The University of Utrecht, The Netherlands

We are looking for PhD or postdoctoral students for multimodal processing of cultural digital archives at the Interaction Division of Utrecht University, the Netherlands. The deadline for applications is 13 November.

Job description

Are you passionate about developing cutting-edge AI techniques to enhance interaction and communication across multiple modalities, such as text, pictures, audio, and video? Join the large scale HAICu NWA-ORC project to help unlock the potential of cultural digital archives through multimodal use, providing richer context and a more comprehensive analysis of current complex issues in society. If this fits your expertise and interests, the Interaction Division of Utrecht University is seeking you!

Your job

We are looking for a PhD and a postdoctoral researcher to work within the multi-partner HAICu NWA-ORC project. This vacancy is for the Postdoc position, the PhD position is being advertised simultaneously:

PhD Position on Multimedia Analysis in the HAICu Project. There are two research topics tackled in parallel for this project (see description below). Based on the applications, the topics will be assigned at PhD or Postdoc level. Both researchers will collaborate within the project.

This project is implemented by an ambitious consortium including many universities, knowledge institutions, archives, foundations, cultural institutions and business partners in the Netherlands. It aims to use improved access to digital heritage to tutor the Digital Citizen in the use of big data. It brings together AI researchers and Digital Humanities scholars to seek solutions to the problem of inadequate data-mining tools we have, aiming to derive information from the continuous stream of data about the present and the past. This will help citizens and other regular users, heritage curators and journalists who are interested in tapping heritage collections, as well as civic organizations and authorities interested in improving civic participation.

There are two research topics. You can indicate in your motivation letter whether you prefer one or the other.

Research topic 1 targets visual and multimodal feature learning for news ecosystems, analysing the complex multidimensional feature space of visual information to support data-driven journalism. This includes experiments for accountability, transparency, inclusiveness, and misinformation. The key technology is multimodal deep learning, and its extensions for these additional targets.

Research topic 2 targets audio and multimodal feature learning beyond words, such as intonation, tone, stress and rhythm, in relation to conveying emotion or messages, to support data-driven journalism. We will research audio features (e.g. for speech and music) and their relation to effective message conveying in news collections with audio and video, and innovate multimodal search by integrated feature learning in both visual and audio at the same time.

Research will include testing, validation and evaluation on large scale and interoperable collections, in cooperation with the societal partners in the project, including the Netherlands Institute for Sound and Vision, the National Archive, and the National Library of the Netherlands. The research will take place in collaboration with the HAICu fieldlab ‘Deep Journalism’, which develops functionality for searching for items about a similar topic from different archives and with various modalities to support news journalists.

The Interaction Division is part of the department of Information and Computing Sciences. It develops novel techniques to research technology-mediated communication and interaction between people, and communication and interaction between systems and people (users). The technologies for interaction make use of various modalities, in particular visual, auditory, and haptic modes, as well as combinations of these. Three of the chairs in the division are collaborating in this project. The Multimedia group (Professor Remco Veltkamp), the Music Information Computing group (Professor Anja Volk), and the Social and Affective Computing group (Professor Albert Salah).

Postdoc position:

https://www.uu.nl/en/organisation/working-at-utrecht-university/jobs/postdoc-position-on-multimedia-analysis-in-the-haicu-project

PhD position:

https://www.uu.nl/en/organisation/working-at-utrecht-university/jobs/phd-position-on-multimedia-analysis-in-the-haicu-project

6-4

(2024-11-11) Deux thèses financées à l'INRIA, France.

Inria ouvre deux offres de thèse financées :

* synthèse vocale multilingue pour les langues régionales :
https://jobs.inria.fr/public/classic/fr/offres/2024-08319 (date limite :
6 décembre)
* biomarqueurs vocaux pour les appels aux urgences:
https://jobs.inria.fr/public/classic/fr/offres/2024-08317 (CIFRE avec la
startup parisienne ECHO, qui a accès à un jeu de données unique en
Europe de centaines de milliers d'appels d'urgence, date limite : 4
décembre)

Les candidats sont invités à postuler en ligne dès que possible. Les
candidatures seront évaluées au fil de l'eau.

6-5

(2024-11-10) Stage de 6 mois, Transcription et Alignement de la Parole Théâtrale par Analyse Prosodique, Universite de Grenoble-Alpes, France

Transcription et Alignement de la Parole Théâtrale par Analyse Prosodique

Contexte :

La transcription automatique de la parole dans des contextes théâtraux pose des défis majeurs. La richesse du langage théâtral, la diversité des accents et des registres, les variations prosodiques marquées, ainsi que les caractéristiques acoustiques propres aux captations en salle (réverbérations, bruits de scène) rendent cette tâche particulièrement complexe. Ce sujet de stage vise à explorer un système de transcription et d'alignement automatique spécifiquement adapté aux enregistrements théâtraux. Il s’appuiera sur des corpus audio non annotés et exploitera les scripts originaux des œuvres pour guider la modélisation linguistique et prosodique.

Objectifs :

L’objectif principal de ce stage est de concevoir et d’adapter un système de reconnaissance automatique de la parole (ASR) à des corpus théâtraux, en intégrant des techniques d’alignement avec le script original basé sur des informations prosodiques. La première étape consistera à exploiter des corpus audio non annotés issus de captations de pièces de théâtre pour former ou affiner des modèles existants, en utilisant des approches d’apprentissage auto-supervisé comme Wav2Vec ou HuBERT. Le script de chaque pièce sera utilisé comme support pour enrichir la modélisation linguistique et contextualiser la transcription. Une attention particulière sera portée aux variations prosodiques propres à l’interprétation théâtrale (intonation, pauses, rythme), qui serviront à aligner la transcription produite avec le texte de la pièce et à détecter les éventuelles divergences dues à des improvisations ou omissions.

Pour aller plus loin, des approches multimodales pourront être explorées. Par exemple, l’utilisation des signaux visuels tels que les mouvements des lèvres ou les expressions faciales des comédiens pourrait améliorer la précision de la transcription, particulièrement dans les environnements acoustiquement complexes. Enfin, des techniques d’adaptation stylistique seront mises en œuvre pour mieux gérer les variations de registre, qu’il s’agisse de langue classique, contemporaine ou poétique.

Encadrement et motivation :

Ce stage est proposé à des étudiants inscrits en M2 d’informatique et intelligence artificielle.

Il sera encadré par Rémi Ronfard, directeur de recherche INRIA, directeur scientifique de l’équipe ANIMA du laboratoire LJK et du centre INRIA de l’université Grenoble Alpes, et responsable de l’action exploratoire ITHEA (informatique théâtrale) ; et Benjamin Lecouteux, professeur de l’Université Grenoble Alpes, membre de l’équipe GETALP du Laboratoire d’Informatique de Grenoble (LIG), et chercheur associé de l’action exploratoire ITHEA.

L’équipe ANIMA est spécialisée en informatique graphique et vision par ordinateur. Elle a constitué depuis plusieurs années un corpus de captations vidéo de pièces de théâtre, indexées et analysées à l’aide d’algorithmes de vision par ordinateur (détection, suivi et reconnaissance des acteurs) et accessibles en ligne sur le site http://kinoai.inria.fr à l’intention des chercheurs en études théâtrales.

L’équipe GETALP est ici spécialisée dans le traitement de la parole et de la langue naturelle. Elle s’intéresse en particulier à la parole théâtrale, qui est incarnée, expressive et multi-modale.

Ce stage de M2 s’inscrit dans une collaboration à long terme entre nos deux équipes sur le sujet de la compréhension, de l’analyse et de la diffusion des mises en scène de théâtre. Dans une première étape, nous cherchons à constituer un corpus de textes de théâtre alignés avec les captations vidéo de leurs mises en scène, qui sera mis à disposition de la communauté des chercheurs en sciences cognitives intéressés par le sujet de la communication théâtrale. Une première étude (Martinez 2023) a montré que les méthodes de reconnaissance vocales disponibles « sur étagère » étaient insuffisantes pour créer un tel corpus et que des approches plus spécifiques devaient être développées. C’est l’objet de ce stage.

Le stage se déroulera dans les locaux de l’action exploratoire ITHEA d’Inria à Grenoble (MINATEC). En cas de succès, il pourra être suivi par une thèse de doctorat sur le même sujet, sous réserve d’obtention d’une allocation de recherche.

Références :

Max Bain, Jaesung Huh, Tengda Han, Andrew Zisserman. WhisperX: Time-Accurate Speech Transcription of Long-Form Audio. INTERSPEECH 2023.

Adela Barbulescu, Rémi Ronfard, Gérard Bailly. Characterization of Audiovisual Dramatic Attitudes. Interspeech 2016 - 17th Annual Conference of the International Speech Communication Association, Sep 2016.

Chow and Brown. A Musical Approach to Speech Melody. Frontiers in Psychology, Section : Cognition, Volume 9, Article 247, March 2018.

Katsalis, A. et al. (2023). NLP-Theatre: Employing Speech Recognition Technologies for Improving Accessibility and Augmenting the Theatrical Experience. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 543. Springer, Cham.

Emma Martinez. Conception d’un système de reconnaissance de la parole pour le théâtre. Mémoire de master Sciences du Langage, Univ. Grenoble Alpes. Sous la direction de Benjamin Lecouteux et Rémi Ronfard. Septembre 2023.

Gabriele Sofia, « Mémoire phonique « incarnée » du théâtre. Prolégomènes d’une approche cognitive », Revue Sciences/Lettres [En ligne], 5 | 2017.

Benjamin Lecouteux
Full Professor in Computer Science
UGA / LIG / GETALP team
Phone:    (+33)7 64 54 24 85

--

6-6

(2024-11-11) Stage sur l'annotation semi-automatique de conversations dans des documents audiovisuels, @ LISN, Orsay, France)

Veuillez trouver ci-dessous l'offre de stage proposée par le LISN (à Orsay) sur l'annotation semi-automatique de conversations dans des documents audiovisuels.

Le stage pourra se poursuivre en thèse (financement ANR prévu).

Description:

Most human interactions occur through spoken conversations. If this interaction mode seems so natural and easy for humans, it remains a challenge for spoken language processing models as conversational speech raises critical issues. First, non-verbal information can be essential to understand a message. For example a smiling face and a joyful voice can help detecting irony or humor in a message. Second, visual grounding between participants is often needed during a conversation to integrate posture and body gesture as well as references to the surrounding world. For example, a speaker can talk about an object on a table and refer to it as this object by designing it with her hand. Finally, semantic grounding between participants of a conversation to establish mutual knowledge is essential for communicating with each other.

In this context, the MINERAL project aims to train a multimodal conversation representation model for communicative acts and to study communicative structures of audiovisual conversation.

As part of this project, we are offering a 5- to 6-month internship focused on semi-automatic annotation of conversations in audio-visual documents. The intern's first task will be to extend the existing annotation ontology for dialog acts, currently available for audio documents (through the Switchboard corpus for example), to incorporate the visual modality. In a second step, the intern will develop an automatic process for transferring annotations to new audiovisual datasets (such as meeting videos and TV series or movies) using transfer or few-shot learning approaches.

Practicalities:

Starting between February and April 2025, the internship will be funded ~500 euros per month for a duration of 5 or 6 months and will take place at LISN (Orsay) within the LIPS team. This internship can potentially be followed by a funded PhD, based on performance and interest in continuing research in this area.

Required Qualifications:

● Master's degree (M2) in Computer Science or related field.
● Experience with deep learning frameworks such as Keras or PyTorch.
● Knowledge of image processing would be an advantage.

To apply, please send your CV, a cover letter and your M1 and M2 transcripts (if available) by email to Camille Guinaudeau camille.guinaudeau@universite-paris-saclay.fr and Sahar Ghannay sahar.ghannay@universite-paris-saclay.fr

References:

[Albanie, 2018] Samuel Albanie, Arsha Nagrani, Andrea Vedaldi, and Andrew Zisserman. Emotion Recognition in Speech using Cross-Modal Transfer in the Wild. In Proceedings of the 26th ACM international conference on Multimedia. 2018
[Zhang, 2021] Sheng Zhang, Min Chen, Jincai Chen , Yuan-Fang Li, Yiling Wu, Minglei Li, Chuanbo Zhu. Combining cross-modal knowledge transfer and semi-supervised learning for speech emotion recognition. Knowledge-Based Systems. 2021.

[Fang, 2012] Alex C. Fang, Jing Cao, Harry Bunt and Xiaoyue Liu. The annotation of the Switchboard corpus with the new ISO standard for dialogue act analysis. Workshop on Interoperable Semantic Annotation. 2012.

6-7

(2024-11-13) Stage 6 mois, Université d'Avignon, France

Stage : 6 mois, 'Extraction d’informations sémantiques dans des transcriptions de résumés oraux d’histoires par des enfants'

Université d' Avignon, LIA

** Informations générales

Durée : 6 mois Début : à partir de janvier 2025, au plus tard avril 2025
Lieu : Université d’Avignon – LIA – Campus
Gratification : selon la grille réglementaire
Perspectives : Programme de doctorat de 3 ans

**Contexte :
Ce stage s’inscrit dans le cadre du projet ANR Chica-AI (2024-2028), qui vise à concevoir un environnement informatique capable d’analyser automatiquement les résumés oraux d’enfants pour évaluer leur compréhension d’un texte à la suite d’une tâche de lecture.
L’étude PISA 2019 montre, en effet, que 20 % des élèves français de 15 ans présentent des difficultés sévères en lecture, et que les écarts socio-économiques accentuent les disparités de niveau. Le projet a pour ambition de réduire les difficultés de lecture des enfants du cycle 3 en proposant des méthodes basées sur l’apprentissage artificiel, permettant un accompagnement personnalisé pour l’élève et un retour informatif pour les enseignants.
La compréhension de la lecture est un enjeu fondamental, et elle peut être entrainée grâce à des activités telles que le résumé de texte. Pour analyser automatiquement la compréhension du texte par l’enfant, il s’agit d’évaluer sa production orale du résumé. Pour cela, il faut extraire un ensemble d’informations sémantiques du résumé mais aussi fournir un ensemble d’indicateurs pertinents et différenciés, tant pour les élèves que pour les enseignants.
Pour atteindre ces objectifs, plusieurs modules seront développés : un module de reconnaissance de la parole, adapté aux voix d’enfants ; un module de compréhension de la parole permettant d’extraire des informations sémantiques ; et un module de mise en correspondance de ces informations sémantiques avec une grille d’évaluation fondée sur des critères psycho-cognitifs, pour juger de la qualité des résumés produits.

**Travail du stagiaire :
L'objectif du stage est de développer un premier système d'extraction d’informations sémantiques à partir des transcriptions de résumés oraux. Dans un premier temps, le stagiaire explorera différentes méthodes de traitement automatique du langage naturel (TAL) pour détecter et extraire des entités nommées (comme les lieux ou personnages) à l'aide de modèles de type CamemBERT et Flaubert. Dans un deuxième temps, ces méthodes seront adaptées à la détection d’évènements pertinents dans le résumé (différentes actions). Il s’agira de se familiariser avec des techniques avancées de TAL et d’apprentissage automatique (architectures neuronales et grands modèles de langues – LLM). Une des difficultés à lever consistera à faire le lien entre les informations sémantiques issues du résumé et celles recherchées pour son évaluation.

**Candidatures :
Le candidat doit être en master 2 informatique avec des connaissances en intelligence artificielle. Des connaissances en traitement automatique du langage naturel seront appréciées. Le candidat doit montrer un intérêt pour le travail en équipe et interdisciplinaire.

Les candidatures (CV et lettre de motivation, relevé de notes Bac+4) sont à envoyer nathalie.camelin@univ-avignon.fr avant le 15/12/2024.

**Contact : nathalie.camelin@univ-avignon.fr

6-8

(2024-11-15) Two fully funded PhD positions, INRIA, France

Inria, the French national institute for research in digital science and
technology, is opening two fully-funded PhD positions:

* multilingual TTS for regional languages:
https://jobs.inria.fr/public/classic/en/offres/2024-08319 (deadline: Dec 6)
* voice biomarkers for medical emergency calls:
https://jobs.inria.fr/public/classic/fr/offres/2024-08317 (deadline: Dec 4)

The latter is joint with ECHO, a Paris-based startup which has access to
a unique dataset of several hundred thousand real emergency calls.

Candidates shall apply online at their earliest convenience.
Applications will be assessed on a rolling basis.

6-9

(2024-12-06) Deux offres de stage au sein du service de la recherche de l'Institut National de l'Audiovisuel (INA), Paris, France

Deux offres de stage au sein du service de la recherche de l'Institut National de l'Audiovisuel, portant sur l'analyse de la parole (signal ou transcrite) avec une forte composante humanités numériques et machine learning.

Sujet 1: Description automatique des stéréotypes racistes et sexistes dans les contenus audiovisuels

https://www.ina.fr/hub-p/public/2024-12/stage_recherche_ina_2025_racisme_sexisme.pdf

Sujet 2: Détection de l'activité vocale dans des corpus audiovisuels à l'aide de représentations auto-supervisées

https://www.ina.fr/hub-p/public/2024-12/stage_recherche_ina_2025_vad.pdf

6-10

(2024-12-10) 12 positions for doctoral researchers: PSST! - Privacy for Smart Speech Technology

PSST! - Privacy for Smart Speech Technology

Call for applicants - PhD students (12 positions)

“Privacy for Smart Speech Technology” (PSST) is a joint doctoral training programme and Horizon Europe Marie Skłodowska-Curie Action, the European Union’s flagship funding programme for doctoral training. We are a consortium of 7 European universities and 11 industrial partners searching for 12 PhD students to work on the protection and evaluation of privacy for smart speech technology. PSST is a unique opportunity, as it is the largest international project focusing on privacy in speech technology and because the importance of privacy has only recently gained wider appreciation.

This is no ordinary PhD programme.

The structured PSST doctoral training programme combines training in cutting-edge research, transferable skills and career-enhancing skills with exposure to multiple sectors and disciplines.

Join us and put your expertise in deep learning / machine learning, speech processing, information privacy and security, and user studies into practice and gain your PhD degree from TWO leading European Universities (listed below)!

See more information and PhD topics at https://psst-doctoralnetwork.eu/

We are looking for 12 PhD candidates who hold a master's degree. We value diversity and plan to hire 12 fellows with a balanced background and skillset, and an excellent academic track record. We especially encourage applications from members of under-represented groups.

10.12.2024 Call opens

26.1.2025 Application deadline

28.2.2025 Shortlisted candidates informed

17.-18.3.2025 Recruitment event in Finland for shortlisted candidates

May 2025 Notification of acceptance

August 2025 Planned start of employment

PSST follows a double-degree model whereby, during their 45-month employment, each PhD student will work in collaboration with two universities towards PhD degrees from both institutions! Each PhD student will also spend 6 months on secondment to one of our Associate Partners, all leading European SMEs, large industrials or regulatory bodies active in speech privacy:. - CNIL (France), ELDA (France), ki:elements (Germany), Loihde (Finland), Naver (France), Omilia (Greece), Orange (France), Vocapia (France), VoiceInteraction (Portugal), Voice INTER connect (Germany), and VoiceMod (Spain).

Applications should include:

- Curriculum Vitae (including countries of residence in the past 36 months).

- Academic transcripts for completed courses and degrees.

- Motivation letter explaining why you want to pursue a PhD degree and why you believe you are an outstanding candidate to pursue your PhD researching PSST topics.

- Reference letter from Master’s thesis supervisor/advisor or similar.

- (Optional) Preferences for 1-3 research topics (see webpage) and universities.

Requirements

- A master's degree in electrical engineering, computer science or related area (degree must be completed before employment can start).

- Mobility: The fellow must not have resided or carried out their main activity (work, studies, etc.) in the country of the first recruiting organisation for more than 12 months in the 36 months immediately before their recruitment date.

- Fluent written and verbal communication skills in English are required, knowledge of the local language is an advantage.

- Candidates cannot hold a doctoral degree.

Desirable skills

- Knowledge and skills in deep learning, programming, speech processing, user studies, privacy.

- Ability to work independently and a critical mindset.

- Pro-activeness and eagerness to participate in network-wide training events, international mobility, and public dissemination activities. 

Submit your application at https://www.aalto.fi/en/open-positions/doctoral-researchers-12-positions-privacy-for-smart-speech-technology-psst

PhD students receive a regular salary and social benefits according to national regulations, and if applicable, also family leave, long-term leave, and special needs allowances. The gross salaries we offer, including both a living allowance and a mobility allowance, are

3500 €/month Aalto University (Espoo, Finland)

3261 €/month EURECOM (Sophia Antipolis, France) [1]

2680 €/month INESC-ID (Lisbon, Portugal) [2]

3261 €/month INRIA (Nancy or Saclay, France) [1]

Salary group TV-L E13 Ruhr University Bochum (Germany) [3]

Salary scale P Radboud University Nijmegen (Netherlands) [4]

Salary group TV-L E13 Technical University of Berlin (Germany) [3]

[1] https://www.horizon-europe.gouv.fr/sites/default/files/2022-02/horizon-europe---dn-pf---french-salary-explained-5762.pdf

[2] includes: base salary + food allowance + holiday allowance

[3] https://oeffentlicher-dienst.info/c/t/rechner/tv-l/allg?id=tv-l-2024&g=E_13&s=1

[4] https://www.ru.nl/sites/default/files/2024-09/Overview%20salary%20scales%201%20sept%202024.pdf

For queries, contact info@psst-doctoralnetwork.eu .

Marie Skłodowska-Curie Actions, Doctoral Networks (MSCA-DN) , 101168193 – PSST.

6-11

(2024-12-13) Doctoral training program

“Privacy for Smart Speech Technology” (PSST) is a joint doctoral
training programme and Horizon Europe Marie Skłodowska-Curie Action, the
European Union’s flagship funding programme for doctoral training. We
are a consortium of 7 European universities and 11 industrial partners
searching for 12 PhD students to work on the protection and evaluation
of privacy for smart speech technology. PSST is a unique opportunity, as
it is the largest international project focusing on privacy in speech
technology and because the importance of privacy has only recently
gained wider appreciation.

Join us and put your expertise in deep learning / machine learning,
speech processing, information privacy and security, and user studies
into practice and gain your PhD degree from TWO leading European
Universities!

See more information and PhD topics at https://psst-doctoralnetwork.eu/

Application deadline: January 26, 2025 - apply now!

On behalf of the PSST team

Nicholas Evans
EURECOM

6-12

(2024-12-13) Stage IRCAM/CNRS/EURECOM

Génération de deepfakes audio-visuels par modèle de diffusion multimodal

Dates : 01/03/2025 au 31/08/2025

Laboratoire : STMS Lab (IRCAM / CNRS / Sorbonne Université et EURECOM

Lieu : IRCAM – Analyse et Synthèse des Sons (Paris) ou EURECOM (Sophia Antipolis)

Responsables : Nicolas Obin (Ircam), Jean-Luc Dugelay (EURECOM), Alexandre Libourel (EURECOM)

Contact : nicolas.obin@ircam.fr, Jean-Luc.Dugelay@eurecom.fr, Alexandre.Libourel@eurecom.fr

Contexte : Ce stage s’inscrit dans le contexte du projet DeTOX

- Lutte contre les vidéos hyper-truquées de personnalités françaises, financé par ASTRID/ANR et en collaboration avec EURECOM. Les récents challenges ont montré qu’il était extrêmement difficile de mettre au point des détecteurs universels de vidéos hyper-truquées - à l’exemple des “deep fakes” utilisés pour contrefaire l’identité d’une personne. Lorsque les détecteurs sont exposés à des vidéos générées par un algorithme nouveau, c’est-à-dire inconnu lors de la phase d’apprentissage, les performances sont encore extrêmement limitées. Pour la partie vidéo, les algorithmes examinent les images une par une, sans tenir compte de l’évolution de la dynamique faciale au cours du temps. Pour la partie vocale, la voix est générée de manière indépendante de la vidéo ; en particulier, la synchronisation audio-vidéo entre la voix et les mouvements des lèvres n’est pas prise en compte. Ceci constitue un point faible important des algorithmes de génération de vidéos hyper-truquées. Le projet DeTOX vise à implémenter et à apprendre des algorithmes de détection de deepfakes personnalisés sur des individus pour lesquels on peut disposer et/ou fabriquer de nombreuses séquences audio-vidéo réelles et falsifiées. En se basant sur des briques technologiques de base en audio et vidéo récupérées de l’état de l’art, le projet se concentrera sur la prise en compte de l’évolution temporelle des signaux audio-visuels et de leur cohérence pour la génération et la détection. Nous souhaitons ainsi démontrer qu’en utilisant simultanément l’audio et la vidéo et en se focalisant sur une personne précise lors de l’apprentissage et de la détection, il est possible de concevoir des détecteurs efficaces même face à des générateurs encore non répertoriés. De tels outils permettront de scruter et de détecter sur le web d’éventuelles vidéos hyper-truquées de personnalités françaises importantes (président de la république, journalistes, chef d’état-major des armées, …) et ce dès leur publication. Objectifs : La génération deepfakes audio-visuels repose actuellement sur l’assemblage de deepfakes audio, visuel, et de resynchronisation labiale générés séparément. Chaque modalité possède des générateurs de référence dans l’état de l’art : par exemple, LIA [1, 2] ou DeepFaceLab pour l’image, RVC [3] pour l’audio, et Wav2lip et Diff2lip [4] pour la synchronisation labiale audio-visuelle.

L’objectif de ce stage consistera à implémenter, entraîner, et évaluer un modèle de génération de deepfakes audio-visuel par diffusion multimodale à partir de générateurs existants et optimisée sur une personnalité visée.

Les contributions attendues sont :

- L’implémentation d’un post-net basé sur un modèle de diffusion à partir de flux de données asynchrones qui, à partir d’un assemblage de générateurs séparés, homogénéise et optimise le réalisme du rendu de la génération d’un deepfake audio-visuel

- La spécialisation de la génération conditionnée sur l’identité d’une personnalité, par exemple par la mise en œuvre d’un apprentissage adversarial conditionné sur la personne.

- La génération d’une base de données de deepfakes audio-visuel sur une ou plusieurs personnalités françaises.

- La mise en œuvre de protocoles d’évaluation objectif et subjectif pour l’évaluation de la qualité et du réalisme des deepfakes générés

Le stage s’appuiera en majeure partie sur les connaissances de l’équipe Analyse et Synthèse des Sons en traitement du signal de parole et en modélisation générative par réseaux de neurones, en collaboration étroite avec EURECOM pour la génération multimodale. En outre, le ou la stagiaire pourra s’appuyer sur les implémentations existantes des générateurs audio, visuel, et de synchronisation labiale déjà réalisées dans le cadre du projet DeTOX.

Compétences attendues :

● Maîtrise de l’apprentissage automatique, en particulier de l’apprentissage par réseaux de neurones, et multimodal.

● Maîtrise du traitement du signal numérique (son, image)

● Bonne maîtrise de la programmation Python et de l’environnement TensorFlow et PyTorch et du calcul distribué sur des unités GPUs

● Autonomie, travail en équipe, communication, productivité, rigueur et méthodologie.

Rémunération : Gratification selon loi en vigueur et avantages sociaux

Date limite de candidature : 20/01/2025

Bibliographie :

[1] Wang, Yaohui, Di Yang, Francois Bremond, and Antitza Dantcheva. 'LIA: Latent Image Animator.' IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).

[2] Wang, Y., Yang, D., Bremond, F. and Dantcheva, A., 2022. Latent image animator: Learning to animate images via latent space navigation. In International Conference on Learning Representation (ICLR), 2022.

[3] Retrieval-based Voice Conversion. Available online: https://github.com/RVCProject/Retrieval-based-Voice-ConversionWebUI/blob/main/docs/en/README.en.md

[4] Mukhopadhyay; S. et al. Diff2Lip: Audio Conditioned Diffusion Models for LipSynchronization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 5292-5302. 2024

6-13

(2024-12-14) M2 Internship: Using Speech-Based AI to Study Communicative Development, @ LIS/CNRS, Marseille ( Luminy campus), France

M2 Internship: Using Speech-Based AI to Study Communicative Development

Requirement: M1 in computer science Large Language Models, such as ChatGPT, have shown impressive abilities in text-based tasks. Beyond practical applications, they have also sparked scientific discussions about the nature of human language and cognitive development, including debates around Chomsky’s theories on the emergence of syntax. 1 However, these models have limitations in advancing our understanding of how children acquire language. First, they rely on vast amounts of text data for training. Children do not acquire language through exposure to written text; their language learning is grounded in speech—an inherently multimodal signal that combines linguistic and paralinguistic information such as prosody. These features are understood to play a critical role in shaping children’s communicative development. 2 Second, children are not passive learners, they actively engage in (proto-)conversational exchanges with caregivers. Through interactions, they influence their linguistic environment, creating a dynamic feedback loop that is vital for learning. 3

Recent advances in speech language modeling provide a scientific infrastructure for the study of how multimodality and interaction shape early language development. Models like Moshi 4 represent a significant step forward by processing speech directly, without first converting it into text. This approach allows an effective integration of both linguistic and paralinguistic cues. Moshi also models interactive speech communication, enabling it to listen and respond simultaneously—just as humans do. This project aims to use such speech-based models to study children’s communicative development in unprecedented ways, addressing questions about how early conversational dynamics, prosody, and meaning interact to support language acquisition and use. Beyond its scientific contributions, this work has significant societal implications. In education, it can guide the development of more engaging, low-latency e-tutoring systems. In health, it can improve the accuracy of tools for early detection of communicative disorders, such as autism, through analysis of markers like turn-taking dynamics and prosody.

The internship will focus on the Generative Spoken Language Model (dGSLM), 5 a direct precursor to Moshi. dGSLM is well-suited for an M2 internship due to its relative simplicity, while still being capable of producing significant scientific results. The main components of dGSLM include (see Figure, extracted from the original paper):

● Encoder: HuBERT, a self-supervised speech model that encodes linguistic and paralinguistic features from raw audio

● Decoder: HiFi-GAN, a vocoder for generating realistic audio.

● Model Architecture: Duplex transformer, which supports bidirectional processing of conversational dynamics. We will fine-tune dGSLM on around 150 hours of child-adult conversations from a new corpus, which includes data from 303 children aged 4 to 9 years. This fine-tuning will adapt the model to study child-directed communication. In particular, we will explore how prosody influences turn-taking dynamics, employing methods analogous to those we use to study children’s behavior in the lab.

Practicalities

The internship will be funded ~600 euros per month for a duration of 5 to 6 months. It will take place in Marseille within the TALEP research group at LIS/CNRS on the Luminy campus. The intern will collaborate with other interns from this project, as well as PhD students and researchers from the research group.

How to apply: send as soon as possible a short application letter, transcripts, and CV to abdellah.fourtassi@gmail.com

● Application deadline: December 20th, 2024

● Expected start: February 2025 6

1 Piantadosi, S. T. (2023). Modern language models refute Chomsky’s approach to language. From fieldwork to linguistic theory: A tribute to Dan Everett, 353-414.

2 Christophe, A., Millotte, S., Bernal, S., & Lidz, J. (2008). Bootstrapping lexical and syntactic acquisition. Language and speech, 51(1-2), 61-75.

3 Murray, L., & Trevarthen, C. (1986). The infant's role in mother–infant communications. Journal of child language, 13(1), 15-29.

4 Défossez, A., Mazaré, L., Orsini, M., Royer, A., Pérez, P., Jégou, H., ... & Zeghidour, N. (2024). Moshi: a speech-text foundation model for real-time dialogue. arXiv preprint arXiv:2410.00037.

5 Nguyen, T. A., Kharitonov, E., Copet, J., Adi, Y., Hsu, W. N., Elkahky, A., ... & Dupoux, E. (2023). Generative spoken dialogue language modeling. Transactions of the Association for Computational Linguistics, 11, 250-266.

6 Ekstedt, E., & Skantze, G. (2022). How much does prosody help turn-taking? investigations using voice activity projection models. arXiv preprint arXiv:2209.05161.

6-14

(2024-12-18) Stages à lRIT (équipe SAMoVA), Toulouse, France

L’équipe SAMoVA de l’IRIT à Toulouse propose plusieurs stages (M1, M2, PFE ingénieur) en 2025 autour des thématiques suivantes (liste non exhaustive) :

Tous les détails (sujets, contacts) sont disponibles dans la section 'Jobs' de l’équipe :
https://www.irit.fr/SAMOVA/site/jobs/

6-15

(2025-02-04) Jobs à Nancy, France

4 postes de maîtres de conférences et 2 postes de professeurs en informatique sont ouverts à l’Université de Lorraine avec une affectation recherche au LORIA (www.loria.fr). Les candidats et candidates doivent impérativement prendre contact avec les responsables des équipes du laboratoire et les composantes d’enseignement.

— 2 postes PR à l’école des Mines de Nancy et à l'IUT Charlemagne (Nancy). En recherche, ouverts au recrutement dans toutes les équipes du LORIA. En enseignement, profilés robotique - CPS à l’école des Mines et profilé pour le département MMI à l’IUT Charlemagne.

— 2 postes MCF ouverts en recherche au recrutement dans toutes les équipes des départements D1 « Algorithmique, calcul, image et géométrie », D2 « Méthodes formelles » et D3 « Réseaux, systèmes et services » au LORIA.

Pour l'enseignement : 1 affectation à la Faculté des Sciences et Technologie (Nancy) avec un profil ouvert (Programmation, Algorithmique, Mathématiques Discrètes, Web, Réseaux, Génie Logiciel, Bases de Données) ; 1 affectation à Telecom Nancy profilée sur les domaines des systèmes connectés et du génie logiciel (Systèmes connectés, systèmes distribués, génie logiciel, programmation système, développement logiciel, cybersécurité, cloud).

— 2 postes MCF ouverts en recherche au recrutement dans toutes les équipes des départements D3 « Réseaux, systèmes et services », D4 « Traitement automatique des langues et des connaissances » et D5 « Systèmes complexes, intelligence artificielle et robotique » au LORIA.

Pour l'enseignement : 1 affectation à l’IDMC (Nancy) profilée pour la formation MIAGE (Informatique, BD, SI, SI distribué, big data, cloud, BI). 1 affectation à l’IUT de Metz profilée sur le parcours Réalisation d’applications (Développement d’applications, Programmation système).

Plus d'informations sur https://www.loria.fr/fr/emplois/

6-16

(2025-02-04) Several 3-year PhD positions @LIA, Avignon, France

Several fully funded three-year PhD positions are opened with LIA's Speech and Language Group, at Avignon University.

One position is in Computational Linguistics, with a specialization in language translation. This position requires knowledge of several languages (and at least English and French, C1 level) and will take place in the context of FR Agorantic. The other PhD position is on the field of human-robot interactions.

Full descriptions available at:

- 'L’écart en traduction : Compréhension, gestion et traitement des écarts linguistiques et culturelles par l’intelligence artificielle' - 'The translation gap: Understanding, managing and processing linguistic and cultural gaps using artificial intelligence'
Supervision: Fabrice Lefèvre and Laurent Lombard

https://adum.fr/as/ed/voirproposition.pl?site=avignon&matricule_prop=60863

- 'Agentic LLM for pro-active multimodal human-robot interaction'
Supervision: Fabrice Lefèvre

https://adum.fr/as/ed/voirproposition.pl?site=avignon&matricule_prop=60862

Applications for both positions close on May 26.

6-17

(2025-02-10) Post-doc and PhD at the Medical University of Vienna, Austria

Positions Announcement

The Speech and Hearing Science Lab (SHS Lab) at the Medical University of Vienna and the Signal Processing and Speech Communication Lab (SPSC Lab) at Graz University of Technology are jointly seeking candidates for:

· 1 PhD Candidate

· 1 Postdoctoral Researcher

Both positions focus on speech processing in digital health and are expected to start on April 1st, 2025. The selected candidates will work on voice conversion for speech pathologies, with applications in (i) disease progression modeling, including treatment effect prediction, and (ii) enhancing pathological speech using a speaking aid.

Key Research Areas & Methodologies

· Speech processing

· Voice conversion

· Speech analysis & synthesis

· Speaker embeddings

· Representation learning

· Deep neural networks

· Natural language processing (NLP)

· Artificial intelligence & machine learning (AI/ML)

Qualifications

PhD Candidate

· M.Sc. degree in a relevant field (Electrical Engineering, Computer Science, Information & Computer Engineering, Electrical Engineering & Audio Engineering).

· Experience in speech processing, preferably in voice conversion.

Postdoctoral Researcher

· PhD degree in a relevant field (related to speech processing).

· Research publications in voice conversion.

General Requirements (All Candidates)

· Independent, self-motivated work ethic.

· Strong teamwork skills.

· Excellent communication abilities.

· Fluency in English (C1 level); German is an asset.

· Willingness and eligibility to work in Graz and Vienna.

· Willingness and eligibility to travel internationally, including to the USA and Asia for conferences.

About the Institutions

The SHS Lab focuses on engineering sciences for communication disorders, integrating speech signal processing, medical data science, AI/ML, medical imaging, and biomarkers. The lab is affiliated with the Department of Otorhinolaryngology and the Division of Phoniatrics-Logopedics.

MedUni Vienna is one of Europe’s leading medical universities, affiliated with Vienna General Hospital, the largest hospital in Europe. It actively advances AI and machine learning research through its newly founded “Comprehensive Centre for AI in Medicine” and is expanding with major new facilities, including the Centre for Translational Medicine (2025) and the Eric Kandel Centre for Precision Medicine (2026).

The SPSC Lab conducts research and teaching in speech processing, audio engineering, signal processing, computational intelligence, and circuits & systems modeling. It has played a key role in organizing INTERSPEECH 2019 and is leading the development of the Graduate School of Speech Language and AI Technologies within the Unite! University Alliance.

TU Graz is Austria’s oldest technical university, known for its high-impact research, student innovation, and vibrant startup ecosystem. It provides an inspiring work environment with excellent infrastructure and university support.

Diversity & Inclusion

Austrian universities are committed to increasing female representation, particularly in scientific and leadership roles. Qualified female candidates are strongly encouraged to apply. In case of equal qualifications, preference will be given to female applicants.

Compensation & Benefits

· PhD Candidate: €37,577.40 (annual gross, 75% position).

· Postdoctoral Researcher: €49,899.15 (annual gross, 75% position).

How to Apply

Send your application (including CV, motivation letter, transcript of records, at least two references, MSc/PhD thesis, and relevant publications) to:

📧 philipp.aichinger@meduniwien.ac.at and hagmueller@tugraz.at

📅 Applications will be accepted until the positions are filled.

6-18

(2025-02-10) Post doc and Research engineers positions at University of Marburg, Germany

For my new research group “AI – Multimodal Modelling and Learning” at the University of Marburg (in collaboration with hessian.AI, the Hessian Center for Artificial Intelligence), I am seeking candidates for the position of a

1 Postdoc (max. 4+2 years)

2 Research Software Engineer (initially 2.5 years)

Position 1: Postdoc

(official job advertisement here: https://stellenangebote.uni-marburg.de/jobposting/283d40458a3570725bf80921a88ec09a44400883)

The position is offered for a period of 4 years (with the option for a 2-year extension upon successful evaluation), if no former times of qualification must be considered. The earliest starting date is April 1, 2025. The position is fulltime with salary and benefits commensurate with a public service position in the state Hesse, Germany (TV-H E 13, 100 %).

Your tasks:

- Research and development of novel AI methods in the topic areas listed below (see „Your qualification“)

- Publication of research results in high-ranked international venues (A*/A/Q1)

- Acquisition of third-party funds for research projects (both in contributing and independent roles)

- Co-supervision of students and PhD students

- Teaching (lectures and/or seminars)

- Optional: Setting up your own junior research group

Your qualification:

- Completed university degree (Diploma, Master‘s or equivalent) in Computer Science

- Very good doctorate or evidence of being in the final stages of doctoral completion

- Demonstrated expertise in one or more of the following areas: computer vision, machine learning, multimodal computing, information retrieval, human-centered AI, semantic web, visual analytics

- Optional expertise in one of the following application domains: social media and disinformation, technology-enhanced learning, learning analytics, cognitive science, medical informatics, digital humanities

- Publications in internationally renowned computer science venues in at least one of the above-mentioned areas

- Excellent programming skills in common programming languages (Python, Java, etc.), experience with machine learning libraries

- Experience in supervising student theses and collaboration in joint publications

Position 2: Research Software Engineer (with Master or PhD degree)

This position has a focus on research software engineering and is offered for a period of 2.5 years (until September 30, 2027), subject to approval of funds. It is a fulltime position with salary and benefits commensurate with a public service position in the state Hesse, Germany (TV-H E 13, 100 %).

The position is part of the project „SportVid: A Portal for Supporting Search, Analysis and Evaluation of Videos in Sports and Training Science“, funded by the German Research Foundation (DFG) within the program „Scientific Library Services and Information Systems“ (LIS).

The project focuses on developing innovative solutions for the analysis of and search in training and sports videos. The project is a collaboration with the German Sport University Cologne (DSHS) and the Central Library of Sports Sciences (ZBS).

Your tasks:

- Implementation of state-of-the-art AI methods for video analysis (e.g. shot boundary detection, camera settings and movement detection), research and review current scientific literature

- Implementation of state-of-the-art AI methods for sports video analysis (e.g. pose detection, recognition of sport-specific actions, sports field registration)

- Implementation of current methods for search and retrieval of training and sports videos

- Integration of developed software components into the web-based video analysis platform SportVid

- Development of infrastructure, frontend and backend functionalities using modern web frameworks

- Preparation of / collaboration on scientific publications

The position, subject to approval of funds, is temporary according to § 2, 2 WissZeitVG.

Your qualification:

- Completed university degree (Master‘s or equivalent) in a relevant field such as computer science, mathematics, or a comparable degree in a related discipline

- Strong programming skills, excellent knowledge of one or more modern programming languages (particularly Python, JavaScript), modern web technologies and databases

- Strong knowledge of machine learning methods (particularly deep learning), ideally in computer vision, alternatively natural language processing or information retrieval

- Experience with deep learning frameworks (PyTorch, TensorFlow)

- Experience with web application development

- Excellent command of written and spoken English

What we offer (both positions)

- Oustanding career development opportunities, e.g. towards becoming a research software engineer, mentoring and support for planning your professional career, support with grant applications

- An excellent and dynamically evolving research environment in the Department of Mathematics and Computer Science, including three newly established AI professorships

- Connection to hessian.AI (Hessian Center for Artificial Intelligence) with exceptional collaboration opportunities and high-performance computing resources for training large-scale AI models

- An excellent international and national research network (including connections to various institutes of the Leibniz Association and Fraunhofer Society)

- Funding for conference participation

- Hessian public transport ticket (Landesticket Hessen)

How to apply:

For position 1, please apply here: https://stellenangebote.uni-marburg.de/en/jobposting/283d40458a3570725bf80921a88ec09a444008830/apply

For position 2: Official application process is not available yet; if you would like to indicate your interest in the position, please send your CV to Prof. Ralph Ewerth (address below).

Contact

If you have any questions, please write to:

Prof. Dr. Ralph Ewerth

rewerth@informatik.uni-marburg.de

6-19

(2025-02-12) PhD position in neurocognition of language, Université de Lille (France) & Radboud University/Donders Institute (The Netherlands)

We invite applications for a PhD position in neurocognition of language. The aim of the project is to better understand the interplay between language comprehension and production by studying neurocognitive mechanisms in typical and neurological adult populations. The PhD student will be co-supervised by Anahita Basirat (Lille, France) and Vitória Piai (Nijmegen, Netherlands). The planned start date is 1 October 2025. The application deadline is 28 March 2025.

We are looking for a highly motivated and talented candidate with demonstrable experience in research. The successful candidate will be based in Lille with long-duration stays in Nijmegen and will carry out research as a member of two groups:
- Language group of SCALab at the University of Lille
- Language Function and Dysfunction group at the Donders Centre for Cognition
The studies will be conducted in French and potentially Dutch. Access to an EEG system and the opportunity to acquire expertise in state-of-the-art electrophysiological techniques are provided.

The criteria for selection include:
- a Master's degree in a relevant field, such as (neuro)psychology, cognitive neuroscience, cognitive science
- a keen interest in language, patient research, and electrophysiology
- experience in the field of psychology of language or speech science as well as electrophysiology techniques is desirable, but not mandatory
- very good proficiency in written and spoken French as well as English
Applications from excellent candidates with a profile that does not fully meet all criteria will also be considered.

Applications should include:
- a CV
- a cover letter (1 page)
- a summary of previous work (1 page)
- a copy of the Master’s diploma (if available) and transcripts
- names of two referees

Applications should be sent in a single PDF file

6-20

(2025-02-15) Full-time Postdoctoral position in Linguistics/Speech Therapy at Silesian University of Technology, Poland

Full-time Postdoctoral position in Linguistics/Speech Therapy at Silesian University of Technology, Poland

We offer a full-time post-doc research position in the 'Longitudinal investigation of sibilant articulation development in children: a statistical modeling approach based on instrumental evidence and data mining methods' project funded by National Science Center, Poland. The project aims to develop a statistical model describing the nature and rate of change in improving sibilant articulation based on parameters determined from speech audio and video recordings of preschool children's faces. The project is led by Principal Investigator Zuzanna Miodońska, PhD.

The employment at the Silesian University of Technology, Poland, will last 12 months and may be renewed up to a maximum of 36 months. The planned monthly salary is around 8800 PLN gross.

The post-doc is expected to execute the following tasks:

1. Participation in developing the articulation study protocol and measurement station; participation in developing language material from the perspective of phonetic analysis and data mining methods.

2. Participation in preparing and conducting multimodal data registration in a group of preschool children, segmentation, and description of data.

3. Development of guidelines for acoustic and phonetic data analysis, development and conducting speech signal analysis protocols,description of articulation patterns occurring in the study group.

4. Participation in designing and verifying models describing the collected data, interpretation and description of results.

5. Preparation of reports and publications.

Requirements:

1) a doctoral degree in the discipline of linguistics or related, obtained in the year of employment in the project or in the period of 7 years before 1 January of the year of employment in the project. This period may be extended by the time spent on long-term (over 90 days) documented sickness or rehabilitation periods or by the number of months spent on leave related to the care and upbringing of children.

2) experience and knowledge in the field of research on speech development and articulation in children, language acquisition, phonetics and phonology of the Polish language; experience in the field of recording, describing and analyzing articulation data and speech signal is welcome, as well as previous contact with statistical modeling methods in linguistics.

3) scientific experience in conducting research in the discipline of linguistics or biomedical engineering or a related discipline, confirmed by co-authorship/authorship of peer-reviewed publications and presentations at scientific conferences;

4) knowledge of English in speech and writing, allowing for the preparation of scientific publications,

5) in the case of foreigners, fluent knowledge of Polish in speech and writing,

6) due to the duties in the project, necessary experience and predispositions to work with children, as well as meeting the legal requirements for working with children.

Application submission:

Please send an email to zuzanna.miodonska@polsl.pl with the following documents:

1. Curriculum vitae detailing scientific achievements, in particular a list of publications, research projects, professional experience, and other information relevant to the project. In the CV, please include a statement about your level of English and Polish and the clause 'I consent to the processing of my personal data by the Silesian University of Technology to conduct recruitment for the position I have applied for.'

2. Cover letter.

3. Opinion about the applicant prepared by the head of the research team, the supervisor of the doctoral thesis, or the head of the department/faculty/institute where the applicant works or worked.

4. Copies of the three most important publications (co-)authored by the person applying for employment (in the case of multi-authored works, a description of the applicant's contribution should be included).

If you have any questions concerning the project or employment, you are welcome to contact us by email at zuzanna.miodonska@polsl.pl.

6-21

(2025-02-20) Associate/Assistant Professor position @ Radboud University in Nijmegen, The Netherlands

A truly interesting job opportunity:

At Radboud University in Nijmegen, NL, we will be hiring an

Associate/Assistant Professor: Language and Speech Technology.

Applications welcome until 16 March!

See for more information https://lnkd.in/eJGVmAXH

6-22

(2025-02-21) Academic positions at LS2N, Nantes Université, France

Le LS2N (Laboratoire des Sciences du Numérique de Nantes) de Nantes Université ouvre plusieurs postes de MCF et un poste de PR en 2025. Les profils détaillés de ces postes sont disponibles sur le site de Nantes Université (https://www.univ-nantes.fr/universite/recrutement/campagne-synchronisee-ec-2024-recrutement-de-60-enseignants-chercheurs) ou directement sur la plateforme Odyssée (https://odyssee.enseignementsup-recherche.gouv.fr).

Parmi ces postes, une intégration est possible dans l'équipe TALN (Traitement Automatique du Langage Naturel) :

- MCF 27 - Polytech Nantes

https://odyssee.enseignementsup-recherche.gouv.fr/procedures/recrutement-ec/offres-poste/fiche-offre-poste/250986

- MCF 27 art29-BOE - IUT Nantes - Département Informatique

https://odyssee.enseignementsup-recherche.gouv.fr/procedures/recrutement-ec/offres-poste/fiche-offre-poste/251115

- PR 27 - Faculté des Sciences et Techniques - Département Informatique

https://odyssee.enseignementsup-recherche.gouv.fr/procedures/recrutement-ec/offres-poste/fiche-offre-poste/250942

Les contacts pour chaque poste ainsi que les profils d'enseignement sont disponibles sur chacune des fiches.

Ne pas hésiter à me contacter pour toute information concernant une intégration dans l'équipe TALN.

Plus d’informations sur l’équipe TALN sont disponibles ici : http://taln.ls2n.fr

6-23

(2025-02-23) Poste de maître de conférences en Intelligence Artificielle pour les Sciences Humaines et Sociales, Sorbonne, Paris, France

Le poste requiert un haut niveau d’excellence scientifique en Intelligence Artificielle Générative et Analytique pour les sciences humaines et sociales et des compétences reconnues sur la création et l’utilisation des grands modèles de langage (LLM). Différents champs d’applications en sciences humaines et sociales sont privilégiés comme l’ingénierie et modélisation des connaissances et le traitement automatique de la parole/du langage. L’intérêt porté aux applications de l’Intelligence Artificielles aux sciences humaines et sociales constitue une des spécificités de l’enseignement de l’Informatique à la faculté des lettres de Sorbonne Université. Le candidat enseignera l’Informatique et l’Intelligence Artificielle dans les différentes formations de licence (sciences du langage option informatique et Intelligence Artificielle) et de master (Langue et Informatique) du département d’Informatique, Mathématiques et de Linguistique appliquées ainsi qu’en Pix (compétences numériques) pour les étudiants de la faculté des lettres.

Recherche
Le candidat sera rattaché à l’équipe de Linguistique Computationnelle du laboratoire Sens Texte Informatique Histoire (STIH, EA 4509) qui mène des recherches disciplinaires et pluridisciplinaires en Intelligence Artificielle pour les sciences humaines et sociales avec d’autres équipes du laboratoire ainsi qu’avec d’autres laboratoires de Sorbonne Université. Dans ce cadre, le candidat devra démontrer une excellente maitrise des approches pour la modélisation, l’analyse et la génération des contenus en sciences humaines et sociales notamment langagiers et interactionnels, dont les architectures des grands modèles des langage (LLM) et des RAG (Retrieval-Augmented Generation) basés sur les connaissances. Il devra avoir pris en compte dans sa recherche les aspects culturels, écologiques (IA frugale) et normatifs des recherches en Intelligence Artificielle. Il devra présenter un programme de recherche s’insérant dans ces thématiques et ces problématiques.

Recherche : claude.montacie@sorbonne-universite.fr
Enseignement : laurence.devillers@sorbonne-universite.fr
UFR : maria-victoria.eyharabide@sorbonne-universite.fr

6-24

(2025-02-25) Post-doc @ 52-Herz, France

La jeune start up française 52-Herz collaborant avec l'INRIA & L'IFREMER pour développer un appareil de communication sous-marin pour plongeur recrute un post doc pour travailler sur le traitement de la déformation de la parole du plongeur dans l'eau. Elle dispose d' une puissance de calcu lembarquée pour tenter de faire cela et travaille sur le débruitage mais également sur la récupération des effets de plosion.

Voici l'offre de post-doc INRIA :

https://jobs.inria.fr/public/classic/fr/offres/2025-08624

A noter que la fin des candidatures est le 31/03.

6-25

(2025-03-01) Proposition de thèse financée (ANR FRENCHMELO), Aix-Marseille, France

Proposition de thèse financée (ANR FRENCHMELO)

Contact : Amandine Michelas [amandine.michelas@univ-amu.fr] et Sophie Dufour [sophie.dufour@univ-amu.fr]

Lieu : Laboratoire Parole et Langage (LPL, CNRS et Aix-Marseille Université)

Candidature jusqu’au 30 avril 2025 (envoyer CV).

Titre : Le bilinguisme : un atout pour le traitement de l’accentuation ?

Proposition : Il est bien connu que les francophones natifs ont des difficultés à discriminer deux mots qui diffèrent par la position de l’accent (comme les mots espagnols bebé « bébé » et bébe «il/elle boit »). L’objectif de cette thèse sera de mieux comprendre ces difficultés par le biais du bilinguisme. En particulier, nous examinerons l’impact de l’acquisition d’une langue à accent lexical soit en première langue (ex. bilingues espagnol-français) soit en langue seconde (ex. bilingues français-espagnol) sur la capacité des auditeurs à traiter des différences d’accent. D’un point de vue sociétal, cette thèse permettra de mieux comprendre les difficultés que rencontrent les francophones natifs lorsqu’ils apprennent des langues étrangères et aura ainsi des implications pour l’enseignement des langues.

6-26

(2025-03-02) Post-doc position, University of Geneva, Switzerland

Postdoc Position
We invite applications for a post-doc position at the Faculty of Psychology and Educational Science to
work on an SNSF research project on language production in typical young and older adults and in
patients suffering from aphasia following stroke, with behavioural and neuroimaging approaches.
Qualifications requested:
- PhD in the field(s) of cognitive psychology of language, and/or neuropsychology of language, and/or
neuroscience of language
- Solid experience with neuroimaging techniques (EEG/MEG)
- Good programing skills (Python, Matlab, R)
Starting date: September 2025 or later
Applicants should send (i) a cover letter with statement of research experience, interests, and career
plan, (ii) a CV with the names of two possible referees, and (iii) a ~300-word description of previous
research and publications by April 13 th 2025, to:
Marina Laganaro, marina.laganaro@unige.ch

6-27

(2025-03-04) Poste de maitre de conferences au LABRI, Bordeaux, France

Le Laboratoire Bordelais de Recherche en Informatique (LaBRI) ouvre un poste de MCF dans l'équipe Traitement et Analyse de Données (TAD) du département Image et Son (I&S) au LaBRI.

Pour toute information complémentaire, contacter Jean-Luc Rouas, www.labri.fr/~rouas

6-28

(2025-03-05) Poste de MCF en psychologie du langage et neurocognition, Université de Lille, France

Un poste de MCF en psychologie du langage et neurocognition est ouvert à l’Université de Lille :

https://www.univ-lille.fr/fileadmin/user_upload/Universite/travailler_a_l_universite/R.H_postes_EEC/Enseignants_Chercheurs/Fiches_de_postes_Synchro_2025/16_MCF_252149.pdf
N° ODYSSEE : 252149

La personne recrutée rejoindra l’équipe Langage du laboratoire SCALab UMR 9193 (https://scalab.univ-lille.fr/laboratoire/equipes-de-recherche/langage/).

6-29

(2025-03-10) Post-doc Researcher @ University of Marburg, Germany

For my new research group “AI – Multimodal Modelling and Learning” at the University of Marburg (in collaboration with hessian.AI, the Hessian Center for Artificial Intelligence), I am offering a position (4+2 years) for a

Postdoctoral Researcher

(official job advertisement here: https://stellenangebote.uni-marburg.de/jobposting/283d40458a3570725bf80921a88ec09a44400883)

The position is offered for a period of 4 years (with the option for a 2-year extension upon successful evaluation), if no former times of qualification must be considered. The earliest starting date is April 1, 2025. The position is fulltime with salary and benefits commensurate with a public service position in the state Hesse, Germany (TV-H E 13, 100 %).

Your tasks:
- Research and development of novel AI methods in the topic areas listed below (see „Your qualification“)
- Publication of research results in high-ranked international venues (A*/A/Q1) 
- Acquisition of third-party funds for research projects (both in contributing and independent roles)
- Co-supervision of students and PhD students 
- Teaching (lectures and/or seminars)
- Optional: Setting up your own junior research group

Your qualification:
- Completed university degree (Diploma, Master‘s or equivalent) in Computer Science 
- Very good doctorate or evidence of being in the final stages of doctoral completion
- Demonstrated expertise in one or more of the following areas: computer vision, machine learning, multimodal computing, information retrieval, human-centered AI, semantic web, visual analytics
- Optional expertise in one of the following application domains: social media and disinformation, technology-enhanced learning, learning analytics, cognitive science, medical informatics, digital humanities
- Publications in internationally renowned computer science venues in at least one of the above-mentioned areas
- Excellent programming skills in common programming languages (Python, Java, etc.), experience with machine learning libraries 
- Experience in supervising student theses and collaboration in joint publications

What we offer
- Oustanding career development opportunities, e.g. towards becoming a research software engineer, mentoring and support for planning your professional career, support with grant applications
- An excellent and dynamically evolving research environment in the Department of Mathematics and Computer Science, including three newly established AI professorships
- Connection to hessian.AI (Hessian Center for Artificial Intelligence) with exceptional collaboration opportunities and high-performance computing resources for training large-scale AI models
- An excellent international and national research network (including connections to various institutes of the Leibniz Association and Fraunhofer Society)
- Funding for conference participation
- Hessian public transport ticket (Landesticket Hessen)
                
How to apply: 
Please apply here: https://stellenangebote.uni-marburg.de/en/jobposting/283d40458a3570725bf80921a88ec09a444008830/apply

Contact
If you have any questions, please write to:
Please *do not send applications via e-mail*.
Prof. Dr. Ralph Ewerth
rewerth@informatik.uni-marburg.de

6-30

(2025-03-10) Research Scientist/Postdoc at the School of Computer Science at Carnegie Mellon University, Pittsburg, PA, USA

 Research Scientist/Postdoc at the School of Computer Science at Carnegie Mellon University:

We are looking for a highly motivated and talented research scientist/postdoc candidate in multimodal human behavior modeling in real-world contexts and applications. We are looking for candidates with strong ML and CV expertise and that are excited to expand their experiences to topics related AI for Healthcare.

The ideal candidates must have a PhD in Computer Science or related fields and strong track record of publications in the top ranked ML/CV venues.

Location: Carnegie Mellon University.
Work type: Full time.
Anticipated Start Date: Now.
Position Duration: 1-2 years. Initial contract is for one year. Second year contract is based on performance.
Application: If interested, please submit a single PDF file titled FirstNameLastName.pdf, including:
 1- A brief letter of application, describing your qualifications and relevant experience to the
 position of interest (with expected date of availability),
 2- A detailed CV including a list of publications and two recent representative publications,
 3- Three reference letters (sent separately by the referees).

Please visit the job details page for more information and submit the one single PDF file with all requested information (Points 1-3):
https://cmu.wd5.myworkdayjobs.com/en-US/CMU/job/ROB---HAMMAL---Postdoctoral-Fellow_2022833

Thank you
Zakia Hammal

6-31

(2025-03-31) PhD position at INRIA, France

Inria, the French national institute for research in digital science and
technology, is opening a fully-funded PhD position on differential
diagnosis of heart attack from speech:
https://jobs.inria.fr/public/classic/en/offres/2025-08716 (deadline:
April 30).

Candidates shall apply online at their earliest convenience.
Applications will be assessed on a rolling basis.

6-32

(2025-03-31) 3 PhD positions @ EURECOM, Sophia Antipolis, France

3 PhD positions in speech deepfake detection and automatic speaker verification at EURECOM

The Audio Security and Privacy Group at EURECOM, France has openings for 3 PhD candidates in speech deepfake detection and automatic speaker verification (ASV). If you have a Master's degree, an excellent academic track record, strong proficiency in English, also have expertise in computer science, machine learning, artificial intelligence, data science, speech processing, deepfake/spoofing detection, text-to-speech synthesis or voice conversion, and are keen on international collaboration, please consider applying.

Topics include, but are not limited to:
- Integrated solutions to spoofing robust ASV
- Trojan back-door attacks against ASV systems
- Watermarking of synthetic speech and converted voice
- Source tracing for spoofing/deepfake attacks
- Adversarial attacks
- Audio-visual deepfake detection

For these particular PhD positions, applications may undergo administration security checks in compliance with French law and regulations. Restrictions on certain nationalities may apply.

In the first instance, please send your CV by email to Nicholas Evans (evans@eurecom.fr) with the subject line 'PhD opportunities'.

Learn about EURECOM by visiting our website https://www.eurecom.fr and about other job opportunities at https://www.eurecom.fr/en/eurecom/job-opportunities/job-opportunities.

6-33

(2025-04-01) Poste de doctorant, Université de Leiden, Pays-Bas

Nous avons ouvert à Leiden(LUCL) un poste de doctorant.e en linguistique

 française quicommencera en septembre 2025, pour travailler sur la

phonologie/phonétique de créoles à base française. Avec Jenny
Doetjes, nous cherchons des candidat.es francophones, avec un master
en linguistique, idéalement en phonologie/phonétique, et qui seraient
aussi capables d'enseigner dans le département de français (langue,
linguistique). Est-ce que vous pensez que cela pourrait intéresser
vos étudiant.es ? La date limite pour postuler est le 31 mars (donc
très bientôt), mais nous allons demander une extension donc il
devrait y avoir un peu plus de temps. L'annonce se trouve ici :
https://www.universiteitleiden.nl/vacatures/2025-nl/q1/15526phd-candidate-in-french-linguistic

6-34

(2025-04-02) Position doctorale à l'Université de Bretagne Occidentale, Brest, France

Appel à candidature pour un contrat doctoral (3 ans à partir du 1er Octobre 2025)

Lieu

Université de Bretagne Occidentale (Brest)

Laboratoire de Traitement de l'Information Médicale (LaTIM - UMR 1101)
ED Sciences de la Vie et de la Santé

Intitulé (titre provisoire)
Détection de paramètres prosodiques et lexicaux prédictifs de synchronisation au cours des interactions entre thérapeute et patient-enfant (TSA) dans le cadre des thérapies d’échange et de développement (TED).

Mots clefs : lexique - phonologie - synchronisation - trouble du spectre de l'autisme (TSA)

Contexte

Les troubles du spectre autistique (TSA) touchent un enfant sur 100 dans le monde (Zeidan et al., 2022). Ces atteintes neurodéveloppementales se caractérisent par un certain degré de difficulté dans les interactions sociales et la communication. L'hétérogénéité des TSA exige donc des stratégies thérapeutiques personnalisées et adaptables. Les progrès réalisés dans la compréhension des TSA ont mis en évidence l'importance d'une intervention précoce, essentielle pour améliorer les compétences sociales et communicationnelles à long terme des personnes atteintes. Malgré la variabilité des interventions précoces proposées, l'objectif principal des cliniciens est la synchronisation avec les interlocuteurs (par exemple, par contacts visuels) et la recherche de facteurs prédictifs de cette synchronisation représente un enjeu majeur dans la prise en soin précoce (Lord et al., 2022).

La Thérapie d'Échange et de Développement (TED) a été mise au point au CHU de Tours dans les années 1980 (Barthélémy, Hameury & Lelord, 1995) pour réhabiliter les fonctions sous-tendues par les systèmes cérébraux de la communication sociale (attention à l'autre, intention, imitation, etc). Cette thérapie rééducative s'effectue dans le cadre de séances ludiques, adaptées au profil de développement de l'enfant et est particulièrement indiquée avant l’âge de quatre ans, période de plasticité cérébrale maximale. L'objectif principal de ces séances est de provoquer des synchronisations entre les patients et eux-mêmes (contacts visuels, imitations et gestes ajustés). Une étude longitudinale portant sur des enfants avec TSA, dont la TED était l’élément majeur du projet thérapeutique, a montré une amélioration des capacités d'échange et de communication en contexte d'autisme sévère associé à un retard de développement (Blanc et al., 2013).

Les changements induits par la TED (comportement, développement, fonctionnement) sont régulièrement mesurés par le biais de l’échelle Behavior Summarized Evaluation (évaluations comportementales et psychologiques standardisées) remplie au cours des séances individuelles de TED mais également par les soignants de l'enfant dans les structures éducatives collectives. Ces multiples évaluations permettent de mieux comprendre l'enfant en l'observant et en captant ses intérêts et ses préférences, ce qui permet ensuite de définir les jeux et les activités les plus engageants pour lui lors des séances individuelles de TED et, ainsi, favoriser au maximum les occasions de synchronisation. D'après l'expérience des soignants, l'intensité de leurs synchronisations avec les enfants est un indicateur clé de la progression future des compétences sociales.

Objectif de la thèse

L’objectif de la thèse est de contribuer à une caractérisation fine des synchronisations sur le plan spécifiquement linguistique, en particulier au niveau des composantes prosodique et lexicale. Le corpus sera construit à partir de la base de données exploitée par le projet ANR TEDIA et sera constitué de 100 extraits de 10 minutes d’interactions TED (CHU de Tours). L’enjeu principal sera d’identifier des paramètres linguistiques, prioritairement prosodiques et lexicaux, précurseurs de synchronisation. Dans cette perspective, le traitement du corpus nécessitera : (i) la catégorisation des évènements prosodiques ; (ii) la transcription orthographique des échanges en intégrant les marqueurs verbaux de la parole ; (iii) l’association de paramètres linguistiques aux synchronisations déjà repérées et l’identification potentielle de nouvelles synchronisations de nature linguistique.

Profil attendu des candidats

1) Master 2 en Sciences du langage ou Psychologie ou Diplôme d'orthophonie

2) Connaissances en Sciences du langage (phonologie, lexique)

3) Connaissances en Troubles neuro-développementaux

Candidature

Les personnes intéressées sont invitées à adresser leur candidature (CV + lettre de motivation) à Gwenolé Quellec (directeur, gwenole.quellec@univ-brest.fr ), Laura Machart (co-encadrante, laura.machart@univ-brest.fr) et Thomas Bertin (co-encadrant, thomas.bertin@univ-brest.fr) avant le 20 mai 2025.

6-35

(2025-04-02) Engineer @ Intelligent Systems and Robotics at Sorbonne University (Paris)

The Institute for Intelligent Systems and Robotics at Sorbonne University (Paris) is looking for a highly motivated and ambitious engineer or postdoctoral researcher to conduct research in machine learning for human-robot collaboration.

Context and objectives

This position focuses on developing machine learning techniques to enhance human awareness in human-robot interaction by integrating situation assessment and action planning.
The successful candidate will contribute to cutting-edge research at the intersection of robotics, artificial intelligence, and human interaction, with an emphasis on designing and evaluating robotic systems that facilitate seamless collaboration with humans.

The position is for 18 months contract, but there is a possibility to be extended depending on the performance and circumstances. 
The position is open at both the engineer and post-doctoral levels for candidates with a strong background in machine learning, human-machine interaction, or robotics.

Responsibilities:
    • Develop advanced situation assessment techniques using machine learning to accurately represent user preferences, behaviors, and characteristics based multimodal data to efficiently plan actions.
    • Collaborate with interdisciplinary teams, including computer scientists, experts from the humanities, and designers, to ensure the usability and effectiveness of the developed techniques.
    • Publish research findings in top-tier conferences and journals in the field of Human-Machine Interaction and Machine Learning (mainly at the post-doc level)


Requirements   
The successful candidate should have: 
    • Experience in human-machine interaction
    • Good knowledge of Machine Learning Techniques
    • Good knowledge of experimental design and statistics 
    • Excellent publication record
    • Strong skills in Python 
    • Willing to work in multi-disciplinary and international teams
    • Good communication skills

Application 
Interested candidates should submit the following by email in a single PDF file to: mohamed.chetouani[@]sorbonne-universite.fr with the subject: Application ML for Human-Robot Collaboration

    • Curriculum vitae with 2 references (recommendation letters are also welcome) 
    • One-page summary of research background and interests 
    • At least three papers (either published, accepted for publication, or pre-prints) demonstrating expertise in one or more of the areas mentioned above 
    • Doctoral dissertation abstract and the expected date of graduation for a post-doc position levale (for those who are currently pursuing a Ph.D) 

Application’s deadline: April 21, 2025.

6-36

(2025-04-10) PhD position @ Laboratoire Informatique d’Avignon (LIA) or Laboratoire des Sciences du Numérique de Nantes (LS2N), France


We are offering a PhD position on the topic: 'Automatic Extraction and Structuring of Cultural Events: An Efficient and Frugal Approach to Connect ICC Stakeholders and Their Audiences.'

Location: Laboratoire Informatique d’Avignon (LIA) or Laboratoire des Sciences du Numérique de Nantes (LS2N)
Start Date: Fall 2025
Funding: 3 years (subject to approval by PEPR ICCARE, ~€1700-1900/month net)
Supervisors: Vincent Labatut vincent.labatut@univ-avignon.fr (LIA), Richard Dufour richard.dufour@univ-nantes.fr (LS2N)
Industrial Partner: Thomas Chenevier thomas@ideactiv.com, ideactiv (https://www.ideactiv.com)

Objective
Develop an automatic system to extract and structure cultural event information from the websites of sector stakeholders (festivals, museums, performance venues, etc.). The approach should be efficient and frugal, minimizing errors and hallucinations from language models.

Research Areas
- Information Extraction & NLP (Named Entity Recognition, Disambiguation, etc.)
- Analysis of Unstructured Web Content
- Machine Learning Models

Candidate Profile
Master’s degree or engineering diploma in computer science (or equivalent), with experience in NLP, machine learning, and/or software engineering. Strong English proficiency, autonomy, and teamwork skills are essential.

Application Process
Send your CV, academic transcripts, recommendation letters, and a motivation letter (specific to the topic) to the supervisors listed above.

More information and detailed document here: https://uncloud.univ-nantes.fr/index.php/s/WagAfz8f3MSy6Zj

6-37

(2025-04-10) 3-year PhD position @LORIA's MultiSpeech Team (Université de Lorraine) and LIA's Speech and Language Group (Avignon University), France

In the context of the ENACT AI cluster, a funded three-year PhD position will open with LORIA's MultiSpeech Team (Université de Lorraine) and LIA's Speech and Language Group (Avignon University):

'Social-behavior-aware chatbot for a communication skills coaching of medical students'
Supervision: Irina Illina, Patrice Gallet and Fabrice Lefèvre

Full description and application form available at https://doctorat.univ-lorraine.fr/fr/les-ecoles-doctorales/iaem/offres-de-these/enact-chatbot-axe-sur-le-comportement-social-pour-le

Applications should include (in PDF):
- cover letter (1 page max)
- resume
- summary of previous work (1 page)
- Master’s diploma (if available) and transcripts for all years
- two referees to contact

Applications close on April 24.

6-38

(2025-04-13) PhD position @Laboratoire Informatique d’Avignon (LIA) et/ou Institut de Recherche en Informatique de Toulouse (IRIT), France

Nous proposons une offre de thèse en informatique sur le sujet « ANALYSES À BASE D'APPROCHES NEURONALES DE LA PAROLE DÉGRADÉE DANS LE CONTEXTE DE TROUBLES DE LA PAROLE, EN VUE DE SA RESTAURATION » co-dirigée par Corinne Fredouille (LIA) et Julie Mauclair (IRIT)

Lieux : Laboratoire Informatique d’Avignon (LIA) et/ou Institut de Recherche en Informatique de Toulouse (IRIT)
Début : 1e octobre 2025

Plus de détails et candidature ici : https://adum.fr/as/ed/voirproposition.pl?site=adumR&matricule_prop=64829

6-39

(2025-04-23) PhD position at Trinity College, Dublin, Ireland

Background:

It is known that many modalities (e.g. articulation, mouth movements, eye gaze, head nods, back channels and gestures) play a role in communication success in speech-based interaction. Lombard speech is one way we modify the sound of our voices to ensure we communicate clearly with others in noisy environments, but recent research has shown that Lombard speech in reality is multimodal. This PhD student will focus on approaches to tracking a conversation that incorporate knowledge from the visual modality of speech to support understanding in noisy conditions. By gaining deeper insights into how multimodal cues are exploited in noisy conditions, the research will look at approaches to architectures that approach the level of flexibility humans have in adapting to changing communication challenges. This PhD research is part of a larger SFI Frontiers project SpeechSpace focussed on multimodal speech and led by Prof. Harte. The funding available will cover fees and stipend, but also equipment and conference travel associated with the research.

Person Requirements:

• Primary or Master’s degree in Electronic Engineering or closely related discipline, with an interest in engaging in multidisciplinary research
• Must meet TCD University requirements at https://www.tcd.ie/study/apply/admission-requirements/postgraduate/
• Strong skills in coding, machine learning, deep learning (basics is fine), and signal processing with the willingness and motivation to learn new skills and packages
• Prior experience with speech-based interaction desirable
• Excellent communication skills, both spoken and written, and fluency in English

The position:

• Prof Harte’s research group brings together a diverse and friendly group of people who are all interested in pursuing research in speech-related topics and like to share ideas and learn from one another. As such, this is a fully in-person position for someone who wants an in-person PhD experience and would like to contribute to our group both technically and socially.

To apply for the position, please complete the following application form:

https://forms.gle/YAif8DfV7eSVRdxM8

Informal inquiries only to Prof Naomi Harte at nharte@tcd.ie

6-40

(2025-04-25) Post-doc position @INRIA Paris, France

We are inviting applications for a postdoctoral research position (F/M) at Inria Paris, within the ALMAnaCH team, as part of a large international project on adaptive personality in conversational agents.

The project investigates how interpersonal factors shape the verbal and nonverbal display of personality in both humans and embodied conversational agents. The position involves collecting and analyzing a corpus of multiparty dialogues, building machine learning models, and implementing these models in interactive agents.

Key responsibilities:

Conduct a literature review and maintain state-of-the-art knowledge

Lead the collection, annotation, and dissemination of a multimodal corpus

Analyze the influence of psychological, linguistic, and interpersonal factors on the expression of personality

Develop machine learning models and implement them in embodied agents

Write scientific publications and project reports

Mentor younger scholars (PhD students, master’s students, engineers)

Required qualifications:

PhD in computational linguistics, computer science, cognitive science, or a related field

Strong background in dialogue systems, conversational agents, or multimodal interaction

Experience with machine learning methods

Proven publication record in top venues

Ability to work in a team and manage junior researchers

High proficiency in written and spoken English (French is a plus)

Job details:

Contract: Fixed-term (1 year), renewable

Location: Inria Paris (France)

Start date: Ideally June 1, 2025

Application deadline: May 31, 2025

Application platform: Apply via Inria website or by email at sophie.etling@inria.fr

To apply, please send:

- A cover letter describing your relevant experience

- A CV

- Names and contact information for 3 referees (Please note: we will not accept any letters of recommendation sent by the candidate. You must send only the contact details for your referees.)

We warmly encourage applications from candidates of all backgrounds, especially those with interdisciplinary experience.

6-41

(2025-09-24) Atelier sur les Avancées en AMR et en Analyse Sémantiques SIR@IXCS2025 - Düsseldorf - September 24 2025

=== Workshop SIR ===

Atelier sur les Avancées en AMR et en Analyse Sémantiques
SIR@IXCS2025 - Düsseldorf - September 24 2025

=================================

	https://team.inria.fr/semagramme/first-workshop-on-semantics-for-interdisciplinary-research/

	https://openreview.net/group?id=inria.fr/INRIA/S%C3%A9magramme/2025/SIR01

=================================

In recent years, Natural Language Processing (NLP) has increasingly intersected with the humanities and social sciences,

offering new methodologies for analyzing textual data, interpreting meaning, and modelling (IF WE WANT BRITISH SPELLING,

WE MIGHT NOT?) language-based phenomena. The potential for multi-disciplinary research using NLP methods is particularly

 great in computational semantics (CS, as its ability to process and represent meaning opens up innovative pathways for

researchers in history, philosophy, literary studies, political science, etc.  This workshop aims to explore how semantic

 models and tools can be leveraged to tackle traditional and emerging questions in the Humanities in a broader sense (Social

Sciences, Law, Economics, Management, Literature, Languages, Art, …). 

A major theme of  SIR is the role of semantics in NLP applied to the humanities (both statistical and symbolic approaches).

=== Topics to Explore ===
    • CS and the humanities: issues, tools and applications.
    • Quantitative and qualitative approaches as a breakthrough in the  Humanities
    • NLP transforming humanities issues
    • Contributions and limitations for understanding meaning
    • Links between formal semantics and neural models
    • Ambiguity, polyphony and interpretation in theHumanities
    • Ethics and bias in semantic modeling
    • Interdisciplinary dialogue between AI, NLP and Humanities
 
=== Dates ===
    • Deadline : July 14th (anywhere on earth)
    • Notification : August 25th (anywhere on earth)
    • Camera Ready : September 10th (anywhere on earth)
    • Workshop : September 24th (anywhere on earth)
 
=== Submission Information ===
Papers should describe original research and must not exceed 4 pages (with an extra page in the camera ready version for accepted papers). Papers should be submitted no later than 14 July 2025 (anywhere on earth).
 
Accepted papers will be published in the conference proceedings in the ACL Anthology. For inclusion in the proceedings, at least one author must register to the conference and present the paper in person. 
 
Submissions should be fully anonymous to ensure double-blind reviewing.
 
=== Submission ===
https://openreview.net/group?id=inria.fr/INRIA/S%C3%A9magramme/2025/SIR01
 
=== Style Files ===
The workshop follow the IWCS 2025 template see the workshop web page.
 
=== Organizers ===
Maxime Amblard, Université de Lorraine
Ellen Breitholtz, Gothenburg University

=== Contact ===
maxime.amblard@univ-lorraine.fr and ellen.breitholtz@ling.gu.se

6-42

(2025-04-30) PhD position (CNRS) , Bordeaux Computer Science Research Laboratory (LaBRI), France

Bonjour à tous,

Avec Nicolas Audibert, nous avons obtenu un financement de la mission pour l’interdisciplinarité du CNRS pour une thèse sur la thématique de la parole des aides-soignants dans les EHPAD.

Nous invitons les candidatures à la fois pour les profils informatique/traitement automatique de la parole et linguistique/sciences du langage.

La date limite de candidature est le 16 mai.

Les détails de l’offre et la procédure pour candidater sont disponibles sur le portail CNRS : https://emploi.cnrs.fr/Offres/Doctorant/UMR5800-JEAROU-002/Default.aspx

N’hésitez pas à me conctacter pour toute demande d’information complémentaire,

Cordialement,

CNRS Researcher
Bordeaux Computer Science Research Laboratory (LaBRI)
351 Cours de la libération - 33405 Talence Cedex - France
T. +33 (0) 5 40 00 35 28
www.labri.fr/~rouas

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy

© Copyright 2026 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA