ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2023 » ISCApad #302 » Jobs » (2023-03-09) Doctoral position : Acoustic to Articulatory Inversion by using dynamic MRI images @LORIA, Nancy, France

ISCApad #302

Friday, August 11, 2023 by Chris Wellekens

6-13 (2023-03-09) Doctoral position : Acoustic to Articulatory Inversion by using dynamic MRI images @LORIA, Nancy, France

Doctoral position : Acoustic to Articulatory Inversion by using dynamic MRI images

Loria ?Lorraine Research Laboratory in Computer Science and its Applications? is a research unit common to CNRS, the Université de Lorraine and INRIA. Loria gathers 450 scientists and its missions mainly deal with fundamental and applied research in computer sciences, especially the MultiSpeech Team which focuses automatic speech processing, audiovisual speech and speech production. IADI is a research unit common to Inserm the Université de Lorraine whose specialty is developing various techniques and methods to improve imaging of moving organs via the acquisition of MR images.

This PhD project founded by LUE (Lorraine Université d?Excellence) associates the Multispeech team and the IADI laboratory.

Start date is (expected to be) 1st septembre 2023 or as soon as possible thereafter.

Supervisors

Yves Laprie, email yves.laprie@loria.fr

Pierre-André Vuissoz, email pa.vuissoz@chru-nancy.fr

The project

Articulatory synthesis mimics the speech production process by first generating the shape of the vocal tract from the sequence of phonemes to be pronounced, then the acoustic signal by solving the aeroacoustic equations. Compared to other approaches to speech synthesis which offer a very high level of quality, the main interest is to control the whole production process, beyond the acoustic signal alone.

The objective of this PhD is to succeed in the inverse transformation, called acoustic to articulatory inversion, in order to recover the geometric shape of the vocal tract from the acoustic signal. A simple voice recording will allow the dynamics of the different articulators to be followed during the production of the sentence.

Beyond its interest in terms of scientific challenge, articulatory acoustic inversion has many potential applications. Alone, it can be used as a diagnostic tool to evaluate articulatory gestures in an educational or medical context.

Description of work

The objective is the inversion of the acoustic signal to recover the temporal evolution of the medio-sagittal slice. Indeed, dynamic MRI provides two-dimensional images in the medio-sagittal plane at 50Hz of very good quality and the speech signal acquired with an optical microphone can be very efficiently deconstructed with the algorithms developed in the MultiSpeech team (examples available on https://artspeech.loria.fr/resources/). We plan to use corpora already acquired or in the process of being acquired. These corpora represent a very large volume of data (several hundreds of thousands of images) and an approach for tracking the contours of articulators in MRI images which gives very good results was developed to process corpora. The automatically tracked contours can therefore be used to train the inversion. The goal is to perform the inversion using the LSTM approach on data from a small number of speakers for which sufficient data exists. This approach will have to be adapted to the nature of the data and to be able to identify the contribution of each articulator. In itself, successful inversion to recover the shape of the vocal tract in the medio-sagittal plane will be a remarkable success since the current results only cover a very small part of the vocal tract (a few points on the front part of the vocal tract). However, it is important to be able to transpose this result to any subject, which raises the question of speaker adaptation, which is the second objective of the PhD.

What we offer

A position funded by LUE (Lorraine Université d?Excellence) at a leading technical university that generates knowledge and skills for a sustainable futur.
A very complementary scientific environment of the two teams (MultiSpeech and IADI) in all fields of MRI and anatomy in the IADI laboratory and in deep learning and speec processing in the MultiSpeech team of Lori.
Engaged and ambitious colleagues along with a creative, international and dynamic working environmen.
At Loria, there are lively research groups in a number of areas, for example natural language processing, deep learning, computer graphics, robotics? At the moment, there are about 150 PhD students at Loria and IADI.
Works in the very center of Europe in close proximity to nature.
Help to relocate and be settled in France and at Université de Lorraine.

Supervisors

Yves Laprie, email yves.laprie@loria.fr

Pierre-André Vuissoz, email pa.vuissoz@chru-nancy.fr

Application

Your application including all attachments must be in English and submitted electronically by clicking APPLY NOW below.

Please include:

Motivated letter of application (max. one page)
Your motivation for applying for the specific PhD project
Curriculum vitae including information about your education, experience, language skills and other skills relevant for the position
Publication list (if possible)
Reference letters (if available)

The deadline for applications is April 15 2023, 23:59 GMT +2.

log into Inria?s recruitment system(https://jobs.inria.fr/public/classic/en/offres/2023-05790) in order to apply to this position.

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy