ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2024 » ISCApad #308 » Jobs » (2023-09-10) PhD position, MIAI, Université de Grenoble, France

ISCApad #308

Saturday, February 10, 2024 by Chris Wellekens

6-5 (2023-09-10) PhD position, MIAI, Université de Grenoble, France

Job Offer: PhD Self-supervised models for transcribing the spontaneous speech of for 3- to 6-year-old children in French

Starting date: between October 1st and December 1st, 2023 (flexible) Application deadline: From now until the position is filled Interviews: from September or latter if the position is still open

Salary: ~2000€ gross/month (social security included)

Mission: research oriented (teaching possible but not mandatory)

Place of work: Laboratoire d'Informatique de Grenoble, CNRS, Grenoble,France

Keywords: deep learning, natural language processing, speech recognition for children's voices, documentation of language development

Description As part of the Artificial Intelligence & Language Chair at the Multidisciplinary Institute in Artificial Intelligence (; https://miai.univ-grenoble-alpes.fr/research/chairs/perception-interaction/artificialintelligence-language-850480.kjsp?RH=6499588038450843), we offer a PhD thesis topic devoted to the enriched automatic transcription of the spontaneous speech of 3- to 6-year-old children using an architecture based on self-supervised models [1]. These methods have emerged as one of the most successful approaches in artificial intelligence (AI), as they allow to exploit a colossal amount of existing unlabeled data and so achieve significantly higher performance for many domains. As part of the DyLNet project (Language dynamics, linguistic learning, and sociability at preschool: benefits of wireless proximity sensors in collecting big data ; https://dylnet.univ-grenoblealpes.fr/), coordinated by A. Nardy, a children's speech collection was carried out in a socially mixed preschool over a period of 2 and a half years [2]. Each year, around 100 children wore a box fitted with microphones that continuously recorded their speech. These boxes were worn for one week a month. We thus collected ~ 30,000 hours of recordings, 815 of which were transcribed and annotated by linguists. This unprecedentedly large corpus of children's spoken French will enable to meet the technical challenges associated with automatic speech processing. While continuous and unsupervised collection methods are now available, another challenge is the automatic transcription of children's voices, made difficult by their acoustic characteristics. The aim of the thesis is to design a transcription system for researchers as well as child development professionals (teachers, speech therapists, etc.). The aim of the thesis is therefore to: - review the state-of-the-art models and the performances achieved by automatic transcription tools for children's voices - implement processes to exploit the mass of audio data collected, and the associated metadata (sociodemographic information on participants, contexts of enunciation, interlocutors, etc.). - design and develop a system for transcribing children's speech using self-supervised tools, as proposed by Speechbrain [3]. The best system obtained will be made available to the language acquisition research community and child development professionals. - set up a system evaluation protocol based on transcribed data - propose tools for automating some of the linguistic analyses to enrich the obtained transcriptions and document oral language development in 3- to 6-year-old children. Skills : Master degree in Computer Science, Artificial Intelligence or Data Science Mastering Python programming and deep learning frameworks. Experience in automatic natural language processing will be really appreciated Excellent communication skills in French or, failing that, in English Scientific environment : The PhD. position will be co-supervised by Benjamin Lecouteux, Solange Rossato (LIG, Univ. Grenoble Alpes) et Aurélie Nardy (Lidilem, Univ. Grenoble Alpes). The recruited person will be part of the GETALP team of the LIG laboratory (https://lig-getalp.imag.fr/) which has extensive expertise and experience in the field of Natural Language Processing. The GETALP team offers a stimulating, multinational working environment, and provides the resources needed to complete the thesis in terms of equipment and scientific exchanges. Regular meetings with the three supervisors will take place throughout the thesis.

Instructions for applying Application forms must contain: CV + letter/message of motivation + master + notes + be ready to provide letter(s) of recommendation. They should be addressed to Benjamin Lecouteux (benjamin.lecouteux@univ-grenoble-alpes.fr), Solange Rossato (solange.rossato@univ-grenoble-alpes.fr) and Aurélie Nardy (aurelie.nardy@univ-grenoblealpes.fr).

[1] Evain, S., Nguyen, H., Le, H., Boito, M. Z., Mdhaffar, S., Alisamir, S., ... & Besacier, L. (2021). Lebenchmark: A reproducible framework for assessing self-supervised representation learning from speech. https://doi.org/10.48550/arXiv.2104.11462

[2] Nardy, A., Bouchet, H., Rousset, I., Liégeois, L., Buson, L., Dugua, C., Chevrot, J.-P. (2021). Variation sociolinguistique et réseau social : constitution et traitement d’un corpus de données orales massives. Corpus, 22 [en ligne]. https://doi.org/10.4000/corpus.5561

[3] Ravanelli, M., Parcollet, T., Plantinga, P., Rouhe, A., Cornell, S., Lugosch, L., ... & Bengio, Y. (2021). SpeechBrain: A general-purpose speech toolkit. https://arxiv.org/abs/2106.0462

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy