ISCA - International Speech
Communication Association


ISCApad Archive  »  2021  »  ISCApad #282  »  Jobs

ISCApad #282

Thursday, December 09, 2021 by Chris Wellekens

6 Jobs
6-1(2021-07-01) Internship at Naver Labs, Grenoble, France

https://europe.naverlabs.com/job/unsupervised-speech-to-text-translation-using-adapter-modules/

Unsupervised Speech-to-Text Translation using Adapter Modules ? Internship

Description

Adapter layers have recently proven to be flexible and lightweight mechanisms for multi-lingual translation models. In this internship we plan to explore their use for speech-to-text translation as a way of leveraging mono-lingual data to be able to translate from/to new languages in an unsupervised way.

Required skills

- PhD or research master student, in NLP, speech or machine learning with an interest on language technologies
- Familiarity with modern machine learning, as applied to NLP. Evidenced by publications in the domain.
- Familiarity with deep learning frameworks and python.

References

Application instructions

Please note that applicants must be registered students at a university or other academic institution and that this establishment will need to sign an 'Internship Convention' with NAVER LABS Europe before the student is accepted.

You can apply for this position online. Don't forget to upload your CV and cover letter before you submit. Incomplete applications will not be accepted.

About NAVER LABS

NAVER LABS is a world class team of self-motivated and highly engaged researchers, engineers and interface designers collaborating together to create next generation ambient intelligence technology and services that are rich with the organic understanding they have of users, their contexts and situations.

Since 2013 LABS has led NAVER?s innovation in technology through products such as the AI-based translation app ?Papago?, the omni-tasking web browser ?Whale?, the virtual AI assistant ?WAVE?, in-vehicle information entertainment system ?AWAY? and M1, the 3D indoor mapping robot.

The team in Europe is multidisciplinary and extremely multicultural specializing in artificial intelligence, machine learning, computer vision, natural language processing, UX and ethnography. We collaborate with many partners in the European scientific community on R&D projects.

NAVER LABS Europe is located in the south east of France in Grenoble. The notoriety of Grenoble comes from its exceptional natural environment and scientific ecosystem with 21,000 jobs in public and private research. It is home to 1 of the 4 French national institutes in AI called MIAI (Multidisciplinary Innovation in Ai) It has a large student community (over 62,000 students) and is a lively and cosmopolitan place, offering a host of leisure opportunities. Grenoble is close to both the Swiss and Italian borders and is the ideal place for skiing, hiking, climbing, hang gliding and all types of mountain sports.

Back  Top

6-2(2021-07-02) PhD position at LIG, Grenoble, France

Contexte:

Le projet ANR PROPICTO vise à développer un axe de recherche autour de
la communication alternative et augmentée en se focalisant sur la
transcription automatique de la parole sous forme pictographique.
PROPICTO répond à la fois à des besoins forts dans le domaine du
handicap et relève de nombreux défis de recherche autour du traitement
automatique de la langue naturelle. PROPICTO a la volonté d'être
pluridisciplinaire en coopérant avec des linguistes et le milieu du
handicap. La finalité du projet est de proposer un système qui est
capable de transcrire directement de la parole sous la forme d?une
suite de pictogrammes.

La thèse sera co-encadrée par Benjamin Lecouteux et Maximin Coavoux


Sujet:

Cette thèse a pour objectif principal de développer un module d?analyse
syntaxique automatique qui sera intégré dans la chaîne de traitement
parole->pictogrammes mise en ?uvre dans le projet PROPICTO. L?analyse
de la parole spontanée pose de nombreux problèmes pour le TAL
(disfluences, chevauchements, segmentation en phrases). Par ailleurs,
la grande majorité des travaux en analyse syntaxique automatique se
concentrent sur des jeux de données issues de textes écrits.

Dans un premier temps, nous proposons d?évaluer les méthodes
état-de-l?art en analyse syntaxique sur les treebanks de parole
existants pour le français, en particulier en utilisant des modèles de
langage préentraînés tels que FlauBERT (Le et al 2019). Dans un second
temps, nous proposons de poursuivre 2 axes de recherche :
-   Analyse end-to-end : dans un contexte applicatif, une partie des
     erreurs de l?analyse syntaxique sont liées à des erreurs de
     reconnaissance de la parole (propagation d?erreurs). Nous proposons
     (i) d?étudier si l?ajout d?informations sur le signal sonore
     permettent de réduire la propogation d?erreur (ii) d?étudier la
     faisabilité d?une approche end-to-end qui prédirait conjointement
     la transcription du signal sonore et son analyse syntaxique.
-   Analyse syntaxique incrémentale : les analyseurs état-de-l'art
     actuels ne sont pas incrémentaux, ils ont besoin d'avoir accès à la
     phrase entière pour commencer l'analyse (modèle de langue
     préentraîné bidirectionnel). Dans le cadre applicatif « online » de
     PROPICTO, il est intéressant de considérer des algorithmes
     d?analyse syntaxique qui puissent commencer l?analyse au fur et à
     mesure où arrive la phrase d?input, à la manière de certains
     systèmes d?analyse par transition. Cela rend l?utilisation de
     modèles bidirectionnels (FlauBERT) impossibles, et nécessitera de
     développer des stratégies pour garantir la robustesse de
     l?analyseur.

Profil recherché:

-   Master ayant une forte composante Traitement Automatique des
     Langues ou linguistique computationnelle
-   Expérience en programmation et machine learning pour le TAL
-   Bonne connaissance du français

Détails pratiques:

-   Début de la thèse envisagé entre septembre et novembre 2021
-   Contrat doctoral à temps plein au LIG (équipe Getalp) pour 3 ans
     (salaire: min 1768e brut mensuel, plus en cas d'enseignement)
-   Date limite pour postuler: 29 juin
-   Pour postuler, le dossier de candidature doit comprendre: cv,
     lettre de motivation, notes de master. Les candidat?es
     sélectionné?es devront également transmettre leur mémoire de master
     (si disponible).

Contacts (pour toutes questions ou pour postuler):
maximin.coavoux@univ-grenoble-alpes.fr et
benjamin.lecouteux@univ-grenoble-alpes.fr

Back  Top

6-3(2021-07-02) PhD position at LIG Grenoble, France
Sujet de thèse dans le cadre du projet  ANR Franco-Suisse Propicto (https://propicto.unige.ch),
encadrée par Benjamin Lecouteux, Didier Schwab et Emmanuelle Esperança-Rodier

 

Traduction automatique de la parole vers des pictogrammes.

 

PROPICTO vise à développer un axe de recherche autour de la communication alternative et augmentée en se focalisant sur la transcription automatique de la parole sous forme pictographique. PROPICTO répond à la fois à des besoins forts dans le domaine du handicap et relève de nombreux défis de recherche autour du traitement automatique de la langue naturelle. PROPICTO a la volonté d?être pluridisciplinaire en coopérant avec des linguistes et le milieu du handicap. La finalité du projet est de proposer un système qui est capable de transcrire directement de la parole sous la forme d?une suite de pictogrammes. 

 

Cette thèse sera axée sur la traduction de l?oral vers des ensembles de pictogrammes. 
L?un des verrous scientifique de cette thèse est de chercher à pallier la quantité limitée d?exemples sous forme de pictogrammes et de corpus parole/pictogrammes.
Les approches utilisées s?inspireront dans un premier temps des approches de la traduction de la parole massivement multilingue où d?autres langues peuvent aider à traduire une langue pour laquelle les données sont rares.
Les aspects simplification de la langue seront également abordées dans ce sujet et appuyées par une autre thèse portant sur l?analyse syntaxique de l?oral.
Parallèlement au déroulement de cette thèse, des récoltes de corpus au sein de différentes institutions seront réalisées pour obtenir des paires parole/pictogrammes et répondre aux attentes en situation réelle.
L?évaluation des méthodes sera également une dimension importante de cette thèse et pourra s?inspirer, par exemple, des méthodes d?évaluation de la traduction automatique.

 

Profil recherché:
      - Solide expérience en programmation & machine learning pour le TAL, notamment l?apprentissage profond

 - Master ayant une composante Traitement Automatique des Langues ou linguistique computationnelle

- Bonne connaissance du français

 

 

Détails pratiques:

- Début de la thèse entre septembre et novembre 2021
- Contrat doctoral à temps plein au LIG (équipe Getalp) pour 3 ans (salaire: min 1768?e brut mensuel)

 

Environnement scientifique/ 

 

La thèse sera menée au sein de l'équipe Getalp du laboratoire LIG  (https://lig-getalp.imag.fr/). La personne recrutée sera accueillie au  sein de l?équipe qui offre un cadre de travail stimulant, multinational  et agréable. 

 

Les moyens pour mener à bien le doctorat seront assurés tant en ce qui concerne les missions en France et à l?étranger qu?en ce qui concerne le matériel (ordinateur personnel, accès aux serveurs GPU du LIG, Grille de calcul Jean Zay du CNRS). 

 

 

/Comment postuler ?/ 

 

Les candidats doivent être titulaires d'un Master en informatique ou en traitement automatique du langage naturel (ou être sur le point d'en obtenir un). Ils doivent avoir une bonne connaissance des méthodes d?apprentissage automatique et idéalement une expérience en collecte et gestion de corpus. Ils doivent également avoir une bonne connaissance de la langue française. Une expérience dans le domaine  du traitement automatique de la parole ou de la traduction automatique (neuronaux ou pas ) et/ou une sensibilisation au milieu du handicap serait un plus. 

 

Les candidatures sont attendues jusqu'au 1er juillet 2021. Elles doivent contenir : CV + lettre/message de motivation + notes de master + lettre(s) de recommandations; et être adressées à Benjamin Lecouteux (benjamin.lecouteux@univ-grenoble-alpes.fr), Didier Schwab (Didier.Schwab@univ-grenoble-alpes.fr) et Emmanuelle Esperança-Rodier (Emmanuelle.Esperanca-Rodier@univ-grenoble-alpes.fr). 

 

 

Références : 

 

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech Solene EvainHa NguyenHang LeMarcely Zanon BoitoSalima MdhaffarSina AlisamirZiyi TongNatalia TomashenkoMarco DinarelliTitouan ParcolletAlexandre AllauzenYannick EsteveBenjamin LecouteuxFrancois PortetSolange RossatoFabien RingevalDidier SchwabLaurent Besacier

Hang Le, Loïc Vial, Jibril Frej, Vincent Segonne, Maximin Coavoux, et al.. FlauBERT: Unsupervised Language Model Pre-training for French. LREC, 2020, Marseille, France. ?hal-02890258?

Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, et al.. Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation. COLING 2020 (long paper), Dec 2020, Virtual, Spain. ?hal-02991564?

Didier Schwab, Pauline Trial, Céline Vaschalde, Loïc Vial, Benjamin Lecouteux. Apporter des connaissances sémantiques à un jeu de pictogrammes destiné à des personnes en situation de handicap : Un ensemble de liens entre Wordnet et Arasaac, Arasaac-WN. TALN 2019, 2019, Toulouse, France. ?hal-02127258?

Back  Top

6-4(2021-07-04) Post-doctoral research position - L3i - La Rochelle France

-- Post-doctoral research position - L3i - La Rochelle France

---------------------------------------------------------------------------------------------------------------------------

Title : Emotion detection by semantic analysis of the text in comics speech balloons

 

The L3i laboratory has one open post-doc position in computer science, in the specific field of natural language processing in the context of digitised documents.

 

Duration: 12 months (an extension of 12 months will be possible)

Position available from: As soon as possible, 2021

Salary: approximately 2100 ? / month (net)

Place: L3i lab, University of La Rochelle, France

Specialty: Computer Science/ Document Analysis/ Natural Language Processing

Contact: Jean-Christophe BURIE (jcburie [at] univ-lr.fr) / Antoine Doucet (antoine.doucet [at] univ-lr.fr)

 

Position Description

The L3i is a research lab of the University of La Rochelle. La Rochelle is a city in the south west of France on the Atlantic coast and is one of the most attractive and dynamic cities in France. The L3i works since several years on document analysis and has developed a well-known expertise in ?Bande dessinée?, manga and comics analysis, indexing and understanding.

The work done by the post-doc will take part in the context of SAiL (Sequential Art Image Laboratory) a joint laboratory involving L3i and a private company. The objective is to create innovative tools to index and interact with digitised comics. The work will be done in a team of 10 researchers and engineers.

The team has developed different methods to extract and recognise the text of the speech balloons. The specific task of the recruited researcher will be to use Natural Language Processing strategies to analyse the text in order to identify emotions expressed by a character (reacting to the utterance of another speaking character) or caused by it (talking to another character). The datasets will be collections of comics in French and English.

 

Qualifications

Candidates must have a completed PhD and a research experience in natural language processing. Some knowledge and experience in deep learning is also recommended.

 

General Qualifications

? Good programming skills mastering at least one programming language like Python, Java, C/C++

? Good teamwork skills

? Good writing skills and proficiency in written and spoken English or French

 

Applications

Candidates should send a CV and a motivation letter to jcburie [at] univ-lr.fr and antoine.doucet [at] univ-lr.fr.

 

Back  Top

6-5(2011-07-13) PhD Position at CNRS

 

 
 
 Modelisation of gestures and speech during interactions
This offer is available in the following languages:
Français - Anglais

Application Deadline : 23 August 2021

Ensure that your candidate profile is correct before applying. Your profile information will be added to the details for each application. In order to increase your visibility on our Careers Portal and allow employers to see your candidate profile, you can upload your CV to our CV library in one click!

 

 

General information

Reference : UMR5267-FABHIR-001
Workplace : MONTPELLIER
Date of publication : Monday, July 12, 2021
Scientific Responsible name : Slim Ouni
Type of Contract : PhD Student contract / Thesis offer
Contract Period : 36 months
Start date of the thesis : 1 October 2021
Proportion of work : Full time
Remuneration : 2 135,00 € gross monthly

Description of the thesis topic

One of the main objectives of social robotics research is to design and develop robots that can engage in social environments in a way that is appealing and familiar to humans. However, interaction is often difficult because users do not understand the robot's internal states, intentions, actions, and expectations. Thus, to facilitate successful interaction, social robots should provide communicative functionality that is both natural and intuitive. Given the design of humanoid robots, they are typically expected to exhibit human-like communicative behaviors, using speech and non-verbal expressions just as humans do. Gestures help in conveying information which speech alone cannot provide and need to be completed, as in referential, spatial or iconic information [HAB11]. Moreover, providing multiple modalities helps to dissolve ambiguity typical of unimodal communication and, as a consequence, to increase robustness of communication. In multimodal communication, gestures can make interaction with robots more effective. In fact, gestures and speech interact. They are linked in language production and perception, with their interaction contributing to an effective communication [WMK14]. In oral-based communication, human listeners have been shown to be well attentive to information conveyed via such non-verbal behaviors to better understand the acoustic message [GM99].

This topic can be addressed in the field of robotics where few approaches incorporate both speech and gesture analysis and synthesis [GBK06, SL03], but also in the field of developing virtual conversational agents (talking avatars), where the challenge of generating speech and co-verbal gesture has already been tackled in various ways [NBM09, KW04, KBW08].

For virtual agents, most existing systems simplify the gesture-augmented communication by using lexicons of words and present the non-verbal behaviors in the form of pre-produced gestures [NBM09]. For humanoid robots the existing models of gesture synthesis mainly focus on the technical aspects of generating robotic motion that fulfills some communicative function, but they do not combine generated gestures with speech or just pre-recorded gestures that are not generated on-line but simply replayed during human-robot interaction.
Missions

The goal of this thesis is to develop a gesture model for a credible communicative robot behavior during speech. The generation of gestures will be studied when the robot is a speaker and when it is a listener. In the context of this thesis, the robot will be replaced by an embodied virtual agent. This allows applying of the outcome of this work in both virtual and real world. It is possible to test the results of this work on a real robot by transferring the virtual agent behavior to the robot, when possible, but it is not an end in itself.

In this thesis, two main topics will be addressed: (1) the prediction of communication-related gesture realization and timing from speech, and (2) the generation of the appropriate gestures during speech synthesis. When the virtual agent is listening to a human interlocutor, the head movement is an important communicative gesture that may give the impression that the virtual agent understands what is said to it and that may make the interaction with the agent more effective. One challenge is to extract from speech, both acoustic and linguistic cues [KA04], to characterize the pronounced utterance and to predict the right gesture to generate (head posture, facial expressions and eye gaze [KCD14]). Synchronizing the gestures with the interlocutor speech is critical. In fact, any desynchronization may induce an ambiguity in the understanding of the reaction of the virtual agent. The gesture timing correlated with speech will be studied. In this work, generating the appropriate gesture during speech synthesis, mainly head posture, facial expressions and eye gaze, will be addressed.

To achieve these goals, motion capture data during uttered speech will be acquired synchronously with the acoustic signal. Different contexts will be considered to achieve the collection of sufficiently rich data. This data will be used to identify suitable features to be integrated within the framework of machine learning techniques. As the data is multimodal (acoustic, visual, gestures), each component will be used efficiently in collecting complementary data. The speech signal will be used in the context of a speech-recognition system to extract the linguistic information, and acoustic features helps to extract non linguistic information, as F0 for instance. The correlation between gestures and speech signal will also be studied. The aim of the different analyses is to contribute to the understanding of the mechanism of oral communication combined with gestures and to develop a model that can predict the generation of gestures in the contexts of speaking and listening.

References

[GBK06] Gorostiza J, Barber R, Khamis A, Malfaz M, Pacheco R, Rivas R, Corrales A, Delgado E, Salichs M (2006) Multimodal human-robot interaction framework for a personal robot. In: RO-MAN 06: Proc of the 15th IEEE international symposium on robot and human interactive communication
[GM99] Goldin-Meadow S (1999) The role of gesture in communication and thinking. Trends Cogn Sci 3:419–429
[HAB11] Hostetter AB (2011) When do gestures communicate? A meta- analysis. Psychol Bull 137(2):297–315
[NBM09] Niewiadomski R, Bevacqua E, Mancini M, Pelachaud C (2009) Greta: an interactive expressive ECA system. In: Proceedings of 8th int conf on autonomous agents and multiagent systems (AA- MAS2009), pp 1399–1400
[KA04] Kendon, Adam, 2004. Gesture – Visible Action as Utterance. Cambridge University Press.
[KBW08] Kopp S, Bergmann K, Wachsmuth I (2008) Multimodal commu- nication from multimodal thinking—towards an integrated model of speech and gesture production. Semant Comput 2(1):115–136
[KCD14] Kim, Jeesun, Cvejic, Erin, Davis, Christopher, Tracking eyebrows and head gestures associated with spoken prosody. Speech Communication (57), 2014.
[KW04] Kopp S, Wachsmuth I (2004) Synthesizing multimodal utter- ances for conversational agents. Comput Animat Virtual Worlds 15(1):39–52
[SL03] Sidner C, Lee C, Lesh N (2003) The role of dialog in human robot interaction. In: International workshop on language understanding and agents for real world interaction
[WMK14] Petra Wagner, Zofia Malisz, Stefan Kopp, Gesture and speech in interaction: An overview, Speech Communication, Volume 57, 2014, Pages 209-232.

Work Context

Funded by the MITI (CNRS), the project GEPACI (for gestures and speech in interactionnal contexts) is led by the UMR5267 Praxiling and UMR7503 LORIA laboratories. Consequently, the successfull candidate will work at the LORIA Nancy. Furthermore, work stays at Montpellier will be organized.

Constraints and risks

No specific risk.

Additional Information

Financement PRIME80 MITI.

We talk about it on Twitter!

Back  Top

6-6(2021-07-16) Ingénieur d’étude en informatique mobile, Université Grenoble Alpes, France

Appel à candidatures

Ingénieur d’étude en informatique mobile
Université Grenoble Alpes

Le Laboratoire d’Informatique de Grenoble (LIG) recrute une personne motivée et force de
proposition pour un contrat d’ingénieur d’étude de 12 mois (renouvelable une fois) en
informatique mobile. La personne recrutée contribuera au projet THERADIA -
https://www.theradia.fr/, qui consiste à développer un assistant virtuel pour accompagner
des patients souffrant de troubles cognitifs lors de la réalisation de séances de remédiation
cognitive à domicile.

Collecte de données d’interaction avec l’agent Theradia piloté par un humain (magicien d’Oz).
Sujet : Développement d’un système mobile pour la collecte, la gestion et l’annotation de
données d’interactions humaines
Le travail consiste à poursuivre le développement d’un logiciel d’annotation en ligne de
données audiovisuelles d’interactions humaines, afin d’y incorporer un certain nombre de
fonctionnalités souhaitées ; e.g., acquisition audiovisuelle des annotateurs, minutage et
contrôle automatique des annotations, interface graphique dynamique, etc. Ce travail sera
réalisé en collaboration avec l’entreprise qui a conçu cet outil (ViaDialog - Paris), ainsi que
l’équipe EMC de l’Université de Lyon 2.
L’outil d’annotation sera ensuite exploité par une population d’annotateurs qu’il faudra
recruter, former à l’outil à l’aide de tutoriaux, et suivre pendant l’annotation des données,
notamment au moyen de scripts permettant de contrôler automatiquement certains aspects
critiques de l’annotation.
Les collaborateurs du projet exploiteront ces annotations pour automatiser les différentes
composantes technologiques constituant l’assistant virtuel, système qui est joué pour l’instant
par un humain pilotant (magicien d’Oz) une application distribuée sur deux machines. Le
système, une fois automatisé, sera déployé sur une plateforme mobile dont il faudra assurer
l’intégration et le bon fonctionnement, notamment pour la collecte continue et parallèle de
données d’interaction multimodales auprès des utilisateurs de l’application, et ce en parfait
respect de la RGPD.
Enfin, une dernière tâche consiste en la valorisation des travaux réalisés auprès de la
communauté scientifique, en participant notamment à l’écriture d’articles scientifiques
présentant les travaux réalisés, et en développant une interface web permettant de faciliter
l’accès aux données collectées et la gestion des licences utilisateurs soumises via l’interface.

Début de contrat : dès que possible
Durée de contrat : 12 mois (renouvelable une fois)
Salaire : selon l’expérience (jusqu’à 3444€ brut / mois)

Environnement scientifique :
La personne recrutée sera accueillie au sein du Groupe d’Étude en Traitement Automatique
des Langues et de la Parole (GETALP) du LIG, qui offre un cadre dynamique, multinational et
stimulant pour conduire des activités de recherche pluridisciplinaire de haut niveau. Les
moyens pour mener à bien les travaux seront assurés tant en ce qui concerne les missions en
France et à l’étranger qu’en ce qui concerne le matériel (ordinateur personnel, accès aux
serveurs du LIG).
Profil de la personne recherchée :
Nous cherchons une personne ayant un diplôme de Master ou d’Ingénieur en informatique
mobile, avec d’excellentes compétences en programmation web (java, python), exécution de
framework web (Angular 10, Flask/fastApi), et en bases de données(SQL). Cette personne doit

avoir une curiosité naturelle pour les sciences, pouvoir travailler de façon autonome, être pro-
active et rendre compte de l’avancement des travaux de façon régulière, être force de

proposition en cas de problème à résoudre, et surtout aimer travailler en collaboration avec
des partenaires diversifiés (industrie / académie). Une participation à l’écriture d’articles
scientifiques est également attendue.
Comment postuler ?
Les candidatures sont attendues au fil de l’eau et le poste sera ouvert jusqu’à ce qu’il soit
pourvu. Elles doivent être adressées à Fabien Ringeval (Fabien.Ringeval@imag.fr) et François
Portet à (Francois.Portet@imag.fr). Le dossier de candidature doit contenir :
- Curriculum vitae détaillé montrant les compétences attendues pour le poste
- Lettre de motivation exprimant votre intérêt et l’adéquation de votre profil
- Informations de contact et lettre de recommandation de deux personnes référentes
- Au moins deux exemples de réalisation démontrant vos compétences techniques
- Diplôme de Master ou d’Ingénieur

Back  Top

6-7(2021-07-30) 5 PhD fellowships in Machine Learning and Information Retrieval , University of Copenhagen
**** 5 PhD fellowships in Machine Learning and Information Retrieval **** 
**** Department of Computer Science, University of Copenhagen **** 
 
 
The Machine Learning Section of the Department of Computer Science at the Faculty of Science at the University of Copenhagen (DIKU) is offering five fully-funded PhD Fellowships in Machine Learning and Information Retrieval, commencing 1 January 2022 or as soon as possible thereafter.
 
Deadline to apply: August 15, 2021
 
 
* Our group and research, and what do we offer:
--------------------------
 
The fellows will join the Machine Learning Section at DIKU. The Machine Learning section is among the leading research environments in Artificial Intelligence and Web & Information Retrieval in Europe (in the top 5 for 2020, according to csrankings.org), with a strong presence at top-tier conferences, continuous collaboration in international & national research networks, and solid synergies with big tech, small tech, and industry. The Machine Learning section consists of a vibrant selection of approximately 65 talented researchers (40 of whom are PhD and postdoctoral fellows) from around the world with a diverse set of backgrounds and a common incessant scientific curiosity and openness to innovation.
 
 
* The fellows will conduct research, having as starting point the following broad research areas:
--------------------------
 
- a fully-funded PhD in machine learning evaluation;
- a fully-funded PhD in bias and interpretability for machine learning;
- a fully-funded PhD in overparameterization and generalizability in deep neural architectures;
- a fully-funded PhD in applied machine learning and/or information retrieval with focus on human-centered computing aspects;
- a fully-funded PhD in web & information retrieval.
 
 
* Who are we looking for?
--------------------------
 
We are looking for candidates with a MSc degree in a subject relevant for the research area. The successful candidate is expected to have strong grades in Machine Learning and/or Information Retrieval. For one of the PhDs, the candidate is expected to also have strong grades in Human-Centered Computing. The candidate should have a preliminary research record as witnessed by a master thesis or publications in the area.
 
For more information, please have a look at: https://employment.ku.dk/phd/?show=154480
 
???

Maria Maistro, PhD
Tenure-track Assistant Professor
Department of Computer Science
University of Copenhagen
Universitetsparken 5, 2100 Copenhagen, Denmark
Back  Top

6-8(2021-08-04) Several Open Positions at KUIS AI Center, Koc University, Istanbul, Turkey

Several Open Positions at KUIS AI Center

Koc University, Istanbul, Turkey

https://ai.ku.edu.tr/

 

Koç University & ?? Bank Artificial Intelligence Center (KUIS AI) was established in March 2020 with a generous donation from ?? Bank. With its 15 core and 20 affiliated faculty members from engineering, medicine, science and other fields, and over 100 graduate students and research staff, it targets to be a leading research institution in artificial intelligence research, education, and industrial collaboration. Research areas in the center are computer vision, computational biology and medicine, human-computer interaction, machine learning, multimedia signal processing, natural language processing, robotics, and systems and AI.  Located in Istanbul, Turkey, Koç University is a non-profit, research-intensive, selective admissions university that provides a world-class education in English. It offers top-quality undergraduate and graduate programs in Engineering, Social Sciences, Humanities, Business and Medicine to the best students from Turkey and abroad. Koç University has been ranked 1st in Turkey by the Times Higher Education World University Rankings 2021 and the QS World University Rankings 2021 and is among the top 250 universities worldwide for Engineering (THE Subject Rankings 2021).

 

There are currently several open positions at KUIS AI, which are listed below:

 

  1. Research Faculty Positions (2 positions)

 

  • Responsibilities: Conducting independent research, advising graduate students, collaborating with the AI faculty members, supporting industrial projects, acquiring research funding, publishing research articles in high impact journals/conferences

  • Eligibility: PhD degree from a reputable university, research experience in AI/ML/DL, strong publication record, post-doctoral research experience

  • Key position benefits

  • 1-year contract with possibility of 2-years extension

  • Starting salary is 15K TL/month (net): can be higher depending on the qualifications of the candidate

  • Financial and logistic support for accommodation within defined limits

  • Monthly meal card covering 2 meals per day in the cafeteria

  • Health insurance coverage for the researcher

  • Full travel support for attending top-tier conferences

  • A high-end laptop computer, access to our state of the art GPU cluster, and additional cloud support as needed.  

How to apply: send your CV, Research Statement, and names of two references to ai-admissions@ku.edu.tr. For enquiries please contact ai-admissions@ku.edu.tr.

 


2)Open Post-Doc Positions (3 positions)


  • Responsibilities: Working on a specific research project under the supervision of an AI faculty member, supervising day-to-day activities of graduate students,  acquiring research funding, publishing research articles in high impact journals/conferences

  • Position Details: we seek fellows in the research areas of computer vision, computational biology and medicine, human-computer interaction, machine learning, multimedia signal processing, natural language processing, robotics, and systems and AI.

  • Eligibility: PhD degree from a reputable university, research experience in AI/ML/DL, strong publication record

  • Key position benefits

  • 1-year contract with the possibility of a 1-year extension

  • Starting salary is 10K TL/month (net): can be higher depending on the qualifications of the candidate

  • Financial and logistic support for accommodation within defined limits

  • Monthly meal card covering 2 meals per day in the cafeteria

  • Health insurance coverage for the researcher

  • Full travel support for attending top-tier conferences

  • A high-end laptop computer, access to our state of the art GPU cluster, and additional cloud support as needed. 

How to apply: send your CV, Research Statement, and names of two references to ai-admissions@ku.edu.tr. For enquiries please contact the individual faculty member of your interest (https://ai.ku.edu.tr/positions/)

 

 

3)Open Research Engineer Positions (2 positions)


  • Responsibilities: Working under the supervision of AI faculty members to support industrial/academic projects in data science and AI, provide technical and software development support for the computational infrastructure of AI Center, good personal skills: the ability to work in industrial projects and interact with the industrial partners to understand their needs

  • Eligibility: B.S/M.S. degree from a reputable university, strong computational skills in AI/ML/DL

  • Key position benefits

  • 1-year contract with possibility of 2-years extension

  • Starting salary is 10K TL/month (net), but may be higher depending on the qualifications of the candidate

  • Opportunity for applied research with industry partners

  • Financial and logistic support for accommodation within defined limits

  • Monthly meal card covering 2 meals per day in the cafeteria

  • Health insurance coverage for the researcher

  • A high-end laptop computer, access to our state of the art GPU cluster, and additional cloud support as needed. 

How to apply: send your CV and the names of two references to ai-admissions@ku.edu.tr. For enquiries please contact ai-admissions@ku.edu.tr.

Back  Top

6-9(2021-08-20) JUNIOR PROFESSOR IN NATURAL LANGUAGE PROCESSING AND MULTIMEDIA INTERACTION , Katholieke Universiteit Leuven, Belgium

JUNIOR PROFESSOR IN NATURAL LANGUAGE PROCESSING AND MULTIMEDIA INTERACTION 

In the Science, Engineering and Technology Group of KU Leuven (Belgium), Faculty of Engineering Science, Department of Computer Science, there is a full-time tenure-track academic vacancy in the area of natural language processing and multimedia interaction. We seek applications from internationally oriented candidates with an outstanding research track record and excellent didactic skills. The successful candidate will perform research in the Human-Computer Interaction research unit. He or she holds a PhD in Computer Science (or a relevant equivalent degree) with focus on natural language processing and multimedia interaction, and has excellent knowledge of the fundamental principles, algorithms and methods of machine learning. 

 

The tenure track of a junior professor lasts 5 years. After this period and subject to a positive evaluation of the tenure track, he or she will be permanently appointed as an associate professor.

 

More info on the vacancy and instructions on how to apply see: https://www.kuleuven.be/personeel/jobsite/jobs/60022759?hl=en&lang=en

You can apply for this professorship till October 15, 2021.

Back  Top

6-10(2021-08-24) Assistant/Associate Professor position in Machine Learning (tenure) at Telecom Paris France
Telecom Paris, a founding member of Institut Polytechnique de Paris, a member of Institut Mines-Telecom (IMT), and one of the top French Engineering schools is opening an Assistant/Associate Professor position in Machine Learning (tenure).
The position is within the  Machine Learning, Statistics & Signal Processing research group (S2A) and is open to a wide variety of research topics around the  team expertise, which covers both theoretical and methodological works in Machine Learning, at the interface of computational/mathematical statistics, stochastic modelling, time-series analysis, signal processing and optimization.Though, expertise in one of the following subjects is at least desired:

- trustworthy machine learning (reliable, robust, fair, explainable)
- online learning, reinforcement learning
- structured prediction / multi-task
- large scale learning, frugal learning
- time-series, spatio-temporal data
 
More information is given at :

Additional information
In the context of the Institut Polytechnique de Paris, the activities in Data Science and AI of the S2A team benefit from the center Hi!Paris (https://www.hi-paris.fr), offering seminars, workshops and fundings through calls for project

The position
? Permanent position
? 19 place Marguerite Perey - 91120 Palaiseau - France


Application
Application must be performed through one of the websites

(French) : https://institutminestelecom.recruitee.com/o/maitre-de-conferences-en-apprentissage-statistique-fh-a-telecom-paris-cdi
(English) : https://institutminestelecom.recruitee.com/l/en/o/maitre-de-conferences-en-apprentissage-statistique-fh-a-telecom-paris-cdi

Important dates
? September 24, 2021 : application deadline
? October 25-26 and November 4-5: interviews (by visio-conference eventually)
Winter 2021/22: beginning

Contact :
Stephan Clémençon (Head of the S2A group)
stephan.clemencon@telecom-paris.fr
Florence d?Alché (Holder of the Chair DSAIDIS)
florence.dalche@telecom-paris.fr
For more info on being an Associate Professor at Telecom Paris, contact rh@telecom-paris.fr

Other web Sites :
Image, Data, Signal Department: web link 
LTCI lab: web link
S2A team: web link
Télécom Paris: web link
Back  Top

6-11(2021-08-28) JUNIOR PROFESSOR IN NATURAL LANGUAGE PROCESSING AND MULTIMEDIA INTERACTION at KULeuven, Belgium

JUNIOR PROFESSOR IN NATURAL LANGUAGE PROCESSING AND MULTIMEDIA INTERACTION 

In the Science, Engineering and Technology Group of KU Leuven (Belgium), Faculty of Engineering Science, Department of Computer Science, there is a full-time tenure-track academic vacancy in the area of natural language processing and multimedia interaction. We seek applications from internationally oriented candidates with an outstanding research track record and excellent didactic skills. The successful candidate will perform research in the Human-Computer Interaction research unit. He or she holds a PhD in Computer Science (or a relevant equivalent degree) with focus on natural language processing and multimedia interaction, and has excellent knowledge of the fundamental principles, algorithms and methods of machine learning. 

 

The tenure track of a junior professor lasts 5 years. After this period and subject to a positive evaluation of the tenure track, he or she will be permanently appointed as an associate professor.

 

More info on the vacancy and instructions on how to apply see: https://www.kuleuven.be/personeel/jobsite/jobs/60022759?hl=en&lang=en

You can apply for this professorship till October 15, 2021.

 

Back  Top

6-12(2021-08-24) Research Associate at the University of Edinburgh , UK
A 5-year fixed-term post for a Research Associate at the University of Edinburgh who will work within an international team to design, implement, and test computational implementations of models of speech articulation planning. The post-holder will contribute to a European Research Council-funded project ?Planning the Articulation of Spoken Utterances? (PI: Alice Turk).   The project?s goal is to understand the representations and processes involved in planning speech articulation. The post-holder should have excellent programming skills and an interest in speech articulation or non-speech motor control.

More information and instructions for how to apply can be found here:  https://elxw.fa.em3.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/job/1884/?utm_medium=jobshare
 
 
Back  Top

6-13(2021-09-06) Postdoc positions in affective computing, Grenoble Alps University, France

Call for postdoc applications in affective computing (Grenoble Alps University) Summary The Grenoble Alps University has an open position for a highly motivated postdoc researcher. The successful candidate will be working on the multi-disciplinary research project THERADIA, which aims at creating an adaptative virtual assistant that accompanies patients suffering from cognitive disorders during the completion of cognitive remediation exercises at home. The person recruited will have the exciting opportunity to develop new machine learning techniques for the robust detection of affective behaviours from audiovisual data collected in-the-wild. Models will be embodied in the virtual agent to monitor and adapt the interaction with the patient, and the whole system will be further tested in a clinical trial to demonstrate the effectiveness of the agent for accompanying patients suffering from cognitive conditions during digital therapies.

Duration: 2 years,

Salary: according to experience (up to 4142€ / month)

Scientific environment The person recruited will be hosted within the GETALP team of the Laboratoire d’Informatique de Grenoble (LIG), which offers a dynamic, international, and stimulating framework for conducting high-level multi-disciplinary research.

The GETALP team is housed in a modern building (IMAG) located on a 175-hectare landscaped campus that was ranked as the eighth most beautiful campus in Europe by Times Higher Education magazine in 2018.

Requirements The ideal candidate must have a PhD degree and a strong background in machine learning, and affective computing or cognitive science/neuroscience. The successful candidate should have:

• Good knowledge of machine learning techniques

• Good knowledge of speech and image processing

• Good knowledge of experimental design and statistics

• Strong programming skills in Python

• Excellent publication record

• Willing to work in multi-disciplinary and international teams

• Good communication skills

Application: Applications are expected to be received on an ongoing basis and the position will be open until filled. Applications should be sent to Fabien Ringeval (fabien.ringeval@imag.fr) and François Portet (francois.portet@imag.fr). The application file should contain:

• Curriculum vitae

• Recommendation letter

• One-page summary of research background and interests

• At least three publications demonstrating expertise in the aforementioned areas

• Pre-defence reports and defence minutes; or summary of the thesis with date of defence for those currently in doctoral studies

Back  Top

6-14(2021-09-09) 2 PhD positions at UNIv. Delft, The Netherlands

Job description 

One of the most pressing matters that holds back robots from taking on more tasks and reach a widespread deployment in society is their limited ability to understand human communication and take situation-appropriate actions. This PhD position is dedicated to addressing this gap by developing the underlying data-driven models that enable a robot to engage with humans in a socially aware manner. 
In order to support long-term interaction, robots need rich models of interaction that include their social context and allow them to act appropriately when faced with uncertainty. Humans use a range of verbal and non-verbal cues to regulate their interactions and are able to adapt how they communicate and cooperate to fit the needs and preferences of their interaction partners. Personalization and adaptation is needed to enable robots to assist people with varying backgrounds and abilities. 
The successful applicant will develop machine-learning based and knowledge modelling techniques applied to multi-sensor data (video, audio etc.) of human behaviour that address both the complexity of the multi-modal nature of interactions and uncertainty regarding the sensor data and social situation.The applicant will design and run the experiments to evaluate the created hybrid-AI models through human-robot interaction. 

 

Topics of interest: 
1) long-term human-robot interaction 
2) affective computing 
2) NLP&argument-mining 
3) planning & learning for HRI 
4) assistive robotics 

 

One Phd candidate will be supervised by Dr. Frank Broz and Prof. Mark Neerincx, focusing on planning and learning for long-term HRI. The other PhD candidate focused more on the NLP aspects of long-term human-robot interaction will carry out his/her PhD studies in the context of the Designing Intelligence Lab (www.di-lab.space), which is a inter-disciplinary lab between Computer Science and Industrial Design Engineering. The candidate will be supervised by Dr. Catharine Oertel Prof. Catholijn Jonker and will collaborate with design colleagues in the context of the DI lab. 

 

To strengthen the social robotics strand of the Interactive Intelligence Group at TU-Delft these two positions are currently available with potential for collaboration.? If you're interested in either vacancy please have a look at:? 

 

 

 

Requirements 
MSc in Computer Science or related field 
At least 3 years of programming experience in python (java ?or C++is a plus) 
Motivation to meet deadlines 
Affinity to design and social science research 
Interest in collaborating with colleagues from Industrial Design (DI lab) 
Willingness to teach and guide students (DI lab) 
The ability to work in a team, take initiative, be results oriented and systematic 

 

 

Conditions of employment 
TU Delft offers PhD-candidates a 4-year contract, with an official go/no go progress assessment after one year. Salary and benefits are in accordance with the Collective Labour Agreement for Dutch Universities, increasing from ? 2395 per month in the first year to ? 3061 in the fourth year. As a PhD candidate you will be enrolled in the TU Delft Graduate School. The TU Delft Graduate School provides an inspiring research environment with an excellent team of supervisors, academic staff and a mentor. The Doctoral Education Programme is aimed at developing your transferable, discipline-related and research skills. 
The TU Delft offers a customisable compensation package, discounts on health insurance and sport memberships, and a monthly work costs contribution. Flexible work schedules can be arranged. For international applicants we offer the Coming to Delft Service and Partner Career Advice to assist you with your relocation. 

 

TU Delft (Delft University of Technology) 
Delft University of Technology is built on strong foundations. As?creators?of the world-famous Dutch waterworks and pioneers in biotech, TU Delft is a top international university combining science, engineering and design. It delivers world class results in education, research and innovation to address challenges in the areas of energy, climate, mobility, health and digital society. For generations, our engineers have proven to be entrepreneurial problem-solvers, both in business and in a social context. At TU Delft we embrace diversity and aim to be as inclusive as possible (see our?Code of Conduct).?Together, we imagine, invent and create solutions using technology to have a positive impact on a global scale. 
Challenge. Change. Impact!? 

 

Faculty Electrical Engineering, Mathematics and Computer Science 
The Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS) brings together three disciplines - electrical engineering, mathematics and computer science. Combined, they reinforce each other and are the driving force behind the technology we use in our daily lives. Technology such as the electricity grid, which our faculty is helping to make future-proof. We are also working on a world in which humans and computers reinforce each other. We are mapping out disease processes using single cell data, and using mathematics to simulate gigantic ash plumes after a volcanic eruption. There is plenty of room here for ground-breaking research. We educate innovative engineers and have excellent labs and facilities that underline our strong international position. In total, more than 1,100 employees and 4,000 students work and study in this innovative environment. 
Click?here?to go to the website of the Faculty of Electrical Engineering, Mathematics and Computer Science. 

 

Additional information 
The application procedure is 'ongoing until the position is filled so interested candidates are encouraged to apply as soon as possible and before September 20th?2021. Note that candidates who apply after the deadline may still be considered but applications before the deadline will be given priority. For further info on the content of the position please contact Dr. Catharine Oertel: C.R.M.M.Oertel@tudelft.nl or Dr. Frank Broz: F.Broz@tudelft.nl 
 
Application procedure 
Interested applicants should include an up-to-date curriculum vitae, letter of motivation and the names and contact information (telephone number and email address) of two references. 
The letter of application should summarise (i) why the applicant wants to do a PhD, (ii) why the project is of interest to the applicant, (iii) evidence of suitability for the job, and (iv) what the applicant hopes to gain from the position 

 

Please apply before?September 20, 2021?via the application website. 

 

  • A pre-employment screening can be part of the selection procedure. 
  • You can apply online. We will not process applications sent by email and/or post. 
  • Acquisition in response to this vacancy is not appreciated.
Back  Top

6-15(2021-09-12) Research assistant at INRIA, Lille, France


 Inria is opening a fixed-term research assistant position on private machine learning for
speech processing as part of the French national collaborative project ANR DEEP-PRIVACY.
The successful candidate will be part of the Magnet team in Lille, which focuses on
privacy-friendly, decentralized and/or graph-based machine learning. He/she will work in
tight collaboration with the Multispeech team in Nancy, which focuses on automatic speech
recognition and speech synthesis.

The goal of DEEP-PRIVACY is to learn automatic speech recognition (ASR) systems from
speech representations that hide speaker identity. Additional requirements such as
decentralized learning, gender fairness, or data efficiency may be considered.

Depending on her/his profile, the candidate will address the following research questions:
- how to design privacy attacks on ASR and design counter-measures;
- how to learn private representations of speech from adversarial training with many
attackers;
- how to design fair and private speech representations;
- how to adapt such methods in the decentralized setting (e.g. federated or fully
decentralized learning);
- how to formally define and measure the trade-off between accuracy, fairness, or privacy
at the global and individual levels.

Application deadline: applications will be assessed on a rolling basis; please apply as
soon as possible.

Starting date: November 1, 2021 or later
Duration: 1.5 year (renewable)
Location: Lille, France
Salary: from 2,050 to 2,130 net/month, according to experience

For more details and to apply:
https://jobs.inria.fr/public/classic/en/offres/2021-04039 (for MSc graduates)
https://jobs.inria.fr/public/classic/en/offres/2021-04034 (for PhD graduates)

Back  Top

6-16(2021-09-15) Senior-level research scientist at Facebook
Facebook Reality Labs Research is seeking a senior-level research scientist in affective and behavioral computing to help craft the human-centered AR computing platform of the future. The role will involve:
 
? Developing and executing a cutting-edge research program with interdisciplinary collaborators aimed at developing closed-loop optimal assistance for human activities that draw heavily from ?internal? affective/emotive/cognitive contextual states;
? Developing tasks, data-collection strategies, modeling approaches, and evaluation criteria to deliver on research program objectives, with a particular focus on approaches that leverage wearable devices with multimodal biosensors;
? Working collaboratively with other research scientists to develop novel solutions and models in service of contextualized AI for augmented reality; and
? Mentoring MS/PhD interns and postdocs and collaborate closely with cross-organizational collaborators and external academic groups to advance our research objectives.
 
The complete job description can be accessed here: https://www.facebook.com/careers/v2/jobs/283289239909563/
Back  Top

6-17(2021-09-17)1 Post-doc and 1 PhD positions at Laboratoire de Phonétique et Phonologie* (LPP ? CNRS/Sorbonne Nouvelle), Paris, France

In the context of the project ChaSpeePro run in collaboration with the University of Geneva, the Hopitaux Universitaires de Genève and the Idiap Institute, the Laboratoire de Phonétique et Phonologie* (LPP ? CNRS/Sorbonne Nouvelle) in Paris is offering:

-       a 3-year (+1) doctoral position starting from December 2021 (https://bit.ly/2XeQNf4)

-       a post-doctoral position for 2-years with a possible extension of 2 more years. Ideally the position would start from December 2021, but later starting date might be considered (https://bit.ly/2VFQWr5).

 

The general goal of the project is to better characterize the processes linked to the encoding of invariant speech units into articulated speech, and their disorders (in dysarthria and apraxia of speech). This project follows the MoSpeeDi project described here: https://www.unige.ch/fapse/mospeedi/.

Within the scope of this project, articulatory and acoustic data will be collected at the LPP to investigate questions related to:

  1. The temporal organization and coordination of speech units at different levels (e.g., gesture, syllable, word, phrases)
  2. Stability and flexibility of the speech production system (e.g., speaker-specific strategies and adaptation to different speech task demands)
  3. Articulatory vs. acoustic manifestations of spatio-temporal reductions in speech

 

Within this Swiss-French collaborative project, the candidates will join the team under the supervision of Cécile Fougeron and in close collaboration with Anne Hermes and Leonardo Lancia.

The Laboratoire de Phonétique et Phonologie (CNRS/Sorbonne Nouvelle, https://lpp.in2p3.fr/) is a research and teaching unit in experimental phonetics and phonology. It is located in the 5th arrondissement in Paris. The lab offers a diverse and fair working environment in a small and dynamic lab.

 

More information on the positions and online application procedure can be found @:

-        https://bit.ly/2XeQNf4 for the doctoral position

-       https://bit.ly/2VFQWr5 for the post-doc position

Back  Top

6-18(2021-09-20) OPEN-RANK FACULTY POSITIONS IN COMPUTER SCIENCE (incl. Spoken Language Processing) AT THE UNIVERSITY OF TEXAS AT EL PASO (UTEP), TX, USA

 

The University of Texas at El Paso is seeking a colleague to join the Computer Science faculty in the broad area of Spoken Language Processing.

OPEN-RANK FACULTY POSITIONS IN COMPUTER SCIENCE AT THE UNIVERSITY OF TEXAS AT EL PASO (UTEP)

 

The Department of Computer Science at UTEP invites applications for two open-rank faculty positions starting fall 2022 with preference for the areas of Spoken Language Processing, Machine Learning, Computer Systems, or Software Engineering. To view the full ad and apply, please visit https://utep.interviewexchange.com/jobofferdetails.jsp?JOBID=136383 or https://www.utep.edu/employment.

 

Informal inquiries can be addressed to Nigel Ward, nigel@utep.edu .

Back  Top

6-19(2021-09-23) Young researchers in NLP at L3i, La Rochelle, France

Cross-lingual and cross-domain terminology alignment


Interested in joining a young NLP group of 10+ people located in a historical town by the Atlantic Ocean? And walk 10 minutes from the lab to the beach. We have open positions in the context of 2 ongoing Horizon 2020 projects: Embeddia and NewsEye as well as related projects. In 2020-2021, we have among others published long papers in CORE A* and A conferences ACL, JCDL, CoNLL, ICDAR, COLING, ICADL, etc.  

Location: L3i laboratory, La Rochelle, France

Duration: 2 years (1+1), with possible further extension

Net salary range: 2100?-2300 ? monthly

Context: H2020 Embeddia project and regional project Termitrad

Start: 1 January 2022

 


Keywords: terminology alignment, cross-lingual word embeddings, named-entity recognition and linking, deep/machine learning, statistical NLP, (text) mining.


Applications are invited for a postdoctoral researcher position around the topic of project Termitrad: keyword and terminology alignment 1) across languages and 2) across domains. In short, the overall objective of the project is to improve the relevance of the keywords describing research papers (and, time allowing, the quality of abstracts). One the one hand (cross-lingual alignment), we will rely on a corpora of journal articles with both French and English keywords and abstracts, both in as written by authors and in versions curated by experts. On the other hand (crossdomain alignment), we will work with use cases provided by researchers from different fields using different terms to describe similar concepts.

 

To address this very project, the project team will consist of senior staff, 2 post-doctoral researchers and 2-3 PhD students, one of which is jointly supervised in the Józef Stefan Institute in Ljubljana, coordinator of H2020 Embeddia. In this context, you will first be in charge of building a state of the art of existing related approaches, tools and resources, then to conduct further research and experiments, as well as participate in the supervision of PhD students.


Who we search for:

-       PhD in statistical NLP, IR, or ML, ideally with further postdoctoral experience

-       proven record of high-level publications in one or more of those fields

-       fluency in written and spoken English (French language skills are irrelevant)

 

Applications including a CV and a one-page research statement discussing how the candidate's background fits requirements and topic are to be sent to by email to  antoine.doucet@univ-lr.fr, strictly with the subject 'Embeddia/Termitrad postdoc application'.

Application deadline: 13 October 2021.

Back  Top

6-20(2021-09-24) Research Assistant at Technische Universität Berlin , Berlin, Germany

Technische Universität Berlin offers an open position:

Research Assistant - salary grade E13 TV-L Berliner Hochschulen

under the reserve that funds are granted; part-time employment may be possible

scientific collaboration in the BMBF project “Emonymous”, possibility of extension

The majority of systems and services that are provided by computer science, electrical engineering and information technology finally are oriented on the needs of their human users. To successfully build such systems and services it is essential to investigate and understand users and their behavior when interacting with technology. From this, design principles for human-machine interfaces or classification systems can be derived and requirements for the underlying technologies can be defined.

The Quality and Usability Lab is part of TU Berlin’s Faculty IV and deals with the design and evaluation of humanmachine interaction, in which aspects of human perception, technical systems and the design of interaction are the subject of our research. We focus on self-determined work in an interdisciplinary and international team; for this we offer open and flexible working conditions that promote scientific and personal exchange and are a prerequisite for excellent results.

Fakultät IV - Elektrotechnik und Informatik - Quality and Usability Lab Reference number: IV-549/21 (starting at the earliest possible / for 2 years / closing date for applications 15/10/21)

Working field: Central is the creation and empirical research of the use of speech and language technology, which includes aspects of signal processing, machine learning, artificial intelligence and natural language processing. Specific tasks in the “Emonymous” funding project are aimed, for example, at researching and implementing anonymization in speech, while at the same time preserving emotions, speaker characteristics and intelligibility. Acoustic and linguistic (textual) content are often analyzed in multimodal models and have to be evaluated using perceptual listening tests (laboratory or crowd), e.g. in order to control speech synthesizer training. If desired and suitable, the publication and presentation of project and research results in scientific journals, at conferences and workshops can be aimed for.

The specific tasks include:

Conception, construction and evaluation of speech processing systems as well as systems for speaker characterization and transformation (e.g. ASR, emotion, intelligibility, synthesis)

Measurement, planning and optimization of quality and user experience of anonymization (Quality of Experience, User Experience)

Planning and execution of user studies (laboratory, large scale crowds)

Project communication and reporting

Publication and presentation of project and research results in scientific journals, at conferences, and in workshops as well as standardization meetings of ITU-T


Professionally experienced employees from our team support you with self-motivated familiarization within the areas of responsibility.

PhD thesis preparation is possible.

Requirements:

Successfully completed scientific university degree (Master, Diplom or equivalent)in electrical engineering, computer engineering/science, informatics, media informatics, media technology, information systems (or an equivalent technical background)

Ability to work independently in a team and good self-organization

Very good programming knowledge in Python, Matlab or similar

Good knowledge of machine learning, AI and / or NLP

Good knowledge of digital signal processing, statistics, empirical data analysis

Interest in carrying out experiments with human participants to determine quality and user experience

Language skills: English fluent in writing and speaking, German communication secure

Desire to work in an agile and lively international and interdisciplinary environmen


Please send your application with the reference number and the usual documents (one file max. 5 MB) only via email to bewerbung@qu.tu-berlin.de.

By submitting your application via email you consent to having your data electronically processed and saved. Please note that we do not provide a guaranty for the protection of your personal data when submitted as unprotected file.

Please find our data protection notice acc. DSGVO (General Data Protection Regulation) at the TU staff department homepage: https://www.abt2-t.tuberlin.de/menue/themen_a_z/datenschutzerklaerung/ or quick access 214041.

To ensure equal opportunities between women and men, applications by women with the required qualifications are explicitly desired. Qualified individuals with disabilities will be favored. The TU Berlin values the diversity of its members and is committed to the goals of equal opportunities.

Technische Universität Berlin - Der Präsident -, Fakultät IV, Quality and Usability Lab, Prof. Dr.-Ing. Möller, Sekr. TEL 18,Ernst-Reuter-Platz 7, 10587 Berlin

The vacancy is also available on the internet at

https://www.personalabteilung.tu-berlin.de/menue/jobs/

Back  Top

6-21(2021-09-27) PhD at KTH, Stockholm, Sweden

The School of Electrical Engineering and Computer Science (EECS) at the KTH Royal Institute of Technology announces a Ph.D position in Multimodal Machine Learning for Human-Robot Interaction at the division of Speech, Music and Hearing (TMH).

 

ABOUT KTH

KTH Royal Institute of Technology in Stockholm has grown to become one of Europe?s leading technical and engineering universities, as well as a key center of intellectual talent and innovation. We are Sweden?s largest technical research and learning institution and home to students, researchers and faculty from around the world. Our research and education covers a wide area including natural sciences and all branches of engineering, as well as in architecture, industrial management, urban planning, history and philosophy.

 

PROJECT DESCRIPTION

In this project the student will design, develop, and evaluate a telepresence platform specifically developed for collecting multimodal data for posterior automation. Unsupervised and supervised multimodal machine learning models alongside multimodal fusion techniques will be explored to evaluate the quality of the telepresence platform. Mixed reality technologies will also be explored for the creation of the telepresence platform as they offer exciting opportunities for data collection (and consequent learning of multimodal models of social behaviour). Users? mixed reality headsets allow for extracting multiple modalities such as real-time head-pose, eye-gaze information, pupil dilation, and high framerate Point of View (PoV) video data.

 

This position is partially funded by a project on 'Using Neuroimaging Data for Exploring Conversational Engagement in Human-Robot Interaction'. This project will leverage a multidisciplinary research collaboration with Julia Uddén from the Linguistics department at Stockholm University where we will aim to study social robotics by exploring the modality of observing underlying neural processes of people that are observing, interacting with, and controlling robots. Understanding these neural processes and how they integrate with other modalities will help us provide contributions to the research areas of Human-Robot Interaction, Artificial Intelligence, Psycho- and Neurolinguistics and Neurobiology of Language.

 

The starting date is open for discussion, although we would like the successful candidate to start at the beginning of 2022.

 

QUALIFICATIONS

The candidate must have a degree in Computer Science or related fields. Documented written and spoken English and programming skills are required. Experience with robotics, human-computer interaction, mixed-reality, neuroscience or machine learning is important.

 

HOW TO APPLY

The application should include:

1. Curriculum vitae.

2. Transcripts from University/College.

3. Brief description of why the applicant wishes to become a doctoral student.

 

The application documents must be uploaded using KTH's recruitment system. More information here:

https://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:427059

 

The application deadline is ** October 15, 2020 **

 

André Pereira

Researcher

KTH Royal Institute of Technology

School of Electrical Engineering and Computer Science

Division of Speech, Music and Hearing (TMH)

Back  Top

6-22(2021-09-28) Ingenieur de recherche en informatique, LORIA,, Nancy, France

H/F Ingénieur de recherche en informatique ? développement d'une plateforme multimodale
Au LORIA à Nancy.

Les détails de l?offre et les modalités de candidature se trouvent sur le site du CNRS :

 
Back  Top

6-23(2021-09-26) Fully funded PhD at the Science Foundation Ireland Center for Research Training in Advanced Networks for Sustainable Societies, Ireland

 

 The Science Foundation Ireland Centre for Research Training in Advanced Networks for Sustainable Societies (https://www.advance-crt.ie/), Munster Technological University (www.mtu.ie) and GREYC UMR CNRS 6072 - Groupe de Recherche en Informatique, Image, et Instrumentation de Caen, University of Caen Normandy, France (https://www.greyc.fr/en/home/) invite applications for a fully funded dual Degree PhD (Cotutelle) position in Text Generation for Mental Health. The successful PhD candidate will be registered at both universities (Munster Technological University, Ireland and University of Caen Normandy, France).

The SFI Centre for Research Training in Advanced Networks for Sustainable Societies focuses on enabling technologies for future hyper-networks, including concepts such as network virtualization, dependable communications, Internet of Things, data driven network management and applications in sustainable and independent living. This centre will train the next generation of researchers who will seek solutions to the technical and societal challenges of global hyper-connectivity.

Vision

'? train the next generation of doctoral graduates at the interface of technologies and social sciences, graduates who can stimulate socially-responsible and inclusive creativity and innovation in the fields of advanced communications. Our aim is to ensure that the next generations of communications technology are developed with human and societal benefit as a priority objective.'

The GREYC lab, France realizes research works in the field of digital science. It has 7 research groups with faculty members from ENSICAEN, UNICAEN and CNRS, PhD students and administrative & technical members. Studies focus on fundamental and methodological aspects ? modelling, new concepts ? and also practical achievements: development of applications and software platforms, design and production of electronic devices.

Funding Notes & Eligibility Criteria

The ADVANCE CRT-GREYC postgraduate programme offers excellent student a fully-funded dual PhD where they will receive a tax-free stipend (in Ireland) of approx. ?18,500 per year for up to four years including EU tuition fees, research and equipment costs and all costs associated with training related covered.


Eligibility criteria & application process

We invite applications from individuals who hold at least a 2.1 honours undergraduate degree in relevant disciplines such as Computer Science, Computer Engineering, Electrical and Electronic Engineering or related disciplines with strong programming skills. 

Application Process

Interested candidates can send an application with the following documents directly to Dr Mohammed Hasanuzzaman (mohammed.hasanuzzaman@mtu.ie) and Prof. Gaël Dias (gael.dias@unicaen.fr).

  • Full transcripts and certificates for primary and highest degree(s); GPA/degree class.

  • A detailed CV (maximum 3 pages)

  • Evidence of English language proficiency for non-native speakers based on IELTS (or similar) score of 6.5 (or similar local requirements)

  • A max. 2 page statement of motivation including description of:

  • Why you wish to undertake this doctoral research and why you believe you are qualified for this ADVANCE-GREYC dual Degree PhD and research topic (with reference to your future career plans)


The deadline for applications is the 9th of October 2021.

Diversity

To help address gender under-representation in science, applications from female applicants are strongly encouraged, as are those from international students and other under-represented groups. This reflects each of the supervisory institutions commitment to providing a diverse and open environment for students and faculty.

 

Dr. Mohammed Hasanuzzaman, Lecturer, Munster Technological University 
Funded Investigator, ADAPT Centre- A World-Leading SFI Research Centre
Member, Lero- SFI Research Centre for Software
Dept. of CS 
             
Munster Technological University 
Bishopstown campus

Cork e: mohammed.hasanuzzaman@adaptcentre.ie/
Ireland https://mohammedhasanuzzaman.github.io/
Back  Top

6-24(2021-10-20) PhD position Orléans/Grenble France

Lieu : Orléans/Grenoble, France
Contacts : Emmanuel Schang (emmanuel.schang@univ-orleans.fr), Benjamin Lecouteux (benjamin.lecouteux@univ-grenoble-alpes.fr)


Nous cherchons un?e candidat?e pour une thèse en Sciences du Langage sur le thème du traitement automatique de la parole.
La thèse s?effectuera au sein du Laboratoire Ligérien de Linguistique (LLL, UMR 7270), avec une possibilité d'accueil au LIG-GETALP (Grenoble). Le financement se fera dans le cadre du projet ANR CREAM (Documentation des langues CREoles Assistée par la Machine
https://sites.google.com/view/creamproject/home).

Termes-clés : langues créoles, traitement automatique de la parole, détection de mot clé, alignement bilingue, creole languages, speech processing, keyword spotting, bilingual alignment.


Objectifs
Le projet CREAM  vise à proposer aux  linguistes travaillant sur les langues créoles des outils novateurs dans la collecte et le traitement des données orales sur des langues disposant de peu de ressources.
Dans le contexte particulier de diglossie qui caractérise souvent l'espace créolophone, le passage par l'étape de la transcription de corpus est fréquemment ressenti comme une difficulté par les linguistes de terrain. Une conséquence est le manque de corpus disponibles.

L'objectif de ce projet est d'ouvrir la voie à des méthodes novatrices en matière de documentation linguistique et de création de ressources sur les langues créoles. En utilisant des technologies d'apprentissage automatique de pointe, nous cherchons à changer la façon dont la documentation linguistique est mise en ?uvre en termes de construction de ressources linguistiques et de traitement des corpus parlés.

L'accent sera mis sur deux tâches en particulier :
- Query-by-example : la recherche de segments similaires dans des corpus en langue créole,
- Alignement bilingue automatisé entre des segments de parole dans une langue créole et une langue proche (français, anglais, portugais, suivant les créoles).

Selon les avancées, les recherches pourront s'étendre à d'autres tâches du TAL :

- la reconnaissance automatique de la parole (étude du transfert d'apprentissage entre langues lexificatrices et langues créoles)

- traduction automatique ...


Bibiographie sélective
- G. Adda, et al.. (2016). Breaking the unwritten language barrier: the BULB project. In SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, Yogyakarta, Indonesia, May 2016.
- A. Baevski, et al.  (2020). wav2vec 2.0: A framework for self-supervised learning of speech representations. arXiv preprint arXiv:2006.11477.
- D. Blachon, et al. (2016). Parallel Speech Collection for Under-resourced Language Studies Using the Lig-Aikuma Mobile Device App. In Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Yogyakarta, Indonesia, May 2016.
- P. Godard et al. (2018). Unsupervised Word Segmentation from Speech with Attention. In Interspeech 2018, Hyderabad, India, September 2018.
- Y.-A. Chung, et al. (2016). Audio word2vec: Unsupervised learning of audio segment representations using sequence-to-sequence autoen- coder, Interspeech 2016 pp. 765?769.
- H. Kamper, (2019). Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models, ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 6535?39.
- H. Kamper, Anastassiou, A. and Livescu, K. (2019). Semantic query-by-example speech search using visual grounding, ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 7120?24.
- S. Schneider, et al.  (2019). Wav2vec: Unsupervised Pretraining for Speech Recognition. Interspeech. Graz, Austria, 2019.
- V. Velupillai. (2015). Pidgins, creoles and mixed languages. John Benjamins Publishing Company.

Profil recherché
Les candidat.e.s auront un master en linguistique ou en informatique et montreront un intérêt certain pour le traitement automatique de la parole et les langues dites 'rares'. Une autonomie dans le codage en python est indispensable, ainsi que des bases en machine learning.

Candidature : les candidat.e.s enverront une lettre de motivation ainsi qu?un CV détaillé. Des documents complémentaires pourront être demandés si le ou la candidat.e est retenu.e pour une audition.


Encadrement
Emmanuel SCHANG (Docteur HDR en Sciences du Langage)
Benjamin LECOUTEUX (Docteur en Informatique)


Candidature à envoyer à Emmanuel Schang (
emmanuel.schang@univ-orleans.fr), Benjamin Lecouteux (benjamin.lecouteux@univ-grenoble-alpes.fr).


Calendrier :

Date limite d?envoi des dossiers : 05 novembre 2021

Les dates des auditions seront communiquées aux candidat.e.s retenu.e.s sur dossier.

 

Back  Top

6-25(2021-10-21) Three fully-funded PhD positions on early prosodic development at Utrecht University The Netherlands

Three fully-funded PhD positions on early prosodic development at Utrecht University

 

These positions are part of a new Dutch Research Council project ‘SoundStart’ on early prosodic development led by Professor Aoju Chen at Utrecht University. Children show knowledge of prosody (i.e., melody and rhythm) of their native language already at birth and thrive on it in early language and communication development. However, how children learn prosody so early is still unknown. Taking an interdisciplinary and crosslinguistic approach, SoundStart aims to discern the role of innate mechanisms, uncover learning mechanisms in the auditory modality underlying development spanning prenatal and after-birth periods, and shed light on the role of visual cues to prosody (i.e., speech-accompanying gestures) in after-birth periods. PhD projects 1 and 2 will be concerned with the first two goals in the learning of prosodic phrasing (i.e., grouping sequences of sounds into meaningful units in speech streams) and prosodic form-meaning mappings (i.e. associating prosodic patterns with communicative functions), PhD project 3 with the third goal in both areas of prosodic development. We are seeking to hire highly motivated, driven and talented MA graduates to take on the PhD projects, starting 1 February 2022. A later starting date is negotiable.

 

In this research programme you will work within an interdisciplinary team, closely collaborating with researchers from linguistics, neuroscience, psycholinguistics, psychology and neonatology at Utrecht University and the University Medical Centre Utrecht. You will receive assistance from student assistants in data collection (in the Netherlands and abroad) and/or data processing and have the opportunity to develop your academic teaching skills during your project.

 

Do you want to play a key role in this new exciting research programme on early prosodic development? Read more and apply! The application deadline is 8 November, 2021.

 

https://www.uu.nl/en/organisation/working-at-utrecht-university/jobs/three-fully-funded-phd-positions-on-early-prosodic-development-at-utrecht-university-10-fte

 

For more information about these positions, please contact Professor Aoju Chen at aoju.chen@uu.nl.

 

----

Prof. dr. Aoju Chen

Chair of Language Development in Relation to Socialisation and Identity

Head of the English Language and Culture Division 

Leader of VICI research group SoundStart

Department of Languages, Literature and Communication & Utrecht Institute of Linguistics - OTS

Utrecht University 

The Netherlands

 

Back  Top

6-26(2021-10-25) Ingenieur de développement, INRIA,Nancy, France

L'équipe Multispeech d'Inria Nancy recrute un ingénieur permanent dont la mission principale sera de piloter et contribuer au développement d'un assistant virtuel multimodal évolutif, à partir de briques logicielles open source et/ou issues de l'équipe, à des fins de recherche et de transfert industriel.

 

Fiche de poste: voir pages 9-10 du fichier https://www.inria.fr/sites/default/files/2021-10/Fiches_poste_ingenieurs_permanents.pdf  

 

Candidature en ligne: https://www.inria.fr/fr/concours-externes

Date butoir: 31/10/2021.

 

 

Back  Top

6-27(2021-10-26) Master internships at TALEP, Marseille, France
L'équipe TALEP de Marseille propose plusieurs stages de M2 en TAL à partir de février 2022 :
The TALEP team (Marseille, France) has several master internship offers in NLP starting in Feb 2022:
* Joint speech segmentation and syntactic analysis
* Syntactic analysis of speech without transcription
* Matching contextual and definitional embeddings for a sense-aware reading assistant
* Using deep learning to study children?s multimodal behavior in face-to-face conversation
* Using interpretability methods to explain Vision-Language models for medical applications
* Impact of language evolution in historical texts on NLP models
* Fidélité et exactitude de la génération dans les systèmes de génération de texte
* ...
Liste complète et détails / full list and details : https://www.lis-lab.fr/offre-de-stage/
 
Candidatez maintenant pour découvrir un environnement scientifique stimulant dans une ville vibrante et ensoleillée :-)
Apply now to discover a stimulating scientific environment in a vibrant sunny city :-)

 
--
Carlos RAMISCH
http://pageperso.lis-lab.fr/carlos.ramisch
Assistant professor at LIS/TALEP and Aix Marseille University, France
Back  Top

6-28(2021-11-12) Internships at IRIT, Toulouse, France

L’équipe SAMoVA de l’IRIT à Toulouse propose plusieurs stages en 2022 :

- Analyse des signaux IoT (audio et accéléromètre) issus du collier Sallis Médical en vue de la modélisation de l’efficacité pharyngolaryngée
- Discriminative sequence training in end-to-end automatic speech recognition
- Le traitement automatique de la parole et l’IA au service du diagnostic différencié de maladies neurodégénératives
- Self- and semi-supervised adaptation of neural speaker diarization
- Self-supervised audio representation learning
- ...

Tous les détails (sujets, contacts) sont disponibles dans la section 'Jobs' de l’équipe, onglet 'Internships' :
https://www.irit.fr/SAMOVA/site/jobs/

Back  Top

6-29(2021-11-13)Master internship at IRISA, Lannion, France
EXPRESSION Team IRISA LANNION - Proposal for an internship for a Research Master in Computer Science
 

Title: Joint training framework for text-to-speech and voice conversion

 

Text-to-speech and voice conversion are two distinct speech generation techniques. Text-to-speech  (TTS) is a process that generates speech from a sequence of graphemes or phonemes. Voice conversion is the conversion of speech from a source voice to a target voice. These processes find their application in domains such as Computer Assisted Language Learning, for example.
 
However, these two processes share some bricks, particularly the vocoder that generates speech from acoustic characteristics or the spectrogram. The quality of these two technologies has been significantly improved thanks to the availability of massive databases, the power of computing machines, and the deep learning paradigm implementation. On the other hand, restoring or controlling expressiveness, and more generally considering suprasegmental information, remains a major challenge for these two technologies.
 
This internship topic aims at setting up a common framework for both technologies. We aim at a joint deep learning framework to generate speech (target voice) from either speech (source voice) or text.
 
 
It will be supervised by members of the EXPRESSION team (IRISA): Aghilas Sini, Pierre Alain, Damien Lolive, and Arnaud Delhay-Lorrain.
 
Please send your application (CV + cover letter) before 10/01/2022 to  à aghilas.sini@irisa.fr,palain@irisa.fr, damien.lolive@irisa.fr, arnaud.delhay@irisa.fr
 
Start date: 01/02/2022 (flexible)
======================
Back  Top

6-30(2021-11-15) Stage en reconnaissance automatique de la parole chez Zaion, France

====== Offre de stage en reconnaissance automatique de la parole ========

 
Nous proposons une offre de stage au sein de Zaion, portant sur le développement de solutions de Reconnaissance Automatique de la Parole adaptées au contexte de la relation client, sur de nouvelles langues (niveau M2).

Merci de transférer si vous connaissez des étudiant.e.s à la quête de telle opportunité.

Description et contacts :
 
Back  Top

6-31(2021-11-26) Four research internships at SteelSeries France R&D team

The SteelSeries France R&D team (former Nahimic R&D team) is glad to open 4 research internship positions for 2022.

The selected candidates will be working on one of the following topics (more details in attached):
 
- Audio media classification
- Audio source classification
- Audio source separation
- Real-time speech restoration
 
Please reply/apply to nathan.souviraa-labastie@steelseries.com.

Audio media classification Master internship, Lille (France), 2022

Advisors — Pierre Biret, R&D Engineer, pierre.biret@steelseries.com — Nathan Souviraà-Labastie, R&D Engineer, PhD, nathan.souviraa-labastie@steelseries.com

Company description

SteelSeries is a leader in gaming peripherals focused on quality, innovation and functionality, and the fastest growing major gaming headset brand globally. Founded in 2001, SteelSeries improves performance through first-to-market innovations and technologies that enable gamers to play harder, train longer, and rise to the challenge. SteelSeries is a pioneer supporter of competitive gaming tournaments and eSports and connects gamers to each other, fostering a sense of community and purpose. Nahimic has joined the SteelSeries family in 2020 to bolster reputation of industry-leading gaming audio performance across both hardware and software. Nahimic is the leading 3D gaming audio software editor with more than 150 man-years of research and development in gaming industry. Their team gathers the rare combination of world class audio processing engineers and software geniuses based across France, Singapore and Taiwan. They are the worldwide leader in PC audio gaming software that are embedded in millions of gaming devices, from gaming headsets to the most powerful gaming PCs by brands such as MSI, Dell, Asus, Gigabyte, etc. Their technology offers the most precise and immersive sound for gamers that allows them to be more efficient in any game and have more immersive feeling. We wish to meet passionate people full of energy and motivations, ready to achieve great challenges to exhale everyone’s audio experience. We are currently looking for a AUDIO SIGNAL PROCESSING / MACHINE LEARNING RESEARCH INTERN to join the R&D team of SteelSeries’ Software & Services Business Unit in our French office (former Nahimic R&D team).

Subject

The target of the internship is to build a model able to classify audio streams into multiple media classes (classes description upon request).The audio classification problem will be addressed using supervised machine learning. Hence, the first step of the internship will be to collect data and build a balanced corpus for such an audio classification problem. Fortunately, massive audio content for most potential classes are available within the company and this task should not be an important burden. Once the corpus is built, the intern will have to either tune the parameters of an already existing internal model or develop a more adapted model from the state of the art [1] [2] [4] that still satisfy the «real-time» constraint. A more advanced step of the internship would be to define a more precise media type classification with for instance sub-types within a same category.Once the relevant classes have been identified, the intern will have to incorporate such changes in his classification algorithm and framework. As the intern will receive support to turn his model into an in-product real-time prototype, this internship is a rare opportunity to bring research to product in such a short time frame.

Skills

Who are we looking for ? Preparing an engineering degree or a master’s degree, you preferably have knowledge in the development and implementation of advanced algorithms for digital audio signal processing. Machine learning skills is a plus. 1 Whereas not mandatory, notions in the following various fields would be appreciated : Audio, acoustics and psychoacoustics - Audio effects in general : compression, equalization, etc. - Machine learning and artificial neural networks. - Statistics, probabilist approaches, optimization. - Programming language : Matlab, Python, Pytorch, Keras, Tensorflow. - Voice recognition, voice command. - Computer programming and development : Max/MSP, C/C++/C#. - Audio editing software : Audacity, Adobe Audition, etc. - Scientific publications and patent applications. - Fluent in English and French. - Demonstrate intellectual curiosity.

Other offers https://nahimic.welcomekit.co/
https://www.welcometothejungle.co/companies/nahimic/jobs

Références
[1] DCase Challenge Low-Complexity Acoustic Scene Classification with Multiple Devices. url : http: //dcase.community/challenge2021/task-acoustic-scene-classification-results-a.
[2] B. Kim et al. Domain Generalization on Efficient Acoustic Scene Classification using Residual Normalization. 2021. arXiv : 2111.06531 [cs.SD].
[3] Nahimic on MSI. url : https://fr.msi.com/page/nahimic.
[4] sharathadavanne. seld-dcase2021. https://github.com/sharathadavanne/seld-dcase2021. 2021


Audio source classification Master internship, Lille (France), 2022

 Advisors — Nathan Souviraà-Labastie, R&D Engineer, PhD, nathan.souviraa-labastie@steelseries.com — Pierre Biret, R&D Engineer, pierre.biret@steelseries.com

Company description

SteelSeries is a leader in gaming peripherals focused on quality, innovation and functionality, and the fastest growing major gaming headset brand globally. Founded in 2001, SteelSeries improves performance through first-to-market innovations and technologies that enable gamers to play harder, train longer, and rise to the challenge. SteelSeries is a pioneer supporter of competitive gaming tournaments and eSports and connects gamers to each other, fostering a sense of community and purpose. Nahimic has joined the SteelSeries family in 2020 to bolster reputation of industry-leading gaming audio performance across both hardware and software. Nahimic is the leading 3D gaming audio software editor with more than 150 man-years of research and development in gaming industry. Their team gathers the rare combination of world class audio processing engineers and software geniuses based across France, Singapore and Taiwan. They are the worldwide leader in PC audio gaming software that are embedded in millions of gaming devices, from gaming headsets to the most powerful gaming PCs by brands such as MSI, Dell, Asus, Gigabyte, etc. Their technology offers the most precise and immersive sound for gamers that allows them to be more efficient in any game and have more immersive feeling. We wish to meet passionate people full of energy and motivations, ready to achieve great challenges to exhale everyone’s audio experience. We are currently looking for a AUDIO SIGNAL PROCESSING / MACHINE LEARNING RESEARCH INTERN to join the R&D team of SteelSeries’ Software & Services Business Unit in our French office (former Nahimic R&D team).

Subject

The target of the internship is to build a model able to classify audio sources. And by audio sources, we mean sources present inside a predetermined given media such as music, movies or video games, e.g., instruments in the case of music. The audio classification problem will be addressed using supervised machine learning. The intern would not start his project from scratch as data and classification code from other projects can be re-used with minor adaptation (description upon request). Once the corpus is reshaped for classification, the intern will have to either tune the parameters of an already existing internal model or develop a more adapted model from the state of the art [1] [3] [5] that still satisfy strong real-time constraint. Multi-task approach A more advanced step of the internship would be to explore multi-task models. The two tasks of target would be 1/ the classification task that the intern would have previously addressed, 2/ the audio source separation task on the same data type. This is a very challenging machine learning problem, especially because the different tasks are heterogeneous (classification, regression, signal estimation), contrary to homogeneous multi-task classification where a classifier is able to address different classification task. Moreover, just a few study are targeting audio heteregeneous multi-task (exhaustive list from advisors knowledge [2, 6, 7, 4]). Potential advantages of the multi-task approach are performance improvement for the main/principal task and computational cost reduction in products, as several tasks are achieved at the same time. Previous internal bibliographic work and network architecture could be used as starting point for this approach.



Skills

Who are we looking for ? Preparing an engineering degree or a master’s degree, you preferably have knowledge in the development and implementation of advanced algorithms for digital audio signal processing. Machine learning skills is a plus. Whereas not mandatory, notions in the following various fields would be appreciated : Audio, acoustics and psychoacoustics - Machine learning and artificial neural networks. - Audio effects in general : compression, equalization, etc. - Statistics, probabilist approaches, optimization. - Programming language : Matlab, Python, Pytorch, Keras, Tensorflow. - Sound spatialization effects : binaural synthesis, ambisonics, artificial reverberation. - Voice recognition, voice command. - Voice processing effects : noise reduction, echo cancellation, array processing. - Computer programming and development : Max/MSP, C/C++/C#. - Audio editing software : Audacity, Adobe Audition, etc. - Scientific publications and patent applications. - Fluent in English and French. - Demonstrate intellectual curiosity.

Other offers

https://nahimic.welcomekit.co/ https://www.welcometothejungle.co/companies/nahimic/jobs

Références

[1] DCase Challenge Low-Complexity Acoustic Scene Classification with Multiple Devices. url : http: //dcase.community/challenge2021/task-acoustic-scene-classification-results-a.
[2] P. Georgiev et al. « Low-resource Multi-task Audio Sensing for Mobile and Embedded Devices via Shared Deep Neural Network Representations ». In : Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1.3 (sept. 2017), 50 :1-50 :19.
[3] B. Kim et al. Domain Generalization on Efficient Acoustic Scene Classification using Residual Normalization. 2021. arXiv : 2111.06531 [cs.SD].
[4] H. Phan et al. « On multitask loss function for audio event detection and localization ». In : arXiv preprint arXiv :2009.05527 (2020).
[5] sharathadavanne. seld-dcase2021. https://github.com/sharathadavanne/seld-dcase2021. 2021.
[6] G. P. Stéphane Dupont Thierry Dutoit. « Multi-task learning for speech recognition : an overview ». In : 24th European Symposium on Artificial Neural Networks 1 (2016).
[7] D. Stoller, S. Ewert et S. Dixon. « Jointly Detecting and Separating Singing Voice : A Multi-Task Approach ». en. In : arXiv :1804.01650 [cs, eess] (avr. 2018). arXiv : 1804.01650.


Audio source separation Master internship, Lille (France), 2022

Advisors — Nathan Souviraà-Labastie, R&D Engineer, PhD, nathan.souviraa-labastie@steelseries.com — Damien Granger, R&D Engineer, damien.granger@steelseries.com

Company description

SteelSeries is a leader in gaming peripherals focused on quality, innovation and functionality, and the fastest growing major gaming headset brand globally. Founded in 2001, SteelSeries improves performance through first-to-market innovations and technologies that enable gamers to play harder, train longer, and rise to the challenge. SteelSeries is a pioneer supporter of competitive gaming tournaments and eSports and connects gamers to each other, fostering a sense of community and purpose. Nahimic has joined the SteelSeries family in 2020 to bolster reputation of industry-leading gaming audio performance across both hardware and software. Nahimic is the leading 3D gaming audio software editor with more than 150 man-years of research and development in gaming industry. Their team gathers the rare combination of world class audio processing engineers and software geniuses based across France, Singapore and Taiwan. They are the worldwide leader in PC audio gaming software that are embedded in millions of gaming devices, from gaming headsets to the most powerful gaming PCs by brands such as MSI, Dell, Asus, Gigabyte, etc. Their technology offers the most precise and immersive sound for gamers that allows them to be more efficient in any game and have more immersive feeling. We wish to meet passionate people full of energy and motivations, ready to achieve great challenges to exhale everyone’s audio experience. We are currently looking for a AUDIO SIGNAL PROCESSING / MACHINE LEARNING RESEARCH INTERN to join the R&D team of SteelSeries’ Software & Services Business Unit in our French office (former Nahimic R&D team).

Approaches and topics for the internship Audio source separation consists in extracting the different sound sources present in an audio signal, in particular by estimating their frequency distributions and/or spatial positions. Many applications are possible from karaoke generation to speech denoising. In 2020, our separation approaches [11, 1] were equaling the state of the art [12, 13] on a music separation task and many tracks of improvement are possible in terms of implementations (details hereafter). The selected candidate will work on one or several of the following topics according to her/his aspirations, skills and bibliographic outcomes. In addition to those topics, the candidates can also make their own topic proposal. She/he will also have the chance to work on our internal substantive datasets. New core algorithm Machine learning is a fast changing research domain and an algorithm can move from being state of the art to being obsolete in less than a year (see for instance the recent advances in music source separation [9, 3]). The task would be to try recent powerful neural network approaches like recent architectures or unit types that proved benefit in other research fields. For instance, the encoding and decoding part of [15] shows huge benefit compared to traditional audio codec. Other research domains outside audio (like computer vision) might be considered as sources of inspiration. For instance, the approaches in [14, 6] have shown promising results on other tasks and previous internal work [1] managed to bring those benefits to audio source separation. Conversely, approaches like [10, 5] were tested without benefit for the separation tasks that we target. 1 Overall, the targeted benefits of a new approach can be of two kinds, either to bring improvements in terms of audio separation performances, either to reduce the computational costs (mainly CPU/GPU load, RAM usage). Extension to multi-source Another challenging problem would be to estimate all the different sources with a single network, either by selecting wich source to output but with a single network such as in [7], either by outputting all sources at the same time. In the case of music, most of the state of the art approaches [12] had historically addressed the backing track problem (i.e., karaoke for instruments) as a one instrument versus the rest problem, hence using specific networks for each instruments when multiple instruments are present in the mix. Pruning Beside testing new architectures or unit types, pruning could be a simple and effective way to reduce computational costs. The original pruning principle is to remove the less influent neural units in order to avoid overfitting. We would mainly be interested in reducing the total amount of units and parameters.The theoretical and domain agnostic literature [16, 4, 8], as well as the audio specific literature [2] will be explored. As the selected candidate would work on our most advanced model, this subject is the opportunity to have a direct impact on the company in such a short time frame.

Skills

Who are we looking for ? Preparing an engineering degree or a master’s degree, you preferably have knowledge in the development and implementation of advanced algorithms for digital audio signal processing. Machine learning skills is a plus. Whereas not mandatory, notions in the following various fields would be appreciated : Audio, acoustics and psychoacoustics - Machine learning and artificial neural networks. - Audio effects in general : compression, equalization, etc. - Statistics, probabilist approaches, optimization. - Programming language : Matlab, Python, Pytorch, Keras, Tensorflow. - Sound spatialization effects : binaural synthesis, ambisonics, artificial reverberation. - Voice recognition, voice command. - Voice processing effects : noise reduction, echo cancellation, array processing. - Computer programming and development : Max/MSP, C/C++/C#. - Audio editing software : Audacity, Adobe Audition, etc. - Scientific publications and patent applications. - Fluent in English and French. - Demonstrate intellectual curiosity.

Other offers
https://nahimic.welcomekit.co/ https://www.welcometothejungle.co/companies/nahimic/jobs

Internship position at Telecom-Paris on Deep learning approaches for social computing

           

*Place of work* Telecom Paris, Palaiseau (Paris outskirt)

 

*Starting date* From February 2021(but can start later)

 

*Duration* 4-6 months

 

*Context*

The intern will take part in the REVITALISE projectfunded by ANR.

The research activity of the internship will bring together the research topics of Prof. Chloé Clavel [Clavel] of the S2a [SSA] team at Telecom-Paris– social computing [SocComp] - and Dr. Mathieu Chollet [Chollet] from University of Glasgow – multimodal systems for social skills training, and Dr Beatrice Biancardi [Biancardi] – Social Behaviour Modelling from CESI Engineering School, Nanterre.

 

Candidate profile*

As a minimum requirement, the successful candidate should have:

• A master degree in one or more of the following areas: human-agent interaction, deep learning, computational linguistics, affective computing, reinforcement learning, natural language processing, speech processing

 Excellent programming skills (preferably in Python)

 Excellent command of English

 The desire to do an academic thesis at Telecom-Paris after the internship

 

*How to apply*

The application should be formatted as **a single pdf file** and should include:

 A complete and detailed curriculum vitae

 A cover letter

 The contact of two referees

The pdf file should be sent to the two supervisors: Chloé Clavel, Beatrice Biancardi and Mathieu Chollet: chloe.clavel@telecom-paris.frbbiancardi@cesi.frmathieu.chollet@glasgow.ac.uk

 

           

Multimodal attention models for assessing and providing feedback on users’ public speaking ability

 

*Keywords* human-machine interaction, attention models, recurrent neural networks, Social Computing, natural language processing, speech processing, non-verbal behavior processing, multimodality, soft skills, public speaking

 

*Supervision* Chloé Clavel, Mathieu Chollet, Beatrice Biancardi

 

*Description* Oral communication skills are essential in many situations and have been identified as core skills of the 21st century. Technological innovations have enabled social skills training applications which hold great training potential: speakers’ behaviors can be automatically measured, and machine learning models can be trained to predict public speaking performance from these measurements and subsequently generate personalized feedback to the trainees.

The REVITALISE project proposes to study explainable machine learning models for the automatic assessment of public speaking and for automatic feedback production to public speaking trainees. In particular, the recruited intern will address the following points:

-   identify relevant datasets for training public speaking and prepare them for model training

-   propose and implement multimodal machine learning models for public speaking assessment and compare them to existing approaches in terms of predictive performance.

-   integrate the public assessment models to produce feedback a public speaking training interface, and evaluate the usefulness and acceptability of the produced feedback in a user study

The results of the project will help to advance the state of the art in social signal processing, and will further our understanding of the performance/explainability trade-off of these models.

 

The compared models will include traditional machine learning models proposed in previous work [Wortwein] and sequential neural approaches (recurrent networks) that integrate attention models as a continuation of the work done in [Hemamou], [BenYoussef]. The feedback production interface will extend a system developed in previous work [Chollet21].

 

Selected references of the team:

[Hemamou] L. Hemamou, G. Felhi, V. Vandenbussche, J.-C. Martin, C. Clavel, HireNet: a Hierarchical Attention Model for the Automatic Analysis of Asynchronous Video Job Interviews.  in AAAI 2019, to appear

[Ben-Youssef]  Atef Ben-Youssef, Chloé Clavel, Slim Essid, Miriam Bilac, Marine Chamoux, and Angelica Lim.  Ue-hri: a new dataset for the study of user engagement in spontaneous human-robot interactions.  In  Proceedings of the 19th ACM International Conference on Multimodal Interaction, pages 464–472. ACM, 2017.

[Wortwein] Torsten Wörtwein, Mathieu Chollet, Boris Schauerte, Louis-Philippe Morency, Rainer Stiefelhagen, and Stefan Scherer. 2015. Multimodal Public Speaking Performance Assessment. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (ICMI '15). Association for Computing Machinery, New York, NY, USA, 43–50.

[Chollet21] Chollet, M., Marsella, S., & Scherer, S. (2021). Training public speaking with virtual social interactions: effectiveness of real-time feedback and delayed feedback. Journal on Multimodal User Interfaces, 1-13.

 

Other references:

[TPT] https://www.telecom-paristech.fr/eng/ 

[IMTA] https://www.imt-atlantique.fr/fr

[SocComp.] https://www.tsi.telecom-paristech.fr/recherche/themes-de-recherche/analyse-automatique-des-donnees-sociales-social-computing/

[SSA] http://www.tsi.telecom-paristech.fr/ssa/#

[PACCE] https://www.ls2n.fr/equipe/pacce/

[Clavel] https://clavel.wp.imt.fr/publications/

[Chollet] https://matchollet.github.io/

[Biancardi] https://sites.google.com/view/beatricebiancardi

-Rasipuram, Sowmya, and Dinesh Babu Jayagopi. 'Automatic multimodal assessment of soft skills in social interactions: a review.' Multimedia Tools and Applications (2020): 1-24.

-Sharma, Rahul, Tanaya Guha, and Gaurav Sharma. 'Multichannel attention network for analyzing visual behavior in public speaking.' 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018.

-Acharyya, R., Das, S., Chattoraj, A., & Tanveer, M. I. (2020, April). FairyTED: A Fair Rating Predictor for TED Talk Data. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 01, pp. 338-345).

Back  Top



 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA