ISCA - International Speech
Communication Association


ISCApad Archive  »  2018  »  ISCApad #244  »  Jobs

ISCApad #244

Friday, October 12, 2018 by Chris Wellekens

6 Jobs
6-1(2018-04-09) Postes d'ATER en Traitement automatique des langues et de la Parole, Sorbonne Université, Paris, France

Des postes d'ATER en Traitement automatique des langues et de la Parole sont disponibles à la faculté des lettres de Sorbonne Université. Le lien pour postuler est http://concours.univ-paris4.fr/PostesAter?entiteBean=posteCandidatureCourant

Les conditions pour candidater sont disponibles sur http://lettres.sorbonne-universite.fr/ater.

Cordialement,

Claude Montacié
claude.montacie@sorbonne-universite.fr

Top

6-2(2018-04-11) A three-year doctoral position at the University Sorbonne Nouvelle, Paris, France

Dear colleagues,
Please find attached the description of a three-years doctoral position at the University Sorbonne Nouvelle to be filled at the last term of 2018.

The Laboratory of Phonetics and Phonology (http://lpp.in2p3.fr/), Paris, France, offers a funded position for a PhD candidate for a period of three years on the acoustic phonetic markers of inter and intra-speaker variability with a special notice considering the normalization of procedures.

We would be most grateful if you could also distribute this information among other persons who may be interested by this offer.


Cédric Gendrot et Cécile Fougeron

 

Descriptif de l?offre :

 

Offre de contrat doctoral par le Laboratoire de Phonétique et Phonologie : « Marqueurs phonétiques et acoustiques de la variabilité inter- et intra-individuelle »      

 

Le Laboratoire de Phonétique et Phonologie propose un contrat doctoral de 3 ans financé par l?ANR pour la rentrée universitaire 2018.

Le thème du doctorat proposé ici a pour objectif d'analyser les marqueurs phonétiques et acoustiques de la variabilité inter et intra locuteurs. Une attention particulière sera portée à la standardisation des méthodes d?analyse proposées, permettant leur transposition dans des domaines d?application connexes, dont celui du traitement automatique de la parole.

 

Il s?agira de prendre en compte des caractéristiques de la voix/parole très liées au contexte de la comparaison de voix. Dans la mesure où les variations de la parole sont multifactorielles, il apparaît indispensable d?établir des standards de mesures objectives pour lesquelles les méthodologies récentes de la phonétique expérimentale peuvent apporter une garantie.

On s?intéressera notamment aux marqueurs acoustiques qui retranscrivent des propriétés physiologiques individuelles ainsi qu?aux habitudes articulatoires, vecteurs d?identité sociale.

  

Le/la doctorant(e) effectuera ses recherches au LPP (Laboratoire de Phonétique et de Phonologie), une unité de recherche mixte CNRS/Université Paris3 Sorbonne Paris Cité. Voir les travaux sur ce thème du Laboratoire de Phonétique et de Phonologie http://lpp.in2p3.fr

Le/la candidat(e) sélectionné(e) sera encadré(e) par Cédric Gendrot et Cécile Fougeron, respectivement enseignant-chercheur de l?Université Sorbonne Nouvelle et Directrice de recherche au CNRS. Il/elle dépendra de l'Ecole Doctorale ED268 de l'Université Sorbonne nouvelle.

Le/la doctorant(e) bénéficiera des ressources du laboratoire, de l'Ecole Doctorale ED268  et de l'environnement de recherche interdisciplinaire du Laboratoire d'Excellence EFL. Il/elle pourra assister à des séminaires hebdomadaires de recherche phonétique et phonologie au LPP et d'autres équipes de recherche, suivre des conférences données par des professeurs invités de stature internationale, des formations, des colloques et des écoles d'été.

 

? Conditions

 - avoir une bonne maitrise de la langue française.

- avoir mené avec succès un premier projet de recherche personnel

- aucune condition de nationalité n'est exigée.

- avoir de très bonnes connaissances en traitement de données de type phonétique acoustique.

- des connaissances en informatique et en analyse statistique seraient un plus. 

 

? Pièces à joindre pour la candidature

1.         un CV

2.         une lettre de motivation

3.         le mémoire de master 2 en phonétique

4.         le nom de deux référents (avec leur adresse courriel)

 

Date limite de candidature: 30 juin 2018

 

 

Les dossiers complets seront à envoyer par mail au plus tard le 30 juin 2018 à Cédric Gendrot (cgendrot@univ-paris3.fr) et Cécile Fougeron (cecile.fougeron@univ-paris3.fr)

 

  • Présélection sur dossier et Audition des candidats présélectionnés

Les candidats présélectionnés seront auditionnés entre le 2 et le 6 juillet 2018) sur place ou par visio-conférence.  

Contact pour plus d?information : 

Cédric Gendrot : cgendrot@univ-paris3.fr

Cécile Fougeron : cecile.fougeron@univ-paris3.fr

 

 

Top

6-3(2018-04-12) Post-doc en criminalistique, LNE, Trappes, France

POST DOC 18 mois - Comparaison de voix dans le domaine criminalistique : définition d’une méthodologie et d’un référentiel pour la certification de laboratoires

Localisation : Trappes (78). Laboratoire national de métrologie et d'essais (LNE)
REF : ML/VOX/DE

CONTEXTE :
Le projet ANR VoxCrim (2017-2021) propose d’objectiver scientifiquement les possibilités de mise en œuvre d’une comparaison de voix dans le domaine criminalistique. Deux objectifs principaux : a) mettre en place une méthodologie d’accréditation de type ISO 17025 pour les laboratoires de la Police, b) établir des standards de mesures objectives. Ce projet permettra de faciliter le traitement d’une comparaison de voix dans les services de police et permettra de renforcer la recevabilité de la preuve auprès des tribunaux.
Le sujet du post-doctorat s’intègre dans le sous-projet « Accréditation, certification, normalisation et métrologie » du projet VoxCrim.
Ce sous-projet s'appuie sur l’existant disponible auprès de l’Association Française de Normalisation (AFNOR) et du Comité Français d’Accréditation (COFRAC).
Le travail à réaliser consiste dans un premier temps à évaluer l’existant et les adaptations nécessaires au contexte de l’accréditation des laboratoires réalisant des comparaisons de voix et à développer les protocoles de métrologie correspondants. Le sous-projet vise, en fin de projet, la définition complète d’une solution pratique d’accréditation et de certification en comparaison de voix.

MISSIONS :
Les missions confiées s’organisent en trois tâches :
-        Rapport sur l’existant. Cette tâche consiste à explorer l’existant pour identifier les normes et
directives à respecter, à faire évoluer ou dont il faut s’inspirer autant que possible. Le travail intégrera une dimension européenne et internationale (travaux du NIST-OSAC par exemple), et s’appuiera principalement sur les normes ISO 17025, 17043, 13528 pour mettre en place l’écosystème nécessaire pour valider les méthodes de comparaison de voix. Ces normes étant relatives principalement à de la mesure physique, le (la) post-doctorant(e) étudiera également la norme ISO 15189 qui présente des exigences relatives à des laboratoires où le prélèvement est fait sur un humain.
-        Spécifications des protocoles de métrologie intra- et inter-laboratoires, adaptées au contexte de la comparaison de voix, et plus spécifiquement dans le domaine de la criminalistique.
-        Le (la) post-doctorant(e) vérifiera l’adéquation des protocoles identifiés avec les jeux de conditions de mise en œuvre de comparaison de voix développés par les autres membres du projet.
Outre le soutien apporté par les équipes Evaluation des systèmes de traitement de l’information et Mathématiques-Statistiques, le (la) post-doctorant(e) bénéficiera de formations :
-        En début de contrat, une journée de formation sur les méthodes de comparaison inter-laboratoire et d’accréditation, dispensée par le LNE aux membres du consortium VoxCrim.
-        Courts stages pratiques à la SDPTS (Sous-Direction de la Police Scientifique et Technique à Ecully) et/ou à l’IRCGN (Institut de Recherche Criminalistique de la Gendarmerie Nationale) afin de comprendre les problématiques liées à la comparaison de voix en criminalistique.
-        Participation aux journées d’étude Voxcrim organisées par les membres du consortium à la SDPTS.
Des publications (et présentations, le cas échéant) en conférences et journaux internationaux sont attendues du (de la) post-doctorant(e).

DUREE :
18 mois. Début de préférence en septembre 2018.

PROFIL :
Vous êtes titulaire d’un doctorat en informatique ou en sciences du langage, avec une spécialisation en traitement automatique de la parole.
Vous possédez des connaissances en méthodologie d’évaluation et en biométrique vocale.
Des connaissances en normalisation seraient un véritable atout.

Pour candidater, merci d’envoyer votre CV à l’adresse recrut@lne.fr en rappelant la référence : ML/VOX/DE

Top

6-4(2018-04-14)2 PhD positions, IRIT Toulouse France

Two PhD positions are still available at IRIT Toulouse France starting
ideally in Sept. 2018.

Position 1: Deep learning approaches to assess head and neck cancer
voice intelligibility

Position 2: Clinical relevance of the intelligibility measures

These positions are in the framework of the TAPAS European Project.

For official information and applications, see
https://www.tapas-etn-eu.org/positions

You may obtain further information from Julie Mauclair (phone: +33 5 61
55 60 55, julie.mauclair@irit.fr) and Thomas Pellegrini (phone: +33 5 61
55 68 86, thomas.pellegrini@irit.fr)

Top

6-5(2018-04-16)Post doc position at INRIA Nancy France
Pos Doctoral Position (12 months)
 
Natural language processing: automatic speech recognition system using deep neural networks without out-of-vocabulary words
 
_______________________________________

- Location:INRIA Nancy Grand Est research center, France

 

- Research theme: PERCEPTION, COGNITION, INTERACTION

 

- Project-team: Multispeech

 

- Scientific Context:

 

More and more audio/video appear on Internet each day. About 300 hours of multimedia are uploaded per minute. In these multimedia sources, audio data represents a very important part. If these documents are not transcribed, automatic content retrieval is difficult or impossible. The classical approach for spoken content retrieval from audio documents is an automatic speech recognition followed by text retrieval.

 

An automatic speech recognition system (ASR) uses a lexicon containing the most frequent words of the language and only the words of the lexicon can be recognized by the system. New Proper Names (PNs) appear constantly, requiring dynamic updates of the lexicons used by the ASR. These PNs evolve over time and no vocabulary will ever contains all existing PNs. When a person searches for a document, proper names are used in the query. If these PNs have not been recognized, the document cannot be found. These missing PNs can be very important for the understanding of the document.

 

In this study, we will focus on the problem of proper names in automatic recognition systems. The problem is how to model relevant proper names for the audio document we want to transcribe.

 

- Missions:

 

We assume that in an audio document to transcribe we have missing proper names, i.e. proper names that are pronounced in the audio document but that are not in the lexicon of the automatic speech recognition system; these proper names cannot be recognized (out-of-vocabulary proper names, OOV PNs). The purpose of this work is to design a methodology how to find and model a list of relevant OOV PNs that correspond to an audio document.

 

Assuming that we have an approximate transcription of the audio document and huge text corpus extracted from internet, several methodologies could be studied:

  • From the approximate OOV pronunciation in the transcription, generate the possible writings of the word (phoneme to character conversion) and search this word in the text corpus.

  • A deep neural network can be designed to predict OOV proper names and their pronunciations with the training objective to maximize the retrieval of relevant OOV proper names.

 

The proposed approaches will be validated using the ASR developed in our team.

 

Keywords: deep neural networks, automatic speech recognition, lexicon, out-of-vocabulary words.

 

- Bibliography

[Mikolov2013] Mikolov, T., Chen, K., Corrado, G. and Dean, J. ?Efficient estimation of word representations in vector space?, Workshop at ICLR, 2013.

[Deng2013] Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero A. ?Recent advances in deep learning for speech research at Microsoft?, Proceedings of ICASSP, 2013.

[Sheikh2016] Sheihk, I., Illina, I., Fohr, D., Linarès, G. ?Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition?. Interspeech, 2016.

[Li2017] J. Li, G. Ye, R. Zhao, J. Droppo, Y. Gong , ?Acoustic-to-Word Model without OOV?, ASRU, 2017.

 

 

- Skills and profile: PhD in computer science, background in statistics, natural language processing, experience with deep learning tools (keras, kaldi, etc.) and computer program skills (Perl, Python).

- Additional information:

 

Supervision and contact: Irina Illina, LORIA/INRIA (illina@loria.fr), Dominique Fohr INRIA/LORIA (dominique.fohr@loria.fr) https://members.loria.fr/IIllina/, https://members.loria.fr/DFohr/

 

Additional links : Ecole Doctorale IAEM Lorraine

 

Deadline to apply: Mai 20th

Selection results: end of June

 

Duration :12 of months.

Starting date: between Nov. 1st 2018 and Jan. 1st 2019
Salary: about 2.115 euros net, medical insurance included

 

The candidates must have defended their PhD later than Sept. 1st 2016 and before the end of 2018. 

The candidates are required to provide the following documents in a single pdf or ZIP file: 

  • CV including a description of your research activities (2 pages max) and a short description of what you consider to be your best contributions and why (1 page max and 3 contributions max); the contributions could be theoretical or  practical. Web links to the contributions should be provided. Include also a brief description of your scientific and career projects, and your scientific positioning regarding the proposed subject.

  • The report(s) from your PhD external reviewer(s), if applicable.

  • If you haven't defended yet, the list of expected members of your PhD committee (if known) and the expected date of defence.

In addition, at least one recommendation letter from the PhD advisor should be sent directly by their author(s) to the prospective postdoc advisor.

 

Help and benefits:

 

  • Possibility of free French courses

  • Help for finding housing

  • Help for the resident card procedure and for husband/wife visa

Top

6-6(2018-04-16) PhD grant, INRIA Nancy France
 
Natural language processing: adding new words to a speech recognition system using Deep Neural Networks
 
 
- Location: INRIA/LORIA Nancy Grand Est research center France
- Project-team: Multispeech
- Scientific Context:

Voice is seen as the next big field for computer interaction. The research company Gartner reckons that by 2018, 30% of all interactions with devices will be voice-based: people can speak up to four times faster than they can type, and the technology behind voice interaction is improving all the time.

As of October 2017, Amazon Echo is present in about 4% of American households. Voice assistants are proliferating in smartphones too: Apple?s Siri handles over 2 billion commands a week, and 20% of Google searches on Android-powered handsets in America are done by voice input.

The proper nouns (PNs) play a particular role: they are often important to understand a message and can vary enormously. For example, a voice assistant should know the names of all your friends; a search engine should know the names of all famous people and places, names of museums, etc.

An automatic speech recognition system uses a lexicon containing the most frequent words of the language and only the words of the lexicon can be recognized by the system. It is impossible to add all possible proper names because there are millions proper names and new ones appear every day. A competitive solution is to dynamically add new PNs into the ASR system. The idea is to add only relevant proper names: for instance if we want to transcribe a video document about football results, we should add the names of famous football players and not politicians.

In this study, we will focus on the problem of proper names in automatic recognition systems. The problem is to find relevant proper names for the audio document we want to transcribe. To select the relevant proper names, we propose to use an artificial neural network.

- Missions:

We assume that in an audio document to transcribe we have missing proper names, i.e. proper names that are pronounced in the audio document but that are not in the lexicon of the automatic speech recognition system; these proper names cannot be recognized (out-of-vocabulary proper names, OOV PNs)

Tgoal of this PhDThesis is to find a list of relevant OOV PNs that correspond to an audio document and to integrate them in the speech recognition system. We will use a Deep neural network to find relevant OOV PNs The input of the DNN will be the approximate transcription of the audio document and the output will be the list of relevant OOV PNs with their probabilities. The retrieved proper names will be added to the lexicon and a new recognition of the audio document will be performed.

During the thesis, the student will investigate methodologies based on deep neural networks [Deng2013]. The candidate will study different structures of DNN and different representation of documents [Mikolov2013]. The student will validate the proposed approaches using the automatic transcription system of radio broadcast developed in our team.

- Bibliography:

 

[Mikolov2013] Mikolov, T., Chen, K., Corrado, G. and Dean, J. ?Efficient estimation of word representations in vector space?, Workshop at ICLR, 2013.

 

[Deng2013] Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero A. ?Recent advances in deep learning for speech research at Microsoft?, Proceedings of ICASSP, 2013.

 

[Sheikh2016] Sheihk, I., Illina, I., Fohr, D., Linarès, G. ?Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition?. Interspeech, 2016.

- Skills and profile: Master in computer science, background in statistics, natural language processing, experience with deep learning tools (keras, kaldi, etc.) and computer program skills (Perl, Python).

- Additional information:

 

Supervision and contact: Irina Illina, LORIA/INRIA (illina@loria.fr), Dominique Fohr INRIA/LORIA (dominique.fohr@loria.fr) https://members.loria.fr/IIllina/, https://members.loria.fr/DFohr/

Additional links: Ecole Doctorale IAEM Lorraine

 

Duration: 3 years

Starting date: between Oct. 1st 2018 and Jan. 1st 2019

Deadline to apply : May 1st 2018

 

The candidates are required to provide the following documents in a single pdf or ZIP file: 

  • CV

  • A cover/motivation letter describing their interest in the topic 

  • Degree certificates and transcripts for Bachelor and Master (or the last 5 years)

  • Master thesis (or equivalent) if it is already completed, or a description of the work in progress, otherwise

  • The publications (or web links) of the candidate, if any (it is not expected that they have any)

In addition, one recommendation letter from the person who supervises(d) the Master thesis (or research project or internship) should be sent directly by his/her author to the prospective PhD advisor.

Top

6-7(2018-04-17) PhD at LORIA Nancy France

Impact LUE Open Language and Knowledge for Citizens ? OLKi
Application for a PhD grant 2018 co-supervised by the Crem and the Loria
?Online hate speech against migrants?

 

Deadline to apply : May 1st 2018

 

According to the 2017 International Migration Report, the number of international migrants worldwide has continued to grow rapidly in recent years, reaching 258 million in 2017, up from 220
million in 2010 and 173 million in 2000. In 2017, 64 per cent of all international migrants worldwide ?
equal to 165 million international migrants ? lived in high-income countries; 78 million of them were
residing in Europe. Since 2000, Germany and France figure among the countries hosting the largest
number of international migrants. A key reason for the difficulty of EU leaders to take a decisive and
coherent approach to the refugee crisis has been the high levels of public anxiety about immigration
and asylum across Europe. Indeed, across the EU, attitudes towards asylum and immigration have
hardened in recent years because of (Berri et al., 2015): (i) the increase in the number and visibility of
migrants in recent years, (ii) the economic crisis and austerity policies enacted since the 2008 Global
Financial Crisis, (iii) the role of the mass media in influencing public and elite political attitudes towards
asylum and migration. Refugees and migrants tend to be framed negatively as a problem, potentially
nourishing.

Indeed, the BRICkS ? Building Respect on the Internet by Combating Hate Speech ? EU project1
has revealed a significant increase of the use of hate speech towards immigrants and minorities, which
are often blamed to be the cause of current economic and social problems. The participatory web and
the social media seem to accelerate this tendency, accentuated by the online rapid spread of fake news
which often corroborate online violence towards migrants. Based on existing research, Carla Schieb and
Mike Preuss (2016) highlight that hate speech deepens prejudice and stereotypes in a society (Citron &
Norton, 2011). It also has a detrimental effect on mental health and emotional well-being of targeted
groups, especially on targeted individuals (Festl & Quandt, 2013) and is a source of harm in general for
those under attack (Waldron, 2012), when culminating in violent acts incited by hateful speech. Such
violent hate crimes may erupt in the aftermath of certain key events, e.g. anti-Muslim hate crimes in
response to the 9/11 terrorist attacks (King & Sutton, 2013).

Hate speech and fake news are not, of course, just problems of our times. Hate speech has always
been part of antisocial behavior such as bullying or stalking (Delgado & Stefancic, 2014); ?trapped?,
emotional, unverified and/or biased contents have always existed (Dauphin, 2002; Froissart, 2002, 2004;
Lebre, 2014) and need to be understood on an anthropological level as reflections of people?s fears,
anxieties or fantasies. They reveal what Marc Angenot calls a certain ?state of society? (Angenot, 1978;
1989; 2006). Indeed, according to this author, analysis of situated specific discourses sheds light to some
of the topoi ? common premises and patterns ? that characterize public doxa. This ?gnoseological?
perspective reveals the ways the visions of the ?world? can be systematically schematized on linguistic
materials at a certain moment.

Within this context and problematic, the PhD project jointly proposed by the Crem and the Loria
aims to analyse hate speech towards migrants in social media and more particularly on Twitter.
It seeks to provide answers to the following questions:
? What are the representations of migrants as they emerge in hate speech on Twitter?
? What themes are they associated with?
? What can the latter tell us about the ?state? of our society, in the sense previously given to this
term by Marc Angenot?

Secondary questions will also be addressed as to refine the main results:
1 http://www.bricks-project.eu/wp/about-the-project/
? What is the origin of these messages? (individual accounts, political party accounts, bots, etc.)
? What is the circulation of these messages? (reactions, retweets, interactions, etc.)
? Can we measure the emotional dimension of these messages? Based on which indicators?
? Can a scale be established to measure the intensity of hate in speech?
More and more audio/video/text appear on Internet each day. About 300 hours of multimedia are
uploaded per minute. In these multimedia sources, manual content retrieval is difficult or impossible.
The classical approach for spoken content retrieval from multimedia documents is an automatic text
retrieval. Automatic text classification is one of the widely used technologies for the above purposes.
In text classification, text documents are usually represented in some so-called vector space and then
assigned to predefined classes through supervised machine learning. Each document is represented as a
numerical vector, which is computed from the words of the document. How to numerically represent
the terms in an appropriate way is a basic problem in text classification tasks and directly affects the
classification accuracy. Sometimes, in text classification, the classes cannot be defined in advance. In
this case, unsupervised machine learning is used and the challenge consists in finding underlying
structures from unlabeled data. We will use methodologies to perform one of the important tasks of text
classification: automatic hate speech detection.

Developments in Neural Network (Mikolov et al., 2013a) led to a renewed interest in the field of
distributional semantics, more specifically in learning word embeddings (representation of words in a
continuous space). Computational efficiency was one big factor which popularized word embeddings.
The word embeddings capture syntactic as well as semantic properties of the words (Mikolov et al.,
2013b). As a result, they outperformed several other word vector representations on different tasks
(Baroni et al., 2014).

Our methodology in the hate speech classification will be related on the recent approaches for text
classification with neural networks and word embeddings. In this context, fully connected feed forward
networks (Iyyer et al., 2015; Nam et al., 2014), Convolutional Neural Networks (CNN) (Kim, 2014;
Johnson and Zhang, 2015) and also Recurrent/Recursive Neural Networks (RNN) (Dong et al., 2014)
have been applied. On the one hand, the approaches based on CNN and RNN capture rich compositional
information, and have outperformed the state-of-the-art results in text classification; on the other hand
they are computationally intensive and require careful hyperparameter selection and/or regularization
(Dai and Le, 2015).

This thesis aims at proposing concepts, analysis and software components (Hate Speech Domain
Specific Analysis and related software tools in connection with migrants in social media) to bridge the
gap between conceptual requirements and multi-source information from social media. Automatic hate
speech detection software will be experimented in the modeling of various hate speech phenomenon and
assess their domain relevance with both partners.
The language of the analysed messages will be primarily French, although links with other languages
(including messages written in English) may appear throughout the analysis.
This PhD project complies with the Impact OLKi (Open Language and Knowledge for Citizens)
framework because:
? It is centred on language.
? It aims to implement new methods to study and extract knowledge from linguistic data
(indicators, scales of measurement).
? It opens perspectives to produce technical solutions (applications, etc.) for citizens and digital
platforms, to better control the potential negative use of language data.
Scientific challenges:
? to study and extract knowledge from linguistic data that concern hate speech towards migrants in
social media;
? to better understand hate speech as a social phenomenon, based on the data extracted and analysed;
? to propose and assess new methods based on Deep Learning for automatic detection of documents
containing hate speech. This will allow to set up a hate speech online management protocol.

Keywords: hate speech, migrants, social media, natural language processing.
Doctoral school: Computer Science (IAEM)
Principal supervisor: Irina Illina, Assistant Professor in Computer Science, irina.illina@loria.fr
Co-supervisors: Crem Loria
Angeliki Monnier, Professor Information-Communication, angeliki.monnier@univ-lorraine.fr
Dominique Fohr, Research scientist CNRS, dominique.fohr@loria.fr

References
Angenot M (1978) Fonctions narratives et maximes idéologiques. Orbis Litterarum 33: 95-100.
Angenot M (1989) 1889 : un état du discours social. Montréal : Préambule.
Angenot M (2006) Théorie du discours social. Notions de topographie des discours et de coupures cognitives,
COnTEXTES. thttps://contextes.revues.org/51.
Baroni, M., Dinu, G., and Kruszewski, G. (2014). ?Don?t count, predict! a systematic comparison of contextcounting
vs. contextpredicting semantic vectors?. In Proceedings of the 52nd Annual Meeting of the
Association for Computational Linguistics, Volume 1, pages 238-247.
Berri M, Garcia-Blanco I, Moore K (2015), Press coverage of the Refugee and Migrant Crisis in the EU: A Content
Analysis of five European Countries, Report prepared for the United Nations High Commission for Refugees,
Cardiff School of Journalism, Media and Cultural Studies.
Chouliaraki L, Georgiou M and Zaborowski R (2017), The European ?migration crisis? and the media: A cross-
European press content analysis. The London School of Economics and Political Science, London, UK.
Citron, D. K., Norton, H. L. (2011), ?Intermediaries and hate speech: Fostering digital citizenship for our
information age?, Boston University Law Review, 91, 1435.
Dai, A. M. and Le, Q. V. (2015). ?Semi-supervised sequence Learning?. In Cortes, C., Lawrence, N. D., Lee, D.
D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28, pages
3061-3069. Curran Associates, Inc
Dauphin F (2002), Rumeurs électroniques : synergie entre technologie et archaïsme. Sociétés 76 : 71-87.
Delgado R., Stefancic J. (2014), ?Hate speech in cyberspace?, Wake Forest Law Review, 49.
Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., and Xu, K. (2014). ?Adaptive recursive neural network for targetdependent
twitter sentiment classification?. In Proceedings of the 52nd Annual Meeting of the Association for
Computational Linguistics, ACL, Baltimore, MD, USA, Volume 2: pages 49-54.
Festl R., Quandt T (2013), Social relations and cyberbullying: The influence of individual and structural attributes
on victimization and perpetration via the internet, Human Communication Research, 39(1), 101?126.
Froissart P (2002) Les images rumorales, une nouvelle imagerie populaire sur Internet. Bry-Sur-Marne : INA.
Froissart P (2004) Des images rumorales en captivité : émergence d?une nouvelle catégorie de rumeur sur les sites
de référence sur Internet. Protée 32(3) : 47-55.
Johnson, R. and Zhang, T. (2015). ?Effective use of word order for text categorization with convolutional neural
networks?. In Proceedings of the 2015 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, pages 103-112.
Iyyer, M., Manjunatha, V., Boyd-Graber, J., and Daumé, H. (2015). ?Deep unordered composition rivals syntactic
methods for text classification?. In Proceedings of the 53rd Annual Meeting of the Association for
Computational Linguistics, volume 1, pages 1681-1691.
Kim, Y. (2014). ?Convolutional neural networks for sentence classification?. In Proceedings of the Conference on
Empirical Methods in Natural Language Processing (EMNLP), pages 1746-1751.
King R. D., Sutton G. M. (2013). High times for hate crimes: Explaining the temporal clustering of hate-motivated
offending. Criminology, 51 (4), 871?894.
Lebre J (2014) Des idées partout : à propos du partage des hoaxes entre droite et extrême droite. Lignes 45: 153-
162.
Mikolov, T., Yih, W.-t., and Zweig, G. (2013a). ?Linguistic regularities in continuous space word representations?.
In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, pages 746-751.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013b). ?Distributed representations of words
and phrases and their Compositionality?. In Advances in Neural Information Processing Systems, 26, pages
3111-3119. Curran Associates, Inc.
Nam, J., Kim, J., Loza Menc__a, E., Gurevych, I., and F urnkranz, J. (2014). ?Large-scale multi-label text
classification ? revisiting neural networks?. In Proceedings of the European Conference on Machine Learning
and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD-14), Part 2, volume 8725,
pages 437-452.
Schieb C, Preuss M (2016), Governing Hate Speech by Means of Counter Speech on Facebook, 66th ICA Annual
Conference, Fukuoka, Japan.
United Nations (2018), International Migration Report 2017. Highlights, New York, Department of Economic
and Social Affairs.
Waldron J. (2012), The harm in hate speech, Harvard University Press.

Top

6-8(2018-04-17) PhD grant at Loria, Nancy France

Thesis title Expressive speech synthesis based on deep learning
 
Location: INRIA Nancy Grand Est research center --- LORIA Laboratory, Nancy, France Research theme: Perception, Cognition, Interaction,  Project-team: MULTISPEECH (https://team.inria.fr/multispeech/)
Scientific Context Over the last decades, text-to-speech synthesis (TTS) has reached good quality and intelligibility, and is now commonly used in information delivery services, as for example in call center automation, in navigation systems, and in voice assistants. In the past, the main goal when developing TTS systems was to achieve high intelligibility. The speech style was then typically a “reading style”, which resulted from the style of the speech data used to develop TTS systems (reading of a large set of sentences). Although a reading style is acceptable for occasional interactions, TTS systems should benefit from more variability and expressivity in the generated synthetic speech, for example, for lengthy interactions between machines and humans, or for entertainment applications. This is the goal of recent or emerging research on expressive speech synthesis. Contrary to neutral speech, which is typically read speech without conveying any particular emotion, expressive speech can be defined as speech carrying an emotion, or spoken as in spontaneous speech, or also as speech with emphasis set on some words. 
Missions: (objectives, approach, etc.)  Deep learning approaches leads to good speech synthesis quality, however the main scientific and technological barrier remains the necessity of having a speech corpora corresponding to the speaker and the target style conditions, here expressive speech. This thesis aims at investigating approaches to overcome this barrier. More precisely, the objective is to propose and investigate approaches allowing expressive speech synthesis for a given speaker voice, using both the neutral speech data of that speaker, or the corresponding neutral speech model, and expressive speech data from other speakers. This will avoid lengthy and costly recording of specific ad hoc expressive speech corpora (e.g., emotional speech data from the target voice speaker). Let recall that three main steps are involved in parametric speech synthesis: the generation of sequences of basic units (phonemes, pauses, etc.) from the source text; the generation of prosody parameters (durations of sounds, pitch values, etc.); and finally the generation of acoustic parameters, which leads to the synthetic speech signal. All the levels are involved in expressive speech synthesis: alteration of pronunciations and presence of pauses, modification of prosody correlates and modification of the spectral characteristics. The thesis will essentially focus on the two last points, i.e., a correct prediction of prosody and spectral characteristics to produced expressive speech through deep learning-based approaches. Some aspects to be investigated include the combined used of only the neutral speech data of the target voice speaker and expressive speech of other speakers in the training process, or in an adaptation process, as well as data augmentation processes. The baseline experiments will rely on neutral speech corpora and expressive speech corpora previously collected for speech synthesis in the Multispeech team. Further experiments will consider using other expressive speech data, possibly extracted from audiobooks.
Skills and profile:  Master in automatic language processing or in computer science Background in statistics, and in deep learning Experience with deep learning tools
Good computer skills (preferably in Python) Experience in speech synthesis is a plus
 
Bibliography: (if any)  [Sch01] M. Schröder. Emotional speech synthesis: A review. Proc. EUROSPEECH, 2001. [Sch09] M. Schröder. Expressive speech synthesis: Past, present, and possible futures. Affective information processing, pp. 111–126, 2009. [ICHY03] A. Iida, N. Campbell, F. Higuchi and M. Yasumura. A corpus-based speech synthesis system with emotion. Speech Communication, vol. 40, n. 1, pp. 161–187, 2003. [PBE+06] J.F. Pitrelli, R. Bakis, E.M. Eide, R. Fernandez, W. Hamza and M.A. Picheny. The IBM expressive text-to-speech synthesis system for American English. IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, n. 4, pp. 1099–1108, 2006. [JZSC05] D. Jiang,W. Zhang, L. Shen and L. Cai. Prosody analysis and modeling for emotional speech synthesis. Proc. ICASSP, 2005. [WSV+15] Z. Wu, P. Swietojanski, C. Veaux, S. Renals, S. King. A study of speaker adaptation for DNN-based speech synthesis. Proc. INTERSPEECH, pp. 879–883, 2015.
Additional information: Supervision and contact: Denis Jouvet (denis.jouvet@loria.fr; https://members.loria.fr/DJouvet/) Vincent Colotte (Vincent.colotte@loria.fr; https://members.loria.fr/VColotte/) Additional link: Ecole Doctorale IAEM Lorraine (http://iaem.univ-lorraine.fr/) Duration: 3 years Starting date: autumn 2018
Deadline to apply: May 1st, 2018
The candidates are required to provide the following documents in a single pdf or ZIP file:   CV  A cover/motivation letter describing their interest in the topic   Degree certificates and transcripts for Bachelor and Master (or the last 5 years)  Master thesis (or equivalent) if it is already completed, or a description of the work in progress, otherwise  The publications (or web links) of the candidate, if any (it is not expected that they have any)  In addition, one recommendation letter from the person who supervises(d) the Master thesis (or research project or internship) should be sent directly by his/her author to the prospective PhD advisor.

Top

6-9(2018-04-19) PhD at LeMans University, France

Title of the PhD thesis:

 

Automatic speech processing in meetings

using microphone array

 

Key words : environment with reverberation– Array & Beamforming – Signal processing – Deep learning – Transcription and speaker recognition

 

 

Supervision : Silvio Montrésor (LAUM), Anthony Larcher (LIUM), Jean-Hugh Thomas (LAUM)

 

Funding: LMAC (Scientific bets of Le Mans Acoustique)

 

Beginning : September 2018

 

Contact : jean-hugh.thomas@univ-lemans.fr

 

Aim of the PhD thesis

The subject is supported by two laboratories of Le Mans – Université: the acoustics lab (LAUM) and the computer science lab (LIUM). The aim is to enhance automatic speech processing in meetings, transcription and speaker recognition, by using a recording device and audio signal processing from a microphone array.

 

 

Subject of the PhD thesis

It consists in implementing a hands-free system able to localise the speakers in a room, to separate the signals emitted by these speakers and to enhance the speech signal and its processing.

           

The thesis’ issues are the following:

 

-       Define an array geometry adapted to distant sound recording with few microphones.

 

-       Propose processing able to take advantage of the acoustic data provided by the array and to select the parts of the audio signals (reflexion orders) the most relevant for enhancing the performance of the automatic speech recognition system of the LIUM. The process should take into account the confined environment (meeting room). It will also use source separation algorithms to identify the different speakers during the meeting.

 

-       Propose new development to the usual methods to extract features from the signal to enhance the relevance for the neural network.

 

-       Propose a learning strategy for the neural network to enhance the transcription performance.

 

Some références

[1] J. H. L. Hansen, T. Hasan, Speaker recognition by machines and humans, IEEE Signal Processing Magazine, 74, 2015.

 

[2] L. Deng, G. Hinton, B. Kingsbury, New types of deep neural network learning for speech recognition and related applications: An overview, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 8599-8603).

 

[3] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-R. Mhamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, B. Kingsbury, Deep neural networks for acoustic modelling in speech recognition, IEEE Signal Processing Magazine, 82, 2012.

[4] P Bell, MJF Gales, T Hain, J Kilgour, P Lanchantin, X Liu, A McParland, S Renals, O Saz, M Wester, et al.The MGB challenge : Evaluating multi-genre broadcast media recognition. Proc. of ASRU, Arizona, USA, 2015.

 

[5] T. B. Spalt, Background noise reduction in wind tunnels using adaptive noise cancellation and cepstral echo removal techniques for microphone array applications, Master of Science in Mechanical Engineering, Hampton, Virginia, USA, 2010.

 

[6] D. Blacodon, J. Bulté, Reverberation cancellation in a closed test section of a wind tunnel using a multi-microphone cepstral method, Journal of Sound and Vibration 333, 2669-2687 (2014).

 

[7] Q.-G. Liu, B. Champagne, P. Kabal, A microphone array processing technique for speech enhancement in a reverberant space, Speech Communication 18 (1996) 317-334.

 

[8] S. Doclo, Multi-microphone noise reduction and de-reverberation techniques for speech applications, S. Doclo, Thesis, Leuven (Belgium), 2003.

 

[9] Y. Liu, N. Nower, S. Morita, M. Unoki, Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments, Speech Communication 84 (2016) 1-14.

 

[10] Feng, X., Zhang, Y., & Glass, J. (2014, May). Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1759-1763). IEEE.

[11] Kinoshita, K., Delcroix, M., Yoshioka, T., Nakatani, T., Sehr, A., Kellermann, W., & Maas, R. (2013, October). The reverb challenge: Acommon evaluation framework for dereverberation and recognition of reverberant speech. In 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (pp. 1-4). IEEE.

 

[12] Xiong X., Watanabe S., Erdogan H., Lu L., Hershey J., Seltzer M. L., Chen G., Zhang Y., Mandel M., Yu D., Deep Beamforming Networks for Multi-Channel Speech Recognition, Proceedings of ICASSP 2016, pp 5745-5749.

Top

6-10(2018-04-15) PhD Project Australia-France


PhD Project – Call for Applications Situated Learning for Collaboration across Language Barriers

People working in development are often deployed to remote locations where they work alongside locals who speak an unwritten minority language. Outsiders and locals share knowhow and pick up phrases in each other’s languages. They are performing a type of situated learning of language and culture. This situation is found across the world, in developing countries, border zones, and in indigenous communities. This project will develop computational tools to help people work together across language barriers. The research will be evaluated in terms of the the quality of the social interaction, the mutual acquisition of language and culture, the effectiveness of cross-lingual collaboration, and the quantity of translated speech data collected. The ultimate goal is to contribute to the grand task of documenting world’s languages. The project will involve working between France and Australia, and will include fieldwork with a remote indigenous community. We’re looking for outstanding and highly motivated candidates to work on a PhD on this subject. Competencies in two or more of the following areas are mandatory:

• machine learning for natural language processing;

• speech processing for interactive systems;

• participatory design;

• mobile software development;

• documenting and describing unwritten languages.

The project will build on previous work in the following areas: mobile platforms for collecting spoken language data [6, 7]; respeaking as a technique for improving the value of recordings made ‘in the wild’ and an alternative to traditional transcription practices [12, 13]; machine learning of structure in phrase-aligned bilingual speech recordings [2, 3, 4, 8, 9, 10, 11]; participatory design of mobile technologies for working with minority languages [5]; managing multilingual databases of text, speech and images [1]. Some recent indicative PhD theses include: Computer Supported Collaborative Language Documentation (Florian Hanke, 2017); Automatic Understanding of Unwritten Languages (Oliver Adams, 2018); Collecter, Transcrire, Analyser : quand la Machine Assiste le Linguiste dans son Travail de Terrain (Elodie Gauthier, 2018); Enriching Endangered Language Resources using Translations (Antonios Anastasopoulos, in prep); Digital Tool Deployment for Language Documentation (Mat Bettinson, in prep); Bayesian and Neural Modeling for Multi Level and Crosslingual Alignment (Pierre Godard, in prep).
Details of the position. Funding includes remission of university fees, a stipend of approximately e17,500 per year, and a travel allowance. The position starts in Fall 2018 (ie from September) and lasts for three years. The research will be supervised by Steven Bird (Charles Darwin University, Australia) and Laurent Besacier (Univ. Grenoble Alpes, France). Acceptance will be subject to approval by both host institutions (Grenoble and Darwin). Given the cross-cultural nature of the project, the successful candidate will have demonstrated substantial experience of cross-cultural living.


Apply. To apply, please contact laurent.besacier@univ-grenoble-alpes.fr and steven. bird@cdu.edu.au including a cover letter, curriculum vitae, academic transcripts and reference letter by your MSc thesis advisor.


Institutions The University of Grenoble offers an excellent research environment with ample compute hardware to solve hard speech and natural language processing problems, as well as remarkable surroundings to explore over the weekends. Charles Darwin University is a research-intensive university attracting students from over 50 countries. CDU is situated in Australia’s tropical north, in the midst of one of the world’s hot-spots for linguistic diversity and language endangerment. Darwin is a youthful, multicultural, cosmopolitan city in a territory that is steeped in Aboriginal tradition and culture and which enjoys a close interaction with the peoples of Southeast Asia.


References
[1] Steven Abney and Steven Bird. The Human Language Project: building a universal corpus of the world’s languages. In Proceedings of the 48th Meeting of the Association for Computational Linguistics, pages 88–97. ACL, 2010. [2] Oliver Adams, Graham Neubig, Trevor Cohn, and Steven Bird. Learning a translation model from word lattices. In Interspeech 2016, pages 2518–22, 2016. [3] Antonios Anastasopoulos, Sameer Bansal, David Chiang, Sharon Goldwater, and Adam Lopez. Spoken term discovery for language documentation using translations. In Proceedings of the Workshop on Speech-Centric NLP, pages 53–58, 2017. [4] Antonios Anastasopoulos and David Chiang. A case study on using speech-to-translation alignments for language documentation. In Proc. Workshop on Use of Computational Methods in Study of Endangered Languages, pages 170–178, 2017. [5] Steven Bird. Designing mobile applications for endangered languages. In Kenneth Rehg and Lyle Campbell, editors, Oxford Handbook of Endangered Languages. Oxford University Press, 2018. [6] Steven Bird, Florian R. Hanke, Oliver Adams, and Haejoong Lee. Aikuma: A mobile app for collaborative language documentation. In Proceedings of the Workshop on the Use of Computational Methods in the Study of Endangered Languages. ACL, 2014. [7] David Blachon, Elodie Gauthiera, Laurent Besacier, Guy-No¨el Kouaratab, Martine Adda-Decker, and Annie Rialland. Parallel speech collection for under-resourced language studies using the Lig-Aikuma mobile device app. In Proceedings of the Fifth Workshop on Spoken Language Technologies for Under-resourced languages, volume 81, pages 61–66, 2016. [8] V. H. Do, N. F. Chen, B. P. Lim, and M. A. Hasegawa-Johnson. Multitask learning for phone recognition of underresourced languages using mismatched transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26:501–514, 2018. [9] Ewan Dunbar, Xuan Nga Cao, Juan Benjumea, Julien Karadayi, Mathieu Bernard, Laurent Besacier, Xavier Anguera, and Emmanuel Dupoux. The zero resource speech challenge 2017. In Automatic Speech Recognition and Understanding (ASRU), 2017 IEEE Workshop on. IEEE. [10] Long Duong, Antonios Anastasopoulos, David Chiang, Steven Bird, and Trevor Cohn. An attentional model for speech translation without transcription. In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 949–959, 2016. [11] Pierre Godard, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Laurent Besacier, Helene Bonneau-Maynard, Guy-No¨el Kouarata, Kevin L¨oser, Annie Rialland, and Franc¸ois Yvon. Preliminary experiments on unsupervised word discovery in Mboshi. In Interspeech 2016, 2016. [12] Mark Liberman, Jiahong Yuan, Andreas Stolcke, Wen Wang, and Vikramjit Mitra. Using multiple versions of speech input in phone recognition. In ICASSP, pages 7591–95. IEEE, 2013. [13] Anthony C. Woodbury. Defining documentary linguistics. Language Documentation and Description, 1:35–51, 2003.

Top

6-11(2018-04-19) Joint PhD, Rennes/Dublin

Funded joint PhD between Univ Rennes and DIT, Dublin. The subject is about  « Deep neural natural language style transfer ».

Top

6-12(2018-05-14) Assistant linguist (French)

Assistant Linguist [French]

 

Job Title:

Assistant Linguist [French]

Linguistic Field(s):

Phonetics, Phonology, Morphology, Semantics, Syntax, Lexicography, NLP

Location:

Paris, France

Job description:

The role of the Assistant Linguist is to annotate and review linguistic data in French. The Assistant Linguist will also contribute to a number of other tasks to improve natural language processing. The tasks include:

  • Providing phonetic/phonemic transcription of lexicon entries

  • Analyzing acoustic data to evaluate speech synthesis

  • Annotating and reviewing linguistic data

  • Labeling text for disambiguation, expansion, and text normalization

  • Annotating lexicon entries according to guidelines

  • Evaluating current system outputs

  • Deriving NLP data for new and on-going projects

  • Be able to work independently with confidence and little oversight

Minimum Requirements:

  • Native speaker of French and fluent in English

  • Extensive knowledge of phonetic/phonemic transcriptions

  • Familiarity with TTS tools and techniques

  • Experience in annotation work

  • Knowledge of phonetics, phonology, semantics, syntax, morphology or lexicography

  • Excellent oral and written communication skills

  • Attention to detail and good organizational skills

Desired Skills:

  • Degree in Linguistics or Computational Linguistics or Speech processing

  • Ability to quickly grasp technical concepts; learn in-house tools

  • Keen interest in technology and computer-literate

  • Listening Skills

  • Fast and Accurate Keyboard Typing Skills

  • Familiarity with Transcription Software

  • Editing, Grammar Check and Proofing Skills

  • Research Skills

 

CV + motivation letter in English: maroussia.houimli@adeccooutsourcing.fr

Top

6-13(2018-05-20) Postdoc position in social robotics, Uppsala University, Sweden

** Postdoc position in social robotics**

Uppsala Social Robotics Lab

Department of Information Technology

Uppsala University, Sweden

 

Uppsala University is an international research university focused on the development of science and education. Our most important assets are all the individuals who with their curiosity and their dedication make Uppsala University one of Sweden’s most exciting work places. Uppsala University has 45.000 students, 6,800 employees and a turnover of SEK 6,300 million. The Department of Information Technology (http://www.it.uu.se/first?lang=en) is with approximately 275 employees, including 110 senior faculty and 120 PhD students, and more than 4000 students enrolled annually, one of Uppsala University’s largest departments.

 

The Uppsala Social Robotics Lab (http://hri.research.it.uu.se/) led by Dr. Ginevra Castellano aims to design and develop robots that learn to interact socially with humans and bring benefits to the society we live in, for example in application areas such as education and assistive technology.

 

We are receiving expressions of interest for an upcoming two-year postdoctoral researcher position in social robotics, specifically on the topic of social learning for co-adaptive social human-robot interactions.

 

The PhD student will have the opportunity to work in one or more projects on personalised and co-adaptive human-robot interaction, funded by the Swedish Research Council and the Swedish Foundation for Strategic Research, in collaboration with KTH Stockholm and the University of Gothenburg.

 

The researcher will be part of the Uppsala Social Robotics Lab at the Division of Visual Information and Interaction of the Department of Information Technology.

The Uppsala Social Robotics Lab’s focus is on natural interaction with social artefacts such as robots and embodied virtual agents. This domain concerns bringing together multidisciplinary expertise to address new challenges in the area of social robotics, including mutual human-robot co-adaptation, multimodal multiparty natural interaction with social robots, multimodal human affect and social behavior recognition, multimodal expression generation, robot learning from users, behavior personalization, effects of embodiment (physical robot versus embodied virtual agent) and other fundamental aspects of human-robot interaction (HRI). State of the art robots are used, including the Pepper, Nao and Furhat robotic platforms. The Lab is involved in a number of different national and EU-funded projects in collaborations with international partners.

 

How to send expressions of interest:

To express their interest, candidates should submit a CV, a 1-page research statement and a cover letter (indicating the name of referees and the earliest possible start date) to Ginevra Castellano (ginevra.castellano@it.uu.se) by the 31st of May.

 

Requirements:

Qualifications: The candidates must have a PhD degree in human-robot interaction or related areas relevant to the postdoc topic. Good programming skills and ability to conduct user studies are required. The PhD position is highly interdisciplinary and requires an understanding and/or interest in psychology and social sciences. Experience in machine learning for human-robot interaction is appreciated.

Top

6-14( 2018-05-20) Post Doctoral Position (12 months), INRIA, Nancy, France
 
Post Doctoral Position (12 months)

Natural language processing: automatic speech recognition system using deep neural networks without out-of-vocabulary words

_______________________________________

- Location:INRIA Nancy Grand Est research center, France

 

- Research theme: PERCEPTION, COGNITION, INTERACTION

 

- Project-team: Multispeech

 

Deadline to apply: June 6th


- Scientific Context:

 

More and more audio/video appear on Internet each day. About 300 hours of multimedia are uploaded per minute. In these multimedia sources, audio data represents a very important part. If these documents are not transcribed, automatic content retrieval is difficult or impossible. The classical approach for spoken content retrieval from audio documents is an automatic speech recognition followed by text retrieval.

 

An automatic speech recognition system (ASR) uses a lexicon containing the most frequent words of the language and only the words of the lexicon can be recognized by the system. New Proper Names (PNs) appear constantly, requiring dynamic updates of the lexicons used by the ASR. These PNs evolve over time and no vocabulary will ever contains all existing PNs. When a person searches for a document, proper names are used in the query. If these PNs have not been recognized, the document cannot be found. These missing PNs can be very important for the understanding of the document.

 

In this study, we will focus on the problem of proper names in automatic recognition systems. The problem is how to model relevant proper names for the audio document we want to transcribe.

 

- Missions:

 

We assume that in an audio document to transcribe we have missing proper names, i.e. proper names that are pronounced in the audio document but that are not in the lexicon of the automatic speech recognition system; these proper names cannot be recognized (out-of-vocabulary proper names, OOV PNs). The purpose of this work is to design a methodology how to find and model a list of relevant OOV PNs that correspond to an audio document.

 

Assuming that we have an approximate transcription of the audio document and huge text corpus extracted from internet, several methodologies could be studied:

  • From the approximate OOV pronunciation in the transcription, generate the possible writings of the word (phoneme to character conversion) and search this word in the text corpus.

  • A deep neural network can be designed to predict OOV proper names and their pronunciations with the training objective to maximize the retrieval of relevant OOV proper names.

 

The proposed approaches will be validated using the ASR developed in our team.

 

Keywords: deep neural networks, automatic speech recognition, lexicon, out-of-vocabulary words.

 

- Bibliography

[Mikolov2013] Mikolov, T., Chen, K., Corrado, G. and Dean, J. ?Efficient estimation of word representations in vector space?, Workshop at ICLR, 2013.

[Deng2013] Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero A. ?Recent advances in deep learning for speech research at Microsoft?, Proceedings of ICASSP, 2013.

[Sheikh2016] Sheihk, I., Illina, I., Fohr, D., Linarès, G. ?Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition?. Interspeech, 2016.

[Li2017] J. Li, G. Ye, R. Zhao, J. Droppo, Y. Gong , ?Acoustic-to-Word Model without OOV?, ASRU, 2017.

 

 

- Skills and profile: PhD in computer science, background in statistics, natural language processing, experience with deep learning tools (keras, kaldi, etc.) and computer program skills (Perl, Python).

- Additional information:

 

Supervision and contact: Irina Illina, LORIA/INRIA (illina@loria.fr), Dominique Fohr INRIA/LORIA (dominique.fohr@loria.fr) https://members.loria.fr/IIllina/, https://members.loria.fr/DFohr/

 

Additional links : Ecole Doctorale IAEM Lorraine

 

Deadline to apply: June 6th

Selection results: end of June

 

Duration :12 of months.

Starting date: between Nov. 1st 2018 and Jan. 1st 2019
Salary: about 2.115 euros net, medical insurance included

 

The candidates must have defended their PhD later than Sept. 1st 2016 and before the end of 2018. 

The candidates are required to provide the following documents in a single pdf or ZIP file: 

  • CV including a description of your research activities (2 pages max) and a short description of what you consider to be your best contributions and why (1 page max and 3 contributions max); the contributions could be theoretical or  practical. Web links to the contributions should be provided. Include also a brief description of your scientific and career projects, and your scientific positioning regarding the proposed subject.

  • The report(s) from your PhD external reviewer(s), if applicable.

  • If you haven't defended yet, the list of expected members of your PhD committee (if known) and the expected date of defence.

In addition, at least one recommendation letter from the PhD advisor should be sent directly by their author(s) to the prospective postdoc advisor.

 

Help and benefits:

 

  • Possibility of free French courses

  • Help for finding housing

  • Help for the resident card procedure and for husband/wife visa

Top

6-15(2018-05-23) Postdoc and PhD positions at Saarland University, Germany

Postdoc and PhD positions at Saarland University
http://www.sfb1102.uni-saarland.de/

The CRC Information Density and Linguistic Encoding (SFB 1102) at Saarland University
invites applications for a range of PhD and post-doctoral positions available for its
second funding phase (7/2018-6/2022).

The CRC includes 16 research projects drawing upon computational linguistics,
psycholinguistics, sociolinguistics, diachronic linguistics, phonetics, discourse
linguistics, contrastive linguistics and translatology. We are seeking to recruit 7
Postdocs and 15 PhD students.

For the phonetics community, projects C1 and C4 will be most relevant, but you may want
to have a look at the other projects too.

Details on the projects and positions as well as instructions for applications are
available at

http://www.sfb1102.uni-saarland.de/?page_id=57

Application deadline: June 20, 2018
Starting date: flexible

Top

6-16(2018-05-24) Permanent Web Developer position at ELDA, Paris, France

The European Language resources Distribution Agency (ELDA), a company specialised in Human Language Technologies within an international context is currently seeking to fill an immediate vacancy for a permanent Web Developer position.

Under the supervision of the technical department manager, the responsibilities of the Web Developer consist in designing and developing web applications and software tools for linguistic data management.
Some of these software developments are carried out within the framework of European research and development projects and are published as free software.
Depending on the profile, the Web Developer could also participate in the maintenance and upgrading of the current linguistic data processing toolchains, while being hands-on whenever required by the language resource production and management team.

Profile:

  • Bachelor of Science (BAC + 3 / BAC + 4) in Computer Science or a related field
  • Proficiency in Python (at least 3 years of experience)
  • Hands-on experience in Django
  • Hands-on knowledge of a distributed version control system (Git preferred)
  • Knowledge of SQL and of RDBMS (PostgreSQL preferred)
  • Basic knowledge of JavaScript and CSS
  • Basic knowledge of Linux shell scripting
  • Practice of free software
  • Experience in natural language processing is a strong plus
  • Proficiency in French and English, with writing and documentation skills in both languages
  • Curious, dynamic and communicative, flexible to work on different tasks in parallel
  • Ability to work independently and as part of a multidisciplinary team
  • Citizenship (or residency papers) of a European Union country


Applications will be considered until the position is filled. The position is based in Paris.

Salary: Commensurate with qualifications and experience.
Benefits: complementary medical insurance; meal vouchers.

Applicants should email a cover letter addressing the points listed above together with a curriculum vitae to:

ELDA
9, rue des Cordelières
75013 Paris
FRANCE
Mail : job@elda.org

ELDA is acting as the distribution agency of the European Language Resources Association (ELRA). ELRA was established in February 1995, with the support of the European Commission, to promote the development and exploitation of Language Resources (LRs). Language Resources include all data necessary for language engineering, such as monolingual and multilingual lexica, text corpora, speech databases and terminology. The role of this non-profit membership Association is to promote the production of LRs, to collect and to validate them and, foremost, make them available to users. The association also gathers information on market needs and trends.

For further information about ELDA/ELRA, visit:
http://www.elra.info

Top

6-17(2018-05-26) DOCTEUR JUNIOR R&D INFORMATIQUE Synthèse vocale / Intelligence Artificielle

DOCTEUR JUNIOR R&D INFORMATIQUE Synthèse vocale / Intelligence Artificielle
 
RD2 Conseil est un cabinet de recrutement spécialisé sur la recherche de jeunes docteurs pour les besoins en R&D des PME innovantes et entreprises privées, souhaitant se doter de compétences scientifiques pointues et de réelles ressources humaines en matière d’Innovation.
 
Nous recrutons actuellement un(e) Docteur Junior H/F (1er CDI) en Informatique spécialisé sur des problématiques de traitement de la parole, en particulier par l’utilisation de techniques d’Intelligence Artificielle.
 
Notre client est une startup, basée à Paris et créée en 2014, développant et fabriquant un produit novateur et technologique dans le domaine du jouet pour enfants. La société a ainsi développé un objet technologique et interactif éveillant l’imaginaire des enfants sans leur imposer d’images, par le biais d’une « fabrique à histoires » permettant à l’enfant de sélectionner les paramètres de l’histoire qui lui sera racontée : personnage principal, scène, objets clés. Le produit est d’ores et déjà commercialisé auprès des grands acteurs de la distribution (Fnac, Nature et Découverte, Oxybulle…).
 
Notre client vise aujourd’hui à renforcer ses activités de Recherche & Développement par la mise en œuvre de développements technologiques focalisés sur : - la synthèse et la reconnaissance vocale d’une part, pour permettre une meilleure personnalisation des histoires -  le développement d’une intelligence artificielle d’autre part par la mise en relation automatique de thèmes et d’idées  Dans ce cadre, la société vise le recrutement d’un Docteur Junior (1er CDI impératif) en Informatique (H/F) disposant de compétences fortes en Traitement de la Parole et Intelligence Artificielle
 
Le candidat travaillera sur des problématiques de synthèse vocale et d’IHM liée à la voix.  L’entreprise souhaite dans un 1er temps pouvoir intégrer directement le nom de l’enfant dans les histoires racontées par l’objet. Il est donc nécessaire de pouvoir disposer d’un outil de synthèse vocale permettant la conversion de texte en voix (text-to-speech) qui soit d’une qualité suffisante pour que i) le nom de l’enfant soit prononcé correctement, ii) avec une voix très similaire à celle du narrateur et iii) avec une palette d’intonations correspondant au contexte de l’histoire.  Les dirigeants de la startup ont également identifié le besoin de proposer des interactions naturelles avec l’objet afin qu’il puisse être utilisé en totale autonomie. Cela amène l’entreprise à étudier la possibilité de le contrôler directement par la voix.
 
Nous recherchons un candidat très autonome, polyvalent, capable d’être force de propositions, d’être créatif et de prendre des initiatives et des responsabilités.  Enfin notre client portera une grande attention à la sociabilité du candidat : dans une petite équipe où l’ambiance est particulièrement conviviale, il est indispensable que le / la candidat(e) soit sociable, dynamique, agréable et ait le goût du travail en équipe et des interactions avec les autres. 
 
Localisation : Paris Rémunération envisagée : Selon profil Si vous pensez être cette personne, que vous êtes titulaire d’un Doctorat et n’avez jamais été embauché en CDI auparavant (contrainte impérative pour respecter les critères du CIR), nous vous invitons à nous faire parvenir votre CV et lettre de motivation par mail : jesuisunjeunedocteur@rd2conseil.com - sous la référence LNI

Top

6-18(2018-05-29) PhD Project France-Australia

PhD Project – Call for Applications Situated Learning for Collaboration across Language Barriers

People working in development are often deployed to remote locations where they work alongside locals who speak an unwritten minority language. Outsiders and locals share knowhow and pick up phrases in each other’s languages. They are performing a type of situated learning of language and culture. This situation is found across the world, in developing countries, border zones, and in indigenous communities. This project will develop computational tools to help people work together across language barriers. The research will be evaluated in terms of the the quality of the social interaction, the mutual acquisition of language and culture, the effectiveness of cross-lingual collaboration, and the quantity of translated speech data collected. The ultimate goal is to contribute to the grand task of documenting world’s languages. The project will involve working between France and Australia, and will include fieldwork with a remote indigenous community. We’re looking for outstanding and highly motivated candidates to work on a PhD on this subject. Competencies in two or more of the following areas are mandatory:

• machine learning for natural language processing;

• speech processing for interactive systems;

• participatory design;

• mobile software development;

• documenting and describing unwritten languages.

The project will build on previous work in the following areas: mobile platforms for collecting spoken language data [6, 7]; respeaking as a technique for improving the value of recordings made ‘in the wild’ and an alternative to traditional transcription practices [12, 13]; machine learning of structure in phrase-aligned bilingual speech recordings [2, 3, 4, 8, 9, 10, 11]; participatory design of mobile technologies for working with minority languages [5]; managing multilingual databases of text, speech and images [1]. Some recent indicative PhD theses include: Computer Supported Collaborative Language Documentation (Florian Hanke, 2017); Automatic Understanding of Unwritten Languages (Oliver Adams, 2018); Collecter, Transcrire, Analyser : quand la Machine Assiste le Linguiste dans son Travail de Terrain (Elodie Gauthier, 2018); Enriching Endangered Language Resources using Translations (Antonios Anastasopoulos, in prep); Digital Tool Deployment for Language Documentation (Mat Bettinson, in prep); Bayesian and Neural Modeling for Multi Level and Crosslingual Alignment (Pierre Godard, in prep).
Details of the position. Funding includes remission of university fees, a stipend of approximately e17,500 per year, and a travel allowance. The position starts in Fall 2018 (ie from September) and lasts for three years. The research will be supervised by Steven Bird (Charles Darwin University, Australia) and Laurent Besacier (Univ. Grenoble Alpes, France). Acceptance will be subject to approval by both host institutions (Grenoble and Darwin). Given the cross-cultural nature of the project, the successful candidate will have demonstrated substantial experience of cross-cultural living.


Apply. To apply, please contact laurent.besacier@univ-grenoble-alpes.fr and steven. bird@cdu.edu.au including a cover letter, curriculum vitae, academic transcripts and reference letter by your MSc thesis advisor.


Institutions. The University of Grenoble offers an excellent research environment with ample compute hardware to solve hard speech and natural language processing problems, as well as remarkable surroundings to explore over the weekends. Charles Darwin University is a research-intensive university attracting students from over 50 countries. CDU is situated in Australia’s tropical north, in the midst of one of the world’s hot-spots for linguistic diversity and language endangerment. Darwin is a youthful, multicultural, cosmopolitan city in a territory that is steeped in Aboriginal tradition and culture and which enjoys a close interaction with the peoples of Southeast Asia.


References
[1] Steven Abney and Steven Bird. The Human Language Project: building a universal corpus of the world’s languages. In Proceedings of the 48th Meeting of the Association for Computational Linguistics, pages 88–97. ACL, 2010.

[2] Oliver Adams, Graham Neubig, Trevor Cohn, and Steven Bird. Learning a translation model from word lattices. In Interspeech 2016, pages 2518–22, 2016.

[3] Antonios Anastasopoulos, Sameer Bansal, David Chiang, Sharon Goldwater, and Adam Lopez. Spoken term discovery for language documentation using translations. In Proceedings of the Workshop on Speech-Centric NLP, pages 53–58, 2017.

[4] Antonios Anastasopoulos and David Chiang. A case study on using speech-to-translation alignments for language documentation. In Proc. Workshop on Use of Computational Methods in Study of Endangered Languages, pages 170–178, 2017.

[5] Steven Bird. Designing mobile applications for endangered languages. In Kenneth Rehg and Lyle Campbell, editors, Oxford Handbook of Endangered Languages. Oxford University Press, 2018.

[6] Steven Bird, Florian R. Hanke, Oliver Adams, and Haejoong Lee. Aikuma: A mobile app for collaborative language documentation. In Proceedings of the Workshop on the Use of Computational Methods in the Study of Endangered Languages. ACL, 2014.

[7] David Blachon, Elodie Gauthiera, Laurent Besacier, Guy-No¨el Kouaratab, Martine Adda-Decker, and Annie Rialland. Parallel speech collection for under-resourced language studies using the Lig-Aikuma mobile device app. In Proceedings of the Fifth Workshop on Spoken Language Technologies for Under-resourced languages, volume 81, pages 61–66, 2016.

[8] V. H. Do, N. F. Chen, B. P. Lim, and M. A. Hasegawa-Johnson. Multitask learning for phone recognition of underresourced languages using mismatched transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26:501–514, 2018.

[9] Ewan Dunbar, Xuan Nga Cao, Juan Benjumea, Julien Karadayi, Mathieu Bernard, Laurent Besacier, Xavier Anguera, and Emmanuel Dupoux. The zero resource speech challenge 2017. In Automatic Speech Recognition and Understanding (ASRU), 2017 IEEE Workshop on. IEEE.

[10] Long Duong, Antonios Anastasopoulos, David Chiang, Steven Bird, and Trevor Cohn. An attentional model for speech translation without transcription. In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 949–959, 2016.

[11] Pierre Godard, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Laurent Besacier, Helene Bonneau-Maynard, Guy-No¨el Kouarata, Kevin L¨oser, Annie Rialland, and Franc¸ois Yvon. Preliminary experiments on unsupervised word discovery in Mboshi. In Interspeech 2016, 2016.

[12] Mark Liberman, Jiahong Yuan, Andreas Stolcke, Wen Wang, and Vikramjit Mitra. Using multiple versions of speech input in phone recognition. In ICASSP, pages 7591–95. IEEE, 2013.

[13] Anthony C. Woodbury. Defining documentary linguistics. Language Documentation and Description, 1:35–51, 2003.

Top

6-19(2018-06-02) Ingénieur de recherche en Informatique, statistiques et calcul scientifique IN2P3
Le Laboratoire de Phonétique et Phonologie (http://lpp.in2p3.fr) ouvre un poste permanent d?ingénieur de recherche en Informatique, statistiques et calcul scientifique
Il s?agit d?un concours externe CNRS, dont les détails sont consultables à cette adresse : 
  • Date de candidature : du 4 juin au 3 juillet 2018
  • Concours n° 42
N?hésitez pas à envoyer un mail à angelique.amelot@univ-paris3.fr si vous souhaitez plus d?informations sur ce poste.

 
Top

6-20(2018-06-04) Two funded PhD, University of Glasgow, Great Britain
The University of Glasgow welcomes applications for two funded PhDs in the area of Human-Robot Interaction.
 
Application deadline: 31 July (for both positions)
 
Position 1: Human-Robot Interaction for Oilfield Drilling Applications
Eligibility: UK/EU students only
Start date: 1 October 2018
 
This project will investigate how such human-robot collaborative tasks can be carried out, concentrating on the communication aspects: how the robot communicates its intentions to the human, and how the human can query and interact with the robot’s plan. The research will be driven by oilfield drilling applications, which involve control of complex equipment in a dynamic environment, with an increasing level of automation. Close coordination between the human crew and the automation system is often required, as is building trust between the human and the machine so that the crew understand why the machine acts the way it does and is confident it has taken all available information into account. The project is an EPSRC iCASE award with Schlumberger Gould Research and it is expected that the student will spend some time working with the company in Cambridge.
 
The student should have excellent experience, enthusiasm and skills in the areas of natural language or multimodal interaction and/or automated planning and reasoning. Applicants must hold a good Bachelor’s or Master’s degree in a relevant discipline.
 
 
Position 2: Natural Language Generation for Social Robotics
Eligibility: UK/EU students, or international students who can cover remaining fees from other sources
Start date: 1 January 2019 (or earlier)
 
In this PhD project, the student will investigate how advanced techniques drawn from natural language generation (NLG) can be combined with practical social robotics applications. The success of the integration will be evaluated through a combination of subjective user evaluations of the social robots as well as technical evaluations of the flexibility and robustness of the underlying systems. In addition to the scientific results of the PhD, an additional goal is to produce a reusable, open-source component for NLG in the context of social robotics, to allow other researchers in this area to benefit from the results of the research.
 
The PhD student should have excellent experience, enthusiasm and skills in the areas of natural language processing, computational linguistics, multimodal interaction, and/or human-robot interaction. Applicants must hold a good Bachelor’s or Master’s degree in a relevant discipline.
 
 
For more information about both of these positions, please contact Dr Mary Ellen Foster MaryEllen.Foster@glasgow.ac.uk
 
Top

6-21(2018-06-05) Post doc position in Speech Processing , Aalto University, Finland

Aalto University (School of Electrical Engineering, Department of Signal Processing and Acoustics) invites applications for
 
Post doc position in Speech Processing
 
The Department of Signal Processing and Acoustics is a part of School of Electrical Engineering. The department consists of four main research areas. The speech communication technology research group (led by Prof. Paavo Alku) works on interdisciplinary topics aiming at describing, explaining and reproducing communication by speech. The main topics of our research are: voice source analysis and parameterization, statistical parametric speech synthesis, speech quality and intelligibility improvement, robust feature extraction in speech and speaker recognition, and occupational voice care.
We are currently looking for a postdoc to join our research team to work on the team’s research themes. We are particularly interested in candidates with research interest in paralinguistic speech processing, particularly speech-based biomarking of human health, voice conversion or speech synthesis. 
Postdoc: 3 years. Starting date: Autumn 2018 (flexible) 
In Helsinki you will join the innovative international computational data analysis and ICT community. Among European cities, Helsinki is special in being clean, safe, Scandinavian, and close to nature, in short, having a high standard of living. English is spoken everywhere. See, e.g., http://www.visitfinland.com/
Requirements The position requires doctoral degree in speech and language technology, computer science, signal processing or other relevant area, skills for doing excellent research in a group, and outstanding research experience in any of the research themes mentioned above. The candidate is expected to perform high-quality research and assist in supervising PhD students.
 
How to apply If you are interested in this opportunity, apply by submitting the following documents in English and in electrical form (use the pdf format only!) by July 31, 2018. Send your application, CV, a transcript of academic records and references directly by email to Professor Paavo Alku. Please insert the subject line “Aalto post-doc recruitment, 2018”.
 
Additional information Paavo Alku, paavo.alku@aalto.fi

Top

6-22(2018-06-08) Analytic Linguistic Project Manager (French) ,Paris, France

Analytic Linguistic Project Manager (French)


  Job title:  Analytical Linguistic Project Manager    

Linguistic Field(s):  Morphology, Semantics, Syntax, Lexicography, NLP, Phonetics, Phonology    Location:  Paris, France  

Hours: 9H – 17H 

Rem: 3790*12

Job description:  The role of the Analytic Linguistic Project Manager is to consult with Natural Language Understanding Researchers on creating guidelines and setting standards for a variety of NLP projects as well as to manage the work of a team of junior linguists to achieve high quality data output.    This includes: 

● Reviewing and annotating linguistic data 

● Developing phonetic/phonemic transcription rules 

● Analyzing acoustic data to evaluate speech synthesis 

● Deriving NLP data for new and on-going projects 

● Training, managing, and overseeing the work of a team of junior linguists 

● Creating guidelines for semantic, syntactic and morphological projects 

● Consulting with researchers and engineers on the development of linguistic databases 

● Identifying and assigning required tasks for a project 

● Tracking and reporting the team's progress 

● Monitoring and controlling quality of the data annotated by the team 

● Providing linguistic/operational guidance and support to the team   

Job requirements: 

● Native speaker French and fluent in English 

● Master's degree or higher in Linguistics or Computational Linguistics with experience in semantics, syntax, morphology, lexicography, phonetics, or phonology 

● Ability to quickly grasp technical concepts; should have an interest in natural language processing 

● Excellent oral and written communication skills 

● Good organizational skills 

● Previous project management and people management experience 

● Knowledge of a programming language or previous experience working in a Linux environment    

CV + Motivation letter in English: Maroussia.houimli@adeccooutsourcing.fr

Top

6-23(2018-06-11) Poste d’Ingénieur Traitement de la Parole , Voxygen, Plemeur Bodou, France


 
Vous voulez surfer sur la vague de la Voice First Revolution ?  Participez à l’aventure Voxygen !
 
Poste d’Ingénieur Traitement de la Parole
 
 
VOXYGEN, PME technologique éditrice d'une synthèse vocale reconnue pour ses qualités d’expressivité, renforce son équipe pour répondre aux besoins d’un marché en pleine expansion dans les secteurs de la relation client, des transports, de la robotique…
 
Nous ouvrons un poste d’Ingénieur Traitement de la Parole pour renforcer l'équipe Operations / Customer Success.
 
Principales missions :
 Définition et maintenance des outils de création de voix, documentation interne  Définition des paramètres acoustiques du système de synthèse dans un contexte multilingue  Maintenance des voix existantes  Contribution aux travaux de R&D sur le contrôle de l’expressivité en synthèse vocale  Qualification d’affaires, interface avec le commerce, accompagnement en avant-vente  Gestion de projets clients de création de voix
Profil recherché :
- Ingénieur en traitement de la parole - Maîtrise de la programmation : python, C, C++ - Connaissances en apprentissage automatique - Connaissances ou intérêt pour la linguistique - Bon niveau en anglais professionnel écrit et oral
Qualités : 
- Capacité d’adaptation dans une équipe pluridisciplinaire - Autonome, bonne capacité d’organisation sur un poste multitâche - Dynamique - Bon relationnel – Vous aimez le travail en équipe et la relation clients. 
Lieu de travail : Côte de Granit Rose – cadre de vie exceptionnel pour les amoureux de la mer et de la nature !  CDI basé à Pleumeur-Bodou (Lannion, 22) – démarrage ASAP – (Possibilité de poste sur Rennes après formation à Pleumeur-Bodou)
Rémunération : Selon expérience.
Merci d'adresser votre candidature (CV + motivations) à jobs@voxygen.fr

Top

6-24(2018-06-17) Postdoc / engineer -Computer Science Research Laboratory (LaBRI) and SANPSY (Sleep - Addiction - Neuropsychatry) , Bordeaux

Position : Postdoc / engineer - 12 months, Bordeaux
       Starting date : 01/10/2018
      
------------

Profile : Speech processing, machine learning, artificial intelligence

------------

Location : primary : Computer Science Research Laboratory (LaBRI)
       secondary : SANPSY (Sleep - Addiction - Neuropsychatry)

------------

Supervisors : Jean-Luc Rouas - LaBRI : jean-luc.rouas@labri.fr - main contact
          Jean-Philippe Domenger - LaBRI
          Pierre Philip - SANPSY

------------

Project : 'IS-OSA' (Innovative digital solution for personalised treatment of sleep apnea syndrome) funded by the Nouvelle Aquitaine Region

------------

Project summary :

Sleep deprivation have a strong impact on physical and mental health leading to multiple consequences: increase of heart failure rates, cognitive and behavorial troubles... In addition to the clinical interviews, it is possible to measure the fatigue state using several cues : eye movement, EEG data, behavorial expression data (e.g. body movements). It is however nowadays feasible, thanks to recent advances in speech processing, to characterise fatigue states using only speech related cues. This technique have the main advantage that it does not require any specific or invasive apparatus and could thus be carried out in diverse enviroments, out of clinical context.

The project aims at following patients suffering from sleep apnea syndrome by using the data collected during interviews with a virtual doctor. This data will complement other data collection sources such as measurements from CPAP devices.
The aim of this work is to focus on vocal cues characterising excessive daytime sleepiness in order to determine which are the vocal biomarkers of these troubles that could be integrated in the clinical measurements carried out during interviews with virtual doctors developped at SANPSY.

-------------

Work plan:

- Carry out Audio and Video recordings of patients on-site at the hospital in Bordeaux
- Define the vocal parameters allowing to describe the troubles induced by excessive daytime sleepiness, in close collaboration with the SANPSY Lab.
- Study these parameters and use them in a excessive daytime sleepiness automatic classification framework using sleepiness measurements proposed by the clinical staff as ground truth.
- Implement the classification system in the virtual doctor framework developped at SANPSY and carry out clinical trials to validate the approach.

-------------

Required skills:

- Training (PhD or Master internship) in automatic speech processing, contributions in analysis/features for speech signals are expected
- Machine learning and Artifical Intelligence : good knowledge of standard techniques (such as GMM/HMM/LDA) and knowledge/keen interest in deep learning methods
- Good programming skills in python, C/C++
- Interest for clinical research / collaboration with clinical staff (flexible hours)
- Good command of professionnal english

-------------

Salary : according to diploma and experience (examples : Master+3y = 2137? gross/month, PhD+3y = 2511 ? gross/month)

-------------

How to apply : Send your CV + cover letter + referees names + reports (interships,thesis,...) or publications by email to jean-luc.rouas@labri.fr

Top

6-25(2018-06-25) Associate Linguist [French], Paris France

Job Title:

Associate Linguist [French]

Linguistic Field(s):

Phonetics, Phonology, Morphology, Semantics, Syntax, Lexicography, NLP

Location:

Paris 8, France

Contract:

Short term contract 1 year -  renewable contract

Job description:

The role of the Associate Linguist is to annotate and review linguistic data in French.  The Associate Linguist will also contribute to a number of other tasks to improve natural language processing. The tasks include:

  • Providing phonetic/phonemic transcription of lexicon entries
  • Analyzing acoustic data to evaluate speech synthesis
  • Annotating and reviewing linguistic data
  • Labeling text for disambiguation, expansion, and text normalization
  • Annotating lexicon entries according to guidelines
  • Evaluating current system outputs
  • Deriving NLP data for new and on-going projects
  • Be able to work independently with confidence and little oversight

Minimum Requirements:

  • Native speaker of French and fluent in English
  • Extensive knowledge of phonetic/phonemic transcriptions
  • Familiarity with TTS tools and techniques
  • Experience in annotation work
  • Knowledge of phonetics, phonology, semantics, syntax, morphology or lexicography
  • Excellent oral and written communication skills
  • Attention to detail and good organizational skills

Desired Skills:

  • Degree in Linguistics or Computational Linguistics or Speech processing
  • Ability to quickly grasp technical concepts; learn in-house tools
  • Keen interest in technology and computer-literate
  • Listening Skills
  • Fast and Accurate Keyboard Typing Skills
  • Familiarity with Transcription Software
  • Editing, Grammar Check and Proofing Skills

-          Research Skills

Salary : 2730?

CV + motivation letter in English: maroussia.houimli@adeccooutsourcing.fr

 

Bien à vous,

Maroussia HOUIMLI

Responsable recrutement

Accueil en entreprise & Evénementiel et Marketing-Vente

  

T 06.24.61.08.43

E maroussia.houimli@adeccooutsourcing.fr

 

Top

6-26(2018-07-08) 2-year post-doc in ASR for low-resource languages, Delft University, The Netherlands

Job opening: 2-year post-doc in ASR for low-resource languages

We are looking for a highly motived post-doctoral researcher in the area of automatic
speech recognition (ASR) for low-resource languages, as part of the newly started
?Human-inspired automatic speech recognition? lab of Dr. Odette Scharenborg at the
Technical University of Delft, The Netherlands.

This project concerns building ASR systems for low-resource languages using linguistic
knowledge. The project aims to investigate different learning and training strategies
(e.g., semi- vs. unsupervised learning, multi-task learning) and architectures of deep
neural networks (DNNs) for the task of low-resource ASR. An important focus of the
project is on the role of linguistic information and multi-linguality in building ASR
systems for low-resource languages. A second important aspect of the project is opening
the DNN ?Black box? by investigating the speech representations in the hidden layers of
the DNNs using visualization techniques, and subsequently using this information to
improve the ASR systems.

We are looking for a highly motivated individual with a strong background in:
-        Deep neural networks
-        Automatic speech recognition

Who preferably has knowledge of one or more of the following topics:
-        Visualization of DNNs
-        Different DNN architectures and training techniques
-        Semi-/unsupervised learning
-        Speech acoustics

Who has/is:
-        A PhD in Computer Science, Electrical Engineering, Computational Linguistics,
Artificial Intelligence or a related field
-        A strong analytical mind
-        Excellent verbal and written communicative skills in English
-        At least 2 journal published journal papers in high-impact journals or conferences as
first author
-        A strong team-worker

We offer a 2-year post-doctoral position in the Multimedia Computing Group, Department of
Intelligent Systems, Faculty of Electrical Engineering, Mathematics, and Computer
Science, Delft University of Technology, the Netherlands.

For inquiries, please contact Dr. Odette Scharenborg (o.e.scharenborg@tudelft.nl).
Applications should be send to o.e.scharenborg@tudelft.nl before August 13, 2018, and
should include:
-        CV
-        Motivation letter
-        List of publications
-        Names and addresses of three referees.

The estimated starting date is October 1, 2018 or as soon as possible after that.
Interviews will likely be held in the week of August 20-24, 2018.

Dr. Odette Scharenborg
Associate Professor and Delft Technology Fellow
Multimedia Computing Group, Faculty of Electrical Engineering, Mathematics, and Computer
Science
Delft University of Technology
The Netherlands

Top

6-27(2018-07-09) Two academic positions at NTNU,Norwegian University of Science and Technology, Trondheim, Norway

Professor/Associate Professor in Statistical Machine Learning for Speech Technology at NTNU, Norwegian University of Science and Technology, Trondheim, Norway

Faculty position targeting machine learning for pattern recognition, with particular emphasis on applications in speech and language technology at the Signal Processing Group, NTNU. Application deadline is Aug. 31, 2018. See https://www.jobbnorge.no/en/available-jobs/job/154954/professor-associate-professor-in-statistical-machine-learning-for-speech-technology-ie-138-2018 for further information.

 

 

Professor/Associate Professor in Statistical Machine Learning for Signal Processing at NTNU, Norwegian University of Science and Technology, Trondheim, Norway

Faculty position targeting machine learning for analysis, classification, prediction and data mining of (large amounts of) sensor data, typically measurements in time and/or space at the Signal Processing Group, NTNU. Application deadline is Aug. 31, 2018. See https://www.jobbnorge.no/en/available-jobs/job/154952/professor-associate-professor-in-statistical-machine-learning-for-signal-processing-ie-137-2018  for further information.

 

Top

6-28(2018-07-19) CDD Ingénieur à l'IRISA, Rennes, Bretagne, France

L'équipe Expression de l'IRISA recrute un ingénieur pour un CDD de 24 mois sur le déploiement sur mobile d'un système de synthèse de la parole.

- Descriptif de l'offre en pièce jointe et ici : https://www-expression.irisa.fr/files/2018/07/fiche_de_poste.pdf

- Plus de détails sur l'équipe Expression : http://www-expression.irisa.fr/fr/

- Plus de détails sur l'IRISA : http://www.irisa.fr/

- Date limite de candidature : 10 septembre 2018

 

Top

6-29(2018-07-20) Post Doc au Laboratoire national de métrologie et d'essais, Trappes (78), France

POST DOC 18 mois – Transparence et explicabilité de la comparaison de voix dans le domaine criminalistique

Localisation : Laboratoire national de métrologie et d'essais, Trappes (78)

REF : ML/VOX/DE

CONTEXTE :

Le projet ANR VoxCrim (2017-2021) propose d’objectiver scientifiquement les possibilités de mise en œuvre d’une comparaison de voix dans le domaine criminalistique. Deux objectifs principaux : a) mettre en place une méthodologie permettant d’assurer l’efficacité et la compétence des laboratoires réalisant des comparaisons de voix, b) établir des standards de mesures objectives.
Il est nécessaire que les outils et méthodologies utilisés dans la comparaison de voix soient évalués, et que leur utilisation soit effectuée dans un cadre explicable et transparent. Les actions menées dans le projet permettront ainsi de faciliter le traitement d’une comparaison de voix dans les services de police et permettront de renforcer la recevabilité de la preuve auprès des tribunaux.
Le laboratoire national de métrologie et d’essais (LNE) apporte au projet son expertise en métrologie, normalisation, accréditation et comparaison inter-laboratoire, dans le but de constituer une solution méthodologique pratique permettant de rendre le processus de comparaison de voix transparent et explicable.

MISSIONS :

Les missions confiées s’organisent en trois tâches :
-        Spécifications du protocole de validation des méthodes de comparaison de voix, plus spécifiquement dans le domaine de la criminalistique. En s’appuyant sur l’existant en termes de normes et méthodologies de référence, le (la) post-doctorant(e) identifiera les besoins et possibilités pour la mise en place d’un protocole de référence.
-        Le (la) post-doctorant(e) vérifiera l’adéquation du protocole identifié avec les métriques de comparaison de voix identifiées par les chercheurs des laboratoires d’informatique et de phonétique associés au projet. Il (elle) s’assurera également de la compatibilité du protocole avec les méthodes de travail des centres scientifiques de la Police et de la Gendarmerie, membres du projet.
-        Il (elle) collaborera à l’organisation d’une comparaison inter-laboratoire s’appuyant sur ce protocole.

Le (la) post-doctorant(e) bénéficiera du soutien de différentes équipes du LNE dans la menée de ses travaux (équipes évaluation des systèmes de traitement de l’information,  mathématiques-statistiques, et métrologie), et sera en interaction régulière avec les autres laboratoires et centres scientifiques membres du projet.
Des publications (et présentations, le cas échéant) en conférences et journaux internationaux sont attendues du (de la) post-doctorant(e).

Bibliographie : Bonastre, J. F., Kahn, J., Rossato, S., & Ajili, M. (2015). Forensic speaker recognition: Mirages and reality. In Speech Production and Perception: Speaker-Specific Behavior. hal-01473992.

DUREE :

18 mois. Début en janvier 2019.

PROFIL :

Vous êtes titulaire d’un doctorat en informatique ou en sciences du langage, avec une spécialisation en traitement automatique de la parole.
Vous possédez des connaissances en méthodologie d’évaluation et en biométrique vocale.

Pour candidater, merci d’envoyer votre CV et lettre de motivation à l’adresse recrut@lne.fr en rappelant la référence : ML/VOX/DE

====================================================
Agnes Delaborde, PhD
Ingénieur de recherche en évaluation IA & robotique
(Research engineer in AI and robotics evaluation)
Direction des essais – DE536
agnes.delaborde@lne.fr
Tél. : +33 (0)1 30 69 11 50 - Mob. : +33 (0)6 26 72 69 80



Laboratoire national de métrologie et d'essais
29 avenue Roger Hennequin 78197 Trappes Cedex - lne.fr

Top

6-30(2018-08-03) Doctoral thesis at IJLRA (Sorbonne Université) Paris France

Doctoral thesis

Navigation aid for the visually impaired:
Virtual Reality acoustic simulations for interior navigation preparation

 

Laboratories                     IJLRA (Institut Jean le Rond d?Alembert, UMR 7190 CNRS ? Sorbonne Université) and IRCAM (Institut de Recherche et Coordination Acoustique/Musique, UMR 9912 STMS IRCAM ? CNRS ? Sorbonne Université)

Doctoral school               École Doctorale Sciences Mécaniques, Acoustique, Électronique et Robotique (SMAER): ED 391

Discipline                           Acoustics (Virtual Reality, Audio, Interaction, Aide Handicap)

Co-supervision                 Brian KATZ (DR-CNRS, IJLRA) et Markus NOISTERNIG (CR, IRCAM)

Keywords                           Virtual reality, 3D audio, spatial sound, spatial cognition, room acoustics, visual impairments, navigation aid

 

Research context            This thesis project is placed in the context of the ANR 2018-2021 project RASPUTIN (Room Acoustic Simulations for Perceptually Realistic Uses in Real-Time Immersive and Navigation Experiences). In the domains of sound synthesis and virtual reality (VR), much effort had been placed on the quality and realism of sound source renderings, from text-to-speech to musical instruments to engine noise for use in driving and flight simulators. The same degree of effort cannot be seen with regards to the spatial aspects of sound synthesis and virtual reality, particularly with respect to the acoustics of the surrounding environment. Room acoustic simulation algorithms have for decades been improving in their ability to predict acoustic measurement metrics like reverberation time from geometrical acoustic models, at a cost of higher and higher computational requirements. However, it is only recently that the perceptual quality of these simulations are being explored beyond their musical applications. In real-time systems, where sound source, listener, and room architecture can vary in unpredicted ways, investigation of the perceptual quality or realism has been hindered by necessary simplifications to algorithms. This project aims to improve real-time simulation quality towards perceptual realism.

The capability of a real-time acoustic simulation to provide meaningful information to a visually impaired user through a virtual reality exploration is the focus of the project. As a preparatory tool prior to visiting a public building or museum, the virtual exploration will improve user's knowledge of the space and navigation confidence during their on-site visit, as compared to traditional methods such as tactile maps.

The thesis work entails participating in the creation and evaluation of a training system application for visually impaired individuals. Tasks involve the development of an experimental prototype in collaboration with project partners with a simplified user interface for the construction of virtual environments to explore. Working in conjunction with a selected user group panel who will remain engaged in the project for the duration, several test cases of interest will be identified for integration into the prototype and subsequent evaluations. The prototype will be developed by the thesis student in collaboration with Novelab (audio gaming) and IRCAM/STMS-CNRS (developers of the audio rendering engine). Design and evaluation will be carried out in collaboration with the Centre de Psychiatrie et Neurosciences and StreetLab/Institut de la Vision. The ability to communicate in French would be beneficial, but is not mandatory at the start of the project.

Evaluations will involve different experimental protocols in order to assess the accuracy of the mental representation of the learned environments. From the point of view of the metrics relation preservation, participants will have to carry out experimental spatial memory tests as well as onsite navigation tasks.

 

Candidate profile:           We are looking for dynamic, creative, and motivated candidates with scientific curiosity, strong problem solving skills, the ability to work both independently and in a team environment, and the desire to push their knowledge limits and areas of confidence to new domains. The candidate should have a Master in Computer Science, Acoustics, Architectural Acoustics, Multimodal Interfaces, or Audio Signal Processing. A strong interest in spatial audio, room acoustics, and working with the visually impaired is necessary.   It is not expected that a candidate will have already all the skills necessary for this multidisciplinary subject, so a willingness and ability to rapidly step into new domains, including spatial cognition and psychoacoustics will be appreciated.

 

Domaine                            Réalité virtuelle, Audio, Interaction, Aide Handicap

 

Dates                                  Preferred starting date from 1-Nov-2018 to 20-Dec-2019, and no later than March-2019.

 

Application                        Interested candidates should send a CV, transcript of Master?s degree courses, a cover letter (limit 2 pages) detailing their motivations for pursuing a PhD in general and specifically the project described above, and contact information for 2 references that the selection committee can contact. Incomplete candidatures will not be processed.

 

Application deadline       Complete candidature files should be submitted to brian.katz@sorbonne-universite.fr and markus.noisternig@ircam.fr before 1-Oct-2018.

Top

6-31(2018-08-13) Post-docs at Idiap, Martigny, Switzerland

Dear Colleagues,

We currently have openings for two or three post-doctoral researchers in speech and
language processing at Idiap Research Institute:

 http://www.idiap.ch/education-and-jobs/job-10251

All the positions involve the theory and application of deep learning.  Whilst a
significant research element is envisaged, there are also applications involving
collaborations with local enterprises.

Idiap is located in French speaking Switzerland, although the lab hosts many
nationalities, and functions in English.  All positions offer quite generous salaries. 
More information is available on the institute's web site, http://www.idiap.ch/en

Several similar positions at PhD, post-doc and senior level are also available at the
institute in general.

 http://www.idiap.ch/en/join-us/job-opportunities

Sincerely,
--
Phil Garner
http://www.idiap.ch/~pgarner

Top

6-32(2018-08-17) 1 PhD position in affective computing at the Grenoble Alps University, France

1 PhD position in affective computing at the Grenoble Alps University
ATOS France and the Grenoble Informatics Laboratory (LIG) invite applications for a fully funded PhD position on 'Weakly-supervised learning of human affective behaviors from multimodal interactions with a chatbot'. The PhD will be co-supervised by Jean-Phillippe Vigne (ATOS) and Béatrice Bouchot (ATOS), Pr. Laurent Besacier (LIG) and Dr. Fabien Ringeval (LIG).


Thesis description
==================
The thesis targets three main objectives: 
1) the development of a weakly-supervised learning methodology for the semi-automatic annotation of affective information from speech and text produced by humans while interacting with a chatbot
2) the development of a module that performs a robust fusion of inputs? representations (speech + text) in order to infer attributes of affect in varying noisy conditions
3) an evaluation of the system?s robustness in different contexts of interaction with the chat-bot.

Recent advances in deep learning have shown promising results in many applications of affective computing [Picard-95], where ones of the most dominant tasks consist in quantifying attributes of human emotion, such as arousal, valence, or dominance [Russel-80], time-continuously from signals recorded by sensors [Wöllmer-08]. Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) [Gers-99] have successfully been employed to model long-range contextual dependencies between attributes of affect and speech data [Eyben-10, Eyben-12, Ringeval-15], and convolutional neural networks (CNNs) have shown promising results for learning useful information from the raw signals when combined with LSTM-RNN in the so-called ?end-to-end? framework [Trigeorgis-16]. Recently, semi-supervised [Schmitt-16, Ghosh-16] and unsupervised [Cummins-18] methods of representation learning have shown the interest of exploiting resources from other domains in order to deal with the issue of data scarcity, which is of paramount importance for methods based on deep learning, as they need as many examples as possible to generalise well on expressions of affect produced ?in-the-wild? [Ringeval-18].

In this thesis, weakly-supervised methods based on deep learning will be exploited to perform semi-automatic annotation of human affective behaviour from speech and text ? either typed on a keyboard, or automatically retrieved from speech by an ASR. Context-aware novelty detection [Marki-15] based on deep LSTM auto-encoders will be used to detect novel affective content, and semi-supervised learning methods [Zhang-18] will be employed to enrich the model while following a curriculum learning [Lotfian-18]. Data exploited to build and evaluate the system will rely on the data collected during the project but also on existing publicly available datasets of emotion, including people with various culture, language, age, education, but also featuring different environments and contexts of interaction. Data automatically retrieved from social platforms like YouTube channels will be considered for automatically enriching the model in a ?virtuous circle? fashion.

The envisioned starting date is December 2018.


Requirements
============
We are looking for one candidate with a strong focus on deep learning for affective computing with the following profile:
+ Master?s degree with background in Machine Learning, Speech Processing, Affective Computing                 
+ Excellent programming skills (Python, Java, C/C++), knowledge of Keras/TensorFlow/Torch would be ideal
+ Ability to work independently and be self-motivated         
+ Excellent communication skills in English           
 

Applying
========
To apply, please email your application to: fabien.ringeval@imag.frlaurent.besacier@imag.frjean-philippe.vigne@atos.net and beatrice.bouchot@atos.net.
 
The application should consist of a single pdf file including:                  
+ a curriculum vitae showing academic records with tracks related to the themes of the thesis               
+ transcript of marks according to M1-M2 profile or last 3 years of engineering school      
+ statement letter expressing your interest in the position and your profile relevance             
+ contact and recommendation letter of at least one university referent                   
 
Incomplete applications will not be processed. Potential candidates will be invited for an interview with the supervisors.
 

Conditions of employment     
========================
You will be hired on a fixed-term contract (3 years contract ? CIFRE) at ATOS, a global leader in digital transformation.
 

Working at Grenoble (ATOS/LIG)       
==================================
You will be integrated in two teams with academic and industrial profiles: the GETALP team of the LIG, recognised for its research activities in the fields of speech and language processing, and the team Cognitive Intelligence from ATOS, who is specialised in Artificial Intelligence (AI) for the development of chatbots.

ATOS is a leader in digital services with pro forma annual revenue of circa ? 12 billion and circa 100,000 employees in 73 countries, serving a global client base. ATOS R&D team has a very active innovation spirit backed by a culture of Intellectual Property. Together these have led to numerous disruptive developments, including more than 1,500 patents. ATOS Grenoble (1000 collaborators) is focusing on AI, working with a variety of clients to implement solutions where they create value. ATOS leadership in Cloud technology, Cybersecurity and High-performance computing, along with our partnerships with major AI companies (e.g., Google), help us provide clients with the resources, expertise and support they need.

The LIG is one of the largest laboratories in Computer Science in France. It is structured as a Joint Research Center (Unité Mixte de Recherche) founded by the CNRS, the Grenoble Institute of Technology (Grenoble INP), the INRIA Grenoble Rhône-Alpes, and the Grenoble Alps University (UGA), which has recently been ranked as France?s number one university in eleven disciplines, including Computer Science & Engineering, in the latest Shanghai Academic Subject Rankings of World Universities 2017. The LIG hosts 17 research teams and three teams providing administrative and technical supports, which represent an overall of 450 collaborators including 205 permanent researchers, 143 PhD students, and 35 persons in supporting teams as identified in 2016.

The city of Grenoble is located on a plateau at the foot of the French Alps and is advertised to be the ?Capital of the Alps? due to its immediate proximity to the mountains. The IMAG building hosting the LIG is located on a landscaped campus of 175 hectares, which straddles Saint-Martin-d?Hères and Gières, and welcome around 40,000 students and researchers working in various research institutions. Thanks to this campus, UGA has been ranked as the eighth most beautiful universities in Europe by the Times Higher Education magazine in 2018. Overall, Grenoble as a city is the largest research center in France after Paris with 22,800 researchers.     
  

References
==========

[Cummins-18]         Nicholas Cummins, Shahin Amiriparian, Gerhard Hagerer, Anton Batliner, Stefan Steidl, and Björn Schuller, An image-based deep spectrum feature representation for the recognition of emotional speech, in Proceedings of ACM MM 2017, pp. 478?484, October 2017, ACM.
[Eyben-10]              Florian Eyben, Martin Wöllmer, Alex Graves, Björn Schuller, Ellen Douglas-Cowie and Roddy Cowie, On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues, Journal on Multimodal User Interfaces 3(1-2):7?19, March 2010, Springer Nature.
[Eyben-12]              Florian Eyben, Martin Wöllmer, and Björn Schuller, A multi-task approach to continuous five-dimensional affect sensing in natural speech, ACM Transactions on Interactive Intelligent Systems (TiiS) - Special Issue on Affective Interaction in Natural Environments 2(1):6, March 2012, ACM.
[Gers-99]                  Felix A. Gers, Jürgen Schmidhuber, and Fred Cummins, Learning to forget: Continual prediction with LSTM, in Proceedings of ICANN 1999, pp. 850?855, ENNS.
[Ghosh-16]              Sayan Ghosh, Eugene Laksana, Louis-Philippe Morency and Stefan Scherer, Representation learning for speech emotion recognition, in Proceedings of Interspeech 2016, pp. 3603?3607, September 2016, ISCA.
[Lotfian-18]             Reza Lotfian and Carlos Busso, Curriculum learning for speech emotion recognition from crowdsourced labels, arXiv:1805.10339, May 2018.
[Marki-15]                Erik Sayan Ghosh, Eugene Laksana, Louis-Philippe Morency and Stefan Scherer, Representation learning for speech emotion recognition, in Proceedings of Interspeech 2016, pp. 3603?3607, September 2016, ISCA.
[Picard-95]               Rosalind W. Picard, Affective Computing, MIT Press.
[Ringeval-15]          Fabien Ringeval, Florian Eyben, Eleni Kroupi, Anil Yuce, Jean-Philippe Thiran, Touradj Ebrahimi, Denis Lalanne, and Björn Schuller, Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data, Pattern Recognition Letters, 66:25?30, November 2015, Elsevier.
[Ringeval-18]          Fabien Ringeval, Björn Schuller, Michel Valstar, Roddy Cowie, Heysem Kaya, Maximilian Schmitt, Shahin Amiriparian, Nicholas Cummins, Denis Lalanne, Adrien Michaud, Elvan Ciftçi, Hüseyin Güleç, Albert Ali Salah, and Maja Pantic, AVEC 2018 Workshop and challenge: Bipolar disorder and cross-cultural affect recognition, in Proceedings of AVEC?18, ACM MM, October 2018, ACM.
[Russel-80]               James A. Russel, A circumplex model of affect, Journal of personality and social psychology, 39(6):1161?1178, December 1980, APA.
[Schimtt-16]             Maximilian Schmitt, Fabien Ringeval, and Björn Schuller, At the border of acoustics and linguistics: Bag-of-Audio-Words for the recognition of emotions in speech. In Proceedings Interspeech 2016, pp. 495?499, San Fransisco (CA), USA, September 2016, ISCA.
[Trigeorgis-16]        George Trigeorgis, Fabien Ringeval, Raymond Brueckner, Erik Marchi, Mihalis Nicolaou, Björn Schuller and Stefanos Zafeiriou, Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network, in Proceedings ICASSP 2016, pp. 5200?5204, Shanghai, China, April 2016, IEEE.
[Wöllmer-08]           Martin Wöllmer, Florian Eyben, Stephan Reiter, Björn Schuller, Cate Cox, Ellen Douglas-Cowie, and Roddy Cowie, Abandoning emotion classes-towards continuous emotion recognition with modelling of long-range dependencies, in Proceedings of Interspeech 2008, Brisbane, Australia, pp. 597?600, ISCA.
[Zhang-18]              Zixing Zhang, Jin Han, Jun Deng, Xinzhou Xu, Fabien Ringeval, and Björn Schuller, Leveraging unlabelled data for emotion recognition with enhanced collaborative semi-supervised learning, IEEE Access, 6, April 2018, IEEE.

 
Top

6-33(2018-08-20) PhD Position to work with laryngeal high-speed videos of pathological speakers at the MUV, Vienna, Austria.

Subject: PhD Position to work with laryngeal high-speed videos of pathological speakers at the MUV, Vienna, Austria.

Job description:

 

The Medical University of Vienna (MUV), Austria, seeks to fill a position of a PhD-student within the project ?Objective differentiation of dysphonic voice quality types?. The candidate must hold a master?s degree, preferably in (one of) the fields of sound engineering, acoustical engineering, audio signal processing, or similar. The work will be conducted at the Division of Phoniatrics-Logopedics within the Department of Otorhinolaryngology of the MUV.

The workgroup hosting the project is interested in the assessment of voice parameters relevant to the medical diagnosis and clinical care of voice disorders. A focus is given to functional assessment of voice, especially to the objective description of voice quality. The levels of description include kinematics of voice production, voice acoustics, and auditory perception of voice. Clinical studies are conducted with a laryngeal high-speed camera that records vocal fold vibration at 4000 frames per second. Microphone signals of the voice are recorded in parallel. Vibratory patterns of the vocal folds are analysed visually and computationally via modelling. Trajectories of vocal fold edges, spatial arrangements thereof, and glottal area waveforms are analysed. Regarding acoustics, analysis of audio recordings involves the implementation, testing, and training of specialized synthesizers for pathological voices. On the level of auditory perception, listening experiments are conducted, especially experiments involving discrimination tasks.

Mandatory skills of the candidate are MATLAB programming, speech signal processing, psychoacoustics, good knowledge of English, good communication skills, and excellent analytical thinking. Optional skills of the candidate are knowledge of German, experience in a health care profession, image and video processing, Python, PureData, object-oriented programming, software engineering, version control (Subversion, Git, or similar), SQL, and XML.

The project duration is 4-5 years. The Austrian Science Fund (FWF) budgets for doctoral candidates a gross salary of 2.112,40 Euro per month. Application documents can be submitted to philipp.aichinger@meduniwien.ac.at by October 31st, 2018. Interviews are planned for November 2018.The project is planned to start in December 2018.

Information regarding the beautiful city of Vienna can be found at https://www.meduniwien.ac.at/web/en/international-affairs/living-in-vienna/.

Top

6-34(2018-08-27) POST DOC 18 mois – Transparence et explicabilité de la comparaison de voix dans le domaine criminalistique, Laboratoire national de métrologie et d'essais, Trappes, France

POST DOC 18 mois – Transparence et explicabilité de la comparaison de voix dans le domaine criminalistique

 

Localisation : Laboratoire national de métrologie et d'essais, Trappes (78)

 

REF : ML/VOX/DE

 

CONTEXTE :

 

Le projet ANR VoxCrim (2017-2021) propose d’objectiver scientifiquement les possibilités de mise en œuvre d’une comparaison de voix dans le domaine criminalistique. Deux objectifs principaux : a) mettre en place une méthodologie permettant d’assurer l’efficacité et la compétence des laboratoires réalisant des comparaisons de voix, b) établir des standards de mesures objectives.

Il est nécessaire que les outils et méthodologies utilisés dans la comparaison de voix soient évalués, et que leur utilisation soit effectuée dans un cadre explicable et transparent. Les actions menées dans le projet permettront ainsi de faciliter le traitement d’une comparaison de voix dans les services de police et permettront de renforcer la recevabilité de la preuve auprès des tribunaux.

Le laboratoire national de métrologie et d’essais (LNE) apporte au projet son expertise en métrologie, normalisation, accréditation et comparaison inter-laboratoire, dans le but de constituer une solution méthodologique pratique permettant de rendre le processus de comparaison de voix transparent et explicable.

 

MISSIONS :

 

Les missions confiées s’organisent en trois tâches :

-              Spécifications du protocole de validation des méthodes de comparaison de voix, plus spécifiquement dans le domaine de la criminalistique. En s’appuyant sur l’existant en termes de normes et méthodologies de référence, le (la) post-doctorant(e) identifiera les besoins et possibilités pour la mise en place d’un protocole de référence.

-              Le (la) post-doctorant(e) vérifiera l’adéquation du protocole identifié avec les métriques de comparaison de voix identifiées par les chercheurs des laboratoires d’informatique et de phonétique associés au projet. Il (elle) s’assurera également de la compatibilité du protocole avec les méthodes de travail des centres scientifiques de la Police et de la Gendarmerie, membres du projet.

-              Il (elle) collaborera à l’organisation d’une comparaison inter-laboratoire s’appuyant sur ce protocole.

 

Le (la) post-doctorant(e) bénéficiera du soutien de différentes équipes du LNE dans la menée de ses travaux (équipes évaluation des systèmes de traitement de l’information,  mathématiques-statistiques, et métrologie), et sera en interaction régulière avec les autres laboratoires et centres scientifiques membres du projet.

Des publications (et présentations, le cas échéant) en conférences et journaux internationaux sont attendues du (de la) post-doctorant(e).

 

Bibliographie : Bonastre, J. F., Kahn, J., Rossato, S., & Ajili, M. (2015). Forensic speaker recognition: Mirages and reality. In Speech Production and Perception: Speaker-Specific Behavior. hal-01473992.

 

DUREE :

 

18 mois. Début en janvier 2019.

 

PROFIL :

 

Vous êtes titulaire d’un doctorat en informatique ou en sciences du langage, avec une spécialisation en traitement automatique de la parole.

Vous possédez des connaissances en méthodologie d’évaluation et en biométrique vocale.

 

Pour candidater, merci d’envoyer votre CV et lettre de motivation à l’adresse recrut@lne.fr en rappelant la référence : ML/VOX/DE

 

====================================================

Agnes Delaborde, PhD

Ingénieur de recherche en évaluation IA & robotique (Research engineer in AI and robotics evaluation)

Direction des essais – DE536

agnes.delaborde@lne.fr

Tél. : +33 (0)1 30 69 11 50 - Mob. : +33 (0)6 26 72 69 80

 

 

Laboratoire national de métrologie et d'essais

29 avenue Roger Hennequin 78197 Trappes Cedex - lne.fr

 

Top

6-35(2018-08-31) Post Doctoral Position (12 months), Natural Language Processing, INRIA-Loria, Nancy, France

Post Doctoral Position (12 months), Natural Language Processing: ?Online hate speech against migrants?

Keywords: hate speech, migrants, social media, natural language processing.

Supervisors : Irina Illina and Dominique Fohr. The applicant will also collaborate with CREM Laboratory

Start: end of 2018 ? begin of 2019

Location: INRIA-Loria, Nancy, France

Duration: 1 year

To apply:  send the following documents to illina@loria.fr and  dominique.fohr@loria.fr  as soon as possible and no later than September 25th, 2018:

  • CV

  • motivation letter

  • PhD thesis if already completed, or a description of the work in progress otherwise

  • a copy of your publications

    • a recommendation letter from the supervisor of your PhD thesis, and up to two other recommendation letters.

    The ideal applicant should have:

    • A PhD in NLP

    • A solid background in statistical machine learning.

    • Strong publications.

    • Solid programming skills to conduct experiments.

    • Good level in English.

       

       

      Context:

      According to the 2017 International Migration Report, the number of international migrants worldwide has continued to grow rapidly in recent years, reaching 258 million in 2017, up from 220 million in 2010 and 173 million in 2000. In 2017, 64 per cent of all international migrants worldwide ? equal to 165 million international migrants ? lived in high-income countries; 78 million of them were residing in Europe. A key reason for the difficulty of EU leaders to take a decisive and coherent approach to the refugee crisis has been the high levels of public anxiety about immigration and asylum across Europe. Indeed, across the EU, attitudes towards asylum and immigration have hardened in recent years because of: (i) the increase in the number and visibility of migrants in recent years, (ii) the economic crisis and austerity policies enacted since the 2008 Global Financial Crisis, (iii) the role of the mass media in influencing public and elite political attitudes towards asylum and migration. Refugees and migrants tend to be framed negatively as a problem, potentially nourishing.

      The BRICkS ? Building Respect on the Internet by Combating Hate Speech ?  EU project has revealed a significant increase of the use of hate speech towards immigrants and minorities, which are often blamed to be the cause of current economic and social problems. The participatory web and the social media seem to accelerate this tendency, accentuated by the online rapid spread of fake news which often corroborate online violence towards migrants.

      More and more audio/video/text appear on Internet each day. About 300 hours of multimedia are uploaded per minute. In these multimedia sources, manual content retrieval is difficult or impossible. The classical approach for spoken content retrieval from multimedia documents is an automatic text retrieval. Automatic text classification is one of the widely used technologies for the above purposes. In text classification, text documents are usually represented in some so-called vector space and then assigned to predefined classes through supervised machine learning. Each document is represented as a numerical vector, which is computed from the words of the document. How to numerically represent the terms in an appropriate way is a basic problem in text classification tasks and directly affects the classification accuracy. We will use these methodologies to perform one of the important tasks of text classification: automatic hate speech detection.

      Our methodology in the hate speech classification will be related on the recent approaches for text classification with neural networks and word embeddings. In this context, fully connected feed forward networks (Iyyer et al., 2015; Nam et al., 2014), Convolutional Neural Networks (CNN) (Kim, 2014; Johnson and Zhang, 2015) and also Recurrent/Recursive Neural Networks (RNN) (Dong et al., 2014) have been applied.

    •  

       

       

      Objectives:

      Within this context and problematic, the post-doc position aims to analyze hate speech towards migrants in social media and more particularly on Twitter. This post-doc position aims at proposing concepts and software components (Hate Speech Domain Specific Analysis and related software tools in connection with migrants in social media) to bridge the gap between conceptual requirements and multi-source information from social media. Automatic hate speech detection software will be experimented in the modeling of various hate speech phenomenon and assess their domain relevance. 

      The language of the analysed messages will be primarily French, although links with other languages (including messages written in English) may appear throughout the analysis.

      • References

        Dai, A. M. and Le, Q. V. (2015). ?Semi-supervised sequence Learning?. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28, pages 3061-3069. Curran Associates, Inc

        Delgado R., Stefancic J. (2014), ?Hate speech in cyberspace?, Wake Forest Law Review, 49.

        Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., and Xu, K. (2014). ?Adaptive recursive neural network for target-dependent twitter sentiment classification?. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL, Baltimore, MD, USA, Volume 2: pages 49-54.

        Johnson, R. and Zhang, T. (2015). ?Effective use of word order for text categorization with convolutional neural networks?. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 103-112.

        Iyyer, M., Manjunatha, V., Boyd-Graber, J., and Daumé, H. (2015). ?Deep unordered composition rivals syntactic methods for text classification?. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, volume 1, pages 1681-1691.

        Kim, Y. (2014). ?Convolutional neural networks for sentence classification?. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746-1751.

        King R. D., Sutton G. M. (2013). High times for hate crimes: Explaining the temporal clustering of hate-motivated offending. Criminology, 51 (4), 871?894.

        Mikolov, T., Yih, W.-t., and Zweig, G. (2013a). ?Linguistic regularities in continuous space word representations?. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746-751.

        Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013b). ?Distributed representations of words and phrases and their Compositionality?. In Advances in Neural Information Processing Systems, 26, pages 3111-3119. Curran Associates, Inc.

        Nam, J., Kim, J., Loza Menc__a, E., Gurevych, I., and Furnkranz, J. (2014). ?Large-scale multi-label text classification ? revisiting neural networks?. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD-14), Part 2, volume 8725, pages 437-452.

        Schieb C, Preuss M (2016), Governing Hate Speech by Means of Counter Speech on Facebook, 66th ICA Annual Conference, Fukuoka, Japan.

Top

6-36(2018-09-02) CDD IRISA, Rennes, France

L'équipe Expression de l'IRISA ouvre un CDD de 24 mois sur le déploiement mobile d'un système de synthèse de la parole.

 

Mots-clés : synthèse de la parole, intelligence artificielle, machine learning, agents conversationnels.

 

Détails :

- offre d'emploi : pièce jointe et ici : https://www-expression.irisa.fr/files/2018/09/fiche_de_poste.pdf

- Équipe Expression : http://www-expression.irisa.fr/fr/

- IRISA : http://www.irisa.fr/

- Bac + 5 ou Bac + 2 avec compétence Android / iOS

- Date limite de candidature : 5 octobre 2018

Top

6-37(2018-09-03) PhD position and Post-doc position: Privacy-respecting dialog systems,Saarland University,Germany

PhD position and Post-doc position: Privacy-respecting dialog systems
=============================================
(Computational Linguistics, Computer Science or similar)

Conversational interfaces based on deep learning are becoming more and
more ubiquitous. However, the massive amounts of stored speech and
text data that is needed for training state-of-the-art models raises
serious privacy concerns for its users. Each spoken message may
potentially reveal information about the user's personality, may
contain critical information (credit card numbers, passwords, etc.),
and may convey sensitive information (ethnicity, age, health status,
etc.). Voice recordings may even be malevolently used to build
synthesized voiced to impersonate users.

The Spoken Language Systems group at Saarland University is seeking
new ways to provide dialog technology that is 'private by design' by
means such as e.g. privacy-preserving machine learning. To this end,
we are anticipating the availability of a PhD position and a Post-Doc
position starting at the beginning of 2019.

Ideal candidates for either position would have:

  1. A good understanding of not just NLP, but of dialog phenomena in
     particular. Here, an understanding of how privacy-relevant
     information may arise as a result of dialog behavior (rather than
     as part of a single utterance) is desirable.

  2. Excellent knowledge of machine learning, experience with
     weakly-supervised methods a plus.

  3. Knowledge of and experience with scientific evaluation
     methodologies.

  4. Excellent programming skills, experience with RESTful APIs a plus.

  5. Experience with architecting large, heterogeneous, modular and
     distributed systems.

Salaries: The PhD position will be 75% of full time on the German E13
scale (TV-L). The Post-Doc position will be 100% of the full time on
the German E13 scale (TV-L). The appointments will be for three years
with possible extensions subject to follow-up funding.

About the department: The department of Language Science and
Technology is one of the leading departments in the speech and
language area in Europe. The flagship project at the moment is the CRC
on Information Density and Linguistic Encoding. Furthermore, the
department is involved in the cluster of excellence Multimodal
Computing and Interaction. It also runs a significant number of
European and nationally funded projects. In total it has seven faculty
and around 50 postdoctoral researchers and PhD students.

How to apply:

Please send us:
* a letter of motivation,
* your CV,
* your transcripts,
* a list of publications,
* and the names and contact information of at least two references,

as a single PDF or a link to a PDF if the file size is more than 3 MB.

Please apply by October 10th, 2018.

Contact: Applications and any further inquiries regarding the project
should be directed to:

* Thomas.Kleinbauer@lsv.uni-saarland.de
* Dietrich.Klakow@lsv.uni-saarland.de

Top

6-38(2018-09-08) Enseignant vacataire Université de Franche-Comté, Besançon, France
Le département Sciences du Langage & FLE de l'Université de Franche-Comté (site de Besançon, SLHS) cherche un.e enseignant.e vacataire pour le TD de « Phonétique et multimodalité de la parole », en 2è année de licence Sciences du langage, au semestre 1, 12h par TD (2 voire 3), sur le créneau du vendredi 8-9h, 10-11h et/ou 13-14h.

Informations :
? Description : Ce cours s?intéresse à l?émergence de la parole des points de vue ontogénétique et phylogénétique. Il explore les modifications anatomiques et physiologiques qui rendent la parole possible. Le cours s?intéresse aussi aux différentes modalités (vocale, verbale, gestuelle) que les humains utilisent pour communiquer au quotidien.
? Objectifs pédagogique : Connaître l?importance de la parole dans la communication humaine, et les autres modalités qui la complètent
- Se familiariser avec les questions que posent les sciences humaines à propos de l?origine du langage.
- S?initier d?un point de vue phonétique à l?unité et à la diversité des langues
- Se familiariser avec les mécanismes de la phonation et approfondir les connaissances articulatoires du système phonétique français
- Renforcer les compétences en transcription phonétique et s?initier à l?analyse acoustique de la parole.
 
Le/la candidat.e devra :
 - soit être étudiant.e âgé.e de moins de vingt-huit ans au 1er septembre de l'année universitaire et inscrit.e.s en vue de la préparation d'un diplôme du troisième cycle, ou :
- justifier d'une activité professionnelle principale d'une durée de 900 heures sur une période de 12 mois (décret du 16 septembre 2004).
 
Contact : Sophie Mariani-Rousset (smariani@univ-fcomte.fr)
Top

6-39(2018-09-13) Senior Research and Development Engineer (m/f), ELDA, Paris

The European Language resources Distribution Agency (ELDA), a company specialized in Human Language Technologies within an international context, acting as the distribution agency of the European Language Resources Association (ELRA), is currently seeking to fill an immediate vacancy for a Senior Research and Development Engineer position.

Senior Research and Development Engineer (m/f)

Under the supervision of the CEO, the responsibilities of the Senior R&D Engineer include designing, developing, documenting, deploying and maintaining tools, software components or web applications for language resource production and management, as well as carrying out quality control and assessment of language resources.
He/she will be in charge of managing the current language resources production workflows and co-ordinating ELDA?s participation in R&D projects while being also hands-on whenever required by the language resource production and management team. He/she will liaise with external partners at all phases of the projects (submission to calls for proposals, building and management of project teams) within the framework of international, publicly- or privately-funded research and development projects.

This yields excellent opportunities for creative and motivated candidates wishing to participate actively to the Language Engineering field.

Profile:
?    PhD in Computer Science, Electrical Engineering, Natural Language Processing, or equivalent
?    Experience in Natural Language Processing (speech processing, data mining, machine translation, etc.)
?    Experience in managing a multi-disciplinary team
?    Proficiency in classic shell scripting in a Linux environment (POSIX tools, Bash, awk)
?    Proficiency in Python
?    Hands-on experience in Django
?    Good knowledge of Javascript and CSS
?    Knowledge of a distributed version control system (Git, Mercurial)
?    Knowledge of SQL and of RDBMS (PostgreSQL preferred)
?    Knowledge of XML and of standard APIs (DOM, SAX)
?    Good knowledge of basic Computer Science algorithms
?    Familiarity with open source and free software
?    Knowledge of a statically typed functional programming language (OCaml preferred) is a strong plus
?    Proficiency in French and English, with strong writing and documentation skills in both languages
?    Dynamic and communicative, flexible to work on different tasks in parallel
?    Ability to work independently and as part of a multidisciplinary team
?    Citizenship (or residency papers) of a European Union country

Salary: Commensurate with qualifications and experience.

Applicants should email a cover letter addressing the points listed above together with a curriculum vitae to:

ELDA
9, rue des Cordelières
75013 Paris
FRANCE
Fax : 01 43 13 33 30
Mail : job@elda.org

ELDA is acting as the distribution agency of the European Language Resources Association (ELRA). ELRA was established in February 1995, with the support of the European Commission, to promote the development and exploitation of Language Resources (LRs). Language Resources include all data necessary for language engineering, such as monolingual and multilingual lexica, text corpora, speech databases and terminology. The role of this non-profit membership Association is to promote the production of LRs, to collect and to validate them and, foremost, make them available to users. The association also gathers information on market needs and trends.

For further information about ELDA/ELRA, visit:
http://www.elda.org

Top

6-40(2018-09-22) Doctoral student (Speech Technology, Cognitive Science), Tampere University of Technology, Finland

Doctoral student (Speech Technology, Cognitive Science),  Tampere University of Technology, Finland


We are inviting applications for the position of Doctoral Student in the areas of speech technology and cognitive science at Tampere University of Technology (TUT), Laboratory of Signal Processing. The successful candidate will become a member of a newly formed research group named Speech and Cognition, led by Assistant Professor Okko Räsänen. In addition to research work, the candidate will commit to the pursuit of a doctoral degree in science (technology) at TUT. The job will consist of the following duties: • Research work on a mutually agreed doctoral research topic • Completion of mandatory studies for a D.Sc. (tech.) degree • Participation to the Doctoral Program of Computing and Electrical Engineering at TUT • Assisting tasks in teaching and in other activities of the research group
 
The broad scope of the position is related to the study of language acquisition and processing by humans and artificial computational systems. Potential topics include: 1. development of computational models of unsupervised and multimodal language learning and speech perception 2. development of algorithms and tools for analyzing acoustic and linguistic patterns in large-scale naturalistic audio recordings.
 
More precise goals of the thesis project will be planned together with the candidate. The work in the position will be closely integrated to several ongoing Academy of Finland research projects and their international collaboration networks.  The current contract will be made for a fixed term period until 31.8.2021 with a view for extension (with an initial probationary period of 6 months). Target completion time for a doctoral degree is 4 years. The commencement date will be as soon as possible, as mutually agreed.  The salary will be based on both the job demands and the employee's personal performance level in accordance with the University Salary System. According to the criteria applied to teaching and research staff, the position of a Doctoral Student is placed on job demands levels 2–4. A typical starting salary for a Doctoral Student at the beginning of their studies is 2330–2450 eur.  Exceptional Master’s students of TUT, who are close to graduation, can be also considered for the position. In this case, the candidate is first employed as a Research Assistant to carry out a master’s thesis project (6 months) on the topic and, upon a successful thesis project, with the possibility to continue to doctoral studies. Salary during master’s thesis project will correspond to job demands level 1. Requirements: The successful candidate must hold a master’s degree or to be close to graduation in a discipline related to the job, for example Computer Science, Signal Processing, Mathematics, Artificial Intelligence or Machine Learning. Candidates from Linguistics, Psychology, Neurosciences, or other field related to language or developmental research will also be considered, given that the candidate has a sufficiently strong technical background.  Basic programming skills and experience with MATLAB, Python, or comparable programming languages are required. Good written and spoken English skills, capability for
team work, and open mindset towards cross-disciplinary research are also essential. Skills in statistical analysis and previous research experience are counted as an advantage.  The successful candidate must either already be a PhD student at Tampere University of Technology or apply for post-graduate studies at the university. More information on the admission process and requirements: http://www.tut.fi/en/admissions/doctoral-studies-p... For more information, please contact: Assistant Professor Okko Räsänen, email: okko.rasanen@tut.fi How to apply: Applications must be submitted by TUT online application form at https://tut.rekrytointi.com/paikat/?o=A_A&jid=28 .  
 Closing date for applications is 30 September 2018 (24.00 EEST / 21.00 UTC). The most promising candidates will be interviewed in person or in a teleconference. The interviews will take place during the last week of application period and the first week of October, and therefore it is advisable to submit an application as soon as possible.  The following documents should be attached to the application in .pdf format:  • motivation letter  • CV, including contact details of possible referees • copy of Master’s degree diploma (if applicable) and a transcript of completed studies with course grades • copies of certificates related to the applicant's language proficiency
 About the Speech and Cognition Group: The research of Speech and Cognition group covers speech communication from both language technology and cognitive science points of view. Central research questions are related to how humans learn to understand and produce speech in interaction with their environment, how speech perception and production operate and how they are related to other cognitive capabilities such as memory and learning, and how similar language and cognitive skills could be implemented in man-made computational systems. The primary research method of the group is computational modeling of these phenomena, combining machine learning and signal processing to spoken language and other sensory data available to language learning children. The group also works on various technological applications related to, e.g., spoken language health technology and automatic analysis of large-scale speech recordings. Research of the group is conducted in collaboration with researchers across areas such as speech technology, linguistics, psychology, acoustics, neuroscience, and clinical medicine. About the research environment: Finland is among the most stable, free and safe countries in the world, based on prominent ratings by various agencies. It is also ranked as one of the top countries as far as social progress is concerned. Tampere is counted among the major academic hubs in the Nordic countries and offers a dynamic living environment. Tampere region is one of the most rapidly growing urban areas in Finland and home to a vibrant knowledge-intensive entrepreneurial community. The city is an industrial powerhouse that enjoys a rich cultural scene and a reputation as a centre of Finland’s information society. Read more about Finland and Tampere: • https://www.visitfinland.com/about-finland/ • https://finland.fi/ • https://tem.fi/documents/1410877/2888440/SIS_MIN_E... • https://visittampere.fi/en/

 The new Tampere University and higher education community begin their operations on 1 January 2019. Tampere University of Technology, the University of Tampere and Tampere University of Applied Sciences are building a unique environment for multidisciplinary, inspirational and high-impact research and education and a hub of expertise in technology, health and society. Read more: https://www.tampere3.fi/en
 

Top

6-41(2018-09-17) Associate Linguist [Français]

 

 

Intitulé du poste :

Associate Linguist [French]

Champs linguistiques  :

Phonétique, Phonologie, Morphologie, Sémantique, Syntaxe, Lexicographie, TAL

Lieu :

Paris, France

Description du poste :

En tant qu’Associate Linguist, vous annoterez et réviserez des données linguistiques en français.  L’Associate Linguist contribuera également à un certain nombre de tâches en traitement automatique des langues, dont :

  • Transcription phonétique/phonémique d’entrées lexicales
  • Analyse de données acoustiques pour évaluer la synthèse vocale
  • Annotation et révision de données linguistiques
  • Labellisation de textes, désambiguisation, expansion, and normalisation des données
  • Annotation d’entrées lexicales en respectant les codes de référence
  • Evaluation des outputs système
  • Dérivation de données en TAL
  • Capacité à travailler de manière indépendante avec précision

Compétences requises:

  • Locuteur de langue maternelle française, parfaite maîtrise de l’anglais
  • Connaissance en transcriptions phonétiques et phonologiques
  • Familiarité avec les techniques et outils de synthèse de la parole et de reconnaissance vocale
  • Expérience en annotation
  • Connaissances en phonétique, phonologie, sémantique, syntaxe, morphologie et lexicographie
  • Excellentes compétences en communication orale et écrite
  • Attention aux détails et compétences organisationnelles 

 

Compétences désirées :

  • Diplôme en linguistique théorique et computationnelle et TAL
  • Capacité à saisir rapidement les concepts techniques et les outils conçus en interne
  • Vif intérêt pour la technologie et compétences en informatique
  • Compétences en écoute de données orales
  • Compétences en saisie de clavier rapide et précise
  • Familiarité avec les logiciels de transcription
  • Compétences en édition, correction grammaticale et orthographique
  • Compétences en recherche

 

CV + lettre de motivation en Anglais : maroussia.houimli@adeccooutsourcing.fr

2730E brut/mensuel + 50% Pass Navigo + Mutuelle

 

 

Maroussia HOUIMLI

Responsable recrutement

Accueil en entreprise & Evénementiel et Marketing-Vente

  

T 06.24.61.08.43

E maroussia.houimli@adeccooutsourcing.fr

 

   

21 Boulevard Voltaire, 75011 - Paris

www.adecco.fr/outsourcing/

Top

6-42(2018-09-18) 3-year Postdoctoral Researcher in Multilingual Speech Processing, IRISA, Rennes, France

3-year Postdoctoral Researcher in Multilingual Speech Processing


CONTEXT

The Expression research team focuses on expressiveness in human-centered data. In this context, the team has a  strong activity in the field of speech processing, especially text-to-speech (TTS). This activity is denoted by  regular publications in top international conferences and journals, exposing contributions in topics like machine learning (including deep learning), natural language processing, and speech processing. Team Expression takes part in multiple collaborative projects.Among those,the current position will take part in a large European H2020 project focusing on the social integration of migrants in Europe.

Team’s website: https://www-expression.irisa.fr/

PROFILE Main tasks:

1. Design multilingual TTS models (acoustic models, grapheme-to-phoneme, prosody, text         normalization…)

2. Take part in porting the team’s TTS system for embedded environments

3. Develop spoken language skill assessment methods

Secondary tasks: 1. Collect speech data

2. Define use cases with the project partners

Environment: The successful candidate will integrate a team of other researchers and engineers working on the same topics.

Required qualification: PhD in computer science or signal processing

Skills: ● Statistical machine learning and deep learning

● Speech processing and/or natural language processing

µ● Strong object-oriented programming skills

µ● Android and/or iOS programming are a strong plus

CONTRACT Duration: 36 months, full time

Salary: competitive, depending on the experience.

Starting date: 1st, November 2018.

APPLICATION & CONTACTS Send a cover letter, a resume, and references by email to:

● Arnaud Delhay,arnaud.delhay@irisa.fr ;

● Gwénolé Lecorvé, gwenole.lecorve@irisa.fr ;

● Damien Lolive, damien.lolive@irisa.fr.

Application deadline: 15th, october 2018.

Application will be processed on a daily basis.

Top

6-43(2018-09-20) Inria is seeking a Technical Project Manager for a new European (H2020 ICT) collaborative project COMPRISE, INRIA, Nancy, France

Inria is seeking a Technical Project Manager for a new European (H2020 ICT) collaborative
project called COMPRISE.

COMPRISE is a 3-year Research and Innovation Action (RIA) aiming at new cost-effective,
multilingual, privacy-driven voice interaction technology. This will be achieved through
research advances in privacy-driven machine/deep learning, personalized training,
automatic data labeling, and tighter integration of speech and dialog processing with
machine translation. The technology will be based on existing software toolkits (Kaldi
speech-to-text, Platon dialog processing, Tilde text-to-speech), as well as new software
resulting from these research efforts. The consortium includes academic and industrial
partners in France (Inria, Netfective Technology), Germany (Ascora, Saarland University),
Latvia (Tilde), and Spain (Rooter).

The successful candidate will be part of the Multispeech team at Inria Nancy (France). As
the Technical Project Manager of H2020 COMPRISE, he/she will be responsible for animating
the consortium in daily collaboration with the project lead. This includes orchestrating
scientific and technical collaborations as well as reporting, disseminating, and
communicating the results. He/she will also lead Inria?s software development and
demonstration tasks.

Besides the management of COMPRISE, the successful candidate will devote half of his/her
time to other activities relevant to Inria. Depending on his/her expertise and wishes,
these may include: management of R&D projects in other fields of computer science,
involvement in software and technology development and demonstration tasks, building of
industry relationships, participation in the setup of academic-industry collaborations,
support with drafting and proofreading new project proposals, etc.

Ideal profile:
- MSc or PhD in speech and language processing, machine learning, or a related field
- at least 5 years' experience after MSc/PhD, ideally in the private sector
- excellent software engineering, project management, and communication skills

Application deadline: October 12, 2018

Starting date: December 1, 2018 or January 1, 2019
Duration: 3 years (renewable)
Location: Nancy, France
Salary: from 2,300 to 3,700 EUR net/month, according to experience

For more details and to apply:
https://jobs.inria.fr/public/classic/en/offres/2018-01033

Top

6-44(2018-09-20) PhD grant at Universidad Politécnica de Madrid, Spain

PhD grant at Universidad Politécnica de Madrid


BIOMARKERS FOR THE DIAGNOSIS AND EVALUATION OF PARKINSON'S DISEASE BASED ON SPEECH AND OCULOGRAPHIC MULTIMODAL STUDIES
Laboratories:  Bioengineering and Optoelectronics Research Group
http://www.byo.ics.upm.es 
Doctoral school:  ETSI Telecomunicación, Universidad Politécnica de Madrid 
Discipline:   Neuroscience, machine learning, digital signal processing 
Supervision:  Juan Ignacio Godino Llorente 
Keywords:  Parkinson’s disease, speech, oculographic signals, multimodal evaluation, early detection 
Research context:   
This thesis project is placed in the context of the BIOMARKERS FOR THE DIAGNOSIS AND EVALUATION OF PARKINSON'S DISEASE BASED ON SPEECH AND OCULOGRAPHIC MULTIMODAL STUDIES (DPI2017-83405-R), financed by the Spanish Ministry of Economy and Competitiveness. 
Summary of the project:
Parkinson's disease is a chronic degenerative disorder affecting the dopamine production centers in the basal ganglia and which is mainly manifested with dysfunctions in motor systems. The disease affects 2% of the population over 60 years but its prevalence is likely to increase due to the aging trend of the world population. In addition to affecting the quality of life of patients and their environment, the disease carries a loss of productivity and high costs for health systems, so early diagnosis and treatment are vital to alleviate these negative effects. However, to date, there are not early and noninvasive markers of the disease. The literature has identified that voice and oculographic signals are affected even in pre-symptomatic stages, but this has not been exploited to design robust diagnosis and screening systems. Therefore this project aims at employing voice and oculographic signals as biomarkers for the design of automatic detection and screening systems based on digital signal processing techniques. To do this a phonetic-articulatory analysis of speech together with an analysis of eye movements (saccades, fixations, smooth pursuit...) analysis will be performed. The project objectives are relevant to the challenge 'health, demographic change and wellbeing' aiming at alleviating the cost associated with the disease on the European healthcare system. 
Candidate profile:  
We are looking for dynamic, creative, and motivated candidates with scientific curiosity, strong problem solving skills, the ability to work both independently and in a team environment, and the desire to push their knowledge limits and areas of confidence to new domains. The candidate should have a Master in Bioengineering, Computer Science, Acoustics, Electronic Engineering, Multimodal Interfaces, or Signal Processing, and experience in signal processing, machine learning, and information retrieval from complex data. A strong interest in bioengineering and multi-disciplinary applications is necessary.   It is not expected that the candidate will have already all the skills necessary, but a willingness and ability to rapidly step into new domains. 
Summary of conditions:
 Full time work (37,5h/week)  Contract duration: 4 years.  Life Insurance.  Estimated Incorporation date: Beginning of 2019.   Specific conditions of the call  
Application: 
Interested candidates should send a CV, transcript of Master’s degree courses, a cover letter (limit 2 pages) detailing their motivations for pursuing a PhD in general and specifically the project described above, and contact information for 2 references that the selection committee can contact. 
Application deadline:       
Complete candidature files should be submitted to ignacio.godino@upm.es before October 10th, 2018.
See also http://www.byo.ics.upm.es/BYO/noticias/phd-position

Top

6-45(2018-08-20) PhD student opportunity at LTCI, Telecom ParisTech, Paris, France

A PhD student opportunity is now available at LTCI, Telecom ParisTech,
Paris, France (https://www.telecom-paristech.fr/eng,
https://ltci.telecom-
paristech.fr/about-the-lab/?lang=en)

Framework:
**********
Groups are a fascinating interdisciplinary phenomenon. They can be
defined as bounded and structured entities that emerge from the
purposive, interdependent actions of individuals. One of the current
open challenges on automated groups? analysis is to provide
computational models of higher level concepts called emergent states,
that is states emerging as results of affective, behavioral and
cognitive interactions among the members of a group. Cohesion is one
of these states. It is a dynamic process that is reflected in the
tendency for a group to stick together and remain united in the pursuit of
its instrumental objectives and/or for the satisfaction of members?

affective needs. Cohesion is considered as a highly valued group
property serving crucial roles for group effectiveness and
performance. Scholars proposed theoretical models of cohesion having

from one to five dimensions.

Among these dimensions, the task and social ones were the most
investigated. The task dimension concerns the extent to which group
members are united to achieve the group?s goals and objectives; the
social dimension refers to the social relationships within the group
(e.g. the extent to which group members like each other). The thesis
will focus on the development of a computational model of cohesion
among humans, able to integrate its task and social dimensions and
also accounting for their relationship and their development over time.
This work will be conducted in the framework of the ANR JCJC French
national project GRACE (Groups? Analysis for automated Cohesion

Estimation).


Tasks:
******
- State-of-the-art on cohesion to identify which are its most suitable
and frequent multimodal behavioral descriptors. State-of-the-art will
span several research fields, including sociology, psychology, and
computer science
- Computation of multimodal behavioral descriptors of cohesion
- Designing and performing experiments to collect a multimodal data
set on cohesion
- Designing, implementing, and evaluating a computational model of
cohesion

Profile:
********
The ideal candidate should have a strong academic background in one or
more of the following fields: Computer Science, AI, Machine learning,
Human- Computer Interaction, Information Technology, Affective
Computing, Social Signal Processing, or closely related fields. In
addition to a passion for science and programming, you should be open
to extend your thinking to the issues linked to Human-Computer
Interaction. Moreover, the ideal candidate should
have:
- Interest in multidisciplinary research at the interface between
computer science and sociology/psychology
- Excellent academic track record
- Good command of English (written and spoken)
- Strong programming skills (C++/Python)
- Very good communication skills, commitment, independent working
style as well as initiative and team spirit

Offer:
******
Starting date: between winter 2018 and spring 2019. This is flexible
and can be negotiated with the supervisor within the above-mentioned

time frame.

Application deadline: the evaluation of the PhD candidates starts
immediately and it will continue until the position is filled.
To apply please send by email to giovanna.varni@telecom-paristech.fr a
single pdf file including:
- A cover letter stating your research interests and how they could be
related to the research topic the thesis focuses on.
- A detailed CV
- Transcripts of records of your MSc
- List of at least 2 referees
- Recommendation letters

You are encouraged to contact Prof. Giovanna Varni for more information.
Please quote ?PhD position? in the email subject for both asking
information and application.
Top

6-46(2018-10-05) POST DOC 18 mois, Laboratoire national de métrologie et d'essais, Trappes (78), France

 

POST DOC 18 mois – Transparence et explicabilité de la comparaison de voix dans le domaine criminalistique

 

Localisation : Laboratoire national de métrologie et d'essais, Trappes (78)

 

REF : ML/VOX/DE

 

CONTEXTE :

 

Le projet ANR VoxCrim (2017-2021) propose d’objectiver scientifiquement les possibilités de mise en œuvre d’une comparaison de voix dans le domaine criminalistique. Deux objectifs principaux : a) mettre en place une méthodologie permettant d’assurer l’efficacité et la compétence des laboratoires réalisant des comparaisons de voix, b) établir des standards de mesures objectives.

Il est nécessaire que les outils et méthodologies utilisés dans la comparaison de voix soient évalués, et que leur utilisation soit effectuée dans un cadre explicable et transparent. Les actions menées dans le projet permettront ainsi de faciliter le traitement d’une comparaison de voix dans les services de police et permettront de renforcer la recevabilité de la preuve auprès des tribunaux.

Le laboratoire national de métrologie et d’essais (LNE) apporte au projet son expertise en métrologie, normalisation, accréditation et comparaison inter-laboratoire, dans le but de constituer une solution méthodologique pratique permettant de rendre le processus de comparaison de voix transparent et explicable.

 

MISSIONS :

 

Les missions confiées s’organisent en trois tâches :

-              Spécifications du protocole de validation des méthodes de comparaison de voix, plus spécifiquement dans le domaine de la criminalistique. En s’appuyant sur l’existant en termes de normes et méthodologies de référence, le (la) post-doctorant(e) identifiera les besoins et possibilités pour la mise en place d’un protocole de référence.

-              Le (la) post-doctorant(e) vérifiera l’adéquation du protocole identifié avec les métriques de comparaison de voix identifiées par les chercheurs des laboratoires d’informatique et de phonétique associés au projet. Il (elle) s’assurera également de la compatibilité du protocole avec les méthodes de travail des centres scientifiques de la Police et de la Gendarmerie, membres du projet.

-              Il (elle) collaborera à l’organisation d’une comparaison inter-laboratoire s’appuyant sur ce protocole.

 

Le (la) post-doctorant(e) bénéficiera du soutien de différentes équipes du LNE dans la menée de ses travaux (équipes évaluation des systèmes de traitement de l’information,  mathématiques-statistiques, et métrologie), et sera en interaction régulière avec les autres laboratoires et centres scientifiques membres du projet.

Des publications (et présentations, le cas échéant) en conférences et journaux internationaux sont attendues du (de la) post-doctorant(e).

 

Bibliographie : Bonastre, J. F., Kahn, J., Rossato, S., & Ajili, M. (2015). Forensic speaker recognition: Mirages and reality. In Speech Production and Perception: Speaker-Specific Behavior. hal-01473992.

 

DUREE :

 

18 mois. Début en janvier 2019.

 

PROFIL :

 

Vous êtes titulaire d’un doctorat en informatique ou en sciences du langage, avec une spécialisation en traitement automatique de la parole.

Vous possédez des connaissances en méthodologie d’évaluation et en biométrique vocale.

 

Pour candidater, merci d’envoyer votre CV et lettre de motivation à l’adresse recrut@lne.fr en rappelant la référence : ML/VOX/DE

 

Top

6-47(2018-10-09) 2 permanent positions at the European Language resources Distribution Agency (ELDA), Paris, France

The European Language resources Distribution Agency (ELDA), a company specialised in Human Language Technologies within an international context is currently seeking to fill immediate vacancies for 2 permanent positions:

 

 

  • Web Developer position (m/f): Under the supervision of the technical department manager, the responsibilities of the Web Developer consist in designing and developing web applications and software tools for linguistic data management.
    Some of these software developments are carried out within the framework of European research and development projects and are published as free software.
    Depending on the profile, the Web Developer could also participate in the maintenance and upgrading of the current linguistic data processing toolchains, while being hands-on whenever required by the language resource production and management team.

  • Research and Development Engineer (m/f): Under the supervision of the CEO, the responsibilities of the R&D Engineer include designing, developing, documenting, deploying and maintaining tools, software components or applications for language resource production and management. He/she will be in charge of managing the current language resources production workflows and co-ordinating ELDA?s participation in R&D projects while being also hands-on whenever required by the language resource production and management team. He/she will liaise with external partners at all phases of the projects (submission to calls for proposals, building and management of project teams) within the framework of international, publicly- or privately-funded research and development projects.

Both positions based in Paris.

Please check the profile details for each open position here: http://www.elra.info/en/opportunities/

Contact: job@elda.org

Top



 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA