ISCApad #257 |
Tuesday, November 12, 2019 by Chris Wellekens |
6-1 | (2019-05-17) 2 PhDs in Trinity College Dublin, Ireland 2 PhDs in Trinity College Dublin, Ireland, to start in Sept 2019. Both come with a stipend of 18000Euros per year, along with full student fees for a 4 year period. Please contact me at nharte@tcd.ie if interested in either post.
Human Speech? How do I know it’s Real? 20 years ago, the major focus in developing speech synthesis systems was testing the intelligibility of the output speech. More recently, attention has switched focus to assessing not only intelligibility, but also naturalness, pleasantness, pauses, stress, intonation, emotion and listening effort. The intelligibility of systems is now so high, that synthetic voices are becoming more human-like. This is good news for generating realistic synthetic speech for applications such as voice reconstruction or gaming. In tandem, research in the area of speaker verification, or voice based biometrics, has started to pay closer attention to the issue of spoofing – where systems are attacked with reconstructed speech. Now, with improvements in speech synthesis, another realistic form of spoofing is the use synthetic speech generated by modelling the target user. So how can you tell when speech is real, or when it is fake? This is the focus of this PhD project and it goes to the very core of the nature of human speech.
Remote and Automatic Monitoring of Bird Populations The objective of this PhD project is to define the next-generation approaches to the use of remote monitoring for populations of birds of conservation concern. This PhD programme will develop acoustic techniques for the monitoring of bird species of conservation concern by leveraging recent developments in speech and language processing technologies. The PhD will develop appropriate approaches to acoustic data collection in the wild to ensure that acoustic surveys yield accurate bird population data and investigate audio signal analysis steps necessary to extract useful information from these long recordings. Approaches will involve the use of signal processing and deep learning. The research will be conducted in collaboration with the Dept of Zoology at TCD.
-- Associate Professor Naomi Harte School of Engineering, Trinity College Dublin
| ||||||||||
6-2 | (2019-05-17) Post-doc/PhD position Pattern mining for Neural Networks debugging: application to speech recognition, INRIA,Rennes, FrancePost-doc/PhD position Pattern mining for Neural Networks debugging: application to speech recognitionAdvisors: Elisa Fromont & Alexandre Termier, IRISA/INRIA RBA ? Lacodam team (Rennes)
Irina Illina & Emmanuel Vincent, LORIA/INRIA ? Multispeech team (Nancy)
firstname.lastname@inria.fr
Location: INRIA RBA, team Lacodam (Rennes)
Keywords: discriminative pattern mining, neural networks analysis, explainability of blackbox models, speech recognition.
Context:
Understanding the inner working of deep neural networks (DNN) has attracted a lot of attention in the past years [1, 2] and most problems were detected and analyzed using visualization techniques [3, 4]. Those techniques help to understand what an individual neuron or a layer of neurons are computing. We would like to go beyond this by focusing on groups of neurons which are commonly highly activated when a network is making wrong predictions on a set of examples. In the same line as [1], where the authors theoretically link how a training example affects the predictions for a test example using the so called ?influence functions?, we would like to design a tool to ?debug? neural networks by identifying, using symbolic data mining methods, (connected) parts of the neural network architecture associated with erroneous or uncertain outputs.
In the context of speech recognition, this is especially important. A speech recognition system contains two main parts: an acoustic model and a language model. Nowadays models are trained with deep neural networks-based algorithms (DNN) and use very large learning corpora to train an important number of DNN hyperparameters. There are many works to automatically tune these hyperparameters. However, this induces a huge computational cost, and does not empower the human designers. It would be much more efficient to provide human designers with understandable clues about the reasons for the bad performance of the system, in order to benefit from their creativity to quickly reach more promising regions of the hyperparameter search space.
Description of the position:
This position is funded in the context of the HyAIAI ?Hybrid Approaches for Interpretable AI? INRIA project lab (https://www.inria.fr/en/research/researchteams/inria-project-labs). With this position, we would like to go beyond the current common visualization techniques that help to understand what an individual neuron or a layer of neurons is computing, by focusing on groups of neurons that are commonly highly activated when a network is making wrong predictions on a set of examples. Tools such as activation maximization [8] can be used to identify such neurons. We propose to use discriminative pattern mining, and, to begin with, the DiffNorm algorithm [6] in conjunction with the LCM one [7] to identify the discriminative activation patterns among the identified neurons.
The data will be provided by the MULTISPEECH team and will consist of two deep architectures as representatives of acoustic and language models [9, 10]. Furthermore, the training data will be provided, where the model parameters ultimately derive from. We will also extend our results by performing experiments with supervised and unsupervised learning to compare the features learned by these networks and to perform qualitative comparisons of the solutions learned by various deep architectures. Identifying ?faulty? groups of neurons could lead to the decomposition of the DL network into ?blocks? encompassing several layers. ?Faulty? blocks may be the first to be modified in the search for a better design.
The recruited person will benefit from the expertise of the LACODAM team in pattern mining and deep learning (https://team.inria.fr/lacodam/) and of the expertise of the MULTISPEECH team (https://team.inria.fr/multispeech/) in speech analysis, language processing and deep learning. We would ideally like to recruit a 1 year (with possibly one additional year) post-doc with the following preferred skills:
? Some knowledge (interest) about speech recognition
? Knowledgeable in pattern mining (discriminative pattern mining is a plus)
? Knowledgeable in machine learning in general and deep learning particular
? Good programming skills in Python (for Keras and/or Tensor Flow)
? Very good English (understanding and writing)
However, good PhD applications will also be considered and, in this case, the position will last 3 years. The position will be funded by INRIA (https://www.inria.fr/en/). See the INRIA web site for the post-doc and PhD wages.
The candidates should send a CV, 2 names of referees and a cover letter to the four researchers (firstname.lastname@inria.fr) mentioned above. Please indicate if you are applying for the post-doc or the PhD position. The selected candidates will be interviewed in June for an expected start in September 2019.
Bibliography:
[1] Pang Wei Koh, Percy Liang: Understanding Black-box Predictions via Influence Functions. ICML 2017: pp 1885-1894 (best paper).
[2] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals: Understanding deep learning requires rethinking generalization. ICLR 2017.
[3] Anh Mai Nguyen, Jason Yosinski, Jeff Clune: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. CVPR 2015: pp 427-436.
[4] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus: Intriguing properties of neural networks. ICLR 2014.
[5] Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi: Deep Text Classification Can be Fooled. IJCAI 2018: pp 4208-4215.
[6] Kailash Budhathoki and Jilles Vreeken. The difference and the norm?characterising similarities and differences between databases. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 206?223. Springer, 2015
[7] Takeaki Uno, Tatsuya Asai, Yuzo Uchida, and Hiroki Arimura. Lcm: An efficient algorithm for enumerating frequent closed item sets. In Fimi, volume 90. Citeseer, 2003.
[8] Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent. Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009.
[9] G. Saon, H.-K. J. Kuo, S. Rennie, M. Picheny: The IBM 2015 English conversational telephone speech recognition system?, Proc. Interspeech, pp. 3140-3144, 2015.
[10] W. Xiong, L. Wu, F. Alleva, J. Droppo, X. Huang, A. Stolcke : The Microsoft 2017 Conversational Speech Recognition System, IEEE ICASSP, 2018.
| ||||||||||
6-3 | (2019-05-18)Tenure track assistant professor at Faculty of Medicine at the University of Toronto, Canada The Department of Speech-Language Pathology, in the Faculty of Medicine at the University of Toronto, invites applications for a full-time tenure-stream appointment in the field of pediatric language disorders. The appointment will be at the rank of Assistant Professor and will commence on January 1, 2020 or shortly thereafter.
Research excellence should be demonstrated by high-quality peer-reviewed publications, peer-reviewed funding, the submitted research statement, presentations at significant conferences, awards and accolades, and strong endorsements from referees of high standing. Evidence of excellence in teaching will be provided through teaching accomplishments and awards, the teaching dossier, including a detailed teaching statement, sample course syllabi, and teaching evaluations submitted as part of the application, as well as strong letters of reference. Salary will be commensurate with qualifications and experience. This is an exceptional opportunity to join the Department of Speech-Language Pathology at the University of Toronto, one of the most highly ranked research universities in North America. The department is housed in the Rehabilitation Sciences Building, which provides excellent teaching and research facilities. The University of Toronto offers unique opportunities for collaborative and interdisciplinary research, encourages innovative scholarship, and provides the prospect of teaching a diverse student population. Applicants must also arrange to have three letters of reference sent directly by the referee via email (on letterhead and signed) to search.slp@utoronto.ca by the closing date. If you have questions about this position, please contact search.slp@utoronto.ca. All application materials must be submitted online. All application materials, including reference letters, must be received by July 18, 2019.
| ||||||||||
6-4 | (2019-05-19) PhD position at LPNC, Grenoble, France SUBJECT TITLE: Bio-Bayes Predictions -- Coupling Biological and Bayesian Predictive
| ||||||||||
6-5 | (2019-05-20) Doctorant en robotique cognitive, Université de Grenoble, France
Lieu de travail : Grenoble Date de publication : 1 Mai 2019 Noms des responsables scientifiques : Gérard Bailly (DR CNRS, GIPSA-Lab, U. Grenoble Alpes) et Pascal Huguet (DR CNRS, LAPSCO, U. Clermont Auvergne) Type de contrat : CDD Doctorant/Contrat doctoral Durée du contrat : 36 mois Date de début de la thèse : 1 octobre 2019 Quotité de travail : Temps complet Rémunération : 2 135,00 € brut mensuel Sujet de la these: Impact de la présence de robots attentifs et bienveillants sur les comportements et fonctionnements cognitifs humains
Contexte
La robotique sociale humanoïde vise à créer des robots capables d’interagir avec des partenaires humains à des fins de coopération dans des secteurs aussi variés que l’assistance aux personnes âgées, l’apprentissage de compétences par les enfants ou encore la cobotique dans l’industrie 4.0. L’objectif est souvent de substituer à un partenaire humain ou animal le robot dans des tâches où les propriétés d’endurance, de rapidité ou de flexibilité des attributions sociales de ces avatars sociaux sont bénéfiques. Si beaucoup d’études en interaction face-à-face montrent l’impact des comportements verbaux [1] et coverbaux [2] des robots sociaux sur ceux de leurs partenaires humains, peu d’études ont été consacrées aux effets comparés entre présence robotique vs. humaine en matière de surveillance de tâches dans lesquelles le robot ou son modèle humain ne sont pas directement engagés. Deux études récentes conduites au LAPSCO [3] [4] ont notamment permis de répliquer avec succès, sous condition de présence robotique, l’influence généralement bénéfique de la présence humaine en matière de contrôle de l’attention dans des tâches impliquant de réprimer un automatisme cognitif néfaste dans l’activité cible (e.g. [5]).
Sujet et plan de travail
Nous proposons ici d’explorer l’impact du comportement d’un robot humanoïde possédant des capacités de communication verbale et co-verbale étendues [6] sur les performances (en matière de contrôle cognitif) de sujets impliqués dans des tâches dans lesquelles le robot est directement impliqué ou plus simplement présent dans l’environnement d’interaction. Nous allons donc varier le degré avec lequel le robot s’immisce dans la tâche principale, afin de déterminer les signaux susceptibles d’optimiser l’expression des bénéfices attachés aux robots sociaux humanoïdes (et à terme, leur acceptabilité) et les signaux plus néfastes à éliminer.
Dans le cadre de cette thèse, nous explorerons l’impact du comportement robotique – dans un premier temps, téléopéré par un pilote humain [7] puis piloté par les modèles appris sur ces données comportementales [8] – sur le comportement observable des sujets (e.g., performances cognitives, signaux verbaux et co-verbaux, notamment regard), sur leurs signaux physiologiques (i.e. conduction dermale, rythme respiratoire et cardiaque) et sur leurs activités cérébrales (i.e. étude de l’onde négative d’erreur par EEG).
Retombées scientifiques et technologiques
Les retombées attendues de cette thèse sont multiples : d’abord des modèles de prédiction des liens entre comportements observables, variables physiologiques et activités cérébrales sous-jacentes, ceci en fonction des performances observées et des comportements actifs et réactifs du robot ; ensuite, un ensemble de stratégies de comportement de robots attentifs et bienveillants, adaptées aux profils psychologiques des sujets et aux indicateurs de performance propres à la tâche ; et enfin des protocoles d’évaluation automatique de ces profils psychologiques par des robots sociaux interactifs, susceptibles d’être déployés dans des applications en santé, éducation ou évaluation de ressources humaines.
Références
[1] K.-Y. Chin, Z.-W. Hong, and Y.-L. Chen, “Impact of using an educational robot-based learning system on students’ motivation in elementary education,” IEEE Transactions on learning technologies, vol. 7, no. 4, pp. 333–345, 2014.
[2] S. Andrist, X. Z. Tan, M. Gleicher, and B. Mutlu, “Conversational gaze aversion for humanlike robots,” presented at the Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction, 2014, pp. 25–32.
[3] N. Spatola et al., “Improved Cognitive Control in Presence of Anthropomorphized Robots,” International Journal of Social Robotics, pp. 1–14, 2019.
[4] N. Spatola et al., “Not as bad as it seems: When the presence of a threatening humanoid robot improves human performance,” Science Robotics, vol. 3, no. 21, eaat5843, 2018.
[5] D. Sharma, R. Booth, R. Brown, and P. Huguet, “Exploring the temporal dynamics of social facilitation in the Stroop task,” Psychonomic bulletin & review, vol. 17, no. 1, pp. 52–58, 2010.
[6] Parmiggiani, Alberto, Elisei, Frédéric, Maggiali, Marco, Randazzo, Marco, Bailly, Gérard, and Metta, Giorgio, “Design and validation of a talking face for the iCub,” International Journal of Humanoid Robotics, vol. 12, no. 3, 20 pages, 2015.
[7] R. Cambuzat, F. Elisei, G. Bailly, O. Simonin, and A. Spalanzani, “Immersive teleoperation of the eye gaze of social robots,” in Int. Symposium on Robotics (ISR), Munich, Germany, 2018, pp. 232–239.
[8] Nguyen, Duc-Canh, Bailly, Gérard, and Elisei, Frédéric, “Learning off-line vs. on-line models of interactive multimodal behaviors with Recurrent Neural Networks,” Pattern Recognition Letters (PRL), pp. 29–36, 2017. Au sein du département Parole & Cognition du GIPSA-Lab (Grenoble, France), l'équipe Cognitive Robotics, Interactive Systems & Speech Processing (CRISSP) développe des modèles de comportements multimodaux (parole, gestes, regard, etc.) pour des robots humanoïdes en interaction avec des partenaires humains. Elle s'appuie sur les moyens techniques du GIPSA-Lab et en particulier sur la plate-forme robotique MICAL qui gère le robot iCub NINA et les outils de développement. Cette thèse s'inscrit dans le projet de l'équipe de développer des robots engagés dans des tâches finalisées nécessitant un contrôle fin de l’engagement. Elle est financée par le projet 80 ans du CNRS «Robotique Sociale Humanoïde et Cognition » (RSHC) impliquant des chercheurs de trois laboratoires : le LAPSCO et le LIMOS à Clermont et le GIPSA-Lab à Grenoble, La thèse sera rattachée à l’école doctorale EEATS de Grenoble. Des déplacements et trois séjours de courte-durée à Clermont-Ferrand seront à prévoir pour collaborer avec les chercheurs du LAPSCO et du LIMOS de l’UCA. Le candidat devra être titulaire d’un diplôme d’ingénieur et/ou d’un master en Sciences Cognitives ou en Robotique d’interaction. Le poste nécessite de solides connaissances en expérimentation, programmation et analyse statistique. Une formation initiale en neurosciences est un plus. Le candidat doit avoir de bonnes aptitudes de communication orale et écrite (français et anglais nécessaires) pour présenter aux congrès et rédiger des articles dans des revues scientifiques. Nous recherchons un jeune chercheur qui saura s’impliquer dans son projet, curieux, ayant une certaine autonomie et une forte motivation pour développer des compétences en synthèse et évaluation de comportements dans le domaine de l’interaction homme-robot. De plus, le candidat devra être apte à travailler en équipe sur des projets pluridisciplinaires. Les candidatures devront inclure un CV détaillé ; au moins deux références (personnes susceptibles d’être contactées) ; une lettre de motivation d’une page ; un résumé d’une page du mémoire de master ; les notes de Master 1 ou 2 ou d’école d’ingénieur.
La date limite pour l’envoi des candidatures est le 15/7/2019
| ||||||||||
6-6 | (2019-05-28) Researcher position, Fixed term contract of 12 months, IRCAM, Paris
Researcher position (W/M) at IRCAM Fixed-term contract of 12 months INTRODUCTION TO IRCAM: IRCAM is a leading non-profit organization associated to Centre Pompidou, dedicated to music production, scientific research and education in sound and music technologies. In the joint research unit, UMR 9912 STMS (Science and Technology for Music and Sound), supported by IRCAM, Sorbonne University, and the CNRS, specialized research teams conduct research and development in the areas of acoustics, audio signal processing, human-computer interaction, computer music and musicology. IRCAM is located in the centre of Paris near the Centre Pompidou, at 1, Place Igor Stravinsky 75004 Paris. POSITION DESCRIPTION: IRCAM is looking for a researcher for the development of music content analysis technologies (such as instrument recognition, audio quality, auto-tagging) in the framework of a technology transfer to the Universal Music Group. The hired researcher will also collaborate with the development team and participate in the project activities (evaluation of technologies, meetings, specifications, reports). REQUIRED EXPERIENCES ET SKILLS: Very high skills in audio signal processing (spectral analysis, audio-feature extraction, parameter estimation) (the candidate should preferably hold a PhD in this field) ; Very high skills in machine learning (SVM, ConvNet) (the candidate should preferably hold a PhD in this field) ; Good skills in distributed computing ; High-skills in Matlab and Python programming, skills in C/C++ programming ; Good knowledge of UNIX environments (GNU-Linux ou MacOSX) ; High productivity, methodical works, excellent programming style. SALARY: According to background and experience. Deadline for application: June, 17th, 2019 Please send an application letter with the reference 201906UMGRES together with your resume and any suitable information addressing the above issues preferably by email to: mignot at ircam dot fr with cc to vinet at ircam dot fr and roebel at ircam dot fr.
| ||||||||||
6-7 | (2019-05-30) Funded position PhD at Avignon University, France PhD title: Decentralized collaborative learning for speech recognition in a context of privacy protection PhD project The candidate will contribute to the DEEP-PRIVACY project funded by the French national research agency (ANR). DEEP-PRIVACY aims at developing private-by-design acoustic models for use in automatic speech recognition systems. Indeed, more and more devices –mobile or not– invest our daily lives. These devices use voice technology to interact with human users who can access IT services through natural interaction (Apple Siri, Amazon Alexa, Google Home ...). With the stated goal of improving the quality of their services, which includes automatic speech recognition, companies that offer such solutions usually collect private audio data from these devices. This constitutes a real risk for privacy. The PhD proposal is to study and propose approaches for distributed and decentralized learning of neural acoustic models in which user data remain on the user's device. A first objective consists in experimenting and comparing different approaches from the literature to the particular case of acoustic models in speech processing tasks. We will focus on machine learning algorithms where data are collected locally on every peer and are not transmitted to a central server. Communications are restricted to models or updates of weights computed locally on the device. The cases of centralized federated learning approaches [Konečny et al. 2016, Leroy et al. 2018] and purely decentralized approaches [Bellet et al. 2017] will be studied. Comparisons will include several trade-offs involving model size and communication costs for instance. Beyond the implementation and experimentation of federated learning and distributed collaborative learning approaches for automatic speech recognition, a study on the nature of the information conveyed (paralinguistics, phonetics, lexics) during these exchanges will also be conducted to assess the level of privacy protection provided by the proposed approaches. More precisely, it will be a question of studying the possibility of recognizing, at a level that will have to be quantified if necessary, the speaker and the phonemes, even the words uttered, by analyzing the values of the exchanged updates. We will also investigate how to locally adapt the acoustic models to the voice and speech characteristics of the user in order to obtain personalized models. The amount of data and adaptation time will be studied. In general, the thesis will keep a critical eye on the advantages and disadvantages offered by collaborative automatic learning in a real application framework [Bhowmick et al. 2018; Hitaj et al. 2018]. In view of the results of these experiments and analyses, the purpose of the thesis will be to propose effective solutions to overcome the disadvantages and propose an effective framework, linking privacy protection and improvement of acoustic modeling for speech recognition in a distributed deployment context. Technical skills required: •Master’s degree in machine learning or in computer science •Background in deep learning, and in statistics •Experience with deep learning tools is a plus (preferably PyTorch) •Good programing skills (preferably in Python) •Experience in speech and/or speaker recognition is a plus General information This PhD thesis fits within the scope of a collaborative project (project DEEP-PRIVACY, funded by the French National Research Agency) involving the LIA (Laboratoire Informatique d’Avignon), the MAGNET team of Inria Lille - Nord Europe, the LIUM (Laboratoire d'Informatique de l'Université du Mans) and the MULTISPEECH team of Inria Nancy - Grand-Est. This PhD position is in collaboration with Avignon University, and will be co-supervised by Yannick Estève (https://cv.archives-ouvertes.fr/yannick-esteve) and Marc Tommasi (http:// researchers.lille.inria.fr/tommasi). The work location will mainly be in Avignon (LIA, Avignon University), with some stays in Lille (CRIStAL / INRIA Lille Nord Europe). Knowledge of the French language is not required. Additional information: Three-year work contract, with a monthly net salary of approximately 1685€/month, and a financial support for international research training and conference participation plus a contribution to the research costs. Duration: 3 years Starting date: September or October 2019 Contacts: Yannick ESTÈVE yannick.esteve@univ-avignon.fr Marc TOMMASI marc.tommasi@inria.fr Doctoral School: Sciences et Agrosciences (Avignon University) References [Bellet et al. 2017] Bellet, A., Guerraoui, R., Taziki, M., & Tommasi, M. (2017). Personalized and private peerto- peer machine learning. arXiv preprint arXiv:1705.08435 [Bhowmick et al. 2018] A. Bhowmick, J. Duchi, J. Freudiger, G. Kapoor, and R. Rogers. Protection Against Reconstruction and Its Applications in Private Federated Learning. arXiv preprint https://arxiv.org/abs/ 1812.00984 [Hitaj et al. 2018] B. Hitaj, G. Ateniese, F. Perez-Cruz. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. arXiv preprint https://arxiv.org/abs/1702.07464 [Konečný et al. 2016] J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, & D. Bacon. (2016). Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv: https://arxiv.org/ abs/1610.05492 [Leroy et al. 2018] D. Leroy, A. Coucke, T. Lavril, T. Gisselbrecht, J. Dureau, Federated Learning for Keyword Spotting. arXiv preprint https://arxiv.org/abs/1810.05512
| ||||||||||
6-8 | (2019-06-01) PhD position in NLP at LORIA, Nancy, France Automatic classification using deep learning of hate speech posted on the Internet
Supervisors: Irina Illina, MdC, HDR, Dominique Fohr, CR CNRS
Team: Multispeech, LORIA-INRIA, France
Contact: illina@loria.fr, dominique.fohr@loria.fr
Duration of PhD Thesis : 3 years
Deadline to apply : June 30th, 2019
Required skills: background in statistics, natural language processing and computer program skills (Perl, Python). Candidates should email a detailed CV with diploma
Keywords: hate speech, social media, natural language processing.
The rapid development of the Internet and social networks has brought great benefits to women and men in their daily lives. Unfortunately, the dark side of these benefits has led to an increase in hate speech and terrorism as the most common and powerful threats on a global scale. Hate speech is a type of offensive communication mechanism that expresses an ideology of hatred often using stereotypes. Hate speech can target different societal characteristics such as gender, religion, race, disability, etc. Hate speech is the subject of different national and international legal frameworks. Hate speech is a type of terrorism and often follows a terrorist incident or event.
Social networks are incredibly popular today. Nowadays, Twitter, LinkedIn, Facebook and YouTube are used as a standard tool for communicating ideas, beliefs and feelings. Only a small percentage of people use part of the network for unhealthy activities such as hate speech and terrorism. But the impact of this low percentage of users is extremely damaging. For years, social media companies such as Twitter, Facebook and YouTube have invested hundreds of millions of dollars each year in the task of detecting, classifying and moderating hate. But these efforts are mainly based on manually revising the content to identify and remove offensive content, which is extremely expensive.
This thesis aims at designing automatic and evolving methods for the classification of hate speech in the field of social media. Despite the studies already published on this subject, the results show that the task remains very difficult. We will use semantic content analysis methodologies from automatic language processing (NLP) and methodologies based on deep learning (DNN) which is the revolution in the field of artificial intelligence. During this thesis, we will develop a research protocol to classify hate speech in the text in terms of hateful, aggressive, insulting, ironic, neutral, etc. character. This type of problem is placed in the context of the multi-label classification.
In addition, the problem of obfuscation of words in hate messages will need to be addressed. People who want to write hate speech on the Internet know that they risk being censored by rudimentary automatic systems of moderation. So, users try to obscure their words by changing the spelling or the spelling of words.
Among the crucial points of this thesis are the choice of the DNN architecture and the relevant representation of the data, ie the text of the internet message. The system designed will be validated on real flows of social networks.
Skills
Strong background in mathematics, machine learning (DNN), statistics
Following profiles are welcome, either:
Strong experience with natural language processing
Excellent English writing and speaking skills are required in any case.
References :
T Gröndahl, L Pajola, M Juuti, M Conti, N Asokan (2018) ?All You Need is? Love?: Evading Hate-speech Detection, arXiv preprint arXiv:1808.09115
Wiegand, M., Klakow, D. (2008). Optimizing Language Models for Polarity Classification. In Proceedings of ECIR, pp. 612-616.
Wiegand, M., Ruppenhofer, J. (2015). Opinion Holder and Target Extraction based on the Induction of Verbal Categories. In Proceedings of CoNLL, pp. 215-225.
Wiegand, M., Ruppenhofer J., Schmidt A., C. Greenberg (2018) Inducing a Lexicon of Abusive Words ? A Feature-Based Approach. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
Wiegand, M., Wolf, M., Ruppenhofer, J. (2017) Negation Modeling for German Polarity Classification. In Proceedings of GSCL.
Zhang Z., Luo L. (2018). Hate speech detection: a solved problem? The Challenging Case of Long Tail on Twitter. arxiv.org/pdf/1803.03662
| ||||||||||
6-9 | (2019-06-07) PhD grant at ISIR and STMS, Paris, France
Modélisation multimodale de l’expressivité et de l’alignement pour l’interaction humain-machine Directrice de thèse : Catherine Pelachaud (ISIR) Co-encadrant : Nicolas Obin (STMS) Contexte Cette thèse s’inscrit dans un contexte particulièrement riche en développement d’interfaces de communication entre l’humain et la machine. Par exemple, l’émergence et la démocratisation des assistants personnels (smartphones, home assistants, chatbots) font de l’interaction avec la machine une réalité quotidienne pour de plus en plus d’individus. Cette pratique tend à s’amplifier et à se généraliser à un grand nombre d’usages et de pratique de l’être humain : depuis les agents d’accueil (aujourd’hui, quelques robots Pepper plus pour la démo que pour un usage réel), la consultation à distance, ou les agents embarqués dans les véhicules autonomes. L’expansion des usages appelle à une extension des modalités d’interaction et à l’amélioration de la qualité de l’interaction avec la machine : aujourd’hui, la voix constitue la modalité privilégiée de l’interaction, et les scénarios d’interaction demeurent très limités (demande d’information, question-réponse, pas de réelle interaction dans la durée). Les limitations principales sont d’une part une faible expressivité : le comportement de l’agent est encore souvent monomodal (voix, comme les assistants Alexa ou Google Home) et demeure très monotone, ce qui limite grandement l’acceptabilité, la durée et la qualité de l’interaction ; et d’autre part le comportement de l’agent est peu ou pas adapté à l’interlocuteur, ce qui diminue l’engagement de l’humain dans l’interaction. Lors d’une interaction humain-humain les phénomènes d’alignement (e.g., ton de voix, vitesse de mouvement corporel) sont des indices de compréhension commune et d’engagement dans l’interaction (Pickering et Garrod, 1999; Castellano et al, 2012). L’engagement est marqué par des comportements non-verbaux sociaux (nonverbal social behaviors) à des moments spécifiques de l’interaction : ce peut être des signaux de feedbacks (pour indiquer être en phase avec l’interactant), ou bien une forme d’imitation (par exemple : un sourire appelle un autre sourire, le ton de la voix reprend des éléments de celui de l’interactant), ou encore des signaux synchronisés avec ceux de l’interactant (la gestion des tours de parole). Cette thèse vise à modéliser le comportement de l’agent en fonction de celui de l’utilisateur pour qu’il puisse montrer son engagement attentionnel en vue de maintenir l’interaction et de rendre ses messages plus compréhensifs. L’adaptation du comportement de l’agent se produira à différents niveaux comportementaux (prosodique, lexicale, comportementale, imitation, tour de parole…). L’interaction humain-machine, avec un fort potentiel applicatif dans de nombreux domaines, est un exemple d'interdisciplinarité nécessaire entre humanités numériques, robotique, et intelligence artificielle. Objectif L’objectif de la thèse est de mieux comprendre et à modéliser les mécanismes qui régissent l’interaction multimodale (voix et geste) entre un humain et une machine, pour permettre de lever des verrous technologiques et permettre d’élaborer un agent conversationnel capable de s’adapter de manière naturelle et cohérente à un interactant humain. 1) Expressifs (Léon, 1993) : capable d’avoir une expression variée et cohérente pour maintenir l’attention de l’interlocuteur, souligner les points importants, améliorer la qualité de l’interaction et en allonger la durée (dépasser un ou deux tours de parole) 2) Alignés sur le comportement multimodal de l’interlocuteur (Pickering et Garrod, 1999; Castellano et al, 2012; Clavel et al, 2016) : c’est-à-dire capable d’adapter son comportement en fonction du comportement de l’interlocuteur, pour renforcer l’engagement de ce dernier dans l’interaction. Dans un premier temps, la thèse proposera de réaliser une architecture neuronale unifiée pour la modélisation générative du comportement multimodale de l’agent. L’expressivité de l’agent, prosodique (Obin, 2011; Obin, 2015) et gestuelle (Pelachaud, 2009), sera modélisée par des architectures neuronales récurrentes aujourd'hui couramment utilisées pour la voix et le geste (Bahdanau et al, 2014, Wang, 2017, Robinson & Obin, 2019). La thèse se focalisera sur deux aspects essentiels de la modélisation du comportement de l’agent : le développement d’architectures structurées sur plusieurs échelles temporelles pour améliorer la modélisation de la variabilité prosodique et gestuelle à l’échelle de la phrase et à l’échelle du discours (Le Moine & Obin, 2019), et l’apprentissage d’un comportement multimodal cohérent par l’approfondissement de mécanismes d’attention multimodaux partagés appliqués à la synchronicité des profils prosodiques et gestuels générés (He, 2018). Dans un deuxième temps, la thèse s’attaquera à l’alignement du comportement de l’agent avec celui de l’humain. La thèse approfondira particulièrement l’apprentissage interactif et par imitation pour adapter de manière cohérente le comportement multimodal de l’agent à l’humain (Weber, 2018; Mancini, 2019), à partir des bases de données de dialogues accessibles (telles que NoXi (récoltées à l’ISIR et annotées en terme d’engagement), IEMOCAP (USC, Carlos Busso), Gest-IS (Edinburgh University, Katya Saint-Amard)) pour apprendre la relation et aussi leur adaptation au cours de l'interaction entre les profils prosodiques et comportementaux des interlocuteurs. La thèse sera co-encadrée par Catherine Pelachaud, de l’équipe PIRoS de l’ISIR, spécialisée en interaction humain-machine et agents conversationnels, et par Nicolas Obin, de l’équipe Analyse et Synthèse des Sons (AS) de STMS, spécialisée en modélisation générative de signaux de parole.. Le doctorant bénéficiera par ailleurs des connaissances, savoir-faire, et outils existants à STMS et à l’ISIR (par exemple : synthétiseur de parole ircamTTS développé à STMS, plateforme GRETA développée à l’ISIR) et de la logistique de calcul de STMS (serveur de calculs, GPU). Bibliographie (Bevacqua et al., 2012) Elisabetta Bevacqua, Etienne de Sevin, Sylwia Julia Hyniewska, Catherine Pelachaud, A listener model: Introducing personality traits, Journal on Multimodal User Interfaces, special issue Interacting ECAs, Elisabeth André, Marc Cavazza and Catherine Pelachaud (Guest Editors), July 2012, 6(1-2), pp 27-38. (Castellano et al., 2012) G. Castellano, M. Mancini, C. Peters, P. W. McOwan. Expressive copying behavior for social agents: a perceptual analysis. IEEE Trans Syst, Man Cybern, Part A: Syst Hum 42(3), 2012. (Clavel et al., 2016) Chloé Clavel, Angelo Cafaro, Sabrina Campano, and Catherine Pelachaud, Fostering user engagement in face-to-face human-agent interactions, in A. Esposito and L. Jain (Eds), Toward Robotic Socially Believable Behaving Systems - Volume I: Modeling Social Signals, Springer Series on Intelligent Systems Reference Library (ISRL), 2016 (Glas and Pelachaud, 2015) N. Glas, C. Pelachaud, Definitions of Engagement in Human-Agent Interaction, workshop ENHANCE, in International Conference on Affective Computing and Intelligent Interaction (ACII), 2015. (Hall et al., 2005) L. Hall, S. Woods, R. Aylett, L. Newall, A. Paiva. Achieving empathic engagement through affective interaction with synthetic characters. Affective computing and intelligent interaction, 2005. (He, 2018) Xiaodong He, Deep Attention Mechanism for Multimodal Intelligence: Perception, Reasoning, & Expression across Language & Vision, Microsoft Research, AI NEXTCon, 2018. (Le Moine & Obin, 2019) Clément Lemoine, Modélisation neuronale de l’expressivité pour la transformation de la voix, stage de Master, 2019. (Léon, 1993) P. Léon. Précis de phonostylistique : Parole et expressivité. Paris:Nathan, 1993. (Obin, 2011) N. Obin. MeLos: Analysis and Modelling of Speech Prosody and Speaking Style, PhD. Thesis, Ircam-Upmc, 2011. (Obin, 2015) N. Obin, C. Veaux, P. Lanchantin. Exploiting Alternatives for Text-To-Speech Synthesis: From Machine to Human, in Speech Prosody in Speech Synthesis: Modeling and generation of prosody for high quality and flexible speech synthesis. Chapter 3: Control of Prosody in Speech Synthesis, p.189-202, Springer Verlag, February, 2015. (Ochs et al., 2008) M. Ochs, C. Pelachaud, D. Sadek, An Empathic Virtual Dialog Agent to Improve Human-Machine Interaction, Seventh International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Estoril Portugal, May 2008. (Paiva et al., 2017) A. Paiva, I. Leite, H. Boukricha, Hana I. Wachsmuth 'Empathy in Virtual Agents and Robots: A Survey.', ACM Trans. Interact. Intell. Syst. (2017), 7 (3):11:1-11:40. (Pelachaud, 2009) C. Pelachaud, Studies on Gesture Expressivity for a Virtual Agent, Speech Communication, special issue in honor of Björn Granstrom and Rolf Carlson, 51 (2009) 630-639. (Poggi, 2007) I. Poggi. Mind, hands, face and body: a goal and belief view of multimodal communication. Weidler, Berlin, 2007. (Robinson & Obin, 2019) C. Robinson, N. Obin, A. Roebel. Sequence-to-sequence modelling of F0 for speech emotion conversion, in IEEE International Conference on Audio, Signal, and Speech Processing (ICASSP), 2019. (Sadoughi et al., 2017) Najmeh Sadoughi, Yang Liu, and Carlos Busso, 'Meaningful head movements driven by emotional synthetic speech,' Speech Communication, vol. 95, pp. 87-99, December 2017. (Sidner and Dzikovska, 2002) C. L. Sidner, M. Dzikovska. Human-robot interaction: engagement between humans and robots for hosting activities. In: IEEE int conf on multimodal interfaces, 2002. (Wang, 2017) Xin Wang, Shinji Takaki, Junichi Yamagishi. An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis, Interspeech, 2017 (Wang, 2018) Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Re, Ye Jia, Rif A. Saurous. « Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis », 2018. (Weber, 2018) K. Weber, H. Ritschel, I. Aslan, F. Lingenfelser, E. André, How to Shape the Humor of a Robot - Social Behavior Adaptation Based on Reinforcement Learning, ACM International Conference on Multimodal Interaction, 2018. (Mancini, 2019) M. Mancini, B. Biancardi, S. Dermouche, P. Lerner, C. Pelachaud, Managing Agent’s Impression Based on User’s Engagement Detection, Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, 2019.
| ||||||||||
6-10 | (2019-06-14) PhD position: Privacy preserving and personalized transformations for speech recognition, INRIA Nancy and Univ.Le Mans, France
Thesis title Privacy preserving and personalized transformations for speech recognition This PhD thesis fits within the scope of a collaborative project (funded by the French National Research Agency) involving several French teams, among which, the MULTISPEECH team of Inria Nancy - Grand-Est and the LIUM (Laboratoire d'Informatique de l'Université du Mans). This PhD position is in collaboration between the Multispeech team of the LORIA laboratory (Nancy) and Le Mans University. The thesis will be co-supervised by Denis Jouvet (https://members.loria.fr/DJouvet/) and Anthony Larcher (https://lium.univlemans.fr/team/anthony-larcher/). The selected candidate is expected to spend time in both teams over the course of the PhD. Scientific Context Over the last decade, great progress has been made in automatic speech recognition [Saon et al., 2017; Xiong et al., 2017]. This is due to the maturity of machine learning techniques (e.g., advanced forms of deep learning), to the availability of very large datasets, and to the increase in computational power. Consequently, the use of speech recognition is now spreading in many applications, such as virtual assistants (as for instance Apple’s Siri, Google Now, Microsoft’s Cortana, or Amazon’s Alexa) which collect, process and store personal speech data in centralized servers, raising serious concerns regarding the privacy of the data of their users. Embedded speech recognition frameworks have recently been introduced to address privacy issues during the recognition phase: in this case, a (pretrained) speech recognition model is shipped to the user's device so that the processing can be done locally without the user sharing its data. However, speech recognition technology still has limited performance in adverse conditions (e.g., noisy environments, reverberated speech, strong accents, etc.) and thus, there is a need for performance improvement. This can only be achieved by using large speech corpora that are representative of the actual users and of the various usage conditions. There is therefore a strong need to share speech data for improved training that is beneficial to all users, while preserving the privacy of the users, which means at least keeping the speaker identity and voice characteristics private1. 1 Note that when sharing data, users may want not to share data conveying private information at the linguistic level (e.g., phone number, person name, …). Such privacy aspects also need to be taken into account, but they are out-of-the scope of this thesis. Missions: (objectives, approach, etc.) Within this context, the objective of the proposed thesis is twofold. First, it aims at finding a privacy preserving transform of the speech data, and, second, it will investigate the use of additional personalized transforms, that can be applied on the user’s terminal, to increase speech recognition performance. In the proposed approach, the device of each user will not share its raw speech data, but a privacy preserving transformation of the user speech data. In such approach, some private computations will be handled locally, while some cross-user computations may be carried out on a server using the transformed speech data, which protect the speaker identity and some of his/her features (gender, sentiment, emotions...). More specifically, this will rely on a representation learning to separate the features of the user data that can expose private information from generic ones useful for the task of interest, i.e., here, the recognition of the linguistic content. We will build upon ideas of Generative Adversarial Networks (GANs) for proposing such a privacy preserving transform. Since a few years, GANs are getting more and more used in deep learning. They typically rely on both a generative network and a discriminative network, where the generator aims to output samples that the discriminator cannot distinguish from the true samples [Goodfellow et al., 2014; Creswell et al., 2018]. They have also been used as autoencoders [Makhzani et al., 2015] which are made of three mains blocks: encoder, generator and discriminator. In our case, the discriminators shall focus on discriminating between speakers and/or between voice-related classes (defined according to gender, emotions, etc). The training objective will be to maximize the speech recognition performance (using the privacy preserving transformed signal) while minimizing the available speaker or voice-related information measured by the discriminator. As devices are getting more and more personal, it creates opportunities to make speech recognition more personalized. This includes two aspects: adapting the model parameters to the speaker (and to the device) and introducing personalized transforms to help hiding the speaker voice identity. Both aspects will be investigated. Voice conversion approaches provide example of transforms aiming at modifying the voice of a speaker so that it sounds like the voice of another target speaker [e.g., Chen et al., 2014; Mohammadi & Kain, 2014]. Similar approaches can thus be applied to map speaker specific features to those of a standard (or average) speaker, which thus would help concealing its identity. To take benefit of the increased personal usage of terminals, speaker and environment specific adaptation will be investigated to improve speech recognition performance. Collaborative learning mixing speech and speaker recognition has been shown to benefit both tasks [Liu et al. 2018; Garimella et al. 2015] and provide a way to combine both information in a single framework. This approach will be compared to adaptation of deep neural networks-based models [e.g., Abdel-Hamid & Jiang, 2013] to handle best different amounts of adaptation data. Skills and profile: Master in machine learning or in computer science Background in statistics, and in deep learning Experience with deep learning tools is a plus Good computer skills (preferably in Python) Experience in speech and/or speaker recognition is a plus Bibliography: [Abdel-Hamid & Jiang, 2013] Abdel-Hamid, O., & Jiang, H. Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code. In ICASSP-2013, IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7942-7946, 2013. [Chen et al., 2014] Chen, L. H., Ling, Z. H., Liu, L. J., & Dai, L. R. Voice conversion using deep neural networks with layer-wise generative training. TASLP-2014, IEEE/ACM Transactions on Audio, Speech and Language Processing, 22(12), pp. 1859-1872, 2014. [Creswell et al., 2018] Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., and Bharath, A. A. Generative adversarial networks: An overview. IEEE Signal Processing Magazine 35, 1, 53-65, 2018. [Garimella et al. 2015] Garimella, S., Mandal, A., Strom, N., Hoffmeister, B., Matsoukas, S., & Parthasarathi, S. H. K., Robust i-vector based adaptation of DNN acoustic model for speech recognition. In INTERSPEECH, 2015. [Goodfellow et al., 2014] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672-2680, 2014. [Liu et al. 2018] Y. Liu, L. He, J. Liu, and M. Johnson, Speaker Embedding Extraction with Phonetic Information,' in INTERSPEECH , pp. 2247-2251, 2018 [Makhzani, 2015] Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., and Frey, B. Adversarial autoencoders. arXiv preprint arXiv:1511.05644, 2015. [Mohammadi & Kain, 2014] Mohammadi, S. H., & Kain, A. Voice conversion using deep neural networks with speaker-independent pre-training. In SLT-2014, Spoken Language Technology Workshop , pp. 19-23, 2014. [Saon et al., 2017] G. Saon, G. Kurata, T. Sercu, K. Audhkhasi, S. Thomas, D. Dimitriadis, X. Cui, B. Ramabhadran, M. Picheny, L.-L. Lim, B. Roomi, and P. Hall. English conversational telephone speech recognition by humans and machines. Technical report, arXiv:1703.02136, 2017. [Xiong et al., 2017] W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, and G. Zweig. Achieving human parity in conversational speech recognition. Technical report, arXiv:1610.05256, 2017. Additional information: Supervision and contact: Denis Jouvet (denis.jouvet@loria.fr; https://members.loria.fr/DJouvet/) Anthony Larcher (anthony.larcher@univ-lemans.fr; https://lium.univlemans.fr/team/anthony-larcher/) Additional link: Ecole Doctorale IAEM Lorraine (http://iaem.univ-lorraine.fr/) Duration: 3 years Starting date: autumn 2019 The candidates are required to provide the following documents in a single pdf or ZIP file: CV A cover/motivation letter describing their interest in the topic Degree certificates and transcripts for Bachelor and Master (or the last 5 years) Master thesis (or equivalent) if it is already completed, or a description of the work in progress, otherwise The publications (or web links) of the candidate, if any (it is not expected that they have any) In addition, one recommendation letter from the person who supervises(d) the Master thesis (or research project or internship) should be sent directly by his/her author to the prospective PhD advisor.
| ||||||||||
6-11 | (2019-06-16) PhD position: Hybrid Bayesian and deep neural modeling for weakly supervised learning of sensory-motor speech representations, University of Grenoble-Alpes, France Open fully-funded PhD position: “Hybrid Bayesian and deep neural modeling for weakly supervised learning of sensory-motor speech representations” The Deep-COSMO project, part of the new AI institute in Grenoble, is welcoming applications for a 3-year, fully funded PhD scholarship starting October, 1st, 2019 at GIPSA-lab (Grenoble, France) TOPIC: Representation learning, speech production and perception, Bayesian cognitive models, generative neural networks RESEARCH FIELD: Computer Science, Cognitive Science, Machine Learning, Artificial Intelligence, Speech Processing SUPERVISION: J. Diard (LPNC); T. Hueber, L. Girin, J.-L. Schwartz (GIPSA-Lab) IDEX PROJECT TITLE: Multidisciplinary Institute for Artificial Intelligence – Speech chair (P. Perrier) SCIENTIFIC DEPARTMENT (LABORATORY’S NAME): GIPSA-lab DOCTORAL SCHOOL: MSTII (maths and computer science) or EEATS (signal processing) or EDISCE (cognitive science), depending on the candidate’s profile and career plan TYPE of CONTRACT: 3-year doctoral contract JOB STATUS: Full time HOURS PER WEEK: 35 SALARY: between 1770 € and 2100 € gross per month (depending on complementary activity or not) OFFER STARTING DATE: October, 1st, 2019 SUBJECT DESCRIPTION: General objective How can a child learn to speak from hearing sounds, without any motor instruction provided by his/her environment? The general objective of this PhD project is to develop a computational agent, able to learn speech representations from raw speech data in a weakly supervised configuration. This agent will involve an articulatory model of the human vocal tract, an articulatory-to-acoustic synthesis system, and a learning architecture combining deep learning algorithms and developmental principles inspired from cognitive sciences. This PhD will be part of the “Speech communication” chair in the Multidisciplinary Institute for Artificial Intelligence in Grenoble (MIAI). Method This work will capitalize on two bricks of research recently developed in Grenoble. First, a Bayesian computational model of speech communication called COSMO (Communicating about Objects using SensoriMotor Operations) (Moulin-Frier et al., 2012, 2015; Laurent et al., 2017; Barnaud et al., 2019) was jointly developed by GIPSA and LPNC. This model associates speech production and speech perception models in a single architecture. The random variables in COSMO represent the signals and the sensori-motor processes involved in the speech production/perception loop. COSMO learns their probability distributions from speech examples provided by the environment, and is then able to perceive and produce speech sounds associated to speech categories. So far, COSMO was mostly tested on synthetic data. One of the main challenges is now to confront COSMO to real-world data. Second, we will also capitalize on a set of computational models for automatic processing and learning of sensorymotor distributions in speech developed at GIPSA. This comprises a set of transfer-learning algorithms (Hueber et al., 2015, Girin et al. 2017) aiming at adapting acoustic-articulatory knowledge on one speaker, towards another speaker, using a limited amount of data, possibly incomplete and noisy; together with a set of deep neural networks able to process raw articulatory data (Hueber et al., 2016; Tatulli & Hueber, 2017). The first step will consist in designing, implementing and testing a “deep” version of COSMO, in which some of the probability distributions are implemented by generative neural models (e.g. VAE, GAN). This choice is motivated by the ability of such techniques to deal with raw, noisy and complex data, as well as their flexibility in terms of transfer learning. The second stage will consist in reformulating entirely the speech communication agent in an end-to-end neural architecture. Outputs The system will be tested in terms of both efficiency of the learning process – hence ability to generate realistic speech sequences after convergence – and coherence of the motor strategies discovered by the computational agent, in spite of the fact that no motor data will be provided for learning. The outputs are both (1) theoretical – for better understanding the cognitive processes at hand in speech development and speech communication; (2) technical – for integrating knowledge about speech production and cognitive processes in a machine learning architecture; and (3) technological – for proposing a new generation of autonomous speech technologies. ELIGIBILITY CRITERIA Applicants must have: - A Master's degree (or be about to earn one) or have a university degree equivalent to a European Master's (5-year duration), in Computer Science, Cognitive Science, Signal Processing or Applied Mathematics. - Solid skills in Machine Learning or probabilistic modeling + General knowledge in natural language processing and/or speech processing (an affinity for cognitive sciences and speech sciences is welcome). - Very good programming skills (mostly in Python). - Good oral and written communication in English. - Ability to work autonomously and in collaboration with supervisors and other team members. SELECTION PROCESS Applicants will have to send their CV + an application letter in English + copy of their last diploma to: Jean-Luc.Schwartzr@gipsa-lab.fr, Thomas.Hueber@gipsa-lab.fr; Letters of recommendation are welcome. Contact before preparing a complete application are welcome too. Applications will be evaluated as they are received: the position is open until it is filled, with deadline on July 10th, 2019
| ||||||||||
6-12 | (2019-06-16) PhD thesis proposal Incremental sequence-to-sequence mapping for speech generation using deep neural networks, GIPSALab, Grenoble, France PhD thesis proposal Incremental sequence-to-sequence mapping for speech generation using deep neural networks June 17, 2019 1 Context and objectives In recent years, deep neural networks have been widely used to address sequence- to-sequence (S2S) learning. S2S models can solve many tasks where source and target sequences have di nition, machine translation, speech translation, text-to-speech synthesis, etc. Recurrent, convolutional and transformer architectures, coupled with attention models, have shown their ability to capture and model complex temporal de- pendencies between a source and a target sequence of multidimensional discrete and/or continuous data. Importantly, end-to-end training alleviates the need to previously extract handcrafted features from the data by learning hierarchi- cal representations directly from raw data (e.g. character string, video, speech waveform, etc.). The most common models are composed of an encoder that reads the full in- put sequence (i.e. from its beginning to its end) before the decoder produces the corresponding output sequence. This implies a latency equals to the length of the input sequence. In particular, for a text-to-speech (TTS) system, the speech waveform is usually synthesized from a complete text utterance (e.g. a sequence of words with explicit begin/end-of-utterance markers). Such approach cannot be used in a truly interactive scenario, in particular by a speech-handicapped person to communicate orally'. Indeed, the interlocutor has to wait for the complete utterance to be typed before being able to listen to the synthetic voice, hence limiting the dynamics and naturalness of the interaction. The goal of this project is to develop a general methodology for incremental sequence-to-sequence mapping, with application to interactive speech technolo- gies. It will require the development of end-to-end classication and regression neural models able to deliver chunks of output data on-the-y, from only a par- tial observation of input data. The goal is to learn an ecient policy that leads to an optimal trade-o process. Possible strategies to decode the output data as soon as possible in- clude: (i) Predicting online he future' of the output sequence from he past 1 and present' of the input sequence, with an acceptable tolerance to possible er- rors, or (2) learn automatically from the data an optimal waiting policy' that prevents the model to output data when the uncertainty is too high. The devel- oped methodology will be applied to address two speech processing problems: (i) Incremental Text-to-Speech synthesis in which speech is synthesized while the user is typing the text (possibly with a variable latency), and (ii) Incremen- tal speech enhancement/inpainting in which portions of the speech signal are unintelligible because of sudden noise or speech production disorders, and must be replaced on-the-y with reconstructed portions. 2 Work plan The proposed working plan is the following : Bibliographic work on S2S neural models, in the context of speech recogni- tion, speech synthesis, and machine translation as well as their incremental (low-latency) variations Investigating new architectures, losses, and training strategies toward in- cremental S2S models. Implementing and evaluating the proposed techniques in the context of end-to-end neural TTS systems (the baseline system may be a neural TTS trained with past information/left-context only). Implementing and evaluating the proposed techniques in the context of speech enhancement/inpainting, rst on simulated noisy speech and then on pathological speech. 3 Requirements We are looking for an outstanding and highly motivated PhD candidate to work on this subject. Following requirements are mandatory: Engineering degree and/or a Master's degree in Computer Science, Signal Processing or Applied Mathematics. Solid skills in Machine Learning. General knowledge in natural language processing and/or speech processing. Excellent programming skills (mostly in Python and deep learning frame- works). Good oral and written communication in English. Ability to work autonomously and in collaboration with supervisors and other team members. 2 4 Work context Grenoble Alpes Univ. o puting facilities, as well as remarkable surroundings to explore over the week- ends. The PhD project will be funded by the Grenoble Articial Intelligence Institute (MIAI). The PhD candidate will work both at GIPSA-lab (CRISSP team) and LIG-lab (GETALP team). The duration of the PhD is 3 years. The salary is between 1770 and 2100 euros gross per month (depending on comple- mentary activity or not). 5 How to apply? Applications should include a detailed CV; a copy of their last diploma; at least two references (people likely to be contacted); a cover letter of one page; a one- page summary of the Master thesis; the two last transcripts of notes (Master or engineering school). Applications should be sent to thomas.hueber@gipsa-lab.fr, laurent.girin@gipsa-lab.fr and laurent.besacier@imag.fr. Applications will be evaluated as they are received: the position is open until it is lled, with deadline on July 10th, 2019.
| ||||||||||
6-13 | (2019-06-20) Post-doc position, CNRS and Unv.Aix-Marseille, Aix-en-Provence, France
| ||||||||||
6-14 | (2019-06-21) Ingénieur d'études, LIG, Univ.de Grenoble-Alpes, France RECRUTEMENT D?UN INGÉNIEUR D?ÉTUDES EN TRAITEMENT AUTOMATIQUE DES LANGUES NATURELLES ET EN DÉVELOPPEMENT D?UNE INTERFACE IHM-WEB - Des compétences en C/C++ seraient un plus - Une expérience en traitement automatique de la parole est requise ainsi qu?un bon niveau de français. - Une experience en METEOR, firepad, Node.JS, mongoDB, firebase serait un plus. Ce poste nécessite des capacités de travail en équipe et en autonomie.
| ||||||||||
6-15 | (2019-06-21) Post doc at LIUM, Univ. du Mans, Le Mans, France Post-doc position open
| ||||||||||
6-16 | (2019-06-22) Responsable de IA H/F. Manager de l’équipe R&D, Zaion, Paris, France
ZAION est une société innovante en pleine expansion spécialisée dans la technologie des robots conversationnels : callbot et chatbot intégrant de l’Intelligence Artificielle. ZAION a développé une solution qui s’appuie sur une expérience de plus de 20 ans de la Relation Client. Cette solution en rupture technologique reçoit un accueil très favorable au niveau international et nous comptons déjà 18 clients actifs (GENERALI, MNH, APRIL, CROUS, EUROP ASSISTANCE, PRO BTP …). Nous sommes actuellement parmi les seuls au monde à proposer une offre de ce type entièrement tournée vers la performance. Nous rejoindre, c’est prendre part à une aventure passionnante au sein d’une équipe ambitieuse afin de devenir la référence sur le marché des robots conversationnels. Dans le cadre de son développement ZAION recherche son Responsable de IA H/F. Manager de l’équipe R&D, votre rôle est stratégique dans le développement et l’expansion de la société. Vous développerez, une solution qui permet de détecter les émotions dans les conversations. Nous souhaitons augmenter les fonctionnalités cognitives de nos callbots afin qu’ils puissent détecter les émotions de leurs interlocuteurs (joie, stress, colère, tristesse…) et donc adapter leurs réponses en conséquence. Vos missions principales : - Vous participez à la création du pôle R&D de ZAION et piloterez à votre arrivée votre premier projet de reconnaissance d’émotion dans la voix. - Construisez, adaptez et faites évoluer nos services de détection d’émotion dans la voix - Analysez de bases de données conséquentes de conversations pour en extraire les conversations émotionnellement pertinentes - Construisez une base de données de conversations labelisées avec des étiquettes émotionnelles - Formez et évaluez des modèles d'apprentissage automatique pour la classification d’émotion - Déployez vos modèles en production - Améliorez en continue le système de détection des émotions dans la voix Qualifications requises et expérience antérieure : -Vous avez une expérience de 5 ans minimum comme Data Scientist/Machine Learning appliqué à l’Audio et une appétence à l’encadrement - Diplômé d’une école d’Ingénieur ou Master en informatique ou un doctorat en informatique mathématiques avec des compétences solides en traitements de signal (audio de préférence) - Solide formation théorique en apprentissage machine et dans les domaines mathématiques pertinents (clustering, classification, factorisation matricielle, inférence bayésienne, deep learning...) - La mise à disposition de modèles d'apprentissage machine dans un environnement de production serait un plus - Vous maîtrisez un ou plusieurs des langages suivants : Python, Frameworks de machine Learning/Deep Learning (Pytorch, TensorFlow,Sci-kit learn, Keras) et Javascript - Vous maîtrisez les techniques du traitement du signal audio - Une expérience confirmée dans la labélisation de grande BDD (audio de préférence) est indispensable ; - Votre personnalité : Leader, autonome, passionné par votre métier, vous savez animer une équipe en mode projet - Vous parlez anglais couramment Merci d’envoyer votre candidature à : alegentil@zaion.ai
| ||||||||||
6-17 | (2019-06-23) 3 open roles at Speechmatics, Cambridge, UK 1.SPEECH RECOGNITION INTERN Location: Cambridge, UK Contact: careers@speechmatics.com
“As an intern at Speechmatics I have worked on projects that use real machine learning to deliver real value to people across the world. There are few places where the machine learning being used is at the bleeding edge of the field, but Speechmatics is one of them. The company has an amazing culture that allows you to grow as a programmer and as a person. If you want to be a part of a fast-growing machine learning company where you, personally, will make a difference then Speechmatics could well be the place for you!”
Background Speech technology is one of the most popular discussion items at the moment, yet speech interaction is limited to “Alexa, turn on the light”, or “Siri, where is the nearest coffee shop?” We are taking speech technology to the next level using our expertise in machine learning and speech-to-text technology to enable our customers to use conversational speech recognition. Our solutions power subtitling on TV, content discovery for videos, compliance solutions in banks, improve efficiency of meetings, and many other use-cases. Our mission is to improve human communication with a global speech engine, that works and put speech back at the heart of communication. At Speechmatics you’ll be working with some of the smartest minds in the industry, working on cutting-edge projects and deploying the latest machine learning techniques to disrupt the market, providing customers with the best speech technology available, all whilst immersed in a progressive and great company culture. You can enjoy benefits including, share options, healthcare, life assurance, Bike Doctor, massages, regular BBQs, Brew Dogs in the fridge, no red tape, a top end laptop and much more. We’re building a company that truly strives to be world-leading and we’re looking for people who wholeheartedly believe they can be additive to our culture, bring new ideas to the table and get stuff done. If that’s you, carry on reading. The Opportunity The Speechmatics Engineering team develops and maintains speech-oriented products and services that will be used by businesses worldwide and is responsible for the complete product development cycle for these products. In this internship, you’ll help to support fundamental speech and language processing research to improve our performance and language coverage as well as helping to build products and features to delight our users. Because you will be joining a rapidly expanding team, you will need to be a team player who thrives in a fast-paced environment, with a focus on investigating ideas and rapidly moving research developments into products. We strongly encourage versatility and knowledge transfer within and across teams. You will be expected to learn fast and feel emboldened to ask for support as you need it. Prior experience of speech recognition is desirable, although Speechmatics has a team of speech recognition engineers who will collaborate and share any specialised knowledge required. If you are enthusiastic about speech recognition and machine learning in general, with the drive to deliver the best possible technology solutions, then we want to hear from you! Our internships are not time constrained to specific dates – we can work out mutually agreeable start and end dates as part of the application process. Key Responsibilities
Requirements Essential
Desirable
Salary Competitive salary (dependent on experience), flexible working and some awesome benefits & perks. Interested? Get in touch! Send your CV and covering letter to careers@speechmatics.com.
2.SPEECH RECOGNITION ENGINEER Location: Cambridge, UK Contact: careers@speechmatics.com
'As a Speech Recognition Engineer at Speechmatics, I work on solving a multitude of problems related to improving the accuracy and delivering new features for a global automatic speech recognition engine. As a member of the speech team, I work across every aspect of speech and implement the latest research in acoustic and language modelling. The team is supportive and also rich in terms of skills and backgrounds. Speechmatics offer progressive and rewarding opportunities in one of the best speech technology companies in the world.'
Background Speech technology is one of the most popular discussion items at the moment, yet speech interaction is limited to “Alexa, turn on the light”, or “Siri, where is the nearest coffee shop?” We are taking speech technology to the next level using our expertise in machine learning and speech-to-text technology to enable our customers to use conversational speech recognition. Our solutions power subtitling on TV, content discovery for videos, compliance solutions in banks, improve efficiency of meetings, and many other use-cases. Our mission is to improve human communication with a global speech engine, that works and put speech back at the heart of communication. At Speechmatics you’ll be working with some of the smartest minds in the industry, working on cutting-edge projects and deploying the latest machine learning techniques to disrupt the market, providing customers with the best speech technology available, all whilst immersed in a progressive and great company culture. You can enjoy benefits including, share options, healthcare, life assurance, Bike Doctor, massages, regular BBQs, Brew Dogs in the fridge, no red tape, a top end laptop and much more. We’re building a company that truly strives to be world-leading and we’re looking for people who wholeheartedly believe they can be additive to our culture, bring new ideas to the table and get stuff done. If that’s you, carry on reading. The Opportunity We are looking for a talented speech engineer to help us build the best speech technology for anybody, anywhere, in any language. You will be part of a team that is working on our core ASR capabilities to improve our speed and accuracy and develop novel features that we can support in all languages. Your work will feed into our ground-breaking framework to support the building of ASR models in every language pack published by the company. You will be responsible for keeping our system the most accurate and useful commercial speech recognition system available. As you will be joining a small team, you will need to be a team player who thrives in a fast-paced environment, with a focus on rapidly moving research developments into products. Bringing skills into the team is as important as a can-do attitude. We strongly encourage versatility and knowledge transfer within the team, so that we can share efficiently what needs to be done to meet our commitments to the rest of the company. Key Responsibilities
Requirements Essential
Desirable
Salary Competitive salary (dependent on experience), flexible working and some awesome benefits & perks. Interested? Get in touch! Send your CV and covering letter to careers@speechmatics.com.
3.SENIOR SPEECH RECOGNITION ENGINEER Location: Cambridge, UK Contact: careers@speechmatics.com
'As a Speech Recognition Engineer at Speechmatics, I work on solving a multitude of problems related to improving the accuracy and delivering new features for a global Automatic Speech Recognition engine. As a member of the speech team, I work across every aspect of speech and implement the latest research in acoustic and language modelling. The team is supportive and also rich in terms of skills and backgrounds. Speechmatics offer progressive and rewarding opportunities in one of the best speech technology companies in the world.'
Background Speech technology is one of the most popular discussion items at the moment, yet speech interaction is limited to “Alexa, turn on the light”, or “Siri, where is the nearest coffee shop?” We are taking speech technology to the next level using our expertise in machine learning and speech-to-text technology to enable our customers to use conversational speech recognition. Our solutions power subtitling on TV, content discovery for videos, compliance solutions in banks, improve efficiency of meetings, and many other use-cases. Our mission is to improve human communication with a global speech engine, that works and put speech back at the heart of communication. At Speechmatics you’ll be working with some of the smartest minds in the industry, working on cutting-edge projects and deploying the latest machine learning techniques to disrupt the market, providing customers with the best speech technology available, all whilst immersed in a progressive and great company culture. You can enjoy benefits including, share options, healthcare, life assurance, Bike Doctor, massages, regular BBQs, Brew Dogs in the fridge, no red tape, a top end laptop and much more. We’re building a company that truly strives to be world-leading and we’re looking for people who wholeheartedly believe they can be additive to our culture, bring new ideas to the table and get stuff done. If that’s you, carry on reading. The Opportunity We are looking for a talented speech engineer to help us build the best speech technology for anybody, anywhere, in any language. You will be part of a team that is working on our core ASR capabilities to improve our speed and accuracy and develop novel features that we can support in all languages. Your work will feed into our ground-breaking framework to support the building of ASR models in every language pack published by the company. You will be responsible for keeping our system the most accurate and useful commercial speech recognition system available. As you will be joining a small team, you will need to be a team player who thrives in a fast-paced environment, with a focus on rapidly moving research developments into products. Bringing skills into the team is as important as a can-do attitude. We strongly encourage versatility and knowledge transfer within the team, so that we can share efficiently what needs to be done to meet our commitments to the rest of the company.
Key Responsibilities
Requirements Essential
Desirable
Salary Competitive salary (dependent on experience), flexible working and some awesome benefits & perks. Interested? Get in touch! Send your CV and covering letter to careers@speechmatics.com.
More about Speechmatics’ culture Live for the wow | Build authentic relationships | Be the adventure Innovation is what we do. We build, we iterate, we develop the next thing that delivers that wow moment. We see value in building long-term, authentic relationships that last and are based on trust and honesty. With our customers, our colleagues, our leaders, our suppliers or within our local community. Our journey should be fun and exciting. We will celebrate our successes and learn from our mistakes together along the way. We embrace learning and change to grow naturally and organically as a company and individuals. We trust, we’re honest, kind and respectful.
| ||||||||||
6-18 | (2019-07-11) 3year Early Stage Researcher PhD positions Applications are invited for a three-year Early Stage Researcher PhD positions in the speech technology for pathological speech. Description The thesis focuses on studying the link between the internal representations of Deep Neural Networks (DNNs) and the subjective representation of speech intelligibility. We propose to explore the saliency detection capabilities of DNNs when used in a regression task for predicting speech intelligibility scores as given by human experts. By saliency, we mean to retrieve which frequency bands are important and used by a DNN to make its predictions. The final expectation is to identify regions of interest in the speech signal, both in time and frequency, that characterise the level of speech impairment. The experiments will be processed on various samples of speech performed by 150 people (100 patients and 50 healthy controls). This database was recorded within the INCA C2SI project, and contains speech from patients treated for cancer of the oral cavity or pharynx. It contains also various metadata such as the location of the tumor, the impairment in terms of severity and intelligibility that were appreciated by human experts, self evaluation questionnaires on the patient’s quality of life… Various tasks were recorded such as a sustained vowel, read speech, nonsense words, prosodic exercises, picture description, etc. There will be also the possibility to extend the work to another corpus which is composed of voice of patients suffering from Parkinson disease. At first, the PhD will have to take benefit from the various analysis and descriptions that were done during the C2SI project trying to correlate the impact of the tumor and the communication ability. Those results will help attesting the human representation of the impact of the disease. Then, a DNN representation will be modeled to fit the data, taking care of the data sparsity. The last part of the work will be to explore the intern representation of the DNN, trying to explore what part of the signal help to make a decision on the impact of the disease and that will be the final goal of the thesis, studying the automatic representation that lies in the model the student will propose. This work is funded by the TAPAS project (https://www.tapas-etn-eu.org) which is a Horizon 2020 Marie Skłodowska-Curie Actions Initial Training Network European Training Network (MSCA-ITN-ETN) project that aims to transform the well being of people across Europe with debilitating speech pathologies (e.g., due to stroke, Parkinson's, etc.). These groups face communication problems that can lead to social exclusion. They are now being further marginalised by a new wave of speech technology that is increasingly woven into everyday life but which is not robust to atypical speech. The supervision of the PhD will take place at IRIT laboratory by the SAMoVA team in Toulouse. SAMoVA does research in the domain of “analysis, modeling and structuring of audiovisual content”. The application areas are diverse: speech processing, identification of languages, speaker verification and speech and music indexing. The researchers expertise covers novel machine learning and audio processing technologies and is now focused on deep learning methods, leading to several publications in international conferences. Eligibility Criteria Early Stage Researchers (ESRs) shall, at the time of recruitment by the host organization, be in the first four years (full-time equivalent research experience) of their research careers. - The ESR may be a national of a Member State, of an Associated Country or of any Third Country. - The ESR must not have resided or carried out her/his main activity (work, studies, etc.) in the country of her/his host organization for more than 12 months in the 3 years immediately prior to her/his recruitment. - Holds a Master’s degree or equivalent, which formally entitles to embark on a Doctorate. - Does not hold a PhD degree. Duration of recruitement: 36 months Contact: Julie Mauclair (mauclair@irit.fr)
| ||||||||||
6-19 | (2019-07-17) Chief Technical Officer (CTO) at ELDA Chief Technical Officer (CTO) Under the supervision of the CEO, the responsibilities of the Chief Technical Officer (CTO) include planning and supervising technical development of tools, software components or applications for language resource production and management. ELDA is acting as the distribution agency of the European Language Resources Association (ELRA). ELRA was established in February 1995, with the support of the European Commission, to promote the development and exploitation of Language Resources (LRs). Language Resources include all data necessary for language engineering, such as monolingual and multilingual lexica, text corpora, speech databases and terminology. The role of this non-profit membership Association is to promote the production of LRs, to collect and to validate them and, foremost, make them available to users. The association also gathers information on market needs and trends.
| ||||||||||
6-20 | (2019-07-19) Two Post-doctoral positions at Le Mans University , France 2 Post-doctoral positions at Le Mans University on Deep learning approaches speech processing **************************************** * Context * **************************************** * Context *
-- Anthony Larcher Maître de Conférences, HDR / Associate Professor
| ||||||||||
6-21 | (2019-07-20) Three-year Early Stage Researcher PhD positions, IRIT, Toulouse, France Applications are invited for a three-year Early Stage Researcher PhD positions in the speech technology for pathological speech. Description The thesis focuses on studying the link between the internal representations of Deep Neural Networks (DNNs) and the subjective representation of speech intelligibility. We propose to explore the saliency detection capabilities of DNNs when used in a regression task for predicting speech intelligibility scores as given by human experts. By saliency, we mean to retrieve which frequency bands are important and used by a DNN to make its predictions. The final expectation is to identify regions of interest in the speech signal, both in time and frequency, that characterise the level of speech impairment. The experiments will be processed on various samples of speech performed by 150 people (100 patients and 50 healthy controls). This database was recorded within the INCA C2SI project, and contains speech from patients treated for cancer of the oral cavity or pharynx. It contains also various metadata such as the location of the tumor, the impairment in terms of severity and intelligibility that were appreciated by human experts, self evaluation questionnaires on the patient’s quality of life… Various tasks were recorded such as a sustained vowel, read speech, nonsense words, prosodic exercises, picture description, etc. There will be also the possibility to extend the work to another corpus which is composed of voice of patients suffering from Parkinson disease. At first, the PhD will have to take benefit from the various analysis and descriptions that were done during the C2SI project trying to correlate the impact of the tumor and the communication ability. Those results will help attesting the human representation of the impact of the disease. Then, a DNN representation will be modeled to fit the data, taking care of the data sparsity. The last part of the work will be to explore the intern representation of the DNN, trying to explore what part of the signal help to make a decision on the impact of the disease and that will be the final goal of the thesis, studying the automatic representation that lies in the model the student will propose. This work is funded by the TAPAS project (https://www.tapas-etn-eu.org) which is a Horizon 2020 Marie Skłodowska-Curie Actions Initial Training Network European Training Network (MSCA-ITN-ETN) project that aims to transform the well being of people across Europe with debilitating speech pathologies (e.g., due to stroke, Parkinson's, etc.). These groups face communication problems that can lead to social exclusion. They are now being further marginalised by a new wave of speech technology that is increasingly woven into everyday life but which is not robust to atypical speech. The supervision of the PhD will take place at IRIT laboratory by the SAMoVA team in Toulouse. SAMoVA does research in the domain of “analysis, modeling and structuring of audiovisual content”. The application areas are diverse: speech processing, identification of languages, speaker verification and speech and music indexing. The researchers expertise covers novel machine learning and audio processing technologies and is now focused on deep learning methods, leading to several publications in international conferences. Eligibility Criteria Early Stage Researchers (ESRs) shall, at the time of recruitment by the host organization, be in the first four years (full-time equivalent research experience) of their research careers. - The ESR may be a national of a Member State, of an Associated Country or of any Third Country. - The ESR must not have resided or carried out her/his main activity (work, studies, etc.) in the country of her/his host organization for more than 12 months in the 3 years immediately prior to her/his recruitment. - Holds a Master’s degree or equivalent, which formally entitles to embark on a Doctorate. - Does not hold a PhD degree. Duration of recruitment: 36 months. Contact : Julie Mauclair (mauclair@irit.fr)
| ||||||||||
6-22 | (2019-07-23) PhD position at LORIA-INRIA, Nancy, France Automatic classification using deep learning of hate speech posted on the Internet
Supervisors: Irina Illina, MdC, HDR, Dominique Fohr, CR CNRS
Team: Multispeech, LORIA-INRIA, France
Contact: illina@loria.fr, dominique.fohr@loria.fr
Duration of PhD Thesis : 3 years
Deadline to apply : August 15th, 2019
Required skills: background in statistics, natural language processing and computer program skills (Perl, Python). Candidates should email a detailed CV with diploma
Keywords: hate speech, social media, natural language processing.
The rapid development of the Internet and social networks has brought great benefits to women and men in their daily lives. Unfortunately, the dark side of these benefits has led to an increase in hate speech and terrorism as the most common and powerful threats on a global scale. Hate speech is a type of offensive communication mechanism that expresses an ideology of hatred often using stereotypes. Hate speech can target different societal characteristics such as gender, religion, race, disability, etc. Hate speech is the subject of different national and international legal frameworks. Hate speech is a type of terrorism and often follows a terrorist incident or event.
Social networks are incredibly popular today. Nowadays, Twitter, LinkedIn, Facebook and YouTube are used as a standard tool for communicating ideas, beliefs and feelings. Only a small percentage of people use part of the network for unhealthy activities such as hate speech and terrorism. But the impact of this low percentage of users is extremely damaging. For years, social media companies such as Twitter, Facebook and YouTube have invested hundreds of millions of dollars each year in the task of detecting, classifying and moderating hate. But these efforts are mainly based on manually revising the content to identify and remove offensive content, which is extremely expensive.
This thesis aims at designing automatic and evolving methods for the classification of hate speech in the field of social media. Despite the studies already published on this subject, the results show that the task remains very difficult. We will use semantic content analysis methodologies from automatic language processing (NLP) and methodologies based on deep learning (DNN) which is the revolution in the field of artificial intelligence. During this thesis, we will develop a research protocol to classify hate speech in the text in terms of hateful, aggressive, insulting, ironic, neutral, etc. character. This type of problem is placed in the context of the multi-label classification.
In addition, the problem of obfuscation of words in hate messages will need to be addressed. People who want to write hate speech on the Internet know that they risk being censored by rudimentary automatic systems of moderation. So, users try to obscure their words by changing the spelling or the spelling of words.
Among the crucial points of this thesis are the choice of the DNN architecture and the relevant representation of the data, ie the text of the internet message. The system designed will be validated on real flows of social networks.
Skills
Strong background in mathematics, machine learning (DNN), statistics
Following profiles are welcome, either: Strong experience with natural language processing
Excellent English writing and speaking skills are required in any case.
References :
T Gröndahl, L Pajola, M Juuti, M Conti, N Asokan (2018) ?All You Need is? Love?: Evading Hate-speech Detection, arXiv preprint arXiv:1808.09115
Wiegand, M., Klakow, D. (2008). Optimizing Language Models for Polarity Classification. In Proceedings of ECIR, pp. 612-616.
Wiegand, M., Ruppenhofer, J. (2015). Opinion Holder and Target Extraction based on the Induction of Verbal Categories. In Proceedings of CoNLL, pp. 215-225.
Wiegand, M., Ruppenhofer J., Schmidt A., C. Greenberg (2018) Inducing a Lexicon of Abusive Words ? A Feature-Based Approach. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
Wiegand, M., Wolf, M., Ruppenhofer, J. (2017) Negation Modeling for German Polarity Classification. In Proceedings of GSCL.
Zhang Z., Luo L. (2018). Hate speech detection: a solved problem? The Challenging Case of Long Tail on Twitter. arxiv.org/pdf/1803.03662
| ||||||||||
6-23 | (2019-07-29) PhD position, Vrij Universiteit Berussels, Belgkium PhD position in Agent Based Modeling of Cognitively Plausible Emergent Behavior
In the context of seed funding for AI research in Flanders prof. Bart de Boer is looking for a PhD student for the origins of language group of the AI-lab of the Vrije Universiteit Brussels.
PhD position offered We offer a four year PhD position funded by a scholarship with a yearly bench fee. The PhD work will consist of building an agent-based simulation in which we can investigate emergence of behavior in a cognitively realistic setting. This means that the agents are not fully rational and that they show behavior similar to that of humans, and that interests of agents are not necessarily always aligned. The modeling will primarily focus on emergence of speech, but the simulation should be general enough that it can be easily adapted to other areas, such as traffic or economic interactions.
What we are looking for We are looking for an enthusiastic student with a degree in artificial intelligence, cognitive science, linguistics or equivalent and who has experience programming agent-based or cognitive models, preferably in Python or C++. Knowledge of speech and speech processing is a bonus. The starting date is negotiable, but preferably no later than September 2019.
How to apply Send a recent CV, detailing your academic record and your programming experience as well as a letter of motivation to prof. Bart de Boer. At this stage we ask you not to send copies of your diplomas or letters of reference. These we will request directly if we decide to further pursue your application, If you have any questions please email prof. Bart de Boer.
Links Context: https://ai.vub.ac.be/node/1687 Email Bart de Boer: bart@ai.vub.ac.be
| ||||||||||
6-24 | (2019-07-29) Visiting postdoc at Vrije Universiteit, Brussels, Belgium Visiting postdoc in Cognitively Plausible Emergent Behavior
In the context of seed funding for AI research in Flanders prof. Bart de Boer is looking for a short term (three-six months) visiting postdoc for the origins of language group of the AI-lab of the Vrije Universiteit Brussels.
Position offered We offer a three-six months visiting postdoc position funded by a scholarship and with a bench fee. The work should consist of agent-based simulation, or of experiments to investigate emergence of behavior in a cognitively realistic setting. This means that in a computer simulation, the agents are not fully rational and that they show behavior similar to that of humans, and that interests of agents are not necessarily always aligned. Experiments should focus on factors that are typical for human settings, but that are generally idealized away, such as altruism, conflicts of interests and other 'non-rational' behaviors. We are most interested in modeling emergence of speech, but we welcome applications proposing other areas, such as traffic or economic interactions.
What we are looking for We are looking for an enthusiastic postdoc with a track record in artificial intelligence, cognitive science, linguistics or equivalent and who has either experience programming agent-based or cognitive models, or who has experience with the interaction between computer models and experiments. The starting date is negotiable, but preferably no later than September 2019.
How to apply Send a recent CV, detailing your academic record and your programming experience as well as a letter of motivation to prof. Bart de Boer. Be sure to include a short (1-page) outline of your proposed project in the letter of motivation, as well as a short planning. At this stage we ask you not to send copies of your diplomas or letters of reference. These we will request directly if we decide to further pursue your application, If you have any questions please email prof. Bart de Boer.
Links Context: https://ai.vub.ac.be/node/1687 Email Bart de Boer: bart@ai.vub.ac.be
| ||||||||||
6-25 | (2019-08-02) Research engineer or Post-doc, at Eurecom, Inria, LIA, France EURECOM (Nice, France), Inria (Nancy, France) and LIA (Avignon, France) are opening a
| ||||||||||
6-26 | (2019-08-02) Ph.D. position in Softbank robotics and Telecom-Paris, France Ph.D. position in Softbank robotics and Telecom-Paris
| ||||||||||
6-27 | (2019-08-03) Speech scientist at ETS Research Speech scientist at ETS Research :
https://etscareers.pereless.com/index.cfm?fuseaction=83080.viewjobdetail&CID=83080&JID=290092
| ||||||||||
6-28 | (2019-08-12) Several positions in Forensic Speech Science or Forensic Data Science: Aston University, Birmingham, UK Positions in Forensic Speech Science or Forensic Data Science:
| ||||||||||
6-29 | (2019-08-14) Postdoc at KTH, Stockholm, Sweden We are looking for a postdoc to conduct research in a multidisciplinary expedition project funded by Wallenberg AI, Autonomous Systems and Software Program (WASP), Sweden?s largest individual research program, addressing compelling research topics that promise disruptive innovations in AI, autonomous systems and software for several years to come.
The project combines Formal Methods and Human-Robot Interaction with the goal of moving from conventional correct-by-design control with simple, static human models towards the synthesis of correct-by-design and socially acceptable controllers that consider complex human models based on empirical data. Two demonstrators, an autonomous driving scenario and a mobile robot navigation scenario in crowded social spaces, are planned to showcase the advances made in the project.
The focus of this position is on the development of data-driven models of human behavior that can be integrated with formal methods-based systems to better reflect real-world situations, as well as in the evaluation of the social acceptability of such systems.
The candidate will work under the supervision of Assistant Prof. Iolanda Leite (https://iolandaleite.com/) and in close collaboration with another postdoctoral researcher working in the field of formal synthesis.
This is a two-year position. The starting date is open for discussion, but ideally, we would like the selected candidate to start ASAP.
QUALIFICATIONS
Candidates should have completed, or be near completion of, a Doctoral degree with a strong international publication record in areas such as (but not limited to) human-robot interaction, social robotics, multimodal perception, and artificial intelligence. Familiarity with formal methods, game theory, and control theory is an advantage.
Documented written and spoken English and programming skills are required. Experience with experimental design and statistical analysis is an important asset. Applicants must be strongly motivated, be able to work independently and possess good levels of cooperative and communicative abilities.
We look for candidates who are excited about being a part of a multidisciplinary team.
HOW TO APPLY
The application should include:
1. Curriculum vitae.
2. Transcripts from University/ University College.
3. A brief description of the candidate's research interests, including previous research and future goals (max 2 pages).
4. Contact of two references. We will contact the references only for selected candidates.
The application documents should be uploaded using the KTH's recruitment system:
The application deadline is ** September 13, 2019 **
-----------------
Iolanda Leite Assistant Professor KTH Royal Institute of Technology School of Electrical Engineering and Computer Science Division of Robotics, Perception and Learning (RPL) Teknikringen 33, 4th floor, room 3424, SE-100 44 Stockholm, Sweden Phone: +46-8 790 67 34 https://iolandaleite.com
| ||||||||||
6-30 | (2019-08-17) Fully funded PhD position at IDIAP, Martigny, Valais, Switzerland. There is a fully funded PhD position open at Idiap Research Institute on spiking neural
| ||||||||||
6-31 | (2019-08-18) PhD positions at IRIT, Toulouse, France Applications are invited for a three-year Early Stage Researcher PhD positions in the speech technology for pathological speech. Description
The thesis focuses on studying the link between the internal representations of Deep Neural Networks (DNNs) and the subjective representation of speech intelligibility. We propose to explore the saliency detection capabilities of DNNs when used in a regression task for predicting speech intelligibility scores as given by human experts. By saliency, we mean to retrieve which frequency bands are important and used by a DNN to make its predictions.
The final expectation is to identify regions of interest in the speech signal, both in time and frequency, that characterise the level of speech impairment.
The experiments will be processed on various samples of speech performed by 150 people (100 patients and 50 healthy controls). This database was recorded within the INCA C2SI project, and contains speech from patients treated for cancer of the oral cavity or pharynx. It contains also various metadata such as the location of the tumor, the impairment in terms of severity and intelligibility that were appreciated by human experts, self evaluation questionnaires on the patient?s quality of life? Various tasks were recorded such as a sustained vowel, read speech, nonsense words, prosodic exercises, picture description, etc.
There will be also the possibility to extend the work to another corpus which is composed of voice of patients suffering from Parkinson disease.
At first, the PhD will have to take benefit from the various analysis and descriptions that were done during the C2SI project trying to correlate the impact of the tumor and the communication ability. Those results will help attesting the human representation of the impact of the disease. Then, a DNN representation will be modeled to fit the data, taking care of the data sparsity. The last part of the work will be to explore the intern representation of the DNN, trying to explore what part of the signal help to make a decision on the impact of the disease and that will be the final goal of the thesis, studying the automatic representation that lies in the model the student will propose.
This work is funded by the TAPAS project (https://www.tapas-etn-eu.org) which is a Horizon 2020 Marie Sk?odowska-Curie Actions Initial Training Network European Training Network (MSCA-ITN-ETN) project that aims to transform the well being of people across Europe with debilitating speech pathologies (e.g., due to stroke, Parkinson's, etc.). These groups face communication problems that can lead to social exclusion. They are now being further marginalised by a new wave of speech technology that is increasingly woven into everyday life but which is not robust to atypical speech.
The supervision of the PhD will take place at IRIT laboratory by the SAMoVA team in Toulouse. SAMoVA does research in the domain of ?analysis, modeling and structuring of audiovisual content?. The application areas are diverse: speech processing, identification of languages, speaker verification and speech and music indexing. The researchers expertise covers novel machine learning and audio processing technologies and is now focused on deep learning methods, leading to several publications in international conferences.
Eligibility Criteria: Early Stage Researchers (ESRs) shall, at the time of recruitment by the host organization, be in the first four years (full-time equivalent research experience) of their research careers. - The ESR may be a national of a Member State, of an Associated Country or of any Third Country.
Applications can be done through the website : https://www.tapas-etn-eu.org/positions/recruitment
Contact : Julie Mauclair (mauclair@irit.fr)
| ||||||||||
6-32 | (2018-08-25) Post-doc position at INRIA Rennes, FrancePost-doc position: Pattern mining for Neural Networks debugging: application to speech recognitionAdvisors: Elisa Fromont & Alexandre Termier, IRISA/INRIA RBA ? Lacodam team (Rennes)
Irina Illina & Emmanuel Vincent, LORIA/INRIA ? Multispeech team (Nancy) Location: INRIA RBA, team Lacodam (Rennes) Keywords: discriminative pattern mining, neural networks analysis, explainability of black Deadline to apply: September 30th, 2019 Context: Understanding the inner working of deep neural networks (DNN) has attracted a lot of attention in the past years [1, 2] and most problems were detected and analyzed using visualization techniques [3, 4]. Those techniques help to understand what an individual neuron or a layer of neurons are computing. We would like to go beyond this by focusing on groups of neurons which are commonly highly activated when a network is making wrong predictions on a set of examples. In the same line as [1], where the authors theoretically link how a training example affects the predictions for a test example using the so called ?influence functions?, we would like to design a tool to ?debug? neural networks by identifying, using symbolic data mining methods, (connected) parts of the neural network architecture associated with erroneous or uncertain outputs. In the context of speech recognition, this is especially important. A speech recognition system contains two main parts: an acoustic model and a language model. Nowadays models are trained with deep neural networks-based algorithms (DNN) and use very large learning corpora to train an important number of DNN hyperparameters. There are many works to automatically tune these hyperparameters. However, this induces a huge computational cost, and does not empower the human designers. It would be much more efficient to provide human designers with understandable clues about the reasons for the bad performance of the system, in order to benefit from their creativity to quickly reach more promising regions of the hyperparameter search space. Description of the position: This position is funded in the context of the HyAIAI ?Hybrid Approaches for Interpretable AI? INRIA project lab (https://www.inria.fr/en/research/researchteams/inria-project-labs). With this position, we would like to go beyond the current common visualization techniques that help to understand what an individual neuron or a layer of neurons is computing, by focusing on groups of neurons that are commonly highly activated when a network is making wrong predictions on a set of examples. Tools such as activation maximization [8] can be used to identify such neurons. We propose to use discriminativepattern mining, and, to begin with, the DiffNorm algorithm [6] in conjunction with the LCM one [7] to identify the discriminative activation patterns among the identified neurons. The data will be provided by the MULTISPEECH team and will consist of two deep architectures as representatives of acoustic and language models [9, 10]. Furthermore, the training data will be provided, where the model parameters ultimately derive from. We will also extend our results by performing experiments with supervised and unsupervised learning to compare the features learned by these networks and to perform qualitative comparisons of the solutions learned by various deep architectures. Identifying ?faulty? groups of neurons could lead to the decomposition of the DL network into ?blocks? encompassing several layers. ?Faulty? blocks may be the first to be modified in the search for a better design. The recruited person will benefit from the expertise of the LACODAM team in pattern mining and deep learning (https://team.inria.fr/lacodam/) and of the expertise of the MULTISPEECH team (https://team.inria.fr/multispeech/) in speech analysis, language processing and deep learning. We would ideally like to recruit a 1 year (with possibly one additional year) post-doc with the following preferred skills: See the INRIA web site for the post-doc page. The candidates should send a CV, 2 names of referees and a cover letter to the four researchers (firstname.lastname@inria.fr) mentioned above. Please indicate if you are applying for the post-doc or the PhD position. The selected candidates will be interviewed in September for an expected start in October-November 2019. Bibliography: [1] Pang Wei Koh, Percy Liang: Understanding Black-box Predictions via Influence Functions. ICML 2017: pp 1885-1894 (best paper). [2] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals: Understanding deep learning requires rethinking generalization. ICLR 2017. [3] Anh Mai Nguyen, Jason Yosinski, Jeff Clune: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. CVPR 2015: pp 427-436. [4] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus: Intriguing properties of neural networks. ICLR 2014. [5] Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi: Deep Text Classification Can be Fooled. IJCAI 2018: pp 4208-4215. [6] Kailash Budhathoki and Jilles Vreeken. The difference and the norm?characterising similarities and differences between databases. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 206?223. Springer, 2015. [7] Takeaki Uno, Tatsuya Asai, Yuzo Uchida, and Hiroki Arimura. Lcm: An efficient algorithm for enumerating frequent closed item sets. In Fimi, volume 90. Citeseer, 2003.
| ||||||||||
6-33 | (2019-08-28) Speech technologist/linguist at Cobaltspeech. Cobalt Speech & Language (http://www.cobaltspeech.com/ ) is looking for a speech technologist/linguist to help find and create language resources for a project in French Canadian.
| ||||||||||
6-34 | (2019-09-04)PhD thesis proposal, GIPSA Lab Grenoble France PhD thesis proposal Incremental sequence-to-sequence mapping for speech generation using deep neural networks September 4, 2019 1 Context and objectives In recent years, deep neural networks have been widely used to address sequence- to-sequence (S2S) learning. S2S models can solve many tasks where source and target sequences have different lengths such as: automatic speech recog- nition, machine translation, speech translation, text-to-speech synthesis, etc. Recurrent, convolutional and transformer architectures, coupled with attention models, have shown their ability to capture and model complex temporal de- pendencies between a source and a target sequence of multidimensional discrete and/or continuous data. Importantly, end-to-end training alleviates the need to previously extract handcrafted features from the data by learning hierarchi- cal representations directly from raw data (e.g. character string, video, speech waveform, etc.). The most common models are composed of an encoder that reads the full in- put sequence (i.e. from its beginning to its end) before the decoder produces the corresponding output sequence. This implies a latency equals to the length of the input sequence. In particular, for a text-to-speech (TTS) system, the speech waveform is usually synthesized from a complete text utterance (e.g. a sequence of words with explicit begin/end-of-utterance markers). Such approach cannot be used in a truly interactive scenario, in particular by a speech-handicapped person to communicate orally'. Indeed, the interlocutor has to wait for the complete utterance to be typed before being able to listen to the synthetic voice, hence limiting the dynamics and naturalness of the interaction. The goal of this project is to develop a general methodology for incremental sequence-to-sequence mapping, with application to interactive speech technolo- gies. It will require the development of end-to-end classication and regression neural models able to deliver chunks of output data on-the-y, from only a par- tial observation of input data. The goal is to learn an ecient policy that leads to an optimal trade-off between (variable) latency and accuracy of the decoding process. Possible strategies to decode the output data as soon as possible in- clude: (i) Predicting online he future' of the output sequence from he past and present' of the input sequence, with an acceptable tolerance to possible er- rors, or (2) learn automatically from the data an optimal waiting policy' that prevents the model to output data when the uncertainty is too high. The devel- oped methodology will be applied to address two speech processing problems: (i) Incremental Text-to-Speech synthesis in which speech is synthesized while the user is typing the text (possibly with a variable latency), and (ii) Incremen- tal speech enhancement/inpainting in which portions of the speech signal are unintelligible because of sudden noise or speech production disorders, and must be replaced on-the-y with reconstructed portions. 2 Work plan The proposed working plan is the following : Bibliographic work on S2S neural models, in the context of speech recogni- tion, speech synthesis, and machine translation as well as their incremental (low-latency) variations Investigating new architectures, losses, and training strategies toward in- cremental S2S models. Implementing and evaluating the proposed techniques in the context of end-to-end neural TTS systems (the baseline system may be a neural TTS trained with past information/left-context only). Implementing and evaluating the proposed techniques in the context of speech enhancement/inpainting, rst on simulated noisy speech and then on pathological speech. 3 Requirements We are looking for an outstanding and highly motivated PhD candidate to work on this subject. Following requirements are mandatory: Engineering degree and/or a Master's degree in Computer Science, Signal Processing or Applied Mathematics. Solid skills in Machine Learning. General knowledge in natural language processing and/or speech processing. Excellent programming skills (mostly in Python and deep learning frame- works). Good oral and written communication in English. Ability to work autonomously and in collaboration with supervisors and other team members. 2 4 Work context Grenoble Alpes Univ. o puting facilities, as well as remarkable surroundings to explore over the week- ends. The PhD project will be funded by the Grenoble Artificial Intelligence Institute (MIAI). The PhD candidate will work both at GIPSA-lab (CRISSP team) and LIG-lab (GETALP team). The duration of the PhD is 3 years. The salary is between 1770 and 2100 euros gross per month (depending on comple- mentary activity or not). 5 How to apply? Applications should include a detailed CV; a copy of their last diploma; at least two references (people likely to be contacted); a cover letter of one page; a one- page summary of the Master thesis; the two last transcripts of notes (Master or engineering school). Applications should be sent to thomas.hueber@gipsa-lab.fr, laurent.girin@gipsa-lab.fr and laurent.besacier@imag.fr. Applications will be evaluated as they are received: the position is open until it is filled.
| ||||||||||
6-35 | (2019-09-04) Postdoc proposal, GIPSA Lab Grenoble, France Postdoc proposal Spontaneous Speech Recognition. Application to Situated Corpora in French. September 4, 2019 1 Postdoc Subject The goal of the project is to advance the state-of-the-art in spontaneous auto- matic speech recognition (ASR). Recent advances in ASR show excellent per- formances on tasks such as read speech ASR (Librispeech), TV shows (MGB challenge), but what about spontaneous communicative speech ? This postdoc project would leverage existing transcribed corpora in French (more than 300 hours) recorded in everyday communication (speech recordings inside a family, in a shop, during an interview, etc.). One impact of the project would be the automatization of transcription on very challenging data in order to feed linguistic and phonetic studies at scale. Research topics: End-to-end ASR models Spontaneous speech ASR Colloquial speech transcription Data augmentation for spontaneous and colloquial language modelling Transcribing situated corpora 2 Requirements We are looking for an outstanding and highly motivated postdoc candidate to work on this subject. Following requirements are mandatory: PhD degree in natural language processing or speech processing. Excellent programming skills (mostly in Python and deep learning frame- works). 1 Interest in pluri-disciplinary research (speech technology and speech sci- ence) Good oral and written communication in English (French is a plus while not mandatory) Ability to work autonomously and in collaboration with other team mem- bers. 3 Work context Grenoble Alpes Univ. o puting facilities, as well as remarkable surroundings to explore over the week- ends. The postdoc project will be funded by the Grenoble Articial Intelligence Institute (MIAI). The candidate will work both at LIG-lab (GETALP team) and LIDILEM-lab. The duration of the postdoc is 18 months. 4 How to apply? Applications should include a detailed CV; a copy of the last diploma; at least two references (people likely to be contacted); a cover letter of one page; a one-page summary of the PhD thesis. Applications should be sent to lau- rent.besacier@imag.fr Applications will be evaluated as they are received: the position is open until it is lled.
| ||||||||||
6-36 | (2019-09-24) VOXCRIM 2019, Ecully France nceVOXCRIM 2019
MARDI 24 SEPTEMBRE 2019
de 9h30 à 17h00
Conférences et table ronde :
regards croisés sur la comparaison de voix
en criminalistique.
Inscriptions avant le 13 septembre
voxcrim@interieur.gouv.fr
04 72 86 85 22
Service Central de la Police
Technique et Scientifique
31 avenue Franklin Roosevelt
69130 ECULLY
| ||||||||||
6-37 | 52019-09-05) Post doctoral position at IDIAP, Martigny, Switzerland The Social Computing Group at Idiap is seeking a creative and motivated postdoctoral
| ||||||||||
6-38 | (2019-09-09) Postdoctoral Research Fellow/Senior Research Fellow, University of Tampere, Finland Postdoctoral Research Fellow/Senior Research Fellow (speech and language technology, cognitive science) Tampere University and Tampere University of Applied Sciences create a unique environment for multidisciplinary, inspirational and highimpact research and education. Our universities community has its competitive edges in technology, health and society. www.tuni.fi/en Speech and Cognition research group (SPECOG) is part of Computing Sciences Unit of Tampere University within the Faculty of Information Technology and Communication Sciences. SPECOG focuses on interdisciplinary research at the intersection of speech technology and cognitive sciences. We apply advanced signal processing and machine learning methods to computational modeling of human language learning and perception and study how human-like information processing principles can be applied in autonomous next-generation artificial intelligence (AI) systems. The group also conducts research and development in speech and language technology and in medical signal processing and machine learning. SPECOG collaborates with several internationally leading research groups within and across disciplinary boundaries, including joint research with computer scientists, psychologists, brain researchers, and linguists. The group is also closely affiliated with audio and machine vision research groups of Tampere University. More information on SPECOG: http://www.cs.tut.fi/sgn/specog/index.html Job description We are inviting applications for the position of a postdoctoral research fellow or senior research fellow in the areas of speech and language technology and cognitive science. The work will be conducted as a member of the SPECOG research group led by Asst. Prof. Okko Räsänen. We are looking for candidates who are interested in human and/or machine language processing, and who are willing to contribute to our highly cross-disciplinary research efforts in understanding language learning in humans and autonomous computational systems. Our current focus is on machine learning algorithms for unsupervised language learning from purely acoustic or audiovisual data (sometimes also known as zero-resource speech processing). However, we also consider candidates with a strong independent research agenda in complementary areas of speech and language technology. In this position, the candidate is expected to: 1) carry out world-class research on a topic related to SPECOG focus areas 2) work in close collaboration with other members of the research group, and 3) help to advise undergraduate and/or PhD projects on the relevant topics (with flexibility according to personal interests and career aspirations). Requirements The candidate should hold a doctoral degree (e.g., PhD or D.Sc. (Tech.)) in language technology, computer science, electrical engineering, cognitive science, or other relevant area. Candidates who have already completed their doctoral research work but have not yet received their doctoral certificate may also apply. A successful candidate has strong expertise in signal processing and machine learning (e.g., deep learning), ideally from the context of speech technology. Applicants with a background in natural language processing (NLP) or cognitive science are also considered. Experience or interests in linguistics, neuroscience, or statistics are considered as an advantage. Fluent programming (Python, Matlab, R, C++ or similar) and English skills are required. Potential candidates must be capable of carrying out independent research at the highest international level. Competence must be demonstrated through several existing publications in internationally recognized peer-reviewed journals and conferences. We offer The position will be filled for a fixed-term period of two years, starting as soon as possible (but not extending the contract beyond the end of December 2021). A trial period of 6 months is applied to all new employees. The exact starting date is negotiable. We offer competitive academic salary, typically between 3500–4000 € for a starting postdoc depending on the experience of the candidate, and 4000–4500 € for a senior research fellow with several years of existing postdoctoral research experience in academia or industry. In addition, the position comes with extensive benefits such as occupational healthcare, excellent sports facilities, flexible working hours, and several restaurants and cafés on the campus with staff discounts. Traveling costs and daily allowances related to presenting peer-reviewed work in major international conferences is also normally covered. How to apply Send the application through the online portal at https://tuni.rekrytointi.com/paikat/?o=A_A&jid=301 We will accept applications until the position has been filled, but no later than 30th of November 2019 at 23.59 (GMT+3). Note that we will start evaluating the applicants already on 1st of October 2019, and the position may be filled as soon as a suitable candidate is found. We reserve the opportunity to recruit the candidate through other channels or to decide to not to fill the position in case a suitable candidate is not found during the process. The application should contain the following documents (all in .pdf format): - A free-form letter of motivation for the position in question (max. 1 page) - Academic CV with contact information - A list of publications - A copy of doctoral degree certificate - A letter or letters of recommendation (max. 3) Please name all the documents as surname_CV.pdf, surname_list_of_publications.pdf … etc. Only the applications sent through the university application portal and containing the requested attachments in the instructed format will be considered in the recruitment process. The most promising candidates will be interviewed in person or via Skype before the final decision. For more information about the position, please contact Assistant Professor Okko Räsänen (firstname.surname@tuni.fi; no umlauts) by email. About the research environment Finland is among the most stable, free and safe countries in the world, based on prominent ratings by various agencies. It is also ranked as one of the top countries as far as social progress is concerned. Tampere is counted among the major academic hubs in the Nordic countries and offers a dynamic living environment. Tampere region is one of the most rapidly growing urban areas in Finland and home to a vibrant knowledge-intensive entrepreneurial community. The city is an industrial powerhouse that enjoys a rich cultural scene and a reputation as a centre of Finland’s information society. Despite its growth, living in Tampere is highly affordable with two-room apartment rents starting from approx. 550 €. In addition, the excellent public transport network enables quick, easy and cheap transportation around the city of Tampere and university campuses. Read more about Finland and Tampere: • https://www.visitfinland.com/about-finland/ • https://finland.fi/ • http://julkaisut.valtioneuvosto.fi/bitstream/handle/10024/161193/MEAEguide_18_2018_T ervetuloaSuomeen_Eng_PDFUA.pdf • https://visittampere.fi/en/
| ||||||||||
6-39 | (2019-09-09) Doctoral Researcher; UNiversity of Tampere, Finland Doctoral Researcher (speech and language technology, cognitive science) Tampere University and Tampere University of Applied Sciences create a unique environment for multidisciplinary, inspirational and highimpact research and education. Our universities community has its competitive edges in technology, health and society. www.tuni.fi/en Speech and Cognition research group (SPECOG) is part of Computing Sciences Unit of Tampere University within the Faculty of Information Technology and Communication Sciences. SPECOG focuses on interdisciplinary research at the intersection of speech technology and cognitive sciences. We apply advanced signal processing and machine learning methods to computational modeling of human language learning and perception and study how human-like information processing principles can be applied in autonomous next-generation artificial intelligence (AI) systems. The group also conducts research and development in speech and language technology and in medical signal processing and machine learning. SPECOG collaborates with several internationally leading research groups within and across disciplinary boundaries, including joint research with computer scientists, psychologists, brain researchers, and linguists. The group is also closely affiliated with audio and machine vision research groups of Tampere University. More information on SPECOG: http://www.cs.tut.fi/sgn/specog/index.html Job description We are inviting applications for the position of a doctoral researcher (doctoral student) in the areas of speech and language technology and cognitive science. The work will be conducted as a member of the SPECOG research group led by Asst. Prof. Okko Räsänen. We are looking for candidates who are interested in human and/or machine language processing, and who are willing to contribute to our highly crossdisciplinary research efforts in understanding language learning in humans and autonomous computational systems. Our current focus is on machine learning algorithms for unsupervised language learning from purely acoustic or audiovisual data (sometimes also known as zero-resource speech processing). However, we also consider candidates with interest towards complementary areas of speech and language technology. In this position, the candidate is expected to: 1) carry out research on a mutually agreed topic 2) complete a doctoral degree, including mandatory course studies for a D.Sc. (tech.) degree 3) participate to doctoral program 4) be available for assisting tasks in teaching and research group activities (max. 15% of working time) Requirements The candidate should hold a master’s degree in language technology, computer science, electrical engineering, mathematics, cognitive science, or other relevant technical area. Candidates who have already completed their master’s studies but are graduating during 2019 may also apply. Exceptional master’s students of Tampere University, who are close to graduation, can be also considered for the position. In this case, the candidate is first employed as a Research Assistant to carry out a master’s thesis project (6 months) on the topic and, upon a successful thesis project, with the possibility to continue to doctoral studies. A successful candidate has experience from signal processing and/or machine learning (e.g., deep learning), ideally from the context of speech technology. Applicants with a background in natural language processing (NLP) or cognitive science are also considered. Experience or interests in linguistics, neuroscience, or statistics are considered as an advantage but not required. Good command of programming (Python, Matlab, R, C++ or similar) and English skills are required. Potential candidates must be capable of carrying out independent research work but are also good team players. Previous experience from research such as research internships or other research projects are considered as a significant advantage. We offer The position will be filled for a fixed-term period of two years with a view for extension. A trial period of 6 months is applied to all new employees. The position start in January 2020 or as soon as possible with a negotiable exact starting date. Target completion time for doctoral studies is 4 years. We offer a starting salary of 2300 € for a starting doctoral researcher with later increases based on demonstrated progress through scientific publications and acquired study credits. In addition, the position comes with extensive benefits such as occupational healthcare, excellent sports facilities, flexible working hours, and several restaurants and cafés on the campus with staff discounts. Traveling costs and daily allowances related to presenting peer-reviewed work in major international conferences is also normally covered. How to apply Send your application through the online portal at https://tuni.rekrytointi.com/paikat/?o=A_A&jid=299 We will accept applications until 15th of November 2019 at 23.59 (GMT+3). We reserve the opportunity to recruit the candidate through other channels or to decide to not to fill the position in case a suitable candidate is not found during the process. The application should contain the following documents (all in .pdf format): - A free-form letter of motivation for the position in question (max. 1 page) - Complete CV with contact information and a list of publications (if any) - A copy of master’s degree certificate - English language certificate of proficiency (for non-native and non-Finnish applicants) Please name all the documents as surname_CV.pdf, surname_list_of_publications.pdf … etc. Only the applications sent through the university application portal and containing the requested attachments in the instructed format will be considered in the recruitment process. The most promising candidates will be interviewed in person or via Skype before the final decision. For more information about the position, please contact Assistant Professor Okko Räsänen (firstname.surname@tuni.fi; no umlauts) by email. About the research environment Finland is among the most stable, free and safe countries in the world, based on prominent ratings by various agencies. It is also ranked as one of the top countries as far as social progress is concerned. Tampere is counted among the major academic hubs in the Nordic countries and offers a dynamic living environment. Tampere region is one of the most rapidly growing urban areas in Finland and home to a vibrant knowledge-intensive entrepreneurial community. The city is an industrial powerhouse that enjoys a rich cultural scene and a reputation as a centre of Finland’s information society. Despite its growth, living in Tampere is highly affordable with private market two-room apartment rents starting from approx. 550 €. In addition, the excellent public transport network enables quick, easy and cheap transportation around the city of Tampere and university campuses. Read more about Finland and Tampere: • https://www.visitfinland.com/about-finland/ • https://finland.fi/ • http://julkaisut.valtioneuvosto.fi/bitstream/handle/10024/161193/MEAEguide_18_2018_T ervetuloaSuomeen_Eng_PDFUA.pdf • https://visittampere.fi/en/
| ||||||||||
6-40 | (2019-09-09) Postdoc position at IRIT, Toulouse, France Projet READYNOV : AUDIOCAP Audition et handicap dans le bruit : vers la restauration de l’intelligibilité de la parole Type d’emploi POSTDOC Cadre de la recherche Restauration d’une intelligibilité dans le bruit pour les personnes âgées via des prothèses auditives. Mots-clés Parole, bruit, intelligibilité Missions Prédiction de l’intelligibilité de la parole dans le bruit : - Prise en main d’un système de Reconnaissance Automatique de la Parole en Français, - Modélisation acoustique dans le bruit. Mise en place d’un outil de séparation de la parole et du bruit, fondée sur l’application de filtres temps-fréquences. Celui-ci sera « réglé » dans un but de favoriser l’intelligibilité de la parole. Compétences Développement logiciel Traitement du signal Apprentissage machine (« deep learning ») Lieu IRIT – 118, route de Narbonne – 31062 TOULOUSE Date et durée de la mission De 12 à 18 mois à partir du 1er octobre 2019 Salaire Entre 1900 et 2400 € net par mois, suivant l’expérience Documents à fournir - CV détaillé - Lettre de motivation - Résumé d'une page de la thèse de doctorat Contact Julien PINQUIER, pinquier@irit.fr
| ||||||||||
6-41 | (2019-09-05) R/D position at Zaion, Paris France ZAION est une société innovante en pleine expansion spécialisée dans la technologie des robots conversationnels : callbot et chatbot intégrant de l’Intelligence Artificielle. ZAION a développé une solution qui s’appuie sur une expérience de plus de 20 ans de la Relation Client. Cette solution en rupture technologique reçoit un accueil très favorable au niveau international et nous comptons déjà 18 clients actifs (GENERALI, MNH, APRIL, CROUS, EUROP ASSISTANCE, PRO BTP …). Nous sommes actuellement parmi les seuls au monde à proposer une offre de ce type entièrement tournée vers la performance. Nous rejoindre, c’est prendre part à une aventure passionnante au sein d’une équipe ambitieuse afin de devenir la référence sur le marché des robots conversationnels. Nous rejoindre, c’est prendre part à une aventure passionnante et innovante afin de devenir la référence sur le marché des robots conversationnels. Dans le cadre de son développement ZAION recrute son Data Scientist /Machine Learning appliqué à l’Audio H/F. Au sein de l’équipe R&D, votre rôle est stratégique dans le développement et l’expansion de la société. Vous développerez, une solution qui permet de détecter les émotions dans les conversations. Nous souhaitons augmenter les fonctionnalités cognitives de nos callbots afin qu’ils puissent détecter les émotions de leurs interlocuteurs (joie, stress, colère, tristesse…) et donc adapter leurs réponses en conséquence. Vos missions principales : - Vous participez à la création du pôle R&D de ZAION et piloterez à votre arrivée votre premier projet de reconnaissance d’émotion dans la voix. - Construisez, adaptez et faites évoluer nos services de détection d’émotion dans la voix - Analysez de bases de données conséquentes de conversations pour en extraire les conversations émotionnellement pertinentes - Construisez une base de données de conversations labelisées avec des étiquettes émotionnelles - Formez et évaluez des modèles d'apprentissage automatique pour la classification d’émotion - Déployez vos modèles en production - Améliorez en continue le système de détection des émotions dans la voix Qualifications requises et expérience antérieure : -Vous avez une expérience de 2 ans minimum comme Data Scientist/Machine Learning appliqué à l’Audio - Diplômé d’une école d’Ingénieur ou Master en informatique ou un doctorat en informatique mathématiques avec des compétences solides en traitements de signal (audio de préférence) - Solide formation théorique en apprentissage machine et dans les domaines mathématiques pertinents (clustering, classification, factorisation matricielle, inférence bayésienne, deep learning...) - La mise à disposition de modèles d'apprentissage machine dans un environnement de production serait un plus - Vous maîtrisez un ou plusieurs des langages suivants : Python, Frameworks de machine Learning/Deep Learning (Pytorch, TensorFlow,Sci-kit learn, Keras) et Javascript - Vous maîtrisez les techniques du traitement du signal audio - Une expérience confirmée dans la labélisation de grande BDD (audio de préférence) est indispensable ; - Votre personnalité : Leader, autonome, passionné par votre métier, vous savez animer une équipe en mode projet - Vous parlez anglais couramment Merci d’envoyer votre candidature à : alegentil@zaion.ai
| ||||||||||
6-42 | (2019-09-15) Post-doc and research engineer at INSA, Rouen, Normandy, France Post-doctoral position (1 year): Perception for interaction and social navigation Research Engineer (1 year): Social Human-Robot Interactions Laboratory: LITIS, INSA Rouen Normandy, France Project: INCA (Natural Interactions with Artificial Companions) Summary: The emergence of interactive robots and connected objects has lead to the appearance of symbiotic systems made up of human users, virtual agents and robots in social interactions. However, two major scientific difficulties are unsolved yet: on the one hand, the recognition of human activity remains inaccurate, both at the operational level (location, mapping and identification of objects and users) and cognitive (recognition and tracking of users? intentions) and, on the other hand, interaction involves different modalities that must be adapted according to the context, the user and the situation. The INCA project aims at developing artificial companions (interactive robots and virtual agents) with a particular focus on social interactions. Our goal is to develop new models and algorithms for intelligent companions capable of (1) perceiving and representing an environment (real, virtual or mixed) consisting of objects, robots and users; (2) interacting with users in a natural way to assess their needs, preferences, and engagement; (3) learning models of user behavior and (4) generating semantically adequate and socially appropriate responses.
Post-doctoral position in perception for interaction and social navigation (1 position) The candidate will work to ensure that a robot can recognize the physical content of the scene surrounding him, recognize himself, static and dynamic objects (users and other robots) and finally predict the movement of dynamic elements. The integration of data from different sensors should allow the mapping of an unknown environment and estimate the position of the robot. First, VSLAM techniques (Visual Simultaneous Localization And Mapping) (Saputra 2018) will be used to map the scene. The regions (or points) of interest detected could Profile: the candidate must have strong skills in mobile robotics and navigation techniques (VSLAM, OrbSlam, Optical Flow, stereovision...) and a high programming capacities under ROS or any other programming language compatible with robotics. Machine learning and Deep learning skills will be highly appreciated. Research Engineer in Social Human-Robot Interactions (1 position) The hired research engineer will work closely with the INCA research staff (permanent, PhD and post-doctoral members) and other project partners. This will mainly involve administering the project's Pepper robots, developing the necessary tools, integrating the algorithms developed with the AgentSlang platform (https://agentslang.github.io/) and join the team created to participate in the Robocup 2020 in Bordeaux, @Home league. Profile: Computer Sciences / Robotics Engineer
Duration and remuneration: 1 year, 2480euros/month (gross salary) Application should be sent to: alexandre.pauchet@insa-rouen.fr
| ||||||||||
6-43 | (2019-09-20) Poste ATER, Paris Sorbonne, France un poste d'ATER en Informatique est disponible à la faculté des lettres de Sorbonne
| ||||||||||
6-44 | (2019-09-21) Post-doc/PhD position, LORIA, Nancy, France Post-doc/PhD position Pattern mining for Neural Networks debugging: application to speech recognition
Advisors: Elisa Fromont & Alexandre Termier, IRISA/INRIA RBA ? Lacodam team (Rennes)
Irina Illina & Emmanuel Vincent, LORIA/INRIA ? Multispeech team (Nancy)
firstname.lastname@inria.fr
Location: INRIA RBA, team Lacodam (Rennes)
Deadline to apply : October 30th 2019.
Starting date : December 2019 -January 2020
Keywords: discriminative pattern mining, neural networks analysis, explainability of blackbox models, speech recognition.
Context:
Understanding the inner working of deep neural networks (DNN) has attracted a lot of attention in the past years [1, 2] and most problems were detected and analyzed using visualization techniques [3, 4]. Those techniques help to understand what an individual neuron or a layer of neurons are computing. We would like to go beyond this by focusing on groups of neurons which are commonly highly activated when a network is making wrong predictions on a set of examples. In the same line as [1], where the authors theoretically link how a training example affects the predictions for a test example using the so called ?influence functions?, we would like to design a tool to ?debug? neural networks by identifying, using symbolic data mining methods, (connected) parts of the neural network architecture associated with erroneous or uncertain outputs.
In the context of speech recognition, this is especially important. A speech recognition system contains two main parts: an acoustic model and a language model. Nowadays models are trained with deep neural networks-based algorithms (DNN) and use very large learning corpora to train an important number of DNN hyperparameters. There are many works to automatically tune these hyperparameters. However, this induces a huge computational cost, and does not empower the human designers. It would be much more efficient to provide human designers with understandable clues about the reasons for the bad performance of the system, in order to benefit from their creativity to quickly reach more promising regions of the hyperparameter search space.
Description of the position:
This position is funded in the context of the HyAIAI ?Hybrid Approaches for Interpretable AI? INRIA project lab (https://www.inria.fr/en/research/researchteams/inria-project-labs). With this position, we would like to go beyond the current common visualization techniques that help to understand what an individual neuron or a layer of neurons is computing, by focusing on groups of neurons that are commonly highly activated when a network is making wrong predictions on a set of examples. Tools such as activation maximization [8] can be used to identify such neurons. We propose to use discriminative pattern mining, and, to begin with, the DiffNorm algorithm [6] in conjunction with the LCM one [7] to identify the discriminative activation patterns among the identified neurons.
The data will be provided by the MULTISPEECH team and will consist of two deep architectures as representatives of acoustic and language models [9, 10]. Furthermore, the training data will be provided, where the model parameters ultimately derive from. We will also extend our results by performing experiments with supervised and unsupervised learning to compare the features learned by these networks and to perform qualitative comparisons of the solutions learned by various deep architectures. Identifying ?faulty? groups of neurons could lead to the decomposition of the DL network into ?blocks? encompassing several layers. ?Faulty? blocks may be the first to be modified in the search for a better design.
The recruited person will benefit from the expertise of the LACODAM team in pattern mining and deep learning (https://team.inria.fr/lacodam/) and of the expertise of the MULTISPEECH team (https://team.inria.fr/multispeech/) in speech analysis, language processing and deep learning. We would ideally like to recruit a 1 year (with possibly one additional year) post-doc with the following preferred skills:
? Some knowledge (interest) about speech recognition
? Knowledgeable in pattern mining (discriminative pattern mining is a plus)
? Knowledgeable in machine learning in general and deep learning particular
? Good programming skills in Python (for Keras and/or Tensor Flow)
? Very good English (understanding and writing)
However, good PhD applications will also be considered and, in this case, the position will last 3 years. The position will be funded by INRIA (https://www.inria.fr/en/). See the INRIA web site for the post-doc and PhD wages.
The candidates should send a CV, 2 names of referees and a cover letter to the four researchers (firstname.lastname@inria.fr) mentioned above. Please indicate if you are applying for the post-doc or the PhD position. The selected candidates will be interviewed in June for an expected start in September 2019.
Bibliography:
[1] Pang Wei Koh, Percy Liang: Understanding Black-box Predictions via Influence Functions. ICML 2017: pp 1885-1894 (best paper).
[2] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals: Understanding deep learning requires rethinking generalization. ICLR 2017.
[3] Anh Mai Nguyen, Jason Yosinski, Jeff Clune: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. CVPR 2015: pp 427-436.
[4] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus: Intriguing properties of neural networks. ICLR 2014.
[5] Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi: Deep Text Classification Can be Fooled. IJCAI 2018: pp 4208-4215.
[6] Kailash Budhathoki and Jilles Vreeken. The difference and the norm?characterising similarities and differences between databases. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 206?223. Springer, 2015
[7] Takeaki Uno, Tatsuya Asai, Yuzo Uchida, and Hiroki Arimura. Lcm: An efficient algorithm for enumerating frequent closed item sets. In Fimi, volume 90. Citeseer, 2003.
[8] Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent. Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009.
[9] G. Saon, H.-K. J. Kuo, S. Rennie, M. Picheny: The IBM 2015 English conversational telephone speech recognition system?, Proc. Interspeech, pp. 3140-3144, 2015.
[10] W. Xiong, L. Wu, F. Alleva, J. Droppo, X. Huang, A. Stolcke : The Microsoft 2017 Conversational Speech Recognition System, IEEE ICASSP, 2018.
| ||||||||||
6-45 | (2019-09-22) Postdoc position at Grenoble Alps University, Grenoble, France Postdoc proposal Spontaneous Speech Recognition. Application to Situated Corpora in French. September 4, 2019 1 Postdoc Subject The goal of the project is to advance the state-of-the-art in spontaneous auto- matic speech recognition (ASR). Recent advances in ASR show excellent per- formances on tasks such as read speech ASR (Librispeech), TV shows (MGB challenge), but what about spontaneous communicative speech ? This postdoc project would leverage existing transcribed corpora in French (more than 300 hours) recorded in everyday communication (speech recordings inside a family, in a shop, during an interview, etc.). One impact of the project would be the automatization of transcription on very challenging data in order to feed linguistic and phonetic studies at scale. Research topics: End-to-end ASR models Spontaneous speech ASR Colloquial speech transcription Data augmentation for spontaneous and colloquial language modelling Transcribing situated corpora 2 Requirements We are looking for an outstanding and highly motivated postdoc candidate to work on this subject. Following requirements are mandatory: PhD degree in natural language processing or speech processing. Excellent programming skills (mostly in Python and deep learning frame- works). 1 Interest in pluri-disciplinary research (speech technology and speech sci- ence) Good oral and written communication in English (French is a plus while not mandatory) Ability to work autonomously and in collaboration with other team mem- bers. 3 Work context Grenoble Alpes Univ. o puting facilities, as well as remarkable surroundings to explore over the week- ends. The postdoc project will be funded by the Grenoble Articial Intelligence Institute (MIAI). The candidate will work both at LIG-lab (GETALP team) and LIDILEM-lab. The duration of the postdoc is 18 months. 4 How to apply? Applications should include a detailed CV; a copy of the last diploma; at least two references (people likely to be contacted); a cover letter of one page; a one-page summary of the PhD thesis. Applications should be sent to lau- rent.besacier@imag.fr Applications will be evaluated as they are received: the position is open until it is lled.
| ||||||||||
6-46 | (2019-09-22) PhD thesis proposal at Grenble Alps University, Grenoble, France PhD thesis proposal Incremental sequence-to-sequence mapping for speech generation using deep neural networks September 4, 2019 1 Context and objectives In recent years, deep neural networks have been widely used to address sequence- to-sequence (S2S) learning. S2S models can solve many tasks where source and target sequences have di nition, machine translation, speech translation, text-to-speech synthesis, etc. Recurrent, convolutional and transformer architectures, coupled with attention models, have shown their ability to capture and model complex temporal de- pendencies between a source and a target sequence of multidimensional discrete and/or continuous data. Importantly, end-to-end training alleviates the need to previously extract handcrafted features from the data by learning hierarchi- cal representations directly from raw data (e.g. character string, video, speech waveform, etc.). The most common models are composed of an encoder that reads the full in- put sequence (i.e. from its beginning to its end) before the decoder produces the corresponding output sequence. This implies a latency equals to the length of the input sequence. In particular, for a text-to-speech (TTS) system, the speech waveform is usually synthesized from a complete text utterance (e.g. a sequence of words with explicit begin/end-of-utterance markers). Such approach cannot be used in a truly interactive scenario, in particular by a speech-handicapped person to communicate orally'. Indeed, the interlocutor has to wait for the complete utterance to be typed before being able to listen to the synthetic voice, hence limiting the dynamics and naturalness of the interaction. The goal of this project is to develop a general methodology for incremental sequence-to-sequence mapping, with application to interactive speech technolo- gies. It will require the development of end-to-end classication and regression neural models able to deliver chunks of output data on-the-y, from only a par- tial observation of input data. The goal is to learn an ecient policy that leads to an optimal trade-o process. Possible strategies to decode the output data as soon as possible in- clude: (i) Predicting online he future' of the output sequence from he past 1 and present' of the input sequence, with an acceptable tolerance to possible er- rors, or (2) learn automatically from the data an optimal waiting policy' that prevents the model to output data when the uncertainty is too high. The devel- oped methodology will be applied to address two speech processing problems: (i) Incremental Text-to-Speech synthesis in which speech is synthesized while the user is typing the text (possibly with a variable latency), and (ii) Incremen- tal speech enhancement/inpainting in which portions of the speech signal are unintelligible because of sudden noise or speech production disorders, and must be replaced on-the-y with reconstructed portions. 2 Work plan The proposed working plan is the following : Bibliographic work on S2S neural models, in the context of speech recogni- tion, speech synthesis, and machine translation as well as their incremental (low-latency) variations Investigating new architectures, losses, and training strategies toward in- cremental S2S models. Implementing and evaluating the proposed techniques in the context of end-to-end neural TTS systems (the baseline system may be a neural TTS trained with past information/left-context only). Implementing and evaluating the proposed techniques in the context of speech enhancement/inpainting, rst on simulated noisy speech and then on pathological speech. 3 Requirements We are looking for an outstanding and highly motivated PhD candidate to work on this subject. Following requirements are mandatory: Engineering degree and/or a Master's degree in Computer Science, Signal Processing or Applied Mathematics. Solid skills in Machine Learning. General knowledge in natural language processing and/or speech processing. Excellent programming skills (mostly in Python and deep learning frame- works). Good oral and written communication in English. Ability to work autonomously and in collaboration with supervisors and other team members. 2 4 Work context Grenoble Alpes Univ. o puting facilities, as well as remarkable surroundings to explore over the week- ends. The PhD project will be funded by the Grenoble Articial Intelligence Institute (MIAI). The PhD candidate will work both at GIPSA-lab (CRISSP team) and LIG-lab (GETALP team). The duration of the PhD is 3 years. The salary is between 1770 and 2100 euros gross per month (depending on comple- mentary activity or not). 5 How to apply? Applications should include a detailed CV; a copy of their last diploma; at least two references (people likely to be contacted); a cover letter of one page; a one- page summary of the Master thesis; the two last transcripts of notes (Master or engineering school). Applications should be sent to thomas.hueber@gipsa-lab.fr, laurent.girin@gipsa-lab.fr and laurent.besacier@imag.fr. Applications will be evaluated as they are received: the position is open until it is lled.
| ||||||||||
6-47 | (2019-10-18) Journées d’étude sur la convergence, LPL, Aix en Provence, France Journées d’étude sur la convergence 18-19 octobre 2019 Organisées par Contacts : Sibylle Kriegel et Sophie Herment (LPL/AMU) Programme Vendredi 18 octobre 2019 9h30-10h15 Accueil 10h15-11h15 Debra Ziegeler, conférencière invitée (U. Sorbonne Nouvelle) : The future of already in Singapore English: a matter of selective convergence 11h15-11h45 Pause-café 11h45-12h30 Diana Lewis (AMU, LPL) : Grammaticalisation de lexème, de construction : deux cas de développement adverbial en anglais 12h30-14h15 Déjeuner 14h15-15h00 James German (AMU, LPL) : Linguistic adaptation as an automatic response to socio-indexical cues 15h00-15h45 Daniel Véronique (AMU, LPL) : L’« agglutination nominale » dans les langues créoles françaises : un exemple de convergence ? 15h45-16h15 Pause-Café 16h15-17h00 Chady Shimeen-Khan (U. Paris Descartes, CEPED) : Convergences et divergences à des fins discursives à travers l’usage des marqueurs discursifs chez les jeunes Mauriciens plurilingues 17h00-17h45 Sibylle Kriegel (AMU, LPL) : Créolisation et convergence : l’expression du corps comme marque du réfléchi 17h45-18h15 Charles Zaremba : Le CLAIX Cercle Linguistique d’Aix-en-Provence, retrospective
Samedi 19 octobre 10h-10h45 Akissi Béatrice Boutin (ILA-UFHB, Abidjan-Cocody) : Réanalyses avec et sans convergence dans le plurilinguisme ivoirien 10h45-11h30 Massinissa Garaoun (AMU) : Convergence linguistique et cycles : le cas de la négation en arabe maghrébin et en berbère 11h30-12h Pause-café 12h-12h45 Nicolas Tournadre (AMU, LACITO) : Phénomènes de copie et de convergence dans les langues du Tibet et de l’Himalaya 12h45-13h30 Cyril Aslanov (AMU, LPL) : Convergence and secondary entropy in a macrodiachronic perspective
| ||||||||||
6-48 | (2019-10-05) offre de post-doctorat au Laboratoire national de métrologie et d' essais (LNE) , Trappes, France offre de post-doctorat au sein de l’activité « Evaluation des systèmes d’intelligence artificielle » du LNE :
https://www.lne.fr/fr/offre-emploi/post-doc-evaluation-systemes-evolutifs-locuteur-traduction
Le candidat retenu intégrera une équipe en forte croissance spécialisée en évaluation des systèmes d’IA, ainsi qu’un projet européen ambitieux portant sur les systèmes de traitement de la langue évolutifs (en traduction et en diarisation). La caractérisation des performances des systèmes intelligents capables de s’auto-améliorer au fur et à mesure de leur utilisation, par eux-mêmes et par interaction avec l’humain utilisateur, représente un véritable défi que ce post-doctorat propose de relever.
| ||||||||||
6-49 | (2019-10-15) Post doc Portugal An open full-time Post-Doc employment position for 30 months in the context of our research project DyNaVoiceR which is supported by FCT (the Portuguese Foundation for Science and Technology).
The official announcement can be found:
English version: http://www.eracareers.pt/opportunities/index.aspx?task=showAnuncioOportunities&jobId=118038&idc=1e
Portuguese version:
The salary level is quite attractive: 2.128,34 Euros per month (14 salaries per year)
Aapplication deadline is October 15.
| ||||||||||
6-50 | (2019-10-10) Ingenieur de recherche,Lab.national de metrologie et d'essais, Trappes, France
Ingénieur de recherche en Traitement Automatique du Langage – F/H Poste en CDI Localisation : Laboeratoire national de metrologie et d'essais,Trappes Traitement Automatique du Langage – F/H Référence : ML/ITAL/DEC Leader dans l’univers de la mesure et des références, jouissant d’une forte notoriété en France et à l’international, le LNE soutient l’innovation industrielle et se positionne comme un acteur important pour une économie plus compétitive et une société plus sûre. Au carrefour de la science et de l’industrie depuis sa création en 1901, le LNE offre son expertise à l’ensemble des acteurs économiques impliqués dans la qualité et la sécurité des produits. Pilote de la métrologie française, notre recherche est au coeur de cette mission de service public et l’une des clés du succès des entreprises. Nous nous attachons à répondre au besoin industriel et académique de mesures toujours plus justes, dans des conditions de plus en plus extrêmes ou sur les concepts les plus émergents tels que les véhicules autonomes, les nanotechnologies ou la fabrication additive. Missions : Vous intégrerez une équipe de six ingénieurs-docteurs régulièrement accompagnés de post-doctorants, doctorants et stagiaires, spécialisée dans l’évaluation et la qualification des systèmes d’intelligence artificielle. Cette équipe est historiquement reconnue pour son expertise dans l’évaluation des systèmes de traitement automatique du langage naturel et le poste proposé doit contribuer à renforcer cette expertise dans un contexte de forte dynamique de croissance. Depuis quelques années, l’équipe s’est diversifiée en termes de domaines d’application de son expertise d’évaluation des intelligences en traitant de sujets tels que les dispositifs médicaux, les robots industriels collaboratifs, les véhicules autonomes, etc. L’équipe capitalise sur les savoir-faire à la fois divers et ciblés de ses experts (TAL, imagerie, robotique, etc.) afin d’apporter conjointement une solution satisfaisante à la question de l’évaluation et de la certification des systèmes intelligents, condition impérative de leur acceptabilité et faisant l’objet aujourd’hui d’une attention prioritaire des pouvoirs publics. C’est dans le cadre de la mise en place progressive d’un centre d’évaluation des systèmes intelligents à vocation nationale et internationale qu’elle cherche à accueillir les meilleurs profils de chaque spécialité de l’IA. Les missions principales de ce futur centre sont le développement de nouveaux protocoles d’évaluation, la qualification et la certification de systèmes intelligents, l’organisation de challenges (campagnes de benchmarking), la mise à disposition de ressources expérimentales, le développement et l’organisation du secteur d’activité et la définition de principes, politiques, doctrines et normes à cet effet. En tant qu’ingénieur-docteur de recherche en TAL, votre champ d’intervention prioritaire sera le traitement de la langue (texte et parole). Vous pourrez également être amené.e à intervenir dans d’autres domaines du traitement de l’information (par exemple sur le traitement de l’image dont la reconnaissance optique de caractères), puis au-delà en fonction des priorités et de vos propres compétences et affinités. Le poste est évolutif sur le moyen et le long termes en ce qu’il vise à la formation d’experts techniques de stature au moins nationale et ayant vocation à mener eux-mêmes la politique de croissance et de tutelle de leur spécialité, sous réserve du cadre réglementaire et d’orientations générales du LNE ou de ses donneurs d’ordres. Dans un premier temps, vous couvrirez les missions suivantes : - Contribution à la R&D et aux actions structurantes (60%) :
Inventaire technique et commercial du besoin et de l’offre, priorisation des marchés et champs techniques à investir Identification et définition des grandeurs à mesurer, des métriques afférentes, des protocoles d’évaluation et des moyens d’essais nécessaires Structuration des données de la discipline (TAL) au sein de référentiels et selon des nomenclatures à bâtir Programmation et conduite d’essais à des fins expérimentales, de recherche itérative et d’étalonnage Constitution et animation d’un réseau de chercheurs des secteurs public et privé, national et étranger, en appui aux présentes missions Contribution au montage et à l’exécution de projets de recherche nationaux et européens et de coopérations internationales Participation aux travaux de planification du LNE : investissements, RH, budgets annuels, perspectives pluriannuelles Publication et présentation des résultats scientifiques Encadrement éventuel de doctorants, post-doctorants, stagiaires - Contribution aux prestations commerciales en TAL (40%) : Ingénierie linguistique générale (manipulation des données, analyse statistique, etc.) Prise en charge du besoin client et reformulation dans le cadre d’une offre technique et commerciale Organisation des tâches pour la réalisation de la prestation, estimation des ressources nécessaires, négociation Réalisation de ces tâches en coordination avec l’équipe Production/rédaction des livrables Présentation des résultats au client Profil : Vous êtes titulaire soit d’un doctorat, soit d’un diplôme d’ingénieur avec un minimum de trois ans d’expérience professionnelle, en informatique ou en sciences du langage, avec une spécialisation en traitement automatique de la langue (TAL) et plus généralement en intelligence artificielle. Les expériences professionnelles ou académiques passées en développement et/ou test logiciel, en analyse statistique, ainsi qu’en traitement de la parole ou de l’image seront particulièrement appréciées. Vous disposez également d’un bon niveau d’anglais et de programmation (C++ et/ou Python), ainsi que d’une expérience en utilisation de Linux. Dans le cadre de votre prise de poste, vous pourriez être amené.e à suivre des formations complémentaires (par exemple en intelligence artificielle et en cybersécurité). Vous saurez être à l’initiative, en disposant d’une large autonomie et d’un potentiel de créativité vous permettant d’occuper pleinement votre espace de responsabilité dans un objectif d’excellence. Vous êtes capable de défendre un leadership de par la qualité et la clarté de vos argumentaires. Déplacements fréquents en région parisienne (une fois par semaine), en province (une à deux fois par mois) et occasionnels dans le monde (une fois par trimestre) dans le cadre de prestations, réunions ou conférences.
| ||||||||||
6-51 | (2019-10-13) Postdoctoral Researcher , IRISA, Rennes, France Postdoctoral Researcher in Multilingual Speech Processing CONTEXT The Expression research team focuses on expressiveness in human-centered data. In this context, the team has a strong activity in the field of speech processing, especially text-to-speech (TTS). This activity is denoted by regular publications in top international conferences and journals, exposing contributions in topics like machine learning (including deep learning), natural language processing, and speech processing. Team Expression takes part in multiple collaborative projects. Among those, the current position will take part in a large European H2020 project focusing on the social integration of migrants in Europe. Team’s website: https://www-expression.irisa.fr/ PROFILE Main tasks: 1. Design multilingual TTS models (acoustic models, grapheme-to-phoneme, prosody, text normalization...) 2. Take part in porting the team’s TTS system for embedded environments 3. Develop spoken language skill assessment methods Secondary tasks: 1. Collect speech data 2. Define use cases with the project partners Environment: The successful candidate will integrate a team of other researchers and engineers working on the same topics. Required qualification: PhD in computer science or signal processing Skills: • Statistical machine learning and deep learning • Speech processing and/or natural language processing • Strong object-oriented programming skills • Android and/or iOS programming are a strong plus CONTRACT Duration: 22 months, full time Salary: competitive, depending on the experience. Starting date: 1st, January 2020. APPLICATION & CONTACTS Send a cover letter, a resume, and references by email to: • Arnaud Delhay, arnaud.delhay@irisa.fr ; • Gwénolé Lecorvé, gwenole.lecorve@irisa.fr ; • Damien Lolive, damien.lolive@irisa.fr . Application deadline: 15th, November 2019. Applications will be processed on a daily basis.
| ||||||||||
6-52 | (2019-10-16) Position in Machine Learning/AI at ReadSpeaker, The Netherlands ReadSpeaker has a job opening for a Machine Learning / AI person working on text-to-speech research and development. Job ad can be found here:
| ||||||||||
6-53 | (2019-10-18) FULLY FUNDED FOUR-YEAR PHD STUDENTSHIPS, University Edingurgh, Scotland FULLY FUNDED FOUR-YEAR PHD STUDENTSHIPS UKRI CENTRE FOR DOCTORAL TRAINING IN NATURAL LANGUAGE PROCESSING at the University of Edinburgh?s School of Informatics and School of Philosophy, Psychology and Language Sciences. Applications are now sought for the CDT?s second cohort of students to start in September 2020 Deadlines: The CDT in NLP offers unique, tailored doctoral training comprising both taught courses and a doctoral dissertation. Both components run concurrently over four years. Each student will take a set of courses designed to complement their existing expertise and give them an interdisciplinary perspective on NLP. They will receive full funding for four years, plus a generous allowance for travel, equipment and research costs. The CDT brings together researchers in NLP, speech, linguistics, cognitive science and design informatics from across the University of Edinburgh. Students will be supervised by a team of over 40 world-class faculty and will benefit from cutting edge computing and experimental facilities, including a large GPU cluster and eye-tracking, speech, virtual reality and visualisation labs. The CDT involves over 20 industrial partners, including Amazon, Facebook, Huawei, Microsoft, Mozilla, Reuters, Toshiba, and the BBC. Close links also exist with the Alan Turing Institute and the Bayes Centre. A wide range of research topics fall within the remit of the CDT:
The second cohort of CDT students will start in September 2020 and is now open to applications. Around 12 studentships are available, covering maintenance at the research council rate (https://www.ukri.org/skills/funding-for-research-training , currently £15,009 per year) and tuition fees. Studentships are available for UK, EU and non-EU nationals. Individuals in possession of other funding scholarships or industry funding are also welcome to apply ? please provide details of your funding source on your application. Applicants should have an undergraduate or master?s degree in computer science, linguistics, cognitive science, AI, or a related discipline. We particularly encourage applications from women, minority groups and members of other groups that are underrepresented in technology. Further details including the application procedure can be found at: https://edin.ac/cdt-in-nlp Application Deadlines 29th November 2019 (non EU/UK) or 31st January 2020 (EU/UK). CDT in NLP Open Days
Enquiries can be made to the CDT admissions team at cdt-nlp-info@inf.ed.ac.uk
| ||||||||||
6-54 | (2019-10-19) Postdoctoral Scholar, University South California, USA Open Position - Postdoctoral Scholar for Multimodal Machine Learning and Natural Language Processing
The University of Southern California?s Institute for Creative Technologies (ICT) is an off-campus research facility, located on a creative business campus in the ?Silicon Beach? neighborhood of Playa Vista. We are world leaders in innovative training and education solutions, computer graphics, computer simulations, and immersive experiences for decision-making, cultural awareness, leadership and health. ICT employees are encouraged to develop themselves both professionally and personally, through workshops, invited guest talks, movie nights, social events, various sports teams, a private gym and a personal trainer. The atmosphere at ICT is informal and flexible, while encouraging initiative, personal responsibility and a high work ethic. ? Design and implement state-of-the-art NLP machine learning algorithms to automatically code dyadic MI therapy sessions and predict behavior change in patients.
? Push the envelope on current NLP and multimodal machine learning algorithms to better understand the MI process and outcome.
? Conduct statistical analysis on verbal, nonverbal and dyadic behavioral patterns to describe their relationship with the MI process and outcome.
? Write and lead authorship of high impact conference (ACL, EMNLP, ICMI, CVPR, ICASSP, and Interspeech) and journal papers (PAMI, TAFFC, and TASLP).
? Support and lead graduate, undergraduate students, and summer interns to preprocess and annotate multimodal MI data.
Work collaboratively with: ? Domain experts of MI research to automatically derive meaningful insights for MI research Experts.
? Computer scientists across departments at the highly accomplished and interdisciplinary USC Institute for Creative Technologies
Minimum Experience: At least 1 year of experience working with data compromising human verbal and/or nonverbal behavior.
Minimum Field of Expertise: Directly related education in research specialization with advanced knowledge of equipment, procedures, and analysis methods.
Skills: Comfortable with machine learning frameworks such as PyTorch or Tensorflow Excellent programming skills in Python or C++ Analysis Assessment/evaluation Communication-written and oral skills Organization Planning Problem identification and resolution Project management Research
| ||||||||||
6-55 | (2019-11-03) Ingénieur de recherche, IRIT, Toulouse France Dans le cadre du laboratoire commun ALAIA, l'IRIT (équipe SAMoVA https://www.irit.fr/SAMOVA/site/) recrute un ingénieur de recherche en CDD pour intégrer son équipe de recherche, travailler dans le domaine de l'IA appliquée à l'apprentissage des langues étrangères et collaborer avec la société Archean Technologie (http://www.archean.tech/archean-labs-en.html). Poste à pourvoir : Ingénieur de recherche
| ||||||||||
6-56 | (2019-11-05) Annotateur/Transcripteur H/F at ZAION, Paris, France ZAION (https://www.zaion.ai) est une société innovante en pleine expansion spécialisée dans la technologie des robots conversationnels : callbot et chatbot intégrant de l?Intelligence Artificielle. ZAION a développé une solution qui s?appuie sur une expérience de plus de 20 ans de la Relation Client. Cette solution en rupture technologique reçoit un accueil très favorable au niveau international et nous comptons déjà 12 clients actifs (GENERALI, MNH, APRIL, CROUS, EUROP ASSISTANCE, PRO BTP ?). Nous sommes actuellement parmi les seuls au monde à proposer une offre de ce type entièrement tournée vers la performance. Nous rejoindre, c?est prendre part à une belle aventure au sein d?une équipe dynamique qui a l?ambition de devenir la référence sur le marché des robots conversationnels. Au sein de notre activité Intelligence Artificielle, pour appuyer ses innovations constantes concernant l'identification automatique des sentiments et émotions au sein d'interactions conversationnelles téléphoniques, nous recrutons un Annotateur/Transcripteur H/F :
Ses missions principales :
| ||||||||||
6-57 | (2019-11-05) Data Scientist /Machine Learning appliqué à l'Audio H/F, at Zaion, Paris, France ZAION est une société innovante en pleine expansion spécialisée dans la technologie des robots conversationnels : callbot et chatbot intégrant de l?Intelligence Artificielle. ZAION a développé une solution qui s?appuie sur une expérience de plus de 20 ans de la Relation Client. Cette solution en rupture technologique reçoit un accueil très favorable au niveau international et nous comptons déjà 18 clients actifs (GENERALI, MNH, APRIL, CROUS, EUROP ASSISTANCE, PRO BTP ?). Nous sommes actuellement parmi les seuls au monde à proposer une offre de ce type entièrement tournée vers la performance. Nous rejoindre, c?est prendre part à une aventure passionnante au sein d?une équipe ambitieuse afin de devenir la référence sur le marché des robots conversationnels. Nous rejoindre, c?est prendre part à une aventure passionnante et innovante afin de devenir la référence sur le marché des robots conversationnels. Dans le cadre de son développement ZAION recrute son Data Scientist /Machine Learning appliqué à l?Audio H/F. Au sein de l?équipe R&D, votre rôle est stratégique dans le développement et l?expansion de la société. Vous développerez, une solution qui permet de détecter les émotions dans les conversations. Nous souhaitons augmenter les fonctionnalités cognitives de nos callbots afin qu?ils puissent détecter les émotions de leurs interlocuteurs (joie, stress, colère, tristesse?) et donc adapter leurs réponses en conséquence. Vos missions principales : - Vous participez à la création du pôle R&D de ZAION et piloterez à votre arrivée votre premier projet de reconnaissance d?émotion dans la voix. - Construisez, adaptez et faites évoluer nos services de détection d?émotion dans la voix - Analysez de bases de données conséquentes de conversations pour en extraire les conversations émotionnellement pertinentes - Construisez une base de données de conversations labelisées avec des étiquettes émotionnelles - Formez et évaluez des modèles d'apprentissage automatique pour la classification d?émotion - Déployez vos modèles en production - Améliorez en continue le système de détection des émotions dans la voix Qualifications requises et expérience antérieure : -Vous avez une expérience de 2 ans minimum comme Data Scientist/Machine Learning appliqué à l?Audio - Diplômé d?une école d?Ingénieur ou Master en informatique ou un doctorat en informatique mathématiques avec des compétences solides en traitements de signal (audio de préférence) - Solide formation théorique en apprentissage machine et dans les domaines mathématiques pertinents (clustering, classification, factorisation matricielle, inférence bayésienne, deep learning...) - La mise à disposition de modèles d'apprentissage machine dans un environnement de production serait un plus - Vous maîtrisez un ou plusieurs des langages suivants : Python, Frameworks de machine Learning/Deep Learning (Pytorch, TensorFlow,Sci-kit learn, Keras) et Javascript - Vous maîtrisez les techniques du traitement du signal audio - Une expérience confirmée dans la labélisation de grande BDD (audio de préférence) est indispensable ; - Votre personnalité : Leader, autonome, passionné par votre métier, vous savez animer une équipe en mode projet - Vous parlez anglais couramment Merci d?envoyer votre candidature à : alegentil@zaion.ai
Très cordialement
|