ISCApad #269 |
Thursday, November 12, 2020 by Chris Wellekens |
6-1 | (2020-05-10) Researcher at GIPSA-Lab Grenoble, France L'équipe CRISSP (Cognitive Robotics, Interactive Systems & Speech Robotics) du GIPSA-Lab recherche un(e) candidat(e) motivé(e) pour travailler sur la synthèse de parole appliquée à l'interaction face-à-face incarnée. Il(Elle) devra avoir des compétences en apprentissage automatique.
Le travail s'inscrit dans le cadre du projet THERADIA, financé par BPI-France et mené en partenariat avec des laboratoires (EMC, LIG) et des industriels (SBT, ATOS, Pertimm).
Les candidatures seront examinées de manière continue jusqu'à ce que le poste soit pourvu.
Tous les détails sur les sujets et comment postuler: https://bit.ly/3cW1gy9
Contact: gerard.bailly@gipsa-lab.fr
| |||||
6-2 | (2020-05-11) Tenure-track researcher at CWI, Amsterdam, The Netherlands We have an open position for a tenure-track researcher at CWI (https://www.cwi.nl/) within our Distributed & Interactive Systems (DIS) group (https://www.dis.cwi.nl/).
| |||||
6-3 | (2020-05-12) Fully-funded 4-year PhD studentships for research in Speech and Language Technologies (SLT) and their Applications , UKRI Centre for doctoral training, Sheffield, UK UKRI Centre for Doctoral Training (CDT) in Speech and Language Technologies (SLT) and their Applications Department of Computer Science Faculty of Engineering University of Sheffield
Fully-funded 4-year PhD studentships for research in Speech and Language Technologies (SLT) and their Applications ** Applications now open for last remaining September 2020 intake places ** Deadline for applications: 31 May 2020. What makes the SLT CDT different:
The benefits:
About you: We are looking for students from a wide range of backgrounds interested in Speech and Language Technologies.
Applying: Applications are now sought for the September 2020 intake.
We operate a staged admissions process, with application deadlines throughout the year. The final deadline for applications for the remaining places is 31 May 2020.
Applications will be reviewed within 6 weeks of each deadline and short-listed applicants will be invited to interview. Interviews will be held in Sheffield. In some cases, because of the high volume of applications we receive, we may need more time to assess your application. If this is the case, we will let you know if we intend to do this.
See our website for full details and guidance on how to apply: slt-cdt.ac.uk For an informal discussion about your application please contact us by email at: sltcdt-enquiries@sheffield.ac.uk By replying to this email or contacting sltcdt-enquiries@sheffield.ac.uk you consent to being contacted by the University of Sheffield in relation to the CDT. You are free to withdraw your permission in writing at any time.
| |||||
6-4 | (2020-05-20) University assistant position, Johannes Kepler University, Linz, Austria We are happy to announce a position as a university assistant at the *Institute of
| |||||
6-5 | (2020-05-26) Fully funded PhD position in data-driven socially assistive robotics,Uppsala University, Sweden ** Fully funded PhD position in data-driven socially assistive robotics** Uppsala Social Robotics Lab Department of Information Technology Uppsala University, Sweden
Uppsala University is a comprehensive research-intensive university with a strong international standing. Our mission is to pursue top-quality research and education and to interact constructively with society. Our most important assets are all the individuals whose curiosity and dedication make Uppsala University one of Sweden’s most exciting workplaces. Uppsala University has 46.000 students, 7.300 employees and a turnover of SEK 7.3 billion. The Department of Information Technology holds a leading position in research as well as teaching at all levels. The department has 280 employees, including 120 faculty, 110 PhD students, and 30 research groups. More than 4000 students are enrolled annually. The Uppsala Social Robotics Lab (https://usr-lab.com) led by Prof. Ginevra Castellano aims to design and develop robots that learn to interact socially with humans and bring benefits to the society we live in, for example in application areas such as education and assistive technology.
We are collecting expressions of interest for an upcoming PhD position in data-driven socially assistive robotics for medical applications within a project funded by Uppsala University’s WoMHeR (Women’s Mental Health during the Reproductive lifespan) Centre, in collaboration with the Department of Neuroscience.
The PhD project will include the development and evaluation of novel machine learning-based methods for robot-assisted diagnosis of women’s depression around childbirth via automatic analysis of multimodal user behaviour in interactive scenarios.
The student will be part of the Uppsala Social Robotics Lab at the Division of Visual Information and Interaction of the Department of Information Technology. The Uppsala Social Robotics Lab’s focus is on natural interaction with social artefacts such as robots and embodied virtual agents. This domain concerns bringing together multidisciplinary expertise to address new challenges in the area of social robotics, including mutual human-robot co-adaptation, multimodal multiparty natural interaction with social robots, multimodal human affect and social behavior recognition, multimodal expression generation, robot learning from users, behavior personalization, effects of embodiment (physical robot versus embodied virtual agent) and other fundamental aspects of human-robot interaction (HRI). State of the art robots are used, including the Pepper, Nao and Furhat robotic platforms.
The position is for four years. Rules governing PhD students are set out in the Higher Education Ordinance chapter 5, §§ 1-7 and in Uppsala University's rules and guidelines http://regler.uu.se/?languageId=1.
How to send expressions of interest: To express your interest, you should send to Ginevra Castellano (ginevra.castellano@it.uu.se) by the 10th of June a description of yourself, your research interests, reasons for applying for this particular PhD position and past experience (max. 3 pages), a CV, copies of relevant university degrees and transcripts, links to relevant publications and your MSc thesis (or a summary in case the thesis work is ongoing) and other relevant documents. Candidates are encouraged to provide contact information to up to 3 reference persons. We would also like to know your earliest possible date for starting.
Requirements: Qualifications: The candidates must have an MSc degree in computer science or related areas relevant to the PhD topics. Good programming skills are required and expertise in machine learning appreciated. The PhD position is highly interdisciplinary and requires an understanding and/or interest in psychology and social sciences and willingness to work in an interdisciplinary team.
Working in Sweden: Sweden is a fantastic place for living and working. Swedes are friendly and speak excellent English. The quality of life is high, with a strong emphasis on outdoor activities. The Swedish working climate emphasizes an open atmosphere, with active discussions involving both junior and senior staff. PhD students are full employees, with competitive salaries, pension provision and five weeks of paid leave per year. Spouses of employees are entitled to work permits. Healthcare is free after a small co-pay and the university subsidizes athletic costs, such as a gym membership. The parental benefits in Sweden are among the best in the world, including extensive parental leave (for both parents), paid time off to care for sick children, and affordable daycare. Upon completion of the PhD degree, students are entitled to permanent residency to find employment within Sweden.
| |||||
6-6 | (2020-06-01) Poste de doctorant financé à l'Université de Grenoble Alpes, France L'Université Grenoble Alpes recrute un?e doctorant?e (3 ans) entièrement financé?e à
| |||||
6-7 | (2020-06-10) 2 post-docs positions at UTDallas, Texas, USA POST-DOCTORAL POSITION #1 Center for Robust Speech Systems: Robust Speech Technologies Lab
Developing robust speech and language technologies (SLT) for naturalistic audio is the most challenging topic in the broader class of machine learning problems. CRSS-RSTL stands at the forefront of this initiative by making available the largest (150,000 hours) publicly available naturalistic corpus in the world. The FEARLESS STEPS corpus is the collection of multi-speaker time synchronized multi-channel audio from all of NASA’s 12 Apollo Manned Missions. Deployment of such ambitious corpora requires development of state-of-the-art support infrastructure using multiple technologies working synchronously to provide meaningful information to researchers from the science, technology, historical archives, and educational communities. To this end, we are seeking a post-doctoral researcher in the area of speech and language processing and machine learning. The researcher will collaboratively aid in the development of speech, natural language, and spoken dialog systems for noisy multi-channel audio streams. Overseeing digitization of analog tapes, community outreach and engagement, and assisting in cutting edge SLT research are also important tasks for the project. Those interested should send an email with their resume and areas of interest to John.Hansen@utdallas.edu. More information can be found on our website: CRSS–RSTLab (Robust Speech Technologies Lab) at https://crss.utdallas.edu/
POST-DOCTORAL POSITION #2 Center for Robust Speech Systems: Cochlear Implant Processing Lab
Cochlear implants are one of the most successful solutions of replacing hearing sensation via an electronic device. However, the search for better sound coding and electrical stimulation strategies could be significantly accelerated by developing a flexible, powerful, portable speech processor for cochlear implants compatible with current smartphones/tablets. We are developing CCi-MOBILE, the next generation of such a research platform, one that will be more flexible and computationally powerful than clinical research devices that will enable implementation and long-term evaluation of advanced signal processing algorithms in naturalistic and diverse acoustic environments. To this end, we are seeking a post-doctoral researcher in the area of cochlear implant signal processing and embedded hardware/systems design. The researcher will collaboratively aid in the development of an embedded (FPGA-based) hardware (PCBs) for speech processing applications. Firmware development in Verilog and Java (Android) for DSP algorithms implementation is also an important task for the project. Those interested should send an email with their resume and areas of interest to John.Hansen@utdallas.edu. More information can be found on our website: CRSS–CILab (Cochlear Implant Processing Lab) at https://crss.utdallas.edu/CILab/
| |||||
6-8 | (2020-06-13) Tenure track researcher at CWI, Amsterdam, The Netherlands Do you want to work with us at CWI in Amsterdam?
| |||||
6-9 | (2020-06-18) Assistant/Postdoc level a TUWien, Austria Are you interested to join a vibrant research environment in the center of Europe, as a
| |||||
6-10 | (2020-06-20) Two fully-funded PhD studentships in automatic speech recognition - University of Sheffield Subject: Two fully-funded PhD studentships in automatic speech recognition - University of Sheffield Topic 1: Semi-supervised Learning for Automatic Speech Recognition Deadline: July 24, 2020 Topic 2: Multilingual Speech Recognition To apply: https://www.jobs.ac.uk/job/CAJ394/phd-studentship-in-multilingual-speech-recognition Deadline: July 24, 2020 For further information please contact Prof. Thomas Hain (t.hain@sheffield.ac.uk).
| |||||
6-11 | (2020-06-22) PhD grant, Université de Toulouse, France Subject: “MOTRYLANG – The role of motor rhythm in language development and language disorders” Supervisors: Corine Astésano, Jessica Tallet Host Laboratories: U.R.I Octogone-Lordat (EA 4156), Université de Toulouse II Laboratoire ToNIC (UMR 1214), Université Paul Sabatier - Toulouse III Discipline: Linguistics Doctoral School: Comportement, Langage, Education, Socialisation, Cognition (CLESCO) Scientific description of the research project: The project aims to address a series of scientific and clinical questions regarding the place of motor activity in child language development and its rehabilitation. Typical development comes with the implementation of rhythm in speech production (prosodic accenting) and also in movement production (tapping, walking, sensorimotor synchronisation…). Interestingly, the tempo of linguistic and motor rhythms is similar in healthy adults (around 700 ms or 1,4 Hz). The present project aims to (1) investigate the existence of a link between motor & linguistic rhythms and associated neural correlates (ElectroEncephaloGraphy) in children with and without linguistic disorders; (2) evaluate the impact of motor training on linguistic performances, and (3) create, computerize and test a language rehabilitation program based on the use of motor rhythm in children with language acquisition disorders. This project will have scientific repercussions in linguistic and movement sciences as well as in the field of rehabilitation. The selected candidate will benefit from a stimulating scientific environment: (s)he will integrate the Interdisciplinary Research Unit Octogone-Lordat (Toulouse II: http://octogone.univ-tlse2.fr/) and will be co-supervised by Corine Astésano, linguistphonetician specializing in prosody, and by Jessica Tallet, specialist in rhythmic motor skills and learning at ToNIC laboratory, Toulouse NeuroImaging Center (Toulouse III: https://tonic.inserm.fr/). The research will be integrated in a work group on Language, Rhythm and Motor skills, which encompasses PhD students, professionals in rehabilitation and collaborators from other universities. Required skills: - Master in linguistics, human movement sciences, cognitive sciences, health sciences or equivalent - A speech therapist’s profile would be a plus - Experience in experimental phonetics and/or linguistics, neuro-psycho-linguistics (speech disorders) - Skills in linguistic data processing and analysis - Skills in evaluating speech neurological disorders and in the running of linguistic remediation programs - Autonomy and motivation for learning new skills (for eg. EEG …) - Good knowledge of the French language; good writing and oral skills in both French and English Salary: - 1768.55 € monthly gross, 3 year contract Calendar: - Sending of applications: 6th july 2020 - Audition of selected candidates: 15th july 2020 - Start of contract: 1rst october 2020 Applications must be sent to Corine Astésano (corine.astesano at univ-tlse2.fr) and will include: - A detailed CV, with list of publications if applicable - A copy of the grades for the Master’s degree - A summary of the Master’s dissertation and a pdf file of the Master’s dissertation - A cover letter / letter of interest and/or scientific project (1 page max) - A letter of recommendation from a referent scientific personality/supervisor
| |||||
6-12 | (2020-06-30) PhD at Université Grenoble Alpes, France L'Université Grenoble Alpes recrute une doctorante ou un doctorant avec
| |||||
6-13 | (2020-07-09) Speech Research Scientist at ETS R&D Speech Research Scientist at ETS R&D:
https://etscareers.pereless.com/index.cfm?fuseaction=83080.viewjobdetail&CID=83080&JID=302819
| |||||
6-14 | (2020-07-16) Early Stage Researcher / PhD student in an EU Marie Sklodowska-Curie Action (H2020-MSCA-ITN), Romania We are hiring an Early Stage Researcher / PhD student in an EU Marie Sklodowska-Curie Action (H2020-MSCA-ITN) on the topic 'Designing and Engineering Multimodal Feedback to Augment the User Experience of Touch Input' under the supervision of Prof. Radu-Daniel Vatavu (http://www.eed.usv.ro/~vatavu)
| |||||
6-15 | (2020-07-20) Technical project manager, INRIA, Nancy, France Inria is seeking a Technical Project Manager for a European (H2020 ICT) collaborative
| |||||
6-16 | (2020-09-10) Experts in recognition and synthesis at Reykjavík University's Language and Voice Lab, IcelandReykjavík University's Language and Voice Lab (https://lvl.ru.is) is looking
for experts in speech recognition and in speech synthesis. At the LVL you
will be joining a research team working on exciting developments in language
technology as a part of the Icelandic Language Technology Programme
(https://arxiv.org/pdf/2003.09244.pdf).
Job Duties:
. Conduct independent research in the fields of speech processing,
machine learning, speech recognition/synthesis and human-computer interaction.
. Work with a team of other experts in carrying out the Speech
Recognition/Synthesis part of the Icelandic Language Technology Programme.
. Publish and disseminate research findings in journals and present at
conferences. . Actively take part in scientific and industrial cooperation projects.
. Assist in supervising Bachelor's/Master's students. Skills:
. MSc/PhD degree in engineering, computer science, statistics, mathematics
or similar
. Good programming skills (e.g. C++ and Python) and knowledge of Linux
(necessary).
. Good knowledge of a deep learning library such as PyTorch or TensorFlow
(necessary).
. Good knowledge of KALDI (preferable)
. Background in language technology (preferable).
. Good skills in writing and understanding shell scripts (preferable).
All applications must be accompanied by a good CV with information about
previous jobs, education, references etc. It is also optional to attach a cover
letter where the applicant can justify the reasons for being the right person
for the job. Here is the link to apply: https://jobs.50skills.com/ru/is/5484
Applications deadline is October 4th 2020.
Applications are only accepted through RU's recruitment system.
All inquiries and applications will be treated as confidential.
Further information about the job is provided by Jón Guðnason
Associate Professor, jg@ru.is, and Ester Gústavsdóttir, Director of Human
Resources, esterg@ru.is.
The role of Reykjavik University is to create and disseminate knowledge to
enhance the competitiveness and quality of life for individuals and society,
guided by good ethics, sustainability and responsibility.
Education and research at RU are based on strong ties with industry and society.
We emphasize interdisciplinary collaboration, international relations and
entrepreneurship.
| |||||
6-17 | (2020-09-17) Proposition de contrat doctoral, Sorbonne University (Jussieu), Paris, France Proposition de contrat doctoral Titre : Rythme de la parole et gestes manuels en synthèse performative Résumé du sujet : Le but de cette thèse est de développer un cadre théorique et des expérimentations quant à l'utilisation du geste manuel pour le contrôle prosodique via des interfaces humain-machine, en synthèse performative. La synthèse vocale performative est un nouveau paradigme de recherche en interaction humain-machine, dans lequel une voix de synthèse est jouée comme un instrument en temps-réel à l’aide des membres (mains, pieds). Le contrôle du rythme de parole par les mains est un problème qui implique des unités rythmiques, des points de contrôle rythmique, les centres perceptifs des syllabes et des gestes de tapotement (tapping), voire des partitions gestuelles inspirées des phonologies autosegmentales ou articulatoires. Les unités rythmiques varient en fonction de la phonologie de la langue étudiée, ici le français, l’anglais et le chinois mandarin. Les enjeux de la thèse portent donc sur la modélisation des schémas de perception-action impliqués dans le contrôle rythmique, la modélisation des unités temporelles, la réalisation et l’évaluation d’un système de contrôle du rythme. Les applications visées sont : 1. l’apprentissage du contrôle naturel des contours intonatifs à l'aide de la chironomie pour l'acquisition de langues étrangères (anglais, français, mandarin) ; 2. l’apprentissage du contrôle chironomique des contours d'intonation de la langue maternelle, pour la suppléance vocale (larynx artificiel). Contexte : La voix n’est pas un 'instrument' de musique, au sens d’un artefact mis en vibration par les membres ou par le souffle. Les organes vocaux sont internes, en grande partie invisibles, et contrôlés de façon complexe par plusieurs ensembles musculaires (respiration, phonation, articulation). Le contrôle vocal est donc, par nature intéroceptif, alors qu’il est davantage kinesthésique et extéroceptif pour les instruments de musique. L’avènement de la synthèse numérique permet pour la première le rendu d’un son indéniablement vocal par un dispositif instrumental externe, mis à distance de l’appareil vocal. Les 'instruments vocaux' sont 'manoeuvrés' par les mains, les pieds, à l’aide de capteurs ou d’interfaces humainmachine. Cette mise à distance pose la question du contrôle vocal dans des termes tout à fait différents de ceux du contrôle d’un instrument acoustique ou de la voix elle même. Les instruments vocaux permettent actuellement un contrôle musical de la phonation : intonation, séquencement rythmique, qualité de voix, pour la voix chantée. Le contrôle très précis de l’articulation et du rythme en parole est encore problématique. Le propos de cette thèse est de traiter la question du contrôle gestuel du rythme prosodique et du séquencement articulatoire. Objectifs et résultats attendus : Cette thèse s’inscrit dans la ligne de recherche sur les instruments vocaux. Un instrument vocal est un synthétiseur en vocal temps réel à contrôle gestuel. La synthèse est réalisée par un programme pour produire les échantillons. Le contrôle gestuel utilise des interfaces pour capter les gestes. Les mouvements des articulateurs étant très rapides, il est difficile de les contrôler de façon directe par les gestes manuels et des méthodologies basées sur la représentation phonologique du rythme prosodique doivent être mise en place. Le rythme est réalisé par des gestes des membres, mains ou pieds, en place des gestes articulatoire qui correspondent aux syllabes. Les circuits de perception-action ne sont plus les mêmes, ni les vélocités des organes mis en mouvement. Le contrôle du rythme prosodique en synthèse performative est donc un problème qui implique la définition d’unités rythmiques, de points de contrôle rythmique, de centres perceptifs des syllabes, de gestes de tapotement (tapping), voire de partitions gestuelles inspirées des phonologies autosegmentale ou articulatoire. Des points de contrôles rythmiques doivent enrichir le signal vocal pour permettre d’en manipuler le déroulement temporel. Ces points doivent avoir du sens du point de vue de la phonologie de la langue jouée, et de sa phonotactique. La perception du flux syllabique, avec ses centres perceptifs, est donc impliquée. Les gestes de contrôle, par appuis ou tapotage, impliquent des processus moteurs, à la fois analogues et différents de ceux des articulateurs. Les unités rythmiques varient en fonction de la phonologie des langues étudiées, ici le français, l’anglais et le chinois mandarin. Les enjeux de la thèse portent donc sur la modélisation des schémas de perception-action impliqués dans le contrôle rythmique, la modélisation des unités temporelles, la réalisation et l’évaluation d’un système de contrôle du rythme. Les résultats attendus sont à la fois théoriques et pratiques : • L’expérimentation perceptive permettra de mettre en relation les différentes unités temporelles; • les théories phonologiques sur l’organisation du geste phonatoire seront mises à l’épreuve avec un nouveau paradigme expérimental; • un nouveau synthétiseur sera réalisé; • un ensemble de méthodes pour le contrôle gestuel de la synthèse, de nouveaux gestes et des interfaces adaptées seront développés et testés dans les tâches applicative visées, soit l’apprentissage du contrôle naturel des contours intonatifs à l'aide de la chironomie pour l'acquisition de langues étrangères (anglais, français, mandarin) et l’apprentissage du contrôle chironomique des contours d'intonation de la langue maternelle, pour la suppléance vocale (larynx artificiel). Méthodologie : Les théories phonologiques et phonétiques de l’organisation temporelle des langues étudiées seront considérées dans le contexte du paradigme de la synthèse performative. L’étude des relations entre points de contrôle rythmique, centres perceptifs, gestes de tapotage et unités phonologiques implique la modélisation et l’expérimentation, avec des sujets réalisant des tâches de perceptionaction. La méthodologie relève ici de la psychologie et de la phonétique expérimentales : définition de corpus, mise en oeuvre de protocoles de test, tests, analyses statistiques. Un synthétiseur qui utilise les nouveaux paradigmes de contrôle rythmique sera développé. La méthodologie relève ici du traitement du signal audio et de la parole ainsi que de l’informatique, depuis la conception jusqu’à la programmation. Ainsi un ensemble de méthodes pour le contrôle gestuel du rythme prosodique et du temps sera développé et testé dans les tâches applicatives visées. Ces méthodes comprennent à la fois les gestes et les interfaces de contrôle et relèvent de l’informatique dans le domaine des interfaces humainmachine. Prérequis : Ce sujet est à l’interface de la synthèse vocale et des interfaces humain-machine, de la prosodie, de la perception et de la performance musicale. Cela demande des connaissances générales en traitement du signal audionumérique et en informatique musicale ou en interface humain-machine. Une partie du travail portera sur le développement logiciel. Des connaissances sur la voix et la parole, en phonétique et phonologie, ainsi qu’en psychologie expérimentale ou sciences cognitives seront nécessaires. Les candidatures avec une formation initiale en informatique et traitement du signal aussi bien que celles avec une formation initiale en linguistique, phonétique ou sciences cognitives seront considérées. La formation initiale sera éventuellement complétée dans les domaines qui seraient moins connus. Encadrement : Christophe d’Alessandro, DR CNRS, Responsable de l’équipe LAM Institut Jean Le Rond d’Alembert, Sorbonne Université christophe.dalessandro@sorbonne-universite.fr Ce projet doctoral est dans le cadre du contrat ANR Gepeto, en collaboration avec le LPP (Sorbonne nouvelle), et le GIPSA-Lab, Université de Grenoble. Début du contrat dès que possible (à partir d’octobre 2020) Références : • Delalez, S. et d’Alessandro, C. (2017). “Vokinesis: syllabic control points for performative singing synthesis”, NIME’17 , pp. 198-203. • X. Xiao, G. Locqueville, C. d'Alessandro, B. Doval, « T-Voks: Controlling Singing and Speaking Synthesis with the Theremin », Proceedings of the International Conference on New Interfaces for Musical Expression, NIME’19, June 3-6, 2019, Porto Alegre, Brazil, 110-115. • Samuel Delalez, Christophe d’Alessandro « Adjusting the Frame: Biphasic Performative Control of Speech Rhythm », Proc. INTERSPEECH 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 18-25, 2017, DOI: 10.21437/Interspeech.2017, 864-868. • Christophe d’Alessandro, Albert Rilliard, and Sylvain Le Beux « Chironomic stylization of intonation » J. Acoust. Soc. Am., 129(3), march 2011, 1594-1604 • Christophe d’Alessandro, Lionel Feugère, Sylvain Le Beux, Olivier Perrotin, and Albert Rilliard (2014) , « Drawing melodies : evaluation of chironomic singing synthesis » , J. Acoust. Soc. Am. 135 (6), 3601-3612. • I . Chow, M. Belyk, V. Tran, and S. Brown. Syllable synchronisation and the P-center in cantonese. 49 :55–66, 2015. • C. d’Alessandro, L. Feugère, S. Le Beux, and O. Perrotin. Drawing melodies : Evaluation of chironomic singing synthesis. J. Acoust. Soc. Am., 135(6) :3601–3612, March 2014. • C. d’Alessandro, A. Rilliard, and S. Le Beux. Chironomic stylisation of intonation. J. Acoust. Soc. Am., 129(3) :1594–1604, March 2011. • C. Fowler. “Perceptual centers” in speech production and perception. Perception & Psychophysics, 25 :375–388, 1979. • P. Howell. Predicton of p-center location from the distribution of energy in the ampitude envelope. Perception and Psychophysics, 43 :90–93, 1988. • P. F. MacNeilage. The frame/content theory of evolution of speech production. Behavioral and Brain Sciences, 21 :499–546, 1998. • S.M. Marcus. Acoustic determinants of perceptual center (P-center). Perception and Psychophysics, 30 :247–256, 1981. • J. Morton, S. Marcus, and C. Frankish. Perceptual centers (P-Centers). Psychological Review, 83(5) :405–408, 1976. • B. Pompino-Marshall. On the psycho-acoustic nature of the p-center phenomenon. Journal of phonetics, 17 :175–192, 1989. • K. Rapp-Holmgren. A study of syllable timing. STL-QPSR, 12(1) :014–019, 1971. • B. H. Repp. Sensorimtor synchronization : A review of tapping litterature. Psychon. Bull. Rev., 12(6) :969–992, 2005. • B. H. Repp and Y. H. Su. Sensorimotor synchronisation : A review of recent research. Pyschon. Bull. Rev., 20 :403–452, 2013. • P. Wagner. The Rhythm of Language and Speech : Constraining Factors, Models, Metrics and Applications. Habilitation à diriger des recherches, Rheinischen Friedrich-Wilhelms- Universität Bonn, 2008.
| |||||
6-18 | (2020-09-24) Offre de thèse: Modèles profonds pour la reconnaissance et l'analyse de la parole spontanée,Grenoble, France *Sujet*
| |||||
6-19 | é2020-10-05) Research Assistant/Associate in Spoken Language Processing, Cambridge University, UK Research Assistant/Associate in Spoken Language Processing x 2 (Fixed Term)
Speech Research Group, Cambridge University Engineering Department, UK
| |||||
6-20 | (2020-10-05) 2 POsitions at Radbout University, Nijmegen, The Netherlands
We have two vacancies for speech technology employees:
| |||||
6-21 | (2020-10-08) Job offer: 1-year postdoc position at LSP (ENS Paris), France Job offer: 1-year postdoc position at LSP (with a possibility of 1-year extension) The Laboratoire des Systèmes Perceptifs (LSP, ENS Paris / CNRS, https://lsp.dec.ens.fr/en) is offering a postdoc position for the ANR project fastACI ('Exploring phoneme representations and their adaptability using fast Auditory Classification Images') supervised by Léo Varnet (leo.varnet@ens.psl.eu). The fastACI project aims to develop a fast and robust experimental method to visualize and characterize the auditory mechanisms involved in phoneme recognition. The fast-ACI method relies on a stimulus-response model, combined with a reverse correlation (revcorr) experimental paradigm. This approach, producing an 'instant picture' of a participant’s listening strategy in a given context, has already yielded conclusive results, but remains very time-consuming. Therefore, the main objectives of this postdoc contract will be (1) to improve the efficiency of the process using advanced supervised learning techniques (e.g. sparse priors on a smooth basis) and an online adaptive protocol (e.g. Bayesian optimisation); then (2) to use this technique to map the phonemic representations used by normal-hearing listeners. For this purpose, a large number of experiments on phoneme-in-noise categorization tasks (e.g. /aba/ vs. /ada/) will be carried out, in order to insure the broadest possible mapping of the French phonological inventory. As a second step, the new tool will be used to explore the adaptability of speech comprehension in the case of sensorineural hearing loss and noisy backgrounds. The post-doc will be involved in all activities in line with the project, including data collection, coding and statistical analyses. The post-doc will also coordinate trainees’ and students’ work involved in the project, and contribute significantly to publication of the findings. Required profile: Background in psychoacoustics and/or machine learning (the candidate should preferably hold a PhD in one of these fields) High skills in statistical data processing (in particular supervised learning algorithms) and practical knowledge in psychophysics Basic understanding of psychoacoustics and psycholinguistics
Good knowledge of Matlab programming (other languages such as R can also be useful) Strong communication skills in English (working language in the lab) and French (interactions with participants, etc.). Duration: 12 months or 24 months Start: Early 2021 Net salary: ~ 2100€/ month Application Procedure: Applications must include a detailed CV and a motivation letter, a link to (or copy of) the PhD thesis, PhD Viva report, plus the email contact of 2 referees. Applications are to be sent to: Léo Varnet (leo.varnet@ens.psl.eu) before 08/11 (interviews should take place on the 18/11 by videoconference)
| |||||
6-22 | (2020-10-20) PhD grant at INRIA Nancy, France Privacy preserving and personalized transformations for speech recognition
This research position fits within the scope of a collaborative project (funded by the French National Research Agency) involving several French teams, among which, the MULTISPEECH team of Inria Nancy - Grand-Est. One objective of the project is to transform speech data in order to hide some speaker characteristics (such as voice identity, gender information, emotion, ...) in order to safely share the transformed data while keeping speaker privacy. The shared data is to be used to train and optimize models for speech recognition. The selected candidate will collaborate with other members of the project, and will participate to the project meetings.
Scientific Context Over the last decade, great progress has been made in automatic speech recognition [Saon et al., 2017; Xiong et al., 2017]. This is due to the maturity of machine learning techniques (e.g., advanced forms of deep learning), to the availability of very large datasets, and to the increase in computational power. Consequently, the use of speech recognition is now spreading in many applications, such as virtual assistants (as for instance Apple’s Siri, Google Now, Microsoft’s Cortana, or Amazon’s Alexa) which collect, process and store personal speech data in centralized servers, raising serious concerns regarding the privacy of the data of their users. Embedded speech recognition frameworks have recently been introduced to address privacy issues during the recognition phase: in this case, a (pretrained) speech recognition model is shipped to the user's device so that the processing can be done locally without the user sharing its data. However, speech recognition technology still has limited performance in adverse conditions (e.g., noisy environments, reverberated speech, strong accents, etc.) and thus, there is a need for performance improvement. This can only be achieved by using large speech corpora that are representative of the actual users and of the various usage conditions. There is therefore a strong need to share speech data for improved training that is beneficial to all users, while preserving the privacy of the users, which means at least keeping the speaker identity and voice characteristics private [1]. Missions: (objectives, approach, etc.) Within this context, the objective is twofold. First, it aims at improving privacy preserving transforms of the speech data, and, second, it will investigate the use of additional personalized transforms, that can be applied on the user’s terminal, to increase speech recognition performance. In the proposed approach, the device of each user will not share its raw speech data, but a privacy preserving transformation of the user speech data. In such approach, some private computations will be handled locally, while some cross-user computations may be carried out on a server using the transformed speech data, which protect the speaker identity and some of his/her features (gender, sentiment, emotions...). More specifically, this rely on a representation learning to separate the features of the user data that can expose private information from generic ones useful for the task of interest, i.e., here, the recognition of the linguistic content. On this topic, recent experiments have relied on Generative Adversarial Networks (GANs) for proposing a privacy preserving transform [Srivastava et al., 2019], and on voice conversion approaches [Srivastava et al., 2020]. In addition, as devices are getting more and more personal, this creates opportunities to make speech recognition more personalized. Some recent studies have investigated approaches that takes benefit of speaker information [Turan et al., 2020]. The candidate will investigate further approaches along these lines. Other topics such as investigating the impact and benefit of adding some random noise in the transforms will be part of the studies, as well as dealing with (hiding) some paralinguistic characteristics. Research directions and priorities will take into account new state-of-the-art results and on-going activities in the project.
Skills and profile: PhD or Master in machine learning or in computer science Background in statistics, and in deep learning Experience with deep learning tools is a plus Good computer skills (preferably in Python) Experience in speech and/or speaker recognition is a plus
Bibliography: [Saon et al., 2017] G. Saon, G. Kurata, T. Sercu, K. Audhkhasi, S. Thomas, D. Dimitriadis, X. Cui, B. Ramabhadran, M. Picheny, L.-L. Lim, B. Roomi, and P. Hall: English conversational telephone speech recognition by humans and machines. Technical report, arXiv:1703.02136, 2017. [Srivastava et al., 2019] B. Srivastava, A. Bellet, M. Tommasi, and E. Vincent: Privacy preserving adversarial representation learning in ASR: reality or illusion? INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association , Sep 2019, Graz, Austria. [Srivastava et al., 2020] B. Srivastava, N. Tomashenko, X. Wang, E. Vincent, J. Yamagishi, M. Maouche, A. Bellet, and M. Tommasi: Design choices for x-vector based speaker anonymization. INTERSPEECH 2020, 21th Annual Conference of the International Speech Communication Association, Oct 2020, Shanghai, China. [Turan et al., 2020] T. Turan, E. Vincent, and D. Jouvet: Achieving multi-accent ASR via unsupervised acoustic model adaptation. INTERSPEECH 2020, 21th Annual Conference of the International Speech Communication Association, Oct 2020, Shanghai, China. [Xiong et al., 2017] W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, and G. Zweig. Achieving human parity in conversational speech recognition. Technical report, arXiv:1610.05256, 2017.
Additional information: Supervision and contact: Denis Jouvet (denis.jouvet@inria.fr; https://members.loria.fr/DJouvet/) Duration: 2 years Starting date: autumn 2020 Location: Inria Nancy – Grand Est, 54600 Villers-lès-Nancy
footnote [1] : Note that when sharing data, users may want not to share data conveying private information at the linguistic level (e.g., phone number, person name, …). Such privacy aspects also need to be taken into account, but they are out-of-the scope of this project.
| |||||
6-23 | (2020-10-21) Fully funded PhD position at >IDIAP, Martigny, Switzerland There is a fully funded PhD position open at Idiap Research Institute on 'Speech
| |||||
6-24 | (2020-10-23) PhD grant at Université Grenoble-Alpes, France. Dans le cadre de la Chaire ' Bayesian Cognition and Machine Learning for Speech
| |||||
6-25 | (2020-10-26) Post doctoral position at the Beckman Institute for Advance Science and Technology, University of Illinois, Urbana-Chanpaign, USA Postdoctoralposition in Mobile Sensing and Child Mental Health Beckman Institute for Advanced Science & Technology University ofIllinois at Urbana-Champaign
Our interdisciplinary research team at the Beckman Institute for Advance Science and Technology is developing and applying innovative tools and methods from mobile sensing, signal processing and machine learning to gain insight into the dynamic processes underlying the emergence of disturbances in child mental health. We have engineered a wearable sensing platform that captures speech, motion, and physiological signals of infants and young children in their natural environments, and we are applying data-driven machine-learning approaches and dynamic statistical modeling techniques to large-scale, naturalistic, and longitudinal data sets to characterize dynamic child-parent transactions and children’s developing stress regulatory capacities and to ultimately capture reliable biomarkers of child mental health disturbance.
We seek outstanding candidates for a postdoctoral scholar position that combines multimodal sensing and signal processing, dynamic systems modeling, and child mental health. The ideal candidate would have expertise in one or more of the following domains related to wearable sensors:
In addition to joining a highly interdisciplinary team and making contributions to high impact research on mobile sensing and child mental health, this position provides strong opportunities for professional development and mentorship by faculty team leaders, including Drs. Mark Hasegawa-Johnson, Romit Roy Choudhury, and Nancy McElwain. In collaboration with the larger team, the postdoctoral scholar will play a central role in preparing conference papers and manuscripts for publication, contributing to the preparation of future grant proposals, and assisting with further development of our mobile sensing platform for use with infants and young children.
Applicants should have a doctoral degree in computer engineering, computer science, or a field related to data analytics of wearable sensors, as well as excellent skills in programming, communication, and writing. Appointment is for at least two years, contingent on first-year performance. The position start date is negotiable.
Please send a coverletter and CV to Drs. Mark Hasegawa-Johnson (jhasegaw@illinois.edu) and Nancy McElwain (mcelwn@illinois.edu). Applications will be considered until the position is filled, with priority given to applications submitted by November 15th.
The University of Illinois is an Equal Opportunity, Affirmative Action employer. Minorities, women, veterans and individuals with disabilities are encouraged to apply. For more information, visit http://go.illinois.edu/EEO.
| |||||
6-26 | (2020-10-28) TWO positions in Trinity College Dublin, Ireland we have openings for TWO positions in Trinity College Dublin, Ireland, available from Dec 1st, 2020 for 14 months. We are seeking:
A Research Assistant (qualified to Masters level) A Research Fellow (holds a PhD)
The Project: RoomReader is a project led by Prof. Naomi Harte in TCD and Prof. Ben Cowan in UCD, Ireland. The research is exploring and modelling online interactions, and is funded by the Science Foundation Ireland Covid 19 Rapid Response Call. The candidate will be working with a team to drive research into multimodal cues of engagement in online teaching scenarios. The work involves a collaboration with Microsoft Research Cambridge, and Microsoft Ireland. The Research Assistant will have a psychology/linguistics/engineering background (we are flexible) and will be tasked with researching and designing a new online task to elicit speech based interactions relevant to online teaching scenarios (think multi-party MapTask or Diapix, but different). They will also be responsible for the capture of that dataset and subsequent editing/labelling for deployment in the project and eventual sharing with the wider research community. Annual gross salary up to ?34,930 per annum depending on experience. The Research Fellow needs a background, including a PhD, in deep learning and the modelling of multimodal cues in speech. Their previous experience might be in conversational analysis, multimodal speech recognition or other areas. They should have a proved track record with publications commensurate with career stage. Annual gross salary up to ?50030 depending on experience.
The project starts on Dec 1st, and the positions can start from that date and continue for 14 months. Please email nharte@tcd.ie for a more detailed description of either role, or to discuss. I am open to a person remote-working for the remainder of 2020, but the ideal candidate will be in a position to move to Ireland for Jan 2021 and work with the team in TCD.
Sigmedia Research Group @ Trinity College Dublin, Ireland The Signal Processing and Media Applications (aka Sigmedia) Group was founded in 1998 in Trinity College Dublin, Ireland. Originally with a focus on video and image processing, the group today spans research in areas across all aspects of media ? video, images, speech and audio. Prof. Naomi Harte leads the Sigmedia research endeavours in human speech communication. The group has active research in audio-visual speech recognition, evaluation of speech synthesis, multimodal cues in human conversation, and birdsong analysis. The group is interested in all aspect of human interaction, centred on speech. Much of our work is underpinned by signal processing and machine learning, but we also have researchers with a background in linguistic and psychology aspects of speech processing to keep us all grounded.
| |||||
6-27 | (2020-10-30) 3 Tenure Track Professors (W2) at Saarland University, Germany 3 Tenure Track Professors (W2) at Saarland University
| |||||
6-28 | (2020-10-30) 2 Research Assistant/Associate Posts at Cambridge University, Great Britain 2 Research Assistant/Associate Posts in Spoken Language Processing at Cambridge University (Fixed Term)
Applications are invited for two Research Assistants/Research Associates in the Department of Engineering, Cambridge University, to work on an EPSRC funded project Multimodal Video Search by Examples (MVSE). The project is a collaboration between three Universities, Ulster, Surrey and Cambridge, and the BBC as an industrial partner. The overall aim of the project is to enable effective and efficient multimodal video search of large archives, such as BBC TV programmes.
The research associated with these positions will focus on deriving representation for all the information that is contained within the video speech signal, and integrating with other modalities. The forms of representation will include both: voice analytics e.g. speaker and emotion; and topic and audio content analytics e.g. word-sequence and topic classification and tracking. The position will involve close collaboration with Surrey and Ulster Universities to integrate with other information sources, video and the audio scene, to yield a flexible and efficient video search index.
Fixed-term: The funds for this post are available until 31 January 2024 in the first instance
Closing date: 1st December 2020
Full information can be found at: http://www.jobs.cam.ac.uk/job/27458/
| |||||
6-29 | (2020-11-02) Fully-funded PhD studentships at the University of Sheffield, Great Britain Fully-funded PhD studentships in Speech and NLP at the University of Sheffield *******************************************************************************************************
UKRI Centre for Doctoral Training (CDT) in Speech and Language Technologies (SLT) and their Applications
Department of Computer Science Faculty of Engineering University of Sheffield, UK
Fully-funded 4-year PhD studentships for research in speech technologies and NLP
** Applications now open for September 2021 intake **
Deadline for applications: 31 January 2021.
Speech and Language Technologies (SLTs) are a range of Artificial Intelligence (AI) approaches which allow computer programs or electronic devices to analyse, produce, modify or respond to human texts and speech. SLTs are underpinned by a number of fundamental research fields including natural language processing (NLP / NLProc), speech processing, computational linguistics, mathematics, machine learning, physics, psychology, computer science, and acoustics. SLTs are now established as core scientific/engineering disciplines within AI and have grown into a world-wide multi-billion dollar industry.
Located in the Department of Computer Science at the University of Sheffield ? a world leading research institution in the SLT field ? the UKRI Centre for Doctoral Training (CDT) in Speech and Language Technologies and their Applications is a vibrant research centre that also provides training in engineering skills, leadership, ethics, innovation, entrepreneurship, and responsibility to society.
Apply now: https://slt-cdt.ac.uk/apply/
The benefits:
About you: We are looking for students from a wide range of backgrounds interested in speech and NLP.
Applying: Applications are now sought for the September 2021 intake. The deadline is 31 January 2021.
Applications will be reviewed within 6 weeks of the deadline and short-listed applicants will be invited to interview. Interviews will be held in Sheffield or via videoconference.
See our website for full details and guidance on how to apply: https://slt-cdt.ac.uk
For an informal discussion about your application please contact us by email at: sltcdt-enquiries@sheffield.ac.uk
|