ISCA - International Speech
Communication Association

ISCApad Archive » 2017 » ISCApad #228 » Jobs

ISCApad #228

Saturday, June 10, 2017 by Chris Wellekens

6 Jobs

6-1

(2017-01-12) Internship INA Paris: Segmentation Parole/Musique de documents multimédias à l’aide de réseaux de neurones profonds

Segmentation Parole/Musique de documents multimédias à

l’aide de réseaux de neurones profonds

Stage de fin d’études d’Ingénieur ou de Master 2 – 2016-2017

Mots clés: Deep Learning, Segmentation Audio, Machine Learning, Music Information

Retrieval, Open Data

Contexte

Les missions de l’institut national de l’audiovisuel (Ina) consistent à archiver et à valoriser la

mémoire audio-visuelle française (radio, télévision et médias Web). A ce jour, plus de 15 millions

d’heures de documents télé et radio sont conservés, dont 1,5 millions d’heures numérisées. En

raison de la masse de données considérée, il n’est techniquement pas possible de procéder à une

description manuelle, systématique et détaillée de l’ensemble des archives. Il est donc nécessaire

d’utiliser des techniques d’analyse automatique du contenu pour optimiser l’exploitation de cette

masse de données.

Objectifs du stage

La segmentation Parole/Musique (SPM) consiste à segmenter un flux audio en zones homogènes de

parole et de musique. Cette étape est nécessaire en amont de tâches d’indexation haut niveau, telles

que la reconnaissance de la parole, du locuteur, du morceau ou du genre musical. Pour ces

différentes raisons, cette tâche a suscité beaucoup d’intérêts au sein des communautés de traitement

de la parole, ainsi qu’en indexation musicale.

L’utilisation de systèmes de SPM à l’Ina répond à trois cas d’usage principaux. En premier lieu, il

s’agit de localiser rapidement les zones d’intérêt au sein des médias, pour fluidifier les processus de

description des archives, réalisés manuellement par des documentalistes. La description manuelle

des archives est coûteuse, et réalisée avec un niveau de détail variable: les journaux télévisés étant

décrits plus finement que les fonds radio anciens. Les systèmes SPM peuvent ainsi permettre de

faciliter la navigation dans des fonds d’archives sous-documentés. Le dernier cas d’usage

correspond à la segmentation en morceaux de musique: consistant à détecter le début et la fin des

morceaux. Cette tâche permet de mesurer la durée des extraits musicaux présents dans les archives,

et ainsi rémunérer les sociétés d’auteurs concernées lorsque les archives sont commercialisées.

A ce jour, un certain nombre de situations restent difficiles pour les systèmes SMS. Il s’agit

notamment la différentiation entre voix parlée et voix chantée, notament dans certains styles

musicaux où les propriétés spectrales de la voix chantée et parlée sont similaires. Une autre

difficulté rencontrée est liée aux cas où la parole est superposée à la musique, ce qui arrive assez

fréquemment dans les émissions radio et télé. Une autre difficulté rencontrée par les systèmes

actuels est la liée à la finesse de la segmentation temporelle, généralement de l’ordre de la seconde.

L’objectif du stage consiste à concevoir des systèmes basés sur l’utilisation de réseaux de neurones

profonds pour la segmentation parole/musique d’archives audio-visuelles. Les méthodes proposées

devront prendre en charge la diversité des archives de l’Ina (archives radio des années 1930 à nos

jours). Une partie du stage sera consacrée à l’analyse des corpus existants, et à la constitution d’un

corpus annoté (interprète, morceau, genre, locuteur, ...) permettant d’avoir un maximum de contrôle

sur l’ensemble des paramètres testés lors des évaluations. L’autre partie du stage sera consacré à la

mise au point d’architectures basées sur des réseaux de neurones profonds pour la SPM, qui sera

réalisée dans la continuité des travaux en cours utilisant des réseaux de neurones convolutionnels.

Le langage de programmation utilisé dans le cadre de ce stage sera Python. Le stagiaire aura accès

aux ressources de calcul de l’Ina (cluster et serveurs GPU).

Conditions du stage

Le stage se déroulera sur une période de 6 mois, au sein de l’équipe recherche de l’Ina. Il aura lieu

sur le site Bry2, situé au 18 Avenue des frères Lumière, 94366 Bry-sur-Marne. Le stagiaire sera

encadré par David Doukhan (ddoukhan@ina.fr) et Jean Carrive (jcarrive@ina.fr), et percevra une

rémunération mensuelle de 527,75 euros/mois.

Bibliographie

Jimena, R. L., Hennequin, R., & Moussallam, M. (2015). Detection and characterization of singing

voice using deep neural networks.

Peeters, G. (2007). A generic system for audio indexing: Application to speech/music segmentation

and music genre recognition. In Proc. DAFX (Vol. 7, pp. 205-212).

Pinto, N., Doukhan, D., DiCarlo, J. J., & Cox, D. D. (2009). A high-throughput screening approach

to discovering good forms of biologically inspired visual representation. PLoS Comput Biol, 5(11),

e1000579.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.

Top

6-2

(2017-01-15) Postdoctoral Positions in Linguistics/Sciences du Langage, LPL, CNRS, Aix-Marseille, France

Call for Postdoctoral Positions in Linguistics/Sciences du Langage

Institution: Laboratoire Parole et Langage (CNRS, Aix-Marseille Université)

Location: Aix-en-Provence, France

No. of positions: 2

Duration: 18-24 months

Application deadline: 1 March 2017

The Laboratoire Parole et Langage (CNRS, Aix-Marseille Université) invites applications for two postdoctoral positions to be supported by a grant from the A*MIDEX Foundation. The funded project explores the relationship between social variables and linguistic representation, and seeks to develop and extend an explicit cognitive model that accounts for the effects of socio-indexical cues on production and perception. The empirical basis for the project involves a combination of experimentation (production, perception, ERP), corpus analysis, and computational modeling. The selected applicants will engage with an interdisciplinary team from the Institute of Language, Communication, and the Brain which includes experts in neuroscience, psychology, computer science, and mathematics, among other areas.

A special emphasis of the project concerns issues in prosody and intonation, so an interest in, or experience with, prosody is highly desirable. In addition, the ideal applicant will have experience in one or more of the following areas:

 Sociolinguistics (quantitative)

 Machine learning and statistical modeling (esp. sound structure/phonetics)

 Design and analysis of speech corpora

 Prosody and meaning (esp. information structure)

Knowledge of French is not essential. Each postdoctoral appointment is for approximately 18-24 months depending on the starting date. Candidates must have completed all PhD requirements before the starting date. The starting date is flexible, though the position should be filled by 1 June 2017.

Applications should include (i) a cover letter that relates the applicants’ experience and interests to the project, (ii) a comprehensive CV, (iii) a PDF copy of all publications or a list of links where these can be accessed, and (iv) the names and contact information of at least two references.

Applications in French or English may be sent by email to Oriana Reid-Collins at oriana.reid-collins@lpl-aix.fr.

For further inquiries regarding the position or the project, please contact James Sneed German (principal investigator) at james.german@lpl-aix.fr.

Top

6-3

(2017-01-20) Postdoc in Mental Health, Affective Computing and Machine Learning at CMU, Pittsburgh, PA, USA

Postdoc in Mental Health, Affective Computing and Machine Learning

Carnegie Mellon University, School of Computer Science

University of Pittsburg, School of Medicine

*** US citizenship or a green card is required to be eligible for consideration. ***

The Multimodal Communication and Machine Learning Laboratory (MultiComp Lab) at Carnegie Mellon University is seeking creative and energetic applicants for a two-year postdoctoral position. This opportunity is part of NIMH-funded training program based at the University of Pittsburg School of Medicine. The position includes a competitive salary with full benefits and travel resources.

Using recent progress in machine learning and artificial intelligence, the postdoc will study patient’s multimodal behaviors (verbal, visual and vocal) during semi-structured clinical interviews to identify behavior markers of mental health disorders (e.g., depression, schizophrenia, suicidal ideation). The postdoc will work under the supervision of Dr. Louis-Philippe Morency (CMU MultiComp Lab’s director), in collaboration with clinicians and researchers at University of Pittsburgh’s Western Psychiatric Institute and Clinic.

The successful applicant will have an extensive research experience in automatic multimodal behavior analysis in the mental health domain, including facial and gesture analysis, acoustic signal processing, linguistic computation and multimodal machine learning.

Required

PhD in computer science or mental-health related field (at the time of hire)
Research experience in human behavior analysis, affective computing and machine learning
US citizenship or a green card is required to be eligible for consideration

Desired

Publications in top machine learning, speech processing and/or computer vision conferences and journals.
Research involving clinical patients with mental health disorders (e.g., depression, schizophrenia, suicidal ideation)
Experience mentoring graduate and undergraduate students

Job details

Preferred start date: May 1^st, 2017 (negotiable)
Candidate will work under the supervision of Dr. Louis-Philippe Morency, MultiComp Lab’s director, at CMU Pittsburgh campus.
Research will be performed in collaboration with clinicians and researchers at University of Pittsburgh’s Western Psychiatric Institute and Clinic.
Competitive salary with full benefits and travel resources.

How to apply

Email applications should be sent to morency@cs.cmu.edu with the title “Postdoc application – NIMH program”
A single PDF file titled FirstNameLastName.pdfshould be attached to the email, including:
- a brief cover letter (with expected date of availability),
- a CV including list of publications and email addresses of 3 references,
- two representative publications (including full citation information)

=====================

Louis-Philippe Morency

Assistant Professor

School of Computer Science

Carnegie Mellon University

Multimodal Commination and Machine Learning Laboratory

https://www.cs.cmu.edu/~morency/

Top

6-4

(2017-01-21) Internship at INRIA Bordeaux, France

Stage de 6 mois pour étudiants en Master2 à INRIA Bordeaux

Titre du Sujet de Stage :Analyse de la parole pour le diagnostic différentiel entre maladie de Parkinson et l'atrophie multisystématisée

Description :

La maladie de Parkinson (MP) et l'atrophie multisystématisée (AMS) sont des maladies neurodégénératives. La dernière appartient au groupe des troubles parkinsoniens atypiques et est responsable d?un pronostic péjoratif. Dans les premiers stades de la maladie, les symptômes de MP et AMS sont très similaires, surtout pour AMS-P où le syndrome parkinsonien prédomine. Le diagnostic différentiel entre AMP-P et MP peut être très difficile dans les stades précoces de la maladie, tandis que la certitude de diagnostic précoce est important pour le patient en raison du pronostic divergent. En effet, malgré des efforts récents, aucun marqueur objectif valide n'est actuellement disponible pour guider le clinicien dans ce diagnostic différentiel. La nécessité de ces marqueurs est donc très élevé dans la communauté de la neurologie, en particulier compte tenu de la gravité du pronostic de AMS-P.

Les troubles de la parole, communément appelés dysarthrie, sont un symptôme précoce commun aux deux maladies et d'origine différente. Notre approche consiste à utiliser la dysarthrie, grâce à un traitement numérique des enregistrements vocaux des patients, comme un vecteur pour distinguer entre MP et AMS-P dans les stades précoces de la maladie.

L'objectif du stage est d'utiliser des techniques connues de mesure de perturbation de la voix ainsi que des techniques récemment développées par l'équipe GeoStat d'Inria pour faire une étude expérimentale préliminaire sur le pouvoir discriminant de ces différentes mesures. Cette études se fera sur des bases de données médicales existantes.

Le stage déboucherait sur une bourse de thèse de doctorat dans le cadre d'une allocation ANR qui finance ce projet de recherche. Les partenaires cliniques de ce projet sont des centres du CHU-Bordeaux et du CHU-Toulouse de renommée internationale sur MP et AMS.

Responsable du stage :Dr. Khalid Daoudi, équipe GeoStat (khalid.daoudi@inria.fr).

Lieu du stage : INRIA- Bordeaux Sud Ouest (http://www.inria.fr/bordeaux). Bordeaux, France.

Durée du Stage : 6 mois

Rémunération : 500euros/mois

Connaissances requises : De bonnes connaissances en traitement de la parole/signal ainsi qu'en programmation C++ et Matlab sont nécessaires. Des connaissances en apprentissage statistique (Machine learning) seraient un grand plus.

Les candidatures doivent être adressées à khalid.daoudi@inria.fr

Top

6-5

(2017-01-22) Stage de 6 mois au LIA Avignon, France

Stage de 6 mois au LIA Avignon

Le stage que nous proposons se situe dans le cadre d?une collaboration entre la société MediaWen International et le Laboratoire d?Informatique d?Avignon (LIA). MediaWen propose des solutions en ligne pour le sous-titrage, la traduction et le doublage de vidéo sur le web. Une plate-forme de travail collaboratif inclut les différentes briques technologiques et permet d?accélérer ou d?automatiser les différents traitements.

Dans ce cadre, MediaWen et le LIA souhaitent explorer la faisabilité et l?intérêt d?une brique technologique autour de la détection automatique de la langue parlée. Les deux originalités majeures seront de pouvoir ajouter facilement une langue à partir d?un ensemble réduit d?exemples audio et de définir une stratégie interactive dans laquelle le critère à optimiser est le temps de travail de l?opérateur (à qualité de production identique).

L?objectif du stage proposé est de mettre en place cette brique en se basant sur la plate-forme ALIZE (développé en C++) qui a déjà donné lieu à plusieurs implémentations de systèmes de reconnaissance de la langue parlée.Une solution basée sur le paradigme des i-Vectors sera choisie. L?approche retenue sera dans un premier temps développée et testée en utilisant des données internes du laboratoire (notamment les données NIST) et en simulant les réponses de l?opérateur. Elle sera ensuite intégréedans les outils de MediaWen et testée sur les données correspondantes.

Une poursuite en thèse est envisageable selon le degré de la réussite de ce stage.

Le stagiaire sera encadré au LIA par Driss MATROUF (MCF-HDR) et Jean-François BONASTRE (Professeur). Il bénéficiera du soutien des spécialistes de MediaWen, pleinement associés au déroulé de ce stage.

Profil et Niveau :

Master en informatique, mathématiques ou traitement du signal. Un bon niveau en développement logiciel, dont la connaissance de C++, est requis.

Motivation, curiosité scientifique et rigueur seront des qualités demandées.

Durée :

5 à 6 mois (prolongation possible)

Rémunération :

~530 Euros/mois (indemnités légales pour un stage de niveau Master)

Contact:

driss.matrouf@univ-avignon.fr

Top

6-6

(2017-01-27) Two R and D engineers at INRIA Nancy France

Our team at Inria Nancy is recruiting two R&D engineers for an ambitious industrial
project on voice processing in audiovisual contents. The mission is to develop, evaluate,
and improve software prototypes based on the latest scientific advances and transfer them
to the partner software development company and the sound creation studio which initiated
the project. The resulting commercial software will be a reference in the professional
audiovisual field.

*R&D engineer A*
Start date: May 2017
Duration: 18 months
Missions:
- speaker recognition based on ALIZE
- speech enhancement by multichannel deep learning
Profile:
- MSc or PhD in computer science, signal processing, or machine learning
- operational skills in software engineering (version control, tests, software quality)
and Python 3 language (numpy, scipy, Keras)
- experience in speaker recognition or speech enhancement would be a plus
Salary: 2048 to 2509? net/month depending on experience

*R&D engineer B*
Start date: May-June 2017
Duration: 16 months
Missions:
- speech recognition based on Kaldi
- concatenative speech synthesis
Profile:
- MSc or PhD in computer science, signal processing, or machine learning
- operational skills in software engineering (version control, tests, software quality)
and Python 3 (numpy, scipy, Keras) and Java (Java SE 8) languages
- experience in speech recognition or synthesis would be a plus
Salary: 2048 to 2509? net/month depending on experience

*Work environment*
Nancy is one of the top cities for young engineers in France with cheap accomodation, a
vibrant cultural scene, and good connections to Paris (1.5h), Luxemburg (1.5h), Belgium,
and Germany. Inria Nancy is a 500-people research institute dedicated to computer
science. The Multispeech team (https://team.inria.fr/multispeech/) is a 30-people group
covering various fields of speech science, with a strong emphasis on machine learning and
signal processing.

*To apply*
Send a CV, a motivation letter and 1 to 3 optional recommendation letters to
emmanuel.vincent@inria.fr. Mention which position(s) you are applying for. Applications
will be assessed on a rolling basis until March 17. Please apply as soon as possible
before that date.

Top

6-7

(2017-02-01) Maitre de conférences, Ecole centrale Marseille France

L'Ecole Centrale Marseille ouvre un poste de Maitre de Conférences en
Informatique au concours 2017 dont le profil enseignement et recherche
est précisé ci-dessous (poste en cours de publication).

Liens utiles :
Ecole Centrale Marseille https://www.centrale-marseille.fr/
Laboratoire d?Informatique Fondamentale : http://www.lif.univ-mrs.fr/

=================================
Profil de poste MC en informatique à l?ECM

- Enseignement

Le/la maître de conférence recruté(e) devra être capable d?assurer un
enseignement attractif pour des élèves ingénieurs généralistes.
Il/elle aura vocation à s'intégrer au sein de l'équipe pédagogique
informatique pour assurer des enseignements de tronc commun
(Algorithmie, modélisation objet, stockage et traitement des données),
participer aux enseignants en informatique dans les options de
deuxième et troisième année, s?investir dans des proposer des projets
et suivre des groupes d'étudiants tout au long de leurs réalisations,
mais également participer à des actions de formation continue ou en
alternance. Il/elle jouera un rôle dans l'animation, la coordination
et l'évolution des enseignements, et participera aux actions
transverses multidisciplinaires de l'École Centrale Marseille.

Contacts : Pascal Préa (pascal.prea@centrale-marseille.fr)

- Recherche :

Le/la candidate devra en priorité développer des recherches dans le
cadre de projets initiés entre le Laboratoire d?Informatique
Fondamentale de Marseille (LIF) et l?Ecole Centrale Marseille (ECM).
Ces deux projets ont pour thèmes d?une part le traitement de données
massives, qu?il s?agisse de modèles de classification, d?optimisation
ou encore de visualisation, et d?autre part l?apprentissage profond,
l?apprentissage de représentations et les domaines d?applications
associés. Ces projets couvrent des thèmes relevant notamment des
équipes ACRO, BDA, QARMA et TALEP du LIF.

Au-delà de cette priorité thématique, toute candidature d?excellence
dans le périmètre du LIF est éligible.

Par ailleurs, la capacité du/de la candidat(e) à enrichir la dimension
technologique des recherches menées dans le cadre de ces projets et à
participer à des partenariats industriels est un plus incontestable.

Contact Recherche : Thierry Artières (thierry.artieres@centrale-marseille.fr)

Top

6-8

(2017-02-13) CDD POST-DOCTORANT 18 mois *Analyse multimodale de contenus audiovisuels*

*CDD POST-DOCTORANT 18 mois*
*Analyse multimodale de contenus audiovisuels*

L?équipe LINKMEDIA (IRISA & Inria Rennes) travaille au développement des futurs
technologies permettant la description et l?accès aux contenus multimédias par le biais
de leur analyse. Les domaines de compétence de l?équipe sont la vision par ordinateur, le
traitement de la parole et du langage, le traitement des contenus audio, la recherche
d?information et la fouille de données. En particulier, l?équipe participe au projet FUI
NexGenTV portant sur l?analyse et l?enrichissement de contenus télévisés. La télévision
évolue de l?écran du téléviseur vers des applications multi-écrans où le spectateur
regarde la télévision tout en explorant le web, cherchant des compléments d?informations
ou réagissant sur les réseaux sociaux. Dans ce contexte, NexGenTV cherche à apporter des
solutions d?édition de contenus enrichis multi-écrans par le bais de fonctionnalités
telles la détection de temps forts, l?enrichissement de programmes par des informations
complémentaires et, plus généralement, l?optimisation de l?expérience utilisateur en
favorisant l?interaction adaptée aux attentes de l?utilisateur. Au sein du projet,
l?IRISA s?intéresse à l?analyse des contenus audiovisuels, de la parole et des réseaux
sociaux.

Dans ce contexte, nous souhaitons recruter un chercheur post-doctorant spécialisé dans
l?analyse de contenus audiovisuels pour développer, étudier et évaluer des approches
innovantes relatives à l?analyse des personnes au sein des contenus télévisés. On
cherchera notamment à concevoir des approches multimodales (voix+visage) permettant aussi
bien la détection de personnes connues que la mise en relation de vidéos d?un même
intervenant. Une première piste de travail s?appuie sur des travaux récents de l?équipe
en apprentissage de représentations multimodales à l?aide de réseaux neuronaux. On pourra
également étudier l?usage de ces derniers pour la représentation et la comparaison des
voix. Dans un second temps, on s?intéressera à l?exploitation de tels modèles pour
enrichir un contenu live avec des extraits de documents archivés, combinant
identification des intervenants et pertinence sémantique.

Les recherches envisagées seront menées dans l?équipe LINKMEDIA de l?IRISA (Rennes,
France), en collaboration étroite avec les partenaires du projet NexGenTV, notamment avec
EURECOM.

Le candidat devra posséder une thèse dans un domaine proche du sujet de recherche, de
préférence dans l?un des domaines suivants : modélisation multimodale, traitement
automatique de la parole, reconnaissance du locuteur, vision par ordinateur. On attend
également du candidat qu?il renforce la compétence de l?équipe en apprentissage neuronal
appliqué à l?analyse des contenus multimédia.

Pour candidater, merci d?adresser un CV accompagné d?une lettre de motivation.

Employeur : Centre National de la Recherche Scientifique
Lieu d?exercice : IRISA, Rennes
Contrat : CDD 18 mois, dès que possible à partir de mars 2017
Rémunération : 2 815? mensuels bruts
Contact : Guillaume Gravier (prenom.nom@irisa.fr)

Top

6-9

(2017-02-20) Several possitions of Research Engineers at Audio Analytic Labs, Cambridge, UK

Audio Analytic Labs, the research division of Audio Analytic Ltd, has several Research
Engineer positions currently open in the field of Automatic Sound Recognition.

These could be of interest to your PhD students or Post-Docs finishing their contracts in
your teams and looking to follow up with an industrial position.

The complete job specification is copied below.

We are also open to answering questions from people interested in our company but not yet
available for employment.

More generally, we are open to finding concrete and mutually beneficial ways to
collaborate with academic partners on research projects, either through joint projects
supported by specific funding, or via secondments and internships.

For more information about our company, please visit the company?s website on
http://www.audioanalytic.com/ , or feel free to contact me directly.

I would be very grateful if you could propagate the attached job offer to your
institutions? career services, or if you could forward it directly to people who you
think may be directly interested in applying.

Hoping this will be useful, and of interest to your alumni.

Many thanks, and best regards,

- Sacha K.

Director of AALabs

AudioAnalytic Ltd.

INDUSTRIAL POSITION OPEN:

*Full Time Audio Analytics Research Engineer*

Location: Cambridge, Cambridgeshire, United Kingdom

Full-time, immediate start.

Audio Analytic Ltd. is leading the world of acoustically connected things. Our unique
software is used by smart home companies the world over to make devices aware of sounds
around them. If a smoke alarm goes off or a glass panel is broken by intruders while
no-one is at home, our software will immediately recognise the sound and tell the device
to alert the home owner and the smart home so they can both take appropriate protective
action. We give smart home owners sound peace of mind. More information is available on:

http://www.audioanalytic.com

We are looking for people who thrive as part of a dedicated and innovative team, love
tough challenges, and are passionate about audio/sound, DSP and Machine Learning.

Responsibilities

As part of our R&D team, you will contribute to researching and evaluating new algorithms
to push the limits of our unique sound recognition system. Responsibilities include
developing new algorithms in house, identifying and reporting on state of the art
methods, and evaluating both types of solutions on large scale field data sets.

Technical Skills

Must have either a Master?s degree with 2 years industrial experience or a PhD, in one of
the following topics: Digital Signal Processing of Audio Signals, Machine Learning
applied to Audio Signals, Automatic Speech/Speaker Recognition, Music Information
Retrieval, Acoustic Events Detection, Statistical Speech Synthesis, Thematic Indexing of
Audio tracks (e.g., Speaker Diarization, Acoustic Segmentation of Video Documents etc.).

Experience as a post-doc research engineer, either academic or industrial, will be a
significant plus.

Required:

Demonstrable skills in Digital Signal Processing and/or Machine Learning applied to
Audio Signals.

Demonstrable experience dealing with at least one type of Machine Learning algorithm
(e.g., Deep Neural Networks, Hidden Markov Models, Support Vector Machines, Decision
Trees etc.) applied to the processing of Audio Signals.

Scripting and algorithm prototyping: Python, bash.

Programming: C/C++ coding and code optimisation. CUDA/GPU programming a plus.

Development under Linux/Unix mandatory, Windows optional.

Desirable:

Hardware design knowledge a plus but not a requirement.

Demonstrable interest in porting DSP/Machine Learning algorithms to either embedded
platforms or high performance computing platforms a plus but not a requirement.

General Skills

Ability to deliver on research and evaluation methodology.

Good communication skills.

Excellent problem-solving skills.

Track record of academic publications a plus but not a requirement.

Enjoy working as a member of a team and using their own initiative.

Self-confident and highly motivated.

Ability to deal confidently with a variety of people at all levels.

Able to manage own workload and meet deadlines.

Good organisational skills.

Good standard of written and spoken English.

Remuneration

This is a great opportunity to join a successful company with a huge potential for
growth. The successful candidate will be compensated with an attractive package
appropriate to qualifications and experience, to include a competitive salary and stock
options.

How to Apply

To apply for this vacancy, please send a covering letter and copy of a recent CV to
jobs@audioanalytic.com, with reference AA-RES-ENG-2016 in the email title.

Please note that it is company policy not to accept job applications from recruitment
consultants.

Top

6-10

(2017-02-21) Acting Assistant Professor, Department of Linguistics, University of Washington, WA, USA

Acting Assistant Professor, Department of Linguistics, University of Washington, Washington, USA, associated with the professional MS program and Ph.D. track in Computational Linguistics.

For additional details and to apply, please go to: http://ap.washington.edu/ahr/academic-jobs/position/aa22332/

Application deadline: May 31, 2017, open until filled. Priority will be given to application received before March 1, 2017.

Top

6-11

(2017-02-21) Several positions at Fluent.ai in Montreal, Canada

Fluent.ai is looking for both permanent full-time employees as well as interns. Please link this page: http://www.fluent.ai/careers/#toggle-id-3.

Fluent.ai is a startup based in Montreal, Canada. We are working on new deep learning and related techniques to enable acoustic-only speech recognition. By associating speech to intent without requiring a speech-to-text translation, Fluent.ai opens a wide variety of new applications and provides higher accuracy and more robust performance compared to existing methods. We are looking to expand our technology and research teams and are inviting applications for various permanent and internship based roles. Joining Fluent.ai provides you an opportunity to be an early team member leading work on an exciting, disruptive technology poised for rapid growth. The technology has already been validated by many academic experts as well as industrial customers in diverse sectors. Now we are looking for the right people to share our vision and hustle to achieve execution excellence in select sectors. You will be joining a diverse, dedicated, smart and fun team. We work hard, we don?t always agree, but we always laugh out loud and we always move forward together. What we offer: We offer a great working environment and a competitive mix of salary and options. We are keen to interact with talented people and will get back to the selected candidates quickly. We are an equal opportunity employer and value diversity at our company. We do not discriminate based on origin, religion, gender, age, sexual orientation, or disability. We are looking for both permanent full-time employees as well as interns. Please link this page: http://www.fluent.ai/careers/#toggle-id-3 Let me know if you have any questions, and I will be happy to answer those. About Fluent.ai Fluent.ai is a startup based in Montreal, Canada. We are working on new deep learning and related techniques to enable acoustic-only speech recognition. By associating speech to intent without requiring a speech-to-text translation, Fluent.ai opens a wide variety of new applications and provides higher accuracy and more robust performance compared to existing methods. We are looking to expand our technology and research teams and are inviting applications for various permanent and internship based roles. Joining Fluent.ai provides you an opportunity to be an early team member leading work on an exciting, disruptive technology poised for rapid growth. The technology has already been validated by many academic experts as well as industrial customers in diverse sectors. Now we are looking for the right people to share our vision and hustle to achieve execution excellence in select sectors. You will be joining a diverse, dedicated, smart and fun team. We work hard, we don?t always agree, but we always laugh out loud and we always move forward together. What we offer: We offer a great working environment and a competitive mix of salary and options. We are keen to interact with talented people and will get back to the selected candidates quickly. We are an equal opportunity employer and value diversity at our company. We do not discriminate based on origin, religion, gender, age, sexual orientation, or disability.

Vikrant Tomar

Fluent.ai

Top

6-12

(2017-02-22) Language modeling scientist at Siri team at Apple

Title: Language Modeling Scientist – Siri Speech team at Apple

Job Summary

Play a part in the next revolution in human-computer interaction. Contribute to a product that is redefining mobile computing. Create groundbreaking technology for large scale systems, spoken language, big data, and artificial intelligence. And work with the people who created the intelligent assistant that helps millions of people get things done — just by asking. Join the Siri Speech team at Apple.

The Siri team is looking for exceptionally skilled and creative Scientists and Engineers eager to get involved in hands-on work improving the Siri experience.

Key Qualifications

Experience building, testing, and tuning language models for ASR
Ability to implement experiments using scripting languages (Python, Perl, bash) and tools written in C/C++
Experience working with standard speech recognition toolkits (such as HTK, Attila, Kaldi, SRILM, OpenFST or equivalent proprietary systems)
Large scale data analysis experience using distributed clusters (e.g. MapReduce, Spark)

Description

The speech team is seeking a research scientist to participate in the language modeling effort for Siri. In order to estimate language model probabilities, you will make use of very large amounts of training text drawn from diverse sources. You will be part of a group that has responsibility for the entire domain of language modeling in multiple languages including, among other things, text processing, data selection, language model adaptation, neural network modeling, improving language model training infrastructure, experimenting with new types of language models etc.

Education

PhD or Masters in Computer Science or related field

3+ years of experience in language modeling for ASR

Apply online at jobs.apple.com

Search for: “Language Modeling Scientist”

Top

6-13

(2017-02-28) Postdoctoral Researcher (Speech/Audio Processing), University of Eastern Finland, Joensuu Campus, Finland

*****
Postdoctoral Researcher (Speech/Audio Processing)

The University of Eastern Finland, UEF, is one of the largest multidisciplinary universities in Finland. We offer education in nearly one hundred major subjects, and are home to approximately 15,000 students and 2,800 members of staff. We operate on three campuses in Joensuu, Kuopio and Savonlinna. In international rankings, we are ranked among the leading 300 universities in the world.

The Faculty of Science and Forestry operates on the Kuopio and Joensuu campuses of the University of Eastern Finland. The mission of the faculty is to carry out internationally recognised scientific research and to offer research-education in the fields of natural sciences and forest sciences. The faculty invests in all of the strategic research areas of the university. The faculty?s environments for research and learning are international, modern and multidisciplinary. The faculty has approximately 3,800 Bachelor?s and Master?s degree students and some 490 postgraduate students. The number of staff amounts to 560. http://www.uef.fi/en/lumet/etusivu

We are now inviting applications for

a Postdoctoral Researcher (Speech/Audio Processing), School of Computing, Joensuu Campus

The Machine Learning research group of the School of Computing at the University of Eastern Finland (http://www.uef.fi/en/web/cs) is looking for a highly motivated researcher to work in the group.

The current research topics in the group include speaker and language recognition, voice conversion, spoofing and countermeasures for speaker recognition, robust feature extraction, and analysis of environmental sounds. Prior experience in these topics is a plus, though we invite candidates widely from general speech/audio/language processing, machine learning or signal processing background. We expect the new Postdoctoral Researcher to bring in complementary skills and expertise.

The recruited Postdoctoral Researcher will take a major role in advancing research in one of the above-listed (or closely related) topics. He or she will also have a significant role in the supervision of students and certain administrative duties, and he or she will work closely with Associate Professor Kinnunen and the other members of the group. The position is strongly research-focused.

The School of Computing, located in Joensuu Science Park, provides modern research facilities with access to high-performance computing services. Our research group hosted the Odyssey 2014 conference (http://cs.uef.fi/odyssey2014/), is a partner in the ongoing H2020 funded OCTAVE project (https://www.octave-project.eu/) focused on voice biometrics, is a co-founder of the Automatic Speaker Verification and Countermeasures (ASVspoof) challenge series (http://www.spoofingchallenge.org/) and has hosted international summer schools. We take actively part in international benchmarking and other collaboration activities. We follow a multidisciplinary research perspective that targets at understanding the speech signal, as well as applying the acquired knowledge to new application areas.

A person to be appointed as a postdoctoral researcher shall hold a suitable doctoral degree that has been awarded less than five years ago. The doctoral degree should be in spoken language technology, electrical engineering, computer science, machine learning or a closely related field. He/she should be comfortable with Unix/Linux, Matlab/Octave and/or Python, processing of large datasets and with strong hands-on experience and creative out-of-the-box problem solving attitude.

The position will be filled from May 1, 2017 until December 31, 2018. The continuation of the position will be agreed separately.

The positions of postdoctoral researcher shall always be filled for a fixed term (UEF University Regulations 31 §).

The salary of the position is determined in accordance with the salary system of Finnish universities and is based on level 5 of the job requirement level chart for teaching and research staff (?2.865,30/ month). In addition to the job requirement component, the salary includes a personal performance component, which may be a maximum of 46.3% of the job requirement component.

For further information on the position, please contact: Associate Professor Tomi Kinnunen, email: tkinnu@cs.uef.fi, tel. +358 50 442 2647. For further information on the application procedure, please contact: Executive Head of Administration Arja Hirvonen, tel. +358 44 716 3422, email: arja.hirvonen@uef.fi.

A probationary period is applied to all new members of the staff.

The electronic application should contain the following appendices:

- a cover letter indicating the position to be applied for and a free-worded motivation letter
- a résumé or CV
- a list of publications
- copies of the applicant's academic degree certificates/ diplomas, and copies of certificates / diplomas relating to the applicant?s language proficiency, if not indicated in the academic degree certificates/diplomas
- the names and contact information of at least two referees

The application needs to be submitted no later than March 24, 2017 (by 24:00 EET) by using the electronic application form:

Apply for the job

The job ad and the application form can also be located under http://www.uef.fi/en/uef/en-open-positions (seek for the position 'Postdoctoral Researcher (Speech/Audio Processing)').

Top

6-14

(2017-02-28) MCF en informatique pour les Sciences Humaines, Sorbonne, Paris, France

Un poste de MCF en informatique pour les Sciences Humaines, notamment en traitement automatique du langage et/ou de la parole, est ouvert à l'Université Paris Sorbonne (www.paris-sorbonne.fr/IMG/pdf/27-7_mcf_766.pdf). Le candidat enseignera l'Informatique dans les différentes formations de licence et de master du département d'Informatique, Mathématiques et de Linguistique appliquées de l'UFR de Sociologie et d'Informatique pour les Sciences Humaines. Il devra s'inscrire dans un ou plusieurs axes de l'équipe de linguistique computationnelle (www.stih.paris-sorbonne.fr/) : (1) Sémantiques, connaissances et corpus (2) Paralinguistique, cognition et physiologie.

Personne à contacter : Claude.Montacie@Paris-Sorbonne.fr

Top

6-15

(2017-03-14) PhD and postdocs positions at INRIA/Nancy France

Our team has several openings for PhD students and postdocs in the fields of deep
learning based:
- speech enhancement
- speech recognition
- environmental sound analysis

For details and to apply, see:
https://team.inria.fr/multispeech/category/job-offers/

Application deadline:
- April 15 for postdoctoral positions
- April 30 for PhD positions

--
Emmanuel Vincent
Multispeech Project-Team
Inria Nancy - Grand Est
615 rue du Jardin Botanique, 54600 Villers-lès-Nancy, France
Phone: +33 3 8359 3083 - Fax: +33 3 8327 8319
Web: http://members.loria.fr/evincent/

Top

6-16

(2017-03-18) Fully-funded PhD Positions in Automatic Emotion Recognition at SUNY, Albany, NY, USA

Fully-funded PhD Positions in Automatic Emotion Recognition at SUNY

Application deadline: 22 March 2017 (**see below for more information**)

We have several PhD research assistantship positions available at the State University of New York, Albany. We are seeking highly creative and motivated applicants with a keen interest in doing research in human-centered technology, affective computing, and automatic emotion recognition using machine learning and multimodal signal processing techniques.

Requirements: - A bachelor's degree in a relevant field (Electrical and Computer Engineering, Computer Science, Statistics, or related) - Solid background in computer programming - Proficiency in spoken and written English - (Preferred) Knowledge in the following technologies: MATLAB, Python, Java, Perl, C++, Unity - (Preferred) Previous coursework and/or practical experience in machine learning - (Preferred) Solid background in mathematics and/or statistics     Interest in one of the following areas: - Human-Centered and Affective Computing, Computational Human Behavior Analysis - Machine Learning, Statistics, Applied Mathematics - Speech Processing, Computer Vision

We expect: - Keen interest in top level conference and journal publications - Self-organized, team worker, with good communication skills

We offer: - You will work at one of the leading U.S. Universities and have the opportunity to work towards your PhD in a group of excellent scientists - Tuition, stipend, and fringe benefits - You will get financial support to attend and present at top level international conferences - Visas will be fully funded for international students

To apply, please send an email to Prof. Yelin Kim (yelinkim@albany.edu) including a CV and a research statement (max. 2 pages) by March 22, 2017. We have rolling admissions policies, so please apply as early as possible. Please give your email the subject “SUNY PhD Research Assistantship in Automatic Emotion Recognition.'

Please liberally forward and share to possibly interested candidates or people that might know suitable candidates.

Top

6-17

(2017-03-20) Ph D position at IRISA Rennes, France

The Expression team of IRISA is recruiting a PhD candidate in computer science on the subject 'Universal speech synthesis through embeddings of massive heterogeneous data'. This work focuses on the following domains:

- Text-to-speech

- Deep learning

- High-dimensional indexing.

Details are given here: http://www.irisa.fr/en/offres-theses/universal-speech-synthesis-through-embeddings-massive-heterogeneous-data .

Application deadline: Monday, 3 April 2017.

Application process:

- CV

- Transcript of M.Sc. marks/grades

- to gwenole.lecorve@irisa.fr, damien.lolive@irisa.fr, laurent.amsaleg@irisa.fr .

Top

6-18

(2017-03-25) Offre de thèse en Systèmes d'interaction vocale , LIA, Avignon France

***** Offre de thèse en Systèmes d?interaction vocale *****
au LIA/CERI Univ. Avignon Prof. F. Lefèvre et B. Jabaian

Améliorer l'interaction vocale avec le monde numérique et la conception
de nouveaux services de dialogue homme-machine sont des défis essentiels
pour un passage total vers une société numérique. Parmi les activités de
recherche en intelligence artificielle portant sur les interactions
vocales, plusieurs questions importantes sont encore mal examinées et
peuvent faire l?objet de différentes études. Le LIA traite de multiples
aspects liés à l?interaction vocale et cherche à travers cette thèse à
approfondir la recherche dans l?une des ces grandes problématiques parmi :

** Le dialogue argumentatif **
pour rendre les systèmes artificiels capables d'apprendre à partir des
données, deux hypothèses fortes sont généralement faites : (1) la
stationnarité du système : on suppose que l'environnement de la machine
ne changera pas avec le temps. (2) l'interdépendance entre la collecte
des données et le processus d'apprentissage : cela implique que
l'utilisateur ne modifie pas son comportement dans le temps alors que ce
dernier a tendance à adapter son comportement en fonction de la réaction
de la machine. Il est clair que ce comportement n'aide pas un système
d'apprentissage artificiel à trouver l'équilibre lui permettant de
satisfaire au mieux les attentes de l'utilisateur.

Les interfaces vocales actuelles, basées sur des processus de décision
markovien partiellement observables, doivent évoluer vers une nouvelle
génération de systèmes interactifs, capables d'apprendre dynamiquement à
partir d'interactions sur le long terme, tout en tenant compte que le
comportement des humains est variable, étant eux-mêmes des systèmes
adaptatifs. En effet, les humains apprennent également de leurs
interactions avec un système et changent leur comportement au cours du
temps. Un tel système sera capable de discuter avec l?humain et
argumenter pour défendre ses choix.

** L?agent dialoguant autoritaire **
L'intelligence artificielle est généralement vue à travers sa soumission
aux désirs/volontés de l'humain, il existe toutefois des situations où
artificiellement doter la machine d'une dimension autoritaire peut être
pertinent (games et serious games principalement, mais aussi simulation
de contrôle...). Des mécanismes concrets permettant de développer un
agent autoritaire (dans l'objectif d'imposer son point de vue à
l'utilisateur) seront étudiés et mis en oeuvre en pratique pour
permettre leur évaluation complète.

** La réalité virtuelle pour la simulation d'agents dialoguant **
Une autre piste de recherche concerne les possibilités offertes par la
réalité virtuelle pour permettre l'apprentissage d'agent vocaux
dialoguant. L'objectif initial est d'offrir un cadre unifié pour le
développement en conditions d'utilisation de systèmes de dialogue situés
par le biais de simulations en réalité virtuelle des environnements
envisagés, éliminant ainsi la nécessité de les recréer. A terme
l'approche permettra aussi de développer des systèmes de dialogue pour
les applications de réalité virtuelle elle-mêmes. Le travail implique
donc des compétences dans les deux domaines de la réalité virtuelle et
du traitement automatique du langage.

Le candidat doit avoir un master en informatique avec une composante sur
les méthodes d'apprentissage automatique et/ou sur l?ingénierie de la
langue. La bourse de thèse fera l?objet d?un concours au sein de l?Ecole
Doctorale 536 de l?université d?Avignon, avec une audition du candidat
retenu par les encadrants de thèse.

Pour postuler merci d?envoyer un mail avant le 30 avril 2017 à Fabrice
Lefèvre (fabrice.lefevre@univ-avignon.fr) et Bassam Jabaian
(bassam.jabaian@univ-avignon.fr) incluant : votre CV, une lettre de
motivation avec votre positionnement sur les propositions d?études
ci-dessus, d?éventuelles lettres de recommandation et vos relevés de notes.

Top

6-19

(2017-03-28) Research Scientist, Spoken and Multimodal Dialog Systems, ETS, S.Francisco, CA, USA

Open Rank Research Scientist, Spoken and Multimodal Dialog Systems

ETS (Educational Testing Service) is a global not for profit organization whose mission is to advance quality and equity in education. With more than 3,400 global employees, we develop, administer and score more than 50 million tests annually in more than 180 countries.

Our San Francisco Research and Development division is seeking a Research Scientist for our Dialog, Multimodal, and Speech (DIAMONDS) research center. The center’s main focus is on foundational research as well as on development of new capabilities to automatically score spoken, interactive, and multimodal test responses in conversational settings in a wide range of ETS test programs, promote learning and other educational areas. This is an excellent opportunity to be part of a world-renowned research and development team and have a significant impact on existing and next generation spoken and multimodal dialog systems and their application to assessment and other areas in education.

Primary responsibilities include:

Developing and collaborating on interdisciplinary projects that aim to transfer techniques to a new context or scientific field. Successful candidates are self-motivated and self-driven, and have a strong interest in emerging conversational technology that can contribute to education in assessment and instructional settings.
Providing scientific and technical skills to conceptualize, design, obtain support for, conduct, and manage new research projects, grants, or parts of existing projects.
Generating or contributing to new or modified methods that support research on and development of spoken and multimodal dialog systems and related technologies relevant in assessment and instructional settings.
Designing and conducting scientific studies and functioning as an expert in the major facets of the projects: responding as a subject matter expert in presenting the results of acquired knowledge and experience.
Developing or assisting in developing proposals for external and internal research grants and obtain financial support for new or continuing research activities. Prepare initial and final proposal and project budgets.
Participating in dissemination activities through the publications of research papers in peer-reviewed journals and in the ETS Research Report series, the issuing of progress and technical reports, the presentation of seminars at major conferences and at ETS, or the use of other appropriate communication vehicles, including patents, books and chapters, that impact practice in the field or at ETS.

Depending on experience this position is open to entry level candidates as well as mid-level and senior level professionals.

REQUIREMENTS FOR A JUNIOR LEVEL POSITION

A Doctorate in computer science, linguistics, cognitive psychology or a related field is required. One year of research experience is required, in education is desirable. Experience can be gained through doctoral studies. Candidates should be very skilled in programming and be able to work effectively as a research team member.

REQUIREMENTS FOR A MID-LEVEL POSITION

A Doctorate in computer science, linguistics, cognitive psychology, or a related field is required. Research experience in education is desirable. Candidates should be very skilled in programming and be able to work effectively as a research team member. Three years of progressively independent substantive research in the area of computer science, linguistics, cognitive psychology, or education are required.

REQUIREMENTS FOR A SENIOR-LEVEL POSITION

A Doctorate in computer science, linguistics, cognitive psychology, or a related field is required. Research experience in education is desirable. Candidates should be very skilled in programming and be able to work effectively as a research team member. Eight years of progressively independent substantive research in the area of computer science, linguistics, cognitive psychology, or education are required.

We offer a competitive salary, comprehensive benefits and excellent opportunities for professional and personal growth. For a full list of position responsibilities and to apply please visit the following link: http://ets.pereless.com/careers/index.cfm?fuseaction=83080.viewjobdetail&CID=83080&JID=235623&BUID=2538

ETS is an Equal Opportunity Employer

Top

6-20

(2017-04-10) 3 Funded PhD Research Studentships at CSTR, Edinburgh, Scotland, UK

Three Funded PhD Research Studentships at the Centre for Speech Technology Research,
University of Edinburgh.

Please see http://www.cstr.ed.ac.uk/opportunities for full details, eligibility
requirements, application procedure and deadlines.

1. Embedding enhancement information in the speech signal

Speech becomes harder to understand in the presence of noise and other distortions, such
as telephone channels. This is especially true for people with a hearing impairment. It
is difficult to enhance the intelligibility of a received speech+noise mixture, or of
distorted speech, even with the relatively sophisticated enhancement algorithms that
modern hearing aids are capable of running. A clever way around this problem might be for
the sender to add extra information to the original speech signal, before noise or
distortion is added. The receiver (e.g., a hearing aid) would use this to assist speech
enhancement.

Funding: Marie Sklodowska-Curie fellowship

2. Broadcast Quality End-to-end Speech Synthesis

Advances in neural networks made jointly in the fields of automatic speech recognition
and speech synthesis, amongst others, have led to a new understanding of their
capabilities as generative models. Neural networks can now directly generate synthetic
speech waveforms, without the limited quality of a vocoder. We have made separate
advances, using neural networks to discover representations of spoken and written
language that have applications in lightly-supervised text processing for almost any
language, and for adaptation of speaker identity and style. The project will combine
these techniques into a single end-to-end model for speech synthesis. This will require
new techniques to learn from both text and speech data, which may have other
applications, such as automatic speech recognition.

Funding: EPSRC Industrial CASE award (in collaboration with the BBC)

3. Automatic Extraction of Rich Metadata from Broadcast Speech (in collaboration with the
BBC)

The research studentship will be concerned with automatically learning to extract rich
metadata information from broadcast television recordings, using speech recognition and
natural language processing techniques. We will build on recent advances in
convolutional and recurrent neural networks, using architectures which learn
representations jointly, considering both acoustic and textual data. The project will
build on our current work in the rich transcription of broadcast speech using neural
network based speech recognition systems, along with neural network approaches to machine
reading and summarisation. In particular, we are interested in developing approaches to
transcribing broadcast speech in a way appropriate to the particular context. This may
include compression or distillation of the content (perhaps to fit in with the
constraints of subtitling), transforming conversational speech into a form that is more
easy to read as text, or transcribing broadcast speech in a way appropriate for a
particular reading age.

Funding: EPSRC Industrial CASE award (in collaboration with the BBC)

--

Top

6-21

(2017-04-11) Postdoctoral research position, Univ. Stellenbosch, South Africa

Postdoctoral research position: Multitlingual code-switched speech corpus and systems

A postdoc position focussing on corpus compilation and automatic speech recognition of codeswitched South African speech in several languages is available in the Digital Signal Processing Group of the Department of Electrical and Electronic Engineering at the University of Stellenbosch, South Africa.
The project will involve the compilation of a multilingual (at least 5 languages) corpus of conversational code-switched South African speech. It will also include the development of associated automatic speech recognition systems able to process this speech. Specific project objectives include the gathering of the acoustic data, developing a transcription protocol, recruiting annotators, setting up and managing the annotation process, developing a validation protocol, validating the data, developing baseline automatic speech recognition systems in the languages of the corpus, and producing new and original research into how best to automatically recognise and process this mixed-language speech.
Applicants must hold a PhD (preferably obtained within the last 5 years) in the field of Electronic/Electrical Engineering, Information Engineering, Computer Science, or other relevant discipline. Suitable candidates must also have practical and research experience with automatic speech processing systems in general and multilingual automatic speech recognition in particular, and should have an excellent background in statistical modelling, signal processing, and/or speech analysis. Applicants should also have proven prior experience in speech corpus compilation, have good programming skills and be able to use high level programming languages for developing prototype systems. Finally, candidates must have excellent English writing skills and have an explicit interest in scientific research and publication.
The position will be available for one year, with a possible extension to a second and third year, depending on progress and available funds.
Applications should include a covering letter, curriculum vitae, list of publications, research projects, conference participation and details of three contactable referees and should be sent as soon as possible to: Prof Thomas Niesler, Department of Electrical and Electronic Engineering, University of Stellenbosch, Private Bag X1, Matieland 7602. Applications can also be sent by email to: trn@sun.ac.za. The successful applicant will be subject to University policies and procedures.
Interested applicants are welcome to contact me at the above e-mail address for further information regarding the project.

Top

6-22

(2017-04-20) Postdoc for project LaDyCa, Sorbonne, Paris

Applicants must have a PhD in linguistics as well as publications in their field of specialization. Independent research experience in one or several of the core areas of the LaDyCa project (i.e. language dynamics, linguistic typology, sociolinguistics, geolinguistics, dialectology & dialectometry) is expected. An experience in working with scholars of diverse backgrounds, e.g. linguists, sociologists, anthropologists, historians and, to some extent, mathematicians or statisticians would be greatly appreciated.The project will be funded by the IDEX (?Initiative d?Excellence?) consortium of Sorbonne Universités, France, in partnership with Ilia State University, Tbilisi, Georgia. Apart from an efficient and fluent command of English and/or French, for collegial relations. with an international team of scholars, applicants should have a good command of Georgian (written & oral skills); efficient reading skills in Russian would be an asset too. A good command of database software, and previous training or experience in computational linguistics would be also appreciated. A strong performing ability in entering data and in designing linguistic databases would be an asset.
Applications should include a statement of interest (letter of motivation), giving accurate details on the applicant?s skills corresponding to the aim of the LaDyCa project, and
how (s)he plans to process data with computing tools and gather information on the ecological, historical and social context of linguistic diversity in the Caucasus. (S)he will also
provide a CV including a list of publications, a copy of the PhD certificate, and the names and e-mail addresses of two referees. Applications should be sent as a single PDF file to the e-
mail addresses below, entitled ?Application_LaDyCa_PostDoc?:

Prof. Jean Léo Léonard < leonardjeanleo@gmail.com >
Prof. Claude Montacié < Claude.Montacie@paris-sorbonne.fr >
Deadline: applications must be submitted by 2 nd of May 2017.

The position is available from July 2017 to June 2018.The duration of employment is intended to last one year.
Net salary: around 2100 euros per month.

http://www.stih.paris-sorbonne.fr/?p=1203

Top

6-23

(2017-04-22) Poste d'ATER à Paris Sorbonne, France

un poste d'ATER en Traitement automatique des langues et de la Parole est disponible à
l'Université Paris-Sorbonne. Le lien pour postuler est
http://concours.univ-paris4.fr/PostesAter?entiteBean=posteCandidatureCourant&modif=839.

Les conditions pour candidater sont disponibles sur www.paris-sorbonne.fr/ater

Top

6-24

(2017-04-23) Associate research scientist-Speech at ETS, Princeton, New Jersey, USA

http://ets.pereless.com/careers/index.cfm?fuseaction=83080.viewjobdetail&CID=83080&JID=243925&type=&cfcend

Careers

ETS Home > Careers > Job Openings

Associate Research Scientist - Speech

Back to Listings

Add Job to Basket
(0 Jobs)

Date Updated:	April 25, 2017
Location:	Princeton, NJ
Job Type:	Full-Time/Regular
Travel:	Not Specified
Position ID:	243925

Job Level:	Entry Level(less than 2 years)
Years of Experience:	Less Than 1 Year
Level of Education:	Doctoral Degree
Starting Date :	ASAP

Job Description

ETS is the world’s premier educational measurement institution and a leader in educational research. As an innovator in developing achievement and occupational tests for clients in business, education, and government, we are determined to advance educational excellence for the communities we serve.

ETS's Research & Development division has an opening for a research scientist in the NLP, Speech & DIAMONDS (Dialog, Multimodal, and Speech) research group. The projects in this research group focus on the application of NLP & Speech processing algorithms in automated scoring capabilities for assessment tasks involving constructed responses (such as essays and spoken responses) as well as on the application of spoken and multimodal dialog systems to assessment tasks. This is an excellent opportunity to be part of a world-renowned research and development team and have a significant impact on existing and next generation NLP & Speech systems and their application to assessment.

BASIC FUNCTIONS AND RESPONSIBILITIES

Take responsibility for conceptualizing, proposing, obtaining funding for, and directing small projects in the areas of speech processing, speech recognition, and automated speech scoring and/or assisting in moderate-to-major speech research projects.

Projects may include (1) research projects, (2) development projects that use scientific principles to create (a) tools to improve the efficiency or quality of the practice of test development or statistical analysis: (b) innovative item types: or (c) the scoring of responses to open-ended items and (3) development projects that use scientific principles to create new products or product prototypes. Small research and development projects typically have minimal budgets, few or no staff other than the project director, a timeline of a year or less, and a single deliverable that is relatively narrow in scope. Major projects have substantial budgets, involve the coordination of many individuals internal and possibly external to ETS, may run across years, and may produce multiple deliverables. Moderate projects fall in between these two types.

Assist in generating or contributing to new knowledge or capability in the field of speech processing, speech recognition, and spoken language technology, and in applying that new knowledge and capability to existing and/or new ETS products and services. New knowledge may take the form of new or modified educational or psychological theories: new research methodology: new development methodology: new statistical, analytic or interpretative procedures: new test designs and item types: new approaches to scoring examinee responses: and new approaches to reporting. New capabilities include developing software to instantiate new and existing knowledge.

Document and disseminate the results of research and/or development projects through publication and presentation. Publication includes peer-review journals, peer-review conference proceedings, patents, books and book chapters, and other print media. Presentation may be at international, national, or regional conferences, client meetings, and ETS seminars.

Participate in setting substantive research and development goals and priorities for a group or initiative within a vice presidential area.
Actively seek input from peers on the quality of one’s work. Participate as a reviewer of others’ work.
Actively seek mentoring from more senior scientific and other R&D staff, developing a continuing mentoring relationship.
Develop proposals and budgets for small projects and/or assist in development for moderate-to-major ones.
Assist more senior scientific staff in consulting on testing program, R&D management, or other ETS management concerns.
Manage small projects, and/or assist in the management of moderate-to-major ones, by accomplishing directed tasks according to schedule and within budget.
Develop external professional relationships and work to cultivate a scientist’s identity.
Become a member and regular presenter at the annual meetings of one or more organizations substantively related to the work of ETS.

Experience and Skills

EDUCATION

A Ph.D. in Computer Science, Electrical Engineering, Natural Language Processing, Computational Linguistics, or a similar area with major education in speech technology, and particularly in speech recognition is required.

EXPERIENCE

Evidence of independent substantive research experience in spoken language technology and/or development experience for deploying speech technology capabilities is required.
One year of independent substantive research experience in spoken language technology and/or development experience for deploying speech technology capabilities is required.
Experience may be gained through doctoral studies.
Practical expertise with automatic speech recognition systems, experience with machine learning toolkits (e.g., Weka, scikit-learn), and fluency in at least one major programming language (e.g. Java, Python) is required.
Practical experience with deep learning paradigms and/or deep-learning-based speech recognition systems (e.g., Kaldi) is highly desirable.

Our strength and success are directly linked to the quality and skills of our diverse associates. A background and/or knowledge of accessibility and accommodations for individuals with disabilities, whether through your own experiences or those of someone close to you, is highly desirable.

ETS is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status, or other characteristic protected by law.

Top

6-25

(2017-05-02) PhD at IRISA, Rennes, France

L'équipe Expression de l'IRISA ouvre un poste de doctorant en informatique sur le sujet 'caractérisation de registres de langue par extraction de motifs séquentiels' dans le cadre du projet ANR TREMoLo.

Domaines : traitement automatique des langues et fouille de données.

Détails de l'offre : http://www.irisa.fr/fr/offres-theses/caracterisation-registres-langue-extraction-motifs-sequentiels

Date limite de candidature : vendredi 2 juin.

Dossier de candidature (* : éléments obligatoires) :

- CV détaillé*

- lettre de motivation*

- relevés de notes (avec classement si possible)*

- contacts pour recommandation*

- rapport(s) de stage recherche (si applicable).

Envoyer à : del.battistelli@gmail.com, nicolas.bechet@irisa.fr, gwenole.lecorve@irisa.fr.

Top

6-26

(2017-05-05) Post-doctoral positions in Multimodal Behavior Analysis: Speech, Vision and Healthcare, CMU, Pittsburgh, PA, USA

Post-doctoral positions in Multimodal Behavior Analysis: Speech, Vision and Healthcare

Carnegie Mellon University, School of Computer Science

Multiple post-doctoral positions are available in the School of Computer Science at Carnegie Mellon University. We are seeking creative and energetic applicants for two-year postdoctoral positions. The positions include a competitive salary with full benefits and travel resources.

Candidates must have a strong research track record for one or more of the following topics: (1) speech and paralinguistic processing for affect, emotion and human behavior analysis, (2) automatic recognition of facial expressions, gestures and human visual activities, (3) multimodal machine learning algorithms for text, audio and video, (4) technologies to help clinicians with mental health diagnoses and treatments.

Required

PhD in computer science or related field (at the time of hire)
International applicants welcome! No US citizenship requirement.

Desired

Publications in top machine learning, speech processing and/or computer vision conferences and journals.
Research involving clinical patients with mental health disorders (e.g., depression, schizophrenia, suicidal ideation)
Experience mentoring graduate and undergraduate students

Job details

Preferred start date: September 1^st, 2017 (negotiable)
Candidate will work under the supervision of Dr. Louis-Philippe Morency, CMU MultiComp Lab’s director
Competitive salary with full benefits and travel resources.

How to apply

Email applications should be sent to morency@cs.cmu.edu with the title “Postdoc application”, preferably before June 12^th, 2017. The email should include:
- a brief cover letter (with expected date of availability),
- a CV including list of publications,
- contact information of two references,
- links to three representative publications

Top

6-27

(2017-05-10) CDI Ingénieur docteur en informatique ou sciences du langage, LNE, Trappes, France

Ingénieur docteur en informatique ou sciences du langage

CDI – TRAPPES

Référence:AP/TAI/DE

L’entreprise: WWW.LNE.FR

Leader dans l’univers de la mesure et des références, jouissant d’une forte notoriété en France et à l’international, le LNE soutient l’innovation industrielle et se positionne comme un acteur important pour une économie plus compétitive et une société plus sûre. Au carrefour de la science et de l’industrie depuis sa création en 1901, le LNE offre son expertise à l’ensemble des acteurs économiques impliqués dans la qualité et la sécurité des produits.

Pilote de la métrologie française, notre recherche est au cœur de notre mission de service public et constitue un facteur fondamental au soutien de la compétitivité des entreprises.

Nous avons à cœur de répondre aux exigences des industriels et du monde académique, pour des mesures toujours plus justes, effectuées dans des conditions de plus en plus extrêmes ou sur des sujets innovants tels que les véhicules autonomes, les nanotechnologies ou la fabrication additive.

Le LNE en quelques chiffres: 700 collaborateurs.

5 métiers (la mesure, les essais, la certification, la formation et la R&D).

8 domaines d’intervention (Métrologie, Santé, Bâtiment, Environnement, Energie, Transports, Sécurité et Défense, Biens de consommation).

55 000 m2 de laboratoires (dont 10 000m2 à Paris et 45 000m2 à Trappes).

7 implantations (2 sites en Ile de France, 2 délégations régionales à Poitiers et Nîmes, 1 antenne à St Etienne, 2 filiales à Washington, Hong Kong).

9000 clients.

Missions :

Le docteur sera intégré à une équipe de 4 ingénieur-docteurs qui encadrent différents stagiaires et doctorants. Cette équipe est historiquement spécialiste de l’évaluation des systèmes de traitement de l’information multimédia (transcription de parole, reconnaissance du locuteur, dialogue, traduction…). Elle s’ouvre aujourd’hui à de nouveaux enjeux que sont l’évaluation des systèmes d’intelligence artificielle en général (robotique, smart-grid, domaine de la défense, véhicule autonome…).

Le docteur se verra attribuer les missions suivantes :

Le développement de la R&D en évaluation de systèmes de traitement de la parole et du langage
- Définition de nouvelles métriques
- Analyse de corpus
- Publication de résultats scientifiques
- Mise en place de protocoles perceptifs
- Contribution au montage et déroulement de projets de recherche européens et nationaux.

Animation des campagnes d’évaluation
- Aide aux équipes participantes pour l’utilisation des outils du LNE
- Contrôle formel des données
- Scoring des systèmes
- Organisation de rencontres scientifiques et industrielles
- Rédaction des rapports d’évaluation

Encadrement de stagiaires, post-docs

Profil :

Titulaire d’un doctorat en Informatique ou Sciences du langage, vous avez des compétences en traitement automatique de la langue ou en linguistique de corpus. Vous maitrisez également la programmation (R ou S, C++, PYTHON).

Vous êtes doté de bonnes qualités rédactionnelles et relationnelles. Vous avez une bonne communication orale et vous aimez travailler en collaboration avec votre équipe et les clients.

Vous avez un anglais vous permettant une communication professionnelle.

Déplacements en région parisienne, 1 jour par semaine et dans le monde 1 fois par an.

Pour déposer votre candidature :envoyer CV+LM à recrut@lne.fr – réf AP/TAI/DE

Top

6-28

(2017-05-10) Open Rank Research Scientist, Spoken and Multimodal Dialog Systems, ETS, San Francisco, CA, USA

Open Rank Research Scientist, Spoken and Multimodal Dialog Systems

Primary responsibilities include:

Developing and collaborating on interdisciplinary projects that aim to transfer techniques to a new context or scientific field. Successful candidates are self-motivated and self-driven, and have a strong interest in emerging conversational technology that can contribute to education in assessment and instructional settings.
Providing scientific and technical skills to conceptualize, design, obtain support for, conduct, and manage new research projects, grants, or parts of existing projects.
Generating or contributing to new or modified methods that support research on and development of spoken and multimodal dialog systems and related technologies relevant in assessment and instructional settings.
Designing and conducting scientific studies and functioning as an expert in the major facets of the projects: responding as a subject matter expert in presenting the results of acquired knowledge and experience.
Developing or assisting in developing proposals for external and internal research grants and obtain financial support for new or continuing research activities. Prepare initial and final proposal and project budgets.
Participating in dissemination activities through the publications of research papers in peer-reviewed journals and in the ETS Research Report series, the issuing of progress and technical reports, the presentation of seminars at major conferences and at ETS, or the use of other appropriate communication vehicles, including patents, books and chapters, that impact practice in the field or at ETS.

Depending on experience this position is open to entry level candidates as well as mid-level and senior level professionals.

REQUIREMENTS FOR A JUNIOR LEVEL POSITION

A Doctorate in computer science, linguistics, cognitive psychology or a related field is required. One year of research experience is required, in education is desirable. Experience can be gained through doctoral studies. Candidates should be very skilled in programming and be able to work effectively as a research team member.

REQUIREMENTS FOR A MID-LEVEL POSITION

A Doctorate in computer science, linguistics, cognitive psychology, or a related field is required. Research experience in education is desirable. Candidates should be very skilled in programming and be able to work effectively as a research team member. Three years of progressively independent substantive research in the area of computer science, linguistics, cognitive psychology, or education are required.

REQUIREMENTS FOR A SENIOR-LEVEL POSITION

A Doctorate in computer science, linguistics, cognitive psychology, or a related field is required. Research experience in education is desirable. Candidates should be very skilled in programming and be able to work effectively as a research team member. Eight years of progressively independent substantive research in the area of computer science, linguistics, cognitive psychology, or education are required.

ETS is an Equal Opportunity Employer

Top

6-29

(2017-05-10) Research Scientist, Disney Research, Pittsburgh, PA, USA

Position: Research Scientist

Focus Area: Autonomous Agents for Multimodal Character Interaction

Disney Research

Disney Research Pittsburgh is seeking applicants for a Research
Scientist position, at either the junior or senior level, in
Autonomous Agents. The research emphasis is on architecture to support
the integration of natural language with character-based reasoning and
behavior.

As part of The Walt Disney Company, Disney Research builds upon a rich
legacy of innovation and technology leadership in the entertainment
industry that continues to this day. Disney Research was launched in
2008 offering the best attributes of academia and industry with the
goal of driving value across the company through technological
innovation. Our research covers a broad range of exciting and
challenging applications that are experienced daily by millions of
people around the world.

Our staff interacts directly with all core business areas of The Walt
Disney Company including Theme Parks and Imagineering, Consumer
Products, our Live Action and Animation Studios, and Media Networks.
We publish our research and are actively engaged with the global
research community. Our researchers collaborate closely with
co-located academic institutions.

We are seeking applicants in the following areas:

· Agent architectures for language-based character interaction.

· AI and machine learning methods for autonomous,
semantically-rich character behavior

Duties:

· Drive value for Disney through groundbreaking research and innovation

· Lead a research group with post-doctoral researchers,
interns, and external collaborators

· Publish results and patent inventions in multimodal interaction

· Participate in conferences, workshops and academic-industrial events

· Develop a strong network of business partners within the company

Required Qualifications:

· Ph.D. in Computer Science or equivalent

· Proven track record of developing autonomous, integrated
agents with real-time NL components.

· Experience with both symbolic and statistical machine
learning methods as applied to modeling semantics, action, or behavior

· Possess strong technical presentation skills and able to
clearly communicate with technical and non-technical audiences

Desired Qualifications:

· Experience in interaction design for entertainment

· Background in NLP (e.g., relationship extraction, word sense
disambiguation, narrative generation) desirable

To apply:

Please email careers@disneyresearch.com. Please use DRP-RS-NLP-2017 in
your subject line. If you're interested in the position or for any
further information, please contact Jill Lehman
(jill.lehman@disneyresearch.com).

Top

6-30

(2017-05-23) Lead Speech Recognition Engineer, Cambridge, UK

Lead Speech Recognition Engineer

Location: Cambridge, UK

Contact: careers@speechmatics.com

Background

Speechmatics is a leader in automatic speech recognition (ASR). Using proprietary technology, we have built one of the most accurate ASR systems in the world, with a vision to power a voice-enabled economy. We are already working at a time when the global economy is actively adopting all types of speech-related technologies. In developing our technology we combine our years of experience, the latest developments in the field and our own focus on cutting-edge research to produce a world-class service.

In the office, we pride ourselves on a relaxed but productive environment whilst we stay in touch with the progress of others by attending both academic and commercial conferences and have fun together with regular outings (in the past we have been punting, go-karting, attended a cooking workshop and played bubble football...).

We are expanding rapidly and are seeking more people in the coming months to help us keep pushing the boundaries of speech recognition. This is an opportunity to join a high growth team and form a major part of its future direction.

The Opportunity

We are looking for a talented speech scientist to help us build the best speech technology for anybody, anywhere, in any language. You will be a part of a team that is working on our core ASR capabilities to improve our speed and accuracy and develop novel features so that we can support all languages. Your work will feed into ‘Auto-Auto’, our ground-breaking framework to support the building of ASR models, and hence the delivery of every language pack published by the company. You will be responsible for keeping our system the most accurate and useful commercial speech recognition available.

Because you will be joining a small team, you will need to be a team player who thrives in a fast paced environment, with a focus on rapidly moving research developments into products. Bringing skills to the team is as important as a can-do attitude. We strongly encourage versatility and knowledge transfer within the team, so we can share efficiently what needs to be done to meet our commitments to the rest of the company.

Key Responsibilities

Ensuring that our speech recognition meets or exceeds that published by others
Improving our core modelling (acoustic, pronunciation, language)
Leading the extension of our ML framework so that we can build any language

Experience

Essential

MSc, PhD or equivalent experience in the academic aspects of speech recognition
Several years practical experience in speech recognition, covering all aspects (acoustic, pronunciation and language modelling as well as decoders/search)
Experience working with standard speech and ML toolkits, e.g. Kaldi, KenLM, TensorFlow, etc.
Solid programming skills with Python and / or C/C++
Experience using Unix/Linux for big data

Desirable

PhD degree
Experience of team leadership and line management
Experience of working in an Agile framework
Expertise in modern speech recognition, including WFSTs, lattice processing, neural net (RNN / DNN / LSTM), acoustic and language models, Viterbi decoding
Comprehensive knowledge of machine learning and statistical modelling
Experience in deep machine learning and related toolkits, e.g. Theano, Torch, etc.
Deep expertise in Python and/or C++ software development
Experience working effectively with software engineering teams or as a Software Engineer

Salary

We offer a competitive salary, bonus scheme, pension contribution matching (up to 5%) and a generous EMI share option scheme. We also have several additional benefits including holiday purchase, massages, fully stocked beer fridge, Cyclescheme, fruit boxes and many more.

The overall package will depend on your motivations and level of experience.

Top

6-31

(2017-05-23) Software Development Engineer, Cambridge, UK

Software Development Engineer

Location: Cambridge, UK

Contact: careers@speechmatics.com

Background

Speechmatics is a leader in automatic speech recognition (ASR). Using proprietary technology, we have built one of the most accurate ASR systems in the world, with a vision to power a voice-enabled economy. We are already working in the world at a time when the global economy is actively adopting all types of speech-related technologies. In developing our technology we combine our years of experience, the latest developments in the field and our own focus on cutting-edge research to produce a world-class service.

In the office, we pride ourselves on a relaxed but productive environment whilst we stay in touch with the field by attending both academic and commercial conferences and have fun together with regular team events (in the past we have been punting, go-karting, attended a cooking workshop and played bubble football...).

The Opportunity

You will be joining the ‘Languages’ team within Speechmatics, focussing on two key goals. We maintain and develop Auto-Auto, our ground-breaking framework to support the building of languages for use in ASR. And we use it to build new language models.

We are looking for an experienced Software Development Engineer to join us. As a member of the team, you will be working on the development, maintenance and expansion of our pipeline, and participating in building and solving the challenges of a growing language portfolio. You will have significant influence on implementing or integrating new features, drive the system architecture, and spearhead the best practices that enable a quality product.

Auto-Auto is core to our business and by working on it you will have a chance to build something that will be used in businesses and homes worldwide. Working in a rapidly growing start-up also means opportunities to contribute to other projects, depending on the candidate’s background and skills.

If you are a talented, detail-oriented engineer with a solid software development foundation and a commitment to deliver the best possible technology solutions, then we want to hear from you!

Key Responsibilities

Delivering high quality, maintainable and robust code on time, as part of a team.
Executing projects and developing against an outlined design.
Developing pragmatic solutions and building flexible systems without over-engineering.
Involvement at all stages of the software development cycle, including designing and developing new architectural systems and improvements, and QA processes.
Participation in estimation and sprint planning in an agile environment.
Participation in delivering new language models for the ASR engine.
Working closely with other technical teams and product team to deliver on the company’s technical vision.

Experience

Essential

Bachelor's Degree in Computer Science or related field.
Professional experience in software development.
Computer Science fundamentals in object-oriented design, data structures, algorithm design, problem solving, and complexity analysis.
Knowledge of professional software engineering practices & best practices for the full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations.
Excellent Python skills.
Good Linux skills.
Experience of working within a team to deliver and run high quality systems.

Desirable

Master's degree in Computer Science or related field.
Demonstrable professional experience in software development.
Proficiency in C and C++ (ideally with strong STL and Boost experience).
Strong skills and experience in cloud-based software development, preferably AWS:
- Working with distributed and/or clustered systems.
- Building and running horizontally scaling architectures.
- Using cloud-based queueing, messaging, monitoring and storage techniques.
Experience in flow-based programming.
Familiarity with statistical models and data mining algorithms.
Analytical with a data-driven approach to making decisions and attention to detail.
Previous experience with Natural Language Processing techniques.
Comfortable collaborating with teams with very different technical skills, and non-technical teams.

Salary

Top

6-32

(2017-05-29) PhD & Post-Doc Research positions in Speech Signal Processing and Electronic Design, Autonomous University of Zacatecas, Zacatecas, Mexico

PhD & Post-Doc Research positions in Speech Signal Processing and Electronic Design

Place: Autonomous University of Zacatecas, Zacatecas, Mexico

Duration: PhD (3 years) / Post-Doc (1 year)

Start: PhD / Post-Doc (January 10th, 2018)

Benefits: - Economical support according to experience - Health insurance from the Mexican Social Security Institute - Round-trip international airfare at the beginning and end.

Position description: Department of Signal Processing and Acoustics, Autonomous University of Zacatecas, is looking for candidates for: fully-funded PhD and Post-doc positions in Signal Processing, Filtering Design, Embedded Systems, Speech Recognition and Synthesis. The signal processing group (led by Dr. Hamurabi GamboaRosales) at Autonomous University of Zacatecas works on algorithm designing, signal processing, electronic design, machine learning, probabilistic modeling in speech recognition and synthesis. The group belongs also to the National Laboratory in embedded systems, advanced electronic design and Microsystems. We are looking for outstanding candidates to join our research group as PhD students and Post-doc researchers to work on any of our research themes, for example: • Digital Signal processing • Optimal Filtering • FPGA’s • Microsystems • large-vocabulary speech recognition and text- to-speech synthesis • ASR in noisy environments

Candidate Profile for PhD / Post-Doc: The candidate will have:  Master’s / PhD degree, as required by the program for which is requested, in digital signal processing, electronic design, speech signal processing, acoustics, machine learning, computer science, electrical engineering, psychology or a related discipline.  Background in signal processing or electronic design (FPGAs).  Good programming skills in Java, C/C++, Python or Matlab.

Contact: Interested applicants can contact PhD Hamurabi Gamboa-Rosales for more information or directly email a candidacy letter including Curriculum Vitae, a list of publications and a statement of research interests. Email: hamurabigr@uaz.edu.mx ; hamurabigr@hotmail.com Telephone MX: +52 (1) 492-121-6787

Top

6-33

(2017-06-01) Appel à chercheurs 2017-2018 à l'INA Paris France

Appel à chercheurs 2017-2018

Nouveaux dispositifs de soutien à la recherche à l’Ina :

Chercheurs associés et bourses de recherche

Afin d’encourager le développement de travaux scientifiques menés à partir de ses fonds et des outils d’analyse qu’il développe, l'Ina a décidé de créer en 2017 deux nouveaux dispositifs de soutien à la recherche et à la valorisation scientifique de ses collections :

l’octroi d’un statut de chercheur associé à l’Ina
l’attribution de bourses de recherche

Par ces dispositifs, l’Ina entend accompagner des doctorants et des chercheurs dans la réalisation de projets de recherche originaux et innovants portant sur (ou faisant appel à) ses collections, ou portant sur l’analyse ou le traitement des images et/ou des sons et/ou de données associées.

L’Institut offre aux chercheurs sélectionnés un accueil privilégié, assorti de divers soutiens matériels.

Ces nouveaux dispositifs sont complémentaires des prix de l’Inathèque créés en 1997, et ajoutent un nouveau volet à la politique scientifique de l’Institut.

Le règlement de l’appel est disponible sur le site de l’Inathèque : http://www.inatheque.fr/actualites/2017/mai-2017/appel-chercheurs-2017-2018.html

Top

6-34

(2017-06-023) Vacataires à la police technique et scientifique, Ecully (Lyon), France

Le service audio de la Police Technique et Scientifique (Ecully, près de Lyon, France) recherche des vacataires pour effectuer un travail de segmentation et de correction d'alignement automatique dans le cadre d'études phonétiques.
Le profil suivant est recherché:

- un intérêt pour la linguistique ou pour les langues

- une bonne maitrise de l'informatique et des nouvelles technologiques

- une connaissance du logiciel Praat sera appréciée

Les vacations peuvent commencer dès que possible et peuvent se poursuivre jusqu'en octobre.

Pour plus d'informations, merci d'envoyer un mail à l'adresse suivante ptsvox@gmail.com, avec vos coordonnées.

Top

6-35

(2017-06-06) Post-doctoral Research Associate in Advanced Deep Neural network Architectures for ASR , Univ. of Crete, Greece

Department of Computer Science, University of Crete, Greece
Post-doctoral Research Associate in Advanced Deep Neural network Architectures for ASR (Fixed Term)

SALARY: €24000-€28000 per year
CLOSING DATE: 30 June 2017
REFERENCE: ASR1
TO APPLY: Send detailed CV, a motivation letter and 3 major publications to yannis@csd.uoc.gr

In the past few years, Deep Neural Networks (DNNs) have achieved tremendous success for many supervised machine learning tasks, including acoustic modelling for Automatic Speech Recognition (ASR). Advanced models such as Convolutional Neural Networks (CNNs) and Long Short Term Recurrent Neural Networks (LSTMs) have contributed to recent empirical breakthroughs. Network depth has played perhaps the most important role in these successes. However, increased depth represents challenges in the optimization of the network and despite the efforts to overcome these challenges some of the optimization issues are still important resistant. Advanced networks such as highway networks and (wide) residual networks seems to offer solutions to these issues.
This position represents an ideal opportunity to work in or move into advanced deep neural networks, as it will involve collaborating widely across academia and industry, and working on one of the most pressing research areas of machine learning for the development of robust ASR systems.
Based in Heraklion Crete the post will be with Prof. Yannis Stylianou and Dr. Vassilis Tsiaras as part of the speech processing group within the Department of Computer Science at the University of Crete. You will explore a rich set of network architectures and thoroughly examine how several different aspects affect the accuracy of ASR. The work will be performed within the framework of advanced deep neural network architectures for various signal processing tasks including 1D and 2D signals. The focus of the post will be to perform various experiments with well-known architectures, explore and suggest modifications, process and reshape knowledge from various signal processing/classification tasks towards speech processing for the purpose of ASR. Outcomes will directly feed into improvements of ASR systems in-house working with state-of-the art ASR tasks (i.e., CHiME4, REVERB, etc) and of our industrial partners using real-life data.
The post involves travel to international conferences and project meetings with our academic and industrial partners. There will be the possibility to co-advise doctoral students and potentially other teaching opportunities.
Applicants should have a doctorate in speech signal processing area for ASR, computer science, applied mathematics or related field and ideally a strong background in deep learning and mathematics. Knowledge of deep learning systems such as Tensorflow or Theano etc and ASR systems like Kaldi are an advantage. Proficiency in computer programming in C and/or Python are expected.
Informal inquiries should be directed to Prof. Yannis Stylianou by email, yannis@csd.uoc.gr
Fixed term: In the first instance, the funding supporting the post is for two years. We are expecting project extension which will provide funding for a further 7-12 months for this post.
Interviews are expected to take place the week commencing 10th July 2017. Expected start date: September 2017, however earlier and later start dates will be considered.

To apply, please send detailed CV, a motivation letter and 3 major publications of yours to: yannis@csd.uoc.gr (Prof. Yannis Stylianou)

Top

6-36

(2017-06-06) Post-doctoral Research Associate in Data Augmentation in the context of Deep Neural network ASR, Univ.of Crete, Greece

Department of Computer Science, University of Crete, Greece

Post-doctoral Research Associate in Data Augmentation in the context of Deep Neural network ASR

(Fixed Term)

SALARY: €24000-€28000 per year

CLOSING DATE: 30 June 2017

REFERENCE: ASR2

TO APPLY: Send detailed CV, a motivation letter and 3 major publications to yannis@csd.uoc.gr

In the past few years, Deep Neural Networks (DNNs) have achieved tremendous success for many supervised machine learning tasks, including acoustic modelling for Automatic Speech Recognition (ASR). Advanced models such as Convolutional Neural Networks (CNNs) and Long Short Term Recurrent Neural Networks (LSTMs) have contributed to recent empirical breakthroughs. However, deep learning methods are quite demanding in the amount of data for training an acoustic model for ASR and as a result significant amounts of transcribed data has become available for training use. But data transcription is a quite expensive and time consuming process. On the other hand, just adding data recorded in real-world conditions puts serious constraints on the efficient training of the acoustic models. Various works on data augmentation show that word error rate (WER) can be significantly reduced if proper augmented data are processed.

This position represents an ideal opportunity to work in or move into data augmentation research area in the context of advanced deep neural networks for ASR, as it will involve collaborating widely across academia and industry, and working on one of the most pressing research areas of machine learning for the development of robust ASR systems.

Based in Heraklion Crete the post will be with Prof. Yannis Stylianou and Dr. George Kafentzis as part of the speech processing group within the Department of Computer Science at the University of Crete. You will design and develop smart approaches for spoken data augmentation for the purpose of multi-condition training of deep learning-based ASR systems. The work will be performed within the framework of advanced deep neural network architectures for various ASR tasks. The focus of the post will be to perform various experiments with spoken data generation, explore and suggest modifications, process and reshape knowledge from various signal processing for the purpose of ASR. Outcomes will directly feed into improvements of ASR systems in-house working with state-of-the art ASR tasks (i.e., AURORA-4, CHiME4, REVERB, etc) and of our industrial partners using real-life data.

The post involves travel to international conferences and project meetings with our academic and industrial partners. There will be the possibility to co-advise doctoral students and potentially other teaching opportunities.

Applicants should have a doctorate in speech signal processing area for ASR, statistical speech synthesis and voice conversion, audio signal processing, computer science, applied mathematics or related field and ideally a strong background in deep learning and mathematics. Knowledge of deep

learning systems such as Tensorflow or Theano etc and ASR systems like Kaldi are an advantage. Proficiency in computer programming in C and/or Python are expected.

Informal inquiries should be directed to Prof. Yannis Stylianou by email, yannis@csd.uoc.gr

Fixed term: In the first instance, the funding supporting the post is for two years. We are expecting project extension which will provide funding for a further 7-12 months for this post.

Interviews are expected to take place the week commencing 10th July 2017.

Expected start date: September 2017, however earlier and later start dates will be considered.

To apply, please send detailed CV, a motivation letter and 3 major publications of yours to: yannis@csd.uoc.gr (Prof. Yannis Stylianou)

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy