6-1(2013-05-01) Two positions at CSTR at the University of Edinburgh Scotland UK

Marie Curie Research Fellow in Speech Synthesis and Speech Perception

'Using statistical parametric speech synthesis to investigate speech perception'

The Centre for Speech Technology Research (CSTR) 
University of Edinburgh

This is a rare opportunity to hold a prestigious individual fellowship in a world-leading research group at a top-ranked University, mentored by leading researchers in the field of speech technology. Marie Curie Experienced Research Fellowships are aimed at the most talented newly-qualified postdoctoral researchers, who have the potential to become leaders in their fields. This competitively salaried fellowship offers an outstanding young scientist the opportunity to kick-start his or her independent research career in speech technology, speech science or laboratory phonetics. 

This fellowship is part of the INSPIRE Network ( and the project that the CSTR Fellow will spearhead involves developing statistical parametric speech synthesis into a toolbox that can be used to investigate issues in speech perception and understanding. There are excellent opportunities for collaborative working and joint publication with other members of the network, and generous funding for travel to visit partner sites, and to attend conferences and workshops.

The successful candidate should have a PhD (or be near completion) in computer science, engineering, linguistics, mathematics, or a related discipline. He or she should have strong programming skills and experience with statistical parametric speech synthesis, as well as an appropriate level of ability and experience in machine learning. The fellowship is fixed term for 12 month (to start as soon as possible). CSTR is a successful and well-funded group, and so there are excellent prospects for further employment after the completion of the fellowship.

The Marie Curie programme places no restrictions on nationality: applicants can be of any nationality and currently resident in any country worldwide, provided they meet the eligibility requirements set out in the full job description (available online - URL below). 

Salary:  GBP 42,054 to GBP 46,731 plus mobility allowance 

Informal enquiries about this position should be made to Prof Simon King ( or Dr Cassie Mayo ( 

Apply online:

Closing date: 10 Jun 2013


An Open Position for Postdoctoral Research Associate in Speech Synthesis 

The Centre for Speech Technology Research (CSTR) 
University of Edinburgh

This post holder will contribute to our ongoing research in statistical parametric ('HMM-based') speech synthesis, working closely with Principal Investigators Dr. Junichi Yamagishi and Prof. Simon King, in addition to other CSTR researchers. The focus of this position will be to conduct research into methods for generating highly intelligible synthetic speech, for a variety of applications, in the context of three ongoing and intersecting projects in CSTR: 

The 'SALB' project concerns the generation of extremely fast, but highly intelligible, synthetic speech for blind children. This is a joint project with the Telecommunications Research Centre Vienna (FTW) in Austria, and is funded by the Austrian Federal Ministry of Science and Research. 

The 'Voice Bank' project concerns the building of synthetic speech using a very large set of recordings of amateur speakers (‘voice donors’) in order to produce personalised voices for people whose speech is disordered, due to Motor Neurone Disease. This is a joint project with the Euan MacDonald Centre for MND research, and is funded by the Medical Research Council. The main tasks will be to conduct research into automatic intelligibility assessment of disordered speech and to devise automatic methods for data selection from the large voice bank. 

The 'Simple4All' project is a large multi-site EU FP7 project led by CSTR which is developing methods for unsupervised and semi-supervised learning for speech synthesis, in order to create complete text-to-speech systems for any language or domain without relying on expensive linguistic resources, such as labelled data. The main tasks here will be to further the overall goals of the project, including contributing original research ideas. There is considerable flexibility in the research directions available within the Simple4All project and the potential for the post holder to form a network of international collaborators. 

The successful candidate should have a PhD (or be near completion) in computer science, engineering, linguistics, mathematics, or a related discipline. He or she should have strong programming skills and experience with statistical parametric speech synthesis. 

Whilst the advertised position is for 24 months (due to the particular projects that the post-holder will contribute to), CSTR is a stable, well-funded and successful group with a tradition of maintaining long-term support for ongoing lines of research and of building the careers of its research fellows. We expect to obtain further grant-funded research projects in the future. 

Informal enquiries about this position to either Dr. Junichi Yamagishi ( or Prof. Simon King ( 

Apply Online:

Closing date: 10 Jun 2013


6-2(2013-05-01) Ph D or post doc at University of Karlsruhe Germany


A job opening is to be filled as soon as possible as part of the 'Deutschen Forschungsgemeinschaft (DFG)' sponsored project at the Department of Computer Science for the duration of up to 18 months at 50% of employment at the Cooperative State University in Karlsruhe, Germany, for a

Ph.D. Research Assistant


or Post-Doctoral Researcher

in the field of

Automatic Language Processing for Education

arch A

This job opening is in the field of automatic speech recognition as part of a joint research project between Karlsruhe Institute of Technology (KIT), the Cooperative State University (DHBW) and the University of Education (PH) sponsored by DFG involving speech technology for educational system. (Working language: English or German)


Starting as soon as possible, we are seeking an experienced and motivated person to join our team of researchers from the above mentioned institutes. The ideal candidate will have knowledge in computational linguistics and algorithm design. Responsibilities include the use and improvement of research tools to update and optimize algorithms applied to diagnostics in children’s (German) writing using speech recognition and speech synthesis tools. For further details of this work, please refer to publications at SLaTE 2011, Interspeech 2011, and WOCCI 2012 by authors Berkling, Stüker, and Fay.

Joint and collaborative research between the partners will be very close, offering exposure to each research lab.




Doctoral Research Candidates may apply and are welcome for joint research with their host institution.


Experienced (post-doctoral) Research Candidates are already in possession of a doctoral degree or have at least 3 years but less than 5 years of research experience in engineering and/or hearing research.



Higher degree in speech science, linguistics, machine learning, or related field


Experience developing ASR applications - training, tuning, and optimization


Software development experience (for example: Perl, TCL, Ruby, Java, C)


Excellent communication skills in English


Willingness and ability to spend 18 months in Germany, working in a team with project partners


Knowledge of German linguistics, phonics, graphemes, morphology or willingness to learn


Strong interest in computational linguistics, morphology, phonics for German




Interest in Education and language learning


Interest in Human Computer Interaction and game mechanics


Ability to create Graphic interfaces in multi-player applications


Working with Ruby on Rails

The job will allow for interesting work within a modern and well equipped environment in the heart of Karlsruhe. The salary level, depending on your circumstances, will be in line with the 13 TV-L tarrifs. KIT and the Cooperative State University are pursuing a gender equality policy. Women are therefore particularly encouraged to apply. If equally qualified, handicapped applicants will be preferred (please submit your paperwork accordingly). Non-EU candidates need to check their ability to reside in Germany.

Interested candidates, please send application (CV, certified copies of all relevant diplomas and transcripts, two letters of recommendation, proof of proficiency in English, letter of motivation (research interest, reason for applying to position) with notification of the job-number to be received on or before

April 26, 2013.

Send electronic application to:

Questions about details of the job can be directed to

6-3(2013-05-01) Thèse à Paris Tech
Attention : le dossier de candidature complet devra être soumis sur le site de l’EDITE au plus tard le 22 mai

Sujet de thèse : Traitement du contenu verbal et analyse des sentiments dans les systèmes d’interactions humain-agent

Proposé par : Chloé Clavel
Directeur de thèse: Catherine Pelachaud
Encadrant : Chloé Clavel
Unité de recherche: UMR 5141 Laboratoire Traitement et Communication de l'Information
Domaine: Département Traitement du Signal et des Images
Secteur: Traitement Automatique du Langage Naturel, Dialogue Homme-Machine
Thématique P: Signal Image SHS
Financement : bourse EDITE (voir modalités

Personnes à contacter :

Le domaine du sentiment analysis et de l’opinion mining  est un domaine en plein
 essor avec l’arrivée en masse de données textuelles sur le web comportant des
 expressions d’opinions par les citoyens (critiques de films, débats sur les
 commentaires de forums, tweets) (El-Bèze et al. 2010)). Les recherches en
 traitement automatique des langues se mobilisent sur le développement de 
méthodes de détection d’opinion dans les textes en s’appuyant sur ces nouvelles
 ressources. La diversité des données et des applications industrielles faisant
 appel à ces méthodes multiplient les défis scientifiques à relever avec,
 notamment, la prise en compte des différents contextes d’énonciation (e.g.,
 contexte social et politique, personnalité du locuteur) et la définition du 
phénomène d’opinion à analyser en fonction du contexte applicatif. Ces méthodes
 d’analyse des sentiments dans les textes s’étendent également depuis peu à
 l’oral en passant par l’analyse des transcriptions automatiques issues de 
systèmes de reconnaissance automatique de la parole pour des problématiques 
d’indexation d’émissions radiophoniques ou de centres d’appels (Clavel et al.,
 2013), et peuvent être ainsi corrélées aux méthodes d’analyse
 acoustique/prosodique des émotions (Clavel et al., 2010).

Autre domaine scientifique en plein essor, celui des agents conversationnels
 animés (ACA) fait intervenir  des personnages virtuels intéragissant avec
 l’humain. Les ACA peuvent prendre un rôle d’assistant comme les agents
 conversationnels présents sur les sites de vente (Suignard, 2010), de tuteur 
dans le cadre des Serious Games (Chollet et al. 2012) ou encore de partenaire
 dans le cadre des jeux vidéos. Le défi scientifique majeur pour ce domaine est
 l’intégration, au sein de l’ACA, de la composante affective de l’interaction. 
Il s’agit d’une part de prendre en compte les comportements affectifs et des 
attitudes sociales de l’humain et d’autre part de les générer de façon

Nous proposons pour cette thèse de travailler sur la détection des opinions et
 des sentiments dans un contexte d’interaction multimodale de l’humain avec un
 agent conversationnel animé, sujet jusqu'à maintenant peu étudié par la
 “communauté agent”. En effet, d’un côté, les ACA réagissent à des contenus
 émotionnels essentiellement non verbaux (Schröder et al., 2011) et de l’autre
 côté, les ACA “assistant” réagissent à partir des contenus verbaux informatif
 (Suignard, 2010) sans prendre en compte les opinions ou les sentiments exprimés
 par l’utilisateur. Des premières études ont été réalisées sur la reconnaissance
 de l’affect dans le langage dans un contexte d’interaction avec un agent 
 (Osherenko et al., 2009) mais celles-ci restent envisagées indépendamment de la
 stratégie de dialogue.

Les développements de la thèse s’intègreront dans la plateforme GRETA qui repose
 sur l’architecture SAIBA, une architecture globale unifiée développée par la 
“communauté agent” pour la génération de comportements multimodaux 
 (Niewiadomski et al., 2011). Greta permet de communiquer avec l’humain en
 générant chez l’agent une large palette de comportements expressifs verbaux et
 non verbaux (Bevacqua et al., 2012). Elle peut simultanément montrer des 
expressions faciales, des gestes, des regards et des mouvements de têtes. Cette 
plateforme a notamment été intégrée dans le cadre du projet SEMAINE avec le
 développement d’une architecture temps-réel d’interaction humain-agent 
(Schröder et al., 2011) qui inclut des analyses acoustiques et vidéos, un 
système de gestion du dialogue et, du côté de la synthèse, le système Text To
 Speech OpenMary et l’agent virtuel de la plateforme GRETA. A l’instar de ce 
projet, la détection d’opinions et de sentiments envisagée dans la thèse
 interviendra en entrée des modèles d’interactions multi-modaux de la
 plateforme. La stratégie de dialogue multimodale associée à ces entrées
 relatives au contenu verbal devra être définie et intégrée dans la plateforme

La thèse portera sur le développement conjoint de méthodes de détection des
 opinions et des sentiments et de stratégies de dialogue humain-agent. Les 
méthodes envisagées sont des méthodes hybrides mêlant apprentissage statistique
 et règles expertes.  Pour les stratégies de dialogue, le doctorant pourra
 s’appuyer sur les travaux réalisés dans le cadre du moteur de dialogue DISCO
 (Rich et al., 2012) et du moteur développé dans le projet Semaine(Schröder et
 al., 2011). Les méthodes développées pourront également s’appuyer sur des
 analyses de corpus humain-humain ou de type Magicien d’Oz (McKeown et al.,
 2012) et un protocole d’évaluation de ces méthodes devra être mis en place. En
 particulier, pour répondre à cet objectif, la thèse devra aborder les
 problématiques suivantes:
-  la définition des types d’opinions et de sentiments pertinents à considérer
 en entrée du moteur de dialogue. Il s’agira d’aller au-delà delà de la
 distinction classique entre opinions positives et opinions négatives, peu
 pertinente dans ce contexte, en s’appuyant sur les modèles issus de la
 psycholinguistique (Martin and White, 2007);
- l’identification des marqueurs lexicaux, syntaxiques, sémantiques et 
dialogiques des opinions et des sentiments;
 - la prise en compte du contexte d’énonciation: les règles implémentées
 pourront intégrer différentes fenêtres d’analyse : la phrase, le tour de parole
 et les tours de paroles antérieurs;
-  la prise en compte des problématiques temps-réel de l’interaction : des
 stratégies de dialogues seront définies en fonction des différentes fenêtres
 d’analyse afin de proposer des stratégies d’interactions à différents niveaux 
de réactivité. Par exemple, certains mots-clés pourront être utilisés comme
 déclencheurs de backchannel en temps réels et la planification des
 comportements de l’agent pourra être ajustée au fur et à mesure de l’avancement 
de l’interaction.

**Ouverture à l’international:
Ces travaux de thèse interviennent en complémentarité des travaux réalisés sur
 les interactions non verbales dans le cadre du projet européen FP7 TARDIS
 prenant comme application les Serious games dans le cas d’un entrainement à 
l’entretien d’embauche ( et des travaux
 réalisés sur le traitement des signaux sociaux dans le cadre du réseau
 d’excellence SSPNET ( Une collaboration avec Candy Sidner,
 professeur au département Computer Science du Worcester Polytechnic Institute 
et experte en modèles computationnels d’intéractions verbales et non verbales et
 à l’origine du moteur de dialogue DISCO (Richet et al. 2012) sera également
 mise en place.

E. Bevacqua, E. de Sevin, S.J. Hyniewska, C. Pelachaud (2012), A listener model:
 Introducing personality traits, Journal on Multimodal User Interfaces, special 
issue Interacting ECAs, Elisabeth André, Marc Cavazza and Catherine Pelachaud 
(Guest Editors), 6:27–38, 2012.
M. Chollet, M. Ochs and C. Pelachaud (2012), Interpersonal stance recognition
 using non-verbal signals on several time windows, Workshop Affect, Compagnon
 Artificiel, Interaction, Grenoble, November 2012, pp. 19-26
C. Clavel and G. Richard (2010). Reconnaissance acoustique des émotions, 
Systèmes d’interactions émotionnelles, C. Pelachaud,  chapitre 5, 2010
C. Clavel, G. Adda, F. Cailliau, M. Garnier-Rizet, A. Cavet, G. Chapuis, S.
 Courcinous, C. Danesi, A-L. Daquo, M. Deldossi, S. Guillemin-Lanne, M. Seizou,
 P. Suignard (2013). Spontaneous Speech and Opinion Detection: Mining Call
-centre Transcripts. In Language Resources and Evaluation, avril 2013.
M. El-Bèze, A. Jackiewicz, S. Hunston, Opinions, sentiments et jugements
 d’évaluation, Revue TAL 2010, Volume 51 Numéro 3.
J.R. Martin , P.R.R. White (2007) Language of Evaluation: Appraisal in English, 
Palgrave Macmillan, Novembre 2007
G. McKeown, M. Valstar, R. Cowie, R., M. Pantic, M. Schroder (2012) The SEMAINE
 Database: Annotated Multimodal Records of Emotionally Colored Conversations
 between a Person and a Limited Agent, IEEE Transactions on Affective Computing,
 Volume: 3  , Issue: 1, Page(s): 5- 17, Jan.-March 2012
R. Niewiadomski, S. Hyniewska, C. Pelachaud (2011), Constraint-Based Model for
 Synthesis of Multimodal Sequential Expressions of Emotions, IEEE Transactions of Affective Computing, vol. 2, no. 3, 134-146, Juillet 2011.
A. Osherenko, E. Andre, T. Vogt (2009),  Affect sensing in speech: Studying fusion of linguistic and acoustic features,  International Conference on Affective Computing and Intelligent Interaction and Workshops, 2009
C. Rich, C. L. Sidner (2012), Using Collaborative Discourse Theory to Partially Automate Dialogue Tree Authoring. IVA 2012: 327-340
M. Schröder, E. Bevacqua, R. Cowie, F. Eyben, H. Gunes, D. Heylen, M.ter Maat, G. McKeown, S. Pammi, M. Pantic, C. Pelachaud, B. Schuller, E. de Sevin, M.l Valstar, and M. Wöllmer (2011), Building Autonomous Sensitive Artificial Listeners, IEEE Transactions of Affective Computing, pp. 134-146, Octobre 2011.
P. Suignard, (2010)  NaviQuest : un outil pour naviguer dans une base de questions posées à un Agent Conversationnel, WACA, Octobre 2010


6-4(2013-05-01) Ph D Visual articulatory biofeedback for speech therapy Grenoble France

Offre de thèse                 financée. Retour articulatoire visuel pour                 l'aide à la rééducation des troubles de la parole.
              Dans la cadre de la rééducation orthophonique des troubles               de la parole, le projet vise à adapter et évaluer un               dispositif de retour visuel articulatoire permettant de               piloter en temps réel les articulateurs visibles et non               visibles d’un avatar à partir de la parole d’un patient.               Le dispositif est basé sur des modèles statistiques               construits par apprentissage automatique à partir de               données acoustiques et articulatoires. L'acquisition de               ces données ainsi que l'évaluation du système développé               s'effectuera en collaboration  avec le CHU de Lyon.
              Mots-clés : technologies vocales, avatar 3D,                machine learning, réalité augmentée,  orthophonie
PhD position. Visual articulatory biofeedback for speech                 therapy

          The project aims             at assessing and adapting a system of visual articulatory             biofeedback for speech therapy. This system is based on a 3D             avatar showing visible and non visible speech articulators             such as the tongue. Statistical mapping technique will be             used to drive the animation of the avatar from the patient's             voice. Data acquisition and system evaluation will be             conducted at the Lyon Hospital.
            Keywords : speech technology, 3D avatar,  machine             learning, augmented reality, speech therapy.



Pierre             BADIN,             DR2 CNRS


Dept Parole &             Cognition (ex ICP),             GIPSA-lab, UMR 5216, CNRS – Grenoble University


Address : GIPSA-lab / DPC,             ENSE3,  Domaine             universitaire, 11 rue des             Mathématiques, BP 46 - 38402 Saint Martin d’Hères cedex,             France


Email:, Web             site:


Fax:             Int + 33             (0)476.57.47.10 - Tel: Int + 33 (0)476.57.48.26

6-5(2103-05-01) Open positions for Research Engineers in Speech and Language Cambridge UK

Positions description: Open positions for Research Engineers in Speech and Language Technology

The Speech Technology Group at Toshiba Cambridge Research Lab (STG-CRL), in Cambridge UK, is looking for talented individuals to lead and contribute to our ongoing research in Statistical Speech and Language Processing, in specific areas such as Speech Recognition, Statistical Spoken Dialog and Speech Synthesis.

The lab in Cambridge, in collaboration with other Toshiba groups and speech laboratories in China and in Japan, covers all aspects of speech technology and at many levels:  from basic and fundamental research to industrial development. We support our researchers in building their career by providing them with the freedom to publish their results and by investing on innovation and creation for addressing real problems in speech and language technology. STG-CRL has also strong connections with EU Universities and especially with the Cambridge University Engineering Department.

Outstanding PhD-level candidates at all levels of experience are encouraged to apply. Candidates should be highly motivated, team-oriented and should have the ability to work independently. Strong mathematical background and excellent knowledge in statistics are required. Very good programming skills are desired. Especially for the team leaders, researchers with a solid research track, senior level and international research experience will be considered.

The Toshiba Cambridge Research Lab is located in the Science Park of the university city of Cambridge.

To apply send your CV and a covering letter to

Informal enquiries about the open positions to Prof. Yannis Stylianou (

Closing date for applications is June 30st 2013 (or until posts filled).

6-6(2103-05-01) PhD student Learning Pronunciation Variants in a Foreign Language (full time) Radboud University Nijmegen, The Netherlands

PhD student Learning  Pronunciation Variants in a Foreign Language (full time)   


Faculty of Arts, Radboud University  Nijmegen, The Netherlands


Vacancy number: 23.12.13


Closing date: 24 May 2013






As a PhD student in this project you will           investigate the interplay between exemplars and abstract           representations, which is expected to vary with processing           speed and experimental task, and to evolve during learning.           The student will investigate these issues with behavioural           experiments investigating how native speakers of Dutch learn           pronunciation variants of French words with schwa deletion.


Learning a foreign language implies           learning pronunciation variants of words in that language.           This includes the words' reduced pronunciation variants, which           contain fewer and weaker sounds than the words' canonical           variants (e.g. 'cpute' for English 'computer'), and which are           highly frequent in casual conversations. The learner has to           build mental representations (exemplars and possibly also           abstract lexical representations) for these variants.           Importantly, late learners will build representations that           differ significantly from native listeners' representations,           since reduction patterns in their native language will shape           their interpretation of reduction patterns in the foreign           language. The goal of this Vici project is to develop the           first, fully specified, theory of how late learners of a           foreign language build mental representations for           pronunciation variants in that language.


The dissertation will consist of an           introduction, at least three experimental chapters that have           been submitted to high impact international journals, and a           General Discussion.




What we expect from you


· You have or shortly expect to obtain a           Master's degree in a field related to speech processing, such           as phonetics, linguistics, psychology-, or cognitive           neuroscience;


· you have an excellent written and spoken           command of English;


· you have demonstrable knowledge of data           analysis;


· you preferably have knowledge of the           phonetics / phonology of French;


· you preferably have knowledge of the           phonetics / phonology of Dutch.




What we have to offer


We offer you:


- full time employment at the Faculty of           Arts, Radboud University Nijmegen


- in addition to the salary: an 8% holiday           allowance and an 8.3% end-of-year bonus;


- the starting salary will amount to €2,042           gross per month on a full-time basis; the salary will increase           to €2,612 gross per month on a full-time basis in the fourth           year (salary scale P);


- you will be appointed for an initial           period of 18 months, after which your performance will be           evaluated;


- if the evaluation is positive, the           contract will be extended by 2 years (on the basis of a           38-hour working week);


- you will be classified as a PhD student           (promovendus) in the Dutch university job-ranking system           (UFO).




Further information


- On the research group  Speech           Comprehension:


- On the project leader:


- Or contact Prof. dr. Mirjam Ernestus,           leader of the Vici project, telephone: +31 24 3612970, E-mail: 






It is Radboud University Nijmegen's policy           to only accept applications by e-mail. Please send your           application, including your letter of motivation, curriculum           and transcripts of your university grades and stating vacancy           number 23.12.13, to,           for the attention of Mr drs. M.J.M. van Nijnatten, before 24           May 2013.



6-7(2013-05-01) PhD position with scholarship - Silent speech interface GIPSA-lab, Grenoble, France

Available PhD position with scholarship - Silent speech interface

GIPSA-lab, Grenoble, France

Incremental speech synthesis for a real-time silent speech interface


: The design of a silent speech interface, i.e. a device allowing speech communication without the

necessity of vocalizing the sound, has recently received considerable attention from the speech research

community [1]. In the envisioned system, the speaker articulates normally but does not produce any audible

sound. Application areas are in the medical field, as an aid for larynx-cancer patients, and in the

telecommunication sector, in the form of a “silent telephone”, which could be used for confidential

communication, or in very noisy environments. In [2], we have shown that ultrasound and video imaging can be

efficiently combined to capture the articulatory movements during silent speech production; the ultrasound

transducer and the video camera are placed respectively beneath the chin and in front of the lips. At present, our

work focused mainly on the estimation of the target spectrum from the visual articulatory data (using artificial

neural network, Gaussian mixture regression and hidden Markov modeling). The other challenging issue

concerns the

estimation of acceptable prosodic patterns (i.e. the intonation of the synthetic voice) from silent

articulatory data only. To address this ill-posed problem, one solution consists of splitting the mapping process

into two consecutive steps: (1) a visual speech recognition step which estimates the most likely sequence of word

given the articulatory observations, and (2) a text-to-speech (TTS) synthesis step which generates the audio signal

from the decoded word sequence. In that case, the target prosodic pattern is derived from the linguistic structure

of the decoded sentence. The major drawback of this mapping method is that it cannot run in real-time. In fact, if

the visual speech recognition step can be done

online (i.e. words are decoded a short amount of time after they

have been pronounced), standard TTS systems need to know the entire sentence to estimate the target prosody.

This introduces a large delay between the (silent) articulation and the generation of the synthetic audio signal.

This delay prevents the communication partners from having a fluent conversation. The main goal of this PhD

project is to design a

real-time silent speech interface, in which the delay between the articulatory gesture and

the corresponding acoustic event has to be constant and as short as possible.


The goal of this PhD project is twofold:

(1) Reducing the delay between the recognition

and the synthesis steps, by designing a new

generation of TTS system, called “


TTS system

” [3]. This system should be able to

synthesize the decoded words, with acceptable

prosody, as soon as they are provided by the

visual speech recognition system.

(2) Designing experimental paradigms in order to evaluate the system in realistic communication situations (faceto-

face, remote/telephone-like interaction, human-machine interaction). The goal is to study how a silent speaker

benefits from the acoustic feedback provided by the incremental TTS and how he/she adapts his/her own

articulation to maximize the efficiency of the communication.


Dr. Thomas Hueber, Dr. Gérard Bailly (CNRS/GIPSA-lab)

Duration / Salary

: 36 months (October 2013- October 2016) / ~ 1400/month minimum (net salary).

Research fields

: multimodal signal processing, machine learning, interactive systems, experimental design


Master’s or engineer’s degree in computer science, signal processing or applied mathematics.


: Good skills in mathematics (machine learning) and programming (Matlab, C, Max/MSP). Knowledge in

speech processing or computational linguistics would be appreciated.

To apply

: send your CV, transcript of records of your Master grade and a cover letter to



[1] B. Denby, T. Schultz, K. Honda, T. Hueber, et al., “Silent Speech Interfaces,” Speech Communication, vol. 52, no. 4, pp.

270-287, 2010.

[2] T. Hueber, E. L. Benaroya, G. Chollet, et al., “Development of a Silent Speech Interface Driven by Ultrasound and

Optical Images of the Tongue and Lips”, Speech Communication, vol. 52, no. 4, pp. 288-300, 2010.

[3] Buschmeier H, Baumann T, Dosch B, Schlangen D, Kopp S. “Combining Incremental Language Generation and

Incremental Speech Synthesis for Adaptive Information Presentation”, in proc of the 13th Sigdial meeting, pp, 295-303,


6-8(2013-05-10) Postdoctoral fellow at Toronto Rehabilitation Institute, University of Toronto

We are seeking a skilled postdoctoral fellow (PDF) whose expertise intersects automatic speech recognition (ASR) and human-robot interaction (HRI). The PDF will work with a team of internationally recognized researchers to create an automated speech-based dialogue system between computers and robotic systems, and individuals with dementia and other cognitive impairments. These systems will automatically adapt the vocabularies, language models, and acoustic models of the component ASR to data collected from individuals with Alzheimer’s disease. Moreover, this system will analyze the linguistic and acoustic features of a user’s voice to infer the user’s cognitive and linguistic abilities, and emotional state. These abilities and mental states will in turn be used to adapt a speech output system to be more tuned to the user.


Work will involve programming, data analysis, dissemination of results (e.g., papers and conferences), and partial supervision of graduate and undergraduate students. Some data collection may also be involved.


The successful applicant will have:

1)            A doctoral degree in a relevant field of computer science, electrical engineering, biomedical engineering, or a relevant discipline;

2)            Evidence of impact in research through a strong publication record in relevant venues;

3)            Evidence of strong collaborative skills, including possibly supervision of junior researchers, students, or equivalent industrial experience;

4)            Excellent interpersonal, written, and oral communication skills;

5)            A strong technical background in machine learning, natural language processing, robotics, and human-computer interaction.


This work will be conducted at the Toronto Rehabilitation Institute, which is affiliated with the University of Toronto.


--== About the Toronto Rehabilitation Institute ==--


One of North America’s leading rehabilitation sciences centres, Toronto Rehabilitation Institute (TRI) is revolutionizing rehabilitation by helping people overcome the challenges of disabling injury, illness ,or age-related health conditions to live active, healthier, more independent lives. It integrates innovative patient care, ground-breaking research and diverse education to build healthier communities and advance the role of rehabilitation in the health system.  TRI, along with Toronto Western, Toronto General, and Princess Margaret Hospitals, is a member of the University Health Network and is affiliated with the University of Toronto.


If interested, please send a brief (1-2 page) statement of purpose, an up-to-date resume, and contact information for 3 references to Alex Mihailidis ( and Frank Rudzicz ( by 31 July 2013. The position will remain open until filled.




Frank Rudzicz, PhD.

   Scientist, Toronto Rehabilitation Institute;

   Assistant professor, Department of Computer Science,

         University of Toronto;

   Founder and Chief Science Officer, Thotra Incorporated

>> (personal)

>>  (lab)


6-9(2013-05-15) Post-doctorat dans le cadre du projet ANR DIADEMS, LABRI, Bordeaux France

ance'Offre de post-doctorat dans le cadre du projet ANR DIADEMS (Description, Indexation, Accès aux Documents Ethnomusicologiques et Sonores).



- Sujet de post-doctorat : identification / classification instrumentale


Durée : 12 mois

Salaire : environ 2000 €/mois

Date de début souhaitée : septembre 2013


La reconnaissance automatique d'instrument et la classification par famille d'instruments est un domaine de recherche actif du MIR (Music Information Retrieval) [Hei09] [Kit07] [Her06] [Ess06]. Les principales techniques reposent sur des méthodes statistiques utilisant des paramètres audio de type MFCC. Nous souhaitons ici tracer une voie nouvelle, permettant de faire le lien entre le traitement de la parole et le traitement de la musique, en considérant l'interprétation musicale comme une phrase, et l'instrument ou l'instrumentiste comme un locuteur.


Ce travail s'effectuera en parallèle d'une thèse en cours sur la caractérisation et l'identification de la voix chantée. Au cours de cette thèse, nous avons proposé une méthode permettant d'identifier les segments contenant de la voix chantée dans des enregistrements polyphoniques (e.g. musique 'pop'). L'objet actuel d'étude est de déterminer quels sont les paramètres du signal les plus pertinents pour caractériser différents styles de chant.


Une des pistes que nous souhaitons poursuivre sera d'identifier l'instrument en suivant le vibrato, de manière similaire à ce qui est proposé pour la voix chantée. En insistant sur la dimension temporelle plutôt que spectrale, nous pourrons aussi observer comment s'enchainent les respirations, les attaques sonores ou les changements timbraux utilisés par le musicien. Ce travail exploratoire nécessitera dans un premier temps d'effectuer des expérimentations sur des bases de données simples (telles que [Fri97] et [Got03]) afin de valider notre approche avant d'appliquer nos algorithmes aux données du projet DIADEMS.



- Références :


[Hei09] Heittola, T., Klapuri, A., Virtanen, T., 'Musical Instrument Recognition in Polyphonic Audio Using Source-Filter Model for Sound Separation,' in Proc. 10th Int. Society for Music Information Retrieval Conf. (ISMIR 2009), Kobe, Japan, 2009.


[Kit07] Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, and Hiroshi G. Okuno: 'Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps', EURASIP Journal on Advances in Signal Processing, Special Issue on Music Information Retrieval based on Signal Processing, Vol.2007, No.51979, pp.1--15, 2007.


[Her06] P. Herrera-Boyer, A. Klapuri, and M. Davy. Automatic classification of pitched musical instrument sounds. Signal Processing Methods for Music Transcription, pages 163–200. Springer, 2006.


[Ess06] S. Essid, G. Richard, and David.B. Instrument recognition in polyphonic music based on automatic taxonomies. IEEE Transactions on Audio, Speech & Language Processing, 14(1):68–80, 2006.


[Fri97] L. Fritts, “Musical Instrument Samples,” Univ. Iowa Electronic Music Studios, 1997–. [Online]. Available:


[Got03] Goto M, Hashiguchi H, Nishimura T, Oka R. RWC music database: Music genre database and musical instrument sound database. ISMIR. 2003:229–230.




Description du projet DIADEMS (Partenaires : LaBRI, IRIT, LESC, Parisson, LIMSI, MNHN, LAM-IJLRA)


Le Laboratoire d'Ethnologie et de Sociologie Comparative (LESC) comprenant le Centre de Recherche en Ethnomusicologie (CREM) et le centre d'Enseignement et de Recherche en Ethnologie Amérindienne (EREA) ainsi que le Laboratoire d'Eco-anthropologie du Muséum National d'Histoire Naturelle (MNHN) sont confrontés à la nécessité d'indexer les fonds sonores qu'ils gèrent et de faire un repérage des contenus, travail long, fastidieux et coûteux.


Lors de l'Ecole d'Été interdisciplinaire Sciences et Voix 2010 organisée par le CNRS, une convergence d'intérêts s'est dégagée entre les acousticiens, les ethnomusicologues et les informaticiens : il existe aujourd'hui des outils d'analyse avancés du son développés par les spécialistes en indexation qui permettent de faciliter le repérage, l'accès et l'indexation des contenus.


Le contexte du projet est l'indexation et l'amélioration de l'accès aux fonds d'archives sonores du LESC : le fonds du CREM et celui d'ethnolinguistique de l'EREA (« chanté-parlé » Maya, ainsi que celui du MNHN (musique traditionnelle africaine). Il s'inscrit dans la continuité d'une réflexion entreprise en 2007 pour l'accès aux données sonores de la Recherche : aucune application n'existant en « open source » sur le marché, le CREM-LESC, le LAM et la Phonothèque de la MMSH d'Aix-en Provence ont étudié la conception d'un outil innovant et collaboratif qui répond à des besoins « métier » liés à la temporalité du document, tout en étant adapté à des exigences du secteur de la recherche. Avec le soutien financier du Très Grand Equipement (TGE) ADONIS du CNRS et du Ministère de la Culture, la plateforme Telemeta développée par la société PARISSON a été mise en ligne en mai 2011 : . Sur cette plateforme, des outils d'analyse élémentaires de traitement de signal sont d'ores et déjà disponibles.


Cependant, il est nécessaire de disposer d'un ensemble d'outils avancés et innovants pour une aide à l'indexation automatique ou semi-automatique de ces données sonores, issues d'enregistrements parfois longs, au contenu très hétérogène et d'une qualité variée. L'objectif du projet DIADEMS est de fournir certains des outils, de les intégrer dans Telemeta, en répondant aux besoins des usagers. Il s'en suit une complémentarité des objectifs scientifiques des différents partenaires : Les fournisseurs de technologies, l'IRIT, le LIMSI, le LaBRI et le LAM auront à :

- Fournir des technologies existantes telles que la détection de parole, de musique, la structuration en locuteurs. Ces outils visent à extraire des segments homogènes d'intérêt pour l'usager. Ces systèmes auront à faire face à la diversité des bases qu'il est proposé d'étudier dans ce projet ; leur hétérogénéité est liée aux conditions d'enregistrement, au genre et à la nature des documents, à leur origine géographique. Il faudra adapter ces systèmes dits « état de l'art » aux besoins des usagers.

- Proposer des outils innovants d'exploration du contenu de segments homogènes. Les travaux sur l'opposition voix parlée-déclamée-chantée, le chant, les tours de chant, la recherche de similarité musicale ne sont pas matures. Un véritable travail de recherche reste à faire et avoir à sa disposition des musicologues et des ethnomusicologues est un atout positif. Les ethnomusicologues, ethnolinguistes, acousticiens spécialistes de la voix et les documentalistes spécialisés vont jouer un rôle important dans le projet en tant que futurs utilisateurs des outils d'indexation : Les documentalistes doivent s'approprier les outils et apporter leur expérience afin d'adapter ces outils à leur besoin en indexation.


Un échange important doit se réaliser entre celui qui fournit l'outil, celui qui l'intègre et celui qui l'utilise. L'effort doit être porté sur la visualisation des résultats avec pour fin une aide forte à l'indexation en la rendant de fait semi-automatique Pour l'ethnomusicologue et le musicologue, l'objectif va au-delà de l'indexation. Il s'agit au travers d'aller et retour entre lui et les concepteurs de technologies de cibler les outils pertinents d'extraction d'information.


Jean-Luc Rouas LABRI 351, cours de la Libération 33405 Talence cedex France (+33) 5 40 00 35 08



6-10(2013-05-16) Ph D Avignon France

Reconnaissance du locuteur en milieu bruité

Nous avons atteint ces dernières années de très bonnes performances en reconnaissance du locuteur. Et ce, malgré la présence de la variabilité session. En effet, le variabilité session est prise en compte lors du scoring en utilisant une matrice de covariance modélisant cette dernière. Ce processus est effectué dans l'espace des i-vectors [1]. Le concept des i-vectors est devenu un standard en reconnaissance du locuteur.

Dans la dernière évaluation internationale NIST 2012, nous avons été confrontés à une nouvelle difficulté qui est le bruit additif [2], c'est à dire le bruit ambiant. La recherche pour réduire l'impact du bruit dans les systèmes de reconnaissance du locuteur est motivée en grande partie par le besoin d'appliquer les technologies de reconnaissance du locuteur sur des appareils portables ou sur l'Internet. Alors que les technologie promet un niveau supplémentaire de sécurité biométrique pour protéger l'utilisateur, la mise en œuvre pratique de ces systèmes doit faire face à de nombreux défis. Un des plus importants défis à surmonter est le bruit environnemental. En raison de la mobilité de ces systèmes, les sources de bruit peuvent être très variables dans le temps et potentiellement inconnus.

Nous proposons de travailler dans ce cadre : proposer des stratégies permettant de compenser l'effet du bruit additif, ces stratégies peuvent intervenir à différents niveaux du processus de reconnaissance: au niveau du signal, au niveau des modèles acoustiques, au niveau des i-vectors et au niveau du scoring....) .

    • Débruitage des signaux

    • Effet du bruit sur la VAD (Voice activity detection)

    • Bruitage des modèles

    • Intégration des caractéristiques statistiques du bruit dans la phase du scoring

Dans une deuxième partie du travail, nous proposons de nous mettre dans les meilleures conditions pour que le système soit le plus robuste au bruit. Par exemple, le choix de l'énoncé à prononcer par le locuteur peut avoir de l'influence sur les performances du système [3]. Faut a t-il avoir avoir le même énoncé pour tous les locuteurs, ou au contraire chaque locuteur se distingue des autres locuteur sur un ensemble bien précis d'unités acoustiques. Dans ce dernier cas, il faut trouver une stratégie, qui permet de déterminer l'ensemble des unités acoustiques qui différencient le plus possible un locuteur (des autres locuteurs). D'autres stratégies de robustesse au bruit doivent être proposées et étudiées dans le cadre de cette thèse. Une des pistes à explorer est l'utilisation de la théorie des caractéristiques manquantes (missing-feature theory), qui a été utilisée dans le domaine du traitement de la parole [4][5][6].

Les systèmes de reconnaissance du locuteur de l'état de l'art sont fondamentalement basés sur l'utilisation de l'UBM (Universal Backgroud Model), il s'agit d'un modèle trop simple pour le traitement et la modélisation de la parole. Dans le cas de la reconnaissance en milieu bruité, la tâche devient plus complexe, il est donc légitime de se reposer la question sur l'adéquation de ce modèle pour cette tâche. Nous proposons d'adapter une approche utilisant des HMM (ou autre modèle) à cette tâche tout en profitant des avancées récemment proposées ( Factor analysis, I-vectors, …).

[1] Bousquet Pierre-Michel, Matrouf Driss and Bonastre Jean-François, «Intersession compensation and scoring methods in the i-vectors space for speaker recognition » Interspeech 2011, Florence.

[2] Miranti Indar Mandasari, Mitchell McLaren and David A. van Leeuwen, « The Effect of noise on modern automatic speaker recognition systems » , ICASSP 2012.

[3] Anthony Larcher, Pierre-Michel Bousquet, Kong-Aik Lee, Driss Matrouf, Haizhou Li, Jean-François Bonastre, « I-vectors in the context of phonetically-constrained short utterances for speaker verification. » ICASSP 2012: 4773-4776.

[4] M.P. Cooke, P.G. Green, L. Josifovski, and A. Vizinho, « Robust ASR with

unreliable data and minimal assumptions, » in Proc., Robust’99, 1999

[5] M.P. Cooke, P.G. Green, L. Josifovski, and A. Vizinho, « Robust Automatic Speech Recognition with missing and unreliable acoustic data, » Speech Communication,, 2000.

[6] B. Raj, M.L. Seltzer, and R.M. Stern, « Reconstruction of missing features for robust speech recognition, » Speech Communication, 2004.




6-11(2013-06-03) 2 Ph D's Université d'Avignon
Le labex Brain and Language Research Institute (BLRI) financera à la rentrée prochaine deux sujets 
de thèse dont l'un des sujets pourra être celui proposé ci-dessous (en fonction des candidatures
Calendrier :. date limite des candidatures : 10 juin. 
auditions des candidats retenus : 24 juin
Bourse : 1 684.93€ brut mensuel (1 368€ nets)
Dossier des candidats :. CV détaillé. Notes. Motivation et/ou projet scientifique correspondant au sujet
Contact scientifique : Corinne Fredouille et Christine Meunier
Contact administratif :
Description du sujet : 

Titre : Détection de zones de déviance dans la parole pathologique : apport du traitement automatique face à l'expertise humaine

Superviseurs : Corinne Fredouille, Christine Meunier

Laboratoire d'accueil : Laboratoire Informatique d'Avignon (collaboration avec le Laboratoire Parole et Langage –Aix-en-Provence)

Discipline et école doctorale : Informatique, école doctorale ED536 de l'Université d'Avignon

Calendrier :

. date limite des candidatures : 10 juin

. auditions des candidats retenus : 24 juin

Bourse : 1 684.93brut mensuel (1 368nets)

Description scientifique

Si la définition de l'étendue de la variabilité en parole normale est une question fondamentale pour les théories

linguistiques actuelles, une façon d'aborder ses limites est d'essayer de déterminer ses frontières par le biais de la

variation pathologique. Comme le soulignent Duffy et Kent, 2001 « Science often takes advantages of nature’s

accidents to learn the principles of a process ». Sur ce principe, la connaissance de la parole pathologique s'appuyant sur

la compréhension des phénomènes d'altération « observables » dans la production de parole de patients atteints de

troubles de la parole devient une nécessité.

La parole dysarthrique correspond à une altération de la commande motrice d’origine centrale ou périphérique des

gestes de la parole. Des variations importantes existent dans la parole dysarthrique en relation avec un déficit de

l’exécution temporo-spatiale des mouvements de la parole et qui peuvent affecter différents niveaux de production

(respiratoire, laryngé et supralaryngé). La grande majorité des travaux ayant porté sur l'étude de la parole dysarthrique

repose sur des analyses perceptives. La raison principale tient dans le fait qu'un patient dysarthrique est dysarthrique

parce qu'il « s'entend/a l'air » dysarthrique. Les travaux les plus connus au niveau international sont ceux de Darley et

al., 1975. Ils ont conduit à l'élaboration d'une organisation des dysarthries en 6 classes (complétée par deux classes

supplémentaires par Duffy, 1995) sur la base de clusters physiopathologiques définis à partir de la concomitance de

caractéristiques les plus déviantes perçues par un jury d'écoute. L’hypothèse sous-jacente à la construction de ces

clusters est qu’un ensemble de paramètres perturbés simultanément, mis en relation avec l’atteinte neurologique,

reflèterait un processus physiopathologique particulier. Si cette classification reste d'actualité encore aujourd'hui pour

évaluer notamment la parole dysarthrique en pratique clinique, elle reste néanmoins sujette à controverses pour deux

raisons principales : la subjectivité des évaluations perceptives d'une manière générale et la difficulté pour un être

humain, même expérimenté, à distinguer et à juger perceptivement les multiples dimensions à prendre en compte dans

l'évaluation de la parole dysarthrique. En conséquence, différents travaux ont été menés à partir des années 80 à

aujourd'hui dans l'objectif de combiner ces analyses perceptives à des méthodes plus objectives et quantitatives telles

que les analyses instrumentales basées sur des mesures acoustiques ou physiologiques (revue de la littérature dans Kay,

2012). Si les analyses instrumentales peuvent s'appuyer sur des traitements semi- voire entièrement automatiques,

l'analyse acoustique fine nécessaire pour comprendre les phénomènes déviants inhérents à la dysarthrie dans le signal de

parole demeure encore très coûteuse en temps par un expert humain. Dans ce contexte, une grande part des études

présentes dans la littérature repose soit sur un nombre très restreint de patients ou sur une pathologie bien ciblée.

Pourtant, la grande variabilité des phénomènes déviants observés dans la parole dysarthrique en fonction de la

pathologie du patient, de l'avancement de la maladie ou de la sévérité de la dysarthrie requiert d'analyser une large

population de patients.

L'objectif de cette thèse est d'étudier comment les outils du traitement automatique de la parole pourraient permettre de

traiter de larges populations de patients dysarthriques et de focaliser l'attention des experts humains sur des zones de

déviance bien identifiées du signal en vue d'analyses plus fines. Ces travaux reposeront notamment sur le système de

transcription automatique du LIA et ses activités de recherche autour des mesures de qualité des transcriptions

(Lecouteux, 2008 et Senay, 2011). La granularité de la détection des zones de déviance – ici potentiellement le mot ou

la séquence de mots – sera dans un second temps affiner par des outils travaillant à des niveaux inférieurs allant

jusqu'au phonème (Fredouille, 2011).

Ces travaux devront tenter de répondre à différentes questions :

Face à la variabilité des phénomènes de déviance observés dans la parole dysarthrique et répertoriés dans la

littérature, quels sont ceux qu'un système de détection automatique est capable de déceler ?

Est ce qu'un système automatique est capable de mettre en évidence les mêmes phénomènes de déviance qu'un

expert humain lors d'une évaluation perceptive ?

Les déviances détectées par le système automatique sont-elles pertinentes pour les phonéticiens ?

Est-il possible de mettre en relation les déviances détectées avec la physiopathologie du patient (ex : indices

hypokinéthiques pour la maladie de Parkinson, indices paralytiques pour la SLA, …) ?

Les travaux autour du système de transcription automatique du LIA devraient également ouvrir des perspectives sur la

mise en place d'un système de mesures objectives de l'intelligibilité des patients dysarthriques.

Ces travaux de thèse seront réalisés dans le cadre d'une collaboration étroite entre le LIA (Corinne Fredouille) pour son

expertise autour des systèmes automatiques, le LPL (Christine Meunier et Alain Ghio) pour son expertise sur les

analyses acoustico-phonétiques et les évaluations perceptives, les hôpitaux de la Timone (Dr Danièle Robert) et des

Pays d'Aix (François Viallet) pour leur expertise clinique. Ils seront basés sur le corpus de patients dysarthriques

élaborés dans le cadre du projet ANR DesPhoAPady (2009-2012 – Fougeron, 2010). Ce corpus présente un large panel

de patients souffrant de différentes pathologies (maladie de Parkinson, Sclérose Latérale Amiotrophique, syndrôme

cérébelleux) et différents niveaux de sévérité de dysarthrie.

Références :

J. R. Duffy, R. D. Kent, « Darley's contributions to the understanding, differential diagnosis, and scientific study of the

dysarthrias », Aphasiology 15(3):275 – 289, 2001.

F. L. Darley, A. E. Aronson, J. R. Brown, « Motor Speech Disorders », Philadelphia: W.B. Saunders, 1975.

J. R. Duffy, « Motor speech disorders : substrates, differential diagnosis and management », Motsby- Yearbook, St

Louis, 1e édition, 1995.

T. S. Kay, « Spectral analysis of stop consonants in individuals with dysarthria secondary to stroke », PhD thesis,

Department of Communication Sciences and Disorders, Faculty of the Louisiana State University and Agricultural and

Mechanical College, USA, 2012.

B. Lecouteux, « Reconnaissance automatique de la parole guidée par des transcriptions a priori », Thèse de doctorat,

Université d'Avignon et des Pays Vaucluse, 2008.

G. Senay, « Approches semi-automatiques pour la recherche d’information dans les documents audio », Thèse de

doctorat, Université d'Avignon et des Pays Vaucluse, 2011.

C. Fredouille, G. Pouchoulin, « Automatic detection of abnormal zones in pathological speech », International Congress

of Phonetic Sciences (ICPHs'11), Hong Kong, 17-21 Août 2011.

C. Fougeron et al., « Developing an acoustic-phonetic characterization of dysarthric speech in French », LREC'10

(international conference on Language Resources and Evaluation), Malte, Mai 2010.

Dossier des candidats :

. CV détaillé

. Notes

. Motivation et/ou projet scientifique correspondant au sujet

Contact scientifique : Corinne Fredouille et Christine Meunier

Title: Detection of deviant zones in pathological speech : contribution of the automatic speech processing against the

Human expertise

Supervisor : Corinne Fredouille, Christine Meunier

Host laboratory : Laboratoire Informatique d'Avignon (collaboration with the Laboratoire Parole et Langage – Aix-en-Provence)

Field and doctoral school : computer sciences, doctoral school ED536 of the University of Avignon

Date :

. deadline for the application : 10th of june

. auditions of chosen candidates : 24th of june

Grant : 1 684.93montly gross (1 368net)

Scientific description

If the definition of the variability range in normal speech is a key issue for the current linguistic theories, a way of

dealing with its limits is to attempt to determine its frontiers through pathological variation. As reported by Duffy and

Kent, 2001 « Science often takes advantages of nature’s accidents to learn the principles of a process ». Based on this,

the knowledge of the pathological speech, based on the understanding of alteration phenomena, « observable » on the

speech production of patients suffering of speech disorders becomes a necessity.

Dysarthria is a group of speech disorders resulting from neurological impairments of speech motor control. Substantial

variations occur in dysarthric speech due to a deficit in the spatio-temporal execution of speech movements that affects

different levels of speech production (respiratory, laryngeal and supralaryngeal). The vast majority of research work

dedicated to the study of the dysarthric speech relies on perceptual analyses. The main reason is that a dysarthric patient

is dysarthric because he/she sounds dysarthric. The most known study, at international level, was done by Darley et al.,

1975. This work leads to organize dysarthria into 6 classes (completed with 2 additional classes by Duffy, 1995) on the

basis of physiopathological clusters defined from the co-occurrences of the most deviant features perceived by a

perceptual jury. The hypothesis underlying the building of these clusters is that a set of simultaneously disturbed

features, connected with the neurological injuries, should reflect a typical physiopathological process.

If this classification is still used nowadays to evaluate dysarthric speech in clinical practice notably, it remains

controversal for a couple of reasons : the subjectivity of perceptual evaluation and the difficulty for a Human being,

even with a high expertise, of distinguishing and assessing perceptually the multiple dimensions to take into account

when dealing with dysarthric speech. Consequently, different research work has been conducted since the 1980s until

now which aims at combining these perceptual analyses with more objective and quantitative approaches such as the

instrumental analyses based on acoustic or physiological measure (review of the literature can be found in Kay, 2012).

Contrary to the instrumental analyses which can rely on some semi- or full-automatic process, in-depth acoustic

analysis of speech necessary to understand the deviant phenomena related to dysarthria still remains very timeconsuming

for a Human expert. Based on this, a signifiant proportion of studies in the literature are conducted on a

limited number of patients or on a focused pathology. However, the large variability of deviant phenomena observed in

dysarthric speech according to the patient’s pathology, the stage of the disease, or the dysarthria severity require the

analysis of a large patient population.

The aim of this thesis is to study how the automatic speech processing tools could permit to treat large populations of

dysarthric patients and to focus Human experts’ attention on speech zones well identified as deviant for further in-depth

analyses. This work will rely on the automatic speech transcription developed at the LIA and its activities on the quality

measure of transcriptions (Lecouteux, 2008 et Senay, 2011). The granularity of the deviant zone detection – here the

word or set of words – will be refined, in a second step, by applying existing detection tools working at lower levels like

the phoneme (Fredouille, 2011).

This work will attempt to answer the following key issues :

Given the variability of deviant phenomena observed on dysarthric speech and reported in literature, which

ones is an automatic detection system able to capture ?

Is an automatic system able to highlight the same deviant phenomena as a Human expert will detect

perceptually ?

Are deviant speech zones detected by an automatic system relevant for a phonetician ?

Does a correlation between the type of deviant phenomena detected and the patient’s physiopathology

exist (e.g : hypokinetic feature for the Parkinson disease, paralytic features for ALS, …) ?

Studies relating to the automatic speech transcription should open up new perspectives on the implementation of an

objective system dedicated to the evaluation of the dysarthric patient’s intelligibility.

This thesis work will be carried out within a close collaboration between the LIA (Corinne Fredouille) for her expertise

on the automatic system dedicated to speech processing, the LPL (Christine Meunier and Alain Ghio) for their expertise

on acoustic-phonetic analyses and perceptual evaluations, both the hospitals of La Timone (Dr Danièle Robert) and desPays d’Aix (Pr. François Viallet) for their clinical expertise. It will be based on the dysarthric patient corpus designed

for the ANR DesPhoAPady project (2009-2012 – Fougeron, 2010). This corpus includes a large population of patients

suffering from various pathologies (Parkinson disease, ALS, cerebelar syndrom, …) and different levels of dysarthria


Bibliography :

J. R. Duffy, R. D. Kent, « Darley's contributions to the understanding, differential diagnosis, and scientific study of the

dysarthrias », Aphasiology 15(3):275 – 289, 2001.

F. L. Darley, A. E. Aronson, J. R. Brown, « Motor Speech Disorders », Philadelphia: W.B. Saunders, 1975.

J. R. Duffy, « Motor speech disorders : substrates, differential diagnosis and management », Motsby- Yearbook, St

Louis, 1e édition, 1995.

T. S. Kay, « Spectral analysis of stop consonants in individuals with dysarthria secondary to stroke », PhD thesis,

Department of Communication Sciences and Disorders, Faculty of the Louisiana State University and Agricultural and

Mechanical College, USA, 2012.

B. Lecouteux, « Reconnaissance automatique de la parole guidée par des transcriptions a priori », Thèse de doctorat,

Université d'Avignon et des Pays Vaucluse, 2008.

G. Senay, « Approches semi-automatiques pour la recherche d’information dans les documents audio », Thèse de

doctorat, Université d'Avignon et des Pays Vaucluse, 2011.

C. Fredouille, G. Pouchoulin, « Automatic detection of abnormal zones in pathological speech », International Congress

of Phonetic Sciences (ICPHs'11), Hong Kong, 17-21 Août 2011.

C. Fougeron et al., « Developing an acoustic-phonetic characterization of dysarthric speech in French », LREC'10

(international conference on Language Resources and Evaluation), Malte, Mai 2010.

Candidate application form :

. detailed CV

. marks

. Motivation and/or scientific project related to the topic

Scientific Contact : Corinne Fredouille et Christine Meunier

Administrative Contact :

6-12(2013-05-16) Internships in Natural Language Processing and Machine Translation, Dublin City University, Ireland

At the Centre for Next Generation Localisation (CNGL) in Dublin, Ireland, we have a number of internships available covering a wide range of topics in Natural Language Processing and Machine Translation based at our Dublin City University site.  The internships are available for both basic research and more applied research projects (including development-focused work).
Candidates are required to be registered as MSc or PhD students (by research) in their home universities while carrying out their internship in Dublin and need to provide written confirmation of this from their home institute. Please find the internship advertisement attached.
Details of a number of specific internships can be found at:
Closing date: 31st May 2013
For any informal enquiries please contact: CNGL Education and Outreach
Dr. Cara Greene, CNGL, DCU Phone: +353 (0)1 7006704 E-mail: cgreene (AT)
Application Procedure:
For formal applications, please download an application form from the link below and send it to cgreene (AT) by Friday 31st May 2013.

6-13(2013-05-17) Speech Technology Researcher/Developer (main focus ASR) , Liguwerk GmbH Dresden Germany

vacant position as a

Speech Technology Researcher/Developer (main focus ASR)

in our research and development group in Dresden. We are a young
company located in Dresden dealing with speech technology products and
applicatuions, signal processing solutions and pattern recognition
techniques. Although we develop a new group of products our work is
closely related to scientific topics and we collaborate with several
german and international universities.

Job descriptions (in German) may be found in

Unfortunately our webpage is currently under construction and only
available in German at the moment. Nevertheless English speaking
applicants are very welcome, all our group members speak English

Our team consists of speech technology scientists and engineers,
education sciense specialists, creatives, and people who love
scientific challenges.

If you are interested to join and enrich our team, don't hesitate to
contact as and ask for further information. Also contact us for
information in English.

Note: The employer for this job will be the University in Dresden, but
you are a member of our joined research group within our Lab.

Dr.-Ing. Rico Petrick
CEO / Geschäftsleitung
Linguwerk GmbH          Office: +49 351 6533-3807
Schnorrstraße 70        Fax:    +49 351 6533-6965
01069 Dresden           Mobile: +49 174 943 8817
Germany                 E-Mail:


6-14(2013-06-05) Specialist in speech processing, CUI/University of Geneva

The CUI/University of Geneva seeks a qualified candidate for one

           Specialist in speech processing

for a period of 6 months (class 13; 5988.-). Depending on hiring rate (100% to 50%), the contract will be extended up to 12 months.

In the context of a newly granted Swiss National Science Foundation Agora project, the CUI will collaborate with the Phonetischen Laboratorium of Universität Zurich (UZH) on 'Swiss VoiceApp - Your voice. Your identity.” The project will combine speech recognition and processing as well as linguistic knowledge to provide a smartphone application for Swiss German dialect recognition and some information on the user’s voice.

The main tasks of successful candidate will be the development of speech recognition engine based on speech corpora of dialectal variants. Knowledge of speech software, namely HTK and Praat, are highly recommended.

Applicant should possess a master’s degree or an engineer diploma in computational linguistics, or in computer science but with strong knowledge of speech processing. Given the focus on Swiss German dialects, preference will be given to speaker of German.

The start date is expected to be during autumn/winter 2013; the position will remain open until filled.


Please send (1) your CV, (2) a copy of your degree(s), (3) a short letter explaining that your skills and your interests fit in the project, to Jean-Philippe Goldman, before July 31th 2013, preferably electronically or by mail at the address below:

Jean-Philippe Goldman

CUI / UNI-Rondeau
CH-1227 Carouge, Switzerland


Le CUI / Université de Genève cherche à engager un/une

           Spécialiste en traitement de la parole

pour un emploi temporaire d’une durée de 6 mois, (classe 13 ; 5988.-).Selon le taux d’engagement (de 100% à 50%) le contrat peut être étendu à 12 mois.

Dans le contexte d’un projet récemment financé par le Fonds National de la Recherche, le CUI collaborera avec le Phonetisches Laboratorium de l’Université Zurich (UZH) sur le projet 'Swiss VoiceApp - Your voice. Your identity.” Le projet combinera des techniques de reconnaissance de la parole et des connaissances linguistiques pour produire une application pour smartphone visant à l’identification de dialectes suisses alémaniques et de fournir à l’utilisateur des informations sur sa propre voix.

La tâche principale du candidat sélectionné sera le développement d’un système de reconnaissance de la parole basé sur des corpus de variantes dialectales. La connaissance de logiciels de parole tels que HTK est essentielle.

Le candidat devra posséder un MA ou un diplôme d’ingénieur en linguistique computationnelle, ou en informatique mais avec de bonnes connaissances en traitement de la parole. Étant donné l’orientation germanophone du projet, une préférence sera donnée à un locuteur de l’allemand. Le contrat pourra commencer en automne ou hiver 2013.

Contact : Veuillez envoyer  (1) votre CV, (2) une copie de vos diplômes, (3) une lettre sur vos compétences et intérêts en rapport avec le projet,  à Jean-Philippe Goldman, avant le 31 juillet 2013, de préférence par voie électronique, ou par courrier à l’adresse suivante:

Jean-Philippe Goldman

CUI / UNI-Rondeau
CH-1227 Carouge, Switzerland

6-15(2013-06-12) Specialist in speech processing- CUI/University of Geneva Switzerland

The CUI/University of Geneva seeks a qualified candidate for one
          Specialist in speech processing
for a period of 6 months (class 13; 5988.-). Depending on hiring rate (100% to 50%), the contract will be extended up to 12 months.
In the context of a newly granted Swiss National Science Foundation Agora project, the CUI will collaborate with the Phonetischen Laboratorium of Universität Zurich (UZH) on 'Swiss VoiceApp - Your voice. Your identity.” The project will combine speech recognition and processing as well as linguistic knowledge to provide a smartphone application for Swiss German dialect recognition and some information on the user’s voice.
The main task of successful candidate will be the development of speech recognition engine based on speech corpora of dialectal variants. Knowledge of speech software, namely HTK and Praat, are highly recommended.
Applicant should possess a master’s degree or an engineer diploma in computational linguistics, or in computer science but with strong knowledge of speech processing. Given the focus on Swiss German dialects, preference will be given to speaker of German.
The start date is expected to be during autumn/winter 2013; the position will remain open until filled.
Contact Please send (1) your CV, (2) a copy of your degree(s), (3) a short letter explaining that your skills and your interests fit in the project, to Jean-Philippe Goldman, before July 31th 2013, preferably electronically or by mail at the address below:
Jean-Philippe Goldman CUI / UNI-Rondeau CH-1227 Carouge, Switzerland

6-16(2013-06-16) Faculty position at CSLP, Johns Hopkins University Baltimore

 CLSP Faculty Position
    May 31, 2013
    The Center for Language and Speech Processing at the Johns Hopkins     University seeks applicants for a tenure-track or tenured faculty     member in speech and language processing. We especially welcome     candidates who could strengthen our institutional drive for     inter-disciplinary research. Rank will be dependent on the     experience and accomplishments of the candidate.
    Applicants must have a Ph.D. in a relevant discipline and will be     expected to establish a strong, independent, multidisciplinary,     internationally recognized research program. A commitment to quality     teaching at the undergraduate and graduate levels is required. We     are committed to building a diverse educational environment; women     and minorities are especially encouraged to apply.
    Primary appointments will be in the academic Department most     appropriate for the candidate within the G.W.C. Whiting School of     Engineering such as Electrical and Computer Engineering, Computer     Science or Biomedical Engineering.
    Applicants should apply using the online application here:
    Applications will be accepted until the position is filled.

6-17(2013-06-18) Thèse financée au GIPSA lab Grenoble

Le Département Parole et Cognition           du GIPSA-lab (
        propose une bourse de doctorat financée par la région         Rhône-Alpes dans le cadre du programme ARC2 (
      sur le thème des troubles             du développement de la parole chez les enfants sourds             porteurs d’implants cochléaires : évaluation, diagnostic et             pistes pour la remédiation.   

          financement est prévu pour une durée de 3 ans, et démarrera à           la rentrée universitaire 2013. Le (la) candidat(e) retenu(e)           sera affilié(e) à l’école doctorale LLSH de l’université           Stendhal de Grenoble (


Compétences recherchées :


          sujet est proposé aux étudiant(e)s titulaires d’un Master 2 R,           issu(e)s d’une formation en sciences du langage ou           orthophonie, possédant une expérience en phonétique ou           psychologie expérimentales. Il est souhaitable, mais non           requis, d’être à l’aise avec le logiciel Matlab et les outils
          d’analyse du signal audio classiques (tels que Praat), ainsi que d’avoir           des compétences en statistiques. Les candidatures de           professionnels de santé en lien avec les troubles du langage           et/ou de l’audition (orthophonistes, audioprothésistes,           O.R.L., etc.) sont bienvenues. Les candidats devront avoir une           bonne maîtrise de la langue française (interaction avec des           enfants).


Encadrement :           Ce sujet sera co-dirigé par Anne Vilain du Département Parole           et Cognition du laboratoire GIPSA-lab et Michel Hoen de           l’équipe DYCOG (Dynamique Cérébrale et Cognition) du Centre de           recherche en Neurosciences de Lyon (U1028/UMR5292), en           collaboration avec le Laboratoire de Psychologie et           NeuroCognition de Grenoble.


 Les candidatures               (CV détaillé + lettre de motivation sont à envoyer à               avant le 25 juin 2013. Elles donneront lieu à une               pré-sélection avant entretien.





Détail du sujet :



        2010, plus de 150 000 personnes dans le monde étaient équipées         d’un implant cochléaire, et la moitié d’entre elles étaient des         enfants. Pourtant, les difficultés en parole des enfants sourds         pré-linguaux porteurs d’implant cochléaire sont peu étudiées.         Les rares études disponibles révèlent l’existence de difficultés         de production et de perception de certains sons de parole, même         après plusieurs années d’usage. Or il a été montré que les         troubles de la parole peuvent induire des difficultés dans les         apprentissages scolaires, qui peuvent avoir un retentissement         affectif et social pour l'enfant. La rééducation orthophonique         ciblée des enfants porteurs d’implants cochléaires représente         donc un enjeu important en termes de qualité de vie à court et         long terme.


        présent projet vise à (i) évaluer les difficultés de production         de parole persistantes chez un sujet sourd pré-lingual après         plusieurs années d’implantation cochléaire, et à (ii) mettre en         rapport ces difficultés avec les capacités des sujets à         percevoir les contrastes phonétiques, afin de (iii) proposer des         stratégies de rééducation adaptées permettant d’améliorer les         capacités phonologiques des enfants implantés.


        études porteront sur une population d’enfants implantés qui sera         comparée avec un groupe apparié d’enfants entendants. Le volet         « Production » s’appuiera sur l’enregistrement et l’analyse         acoustique de corpus de parole guidée et spontanée, et le volet         « Perception » sur des tests d’identification et de         discrimination de sons de parole.


        projet présente des enjeux théoriques et pratiques forts. Les         résultats pratiques permettront d’une part de diagnostiquer les         problèmes spécifiques persistant en production chez les enfants         implantés, afin de guider les pratiques de rééducation. Ils         permettront d’autre part de mieux décrire les étapes         chronologiques de la production et de la perception du langage         oral chez les enfants sourds porteurs d’implants, ce qui         fournira des repères cruciaux pour les thérapeutes.


        apports théoriques de ce travail concernent la question du lien         entre production et perception de parole et la compréhension de         la construction d’un modèle interne pour la production de parole         au cours du développement. La seconde question théorique abordée         est la notion de période critique d’apprentissage et du rôle de         l’accès à des informations auditives pendant les premières         années de vie.




Mots-clés : handicap auditif - implant cochléaire – relations         perception / production en parole – trouble du développement du         langage – qualité de vie des enfants implantés




-- Anne Vilain Departement Parole et Cognition Laboratoire GIPSA-lab Universite Stendhal BP 25 38040 Grenoble cedex 9 tél: 00 33 4 76 82 77 85 fax: 00 33 4 76 82 43 35
6-18(2013-06-19) PhD positions - Computer Science, Natural Language Processing - Marseille, France

PhD positions - Computer Science, Natural Language Processing - Marseille, France

The NLP group of the Laboratoire d'Informatique Fondamentale of the Aix Marseille University has 2 open PhD positions in Computer Science in the context of an European project (3 years).

Location : Campus de Luminy (, Marseille, France
Starting : October 1st 2013
Deadline for application: July 15th 2013

The two fully funded PhD studentships will focus on 2 different aspects of the project:
1- Discourse parsing of speech.
This PhD will focus on developing discourse analysis methods adapted to a large range of conversational styles and domains from spoken conversations to social media interaction. Automatic methods will be studied for five types of discourse analysis: discourse parsing, event and temporal structure, argumentation structure and intra-document coreference. The applicative domain of this research will be the automatic summarization task of human-human dialogs. The language targeted will be French and English.

2- Syntactic and semantic parsing with deep learning methods.
Recently, neural network approaches based on a deep learning paradigm has been successfully applied to some NLP tasks such as POS and NE tagging or dependency parsing ( ). In this PhD we will investigate how this paradigm performs and can be adapted in the context of robust syntactic and semantic parsing of human human conversations collected on social media platforms and telephone call centres.

The successful candidates should :
- hold a relevant degree in the field of Natural Language Processing or Machine Learning.
- have good algorithmic and programming skills

Description of the lab:
The University of Aix-Marseille (AMU) is currently one of the largest university in France, created in 2012 from the fusion of the 3 former Aix-Marseille universities (Université de Provence, Université de la Méditerranée, Université Paul Cézanne).
The LIF (Fundamental Computer Science Lab), is a JRU between the Centre National de la Recherhe Scientifique (CNRS) and AMU. The Natural Language Processing group of LIF aims at developing symbolic and statistical methods for the automatic processing of textual and speech data. The two main characteristics of this research group are: (1) to host both linguists working on the development of rich linguistic resources such as syntactic and semantic lexicons and computer scientists with a very strong experience in numerical approaches for NLP; and (2) to work on both spoken and written languages, at the descriptive level as well as the application level thanks to a strong expertise of some members of the group in Automatic Speech Recognition methods.

Alexis Nasr :
Frederic Bechet :
Benoit Favre :

6-19(2013-06-19) Post-doc au LIMSI-CNRS Orsay, France

Post-doc au LIMSI-CNRS pour le projet ANR DIADEMS


Proposition d'un contrat post-doctoral d'un an dans le groupe       Traitement du Langage Parlé du LIMSI-CNRS (Orsay), pour projet ANR       DIADEMS (Description, Indexation, Accès aux Documents       Ethnomusicologiques et Sonores).



    Le projet DIADEMS vise à développer une plateforme innovante     permettant la consultation, l'indexation et l'analyse automatique     d'archives sonores collectées par des ethno-musicologues et     ethno-linguistes. En particulier, des outils automatiques de     détection de parole, musique et chant ainsi que de structuration des     enregistrements en tours de parole et tours de chant doivent être     développés, en adéquation avec la nature très variée des archives     analysées. Pour plus d'informations, voir le site du projet :


    Une thèse en traitement de la parole, de la musique ou du signal,     ainsi que des compétences en informatique (développements à réaliser     en langage Python)






  • Date de démarrage prévue : septembre ou octobre 2013
  • Durée du contrat : 12 mois
6-20(2013-06-19) CDD de 24 mois au laboratoire LACITO-CNRS
Profil de poste : 

CDD de 24 mois au laboratoire LACITO-CNRS

Niveau : Ingénieur d’études Contribution à la constitution de corpus de langues rares : textes et 
dictionnaires en ligne 
CONTEXTE Le projet HimalCo, financé par l’Agence Nationale de la 
(2013-2015), porte sur la constitution et l’exploitation de corpus pour dix langues à tradition orale. 
Les corpus sont composés de ressources sonores (enregistrements audio), textuelles (transcription,
 annotations) ainsi que de données lexicales
 (dictionnaires et enregistrements de mots) : 
Les corpus et les outils issus du projet HimalCo iront à terme alimenter la plateforme de
 la collection Pangloss qui regroupe elle-même plus de 70 corpus de
 langues rares : 
MISSIONS La personne recrutée en CDD travaillera en étroite collaboration avec l’ingénieur 
responsable de la Collection Pangloss et participant au projet HimalCo. Elle devra rapidement 
faire preuve d’autonomie dans la réalisation des tâches qui lui sont confiées. Les tâches à effectuer 
pour le projet sont diverses. Voici une liste non exhaustive : 
- traitement et mise en forme des corpus : suivi des tâches, gestion des contacts avec les déposants, 
alignement texte/son, préparation et vérification de métadonnées... 
- dépôt de documents à l’archivage pérenne et mise à jour des pages web correspondantes sur le 
site de la Collection Pangloss 
- développement de fonctionnalités en ligne pour la consultation des textes parallèles et des 
 - développement d’outils et mise à jour d’outils existants pour la mise en forme, la diffusion et
 la recherche dans les corpus
 - dialogue avec les partenaires de la Collection Pangloss 
- déploiement d’un outil logiciel de suivi des tâches (de la prise de contact initiale jusqu’au dépôt final) si le temps nécessaire peut être dégagé COMPETENCES - Connaissances en structuration de données textuelles (HTML, XML, XSL) et sonores (wav). - PHP - Perl - Java souhaité Capacité d’écoute pour comprendre les besoins et les pratiques des linguistes. Une expérience de l’étude et/ou du traitement de données linguistiques serait un plus. DUREE ET DATES La durée totale du contrat est de 24 mois. Les dates prévues sont : de novembre 2013 à octobre 2015 inclus. La date de début peut être avancée à septembre ou octobre 2013 si la personne recrutée le souhaite. Aucun engagement ne peut être pris concernant une prolongation du contrat au-delà de 24 mois : les possibilités sont soumises aux contingences des futurs Appels à projets de recherche (pour les CDD) et des créations de poste (pour les CDI).
 Contact : 


6-21(2013-06-21) Technical Engineer/Scientist (Project Manager) position, specialized in Speech and Multimodal technologies at ELDA

The European Language resources Distribution Agency       (ELDA), a company specialized in Human Language Technologies       within an international context, acting as the distribution agency       of the European Language Resources Association (ELRA), is       currently seeking to fill an immediate vacancy for Technical       Engineer/Scientist (Project Manager) position, specialized in       Speech and Multimodal technologies.


Technical Engineer / Scientist         (Project Manager) in Speech and Multimodal Technologies


Under the supervision of the CEO, the       responsibilities of the Technical Engineer/Scientist include       designing/specifying language resources, setting up production       frameworks and platforms, carrying out quality control and       assessment. He/she will be in charge of renovating the current       language resources production workflows. This yields excellent       opportunities for young, creative, and motivated candidates       wishing to participate actively to the Language Engineering field.       He/she will be in charge of conducting the activities related to       language resources and Natural Language Processing technologies.       The task will mostly consist in managing language resources       production projects and co-ordinating ELDA’s participation in       R&D projects while being also hands-on whenever required by       the development team.


Profile :


  • PhD in computer science, speech,         audiovisual/multimodal technologies
  • Experience and/or good knowledge in speech data         collection, expertise in phonetics, transcription tools
  • Experience in speech recognition, synthesis,         speaker ID and the well-used packages (e.g., HTK, Julius, ...)         and the tools to produce, collect and assess quality of         resources and datasets
  • Experience and/or good knowledge of the Language         Technology area
  • Experience with technology transfer projects,         industrial projects, collaborative projects within the European         Commission or other international frameworks
  • Ability to work independently and as part of a         team, in particular the ability to supervise members of a         multidisciplinary team
  • Dynamic and communicative, flexible to combine         and work on different tasks
  • Good knowledge of Linux and open source software      
  • Proficiency in C++, PhP, Java, Django is a plus
  • Proficiency in French and English
  • Citizenship of (or residency papers) a European         Union country


Applications will be considered until the position       is filled. The position is based in Paris.


Salary : Commensurate with qualifications and       experience.


Applicants should email a cover letter addressing       the points listed above together with a curriculum vitae to :


Khalid Choukri




55-57, rue Brillat-Savarin


75013 Paris




Fax : 01 43 13 33 30


Mail :


ELRA was established in February 1995, with the       support of the European Commission, to promote the development and       exploitation of Language Resources (LRs). Language Resources       include all data necessary for language engineering, such as       monolingual and multilingual lexica, text corpora, speech       databases and terminology. The role of this non-profit membership       Association is to promote the production of LRs, to collect and to       validate them and, foremost, make them available to users. The       association also gathers information on market needs and trends.


For further information about ELDA/ELRA, visit :   

6-22(2013-06-22) Visiting Research Engineer; Linguistics Research Labs; Univ.Urbana-Champaign, Illinois, USA

aVisiting Research Engineer

Linguistics Research Labs


The School of Literatures, Cultures, and Linguistics at the University of Illinois at Urbana-Champaign has an opening for a full-time (100%) Visiting Research Engineer in its linguistics research labs. The Visiting Research Engineer works directly with faculty and graduate students to identify, implement and maintain appropriate hardware and software for research within the School of Literatures, Cultures and Linguistics.  Currently, the labs have facilities for high-quality audio and video capture, eye-tracking, speech aerodynamics, electropalatography, and event-related potentials: Phonetics and Phonology Lab (; Second Language Acquisition Lab (; Discourse, Social Interaction, and Translation Lab; Electrophysiology and Language Processing Lab. The position is renewable for an additional two years and is contingent on funding and strong annual performance reviews by the School of Literatures, Cultures, and Linguistics. The position may become regular at a later date. The target starting date is September 1, 2013. Salary is commensurate with qualifications and experience.


Responsibilities will be research-related only and will include: Training faculty and graduate student researchers in the use of hardware and software for research purposes (including occasional workshops); Holding scheduled consultations with faculty and graduate student researchers on their research projects; Oversight of data acquisition hardware; Assisting faculty and graduate student researchers in problem-solving hardware and software issues; Providing support to faculty and graduate student researchers in procedural programming languages (e.g., Python, R, Matlab); Helping standardize computing and programming procedures across labs; Database management; Digital signal processing; and Assistance with initial setup of pilot experiments.


At a minimum, qualified applicants must have a MA/MS in linguistics or closely related fields, (e.g., neuroscience, psychology, speech and hearing science with a concentration in linguistics or speech-related research). The applicant should also have a solid background in a procedural programming language (e.g., Python, Matlab, and/or R) and statistical modeling. Preference will be given to candidates who have previously worked in a laboratory setting, have a demonstrated ability to work well as part of a research team, and have experience using advanced hardware for data acquisition.  


To apply, create your candidate profile through the University of Illinois application login page at and upload your application materials: letter of application, CV, and names and contact information for three professional references. Referees will be contacted electronically upon submission of the application. Only electronic applications submitted through will be accepted.

To ensure full consideration, all required applicant materials must be received no later than July 22, 2013. Letters of reference must be received no later than July 29, 2013. The department highly recommends that complete applications be submitted prior to July 22, to ensure that referees have enough time to submit their letters of recommendation. 


For additional information, please contact Applicants may be interviewed before the closing date; however, no hiring decision will be made until after that date.


Illinois is an Affirmative Action /Equal Opportunity Employer and welcomes individuals with diverse backgrounds, experiences, and ideas who embrace and value diversity and inclusivity. (




6-23(2013-06-27) 1-2 Assistant Professors in Speech Technology (main focus Dialogue Systems), KTH Stockholm Sweden

The Department of Speech, Music and Hearing at KTH, Stockholm, Sweden,  will hire 1-2 Assistant Professors in Speech Technology (main focus Dialogue Systems)

6-24(2013-06-29) Speech Scientist at Voicebox’s Research and Advanced Development Team, Munchen, Germany

VoiceBox is an acknowledged pioneer in the voice technology and application industry. Our

continued growth allows us to add to our diverse team of talented professionals.

Our opportunity is

your opportunity!

Because we work with some of the most respected brands in the world, you’ll not only work to high

standards but you’ll also get that “hey, I worked on that product” feeling. Even better, we’re small

enough for you to make a real impact - you’ll learn and grow quickly and never have that cog-in-thewheel-feeling.

We’re glad you’re considering joining the team!


A Speech Scientist at Voicebox’s Research and Advanced Development Team is responsible for work

on complex tasks and work packages independently and provide a solution to the team. Work

packages in the area of ASR, TTS and NLU in research as well as project related. Typical work

packages are:

Tuning and maintaining speech applications

Design and develop new speech applications

Adapt speech resources for certain customers’ requirements

Research, Development and Implementation of new algorithms in ASR, TTS and NLU.

Software Integration of third party ASR or TTS products in VoiceBox Engine

Training and adaptation of acoustic models and language models

This position can be located in Germany, Munich or USA, Seattle Area

Key Requirements/Skills/Experience

Strong knowledge in ASR, TTS and NLU as well as statistical learning methods

Strong plus: Experience in ASR Training-Toolkits, like HTK

Strong plus: Working experience on ASR topics: e.g. as intern, for PHD

Deep knowledge in digital signal processing

Deep knowledge in programming languages like: Ansi-C, C++

Knowledge in scripting languages like: Perl, Python, Shell (bash, awk, sed)

Excellent communication skills, great attitude and team oriented

Good skills in English (Text and Spoken)

Foreign language skills a plus


To apply for this position, please send your resume to

6-25(2013-06-30) Two positions in the area human machine dialog at Saarland University, Germany

Two positions in the area human machine dialog at Saarland University

We anticipate the availability of funds for two positions in the area of
dialog modeling and dialogue system design, one position for a PhD
candidate and a second position for a postdoctoral researcher.

The aim of the research project is to the development and testing of a
multimodal dialogue system for highly adaptive and flexible dialogue
management. Dialogue system will be designed to support negotiation
training games with the real and virtual agents. The research will be
carried out together with a European consortium (FP7 Programme) of
high-profile research institutes and companies.

The successful candidate should have a degree in computer science,
computational linguistics or a discipline with a related background.
Excellent programming skills are reqired (preferably in Java and C++),
as well as strong analytical and problem-solving skills. Some experience
in math, logics and cognitive modelling is a plus. Very good oral and
written communication skills in English are also required.

The successful candidate for a postdoc position additionally should have
a strong publication record in relevant venues and strong collaborative
skills, including possibly supervision of junior researchers, students,
or equivalent industrial experience.

This work will be conducted at the Spoken Language Systems group
( at Saarland University.

Saarland University

Saarland University ( is a European
leader in Computer Science research and teaching, and is particularly
well-known for its research in Computational Linguistics and Natural
Language Processing. In addition, the university campus hosts the
interdisciplinary MMCI Cluster of Excellence, Max Planck Institute for
Computer Science, Max Planck Institute for Software Systems and German
Research Center for Artificial Intelligence (DFKI). Students and
researchers come from many countries and the research language is

Both positions are fully funded positions with a salary in the range of
37,000 Euros to 51,000 Euros per year depending on the qualification and
professional experience of the successful candidates. Starting date is
the November 1st. The PhD position is for three years. The postdoc
position is for 2 years with the possibility of extension for one more

Each application should include:

* Curriculum Vitae including a list of publications
 (if applicable)
* Transcript of records
* Short statement of interest (not more than half a
* Names of two references
* Any other supporting information or documents

Applications (documents in PDF format in a single file) should be sent
no later than , Monday July 15th to:

Further inquiries regarding the project should be directed to:

6-26(2013-06-30) Post-doc position (information retrieval and language understanding) at Saarland University, Germany

Postdoc position in the area of information retrieval and language
understanding at Saarland University

We are seeking a skilled postdoctoral researcher whose expertise
intersects information retrieval (IR) and human-computer interaction
(HCI).The researcher will work in a research team to create an automated
speech-based question-answering (QA) system for various scenarios. This
research will done in cooperation with high profile partners in the US
and Europe.

The successful applicant will have:

1) a doctoral degree in a relevant field of computational linguistics,
computer science, or a relevant discipline;

2) a strong publication record in relevant venues;

3) excellent programming skills;

4) strong collaborative skills, including possibly supervision of junior
researchers, students, or equivalent industrial experience;

5) a strong technical background in machine learning, natural language
processing, and human-computer interaction.

This work will be conducted at the Spoken Language Systems group
( at Saarland University.

Saarland University

Saarland University ( is a European
leader in Computer Science research and teaching, and is particularly
well-known for its research in Computational Linguistics and Natural
Language Processing. In addition, the university campus hosts the
interdisciplinary MMCI Cluster of Excellence, Max Planck Institute for
Computer Science, Max Planck Institute for Software Systems and German
Research Center for Artificial Intelligence (DFKI). Students and
researchers come from many countries and the research language is

The planned starting date is the November 1st (an earlier starting date
is negotiable). The position is for 2 years with the possibility of
extension for one more year. The position is a fully funded position
with a salary in the range of 37,000 Euros to 51,000 Euros per year
depending on the qualification and professional experience of the
successful candidates.

Each application should include:

* Curriculum Vitae including a list of publications (if applicable)
* Transcript of records
* Short statement of interest (not more than half a page)
* Names of two references
* Any other supporting information or documents

Applications (documents in PDF format in a single file) should be sent
no later than , Monday July 15th to:

Further inquiries regarding the project should be directed to:

6-27(2013-07-01) Postdoctoral position at TTI-Chicago

### Postdoctoral position at TTI-Chicago ###
A postdoctoral position is available at TTI-Chicago on topics at the intersection of speech processing and machine learning.  The ideal candidate will have completed (or be about to complete) a PhD degree in computer science, electrical engineering, statistics, speech and language technologies, or a related field, and strong mathematical and experimental skills.  The main duties of the postdoc will be his/her research activities in collaboration with his/her supervisor and other collaborators at TTI-Chicago and beyond; opportunities for teaching and advising may also be available if desired.
To apply, or for additional information, please contact Karen Livescu
TTI-Chicago is a philanthropically endowed academic computer science institute with an accredited PhD program situated on the University of Chicago campus.

6-28(2013-07-01) Postdoctoral position INRIA Nancy Grand-Est (Nancy, France) - Speech Group, LORIA

INRIA Nancy Grand-Est (Nancy, France) - Speech Group, LORIA

Postdoctoral position

Accurate 3D Lip modeling and control in the context of animating a 3D talking head

Scientific Context

The lips play a significant role in audiovisual human communication. Several studies showed the

important contribution of the lips to the intelligibility of visual speech (Sumby & Pollack, 1954; Cohen

& Massaro 1990). In fact, it has been shown that human lips alone carry more than half the visual

information provided by the face (Benoît,1996). Since the beginning of the development of 3D virtual

talking heads, researchers showed interest to model lips (Guiard-Marigny et al., 1996, Reveret &

Benoît, 1998), as the lips increase intelligibility of the visual message. The existing models are still

considered as pure parametric and numerical models and do not take into account the dynamic

characteristic of speech. As audiovisual speech is highly dynamics, we consider that modeling this

aspect is crucial to provide a lip model that is accurately animated, and reflects the real articulatory

dynamics as observed in human vocal tract. In fact, the movement of the lips, even subtle, can

communicate relevant information to the human receiver. This is even more crucial for some

population such as hard-of-hearing people.


The goal of this work is to develop an accurate 3D lip model that can be integrated within a talking

head. A control model will also be developed. The lip model should be as accurate dynamically as

possible. When designing this model, the focus will be on the dynamics. For this reason, one can

start from a static 3D lip mesh, using a generic 3D lip model, and then we will use MRI images or 3D

scans to obtain more realistic shape of the lips. To take into account the dynamic aspect of the lip

deformation, we will use an articulograph (EMA) and motion capture technique to track sensors or

markers on the lips. The mesh will be adapted to this data. To control the lips, we will consider

allowing a skeletal animation to be controlled by the EMA sensors or motion capture markers, using

inverse kinematic technique, widely used in 3D modeling. In line with conventional skeletal animation,

an articulated armature rigged inside the mesh is mapped to vertex groups on the lip mesh by a

weight map that can be defined automatically from the envelope of the armature's shape and

manually adjusted if required, where manipulating the armature's components deforms the

surrounding mesh accordingly. The main challenge is to find the best topology of the sensors or

markers on the lips, to be able to better capture accurately its dynamics. The main outcome is to

accurately model and animate the lips based on articulatory data. It is very important to have

readable lips in that can be lip-read by hard-of-hearing people.


C. Benoît (1996). On the Production and the Perception of Audio-Visual Speech by Man and Machine.

Multimedia Communications and Video Coding, pp 277-284.

M. M. Cohen & D. W. Massaro (1990), Synthesis of visible speech. Behavioral Research Methods and

Instrumentation, 22, 260-263.

T. Guiard-Marigny, N. Tsingos, A. Adjoudani, C. Benoit, M.-P. Cani (1996). 3D Models of the Lips for Realistic

Speech Animation. Computer Animation 80-89

L. Reveret, C. Benoit (1998). A New 3D Lip Model for Analysis and Synthesis of Lip Motion in Speech

Production. Proc. AVSP'98, Terrigal, Australia, Dec. 4-6, 1998.

Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. Journal of Acoustic

Society of America, 26, 212-215.

Q. Summerfield (1987), 'Some preliminaries to a comprehensive account of audio-visual speech perception', In:

B. Dodd and R. Campbell, Editors, Hearing by Eye: The Psychology of Lip-Reading, Lawrence Erlbaum,

Hillsdale, NJ.


Required qualification: PhD in computer science Appropriate candidate would have good knowledge

in 3D modeling, speech processing and data analysis, as well as solid java programming skills.

Additional Information

Application deadline: 11 June 2013

Supervision and contact:

Slim Ouni ( )


1 year (possibly extendable)

Starting date:

between Sept. 1st 2013 and Jan. 1st 2014


2.620 euros gross monthly (about 2.135 euros net) medical insurance included.

Application Procedure

The required documents for an INRIA postdoc application are the following:

- CV, including a description of your research activities (2 pages max) and a short description of

what you consider to be your best contributions and why (1 page max and 3 contributions max); the

contributions could be theoretical, implementation, or industry transfers. Include also a brief

description of your scientific and career projects.

- The report(s) from your PhD external reviewer(s), if applicable.

- If you haven't defended yet, the list of expected members of your PhD committee (if known) and the

expected date of defense (the defense, not the manuscript submission).

- Your best publications, up to 3.

- At least one recommendation letter from your PhD advisor, and possibly up to two other letters. The

recommendation letter(s) should be sent directly by their author to the prospective postdoc advisor

All these documents should be sent

before June 11th


Established in 1967, Inria is the only public research body fully dedicated to computational sciences.

Combining computer sciences with mathematics, Inria’s 3,400 researchers strive to invent the digital

technologies of the future. Educated at leading international universities, they creatively integrate

basic research with applied research and dedicate themselves to solving real problems, collaborating

with the main players in public and private research in France and abroad and transferring the fruits of

their work to innovative companies. The researchers at Inria published over 4,800 articles in 2010.

They are behind over 270 active patents and 105 start-ups. The 171 project teams are distributed in

eight research centers located throughout France.

6-29(2013-07-02) Postdoc at MxR Lab at the University of Southern California Institute for Creative Technologies, Ca, USA

The MxR Lab at the University of Southern California Institute for Creative Technologies, located in Playa Vista, CA, is seeking a postdoctoral researcher.  Applicants should have a computer science or related field and a strong research background in HCI, virtual environments, virtual humans, data visualization, novel user interfaces, or a similar area.


The University of Southern California (USC), founded in 1880, is located in the heart of downtown L.A. and is the largest private employer in the City of Los Angeles. As an employee of USC, you will be a part of a world-class research university and a member of the 'Trojan Family,' which is comprised of the faculty, students and staff that make the university what it is.


Initial appointment will be for one year with the possibility of renewal for subsequent years.  Please direct all inquiries to Evan Suma (


Applicants can apply online at: 

6-30(2013-07-03) poste d'ATER en informatique à l'UFR de Sociologie et d'Informatique pour les Sciences Humaines de l'Université Paris Sorbonne.

un poste d'ATER en informatique est disponible à l'UFR de Sociologie et d'Informatique pour les Sciences Humaines de l'Université Paris     Sorbonne.

Le  candidat enseignera l’Informatique dans les différentes  formations de licence et de master du département d’Informatique,  Mathématiques et de Linguistique appliquées. Il devra s'inscrire dans un ou plusieurs axes de l'équipe de linguistique    computationnelle ( : Sémantiques  et connaissances, Paralinguistique de la parole et du texte, Jugements d’évaluation, opinions   et sentiments.

    La date limite  de candidature est le 4 septembre   2013.
        Personne à contacter :

6-31(2013-07-09) PhD position at Trinity College, Dublin, Ireland


PhD Title: Birdsong Forensics for Species Identification and Separation

Studentship: Full Scholarship, including fees (EU/Non EU) plus annual stipend of €16,000.

Start Date: Sept 2nd 2013 

PhD Supervisor: Dr. Naomi Harte, Sigmedia Group, Electronic & Electrical Engineering, Trinity College Dublin, Ireland

Collaborator: Dr. Nicola Marples, Zoology, Trinity College Dublin, Ireland.


The analysis of birdsong has increased in the speech processing community in the past 5 years. Much of the reported research has concentrated on the identification of bird species from their songs or calls. Smartphone apps have been developed that claim to automatically identify a bird species from a live recording taken by the user. A lesser reported topic is the analysis of birdsongs from subspecies of the

same bird. Among experts, bird song is considered a particularly effective way of comparing birds at species level. Differences in song may help uncover cryptic species. In many species, such as those living in the high canopy, catching the birds in order to obtain morphological (e.g. weight, bill length, wing length etc.) and genetic data may be time consuming and expensive. Identifying potentially interesting populations by the detection of song differences, allows any such effort to be better targeted.

Birdsong presents many unique challenges as a signal. The use of signal processing and machine learning techniques for birdsong analysis is at a very early stage within the ornithological research community. This PhD project seeks to lead the way in defining the state of the art for forensic birdsong analysis. Comparing birdsongs will push out the boundaries of feature analysis and classification techniques in signal processing. The research will develop new algorithms to systematically quantify levels of similarity in birdsong, transforming the comparison of birdsong in the natural sciences arena. The results will be of importance internationally for the study, monitoring, and conservation of bird populations.


The ideal candidate for this position will:

 Have a primary degree (first class honours) in Electronic Engineering, Electronic and Computer Engineering or a closely related discipline.

 Possess strong written and oral communication skills in English.

 Have a strong background and interest in digital signal processing (DSP)

 Be mathematically minded, and be curious about nature.

Experience in Matlab is a distinct advantage.


Interested candidates should send an email to Dr. Naomi Harte at The email

MUST include the following:

 Candidate CV (max 2 pages)

 A short statement of motivation (half page)

 Scanned academic transcripts

 Name and contact details for TWO academic referees

Incomplete applications may not be considered.

About the Sigmedia Group at TCD

Dr. Naomi Harte is an expert in Human Speech Communication. Her principal areas of focus are audio visual speech processing, speaker verification for biometrics and forensics, emotion in speech, speech processing in hearing aids and speech quality.

She is a leader of the Sigmedia Group at TCD ( within the School of Engineering. Over the past 5 years, Sigmedia has been awarded research income of over €3million and published 73 peer reviewed papers. The group currently has 3 academic and 3 post-doctoral staff along with 12 research students. The work of Sigmedia is supported by research grants from Science Foundation Ireland, Enterprise Ireland, Irish Research Council, Google and DTS.


6-32(2013-07-10) Research Assistant in Computational Psycholinguistics, Univ. Maryland, USA

Research Assistant in Computational Psycholinguistics

The Department of Linguistics at the University of Maryland is looking to fill a full-time

position for a post-baccalaureate researcher, starting September 1, 2013 or as soon as

possible thereafter. Salary is competitive, with benefits included. This person will be

involved in computational psycholinguistics research, with a focus on using techniques

from automatic speech recognition to better understand human speech perception. The

person will have the opportunity to develop skills in Bayesian modeling and signal

processing and will be part of a vibrant language science community that numbers 200

faculty, researchers, and graduate students across 10 departments.

The position would be ideal for individuals with a BA degree who are interested in

gaining significant research experience in a very active research group as preparation for

a research career. Applicants must be US or Canadian citizens or permanent residents,

and should have completed a BA or BS degree by the time of appointment. Previous

experience in cognitive science as well as familiarity with mathematics, computer

science, or signal processing is preferred. This is a 1 year initial appointment with

possibility of extension.

Applicants should submit a cover letter outlining relevant background and interests, a

current CV, and names and contact information for 3 potential referees. Reference letters

are not needed as part of the initial application. Applicants should also send a writing

sample. Applications should be submitted by email to Dr. Naomi Feldman,

, with 'Research Assistantship' in the subject line. Review of applications

will begin immediately and will continue until the position is filled.

6-33(2013-07-12) Research Assistant in Computational Psycholinguistics, University of Maryland, USA

Research Assistant in Computational Psycholinguistics

The Department of Linguistics at the University of Maryland is looking to fill a full-time

position for a post-baccalaureate researcher, starting September 1, 2013 or as soon as

possible thereafter. Salary is competitive, with benefits included. This person will be

involved in computational psycholinguistics research, with a focus on using techniques

from automatic speech recognition to better understand human speech perception. The

person will have the opportunity to develop skills in Bayesian modeling and signal

processing and will be part of a vibrant language science community that numbers 200

faculty, researchers, and graduate students across 10 departments.

The position would be ideal for individuals with a BA degree who are interested in

gaining significant research experience in a very active research group as preparation for

a research career. Applicants must be US or Canadian citizens or permanent residents,

and should have completed a BA or BS degree by the time of appointment. Previous

experience in cognitive science as well as familiarity with mathematics, computer

science, or signal processing is preferred. This is a 1 year initial appointment with

possibility of extension.

Applicants should submit a cover letter outlining relevant background and interests, a

current CV, and names and contact information for 3 potential referees. Reference letters

are not needed as part of the initial application. Applicants should also send a writing

sample. Applications should be submitted by email to Dr. Naomi Feldman,

, with 'Research Assistantship' in the subject line. Review of applications

will begin immediately and will continue until the position is filled.

6-34(2013-08-15) Speech Researcher at Vocalize,Sao Paolo, Brasil

SPEECH RESEARCHER A Speech Researcher at VOCALIZE will be responsible for complex tasks and for providing solutions to the team. The work will be related to the area of Text-to-Speech, Transcription, Speech Recognition, Speech Analytics and Natural Language Processing. Typical works will be: •Research, Development and Implementation of new algorithms in Text-to-Speech, Transcription, Speech Recognition,Speech Analytics and Natural Language Processing; •Create/develop new innovative solutions/products/applications; •Training and adaptation of acoustic models and language models; Key Requirements/Skills/Experience •PhD or Master degree in Engineering, Computer Science or Computational Linguistics; •Strong knowledge in Text-to-Speech, Transcription/Speech Recognition/Speech Analytics and Natural Language Processing as well as statistical learning methods; •Strong plus: Experience in Speech Recognition Training-Toolkits, like HTK, etc; •Deep knowledge in digital signal processing; •Deep knowledge in programming languages like: Ansi-C, C++; •Knowledge in scripting languages like: Python, Shell (bash, awk, sed); •Excellent communication skills, great attitude and team oriented; •Good skills in English (Text and Spoken); •Good skills in Portuguese and Spanish is a plus.
To apply for this position, please send your CV and cover letter to:


VOCALIZE is a growing and dynamic Speech and Language Technology Company located in Campinas, São Paulo, Brazil ( We are looking for talented Speech Researchers to create/develop New Algorithms to be added to our Core Technologies and to create Innovative solutions/products/applications to both the Brazilian and the Global market. We’re glad you’re considering joining the VOCALIZE team!

6-35(2013-08-23) PhD vacancy in project CHASING

PhD vacancy in project CHASING

Vacancy for a PhD in the project CHASING: CHAllenging Speech training In
Neurological patients by interactive Gaming

For more (up-to-date) information, see:

* Requirements
  + Degree in Computational or Applied Linguistics, Computer Science,
Artificial Intelligence, Informatics, Cognitive Science, or Education
  + Programming skills, e.g. Python, Java, html5, php, javascript,
flash, C[++]
  + An interest in e-Health, esp. gaming and speech technology for
speech training
  + A good command of the English language. Knowledge of Dutch is
considered an advantage.
  + Good communicative abilities
  + Willingness to work in an international, multidisciplinary team

* Job description
As a PhD student you will take part in the research project CHASING:
‘CHAllenging Speech training In Neurological patients through
Interactive Gaming’. The goal of this project is to investigate the
potentials of e-Health applications for speech training for neurological
patients. Current results indicate the need to develop advanced,
motivating training devices that provide guidance on articulation
improvement and that allow the integrated benefits of intensified,
independent speech training. Serious games are known to have strong
motivational power. In this research project we aim to investigate to
what extent serious games and game principles motivate and support
neurological patients in speech training. To this end, an interactive,
intuitive, user adaptive game for speech training which employs advanced
speech technology and which is compatible with mobile platforms will be
developed and tested.

You will first develop dedicated speech technology, esp. ‘automatic
speech recognition’ (ASR) technology, and integrate it in the game.
Next, you will carry out experiments with neurological patients, to
study the effects of the game and how the game (incl. the speech
technology) can be improved.

You are expected to start in the fall of 2013. It is a full time
position for 4 years.
You will be part of a dynamic international and interdisciplinary team
and will work in an inspiring research environment.

* Contact persons
  + Helmer Strik
    +31 24 3616104
  + Lilian Beijer
    +31 24 3659718 / 3659140

6-36(2013-08-24) Doctoral/Post-Doctoral Position (E13) in the field of Computational Pragmatics at Bielefeld University, Germany

Doctoral/Post-Doctoral Position (E13) in the field of Computational Pragmatics,

up to three years, starting autumn 2013

at the Center of Excellence ‘Cognitive Interaction Technology’ (CITEC) at

Bielefeld University, Germany,


in the project:

“Computational Pragmatics -- Multimodal Intention Processing”.

We are seeking a highly-motivated candidate with a degree (Master‘s or Ph.D.)

in Linguistics/Psycholinguistics or Cognitive Science/Artificial Intelligence with experience

in computational modeling of interactive systems. The successful applicant should have

a theoretical or applied background in dialogue, intention recognition/pragmatics, as well

as interpretation or generation of multimodal communicative behavior. Experiences

with conducting empirical studies and experiments would be favorable. Given

the multidisciplinary nature of the project, the ideal candidate should be prepared to

acquire new knowledge in the fields of psycholinguistics/computational linguistics,

cognitive science, and especially speech-accompanying gestures. Excellent command of

English is required.

The Center of Excellence 'Cognitive Interaction Technology' (CITEC) at Bielefeld

University, Germany, conducts interdisciplinary research into understanding the

functional processes of cognitive interaction with the goal of replicating them in technical

systems, including developing relevant evaluation methodologies and toolkits. For more

details, see Within CITEC, this interdisciplinary project involves the

research groups “Psycholinguistics” (Faculty of Linguistics & Literary Studies) and

'Sociable Agents' (Faculty of Technology). General information about the labs can

be found at and http://www.techfak.unibielefeld.



Applicants should submit the following documents:

• a cover letter indicating research interests, academic education and past research

• curriculum vitae including list of publications

• sample publications (if available)

• letters of recommendation (if available)

Applications from suitably qualified handicapped and severely handicapped persons are

expressly encouraged.

Bielefeld University has received a number of awards for its achievements in the provision

of equal opportunity and has been recognised as a family friendly university. The

University welcomes applications from women. This is particularly true with regard both to

academic and technical posts as well as positions in Information Technology and trades

and crafts. Applications are handled according to the provisions of the state

equal opportunity statutes.

Applications in PDF format will be considered until the position has been filled. For

full consideration please submit applications by 30.09.2013. Please send applications to:

Prof. Dr.-Ing. Stefan Kopp

Sociable Agents Group, Faculty of Technology


Prof. Dr. Jan De Ruiter

Psycholinguistics, Faculty of Linguistics & Literary Studies


6-37(2013-08-24) PhD Title: Birdsong Forensics for Species Identification and Separation, Trinity College Dublin, Ireland.


PhD Title: Birdsong Forensics for Species Identification and Separation

Studentship: Full Scholarship, including fees (EU/Non EU) plus annual stipend of €16,000.

Start Date: Sept 2

nd, 2013

PhD Supervisor: Dr. Naomi Harte, Sigmedia Group, Electronic & Electrical Engineering, Trinity College Dublin, Ireland

Collaborator: Dr. Nicola Marples, Zoology, Trinity College Dublin, Ireland.


The analysis of birdsong has increased in the speech processing community in the past 5 years. Much of the reported research has concentrated on the identification of bird species from their songs or calls. Smartphone apps have been developed that claim to automatically identify a bird species from a live recording taken by the user. A lesser reported topic is the analysis of birdsongs from subspecies of the

same bird. Among experts, bird song is considered a particularly effective way of comparing birds at species level. Differences in song may help uncover cryptic species. In many species, such as those living in the high canopy, catching the birds in order to obtain morphological (e.g. weight, bill length, wing length etc.) and genetic data may be time consuming and expensive. Identifying potentially interesting populations by the detection of song differences, allows any such effort to be better targeted.

Birdsong presents many unique challenges as a signal. The use of signal processing and machine learning techniques for birdsong analysis is at a very early stage within the ornithological research community. This PhD project seeks to lead the way in defining the state of the art for forensic birdsong analysis. Comparing birdsongs will push out the boundaries of feature analysis and classification techniques in signal processing. The research will develop new algorithms to systematically quantify levels of similarity in birdsong, transforming the comparison of birdsong in the natural sciences arena. The results will be of importance internationally for the study, monitoring, and conservation of bird populations.


The ideal candidate for this position will:

 Have a primary degree (first class honours) in Electronic Engineering, Electronic and Computer Engineering or a closely related discipline.

 Possess strong written and oral communication skills in English.

 Have a strong background and interest in digital signal processing (DSP)

 Be mathematically minded, and be curious about nature.

6-38(2013-08-28) POSTDOCTORAL RESEARCHER IN SPEECH TECHNOLOGY, University of Eastern Finland



The Speech and Image Processing Unit (SIPU) research group ( at the School of Computing ( announces POSTDOCTORAL RESEARCHER IN SPEECH


The position is filled in Academy of Finland project '

Reliable speaker recognition and modification', with focus on text-independent speaker recognition, but including also voice conversion and anti-spoofing topics. The post-doc will focus on core research in one of these technologies together with a speech processing group consisting of project leader, another postdoc and several PhD students. In addition to core research activities, the postdoc is expected to take part (10% to 15% of time) in practical supervision tasks of PhD/MSc students working on similar topics. There are no class-room teaching duties.

University of Eastern Finland (UEF) is a multidisciplinary university formed as a union of universities of Joensuu and Kuopio in 2010. UEF ranks among the best 100 universities in Times Higher Education evaluation of universities less than 50 years old ( The postdoc position is filled up in Joensuu campus. School of Computing is located at the facilities of Joensuu Science Park, providing modern research facilities.

The candidate should have a doctoral degree in spoken language technology, electrical engineering, computer science, pattern recognition or a closely related field. The candidate should be comfortable with Unix/Linux tools and Matlab/Octave with good skills in signal processing or pattern recognition. Previous exposure to technology benchmarks (e.g. NIST evaluations, Blizzard challenge) is a plus.

The position is filled for a period of 1 to 2 years, with preference for the 2-year post. The salary will be placed on level 5 according to Finnish university salary system. In addition, the appointees will be paid a salary component based on their personal performance, which can be a maximum of 46.3 per cent of the job requirement component. Additionally, at least one conference trip per year is supported.

The application consisting of the following documents should be sent or delivered to the Registry Office of the University of Eastern Finland. Postal address: Itä-Suomen yliopisto, Kirjaamo, PL 111, 80101 Joensuu or Itä-Suomen yliopisto, Kirjaamo, PL 1627, 70211 Kuopio. Street address: Yliopistokatu 2 (Joensuu) and Yliopistonranta 1 E (Kuopio). The deadline for applications is

September 30, 2013 (at 3.00 pm Finnish time). The applications with verification of document originality (signatures) can also be sent as scanned PDF files to

The application should include:

 A cover letter indicating the position to be applied for and a free-worded application describing the special qualities of the applicant and his or her reasons for applying to the position

 Full curriculum vitae (CV), including a list of publications (if any)

 Copies of relevant diplomas and transcripts of academic records. The diplomas should be in English or Finnish, and the grading system should be described.

 The names and contact information of at least two referees.

All enquiries related to the positions should be addressed to Dr. Tomi Kinnunen, email:,, Tel. +358 50 442 2647.


6-39(2013-09-17) Postdoc Research Associate, Emerson College, Boston, MA, USA

Emerson College

Post Doctoral Research Associate

Post Doctoral Research Associate in Autism

The Facial Affective and Communicative Expression (FACE) Lab at Emerson College in Boston, Massachusetts announces the availability of a Postdoctoral Research Associate position, funded by the National Institute on Deafness and Communication Disorders (NIH-NIDCD). The position is for two years, renewable up to four years.


The FACE lab investigates social communication, specifically facial and vocal expressions of children with and without autism, using several methodologies. We collect and analyze acoustic measures of speech, infrared motion capture data of facial feature movement, eyetracking data on gaze behavior to social stimuli, and subjective measures of how typical individuals perceive the facial and vocal expressions of individuals with high-functioning autism. The FACE lab is affiliated with the department of Communication Sciences and Disorders, which has active research programs in the study of communication and communication disorders in several different populations.


Emerson College is located in the center of Boston, surrounded by major health care and research centers, which provide a wide range of collaborative clinical and research opportunities. It is the nation’s only four year institution dedicated exclusively to communication and the arts. The program in Communication Sciences & Disorders is one of the oldest and most respected in the country, and is highly ranked among the most competitive graduate programs in communication disorders in the US. The department offers state-of the-art research facilities, on-campus clinical facilities and is easily reached by public transportation.

Campus Location

Boston Campus

Primary Duties, Responsibilites, and Tasks

The position involves interacting with children with and without autism and their families, recruiting participants and coordinating research activities in the lab, collecting and analyzing eyetracking, speechacoustic, motion-capture, and behavioral data, and disseminating research results through conferences and journal publications. The successful candidate is expected to develop and implement independent projects as well.

Required Knowledge, Skills, and Education (including hardware, software, and equipment)

Applicants should have a doctorate in speech and hearing sciences, linguistics, computer science, engineering, or a related field. Familiarity with autism, computer programming, and strong writing skills are required.

Required Prior Work Experience

Doctoral level research experience in autism or related field is also required.

Special Instructions to Applicants

Interested candidates should submit a cover letter describing their research interests and relevant experience, Curriculum Vitae, and contact information for three references to Dr. Ruth Grossman at

Diversity Statement

Emerson College values and has placed an institutional priority on multiculturalism in the campus community. Through its constantly evolving curriculum it seeks to prepare students for success in an increasingly multicultural society. The successful candidate must have the ability to work effectively with faculty, students, and staff from diverse backgrounds. Members of historically underrepresented groups are encouraged to apply.

Open Date


Open Until Filled


Job Title

English Language Learning (ELL) Faculty Member

Primary Duties, Responsibilities, and Tasks


Teach undergraduate- and graduate-level English writing, speaking, listening and reading to international students.

Assess the level of student writing and oral skills.

Work cooperatively with faculty members on pedagogies to address international student communication needs.

Participate with the faculty in on-going curriculum review and development with an emphasis on ELL skills and their alignment with student needs and faculty requirements.

Work with the Office of Admissions and the iGrad Transition program on assessing international student competencies.

Work with the Office of Internationalization and Global Engagement to help articulate intercultural program development and research goals; facilitate the internationalization of the curriculum; and build additional support programs for international students.

Work with the Lacerte Family Writing and Academic Resource Center (WARC) staff to develop strategies and programs to address international student communication needs.


Minimum Qualifications

Three-to five years of experience teaching university level ELL courses.

Expertise in ELL curricular development.

Master’s Degree and TESL/TEFL/TESOL certification and/or degree.

Fluency in a foreign language and experience living abroad.

Strong work ethic, positive attitude (flexibility and optimism), and analytical/organizational skills.

The ability to work independently as well as part of a team, multi-task, take initiative and set priorities to accomplish various instructional and operational tasks.

Emerson College is the nation’s only four-year institution dedicated exclusively to majors in communication and the arts in a liberal arts context. It is located in the theater district in the dynamic multi-cultural city of Boston in close proximity to major media outlets, arts institutions, and research centers. The college enrolls 3,662 undergraduate students and 830 graduate students from 75 countries and all 50 states.


To apply, please visit:



6-40(2013-09-20) Postdoctoral fellowship in neuroscience for speech recognition, Toronto Rehabilitation Institute, Canada

Postdoctoral fellowship in neuroscience for speech recognition


We are seeking a skilled postdoctoral fellow (PDF) whose expertise intersects automatic speech recognition (ASR) and neuroscience to develop a next-generation model of speech production.

Approximately 10% of North Americans have some sort of communication disorder. It is imperative that technology is used to mitigate difficulties these individuals have in being understood. This research involves building a model of how speech is produced physically and in the brain, and translating it directly into automatic speech recognition. Specifically, we propose to build an advanced neural network that relates words and phrases across electroencephalographic (EEG) data, acoustic data, and measurements of how the important articulators in speech (e.g., the lips and tongue) move. This model of speech production will be built from data recorded with people with cerebral palsy and healthy controls.

The PDF will work with a team of internationally recognized researchers in computer science, speech-language pathology, and neuroscience. Work will involve programming, data analysis, dissemination of results (e.g., papers and conferences), and partial supervision of graduate and undergraduate students. Some data collection will also be involved.

The successful applicant will have:

1)      A doctoral degree in a relevant field of computer science, electrical engineering, biomedical engineering, neuroscience, or a relevant discipline;

2)      Evidence of impact in research through a strong publication record in relevant venues;

3)      Evidence of strong collaborative skills, including possibly supervision of junior researchers, students, or equivalent industrial experience;

4)      Excellent interpersonal, written, and oral communication skills;

5)      A strong technical background in machine learning, natural language processing, robotics, and human-computer interaction.

This work will be conducted at the Toronto Rehabilitation Institute and the University of Toronto.

About the Toronto Rehabilitation Institute

One of North America’s leading rehabilitation sciences centres, Toronto Rehabilitation Institute is revolutionizing rehabilitation by helping people overcome the challenges of disabling injury, illness or age related health conditions to live active, healthier, more independent lives. It integrates innovative patient care, ground-breaking research and diverse education to build healthier communities and advance the role of rehabilitation in the health system.  Toronto Rehab, along with Toronto Western, Toronto General and Princess Margaret Hospitals, is a member of the University Health Network and affiliated with the University of Toronto.

Applicants should send 1) a full CV, 2) a representative sample of their work, and 3) a 1-page statement of purpose to Frank Rudzicz at by 1 December 2013.


6-41(2013-09-20) English language technician to work in one of its Text-to-Speech projects at Voxygen, F

Voxygen SAS, a young and innovative company, is looking for an American
English language technician to work in one of its Text-to-Speech projects.

Voxygen develops speech synthesis products and services for World-wide
markets, with a particular focus on the creation of
expressive voices for industrial and entertainement purposes. For more
information on the company, please visit:

Job description:
  - Validation of sentences correctness for script creation:
    verification of orthography, grammar, readability and phonetic
  - Participate in the voice-talent casting.
  - Assist company experts with native language expertise during script
    recording sessions.
  - Revision of automatic phonetization and segmentation of recorded script     sentences.

Job Requirements:
  - Fluent in spoken American English.
  - Thorough knowledge of the language grammar and orthography.
  - Keen ear for phonetic nuances.
  - A degree in any language-related field such as linguistics, translation,
    language teaching, will be helpful.
  - Knowledge of the target language phonetics or previous experience with
    TTS will be a definite plus.
  - Attention to detail.
  - Keen interest for language and technology.

This is a temporary position for 6 months. The job will be located in Brittany,
France (near Rennes or Lannion).

This is a great opportunity to participate in an exciting state-of-the-art
project and to colaborate with world-class experts in the field of TTS.

If this sounds interesting to you, please send us your CV to:

6-42(2013-09-20) Post-doctoral position in Multimedia Indexing, Eurecom, Sophia Antipolis, F

Post-doctoral position in Multimedia Indexing

Location: EURECOM, Multimedia Communications Department, Sophia Antipolis, France

Duration: 12 months


We have an open position for a post-doc to work on several aspects of
Multimedia Indexing, in particular Multimedia fusion and co-training. We
are looking for candidates who are highly motivated to conduct high
quality research, propose and evaluate innovative solutions for the
difficult problems that arise when automatically analyzing Multimedia
content. This research is conducted in partnership with other French
laboratories and companies.

Candidates should have a PhD Degree (or equivalent) in Computer Science,
or a closely related area, with a good knowledge of Machine Learning
techniques, and possibly an experience on multimedia analysis. Good
programming skills are expected. A good level of written and spoken
English is mandatory.


Screening of applications will begin immediately, and the search will
continue until the position is filled. Applicants should send, to the
email address below (i) a CV, (ii) a motivation letter, (iii) contact
details for three referees, (iv) a two page statement of research
interests and motivation.

Postal address :
    Campus SophiaTech,
    450 route des Chappes,
    06410 Sophia Antipolis,

Contact : Prof. Bernard Merialdo,
Web page :
Phone number : +33 4 93 00 81 29
Fax number : +33 4 93 00 82 00

EURECOM is a French graduate school and a research center in
communication systems based in the international science park of Sophia
Antipolis, which brings together renowned universities such as Télécom
ParisTech, Aalto University (Helsinki), Politecnico di Torino,
Technische Universität München (TUM), Norwegian University of Science
and Technology (NTNU) and Vietnam National University Ho Chi Minh Ville
(VNU). The Principality of Monaco is a new institutional member. The
Institut Mines-Télécom is EURECOM’s founding member.

EURECOM benefits from a strong interaction with the industry through its
specific administrative structure: Economic Interest Group (kind of
consortium), which brings together international companies such as:
Swisscom, SFR, Orange, ST Microelectronics, BMW Group Research &
Technology, Symantec, Monaco Telecom, SAP, IABG. EURECOM deploys its
expertise around three major fields: Networking and security, Multimedia
Communications and Mobile Communications. EURECOM is particularly active
in research in its areas of excellence while also training a large
number of doctoral candidates. Its contractual research is recognized
across Europe and contributes largely to its budget.

Thanks to its strong ties set up with the industry, EURECOM was awarded
the “Institut Carnot” label jointly with the Institut Telecom right from
2006. The Carnot Label was designed to develop and professionalize
cooperative research. It encourages the realization of research projects
in public research centers that work together with socioeconomic actors,
especially companies.

6-43(2013-09-25) Post‐Doc positions in psycholinguistics, Geneva, CH

TWO PostDoc positions
            in the Psycholinguistics research  group at the University of Geneva (             ,
            to work on a project funded by the Swiss National Science  Foundation:
            - in the field of reading acquisition in children;
            - in the field of language production in healhy and brain damaged (aphasic) speakers
            Qualifications requested :
            - PhD in psychology or neuroscience or related field
            - Experience in the field of psycholinguistics and/or acquisition and/or neuropsychology of language
            - Experience with EEG/ERP acquisition and analysis
            Starting January 2014 or later.
          Applicants should submit a CV and a mail with statement of research interests till October 31 to:
 or to

6-44(2013-10-01) Post doc scholar: Computational Linguistics/ASR Univ. N.Arizona, Flagstaff, AZ, USA



 Post-Doctoral Scholar: Computational Linguistics/Automated Speech Recognition




Applications are invited for a full-time postdoctoral position in computational linguistics/automated speech recognition through the Department of English at Northern Arizona University, Flagstaff. This project will focus on extracting intonation features applicable to automated scoring systems and develop algorithms to measure the degree of nativeness of accented speech.This new position has guaranteed funding for two calendar years commencing on the date of appointment, with continued appointment upon availability of funds.

Minimum Education: 

Ph.D. in Natural Language Processing, Electrical and Computer Engineering, or Computational Linguistics by the time of appointment, with an emphasis on speech technology. 

Required Qualifications


  • Background in NLP and signal processing

  • Background in machine learning (e.g., HMM, GMM, SVM)

  • Interest or experience in speech feature extraction/analysis (e.g., pitch, LPC, MFCC, formants, jitter & shimmer)

  • Computer program skills (e.g., C/C++, MATLAB, python)

  • Knowledge about speech processing and implementation in automated systems

  • Expertise in the application of speech recognition systems and fluency


Preferred Qualifications

  • Knowledge of linguistics

  • Experience with collaborative interdisciplinary research

  • A record of software deliverables and publication


To Apply:


Candidates should email letter of application, CV, and names & contact information for 3 references to Dr. Okim Kang ( For the complete job announcements, visit (Job ID#600539):


Deadlines: Open until further notice; review of applications will start on November 5, 2013.


About NAU and Flagstaff:

Northern Arizona University (NAU) is a 25,000-student institution with its main campus is in Flagstaff, a four-season community of about 62,000 at the base of the majestic San Francisco Peaks. NAU is an Equal Opportunity/Affirmative Action Employer, and applications from minority and women candidates are especially welcome. The position is open to non-US citizens or non-permanent residents.

6-45(2013-10-01) Tenure-track Faculty Position in Human-Computer Interaction Department of Computer Science, Virginia Tech VA, USA
Tenure-track Faculty Position in Human-Computer Interaction Department of Computer Science, Virginia Tech The Department of Computer Science at Virginia Tech ( invites applications for a 
full-time tenure-track position, at the rank of Assistant Professor, from candidates with expertise
 in human-computer interaction (HCI). The department is especially interested in sub-areas of 
HCI involving human interaction with big data or advanced technologies, such as large-scale data 
visualization and human-robot interaction, but candidates from all areas of HCI are encouraged 
to apply. Candidates should have a PhD in Computer Science or related discipline at the time of 
appointment; a strong record of scholarship in human-computer interaction and interdisciplinary 
areas; demonstrated ability to contribute to teaching at the undergraduate and graduate levels in 
HCI and related subjects; sensitivity to issues of diversity in the campus community; and the 
skills to establish and grow a multidisciplinary research group. Selected candidates are expected 
to travel occasionally to attend professional conferences/meetings. VT CS faculty have been involved in HCI research since the early days of the field, and lead the 
interdisciplinary Center for Human-Computer Interaction (, a university-wide effort 
that brings together faculty with strengths in multi-sensory interactive communication, ecologies 
of displays and devices, social/collaborative computing, and human aspects of 
data/information/knowledge. Within CS, there are rich opportunities for collaboration in data 
mining/machine learning, parallel and distributed computing, computational biology and 
bioinformatics, information retrieval, software engineering, cyber security, cyber arts, and CS 
education. Beyond the department, HCI faculty members collaborate with researchers in design 
and the arts (the Institute for Creativity, Arts, and Technology), in engineering and science (the 
Institute for Critical Technology and Applied Science), and in education, business, and liberal arts.
 The Department of Computer Science has 36 research oriented tenure-track faculty and 11 
research faculty. There are a total 12 NSF/DOE CAREER award winners in the department. 
Research expenditures during FY2013 were $11.7 million; total research funding at the beginning
 of FY2014 was $34 million. BS, MS, and PhD degrees are offered, with an enrollment of over 
550 undergraduate majors (12% women) and over 200 PhD/MS students. In 2010, CS@VT was
 ranked 5th in the country in recruiting quality of CS undergrads by the Wall Street Journal. The 
department is in the College of Engineering, whose undergraduate program was ranked 6th and 
graduate program was ranked 12th among public engineering schools in 2013 by US News and 
World Report. Other research centers associated with the department include the new
 interdisciplinary Discovery Analytics Center (, which focuses on ‘big data’ problems
 in areas of national interest including intelligence analysis, sustainability and health informatics,
 and the Center for High End Computing known for its expertise
 in high performance computing, including energy efficient and/or heterogeneous supercomputers. Recently, we designed and acquired HokieSpeed, a CPU/GPU 200+ node machine for use by computational scientists and engineers across campus through a $2M NSF MRI grant. HokieSpeed ranked 11th in the world on the Green 500 List in November 2011. CS faculty also participate in the Ted and Karyn Hume Center for National Security and Technology (, a cyber security center. Virginia Tech is a comprehensive research university with over 31,000 students. This hire is for 
the main campus in Blacksburg, consistently ranked among the country’s best places to live Salary for suitably qualified applicants is competitive and commensurate with experience. 
Selected candidates must pass a criminal background check prior to employment. Applications must be submitted online to for posting #117036. We welcome 
applications from women and minorities. Applicant screening will begin December 15, 2013 and 
continue until the position is filled. Early applications are encouraged. Inquiries should be directed
 to Dr. Doug Bowman, HCI Search Committee Chair, Virginia Tech is an equal opportunity/affirmative action institution. 
6-46(2013-10-02) Voice recognition engineer at Sony, San Francisco Area (Foster City), CA, USA


Primary Location: United States-California-San Francisco Bay Area - Foster City


Sony PlayStation

® US R&D is looking for an individual who will contribute to voice recognition system and technologies for current and future Sony PlayStation® platforms. The Senior Software Engineer (voice recognition) for the R&D team will contribute to one or more of the following fields:

Automatic generation of pronunciation and voice recognition grammar for 10+ languages.

Language modeling (LM) for large vocabulary continuous speech recognition (LVCSR)

Robust automatic speech recognition (ASR) technologies to various kinds of distortions and variations such as channel and environment distortions, emotional speech, variety of speaking rate and speaking style for multiple languages.

Acoustic model training and adaptation.

Keyword spotting and voice search technologies are plus.

Speech synthesis experience is a plus.

Improve runtime voice recognition and sample voice applications on PS3 and PlayStation future platforms for many languages.


At least 5 years’ experience and solid understanding in voice recognition and digital signal processing technologies.

At least 5 years’ experience and strong skills in scripting languages and C/C++ programming.

Experience of multi-lingual speech and language processing is preferred.

Good written and oral communication skills.

Bachelor's degree in Computer Science/Electrical Engineering, related engineering discipline, or equivalent

Master's degree or PhD in Computer Science/Electrical Engineering or equivalent is preferred

Fresh yet outstanding PhD graduates are also encouraged to apply.

To apply send email to ''

Sony Computer Entertainment America (SCEA) is home to the PlayStation® family of products, including the PlayStation®3 (PS3™, PlayStation® Vita (PS Vita), PlayStation® Mobile and PlayStation®Network. Founded in 1994, SCEA has grown into a leading global computer entertainment brand and continues to redefine interactive consumer entertainment. Since the original PlayStation® first revolutionized the world of gaming, SCEA has repeatedly set the benchmark for innovation in home and portable entertainment through amazing gameplay experiences that inspire people across the world. Based in Foster City, CA, SCEA serves as headquarters for all North American operations and employs over 2,104 people in offices located in Foster City, CA, San Diego, CA, Santa Monica, CA and Bend, OR.

It is SCEA's policy to provide equal employment opportunity for all applicants and employees. SCEA does not unlawfully discriminate on the basis of race, color, religion, gender, gender identity, marital status, age, disability, veteran status, sexual orientation, national origin, or any other category protected by applicable federal and state law. SCEA also makes reasonable accommodations for disabled applicants and employees.


6-47(2013-10-08) Thesis proposal at LORIA Nancy F

Dans le cadre du projet ANR ContNomina (2013-2016), nous proposons  une thèse  (36 mois), financée par ce projet sur le sujet suivant : 
Exploitation du contexte pour la reconnaissance de noms propres dans les documents diachroniques 

Début : novembre-décembre 2013 Lieu : Nancy LORIA/INRIA et LIA Avignon

Résumé : L'adaptation des systèmes de reconnaissance de la parole vise à  rapprocher les modèles des conditions d'utilisations présumées ou  observées : au locuteur, à l'environnement acoustique, au domaine, etc. Tandis que les adaptations acoustiques peuvent être réalisées  de façon supervisée ou non-supervisées, sur des collections de données de tailles variables, les adaptations du modèle de langage  requièrent des grandes quantités de données et sont appliquées dans la phase de conception du système.

Dans cette thèse, nous nous concentrerons sur la contextualisation des systèmes, opération qui consiste à réaliser une adaptation  rapide et non supervisée des ressources linguistiques (lexique et modèle de langage) d'un reconnaisseur de parole. On traitera en  particulier du problème de la reconnaissance des noms propres, pour lesquels une bonne couverture lexicale est très difficile à  obtenir alors qu'ils participent significativement à l'intelligibilité du discours. Ce travail comporte deux parties  qui concernent respectivement la modélisation des contextes et l'intégration de ces modèles dans un processus de reconnaissance  multi-passes.


Liste des personnes à contacter :  Irina Illina , Responsable du projet ANR ContNomina , INRIA-LORIA , Nancy , équipe Parole, tel 03 54 95 84 90, illina @ loria . fr   Dominique Fohr , tel 03 83 50 20 27,  fohr @ loria . fr 



6-48(2013-10-10) Poste d'ATER, Aix-Marseille, F

Un poste d'ATER rattaché au département de sciences du langage et au laboratoire Parole et Langage à Aix-Marseille Université a été ouvert dans le cadre d'une campagne de recrutement au fil de l'eau.

Vous trouverez un appel à candidatures à l'adresse : Il s'agit du poste P1-941 pour lequel la date-limite de dépôt des candidatures est fixée au 18 octobre et la date de prise de fonction au 1er décembre.
6-49(2013-10-10) PhD-position at CITEC, Bielefeld University, Germany
PhD-position at CITEC, Bielefeld University, Germany ====================================================
TOPIC: Embodied cognitive modeling of language processing and dialogue ----------------------------------------------------------------------
We are seeking highly qualified candidates who want to participate as a PhD student researcher in a larger scale project at Bielefeld University's Center of Excellence on Cognitive Interaction Technology (CITEC) [1,2]. The project aims to investigate how a robot can familiarize itself with novel objects and actions through exploratory manual interaction in its environment under guidance of a human tutor. The goal is to develop deep representations of objects and actions, using language to engage in dialogue with the tutor to guide the ongoing interaction and familiarization process. Language processing and dialogue shall thereby be grounded in embodied cognitive processes.
In particular, we are searching for a PhD student working on the language aspects of this project. The task is to build a cognitively motivated representation system which interconnects different levels of abstraction and is grounded in the (already present) lower-level sensorimotor capabilities of the robot. The researcher will have to develop an intermediate type of conceptual representation which should be based on or is connected to schema theories and should in particular focus on dynamic representation of action. The targeted outcome is an implemented formalism for action semantics that links up lower-levels of sensorimotor processing with linguistic processing of syntax and semantics, as e.g. provided by construction grammar based approaches. A second goal is to build dialogue strategies on top of this formalism, e.g. to enable clarification questions geared to reduce uncertainty in this mental simulation.
The successful candidate has a background in Cognitive Science/A.I. and Linguistics, and is in particular familiar with, or motivated to learn about a deep understanding of cognition within holistic approaches to complex embodied cognitive systems. Knowledge on cognitive linguistic formalism and schema theory will be helpful.
Overview --------
The successful candidate is expected to have a strong background in at least two of the following areas:
* Cognitive Science * Computational Linguistics * Mental Representation * Cognitive Robotics
Candidates with expertise in several topics will be given preference. Applicants are expected to have strong programming skills, the ability to work independently, strong interpersonal skills as well as strong English language skills. Candidates should be qualified for and willing to perform state-of-the-art research. Knowledge of German is not required.
The project will provide the chance to perform cutting-edge research in an international context. The PhD applicant will be primarily associated with two research groups participating in this project: The Sociable Agents Group [3] (Prof. Dr. Stefan Kopp) developing cognitive systems capable of multimodal communication and social interaction, and the Emergentist Semantics Group [4] (PD Dr. Katharina Rohlfing) investigating the emergence of meaning in children and learning systems. Within these groups, the applicant will receive supervision and support in developing the necessary skills to produce high-quality research necessary to earn a PhD Degree.
The salary for the position is around 3.300 Euro per month (pre taxes), according to scale TV-L E 13 in the German University System and can vary depending on age, marital status, tax class, etc. [5]. The salary increases with the duration of the employment.
The position is available for two years and can be prolonged for one more year. Candidates are expected to start on January 1st, 2014. Applications from suitably qualified handicapped and severely handicapped persons are expressly encouraged.
Bielefeld University has received a number of awards for its achievements in the provision of equal opportunity and has been recognised as a family friendly university. The University welcomes applications from women. This is particularly true with regard both to academic and technical posts as well as positions in Information Technology and trades and crafts. Applications are handled according to the provisions of the state equal opportunity statutes.
Applications are invited until 31.10.2013 and will be processed as they are received. Interviews will be scheduled via Skype in the first half of November.
Applicants will be informed of the outcome of the selection in the second half of November.
Applications should be sent by email to
Prof. Stefan Kopp Sociable Agents Group, Faculty of Technology Email:
preferably as *one* PDF document including a CV, motivation letter, publication record (if applicable), record of teaching activities (if applicable) and any other relevant information in addition to the names of two referees.
Or by mail to the following address:
Bielefeld University, CITEC Inspiration 1 33619 Bielefeld Germany
The application should highlight the expertise in the areas mentioned above and express the research interests of the applicant.
Location --------
Bielefeld is one of the 20 largest cities in Germany (with close to 330.000 inhabitants) [6]. It is a lively city with a lot of cultural and entertainment opportunities. It is located in the heart of the Teutoburg Forest [7], offering many opportunities for outdoor and leisure activities.
[1] [2] [3] [4] [5]ür_den_öffentlichen_Dienst_der_Länder [6] [7]
6-50(2013-10-11) Assistant Professor tenure-track position in computational linguistics/language science at Rochester Institute of Technology


Position Title: Instructional Faculty

Faculty Rank: Assistant Professor

Faculty Type: Tenure Track

Department: English

PC# 8735 Requisition# 797BR

Anticipated Start Date:August 13, 2014



The Department of English at the Rochester Institute of Technology invites applications for an Assistant Professor of English tenure-track positionto begin August 13, 2014 with specialization in computational linguistics and/or innovative technical methods in language science, with a focus on one or more areas of application. Possible areas include:


  • Cultural or social analytics

  • Speech technology

  • Human-computer communication

  • Clinical, assistive, and/or access technology

  • Cognitive modeling of linguistic processes (for example reading)

  • Games and/or social media


The applicant should demonstrate a fit with our commitment to collaborate with colleagues across the university on curricular and research initiatives in digital humanities and language science.



The successful applicant will be a teacher and a researcher with an agenda that emphasizes innovative technical methods in linguistics, for instance natural language processing, corpus-based studies, linguistic/multimodal sensors, speech technology, and/or other computational approaches. We are seeking a scholar who engages in disciplinary and interdisciplinary teamwork and has a coherent plan for grant seeking activities. The right candidate will contribute to our department’s profile in the digital humanities, in addition to furthering our interdisciplinary language science curriculum in a college of liberal arts at a technical university. Contributions that build students’ global education experiences are additionally valued.


Teaching assignments may include Introduction to Language Science, Language Technology, Introduction to Natural Language Processing, Advanced Topics in Computational Linguistics, Language & Brain, self-designed courses, and other courses in the linguistics or general education frameworks such as Dialects & Identity, Text & Code, or Evolving English Language. The course teaching load is 3/2.


We are seeking an individual who has the ability and interest in contributing to a community committed to Student Centeredness; Professional Development and Scholarship; Integrity and Ethics; Respect, Diversity and Pluralism; Innovation and Flexibility; and Teamwork and Collaboration. Select to view links to RIT’s core values,honor code, and diversity commitment.  



RIT is a national leader in professional and career-oriented education. Talented, ambitious, and creative students of all cultures and backgrounds from all 50 states and more than 100 countries have chosen to attend RIT.  Founded in 1829, Rochester Institute of Technology is a privately endowed, coeducational university with nine colleges and institutes emphasizing career education and experiential learning.  With approximately 15,000 undergraduates and 3,000 graduate students, RIT is one of the largest private universities in the nation.  RIT offers a rich array of degree programs in engineering, science, business, and the arts, and is home to the National Technical Institute for the Deaf. RIT has been honored by The Chronicle of Higher Education as a “Great Colleges to Work For” for four years.  RIT is responsive to the needs of dual career couples by our membership in the Upstate NY HERC.


Rochester, located on Lake Ontario, is the 79th largest city in the United States and the third largest city in New York State.  The Greater Rochester region, which is home to nearly one million people, is rich in cultural and ethnic diversity, with a population comprised of approximately 16% African and Latin Americans and another 7% of international origin.  It is also home to the largest deaf community per capita in the U.S.  Rochester ranks 3rd best metropolitan regions for Raising a Family' by Forbes Magazine; 6th among 379 metropolitan areas as “Best Places to Live in America” by Places Rated Almanac; 1st in Expansion Management Magazine’s ranking of metropolitan areas having the best “Quality of Life in the Nation”; and is among Essence Magazine’s “Top 10 Cities for Black Families.”



  • Ph.D. in Linguistics (or an allied field) in hand prior to appointment date

  • Advanced graduate coursework in language science and technical methods

  • Evidence of outstanding teaching

  • Experience teaching or mentoring diverse or multi-disciplinary students

  • Evidence of publication and a coherent plan for research and grant seeking activities

  • Ability to contribute in meaningful ways to the college’s continuing commitment to cultural diversity, pluralism, and individual differences.



Apply online at Search: 797BR. Please submit your cover letter; CV; copy of transcripts of graduate coursework; a research statement; a teaching statement; writing sample/portfolio; Contribution to Diversity Statement; and the names, addresses and phone numbers for three references.



Please direct questions to Dr. Cecilia Ovesdotter Alm at: or (585) 475-7327.



Review of applications  on November 25, 2013.





RIT does not discriminate. RIT promotes and values diversity, pluralism and inclusion in the work place. RIT provides equal opportunity to all qualified individuals and does not discriminate on the basis of race, color, creed, age, marital status, sex, gender, religion, sexual orientations, gender identity, gender expression, national origin, veteran status or disability in its hiring, admissions, educational programs and activities.



