ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2012 » ISCApad #170 » Jobs

ISCApad #170

Monday, August 06, 2012 by Chris Wellekens

6 Jobs

6-1

(2012-02-02) Research and Development Opportunities for Next Generation Technology at Microsoft

Research and Development Opportunities for Next Generation Technology at Microsoft

Do you want to impact billions of people all over the world with speech technology that you create?

We are looking for PhD level scientists and senior scientists, who will work on research problems in spoken language understanding, statistical dialog modeling, natural language generation, machine learning, statistical language modeling, and acoustic modeling.

Microsoft is all-in on the Natural User Interface to bring computing to larger audiences in more applications. To drive this mission we are bringing together scientists and engineers in the areas of speech recognition, natural language understanding, dialog modeling, machine learning and synthesis to develop and deliver robust, natural and scalable solutions across a rich set of scenarios and languages.

Join the excitement to be part of the newly formed team of scientists within Microsoft and to impact the lives of billions of people all over the world. We’re talking about Bing, Windows, XBOX, Mobile, Exchange Server and Tellme, just to name a few. Microsoft is dedicated to improving everyday life using speech. And not just in a few countries - but around the world.

How to apply:

MICROSOFT CORPORATION

Attention: Recruiting,

One Microsoft Way, STE 303, Redmond WA 98052-8303

Or email resume to: Tom Swanson toswanso@microsoft.com

Please reference Speech

in the subject line.

Top

6-2

(2012-02-05) NSF-Supported Summer Research for Undergraduates

NSF-Supported Summer Research for Undergraduates

The Center for Language and Speech Processing at the Johns Hopkins University is seeking outstanding members of the current junior class to join a summer research workshop on language engineering from June 11 to August 7, 2012.

The 8-week workshop provides an intense intellectual environment. Undergraduates work closely alongside more senior researchers as part of a multi-university research team, which has been assembled for the summer to attack some problem of current interest. The teams and topics for summer 2012 are described here:

http://www.clsp.jhu.edu/internships/

We hope that this stimulating and selective experience will encourage students to pursue graduate study in human language technology, as it has been doing for many years.

The summer workshop provides:

* An opportunity to explore an exciting new area of research
* A two-week tutorial on current speech and language technology
* Mentoring by an experienced researcher
* Participation in project planning activities
* Use of a computing cluster and personal workstation
* A $5,000 stipend and $2,520 towards per diem expenses
* Private furnished accommodation for the duration of the workshop
* Travel expenses to and from the workshop venue

Applications should be received by WEDNESDAY, FEBRUARY 29, 2012, INCLUDING one letter from a faculty nominator. Apply online here:

http://www.clsp.jhu.edu/internships/

Applicants are evaluated only on relevant skills, employment experience, past academic record, and the strength of letters of recommendation. No limitation is placed on the undergraduate major. Women and minorities are encouraged to apply.

Top

6-3

(2012-02-02) Research and Development Opportunities for Next Generation Technology at Microsoft

Research and Development Opportunities for Next Generation Technology at Microsoft

Do you want to impact billions of people all over the world with speech technology that you create?

How to apply:

MICROSOFT CORPORATION

Attention: Recruiting,

One Microsoft Way, STE 303, Redmond WA 98052-8303

Or email resume to: Tom Swanson toswanso@microsoft.com

Please reference Speech

in the subject line.

Top

6-4

(2012-02-15) Maître de Conférences contractuel, ESPCI ParisTech

Un poste de Maître de Conférences contractuel (1 an renouvelable) est disponible au laboratoire SIGMA (SIGnaux, Modèles, Apprentissage statistique) de l'ESPCI ParisTech à partir d'avril 2012.
Enseignement : électronique et automatique (niveau 1ère année).
Recherche : apprentissage statistique appliqué à la conception d'une interface de parole silencieuse ou à la prédiction des propriétés/activités de molécules.
Rémunération : 33800€ par an + heures supplémentaires

Description détaillée du poste : http://www.neurones.espci.fr

Top

6-5

(2012-02-10) Poste MCF Informatique [27MCF519 Paris-Sorbonne]

Le poste requiert une double compétence : un haut niveau d’excellence scientifique en Informatique et en applications de l’Informatique aux sciences humaines et sociales (notamment le traitement paralinguistique de la parole et du langage, la sociologie computationnelle, …). L’intérêt porté aux applications de la théorie informatique aux sciences humaines et sociales constitue une des spécificités de l’enseignement de l’Informatique à l’Université Paris-Sorbonne. Le candidat enseignera l’Informatique dans différentes formations de licence (LFTI) et de master (ILGII, IILGI). Il s’impliquera également dans l’encadrement de nouvelles licences bi-cursus (licence Sciences-Sciences du langage, …) en projet au sein du PRES Sorbonne Universités.

Le (la) MCF rejoindra le département de Mathématiques et d’Informatique appliquées aux Sciences de l’Homme (actuellement 6 MCF et 2 Pr) de l’Institut des Sciences Humaines Appliquées (ISHA) de l'Université Paris-Sorbonne. Ce département participe à des formations pluridisciplinaires de licence et de master. Il est également chargé de l’enseignement de l’informatique pour les étudiants de lettres et de sciences humaines. Le candidat sera rattaché à une des équipes de recherches de l’ISHA et devra présenter un programme de recherche qui s’insère dans les perspectives d’une de ces équipes.

Contact : Claude.Montacie@paris-sorbonne.fr
Laboratoires d'accueil : UMR 8598 (GEMASS), EA 4509 (STIH)

Top

6-6

(2012-02-15)PhDs at Tilburg Center for Cognition and Communication (TiCC) research program 'Language, Communication and Cognition' (LCC), The Netherlands

For the Tilburg Center for Cognition and Communication (TiCC) research program 'Language, Communication and Cognition' (LCC), we are looking for two new, enthusiastic and competent PhD colleagues.

If you are interested in one of these positions, you will need to identify a potential research topic related to one of the research themes of the LCC program. Current themes include:

 - Social media and interpersonal communication.

 - Professional communication (medical, business, etc.).

 - Alignment and adaptation in communication.

 - Social exclusion and other social aspects of interaction.

 - Emotion and speech.

 - Language acquisition and learning.

 - Multimodality and communication.

 - Language and speech production.

 - Visual communication (diagrams, metaphors, etc.).

 - Gesture and other forms of non-verbal behavior

 For the positions we seek candidates with a background in a relevant discipline, including Psycholinguistics, Communication & Information Sciences, Linguistics, Cognitive Science, Psychology or some related area, with experience in doing experiments and analyzing data.

The PhD candidates have a good (research) master degree in one of the aforementioned areas, a strong interest in doing research, excellent writing skills and a good command of English. Developing and defending a research plan is part of the procedure.

Tilburg University is rated among the top Dutch employers, offering excellent terms of employment. The collective labour agreement of Tilburg University applies. The selected candidates will start with a contract for one year, concluded by an evaluation. Upon a positive outcome of the first-year evaluation, the candidate will be offered an employment contract for the remaining years. Candidates with a Research Master (MPhil) will be offered a 1+2 years-contract. Master students might be offered a 1+3 years-contract. It is also possible to work 80% instead of fulltime. The PhD candidates will be ranked in the Dutch university job ranking system (UFO) as a PhD-student (promovendus) with a starting salary of € 2.042,-- gross per month in the first year, up to € 2.612,-- in the fourth year (amounts fulltime). The selected candidate is expected to have written a PhD thesis by the end of the contract (which may be based on articles).

Research in the Department of Communication and Information Sciences is located in the Tilburg Center for Cognition and Communication (TiCC). TiCC consists of two research programs: Language, Communication and Cognition (LCC) and Creative Computing (CC). There is a strong emphasis on experimental research and interdisciplinary cooperation. More information about the research programs can be found at http://www.tilburguniversity.edu/research/institutes-and-research-groups/ticc/. There is a strong emphasis on experimental work and interdisciplinary cooperation. The department DCI is responsible for a flourishing academic programme Communication and Information Sciences (CIW), that annually attracts about 120 Bachelor students, 130 Pre-master and 200 Master students. The department is also co-responsible for the Research Master Language and Communication. More information about the DCI department can be found at www.tilburguniversity.nl/faculties/humanities/dci/.

 For more information on the positions, please contact one of LCC program leaders prof.dr. Emiel Krahmer (E.J.Krahmer@uvt.nl, +311346630700) or prof.dr. Marc Swerts (M.G.J.Swerts@uvt.nl, +31134662922).

Applications should include.

 - a cover letter.

 - a Curriculum Vitae.

 - a 2-page research proposal on a selected theme, plus names of potential supervisor and promotor.

 - names of two references.

The only way to apply is via the online link at the bottom of this vacancy: 'apply direct'. If you receive this vacancy via eg. E-mail, please look at the vacancy located at: http://www.tilburguniversity.edu/about-tilburg-university/working-at/wp/. Applications should be sent before the application deadline of March 24, 2012. Interviews are expected to be held in April 2012. Starting dates are flexible, so applicants who expect to graduate in the summer of 2012 are also invited to apply.

Top

6-7

(2012-02-15) Maitre de Conférences, l'Université Sorbonne Nouvelle, Paris

Un poste de Maitre de Conférences est ouvert au recrutement pour la rentrée 2012 à l'Université Sorbonne Nouvelle Paris 3. Voici le descriptif ci-dessous. (plus de détails sur http://lpp.univ-paris3.fr/postes/offres.htm)

Enseignement :

L’enseignant(e) recruté(e) pour ce poste devra se consacrer à la formation en sciences phonétiques et initier les étudiants à des domaines traditionnels, mais aussi plus modernes au sein du cursus de Licence (http://www.ilpga.univ-paris3.fr/documents/descriptifs-licence.pdf) et de Master (http://www.ilpga.univ-paris3.fr/masters.html) de Sciences du Langage.
Le/la candidat(e) enseignera dans les domaines suivants : phonétique acoustique et articulatoire ; aspects perceptifs/cognitifs et communicationnels de la parole ; didactique de la phonétique ; prosodie ; phonétique comparée des langues ; phonétique clinique, etc. Il est également attendu que le/la candidat(e) puisse éventuellement enseigner dans d’autres domaines de la linguistique.

Composante / UFR / Lieu(x) d’exercice :
Université Sorbonne Nouvelle Paris 3
Institut de Linguistique et Phonétique Générales Appliquées
19, rue des Bernardins, 75005 Paris.

Recherche :

Le candidat sera accueilli par le Laboratoire de Phonétique et Phonologie (LPP, Unité Mixte de Recherche 7018), dont le rôle est d’assurer un enrichissement mutuel entre les sciences phonétiques et la phonologie, et l’application de cet enrichissement dans un grand nombre de domaines, tels que l’apprentissage d’une langue nouvelle, l’acquisition du langage, la remédiation, la typologie des langues, etc. La recherche du candidat s’effectuera en synergie avec celles des membres du LPP et contribuera ainsi au développement des thèmes de recherche du laboratoire. Il est également attendu que le candidat participe à la vie administrative du laboratoire.

Laboratoire de rattachement : Laboratoire de Phonétique et Phonologie (LPP), UMR 7018 (http://lpp.univ-paris3.fr/)

Contacts : Jacqueline Vaissière; Cédric Gendrot
Téléphone : 01 44 32 05 74
E-mail : jacqueline.vaissiere@univ-paris3.fr ; cedric.gendrot@univ-paris3.fr

Top

6-8

(2012-02-15) Postdoc at University of Trento, Italy - Machine Translation/Social Computing

Postdoc at University of Trento, Italy - Machine Translation/Social Computing

The Signals and Interactive Systems Lab is looking for top-candidates to fill a postdoc position.
Candidates with significant research experience in Statistical Machine Translation and at least
one of the following topics are invited two apply:

- Spoken Dialog Systems
- Machine Learning
- Social Computing

The candidate will work with SIS lab members and European partners on
an upcoming research project addressing multilingual portability of spoken
conversational systems via machine translation and social computing.

The SIS Lab research is driven by interdisciplinary approach to the
analysis and interpretation of diverse signals ( e.g. speech, text,
biosignals, multimodal ) to support human-human and human-machine
interactive systems. The SIS Lab has state-of-the-art technology
infrastructure and collaborations with premiere national, international
research centers and industry research labs.

For more info SIS lab's research projects
visit the lab website: http://sisl.disi.unitn.it.

LANGUAGE

The official language of the Department and the lab is English.

SALARY

Postdoc salary are in the range of 30K-50K/year (gross) depending on
background and experience. Research fellowship may benefit from tax exemptions.

DEADLINE:

May 1 , 2012

HOW TO APPLY:

All applicants should have good very programming and math skills and used
to project team work. Interested applicants should send their

1) CV along with
2) their transcripts
3) have three reference letters sent to:

Prof. Dr.-Ing. Giuseppe Riccardi Email: sisl-jobs@disi.unitn.it
For more info:
Lab web page: http://sisl.disi.unitn.it/
Department : http://disi.unitn.it/
Local Information: http://international.unitn.it/welcome-services/cost-living

University of Trento
The University of Trento (ww.unitn.it) is constantly ranked as premiere
Italian graduate and undergraduate university institution.
University of Trento is an equal opportunity employer.

Top

6-9

(2012-02-21) Two Postdoctoral Research Associates in Speech Technology,University of Edinburgh

Two Postdoctoral Research Associates in Speech Technology
Centre for Speech Technology Research
University of Edinburgh

The School of Informatics, University of Edinburgh invites applications for two Postdoctoral Research Associates in Speech Technology supported by the EPSRC Programme Grant Natural Speech Technology (NST, http://www.natural-speech-technology.org) and the EU Integrated Project EU-Bridge (http://www.eu-bridge.eu). NST is a collaboration between the Universities of Edinburgh, Cambridge, and Sheffield, whose objective is to significantly advance the state-of-the-art in speech technology by making it more natural, approaching human levels of reliability, adaptability and conversational richness. EU-Bridge is a large scale European collaboration which aims to develop automatic transcription and translation technology to permit the development of innovative multimedia captioning and translation services of audiovisual documents between European and non-European languages. The researchers will be part of the Centre for Speech Technology Research (CSTR, http://www.cstr.ed.ac.uk).

The successful candidate should have or be near completion a PhD in speech processing, computer science, linguistics, engineering, mathematics, or a related discipline. They must have a background in statistical modelling and machine learning, research experience in speech recognition and/or speech synthesis, excellent programming skills, and research publications in international journals or conferences.

Experience in acoustic modelling or language modelling for speech recognition or speech synthesis is essential. A background in one or more of the following areas is also desirable: multilingual speech recognition; subspace Gaussian mixture models; adaptation techniques for acoustic or language modelling; experience of the design, construction and evaluation of speech recognition or speech synthesis systems; distant speech recognition; deep neural networks; and familiarity with software tools including HTK, Kaldi, HTS, or Festival.

Fixed Term: 30 months
Closing date for applications: 10 April 2012

For further details and to apply: http://www.jobs.ed.ac.uk/vacancies/index.cfm?fuseaction=vacancies.detail&vacancy_ref=3015397

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Top

6-10

(2012-02-21) Senior Researcher in Speech Technology, University of Edinburgh

Senior Researcher in Speech Technology
Centre for Speech Technology Research
University of Edinburgh

The School of Informatics, University of Edinburgh invites applications for the post of Senior Researcher in Speech Technology on the EPSRC programme grant Natural Speech Technology (NST, http://www.natural-speech-technology.org/). NST is a collaboration between the Universities of Edinburgh, Cambridge, and Sheffield, whose objective is to significantly advance the state-of-the-art in speech technology by making it more natural, approaching human levels of reliability, adaptability and conversational richness.

You should have a PhD (or equivalent experience) in speech processing, computer science, linguistics, engineering, mathematics, or a related discipline. You must have a background in statistical modelling and machine learning, research experience in speech recognition and/or speech synthesis, excellent programming skills, and a strong publications record in international journals and conferences. In addition, experience of project development and project leadership in a research context, together with excellent communication, presentation, and organisational skills are highly desirable.

You will be part of the Centre for Speech Technology Research (CSTR, http://www.cstr.ed.ac.uk), leading work on novel statistical modelling and machine learning for speech technology. We are interested in research on either large vocabulary conversational speech recognition, or on natural expressive speech synthesis. This will include design, implementation and evaluation of novel algorithms and models for speech recognition or speech synthesis, and testing of algorithms on 'real-world' data and tasks obtained from the NST user group. The work will involve close collaboration with other researchers across the three NST partners.

Fixed Term: 3 years
Closing date for applications: 10 April 2012

For further details and to apply: http://www.jobs.ed.ac.uk/vacancies/index.cfm?fuseaction=vacancies.detail&vacancy_ref=3015400

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Top

6-11

(2012-02-16) 4 PhD positions in spoken dialogue systems research / Charles University in Prague

4 PhD positions in spoken dialogue systems research

Applications are invited for 4 PhD fellowships in the area of
statistical spoken dialogue systems funded by the Czech Government.
The students will join the Institute of Formal and Applied
Linguistics, Charles University in Prague, Czech Republic with an
anticipated start date of October 1st, 2012.

Topic description: In recent years, it has been suggested that
statistical approach to spoken dialogue system offer a framework to
naturally handle inherent uncertainty in the human speech. The two
main advantages of statistical methods are increased robustness in
noisy conditions and more natural behaviour learned from data.
However, the current methods need large corpora effectively preventing
these methods to be used for complex dialogue systems occurring in
real-life. The successful PhD candidates will investigate and
implement statistical models and methods with the aim of increasing
the efficiency of the learning process and reducing the need for large
corpora. The conducted research will cover areas of spoken language
understanding, dialogue management, and natural language processing.

Skills: Candidates should hold a master degree in a relevant area,
such as computer science, mathematics, engineering or linguistics. A
strong mathematical background, excellent programming skills (e.g.
C/C++, Java, MATLAB, and various scripting languages under Linux
environment), aptitude for creative research and autonomy are
expected. Experience in machine learning, Bayesian methods, and
natural language processing is a plus.

The Institute of Formal and Applied Linguistics is a top-level
research group working in the area of computational linguistics and
natural language processing. During the fellowship, there will be good
opportunities to attend international conferences and workshops. The
formal applications should be submitted before June 1. Prospective
candidates are strongly encouraged to contact Dr Filip Jurcicek
(jurcicek@ufal.mff.cuni.cz) as soon as possible to obtain details
about the application process, the institute, and the research
opportunities.
Additional information is available here:
http://ufal.mff.cuni.cz/~jurcicek/jobs.

Top

6-12

(2012-03-03) 3 Post-doctoral positions at the Bruno Kessler Foundation, Center for Information Technology, Trento Italy

3 Post-doctoral positions available in the 'Human Language Technologies - HLT' Research Unit at the Bruno Kessler Foundation, Center for Information Technology.

Workplace description:

The Human Language Technology is a multi-disciplinary research unit that addresses the automatic processing of human language for a range of tasks. In particular, the research unit focuses on: automatic speech recognition, machine translation and content processing.

The HLT unit has been developing state-of-the-art technology in all the main research areas it operates in. The group has consistently performed well in several international evaluations, and is currently engaged in international projects for open source software development (e.g. the Moses platform for statistical machine translation). The unit also provides technological support and high-level services in order to optimize the internal research activities, namely a shared and efficient computing environment, software tools, up to the creation and management of large scale linguistic resources.

The HLT group is part of the larger network of research labs focusing on Human Language Technologies and related domains in the Trento region, that is quickly becoming one of the areas with the highest concentration of researchers in HLT and related fields anywhere in Europe.

More information about the HLT Unit is available at http://hlt.fbk.eu

The HLT Research Unit, is looking for 3 candidates to carry out research activities in the field of

Textual Inferences, Machine Translation and Speech Recognition. Each research position will be funded through the following European research projects:

MateCat: http://www.matecat.com

EU-Bridge: http://www.eu-bridge.eu

EXCITEMENT: website in progress

Open positions:

A Postdoctoral position in Textual Inferences

(Ref.Code HLT_PostDoc2012_TI)

The candidate is expected to carry out research activities in the context of the EU-funded project EXCITEMENT on multilingual semantic processing. The goal of the EXCITEMENT project is to develop generic semantic 'engines' or platforms for robust textual inference that are applicable across languages and linguistic frameworks. These inference platforms will be leveraged for unsupervised text exploration on customer interaction data. Concrete systems will be developed

for English, German, and Italian. Project partners are Bar-Ilan University, DFKI Saarbrücken, University of Heidelberg, Almawave S.r.l, NICE Systems, and OMQ GmbH.

The selected candidate will join the FBK research group with the aim of advancing the state of the art on component-based textual entailment.

A Postdoctoral position in Machine Translation

(Ref.Code HLT_PostDoc2012_MT)

The candidate is expected to contribute original research results inside leading edge international projects. The aim is to advance the state of the art in the integration of statistical MT in computer assisted translation and in adaptive MT, by drawing ideas and contributions from different areas, such as machine learning, statistical language processing, high performance computing, etc.

A Postdoctoral position in Speech Recognition

(Ref.Code HLT_PostDoc2012_SR)

The candidate is expected to contribute original research results inside a leading edge international project. The aim is to advance the state-of-the-art in multilingual speech processing by improving acoustic modelling, language modelling, and adaptation to different domains, conditions and genres. The contribution will be evaluated on application scenarios that include both efficient annotation of audiovisual archives and live processing of audio streams.

Job requirements:

 Applicants should have a PhD degree related to any of the specific research areas mentioned (computational linguistics, speech processing or related fields)

 Experience in statistical modelling, speech processing or machine learning (preferable on approaches applied to NLP tasks)

 Experience in distributed software development (open source)

 Skills in experimental work and development of algorithms

 Ability to work and deliver in funded research projects

 Oral and written proficiency in English

In adherence to FBK's policy to promote equal opportunity and gender balance, in case of equal applications, female candidates will be given preference.

Employment:

Contract type: Full time, 30-month contract (may be extended up to 6 months).

Number of positions: 3

Gross salary: from 37,500 to 41,500 €per year (depending on the candidate’s experience)

Benefits: 28 vacation days per year, flexi-time, company subsidized cafeteria or meal vouchers, internal car park, welcome office support for visa formalities, accommodation, social security, etc., reductions on bank accounts, public transportation, sport, accommodation and language courses fees.

Start date: Spring 2012

Location: Povo, Trento (Italy)

Application process:

To apply online, please send your detailed CV (.pdf format) including a list of publications, a statement of research interests and contact information for at least 2 references. Please include in your CV your authorization for the handling of the your personal information as per the Personal data Protection Code, Legislative Decree no. 196/2003 June 2003.

Applications must be sent to jobs@fbk.eu

Emails should have the reference code related to the position of interest (

HLT_PostDoc2012_TI, HLT_PostDoc2012_MT or HLT_PostDoc2012_SR)

Application deadline: 9 April 2012

Short-listed candidates will be contacted for an interview. Non-selected applicants will be notified of their exclusion at the end of the selection process.

Please note that FBK may contact short-listed candidates who were not selected for the current openings within a period of 6 months for any selection process for similar positions.

For transparency purposes, the name of the selected candidate, upon his/her acceptance of the position, will be published on the FBK website at the bottom of the selection notice

Top

6-13

(2012-03-08) Post-docs at the Speech Processing and Transmission Lab ,Universidad de Chile,Santiago,Chile

The Speech Processing and Transmission Lab (LPTV, Laboratorio de Procesamiento y Transmisión de Voz) at Universidad de Chile,Santiago,Chile, is looking for post-doc researchers in the following fields:

Robust speech recognition

Robust speaker verification

Second language learning assessment

The grants are funded by Conicyt (Chilean funding Agency): http://www.conicyt.cl

The applicant are required to present a brief research proposal prepared in collaboration with the director of the LPTV. For further information, contact:

Néstor Becerra Yoma, Ph.D.

Professor

Speech Processing and Transmission Laboratory

Department of Electrical Engineering

Universidad de Chile

Av. Tupper 2007,POBox412-3

Santiago,Chile

Tel. +56 2 978 4205

Fax. +56 2 695 3881

E-mail: nbecerra@ing.uchile.cl

http://www.cec.uchile.cl/~labptvoz/

Top

6-14

(2012-03-10) Senior Researcher/Research Associate in Statistical Dialogue Systems at Cambridge UK

Senior Researcher/Research Associate in Statistical Dialogue Systems

Applications are invited at either the Senior Research Associate or Research Associate

level to work on an EU-funded project called Parlance which aims to build mobile voice-

driven systems for interactive hyper-local search.

Candidates should have a PhD or comparable research experience in spoken dialogue

systems and noise robust automatic speech recognition and understanding. Good

programming skills are essential and familiarity with HTK would be an advantage.

Appointment at the senior level will require at least 3 years post-doctoral experience

and evidence of independent standing. Salary range is from £27578 to £46846.

This is an exciting opportunity to join one of the leading groups in statistical speech and

language processing. Cambridge provides excellent research facilities and there are

extensive opportunities for collaboration, visits and attending conferences.

Contact Prof Steve Young (sjy@eng.cam.ac.uk) for further information.

Application details can be found at: http://www.jobs.cam.ac.uk/job/-14472

Top

6-15

(2012-03-12) Postdoc position: Acoustic to articulatory mapping of fricative sounds LORIA Nancy France

Postdoc position: Acoustic to articulatory mapping of fricative sounds

15 months, start between September and December 2012 at LORIA (Nancy, France).

Contact : Yves.Laprie@loria.fr

Context

This subject deals with acoustic to articulatory mapping [Maeda et al. 2006], i.e. the recovery of the vocal tract shape from the speech signal possibly supplemented by images of the speaker’s face. This is one of the great challenges in the domain of automatic speech processing which did not receive satisfactory answer yet. The development of efficient algorithms would open new directions of research in the domain of second language learning, language acquisition and automatic speech recognition.

The objective is to develop inversion algorithms for fricative sounds. Indeed, there exist now numerical simulation models for fricatives. Their acoustics and dynamics are better known than those of stops and it will be the first category of sounds to be inverted after vowels for which the Speech group has already developed efficient algorithms.

The production of fricatives differs from that of vowels about two points:

The vocal tract is not excited by the vibration of vocal cords located at larynx but by a noise. This noise originates in the turbulent air flow downstream the constriction formed by the tongue and the palate.
Only the cavity downstream the constriction is excited by the source.

The approach proposed is analysis-by-synthesis. This means that the signal, or the speech spectrum, is compared to a signal or a spectrum synthesized by means of a speech production model which incorporates two components: an articulatory model intended to approximate the geometry of the vocal tract and an acoustical simulation intended to generate a spectrum or a signal from the vocal tract geometry and the noise source. The articulatory model is geometrically adapted to a speaker from MRI images and is used to build a table made up of couples associating one articulatory vector and the corresponding acoustic image vector. During inversion, all the articulatory shapes whose acoustic parameters are close to those observed in the speech signal are recovered. Inversion is thus an advanced table lookup method which we used successfully for vowels [Ouni & Laprie 2005] [Potard et al. 2008].

Activities

The success of an analysis by synthesis method relies on the implicit assumption that synthesis can correctly approximate the speech production process of the speaker whose speech is inverted. There exist fairly realistic acoustic simulations of fricative sounds but they strongly depend on the precision of the geometrical approximation of the vocal tract used as an input. There also exist articulatory models of the vocal tract which yield very good results for vowels. On the other hand, these models are inadequate for those consonants which often require a very accurate articulation at the front part of the vocal tract. The first part of the work will be about the elaboration of articulatory models that are adapted to the production of consonants and vowels. The validation will consist of piloting the acoustic simulation from the geometry and of assessing the quality of the synthetic speech signal with respect to the natural one. This work will be carried out for some X-ray films, whose the acoustic signal recorded during the acquisition of them is sufficiently good.

The second part of the work will be about several aspects of the inversion strategy. Firstly, it is now accepted that spectral parameters implying a fairly marked smoothing and frequency integration have to be used, which is the case of MFCC (Mel Frequency Cepstral Coefficients) vectors. However, the most adapted spectral distance to compare natural and synthetic spectra has to be investigated. Another solution consists in modeling the source so as to limit its impact on the computation of the spectral distance.

The second point is about the construction of the articulatory table which has to be revisited for two reasons: (i) only the cavity downstream the constriction plays an acoustic role, (ii) the location of the noise source is an additional parameter but it depends on the other articulatory parameters. The third point concerns the way of taking into account the vocal context. Indeed, the context is likely to provide important information about the vocal tract deformations before and after the fricative sound, and thus constraints for inversion.

A very complete software environment already exists in the Speech group for acoustic-to-articulatory inversion, which can be exploited by the post-doctoral student.

References

- [S. Ouni and Y. Laprie 2005] Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion, Journal of the acoustical Society of America, Vol. 118, pp. 444-460

- [B. Potard, Y. Laprie and S. Ouni], Incorporation of phonetic constraints in acoustic-to-articulatory inversion, JASA, 123(4), 2008 (pp.2310-2323).

- [Maeda et al. 2006] Technology inventory of audiovisual-to-articulatory inversion http://aspi.loria.fr/Save/survey-1.pdf

Expected skills

Knowledge of speech processing and articulatory modeling, Acoustics, Computer sciences, Applied mathematics

Top

6-16

(2012-03-15) Ingénieur de recherche à IRISA Lannion France

Un poste d'ingénieur de recherche (CDD 24 mois) est ouvert dans l'équipe de recherche Cordial de l'Irisa à Lannion. Ce recrutement, dont le profil de recherche se situe en traitement de la parole et du signal de parole, est effectué dans le cadre du projet ANR Phorevox. Le poste est à pourvoir dès que possible. Le profil détaillé est disponible en suivant le lien :
http://www.irisa.fr/doc/emploi/emploi_cordial-12_03.pdf

Top

6-17

(2012-03-15) INGENIEUR D’ETUDES ET RECHERCHE au Labo Nat. Métrologie et essais.

DIRECTION DES ESSAIS

Pôle Essais en Environnement

INGENIEUR D’ETUDES ET RECHERCHE

EN TRAITEMENT AUTOMATIQUE DES LANGUES H/F

Réf : CL/TAL/DE

Contexte :

Le Laboratoire National de Métrologie et d’Essais propose des prestations d’évaluation de la performance des systèmes de traitement automatiques des langues et de la parole pour une tâche donnée (transcription, traduction, extraction d’informations,…).

Au sein du Département CEM, Sécurité Electrique et Technologies de l’information, l’équipe de traitement de l’information multimedia travaille sur les différentes étapes qui définissent une évaluation. Ses principales missions sont :

- De définir des tâches pertinentes à évaluer en fonction des besoins applicatifs et/ou théoriques,

- De déterminer des caractéristiques des données à utiliser pour répondre à la tâche considérée,

- D’établir des mesures qui permettent de rendre compte de la pertinence d’un système pour une tâche donnée.

Missions :

Dans le cadre de programmes d’études et recherche, vous aurez pour mission de contribuer au développement de l’activité, notamment au travers des éléments suivants :

- Le montage et la gestion de projets de recherche et développement dans le domaine du multimédia,

- L’élaboration des protocoles pour répondre aux problématiques de l’évaluation en Traitement Automatique du Langage :

Définitions des mesures
Implémentation des outils informatiques
Organisation et animation des colloques de restitution
Participation à la dissémination des connaissances (entretien site web, publications scientifiques,…)

- Le développement des partenariats au niveau international afin de renforcer la position du LNE dans le domaine.

Profil :

Docteur Ingénieur en Informatique, spécialisé en Traitement Automatique des Langues (TAL).

Vous possédez une première expérience professionnelle (3 à 5 ans en plus de la thèse), durant laquelle vous avez travaillé sur l’évaluation des systèmes automatiques.

Vous maîtrisez la gestion de projet et êtes à l’aise dans l’approche clients et l’organisation et l’animation de réunions et/ou séminaires.

Vous avez des connaissances solides en programmation et analyse de données (fouille de données).

Rigoureux, dynamique, déterminé et d’un relationnel facile, vous saurez rapidement vous intégrer au sein des équipes et démontrer le leadership et l’expertise nécessaires à la réussite de votre mission.

Anglais courant impératif.

Déplacements à prévoir (une dizaine par an de 1 à 3 jours, majoritairement en France).

Poste en CDI basé à Trappes (78).

Contact :

Postuler sous la référence CL/TAL/DE A l’attention de Mlle Christelle LEBRAULT - Par mail : recrut@lne.fr

Top

6-18

(2012-03-25) Audio Indexing Researcher W/M position at IRCAM – 3DTV project

If you already applied for this position, please just send us a quick email telling us you are still interrested and we get back to you.

Audio Indexing Researcher W/M position at IRCAM – 3DTV project

Starting : April 2012 (as soon as possible)

Duration : 18 months

Introduction to IRCAM

IRCAM is a leading non-profit organization associated to Centre Pompidou, dedicated to music production, R&D and education in acoustics and music. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D department include acoustics, audio signal processing, computer music, interaction technologies, musicology. Ircam is located in the centre of Paris near the Centre Pompidou, at 1, Place Igor Stravinsky 75004 Paris.

Introduction to 3DTVs project

The goal of the 3DTVS project is to devise scalable 3DTV AV content description, indexing, search and browsing methods across open platforms, by using mobile and desktop user interfaces and to incorporate such functionalities in 3D audiovisual content archives. 3D multichannel audio analysis targets audio event detection based on fusion techniques that combine the feature analysis performed in the individual channels as well as source localization and separation algorithms for the detection of moving audio sources. The results will be used in 3D audio/cross-modal indexing and retrieval. Multimodal 3D audiovisual content analysis will built on the results of 3D video and audio analysis. 3DTV content description and search mechanisms will be developed to enable fast reply to semantic queries.

Role of IRCAM in the 3DTV Project

In the 3DTVs project, IRCAM is in charge of the research and development of technologies related to

- Audio event detection using multi-channel audio scenes

- Speaker diarization

- Segmentation into Movie scene from the audio signal

- Sound source separation, localization and identification

Position description

Hired Researcher will be in charge of the development of technologies related to:

Audio event detection using multi-channel audio scenes
Speaker diarization

The Researchers will also collaborate with the development team and participate in the project activities (evaluation, meetings, specifications).

Required profiles

High skill in audio indexing and data mining (statistical modelling, automatic feature selection algorithm …); especially late-fusion algorithms
High skill in audio signal processing (spectral analysis, audio-feature extraction, parameter estimation)
High-skill in Matlab programming, skills in C/C++ programming

Good knowledge of Linux, Windows, MAC-OS environments

· High productivity, methodical works, excellent programming style.

Salary

According to background and experience

Applications

Please send an application letter together with your resume and any suitable information addressing the above issues preferably by email to: peeters_a_t_ircam dot fr with cc to vinet_a_t_ircam dot fr, roebel_at_ircam_dot_fr

L’Ircam recrute un Chercheur H/F – en CDD de 18 mois et à temps plein – Projet 3DTVs

Poste disponible à partir d'avril 2012

Présentation de l’Ircam

L'Ircam est une association à but non lucratif, associée au Centre National d'Art et de Culture Georges Pompidou, dont les missions comprennent des activités de recherche, de création et de pédagogie autour de la musique du XXème siècle et de ses relations avec les sciences et technologies. Au sein de son département R&D, des équipes spécialisées mènent des travaux de recherche et de développement informatique dans les domaines de l'acoustique, du traitement des signaux sonores, des technologies d’interaction, de l’informatique musicale et de la musicologie. L'Ircam est situé au centre de Paris à proximité du Centre Georges Pompidou au 1, Place Stravinsky 75004 Paris.

Introduction au projet 3DTVs

L'objectif du projet 3DTVs est de concevoir des descriptions évolutives des contenus 3DTV, leur indexation, leur recherche ainsi que la conception de méthodes de navigation sur toutes des plateformes ouvertes, en utilisant des interfaces utilisateurs mobiles et fixes et d'intégrer de telles fonctionnalités 3D dans les archives de contenus audiovisuels. L’analyse multi canal audio 3D vise la détection d’événements audio basés sur des techniques de fusion combinant l'analyse audio effectuée dans les canaux individuels ainsi que des algorithmes de localisation et de séparation de source pour la détection des mouvements des sources audio. Les résultats seront utilisés pour l’indexation 3D audio et cross modale ainsi que pour la recherche. L’indexation audio/ video multimodale 3D des contenus audiovisuels s’appuiera sur les résultats de l’indexation vidéo 3D et audio 3D. Des méthodes de description de contenu et de recherche seront développées afin de permettre des réponses rapides aux recherches sémantiques.

Rôle de l’Ircam dans le projet Quaero

Dans le projet 3DTVs, l'Ircam est en charge de la recherche et du développement des technologies relatives à la

- Détection des événements audio en utilisant les scènes audio multi canal

- Segmentation en tours de parole

- Séparation, localisation et identification des sources sonores

Missions

Le Chercheur sera en charge du développement des technologies liées à:

- Détection des événements audio en utilisant les scènes audio multi canal

Le chercheur collaborera également avec l'équipe de développement et participera aux activités du projet (évaluation, réunions, spécification).

Profil recherché

Grande expérience de recherche en indexation audio (modélisation statistique, sélection automatique de descripteurs …) ; grande connaissance en techniques de fusion tardives
Grande expérience de recherche en traitement du signal (analyse spectrale, extraction de descripteurs audio, estimation de paramètres)
Très bonne connaissance du langage Matlab

Connaissance des environnements Linux, Windows et Mac OS-X.
Connaissance des langages C et C++
Haute productivité, travail méthodique, excellent style de programmation, bonne communication rigueur

Salaire

Selon formation et expérience professionnelle

Candidatures

Prière d'envoyer une lettre de motivation et un CV détaillant le niveau d'expérience/expertise dans les domaines mentionnés ci-dessus (ainsi que tout autre information pertinente) à peeters _a_t_ ircam dot fr avec copie à

vinet _a_t_ ircam dot fr, roebel _at_ ircam _dot_ fr

If you already applied for this position, please just send us a quick email telling us you are still interrested and we get back to you.

Top

6-19

(2012-03-27) Post Doctoral Fellow or Research Associate, Toronto, Canada

Position: Post Doctoral Fellow or Research Associate (scientific)

Site:              Toronto Rehabilitation Institute, University Centre, Toronto, Canada
Start Date:      Immediately
Status:           1 year

We are searching for a motivated and skilled individual to take part in signal and audio processing of biomedical data for 1 year at the Toronto Rehabilitation Institute, University Centre. He/she will work on analysis and processing of audio signals to detect disease-specific patterns. This position will offer an opportunity to work in a resourceful and multi-disciplinary environment and publish findings in scholarly journals.

KEY RESPONSIBILITIES:

Problem analysis and conceptual design of solutions
Literature review
Study and analysis of biomedical data
Developing code for biomedical signal and audio processing
Feature extraction and pattern classification
Evaluation of results and tuning of the model parameters to optimize the performance.
Documentation of procedures and outcomes

KEY REQUIREMENTS:

At least 3 years of practical experience either in graduate studies or in industry in audio or speech processing
Proficiency in speech/audio processing tools using matlab or stand alone toolkits such as HTK
Theoretical Knowledge of pattern classification algorithms such as HMM, support vector machines, and neural networks
Good technical writing skills and sound publication record
Excellent interpersonal, organizational, and multi-tasking skills

ASSET REQUIREMENTS:

Have good programming skills (Python, C, Matlab, ...)
Experience in processing of physiological signals
Preference will be given to PhD graduates

Please quote job reference “Sleep Apnea DSP” and send résumé to:

Alshaer dot Hisham at torontorehab dot on dot ca

Top

6-20

(2012-04-02) Research position in Spoken Language Dialogue Systems Development for Serious Games ; University of Ulm Germany

Research Position with perspective of a PhD degree in Spoken Language
Dialogue Systems Development for Serious Games

The Dialogue Systems Group (www.dialogue-systems.org) in the Faculty of
Engineering and Computer Sciences, University of Ulm is seeking a
researcher at MSc level to work in the area of Spoken Dialogue
Management for Serious Games. The research topic will fit into the
scientific context of the group (including Intelligent, Adaptive and
Proactive Spoken Language Dialogue Interaction, Semantic Analysis, and
Dialogue Modelling) but will be adapted to the expertise of the candidate.

The dialogue management system will manage the communication between
mobile nodes that are connected via a mobile adhoc network (MANET). Due
to the mobility of the nodes and limited range of wireless transmissions
the underlying network topology and link quality frequently changes. In
order to build an adaptive dialogue management system the network can
provide the system with information about the available resources (such
as segmentation and link quality). In turn the dialogue manager can
request different Quality of Service for individual communications (such
as low latency, delay tolerance or high reliability)

Perspective: PhD Thesis.

Requirements: Good programming skills in C, C++, Perl, VoiceXML, Java,
JavaScript and experience with Unix/Linux are highly desirable;
expertise in speech and dialogue technologies would also be appreciated.

The appointment (0,5 TVL) has a fixed duration of 36 months.

Candidates should send their application electronically to
wolfgang.minker@uni-ulm.de. The application should include a short
resume, the names of two referees and a transcript of records with the
results of exams relevant to the MSc Degree. A pdf-version of the MSc
Thesis may also be included.

Dialogue Systems Group
Institute of Communications Engineering
Faculty of Engineering and Computer Sciences
University of Ulm, Germany

--
Wolfgang Minker
Ulm University
Communications Engineering - Dialogue Systems
Albert-Einstein-Allee 43
D-89081 Ulm
Phone: +49 731 502 6254/-6251
Fax: +49 731 501 226254
http://dialogue-systems.org

Top

6-21

(2012-04-04) PhD fellowship- Fondazione Bruno Kessler (FBK), Trento, Italy

A PhD fellowship is available for conducting research studies in the
field of Automatic Speech Recognition at the Human Language Technology
Research    Unit    (http://hlt.fbk.eu/en/openpositions/phd-ict)    of
Fondazione Bruno Kessler (FBK), Trento, Italy. Research work will be
carried out at FBK as part of the PhD Program of the International
Doctorate School   in Information and   Communication Technologies
(http://www.ict.unitn.it)   of the   University of   Trento, Italy.
Interested candidates need to specify in the application form that
they intend apply for the project-specific grant offered by FBK for
the Automatic Speech Recognition (ASR) project. FBK has a long
tradition in developing automatic transcription systems for several
languages (information about the research group and ongoing activities
can be found at http://hlt.fbk.eu), the aim of the project will be
advancing beyond the state-of-the-art the existing FBK automatic
transcription technology. Possible research topics include but are
not limited to: improving acoustic modeling for large vocabulary ASR
(e.g. discriminative training algorithms, speaker adaptive acoustic
modeling, methods for fast and efficient adaptation to changing
domains, data selection methods for AM training, bootstrap methods for
under resourced languages) and improving language modeling for large
vocabulary ASR (data selection for LM training, domain adaptation).
Details   about   requirements of   candidates   can   be found   at
http://hlt.fbk.eu/en/openpositions/phd-ict.

Contact: Daniele Falavigna (falavi@fbk.eu)

Top

6-22

(2012-04-04) Post-Doctoral Research Position, Aalto University

Post-Doctoral Research Position, Aalto University

Title: Statistical speech synthesis

Department: Department of Signal Processing and Acoustics

URL: http://spa.aalto.fi/en/

Start date: August-October 2012

Duration: 12-18 months contract

Department of Signal Processing and Acoustics, Aalto University (Espoo, Finland), invites applications for a post-doctoral researcher position in speech technology. The position is funded by the Simple4all project (http://simple4all.org/), which is a collaboration between Aalto University, University of Edinburgh (coordinator), University of Helsinki, Universidad Politécnica de Madrid, and Universitatea Tehnica Cluj-Napoca. Simple4All is a 3 year project, funded by EC’s FP7 ICT Programme, whose general aim is to create speech synthesis technology that learns from data with little or no expert supervision and continually improves itself, simply by being used.

The work at the Department of Signal Processing and Acoustics focuses on novel vocoding technologies in statistical parametric speech synthesis. More specifically, we are interested in utilizing such speech models in statistical speech synthesis that are closer to the human speech production mechanism and are inherently able to produce many voice qualities. Applicants for the post-doctoral researcher position must have a PhD (or equivalent experience) in speech processing, digital signal processing or computer science. They must have background in statistical speech synthesis, experience in the development of vocoders is particularly appreciated. In addition, experience of project development and project leadership in a research context, together with excellent communication, presentation, and organisational skills are highly desirable.

To apply, please send your CV (.pdf format) including a list of publications and your contact information, a statement of research interests and contact information for at least 2 references. Applications must be sent to paavo.alku@aalto.fi using the subject line: Post-doc position in statistical speech synthesis

Application deadline: 30 June 2012

Top

6-23

(2012-04-15) Full Time Research Programmer, Dialog Research Center, CMU Pittsburgh

Full Time Research Programmer, Dialog Research Center
Language Technologies Institute, CMU Pittsburgh

Minimum Education Level: Bachelor's Degree

The Dialog Research Center (dialrc.org) provides infrastructure support for spoken dialog systems, including distribution of data and software. DIALRC is funded by the US National Science Foundation, and is hosted at the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University's main campus in Pittsburgh PA, US. In addition to distributing open source software and dialog data, DIALRC offers live dialog platforms for researchers to evaluate their techniques with real users in live situations. The person filling the position will be responsible for developing and supporting existing dialog software, managing distributions, supporting other researchers using the system and other programming and support tasks as necessary for the center.

Primary Tasks

Work with graduate student researchers to maintain and further develop existing dialog systems based on the open-source Olympus Spoken Dialog System framework. Maintain and distribute data corpus. General programming and research support.

Preferred Skills

Two or more years experience in research programming; some experience in supporting research software systems; experience with the CMU Olympus Spoken Dialog System; other spoken dialog systems; and/or speech recognition and speech synthesis.

Experience in Speech Processing/Natural Language Dialog: Python, Perl, C/C++

More information is avilable from Maxine Eskenazi (max@cs.cmu.edu) and Alan W Black (awb@cs.cmu.edu).

Or goto http://www.cmu.edu/jobs/postings/index.html and search for Job Number 9039

Top

6-24

(2012-04-20) PhD grant: Prosodic markers at IRIT Toulouse

Modélisation de trajectoires de marqueurs prosodiques et linguistiques ; application à la caractérisation des intentions des intervenants dans les discours audiovisuels

Contact

Jérôme Farinas, jfarinas@irit.fr équipe SAMOVA http://www.irit.fr/recherches/SAMOVA/

Description du sujet

Dans le domaine du traitement automatique de l'audio, les systèmes actuels sont parvenus à une assez grande maturité pour extraire de façon plutôt fiable des informations sur les locuteurs présents, la langue utilisée et la transcription de la parole. Un des objectifs de la recherche actuelle consiste à utiliser ces informations afin de structurer les interventions des locuteurs et plus largement le contenu radiophonique et télévisuel.

Dans ce contexte, l'équipe SAMOVA de l'IRIT a acquis ces dernières années de fortes compétences en modélisation et segmentation automatique en locuteurs [Louradour 2007, El Khoury 2010], en identification automatique de langues [Pellegrino 1998, Farinas 2002, Rouas 2005], en segmentation parole/musique/chant [Pinquier 2004, Lachambre 2009], en extraction de jingle [Pinquier 2004], en transcription de la parole [Campagne ESTER 2004], en recherche de zones de parole conversationnelle [Projet EPAC 2010] et de mots-clés [Le Blouch 2009]. En s'appuyant sur ces travaux, l'équipe travaille sur la structuration des émissions en se basant sur les interventions des locuteurs et leurs interactions [Bigot 2011] ainsi que sur la vidéo [Ercolessi 2011].

A partir d'une caractérisation du rôle des intervenants (présentateur, locuteur dominant...) notre objectif est d'étudier plus précisément les interactions entre locuteurs afin de distinguer ce qui dans le message relève de l'interaction (ouverture, clôture, présentation d'un invité, gestion des tours de parole) et des échanges d'opinion. Plus largement, le sujet de thèse proposé vise à étudier l'intention dans les interventions audiovisuelles de personnes.

La modélisation des intentions est principalement basée sur la modélisation de la prosodie, qui a travers l'intonation et le rythme permet d'influer sur la forme du discours. Cette modélisation devra prendre en compte la prosodie à court ou long terme [Farinas2002,Rouas2004]. Deux niveaux de modélisations seront donc mis en œuvre afin de caractériser la modalité de la phrase et la modification de la prosodie des mots. Cela passera par la choix de paramètres prosodiques appropriée (F0, energie) et la modélisation statistique de ces paramètres. L'évolution temporelle pourra être prise en compte en utilisant des modélisations stochastiques, des modélisations de trajectoires.

Cette étude se déroulera en deux phases :

dans un premier temps elle portera sur la détermination de marqueurs linguistiques (par le biais de la détection d'expressions clés) et prosodiques (emphase, modalité de la phrase, intonation locale) caractéristiques de certaines fonctions communicatives présentes dans les interactions entre personnes. Ces indicateurs permettront de localiser les zones du document dans lesquelles les informations sur l'intervenant (son nom, son statut) sont potentiellement présentes et apporteront des précisions sur le contexte dans lequel la personne intervient (interview, débat, …). Ces informations pourront d'une part aider à mieux décrire le contenu et d'autre part renforcer les résultats issues de la reconnaissance de la parole particulièrement difficile en situation de débat et de parole spontanée.
dans un second temps, à partir des informations disponibles sur les intervenants, l'étude portera sur l'analyse de leurs intentions. Par exemple, pour un présentateur il s'agira de déterminer les zones qui correspondent à la présentation des invités, la gestion des tours de parole, l'ouverture ou la clôture des débats tandis que pour un invité, il s'agira plutôt de qualifier ses tours de parole afin de caractériser l'objectif de son intervention (donner un avis, apporter une contradiction...) à travers, entre autres, du message, du ton, du comportement, du mode d'expression, de la prosodie locale mais également d'indications issues de la vidéo (texte incrusté, ...).

Les applications de cette recherche concernent la structuration de contenus audiovisuels pour aider à l'archivage documentaire et la recherche d'information dans ces contenus. Cette structuration et caractérisation de zones d'interaction présente également un intérêt pour la constitution de résumés audio-visuels.

Le candidat devra posséder un Master avec de fortes compétences en informatique. Des connaissances en traitement du signal, en reconnaissance de la parole seraient souhaitables (reconnaissance de la parole et prosodie).

Références

[Louradour 2007] Noyaux de séquences pour la vérification du locuteur par Machines à Vecteurs de Support. Thèse de doctorat, Université Paul Sabatier, janvier 2007

[El Khoury 2010] Unsupervised Video Indexing based on Audiovisual Characterization of Persons. Thèse de doctorat, Université de Toulouse, juin 2010

[Pellegrino 1998] Une approche phonétique en identification automatique des langues : la modélisation acoustique des systèmes vocaliques. Thèse de doctorat, Université Paul Sabatier, décembre / december 1998.

[Farinas 2002] Une modélisation automatique du rythme pour l'identification des langues. Thèse de doctorat, Université Paul Sabatier, novembre 2002.

[Rouas 2005] Caractérisation et identification automatique des langues. Thèse de doctorat, Université Paul Sabatier, mars 2005.

[Pinquier 2004] Indexation sonore : recherche de composantes primaires pour une structuration audiovisuelle. Thèse de doctorat, Université Paul Sabatier, décembre 2004.

[Lachambre 2009] Caractérisation de l'environnement musical dans les documents audiovisuels. Thèse de doctorat, Université de Toulouse, décembre 2009.

[Campagne ESTER 2004] G. Gravier, J.F. Bonastre, S. Galliano, E. Geoffrois, K. Mc Tait and K. Choukri. ESTER, une campagne d'évaluation des systèmes d'indexation d'émissions radiophoniques, Proc. Journées d'Etude sur la Parole, Avril 2004.

[projet EPAC 2010] Yannick Estève, Thierry Bazillon, Jean-Yves Antoine, Frédéric Béchet, Jérôme Farinas. The EPAC corpus: manual and automatic annotations of conversational speech in French broadcast news (regular paper). Dans : Language Resources and Evaluation Conference (LREC 2010), Valletta, Malte, 19/05/2010-21/05/2010, Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk (Eds.), European Language Resources Association (ELRA), p. 1686-1689, 2011.

[Le Blouch 2009] Décodage acoustico-phonétique et applications à l'indexation audio automatique. Thèse de doctorat, Université Paul Sabatier, juin 2009.

[Bigot 2011] Benjamin Bigot, Isabelle Ferrané, Julien Pinquier, Régine André-Obrecht. Speaker Role Recognition to help Spontaneous Conversational Speech Detection (regular paper). Dans : International workshop on Searching Spontaneous Conversational Speech SCSS (SCSS 2010), Firenze, Italy, 25/10/2010-29/10/2010, ACM, p. 5-10, octobre 2010.

[Ercolessi 2011] Philippe Ercolessi, Hervé Bredin, Christine Sénac and Philippe Joly, Segmenting TV series into scenes using speaker diarization, WIAMIS 12^th International Workshop on Image Analysis for Multimedia Interactive Services, Delft, Pays-Bas,13-15 avril 2011.

Mots clés

Traitement automatique de la parole, décodage phonétique, recherche de mots clés, prosodie, acoustique, structuration en émissions, vidéo

Kewords

Automatic Speech Processing, Phonetic Decoding, Keyword Spotting, Prosody, Acoustic, Structuring Programs, Video

Top

6-25

(2012-04-20) Ingénieur at INRIA France

Inria recherche un ingénieur jeune diplômé pour développer sa boîte à outils de séparation de sources audio FASST (http://bass-db.gforge.inria.fr/fasst/) et effectuer un travail de recherche sur la reconnaissance de la parole robuste au bruit.

Qualification: ingénieur ou master 2, diplôme obtenu en 2011 ou 2012
Durée: 2 ans
Lieu de travail: Nancy, France
Date prévisionnelle d’embauche: 01/12/2012
Salaire: 2527 € brut mensuel

Les candidatures seront examinées au fil de l'eau. Informations supplémentaires et formulaire de candidature sur
http://www.inria.fr/institut/recrutement-metiers/offres/ingenieurs-jeunes-diplomes/%28view%29/details.html?nPostingTargetID=11988

Top

6-26

(2012-05-01) PhD Reconnaissance automatique de la parole continue : parole spontanée LORIA Nancy France

Sujet de thèse :Reconnaissance automatique de la parole continue : parole spontanée

Encadrants pour ce sujet :
– Irina Illina, Maitre de conférences, HDR, Université de Lorraine, bureau C147, tel. 03 83 59 84 90, mel. illina@loria.fr
– Denis Jouvet, Directeur de recherches INRIA, HDR bureau C147, tel. 03 54 95 86 26, mel. denis.jouvet@inria.fr
Type de financement CONTRAT DOCTORAL

Lieu : Inria-LORIA Nancy

Le sujet est affiché sur le site de l'école doctorale IAEM http://www.iaem.uhp-nancy.fr/ , rubrique 'propositions contrats doctoraux'.

Date limite du depot de candidature : le 1-er juin

Conetxte : La reconnaissance de la parole est un processus par lequel un ordinateur transforme le signal acoustique de la parole prononcée en texte. Pendant ce processus, le système de reconnaissance utilise des modèles acoustiques, des modèles de langage et un lexique de prononciations.
La parole spontanée est définie comme un énoncé perçu et conçu au fil de son élocution. Par rapport à la parole préparée, la parole spontanée se caractérise par :
– des insertions (hésitations, répétitions, pauses, reprises, faux départs) ;
– des variations de prononciations (contraction de mots ou de phonèmes) ;
– des environnements difficiles (rires, parole superposée) ;
– des phrases agrammaticales.
La parole spontanée est présente sous plusieurs formes : interviews, débats, dialogues. Ces spécificités sont peu ou pas prises en compte dans les systèmes de reconnaissance de la parole.
Afin d’améliorer la performance de systèmes de reconnaissance il est nécessaire de s’attaquer à deux problèmes ouverts :
– d’un part, détecter automatiquement ces événements de la parole spontanée ;
– et d’autre part, les prendre en compte dans le système de reconnaissance au niveau acoustique ainsi qu’au niveau linguistique.
Pour caractériser et détecter la parole spontanée, [Dufour et al.2009] propose un ensemble de caractéristiques acoustiques (la durée et le débit phonétique) et linguistiques (morphèmes spécifiques, répétitions et faux départs). Concernant la prise en compte de la parole spontanée, certaines pistes de recherche se sont avérées intéressantes telles que l’analyse de prononciation latente avec les connaissances à priori [Lin2007], l’utilisation de dictionnaires avec des prononciations multiples issues de la parole spontanée et l’étude de différents contextes acoustiques de phonèmes [Dupont et al.2005].

L’objet de cette thèse est d’apporter des éléments de solution à ce problème en proposant de nouvelles méthodes qui permettent de mieux prendre en compte les caractéristiques de la prononciation spontanée dans le cadre de la reconnaissance automatique de la parole.
Le premier objectif de cette thèse concerne l’augmentation de nos connaissances de la variabilité de la parole spontanée dans différents types de parole (interviews, dialogues, etc.). Nous allons surtout nous intéresser aux aspects segmentaux et acoustiques du problème. Les aspects prosodiques pourraient également être envisagés.
Le second objectif concerne la détection et la localisation de ces phénomènes de parole spontanée, et surtout leur prise en compte pour améliorer la reconnaissance de la parole. Ceci reposera sur l’enrichissement des modèles pour tenir compte des connaissance acquises, ainsi que sur la mise en ouvre de techniques de détection de ces phénomènes. Le travail s’effectuera au sein de l’équipe PAROLE au LORIA en utilisant le système ANTS [Brun et al.2005]. Après une étude bibliographique, l’étudiant aura à analyser des corpus de parole, à développer des modules de traitement de la parole spontanée et à les intégrer dans notre système de reconnaissance de la parole. Puis il devra d’évaluer les améliorations sur différents
corpus de parole. Notre équipe possède déjà un corpus riche en parole spontanée : le corpus d’émissions radiophoniques et télévisées, issu des campagnes d’évaluation ESTER et ETAPE.
Les validations éventuelles sur un corpus de parole de personnes âgées (dans un but d’assistance aux personnes à domicile) nous permettraient probablement de dégager et d’étudier d’autres phénomènes de la parole spontanée.
Les domaines abordés par ce sujet sont : la reconnaissance automatique de la parole,
la modélisation probabiliste, la parole spontanée, modélisation acoustique, modèle de langage.

Références : [Brun et al.2005] A. Brun, C. Cerisara, D. Fohr et I. Illina. ANTS : le système de transcription automatique du LORIA. WorkShop ESTER, 2005.
[Dufour et al.2009] R. Dufour, V. Jousse, Y. Estève, F. Bechet et G. Linares. Spontaneous speech characterization and detection in large audio database. SpeCom, 2009.
[Dupont et al.2005] S. Dupont, C. Ris, L. Couvreur et J.-M. Boite. A study of implicit and explicit modeling of coarticulation and pronunciation variation. Interspeech, 2005.
[Lin2007] L.-S. Lin, C.-K. Lee. Pronunciation modeling for spontaneous speech recognition using latent analysis (LPA) and prior knowledge. ICASSP, 2007

Top

6-27

(2012-05-13) PhD position: Caractérisation de l'ambiance sonore dans des enregistrements ethnomusicologiques IRIT Toulouse France

Titre : Caractérisation de l?ambiance sonore dans des enregistrements ethnomusicologiques

Responsables : Régine André-Obrecht et Julien Pinquier (IRIT, équipe SAMoVA) obrecht@irit.fr et pinquier@irit.fr

Cette thèse concerne le traitement de données ethnomusicologiques issues des archives du CNRS-Musée de l?Homme, gérées par le Centre de Recherche en EthnoMusicologie (CREM) du Laboratoire d'Ethnologie et de Sociologie Comparative (LESC). Il s?agit de documents en cours de numérisation et d?informatisation (3500 heures d?enregistrements inédits, de 1900 à nos jours, de musiques traditionnelles et d?enquêtes ethnographiques du monde entier et 3500 heures de documents anciens et rares). Cette collection est d?une grande importance historique et est unique au monde. Dans ce contexte applicatif, il est nécessaire de mettre au point un ensemble d'outils de traitement automatique de l'audio (parole, musique, chant, bruits?) afin de produire une indexation (semi)automatique pour un accès intelligent à la collection d'enregistrements sonores. Ce travail est principalement à destination de chercheurs (experts) en ethnomusicologie.

L?étude envisagée a pour objectif une caractérisation plus fine des composantes Parole, Musique, Chant, Bruits afin de définir l?environnement sonore générique. De plus, l?introduction d?une approche semi-supervisée (prise en compte de métadonnées disponibles ou de l?utilisateur) doit permettre la caractérisation d?environnements sonores spécifiques.

Après s?être approprié les différents systèmes précédemment développées à l?IRIT, concernant la détection de parole et de musique, le doctorant aura en charge leur adaptation au corpus du projet. L?analyse des zones de parole et de voix chantée détectées doit conduire à une segmentation en tours de parole et en tours de chant, suivie du regroupement de ces segments par recherche de similarité des voix. Dès lors que les enregistrements sonores sont effectués dans des conditions naturelles et lorsque les zones de parole, de musique et de chant sont identifiées, restent des zones sonores digne d?un intérêt pour un ethnomusicologique car leur écoute permet de préciser le contexte sonore de la session de l?enregistrement, ce que l?on appelle « l?ambiance sonore ». Il est proposé de localiser ces zones de bruit d?intérêt et de spécifier un étiquetage. Pour ce faire, deux stratégies sont envisagées :

- un mode supervisé en utilisant les attributs acoustiques classiques (approche générique),

- un mode non-supervisé en introduisant des connaissances issues des ethnomusicologues (approche spécifique) via la plateforme Telemeta (http://crem.telemeta.org/).

Ce doctorat sera financé par le projet ANR DIADEMS qui démarrera en octobre 2012. Il serait appréciable que le candidat ait des connaissances en reconnaissance de formes et en traitements de la parole et de la musique.

Date limite de réponse : 15 juin 2012

Top

6-28

(2012-06-01) Two positions at Nuance Belgium

Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world. Our technologies, applications and services make the user experience more compelling by transforming the way people interact with information and how they create, share and use documents. Every day, millions of users and thousands of businesses, experience Nuance by calling directory assistance, getting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched. Making each of those experiences productive and compelling is what Nuance is all about.

Speech Recognition Specialist

Merelbeke, Belgium Permanent role Response to Craig.Robertson@Nuance.com

Nuance Mobile builds innovative, intelligent and intuitive touch and speech interfaces to simplify and enhance the way people interact with mobile devices, applications, and services. Nuance Mobile solutions make mobile devices and in-car systems easier to use, automate customer self-service, and optimize the access and discovery of even the most advanced mobile applications and content - regardless of technical know-how, location, environment, or physical and literacy capabilities.

As a contributing member of Nuance, you will work within a dynamic team environment to develop, support, market and sell our award-winning software applications. We offer competitive compensation packages and a challenging technical but casual work environment. Join our dynamic, entrepreneurial team that operates worldwide (Europe, US, APAC). Be a part of our fast growing track of continuing success.

For more information, please see www.nuance.com.

Nuance is an equal opportunity employer.

Responsibilities

As a Speech Recognition specialist at Nuiance you will work with peers from other teams arround the world to investigate new & best usage of speech recognition for music and/or POI vertical domains. You will be closely working together with our R&D dpt to understand what is and what is not doable with the current limitation of the technology, and help customer's and Nuance internal integration teams to include Nuance technologies into successful products in an efficient way.

Representative tasks will include:

Investigate new & best usage of our speech recognition technologies for entering a POI (Point of Interest) by voice, considering the platform & technology constraints
Investigate new & best usage of our speech recognition technologies for accessing music by voice, considering the platform & technology constraints
Crunch navigation POI data from map providers to build proof of concepts and experiments
Contribute to research and technology agendas by providing input and improvement requests to our r&d dpt
Support customer projects integrating ASR technologies for POI and/or music

Qualifications

Bachelors or Graduate University degree in Electrical Engineering, Computer Engineering, Computer Science or equivalent / related Technical Degree
first working experience
Strong C/C++ programming skills; proven software/system problem-solving skills.
Excellent oral and written communication skills in English is a must
Good listener and communicator, who can represent Nuance professional services at the customer’s premises or in written and oral communications with customers.
Positive 'can-do' attitude, well organized, focusing on achieving results cost-effectively
Ability and willingness to travel
Ability to work independently, including at customer premises, but always as part of the embedded team.
Self learner, with sense of initiative, and perseverance to deliver high quality work.

Preferred:

• Experience with embedded hardware platforms, embedded operating systems, and embedded software development is desirable

• Experience with Python and SQLite is highly desirable

• Windows CE or Linux or QNX OS

NLP Processing Engineer

Merelbeke, Belgium Permanent role Response to Craig.Robertson@Nuance.com

Qualifications

Excellent background in statistics, pattern recognition, and/or signal processing

• Expertise in natural language processing, computational linguistics, statistical language modeling, search, and/or machine translation

• Strong programming skills, ideally in Python, Java, and/or C.

• Skills related to text processing, scripting languages, regular expressions

• Excellent oral and written communications skills in English.

• Ability to carry out focused and goal-oriented research and development, ability to assume responsibility for one’s work

• Ability to work in an international team as well as independently in fast-paced environment

• Ability to creatively solve problems while leveraging existing technology with an eye for efficiency.

PhD or equivalent research experience are a strong asset

• Good knowledge of speech recognition theory, acoustics, and/or psychoacoustics

• User interface, human—machine interaction, and dialogue system development experience

• Operational knowledge of languages other than English

MSc, ideally PhD in computer science, engineering, physics, mathematics, or other technical field

Craig Robertson

Recruitment Manager EMEA

Top

6-29

(2012-06-11) PhD Student 'Increasing Robustness of Speech Recognition' Radbout University Nijmegen NL

PhD Student 'Increasing Robustness of Speech Recognition' (1,0 fte)

Renewed Job opening !

Faculty of Arts Vacancy number: 23.02.12 Closing date: 7 July 2012

Responsibilities

As a PhD student you will participate in the FP7 Marie Curie Initial Training Network Investigating Speech Processing In Realistic Environments (INSPIRE). This network provides research opportunities for 13 PhD students and 3 postdocs. You will become a member of an international team of researchers whose aim is to gain a better understanding of how listeners recognize speech, even under non-ideal circumstances. You will contribute to urgently needed solutions that help alleviate the serious communication problems that arise, especially for older and hearing-impaired persons, when different combinations of 'adverse' conditions affect the speech processing system. You will conduct your research in the framework of of a project called ’Increasing robustness of speech recognition by using multiple signal representations’. Speech processing in the human brain presumably involves competition between multiple, intermediate signal representations. The redundancy of these different representations are assumed to help improve the robustness of recognition. In some cases, however, they may lead to conflicting interpretations resulting in intelligibility problems. The goal of this PhD project is to investigate to what extent human recognition errors with regard to speech in ’adverse’ conditions can be replicated by machines that were trained on multiple input representations which are partially redundant.

Work environment

The project will be carried out at the Centre for Language and Speech Technology (CLST). CLST is a research unit within the Faculty of Arts of Radboud University Nijmegen and hosts a large international group of senior researchers and PhD students who conduct research at the frontier of science and develop innovative applications.

What we expect from you

You should: - hold a Master's degree in engineering or science; - have a strong background in machine learning (experience with dynamic Bayesian networks would be an advantage), mathematical and/or statistical modelling, and signal processing; - have excellent programming skills; - be willing to spend several months at the Technical University of Denmark. Prior exposure to courses in linguistics or speech- or hearing-related fields would be an advantage. Furthermore, you should comply with the rules set forward by the FP7 Marie Curie ITNs, i.e. you should: - not have resided of performed your main research activity in the Netherlands for more than 12 months in the last three years; - be willing to work in at least one other country in the INSPIRE network; - have less than 4 years of research experience since you obtained your Master’s degree, and not hold a PhD.

What we have to offer

We offer you: - employment: 1,0 fte; - in addition to the salary: an 8% holiday allowance and an 8.3% end-of-year bonus; - the starting salary is €2,042 per month on a full-time basis; the salary will increase to €2,492 per month in the third year; - in addition to the salary, you will receive travel and training allowances on the basis of generous Marie Curie ITN provisions; - duration of the contract: 18 months with the possibility of extension by another 18 months.

Are you interested in our excellent employment conditions

(http://www.ru.nl/newstaff/working_at_radboud/conditions_of/)?

The Radboud University is an equal opportunity employer. Female researchers are strongly encouraged to apply for this vacancy.

Would you like to know more?

Further information on: Investigating Speech Processing In Realistic Environments (http://www.inspire-itn.eu/) Dr. Bert Cranen, assistant professor Speech science Telephone: +31 24 3612904 E-mail: B.Cranen@let.ru.nl

Applications

Are you interested? Please include with your application: - a CV; - a 2-page description of your research interests explaining why the INSPIRE goals appeal to you, how the INSPIRE team may benefit from your participation, and your career perspectives as expected from INSPIRE; - university transcripts; - names and email addresses of two potential referees (or alternatively letters of recommendation). It is Radboud University Nijmegen's policy to only accept applications by e-mail. Please send your application, stating vacancy number 23.02.12, to vacatures@let.ru.nl, for the attention of drs. M.J.M. van Nijnatten, before 7 July 2012. No commercial propositions ple

Top

6-30

(2012-06-11) Speech Recognition Scientist at Sunnyvale

Job Description

Title:

Speech Recognition Scientist

Location:

Sunnyvale

Status:

Regular, Full-time, Exempt

We are a well-funded start-up with cutting-edge speech recognition with a disruptive technology platform applicable to a variety of markets and environments for spoken dialog interaction. With an exciting mix of evolving speech-enabled products, we offer excellent opportunities for 'rock star' scientists to grow and share in our success. We offer competitive compensation, excellent benefits and an ultra-creative work environment.

We are currently seeking a talented speech recognition scientist to join our hands-on team developing our platform spoken dialog interactions. The ideal candidate has a proven track record for optimizing speech recognition performance.　This work includes creating the necessary scripts and tools to experiment with novel algorithms to optimize recognition and natural language understanding throughout all stages of a multi-modal dialog system. Among others, you will be asked to work on statistical language modeling, as well as language model and acoustic model adaptation.

Responsibilities

Develop tools and enhance algorithms that facilitate deployment and tuning of spoken dialog systems

Analyze speech recognition performance andimplement solutions to provide optimum accuracy

Use, improve and create research tools to create, update and optimize language models and speech recognition systems for multiple domains

Evaluate and develop different language modeling and rescoring based on machine learning algorithms

Document language model development and adaptation process

Work with the team to design future product releases

Required Skills & Experience

Ph.D. or Master’s degree in computer science, electrical engineering, comp. linguistics, or equivalent

Speech and/or language processing background (in research and/or industry)

In-depth scripting experience with Python, Perl or similar

Ability to own and drive experimental definition, investigations and ultimately be responsible for the speech recognition performance

Passion for solving difficult problems

Strong planning and communication skills

Strong analytical and problem solving skills and ability to troubleshoot issues

Pluses:

Background in natural language processing, machine learning and/or computational linguistics

Programming experience in C/C++

Qualified candidates are encouraged to send your resume and cover letter to

swright@fluentialinc.com. Fluential, LLC, is an Equal Opportunity Employer. To learn more, please visit us online at http://www.fluentialinc.com

Top

6-31

(2012-06-11) Voice Developer (m/f) Speech Technology Automotive, Nuance Turin (Italy)

Nuance is a leading provider of speech and imaging solutions for businesses and consumers around the world. Our technologies, applications and services make the user experience more compelling by transforming the way people interact with information and how they create, share and use documents. Every day, millions of users and thousands of businesses, experience Nuance by calling directory assistance, getting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched. Making each of those experiences productive and compelling is what Nuance is all about.

As a contributing member of Nuance, you will work within a dynamic team environment to develop, support, market and sell our award-winning software applications. We offer competitive compensation packages and a casual work environment. Join our dynamic, entrepreneurial team that operates worldwide (Europe, US, APAC). Be a part of our fast growing track of continuing success.

For more information, please see www.nuance.com.

Nuance is an equal opportunity employer.

For our office in Turin / Italy we are currently looking for a fulltime

Voice Developer (m/f)

Speech Technology Automotive

Key responsibilities.

· Develop voices for the range of Nuance Text-to-Speech Products.

Processing speech data for text-to-speech speech data bases
Supervising native contractors in annotation of speech databases
Creating speech databases
Building text-to-speech voices
Testing and product development of the TTS voice
Testing and quality assessment of the integrated system

Qualifications.

MSc degree in phonetics, computational linguistics or another relevant field.
first working experience
Working in Windows and Unix/Linux environment
Strong sense of precision and quality in your daily job
Fluent in English
Understanding of phonological and phonetic concepts
Ability to interpret spectrograms
Ability to write high quality documentation
Ability to work independently as well as in a team
Good problem solving, analytic skills and troubleshooting skills
Basic experience with scripting languages like Perl or Python
Experience with acoustic phonetics
Experience in text-to-speech voice development
Speaking other languages
Experience using tools such as Audition or Praat

We offer.

At Nuance Communications we believe our people are our most valuable asset.

We offer competitive compensation packages and we offer you career development opportunities in a challenging technical but casual work environment.

As a Nuance team member you will work within a dynamic international team operating worldwide.

Does Nuance speak to you?

If you are interested in joining our team, please send your English CV including earliest starting date and salary expectations via our Recruiting tool https://jobs-nuance.icims.com/jobs/7804/job.

Nuance's Mobility Division builds innovative, intelligent and intuitive touch and speech interfaces to simplify and enhance the way people interact with mobile devices, applications, and services. Nuance Mobile solutions make mobile devices and in-car systems easier to use, automate customer self-service, and optimize the access and discovery of even the most advanced mobile applications and content - regardless of technical know-how, location, environment, or physical and literacy capabilities.

ILONA ALEXANDRA HOLTZ

Recruiter - Employment Specialist DACH

Human Resources

Nuance Communications Aachen GmbH

Site Ulm

Soeflingerstr. 100

D-89077 Ulm, Germany

Fon +49 731 - 379 50 1166

Fax +49 731 - 379 50 1106 (Zentrale)

Mobil +49 170 56 15 235

WWW.NUANCE.COMThe experience speaks for itself ™

Geschäftsführung/Director: Jan Anthierens

Sitz der Gesellschaft/Registered Office: Aachen

Registergericht/Court of Registration: Aachen

Reg. Nr.: HRB 11872

USt-ID/VAT: DE 813191696

This electronic transmission and any files transmitted with it are confidential. It is transmitted for the sole use of the person(s) to whom it is addressed. Any further distribution or copying is prohibited. If you receive this message in error, please inform the sender immediately, do not use it or disclose its contents and delete it from your system. Please note that Nuance cannot guarantee that the transmission will be secure or error-free.

Experience Nuance in the web: http://www.youtube.com/watch?v=32QbXebhiag&list=UUtmZ1Vk2yFJkOe1DYQwLgag&index=1&feature=plcp

http://www.youtube.com/watch?v=RkiYr8aw5pE&feature=related

Top

6-32

(2012-06-11) Speech Output Designer (m/f) Speech Technology Automotive Nuance at Merelbeke Belgium

For our office in Merelbeke / Belgium we are currently looking for a fulltime

Speech Output Designer (m/f)

Speech Technology Automotive

Key responsibilities.

· Design and implement voice customizations for Nuance Text-to-Speech Products

Gathering and analyzing cutomer requirements for custom voices
Defining strategies for optimized speech output
Designing text corpora for recording
Developing User Dictionaries and rules for customer applications
Training and supervising native contractors in prompt tuning
Interacting with professional services teams and contributing to customer project success

Qualifications.

High School diploma or Bachelor Degree in languages, phonetics, computational linguistics or another relevant field
some years of work experience
Working in Windows (and Linux/Unix) environment
Language skills (grammar, punctuation, spelling, phonetics, etc.)
Ability to appreciate acoustic and prosodic quality of speech
Basic understanding of Text-to-speech technology
Experience in TTS prompt tuning
Ability to work in an international team and to coordinate contractors
Fluent in English
Ability to write high quality documentation
Experience with speech output applications
Experience with acoustic phonetics
Basic experience with scripting languages like Perl or Python
Speaking some other languages

We offer.

At Nuance Communications we believe our people are our most valuable asset.

We offer competitive compensation packages and we offer you career development opportunities in a challenging technical but casual work environment.

As a Nuance team member you will work within a dynamic international team operating worldwide.

Does Nuance speak to you?

If you are interested in joining our team, please send your English CV including earliest starting date and salary expectations via our Recruiting tool https://jobs-nuance.icims.com/jobs/7801/job .

ILONA ALEXANDRA HOLTZ

Recruiter - Employment Specialist DACH

Human Resources

Nuance Communications Aachen GmbH

Site Ulm

Soeflingerstr. 100

D-89077 Ulm, Germany

Fon +49 731 - 379 50 1166

Fax +49 731 - 379 50 1106 (Zentrale)

Mobil +49 170 56 15 235

WWW.NUANCE.COMThe experience speaks for itself ™

Geschäftsführung/Director: Jan Anthierens

Sitz der Gesellschaft/Registered Office: Aachen

Registergericht/Court of Registration: Aachen

Reg. Nr.: HRB 11872

USt-ID/VAT: DE 813191696

Top

6-33

(2012-06-11) Speech Output Designer (m/f) Speech Technology Automotive at Nuance Turin Italy

For our office in Turin / Italy we are currently looking for a fulltime

Speech Output Designer (m/f)

Speech Technology Automotive

Key responsibilities.

· Design and implement voice customizations for Nuance Text-to-Speech Products

Gathering and analyzing cutomer requirements for custom voices
Defining strategies for optimized speech output
Designing text corpora for recording
Developing User Dictionaries and rules for customer applications
Training and supervising native contractors in prompt tuning
Interacting with professional services teams and contributing to customer project success

Qualifications.

High School diploma or Bachelor Degree in languages, phonetics, computational linguistics or another relevant field
some years of work experience
Working in Windows (and Linux/Unix) environment
Language skills (grammar, punctuation, spelling, phonetics, etc.)
Ability to appreciate acoustic and prosodic quality of speech
Basic understanding of Text-to-speech technology
Experience in TTS prompt tuning
Ability to work in an international team and to coordinate contractors
Fluent in English
Ability to write high quality documentation
Experience with speech output applications
Experience with acoustic phonetics
Basic experience with scripting languages like Perl or Python
Speaking some other languages

We offer.

At Nuance Communications we believe our people are our most valuable asset.

We offer competitive compensation packages and we offer you career development opportunities in a challenging technical but casual work environment.

As a Nuance team member you will work within a dynamic international team operating worldwide.

Does Nuance speak to you?

If you are interested in joining our team, please send your English CV including earliest starting date and salary expectations via our Recruiting tool https://jobs-nuance.icims.com/jobs/7801/job .

ILONA ALEXANDRA HOLTZ

Recruiter - Employment Specialist DACH

Human Resources

Nuance Communications Aachen GmbH

Site Ulm

Soeflingerstr. 100

D-89077 Ulm, Germany

Fon +49 731 - 379 50 1166

Fax +49 731 - 379 50 1106 (Zentrale)

Mobil +49 170 56 15 235

WWW.NUANCE.COMThe experience speaks for itself ™

Geschäftsführung/Director: Jan Anthierens

Sitz der Gesellschaft/Registered Office: Aachen

Registergericht/Court of Registration: Aachen

Reg. Nr.: HRB 11872

USt-ID/VAT: DE 813191696

Top

6-34

(2012-06-11) Co Producer (m/f) Speech Technology Automotive at Nuance Turin Italy

For our office in Turin / Italy we are currently looking for a fulltime

Co Producer (m/f)

Speech Technology Automotive

Key responsibilities:

Produce audio recordings for Nuance Text-to-Speech Products
Assisting with (coached) Casting of Voice Talents for TTS
Assisting with the analysis of Casting-recordings
Coaching Voice Talents for TTS-recordings – in a Recording Studio and from remote
Supervise Mother-tongue linguists in a recording situation
Preparation of the scripts (in collaboration with others)
Manage the data-flow of the recorded output
Assistance with overhead (contracts, negotiations)

Qualifications:

Degree in languages or another relevant field
some years of working experience
Experience with TTS-technology, understanding the technology.
Experience with the coaching of TTS-Voice Talents
Ability to understand on the fly what we need from a VT, and to communicate this to the VT
Excellent communicative and collaborative skills
Very good knowledge of English + some other languages
Experience in a Recording-studio environment
Willing to travel
Knowledge of recording software, like Pro-Tools, Voxover, …
Linguistic knowledge
Creative, flexible
Even more languages (Spanish, east-European, Asian..)

We offer.

At Nuance Communications we believe our people are our most valuable asset.

We offer competitive compensation packages and we offer you career development opportunities in a challenging technical but casual work environment.

As a Nuance team member you will work within a dynamic international team operating worldwide.

Does Nuance speak to you?

If you are interested in joining our team, please send your English CV including earliest starting date and salary expectations via our Recruiting tool https://jobs-nuance.icims.com/jobs/7799/job .

ILONA ALEXANDRA HOLTZ

Recruiter - Employment Specialist DACH

Human Resources

Nuance Communications Aachen GmbH

Site Ulm

Soeflingerstr. 100

D-89077 Ulm, Germany

Fon +49 731 - 379 50 1166

Fax +49 731 - 379 50 1106 (Zentrale)

Mobil +49 170 56 15 235

WWW.NUANCE.COMThe experience speaks for itself ™

Geschäftsführung/Director: Jan Anthierens

Sitz der Gesellschaft/Registered Office: Aachen

Registergericht/Court of Registration: Aachen

Reg. Nr.: HRB 11872

USt-ID/VAT: DE 813191696

Top

6-35

(2012-06-11) Tools Developer (m/f) Software Engineering C/C++ Speech Technology Automotive at Nuance Turin Italy

For our office in Turin / Italy we are currently looking for a fulltime

Tools Developer (m/f)

Software Engineering C/C++

Speech Technology Automotive

Key responsibilities:

Analyze requirements for improving the Text-to-Speech Voice Building Process
Develop methodologies, scripts and procedures to improve efficiency and quality
Develop speech analysis algorithms to be applied in building text-to-speech voices
Adapt and extend existing in-house voice building technologies in view of large-scale production
Study and experiment automatic learning and statistical approaches in order to minimize the need for manually labeled data
Design and develop software components to be used in local, networked or Internet-related tools for voice building
Make sure tools are efficient and easy to use and provide support to users
Document, test, debug and modify software components of voice building tools

Qualifications:

Master degree in Electronic Engineering / Computer Science / Computer Engineering
2 or more years of medium to large-scale application development through complete lifecycle
Software Engineering (C/C++);
Experience with process optimization;
Signal Processing, voice recognition, HMM, neural networks;
One or more scripting languages (PHP, Python, Perl, Awk);
Experience working with software versioning and revision control systems;
Comfortable working both independently and as part of a large international team;
Fluent in English
Data processing
Experience working with Multithreading / Multithreaded Programming

We offer.

At Nuance Communications we believe our people are our most valuable asset.

We offer competitive compensation packages and we offer you career development opportunities in a challenging technical but casual work environment.

As a Nuance team member you will work within a dynamic international team operating worldwide.

Does Nuance speak to you?

If you are interested in joining our team, please send your English CV including earliest starting date and salary expectations via our Recruiting tool https://jobs-nuance.icims.com/jobs/7797/job .

ILONA ALEXANDRA HOLTZ

Recruiter - Employment Specialist DACH

Human Resources

Nuance Communications Aachen GmbH

Site Ulm

Soeflingerstr. 100

D-89077 Ulm, Germany

Fon +49 731 - 379 50 1166

Fax +49 731 - 379 50 1106 (Zentrale)

Mobil +49 170 56 15 235

WWW.NUANCE.COMThe experience speaks for itself ™

Geschäftsführung/Director: Jan Anthierens

Sitz der Gesellschaft/Registered Office: Aachen

Registergericht/Court of Registration: Aachen

Reg. Nr.: HRB 11872

USt-ID/VAT: DE 813191696

Top

6-36

(2012-06-11) Voice Manager (m/f) Speech Technology Automotive at Nuance Turin Italy

Voice Manager (m/f)

Speech Technology Automotive

Key responsibilities:

· Casting and coaching of Voice Talents
· Designing text corpora for recording and testing
· Processing speech data for text-to-speech speech data bases
· Building and testing text-to-speech voices
· Managing all technical aspects of a TTS voice development
· Supporting our professional services teams and contributing to customer project success

Qualifications:

MSc degree in Languages / Computational linguistics / Electronic Engineering or another relevant field
some work experience
Working in a Windows and Unix/Linux environment
Basic experience with scripting languages like Perl or Python
Understanding of phonological and phonetic concepts
Basic understanding of Text-to-speech technology
Strong sense of precision and quality in your daily job
Fluent in English
Ability to write high quality documentation
Experience with acoustic phonetics preferred
Experience in text-to-speech voice development would be an asset
Speaking some other languages preferred

We offer.

At Nuance Communications we believe our people are our most valuable asset.

We offer competitive compensation packages and we offer you career development opportunities in a challenging technical but casual work environment.

As a Nuance team member you will work within a dynamic international team operating worldwide.

Does Nuance speak to you?

If you are interested in joining our team, please send your English CV including earliest starting date and salary expectations via our Recruiting tool https://jobs-nuance.icims.com/jobs/7795/job.

ILONA ALEXANDRA HOLTZ

Recruiter - Employment Specialist DACH

Human Resources

Nuance Communications Aachen GmbH

Site Ulm

Soeflingerstr. 100

D-89077 Ulm, Germany

Fon +49 731 - 379 50 1166

Fax +49 731 - 379 50 1106 (Zentrale)

Mobil +49 170 56 15 235

WWW.NUANCE.COMThe experience speaks for itself ™

Geschäftsführung/Director: Jan Anthierens

Sitz der Gesellschaft/Registered Office: Aachen

Registergericht/Court of Registration: Aachen

Reg. Nr.: HRB 11872

USt-ID/VAT: DE 813191696

Top

6-37

(2012-06-15) INESC-ID Open Positions, Lisbon Partugal

INESC-ID Open Positions

    INESC-ID invites applications for researchers, starting 2012. We are     interested in PhD researchers, fluent in English and with autonomous     research abilities. We expect that the candidate develop scientific     research in the topics described below.

The Institution

Instituto de Engenharia de Sistemas e Computadores, Investigação e Desenvolvimento em Lisboa (INESC-ID) is one of the most dynamic research institutes in Portugal in the areas of communication and information technologies. The activity of INESC-ID is focused on the following area: Interactive Intelligent Systems (http://www.inesc-id.pt).

Job Description

Successful candidates will be integrated in an existing research group at INESC-ID and will conduct research focusing on:

Development and evaluation of human-systems,
Natural language understanding,
Agents,
Graphical modeling and visualization and robot relations.

By addressing the creation of intelligent and affective relations with machines, together with intelligent visualization and virtual environments, the candidate will develop techniques that will advance the state of the art in building dialogue, multi-modal interaction and affective relations with machines.

The work will be carried out in one of the three groups: Spoken Language Systems, Intelligent Agents and Synthetic Characters or Visualization and Multi-modal Interactions.

The position holder is expected to develop scientific research preferably within these topics, and will be encouraged to start their own projects in coordination with colleagues. Excellent candidates in related areas are also strongly encouraged to apply.

Qualifications Required

Applicants should hold a PhD, be fluent in English and show evidence of autonomous research abilities. The should be willing to work in a team and also have a strong publication record. The successful candidate should propose an innovative research project with relevance for the research area where she/he will be integrated.

Contract Conditions

INESC-ID is an equal opportunity employer that implements the principle of equal treatment and training irrespective of religion or belief, disability, age or sexual orientation in employment. The contract corresponds to a salary of a Research Assitant Professor in Portugal. Successful applicants will be in post from August 2012.

Application Deadline

July 15^th 2012

Application details

INESC-ID invites eligible individuals to submit their expressions of interest, which must include an application letter, a detailed CV, a 1-page outline of a proposed research program, and is highly recommended to include reference letters.

All documents should be sent by email to applications@inesc-id.pt and/or snail mail to: Direcção INESC-ID, R. Alves Redol, 9, 1000-029 Lisboa, Portugal.

Top

6-38

(2012-06-15) Language Processing Software Engineer at ONMOBILE

Language Processing Software Engineer

At ONMOBILE SA, an IT and Telecom VAS company, we hire at the earliest possible date an experienced natural language processing (NLP) software engineer for new research and development projects on automatic speech recognition, text processing, and multilingual question-answering systems.

We are looking for a software development engineer with NLP development or research background on either commercial or academic speech recognition systems. You will be familiar with and have practical experience in the following areas:

State-of-the-art NLP technologies (robust parsing, finite-state transducers FSTs, statistical language modelling etc.)

Semantic Web technologies like RDF, OWL, SPARQL

Ability to develop with Eclipse RCP

Strong programming skill with modern programming languages (C++, Java) and scripting languages

Expertise in speech recognition, acoustic modelling, and audio/video processing is a plus

You should have an engineer school or university degree in computer science or related disciplines. A PhD or an equivalent level of experience would be helpful.

An application-oriented perspective and a concern for customers with strong analytical and problem solving skills are required. You should be capable to work independently when needed.

Other skills: transparent behaviour, clarity of expression, ability to work in a multi-cultural team.

Languages: fluent in English, French

The position is based in Paris, France.

Contact by email to WenXuan TENG (

teng.wenxuan@onmobile.com)

Top

6-39

(2012-06-15) Speech and Audio Processing Software Engineer at ONMOBILE

Speech and Audio Processing Software Engineer

At ONMOBILE SA, an IT and Telecom VAS company, we are hiring at the earliest possible date an experienced Speech and Audio Processing Software Engineer for research and development projects on automatic speech recognition, text processing, and multilingual question-answering systems.

We are looking for a software development engineer with speech recognition and audio processing development or research background on either commercial or academic speech recognition systems. You will be familiar with and have practical experience in the following areas:

State-of-the-art speech recognition technologies (decoder, language models, acoustic models, signal processing) and their implementation within efficient recognition and training systems

Audio and signal processing for de-noising, acoustic feature extraction, audio fingerprinting, karaoke scoring etc.

Strong programming skill with modern programming languages (C++, Java) and scripting languages

Expertise in NLP technologies is a plus

You should have an engineer school or university degree in signal processing or computer science or related disciplines. A PhD or an equivalent level of experience would be helpful.

An application-oriented perspective and a concern for customers with strong analytical and problem solving skills are required. You should be capable to work independently when needed.

Other skills: transparent behaviour, clarity of expression, ability to work in a multi-cultural team.

Languages: fluent in English, French

The position is based in Paris, France.

Contact by email to WenXuan TENG (

teng.wenxuan@onmobile.com)

Top

6-40

(2012-06-17) 2-4 PhD positions in Speech Technology and Communication at KTH Stockholm Sweden

2-4 PhD positions in Speech Technology and Communication

The goal of the positions is to contribute to the research foundation for speech technology in tomorrow's conversational systems.

Anticipated specializations

Data-driven Dialogue Management
Incremental Input Fusion and Understanding
Avatars that Interact Through Speech, Gesture or Sign Language
Novel Methods for Automatic Speech Recognition and Understanding

The positions include free tuition and are salaried 4 year employments, presently starting at 2800 euro/month increasing to 3400 euro/month for the last year.

To get information about how to apply for the positions go to http://www.speech.kth.se/vacancies/

Top

6-41

(2012-06-21) Mandarin TTS (Text To Speech) Manager / Expert / Research Engineer Positions at Nuance: 6 Location : Shanghai, China

Nuance Communications is a listed US$1.5B global software company and the world leader in speech, text and imaging solutions for businesses and consumers around the world, with aggressive growth plans in the Asia Pacific region.

Job title: Mandarin TTS (Text To Speech) Manager / Expert / Research Engineer Positions: 6 Location : Shanghai, China

Job description and requirements Overview: Nuance's Mobility Division builds innovative, intelligent and intuitive touch and speech interfaces to simplify and enhance the way people interact with mobile devices, applications, and services. Nuance Mobile solutions make mobile devices and in-car systems easier to use, automate customer self-service, and optimize the access and discovery of even the most advanced mobile applications and content - regardless of technical know-how, location, environment, or physical and literacy capabilities. Responsibilities:
Reporting to TTS manager, the research scientist will conduct and lead innovative research and development on speech synthesis technologies for Asian languages. In the role as research scientist, your goal will be to continuously drive improvements and innovation to the Nuance Chinese TTS system, for commercial deployment in all types of markets and platforms. Representative tasks will include: •Design, implementation, evaluation, optimization and testing of new algorithms and tools for text-to-speech synthesis, for both signal generation and text processing/understanding. •Product integration supervision of proven innovation results •Defining the team's innovation and technical agenda, in cooperation with TTS management •Creation of demonstrators and evaluators of new technologies. Required skills: Digital speech processing, strong mathematics knowledge, excellent computer programming skills preferably in C, C++ and scripting languages, familiarity with different OS and computing platforms, excellent English and communication skills, strong team player, proven track record of achievements in Chinese TTS R&D, fluent Mandarin speaking Preferred skills: Hands-on experiences in one or more of these areas, TTS R&D, software engineering, natural language processing and understanding, project management, parametric and/or unit selection TTS development Education: PhD or Master degree in EE or CS from a well known university

You can also find the job details via searching job 7812 and 7806, 7808~7811 at http://www.nuance.com/company/careers/index.htm

Contact information: Further questions and resumes can be sent to Lily He, Recruiter for Nuance Greater China at lily.he@nuance.com Thank you for your time and consideration, I look forward to hearing from you!

Regards,

Lily He

Recruiter, GreaterChina

NUANCE COMMUNICATIONS, INC.

Top

6-42

(2012-06-30) Thèse CIFRE à Orange Labs (Issy les Moulineaux, Paris).

Thèse CIFRE à Orange Labs (Issy les Moulineaux, Paris).

Sujet de la thèse : Apprentissage par renforcement dans un système de dialogue incrémental

L’objectif général de la thèse est d’implanter, d’adapter, d’optimiser et d’intégrer un algorithme d’apprentissage par renforcement dans un système de dialogue incrémental.

La thèse produira une analyse extensive, proche de l’exhaustivité, de l’apprentissage par renforcement dans un environnement événementiel, domaine très peu exploré par la littérature scientifique actuellement. Cette analyse débouchera sur la réalisation de plusieurs algorithmes qui passeront un banc d’essai. L’algorithme le plus performant/prometteur fera l’objet d’une analyse plus poussée et sera intégrée à une application de dialogue expérimentale, pour démontrer son efficacité dans son environnement naturel.

Les défis scientifiques sont nombreux. La thèse rapproche plusieurs domaines scientifiques à la fois très pointus et hétérogènes : l’apprentissage automatique, les systèmes distribués, et le dialogue, qui est lui-même une discipline par nature multi-compétence, du traitement automatique de la langue naturelle au traitement du signal, en passant par la psychologie cognitive. Un premier défi est donc de prendre suffisamment de recul pour intégrer toutes ces composantes dans une seule image globale. Un second défi, plus mathématique, consiste à adapter les algorithmes d’apprentissage par renforcement à un modèle de décisions événementiel. Un troisième défi, de l’ordre de l’ingénierie, concerne l’intégration dans, et le développement d’une application de dialogue de bout en bout. Et enfin, un quatrième défi, auquel nous serons particulièrement attentifs, concerne le souci d’inscrire ce travail scientifique dans un objectif d’industrialisation, à long terme.

Contacts : Romain Laroche, Orange Labs (romain.laroche...at___orange.com) et Fabrice Lefevre, Université d'Avignon (fabrice.lefevre''''at***univ-avignon.fr).

Top

6-43

(2012-06-12) Senior research scientist at Pearson

Pearson has one defining goal: to help people progress in their lives through learning. We champion innovation and we invest in models for education that deliver on our promise for effective, accessible, and personal learning from early literacy, college and career readiness to professional education, through data informed instruction and inventive applications for mobile and digital learning.

Pearson, the world's leading learning company, has global-reach and market leading businesses in education, business, and consumer publishing and is listed on the London and New York stock exchanges (UK: PSON; NYSE: PSO). For more information, visit www.pearson.com.

Pearson is an Equal Opportunity and Affirmative Action Employer, and a member of E-Verify. All qualified applicants, including minorities, women, veterans, and people with disabilities are encouraged to apply.

Responsibilities of this role:
• The whole R&D cycle (building language models, training acoustic models, building statistical models for measuring the performance, etc.) for automated grading of different test system.
• Design new algorithms for different purpose (such as improving the grading performance).
• Do data analysis for different requirements.
• Write different scripts and tools to support the sales, marketing and test development teams.
• Maintain grading system.

Qualifications:

Personality attributes/skills required:
• Knowledge of speech recognition, probabilistic systems, statistical models.
• Strong background in statistical modeling and machine learning.
• Extensive experience programming in C/C++.
• Proficiency with SQL, Perl and Matlab.
• Demonstrate a willingness to learn and apply a wide range of technologies
• Experience with linguistic and statistical analysis of natural language would be highly desirable.
• Ph.D. with 3-5 years of applied experience in the field

• Experience in designing and running ASR experiments using HTK a plus
• Strong written and verbal communication skills

Please apply online at www.pearsoned.com/careers

Top

6-44

(2012-06-12) Research scientist at Pearson

Research Scientist:

Responsible for advancing the state of the company's automated grading technology while working with the sales, marketing and test development teams to solve practical problems from the field. The position demands a person who has broad interests and is motivated to design and implement improvements to the company’s system. Opportunities for further research in applicable areas are available.

Day-to-day responsibilities of this role:
• The whole R&D cycle (building language models, training acoustic models, building statistical models for measuring the performance, etc.) for automated grading of different test system.
• Design new algorithms for different purpose (such as improving the grading performance).
• Do data analysis for different requirements.
• Write different scripts and tools to support the sales, marketing and test development teams.
• Maintain grading system.

Qualifications:

Apply online at www.pearsoned.com/careers

Top

6-45

(2012-07-05) PhD at LIG (Grenoble-France)

PhD proposal : Collaborative Annotation of multi-modal, multi-lingual and multimedia documents
Project objective
This PhD will be proposed and funded in the context of the CHIST-ERA / ANR Camomile Project (Collaborative Annotation of multi-MOdal, MultI-Lingual and multi-mEdia documents) Human activity is constantly generating large volumes of heterogeneous data, in particular via the Web. These data can be collected and explored to gain new insights in social sciences, linguistics, economics, behavioral studies as well as artificial intelligence and computer sciences. In this regard, 3M (multimodal, multimedia, multilingual) data could be seen as a paradigm of sharing an object of study, human data, between many scientific domains. But, to be really useful, these data should be annotated, and available in very large amounts. Annotated data is useful for computer sciences which process human data with statistical-based machine learning methods, but also for social sciences which are more and more using the large corpora available to support new insights, in a way which was not imaginable few years ago. However, annotating data is costly as it involves a large amount of manual work, and in this regard 3M data, for which we need to annotate different modalities with different levels of abstraction is especially costly. Current annotation frameworks involve some local manual annotation, with the help sometimes of some automatic tools. The Camomile Project aims at developing a first prototype of collaborative annotation framework on 3M data, in which the manual annotation will be done remotely on many sites, while the final annotation will be localized on the main site. Furthermore, with the same principle, some systems devoted to automatic processing of the modalities (speech, vision) present in the multimedia data will help the transcription, by producing automatic pre-annotations.
PHD proposal
This PhD is dedicated to the proposal of semi-supervised and unsupervised methods for the annotation of MMM data. Different scenarios of semi-supervised annotations will be experimented, for different type of videos. More precisely, we shall study: ? innovative retraining / adaptation strategies to update the different systems using new annotations. Since we consider a real scenario where new annotations are produced continuously, we will specially focus on iterative learning techniques where models are updated instead of being fully retrained; ? new data selection methods for active learning strategies ; we will focus on active learning for multimodal and heterogeneous systems which makes the data selection task much more difficult. As a case study we shall focus our work on developing technologies in order to answer to the questions ?who is seen??, ?who is speaking?? in videos. Depending on the type of video and the feedback from the supervision group, we may extend our work to the automatic annotation of objects (?what is seen??) or activities (?what is going on??).
Required Skills
The applicant must have a master degree in either computer science or computer engineering and have some knowledge in speech, image or video processing and in machine learning. We also search for a candidate with very good programming skills.
LIG GETALP and MRIM collaboration
PHD work is to be carried out between the GETALP and MRIM teams of LIG. LIG / GETALP website http://getalp.imag.fr LIG / MRIM website http://mrim.imag.fr
Contacts Laurent Besacier Laurent.Besacier@imag.fr Georges Quénot Georges.Quenot@imag.fr
Targeted starting date: fall 2012

Top

6-46

(2012-07-08) Faculty position in Phonetic Science and Speech Technology at Nanjing Normal University, China

Faculty position in Phonetic Science and Speech Technology at Nanjing Normal University, China

(Urgent job announcement)

The Institute of Linguistic Science and Technology at Nanjing Normal University, China,

invites applications for a faculty position in the area of Phonetic Science and Speech

Technology. The position can be Lecturer, Associate Professor, or Professor, depending on

the qualifications and experience of the applicant.

Nanjing Normal University (NNU) is situated in Nanjing, a city in China not only

famous for its great history and culture but also pride for excellence in education and

academy. With Chinese-style buildings and garden-like environment, the Suiyuan Campus of

NNU is often entitled the “

Most Beautiful Campus in the Orient.”

Nanjing Normal University is among the top 5 universities of China in the area of

Linguistics. Placing strong emphasis on interdisciplinary research, the Institute of Linguistic

Science and Technology at NNU is unique in that it bridges the studies of theoretical and

applied linguistics, phonetics, cognitive sciences, neural sciences, and information

technologies. The phonetic laboratory is very well equipped, with sound-proof recording

studio, professional audio facilities, physiological instruments (e.g., WAVE system,

PowerLab, EGG, EPG, airflow and pressure module, and nasality sensor), EEG for ERP

studies, eye tracker, etc. The laboratory just organized an international symposium TAL 2012

www.TAL2012.org

very successfully at the end of May.

We welcome interested colleagues to join us. The research can cover any areas in

phonetic sciences and speech technologies, including but not limited to speech production,

speech perception, prosodic modeling, speech synthesis, automatic speech recognition and

understanding, spoken language acquisition, computer-aided language learning, and ERP

study for spoken languages. Outstanding research support will be offered.

Requirements:

* A PhD degree (or an expected one) in related disciplines (e.g., linguistics, psychology,

physics, applied mathematics, computer sciences, and electronic engineering);

* Good publication/patent record in phonetic sciences or speech technologies;

* Good oral and written communication skills in both Chinese and English;

* Team work spirit in a multidisciplinary group.

Interested candidates should submit a CV, a detailed list of publication, the copies of the best

two or three publications, and the contact information of two references to:

Prof. Wentao GU

Email:

wtgu@njnu.edu.cn; wentaogu@gmail.com

Phone: (office) +86-25-8359-8624, (mobile) +86-189-3687-2840

The position will keep open until it is filled. An earlier application is strongly recommended

Top

6-47

(2012-07-26) Offre de thèse en correction orthographique par traduction statistique, Univ. Le Mans, France

Offre de thèse financée au sein du laboratoire d'Informatique de l'Université du Maine (LIUM) dans le domaine de la correction orthographique automatique par méthodes de traduction statistique. Lieu : LIUM (Le Mans) Date : 1/10/2012 Durée : 3 ans Cette thèse s'inscrit dans le projet 'investissement d'avenir' PACTE, porté par l'entreprise Diadeis, et dont sont également partenaires l'équipe Alpage (INRIA et Paris 7), et les entreprises A2ia et Isako. PACTE a pour objectif l'amélioration de la qualité orthographique des textes issus de différentes méthodes de capture textuelle. L'accent est mis sur les sorties d'OCR (reconnaissance optique de caractères sur des textes imprimés scannés), mais concerne également des données obtenues par reconnaissance d'écriture manuscrite, par saisie manuelle, et par rédaction directe. Les techniques qui seront utilisées sont à la fois statistiques et hybrides, faisant usage d'outils et de ressources de linguistique computationnelle. Le domaine d'application principal du projet est celui de la numérisation du patrimoine écrit, dans un contexte multilingue. Une deuxième thèse démarrera à Alpage avec un accent sur l'utilisation des connaissances linguistiques pour aider à optimiser automatiquement ou quasi-automatiquement la qualité orthographique des textes. Dans le cadre du projet PACTE, une étroite collaboration aura lieu entre le LIUM, Alpage et la société Diadeis. Dans ce contexte, l'enjeu de la thèse au LIUM est d'analyser comment utiliser les techniques de traduction automatique statistique pour la correction d'erreur. En effet, on peut considérer la correction d'erreur comme un processus de passage d'une langue erronée vers une langue correcte. Une approche similaire a déjà été utilisée avec succès pour corriger les sorties des systèmes de traduction par règles, connue sous le nom 'statistical post-editing (SPE)'. Dans le cadre de cette thèse, il s'agit donc d'étudier comment une approche similaire peut être utilisée pour la correction orthographique. Un aspect important de cette thèse concerne le développement de modèles de langue efficaces, donnant de bons résultats avec une faible empreinte mémoire. Les modèles n-grammes à repli seront privilégiés, mais d'autres méthodes seront également explorées, notamment la modélisation dans l'espace continu (continuous space language models). Nous nous intéresserons aussi à l'intégration de connaissances morphosyntaxiques, en collaboration avec l'équipe Alpage. Les langues étudiées seront prioritairement le français et l'anglais, ainsi que l'allemand. Une application à l'espagnol, l'italien, voire d'autres langues européennes est possible. Profil recherché : - bonnes compétences en informatique (la maîtrise de Linux est indispensable, programmation en C++, utilisation de scripts, Perl, etc); - des connaissances en traduction automatique statistique sont souhaitées, ou, à défaut, en apprentissage automatique; - une expérience avec l'outil Moses est un plus. La thèse se déroulera au sein de l'équipe LST du LIUM. Le LIUM est connu au niveau international pour ses recherches dans le domaine de la traduction statistique, et nous avons de nombreuses collaborations avec des universités et entreprises en Europe et aux États-Unis. Contact : Holger Schwenk Holger.Schwenk@lium.univ-lemans.fr

Top

6-48

(2012-08-03) PhD Studentship in Speaker Diarization at EURECOM, Sophia Antipolis, Alpes Maritimes, France

PhD Studentship in Speaker Diarization at EURECOM

Department: Multimedia Communications
URL:        http://www.eurecom.fr/mm
Start date: 01/10/12
Duration:   Duration of the thesis

Description:

EURECOM’s Multimedia Communications Department invites applications for a PhD studentship in speaker diarization within its Speech and Audio Processing Research Group. Speaker diarization is commonly referred to as the task of detecting ‘who spoke when’ in a multiple-speaker audio signal. In its most general form it is performed without any prior
knowledge regarding the number of speakers or speaker identities. applications include speech recognition, speaker recognition (biometrics), multimedia indexing, content structuring and general multimedia document processing.

As with any modelling or statistical pattern recognition task, performance is affected by unwanted nuisance variation and by the amount of data available for any given class. In the case of speaker diarization performance can be affected by background noise, varying
linguistic content and differences in speaker floor times.   Our recent work has developed new normalization approaches to marginalise linguistic variation in order to increase speaker discrimination and improve speaker diarization performance.

This fully-funded PhD position aims to extend this work to further improve the robustness of speaker diarization in the case of linguistic variation and varying speaker floor times. The work will develop a novel phone adaptive training algorithm and investigate other, new normalisation and marginalization approaches to improve speaker modelling. The position is an opportunity to make a contribution in an increasingly important field of speech and audio processing. You will join a small, but dynamic research group which participates in a growing number of European, national and industrially-funded research projects and will have the opportunity for international travel and participation in competitive evaluations.

Requirements:

The successful candidate will have a Master’s degree in engineering, mathematics, computing or a related, relevant discipline. You will be highly motivated to undertake challenging research, have strong expertise in mathematics and programming and have excellent communication skills. Knowledge of C/C++ and Matlab is strongly desirable. Good English language speaking and writing skills are essential. Knowledge of French is a bonus. Application Screening of applications will begin immediately, and the search will continue until the position is filled. Applicants should send, to the address below (i) a one page statement of research interests and motivation, (ii) your CV and (iii) contact details for three referees.

Applications should be submitted by e-mail to secretariat@eurecom.fr

Contact:         Dr. Nicholas Evans
Postal address: 2229 route des Crêtes
                 B.P. 193
                 06904 Sophia Antipolis
                 France
Email:           evans@eurecom.fr
Web page:        http://www.eurecom.fr/mm
Phone number:    +33 4 93 00 81 14
Fax number:      +33 4 93 00 82 00

EURECOM is a graduate school and a Research Centre in Communication Systems, located in Sophia Antipolis technology park, in close proximity with a large number of research units of leading multinational corporations in the telecommunications, semiconductor and biotechnology sectors, as well as other outstanding research and teaching institutions. EURECOM was founded in 1991 by TELECOM ParisTech (Ecole Nationale Supérieure des Télécommunications) and EPFL (Swiss federal institute of Lausanne) in a consortium form, combining academic and industrial partners.

EURECOM deploys its expertise around three major fields: Networking and security, Multimedia Communications and Mobile Communications and has a strong international scope and strategy. EURECOM is particularly active in research in its areas of excellence while also training a large number of doctoral candidates. Its contractual research is recognized across Europe and contributes largely to its budget.

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy