ISCA - International Speech
Communication Association


ISCApad Archive  »  2015  »  ISCApad #210  »  Jobs

ISCApad #210

Sunday, December 13, 2015 by Chris Wellekens

6 Jobs
6-1(2015-07-01) A 3 year fully-funded PhD studentship at University of Sheffield, UK

We have a 3 year fully-funded PhD studentship in the use of Spoken Language Dialogue Systems in Assistive Technology. Full details are at

 http://www.sheffield.ac.uk/dcs/resdegrees/funded_phds


Closing date is 16th August 2015. Please circulate to suitable candidates.
 
 
Dr. Heidi Christensen
Lecturer, Department of Computer Science, University of Sheffield
Centre for Assistive Technology and Connected Healthcare (http://www.catch.org.uk/)
Back  Top

6-2(2015-07-20) Post-Doctoral Researcher at the Advanced Digital Sciences Center, Singapore
-------------------------------------------------------------
WHO: Post-Doctoral Researcher wanted
WHY: Massively Multilingual Automatic Speech Recognition
WHERE: Advanced Digital Sciences Center, Singapore
WHEN: September, 2015
 
Speech input permits people to find data (maps, search, contacts) by talking to their cell phones.  Of the 6700 languages spoken in the world, speech input is available in 40.  Why so few?  The problem is data.  Before it can be used, speech input software must learn a language by studying hundreds of hours of transcribed audio.  In most languages, finding somebody who can transcribe hundreds of hours of audio (somebody who is computer literate, yet has time available to perform this task) is nearly impossible.  Faced with this problem, we proposed a radical solution: solicit transcription from people who don't speak the language.  Non-native listeners make many mistakes.  By building a probabilistic model of their mistakes, we are able to infer correct transcriptions, and thus to train speech technology in any language.
 
We are seeking a post-doctoral researcher who can scale these algorithms to commercial relevance.  Necessary qualifications include a Ph.D. in speech technology, natural language processing, information theory or machine learning.  Objectives of the research include the derivation, implementation, testing, and publication of new algorithms that train state of the art speech input technologies from probabilistic transcription in the under-resourced languages of southeast Asia.
 
This is a 20-month post-doctoral research position at the Advanced Digital Sciences Center (ADSC) in Singapore.  The post-doc will work most closely with Dr. Nancy Chen, A*STAR, Singapore, and with Dr. Preethi Jyothi and Prof. Mark Hasegawa-Johnson, University of Illinois at Urbana-Champaign.  For inquiries contact probtranspostdoc@gmail.com.
 
Back  Top

6-3(2015-07-21) PhD position at Telecom ParisTech, France

PhD position in Feature Function Learning for Sentiment Analysis in speech interactions

Telecom ParisTech (http://www.telecom-paristech.fr/eng/)
46 rue Barrault  75013 Paris - France

Advisors: 
Chloé Clavel  (http://clavel.wp.mines-telecom.fr/)
Slim Essid  (http://perso.telecom-paristech.fr/~essid/)

Starting date: Early Autumn 2015

Funding: Secured with the Telecom ParisTech Machine Learning for Big Data Chair (http://machinelearningforbigdata.telecom-paristech.fr

Keywords: Sentiment Analysis, Opinion Mining, Deep Learning, Conditional Random Fields, Natural Language Processing, Speech Processing, Natural Language Processing

Applications are invited for a 36 month PhD.

Topic: 
Sentiment analysis and opinion mining have gained an increasing interest with the explosion of textual content conveying users? opinions (e.g. film reviews, forum debates, tweets). Hence, natural language processing researchers have dedicated a great deal of effort into the development of methods amenable to opinion detection in such texts, though often simplifying the problem to one of classification over the valence (positive vs negative) and intensity axes. As for sentiment analysis in speech signals, there have been hardly any attempts. Further challenges are posed in this case where not only should the special features of spoken language be taken into account, but also prosodic features and the potential errors of automatic speech recognition systems.

The research work planned will focus on the development of sentiment analysis methods in the context of speech interactions (phone conversations, face-to-face human-agent interactions). The privileged research direction will consist in creating effective computational models of appraisal expressions. In particular, Conditional Random Fields and deep learning approaches will be considered with feature functions encoding the semantic rules usually used for our task.

IDEAL CANDIDATE: 
Master?s student or Master?s degree with background in
-        Machine learning / pattern recognition
-        Speech processing, natural language processing
-        Excellent programming skills (Python, Java, C/C++)
-        Good English level

APPLICATIONS:
To be sent to chloe.clavel@telecom-paristech.frslim.essid@telecom-paristech.fr,:
-        Curriculum Vitae
-        Statement of interest (in the body of the email)
-        Academic records
-        List of references

Incomplete applications will not be considered.

Back  Top

6-4(2015-08-04) PhD offer at IRISA, Lannion, France

A PhD offer (3 year, beginning in October 2015) is available in EXPRESSION team at IRISA at Lannion on Characterisation and generation of expressivity for audiobooks creation.

 
Competences: Computer science, software development (Python, Perl, C++), machine learning.


Contact: damien.lolive@irisa.fr

 
 

Damien Lolive
Associate Professor 
IRISA - Team Expression
University of Rennes 1
Back  Top

6-5(2015-08-04) Post doctoral position at IRISA, Lannion, France
 A post-doctoral position on Pronunciation variants modelling for speech synthesis is available at ranceIRISA, Lannion. You?ll find more details on :
 
Position available from novembre 2015.
Salary: depending on experience
 

Regards,

Damien Lolive
Associate Professor 
IRISA - Team Expression
University of Rennes 1
Back  Top

6-6(2015-08-15) Internship opportunity at Orange Labs

 

Internship opportunity at Orange Labs

Incomplete requests management in human/machine dialogue.

Entity: Orange Labs.

Department/Team: CRM&DA/NADIA.

Duration: 6 months.

Contact: Hatim KHOUZAIMI (hatim.khouzaimi@orange.com)

About our team:

Orange Labs is the Research and Development division of Orange, the leading telecommunication company in France. The mission of the CRM&DA department (Customer Relationship Management & Data Analytics) is to invent new solutions to improve the company’s interactions with its customers by using data analysis techniques. You will be part of NADIA (Natural DIAlogue interaction), which is one of the teams composing CRM&DA and whose mission is to develop and maintain a human/machine dialogue solution, which is already widely used by customers.

Your mission:

Thanks to the recent improvements in the Automatic Speech Recognition (ASR) technology, research in the field of Spoken Dialogue Systems (SDSs) has been very active during the late few years. The main challenge is to design user initiative dialogue strategies where the user can use natural language to utter complex requests, with a lot of information, as opposed to system initiative ones, where the request is entered chunk by chunk. However, due to the user’s unfamiliarity with the system and the noise induced by the ASR module, the request captured by the system is often incomplete, hence rejected. The objective of this internship is to figure out solutions to detect whether a request is incomplete and not incorrect and if it is the case, to extract partial information. This will be later used by the Dialogue Manager module to ask the user to add missing information.

In addition, researchers in the field of SDSs are more and more interested in improving the system’s floor management capacities. Instead of adopting a walkie-talkie approach where each of the dialogue participants has to wait for the other to release the floor before processing his utterance and coming up with a response, incremental dialogue suggests that the listener processes the speaker’s utterance on the flow, hence being able to interrupt her. In this frame, the system processes growing partial requests, which is another application of the solutions that will be studied. Incremental dialogue capacities are crucial in the development of a new generation of dialogue systems, which are more human-like, more reactive and less error-prone.

Essential functions:

You will improve the current dialogue solution that is developed and maintained by our

team. For that, you will have to interact with researchers in the field as well as developers.

According to the quality of the solutions that will be proposed, your results can be published in

scientific conferences or lead to a patent.

Qualifications and skills:

- MSc in Computer Science or a related field.

- A specialisation in Natural Language Processing is very welcome.

- Object-Oriented Programming.

- Good background in applied mathematics: probability and statistics.

- Good English level.

- Interest in Human Machine Interaction and Artificial Intelligence.

- Team work.

If you want to be part of an innovative experience in a team of talented people with state of art

skills in the field, please submit your resume by email to hatim.khouzaimi@orange.com.

Back  Top

6-7(2015-08-18) 1 year engineer position at INRIA Bordeaux

1 year engineer position at INRIA Bordeaux

 

The French National Institute of Research in Computing and Automation (INRIA, http://www.inria.fr/), research centre of Bordeaux-Sud Ouest (http://www.inria.fr/centre/bordeaux) is recruiting an engineer/developer for 12 months, in the framework of a partnership between the GEOSTAT team (http://geostat.bordeaux.inria.fr) and BATVOICE TECHNOLOGIES Company.

The field of activity is the pathologic speech processing. The successful candidate will take part in the valorisation of our research and to its transposition into a cutting-edge architecture.

The aim is the emergence of a new technology in a multi-support medical application.

Profile

Engineer/developer or Master2 or PhD with a good knowledge of :

  • C++ object language for real-time and multithread processes;

  • application development and signal processing, particularly for the data precision part;

  • development with Eclipse under Linux;

  • decryption and integration of open-source projects modules.

Mission

Implementation of speech processing algorithms, based on data captured by microphone.

Based on algorithms specifications, the implementations should be in the form of class modules gathered in directly executable applications.

The successful candidate will be supervised by an experienced researcher and work in close collaboration with a developer/integrator from BATVOICE.

The nature of developed modules will be adapted to massive data treatment, in execution and in/out in console mode only.

The development will include the treatment of every exception bug.

Location

Join a team of talented researchers an engineers at the cutting-edge of science, in the stimulating environment of the French National Institute of Research in Computing and Automation (INRIA) in the famous town of Bordeaux in the south-west of France.

Collaborate with a technology company close to the Parisian dynamic eco-system, and working in the MedTech area.

Evolution

By the end of this mission, the selected candidate will have the opportunity to be recruited by Batvoice Technologies.

Salary

Depending on profile, between 30 K? and 45 K? per year.

 

Starting date

From September 1, 2015 and before November 30, 2015.

 

Contact

Dr. Khalid DAOUDI, khalid.daoudi@inria.fr

Back  Top

6-8(2015-08-18) 1 year engineer position at INRIA Bordeaux

1 year engineer position at INRIA Bordeaux

 

The French National Institute of Research in Computing and Automation (INRIA, http://www.inria.fr/), research centre of Bordeaux-Sud Ouest (http://www.inria.fr/centre/bordeaux) is recruiting an engineer/developer for 12 months, in the framework of a partnership between the GEOSTAT team (http://geostat.bordeaux.inria.fr) and BATVOICE TECHNOLOGIES Company.

The field of activity is the pathologic speech processing. The successful candidate will take part in the valorisation of our research and to its transposition into a cutting-edge architecture.

The aim is the emergence of a new technology in a multi-support medical application.

Profile

Engineer/developer or Master2 or PhD with a good knowledge of :

  • C++ object language for real-time and multithread processes;

  • application development and signal processing, particularly for the data precision part;

  • development with Eclipse under Linux;

  • decryption and integration of open-source projects modules.

Mission

Implementation of speech processing algorithms, based on data captured by microphone.

Based on algorithms specifications, the implementations should be in the form of class modules gathered in directly executable applications.

The successful candidate will be supervised by an experienced researcher and work in close collaboration with a developer/integrator from BATVOICE.

The nature of developed modules will be adapted to massive data treatment, in execution and in/out in console mode only.

The development will include the treatment of every exception bug.

Location

Join a team of talented researchers an engineers at the cutting-edge of science, in the stimulating environment of the French National Institute of Research in Computing and Automation (INRIA) in the famous town of Bordeaux in the south-west of France.

Collaborate with a technology company close to the Parisian dynamic eco-system, and working in the MedTech area.

Evolution

By the end of this mission, the selected candidate will have the opportunity to be recruited by Batvoice Technologies.

Salary

Depending on profile, between 30 K? and 45 K? per year.

 

Starting date

From September 1, 2015 and before November 30, 2015.

 

Contact

Dr. Khalid DAOUDI, khalid.daoudi@inria.fr

Back  Top

6-9(2015-08-25) Postdoc / Spontaneous speech recognition and understanding, IMAG, Grenoble, France

 


Postdoc / Spontaneous speech recognition and understanding

 

You will work on a research and development project (CASSIE) involving academic and industrial stakeholders of spoken dialog and assistive technologies. The postdoc objective is to advance the state-of-the-art in spontaneous speech recognition and understanding. More precisely, one application of the project is a chatterbot which assists users to interact with a smart home environment. The heart of the research will be twofold:

-improve/adapt the LIG ASR system to spontaneous speech 

-build probabilistic and/or deep-learning based models for spoken language understanding in the context of assistive technologies.

 For the experimental development and validation, the research will benefit from the fully-equipped LIG smart home (DOMUS).

Start : Fall 2015 (flexible start from Sept to Dec 2015)

Duration : 18 months (postdoc)

Contact : Laurent.Besacier@imag.fr ; Benjamin.Lecouteux@imag.fr

 
Profiles The applicants must hold a PhD degree in Computational Linguistics, Computing sciences or Cognitive Sciences preferably with experience in the fields of speech processing and/or natural language processing and/or machine learning. Good background in programming will also be required. Experience in using deep learning architectures, word embeddings is a plus.
He/she will also be involved in experimenting the technology with human participants being either French or English speakers. For this reason good English level is required as well as (possibly) a  good command of French. Finally effective communication skills in English, both written and verbal are mandatory.
 
Location Grenoble is a high-tech city with 4 universities. It is located at the heart of the Alps, in outstanding scientific and natural surroundings. It is 3h by train from Paris ; 2h from Geneva ; 1h from Lyon ; 2h from Torino and is less than 1h from Lyon international airport.
 
Research Group Website : http://getalp.imag.fr 
 
Dates Interviews will be held in July 2015 (until September 2015 if needed). Meetings during Interspeech 2015 in Dresde (Germany) can be also organized.
Back  Top

6-10(2015-08-25) Mother-tongue Pashto at Vocapia.

CONTEXTE
Dans le cadre d'une étude sur des langues à écriture peu standardisée, Vocapia recrute 1 personne de langue maternelle Pashto qui aura en charge les missions suivantes:
- lemmatisation et normalisation de textes WEB
- recherche sur le WEB des documents audio (provenant de radio, TV, youtube...)
- télécharger et classifier thématiquement ces documents
- transcription de documents audio.

PROFIL:
La personne recrutée est de langue maternelle Pashto et a de préférence des connaissances en linguistique et/ou en traitement automatique des langues. Une expérience d'étude et/ou de traitement de données linguistiques serait un plus.

DUREE:
4-6 mois ; début du contrat : dès que possible

REMUNERATION: suivant niveau.

LIEU DE TRAVAIL : Orsay (91400)

Les personnes intéressées sont priées d'envoyer un CV à recruit@vocapia.com

Back  Top

6-11(2015-08-25) Mother-tongue Somali at Vocapia

CONTEXTE
Dans le cadre d'une étude sur des langues à écriture peu standardisée, Vocapia recrute 1 personne de langue maternelle Somali qui aura en charge les missions suivantes:
- lemmatisation et normalisation de textes WEB
- recherche sur le WEB des documents audio (provenant de radio, TV, youtube...)
- télécharger et classifier thématiquement ces documents
- transcription de documents audio.

PROFIL:
La personne recrutée est de langue maternelle Somali et a de préférence des connaissances en linguistique et/ou en traitement automatique des langues. Une expérience d'étude et/ou de traitement de données linguistiques serait un plus.

DUREE:
4-6 mois ; début du contrat : dès que possible

REMUNERATION: suivant niveau.

LIEU DE TRAVAIL : Orsay (91400)

Les personnes intéressées sont priées d'envoyer un CV à recruit@vocapia.com

Back  Top

6-12(2015-10-06) POST-DOC OPENING IN STATISTICAL NATURAL LANGUAGE PROCESSING AT LIMSI-CNRS, FRANCE

POST-DOC OPENING
IN STATISTICAL NATURAL LANGUAGE PROCESSING
AT LIMSI-CNRS, FRANCE
****************************************************************

The 'Spoken Language Processing' team at LIMSI-CNRS, Orsay (25 km south of
Paris) is seeking qualified postdoctoral researchers in the field of Statistical
Natural Language Processing (see https://www.limsi.fr/en/research/tlp/).

* Description of the work

This position is related to a collaborative project aimed at developping on an
experimental platform for online monitoring of social media and information
streams, with self-adaptive properties, in order to detect, collect, process,
categorize, and analyze multilingual news streams. The platform includes
advanced linguistic analysis, discourse analysis, extraction of entities and
terminology, topic detection, translation and the project includes studies on
unsupervised and cross-lingual adaptation.

In this context, the candidate is expected to develop innovative methods for
performing unsupervised cross-domain and/or cross-lingual adaptation of
statistical NLP tools.

* Requirements and objectives

Candidates are expected to hold an Engineering BSc/Msc degree and a PhD in
Computer Science. Knowledge of statistical  or example-based approaches for
Speech or Natural Language Processing is required; the candidate is also
expected to have strong programmation skills, to be familiar with statistical
machine learning techniques and to have a good publication record in the
field.

Salary and other conditions of employments will follow CNRS standard rules
for non-permanent researchers, according to the experience of the candidate.

* Contacts : Francois Yvon - francois.yvon at limsi.fr

Interested candidates should send a short cover letter stating motivation and
interests, along with their CV (in .pdf format, only) and names and addresses of
two references, to the email address given above as soon as possible and by nov
1st, 2015 at the latest. The contract is expected to start Jan 1st, 2016.

Informal questions regarding this position should be directed to the same address.


--

Back  Top

6-13(2015-10-16) CDD à l’Institut français de l’Éducation (IFÉ - ENS de Lyon)

Offre de CDD 6 mois :

Sélection d’un système automatique de phonétisation de texte, adaptation au

contexte de l’apprentissage de la lecture, puis intégration au sein de la plate-forme

web du projet CADOE (Calcul de l’Autonomie de Déchiffrage Offerte aux Élèves).

Contexte

L’Institut français de l’Éducation (IFÉ - ENS de Lyon) mène une recherche d’envergure

nationale sur l’apprentissage de la lecture et de l’écriture au cours préparatoire (projet

LireEcrireCP). Cette recherche réunit 60 enseignants chercheurs et doctorants répartis sur

le territoire national. 131 classes ont été observées et de nombreuses données ont été

recueillies et analysées. Il en ressort notamment que les textes dont moins de 31% du

contenu est directement déchiffrable par les élèves pénalisent leurs apprentissages, et que

ceux dont cette proportion dépasse 55% favorisent les apprentissages des élèves les plus

faibles en « code » (c’est-à-dire sur les correspondances entre les lettres et les sons) à

l’entrée du cours préparatoire. L’analyse nécessaire pour connaitre cette part directement

déchiffrable est complexe et ne peut être réalisée au quotidien par les enseignants, mêmes

chevronnés. Ils ne disposent donc pas d’une information pourtant cruciale au choix des

textes à utiliser pour enseigner la lecture.

Le projet CADOÉ a pour but de mettre en place une plate-forme qui permettra aux

enseignants d’accéder à cette part de texte directement déchiffrable par les élèves. Pour

cela, ils devront renseigner la progression de leur enseignement des correspondances

entre les lettres et les sons (l’étude du « code »), indiquer quels mots ont été appris en

classe, et déposer les textes « candidats » pour être utilisés comme supports de lecture.

Ces textes seront automatiquement analysés et segmentés en unités graphiques. La

confrontation du « code » enseigné et des mots appris en classe avec le résultat de la

décomposition permettra de calculer, et de fournir en retour à l’utilisateur, la mesure de la

part de texte directement déchiffrable par les élèves, autrement dit le niveau d’autonomie

de déchiffrage des élèves sur le texte soumis.

Compétences attendues

Nous recherchons des candidats ayant, soit un profil informatique avec une spécialisation

sur le traitement automatique des langues (TAL) ou le traitement automatique de la parole,

soit une formation en linguistique avec spécialisation sur l’ingénierie linguistique et

l’informatique.

La personne recrutée étudiera des outils de phonétisation existants, et en sélectionnera un.

Elle devra ensuite le configurer ou l’adapter pour que ses sorties soient conformes à la

décomposition attendue proposée par Riou (2015), et assurera les tests. Elle portera le

projet et sera responsable de son avancée. Elle travaillera en coordination avec l’équipe de

développement Web sur les formats d’échanges entre la plate-forme et l’outil de

phonétisation. Elle travaillera en étroite collaboration avec J. Riou, chargé d’étude à l’IFÉ,

responsable scientifique du projet qui validera la configuration de l’outil de phonétisation et

coordonnera les tests utilisateur de l’environnement développé, ainsi qu’avec P. Daubias,

Ingénieur de Recherche en Informatique, responsable techniques. L’implication technique

dans la réalisation de la plate-forme web CADOÉ pourra varier en fonction du profil et des

compétences du(de la) candidat(e) retenu(e). L’intégration à l’équipe CADOÉ permettra.

Le processus de développement sera itératif (cycle en spirale) en affinant les spécifications

au vue des maquettes successives réalisées.

Aspects techniques

Le développement se fera pour une cible Linux et doit être performant. Le premier outil

étudié (lia_phon) est écrit en C, mais son adaptation ne requiert pas nécessairement une

connaissance approfondie de ce langage. D’autres outils de phonétisation sont

envisageables (IrisaPhon par exemple), et les technologies ne sont pas figées.

Aspects administratifs

Lieu : Institut français de l'Éducation - École Normale Supérieure de Lyon

Bâtiment D6

19 allée de Fontenay, 69007 Lyon

Métro B : Debourg

Rémunération en fonction du niveau et selon grille : de 1700 € à 2500 € brut mensuel.

Début de contrat : dès que possible

Merci d’envoyer votre candidature (CV + lettre de motivation) à : lire.ecrire@ens-lyon.fr

Un premier examen des dossiers reçus aura lieu début novembre 2015

Durée du contrat : 6 mois

Back  Top

6-14(2015-12-02) Master2 position at Multispeech Team, LORIA (Nancy, France)

Master2 position at Multispeech Team, LORIA (Nancy, France)

Automatic speech recognition: contextualisation of the language model based on neural networks by dynamic adjustment

Framework of ANR project ContNomina

The technologies involved in information retrieval in large audio/video databases are often based on the analysis of large, but closed, corpora, and on machine learning techniques and statistical modeling of the written and spoken language. The effectiveness of these approaches is now widely acknowledged, but they nevertheless have major flaws, particularly for what concern proper names, that are crucial for the interpretation of the content.

In the context of diachronic data (data which change over time) new proper names appear constantly requiring dynamic updates of the lexicons and language models used by the speech recognition system.

As a result, the ANR project ContNomina (2013-2017) focuses on the problem of proper names in automatic audio processing systems by exploiting in the most efficient way the context of the processed documents. To do this, the student will address the contextualization of the recognition module through the dynamic adjustment of the language model in order to make it more accurate.

Subject

Current systems for automatic speech recognition are based on statistical approaches. They require three components: an acoustic model, a lexicon and a language model. This stage will focus on the language model. The language model of our recognition system is based on a neural network learned from a large corpus of text. The problem is to re-estimate the language model parameters for a new proper name depending on its context and a small amount of adaptation data. Several tracks can be explored: adapting the language model, using a class model or studying the notion of analogy.

Our team has developed a fully automatic system for speech recognition to transcribe a radio broadcast from the corresponding audio file. The student will develop a new module whose function is to integrate new proper names in the language model.

Required skills

Background in statistics and object-oriented programming.

Localization and contacts

Loria laboratory, Multispeech team, Nancy, France

Irina.illina@loria.frdominique.fohr@loria.fr

Candidates should email a detailed CV and diploma

References

[1] J. Gao, X. He, L. Deng Deep Learning for Web Search and Natural Language Processing , Microsoft slides, 2015

[2] X. Liu, Y. Wang, X. Chen, M. J. F. Gales, and P. C. Woodland. Efficient lattice rescoring using recurrent neural network langage models, in Proc. ICASSP, 2014, pp. 4941?4945.

[3] M. Sundermeyer, H. Ney, and R. Schlüter. From Feedforward to Recurrent LSTM Neural Networks for Language Modeling. IEEE/ACM Transactions on Audio, Speech, and Language Processing, volume 23, number 3, pages 517-529, March 2015.

Back  Top

6-15(2015-10-22) Scientific collaborator for the multimodal project ADNVIDEO, Marseille, France

Application deadline: 12/31/2015
Starting: as soon as possible.

Description:
The ADNVIDEO project (http://amidex.kalysee.com/), funded in the
framework of A*MIDEX (http://amidex.univ-amu.fr/en/home), aims at
extending multimodal analysis models. It focuses on jointly processing
audio, speech transcripts, images, scenes, text overlays and user
feedback. Using as starting point the corpus, annotations and
approaches developed during the REPERE challenge
(http://defi-repere.fr), this project aims at going beyond indexing at
single modalities by incorporating information retrieval methods, not
only from broadcast television shows, but more generally on video
documents requiring multimodal scene analysis. The novelty here is to
combine and correlate information from different sources to enhance
the description of the content. The application for this project
relates to the issue of recommendation applied to videos in the
context of Massive Open Online Courses where video content can be
matched to student needs.

Objectives:
The candidate will participate in the development of a prototype for
video recommendation:

- Integration of existing multimodal high-level descriptors in prototype
- Generation of textual descriptors from videos (such as automatic
image captioning, scene title generation, etc)
- Implementation of deep learning methods for video analysis

The allocation of the tasks can be adjusted depending on the wishes
and skills of the candidate.

Skills:
For this project, we are looking for one candidate with a PhD degree
in the areas of machine learning, artificial vision, natural language
processing, or information retrieval:
- Strong programming skills (C++, Java, Python...).
- Desire to produce functioning end-to-end systems, life-scale live demos
- Scientific rigor
- Imagination
- Top notch publications
- Excellent communication skills
- Enjoy teamwork
Candidates must presently work outside of France.

Location:
The work will be conducted in the University of Aix-Marseille at the
Laboratoire des Science de l'information et des système (LSIS
http://www.lsis.org), within the ADNVidéo project, supported by
funding from a AMIDEX foundation in collaboration with Kalyzee
(http://www.kalyzee.com/). Both LSIS and Kalysee are located in the
historical and sunny city of Marseille, in south of France
(http://www.marseille.fr/sitevdm/versions-etrangeres/english--discover-marseille).

Contact: sebastien.fournier@lsis.org
Duration: 6 month

Candidates should email a letter of application, a detailed CV
including a complete list of publications, and source code showcasing
programming skills.

Back  Top

6-16(2015-11-05) Ingénieur pour le projet LINKMEDIA de l'IRISA, Rennes, France

L?e?quipe LINKMEDIA (http://www-linkmedia.irisa.fr) de l?IRISA travaille au
de?veloppement de technologies permettant la description et l?acce?s au contenus
multime?dias par analyse de ces derniers : vision par ordinateur, traitement de la parole
et du langage, traitement des contenus audio, fouille de donne?es. Nos travaux s?appuient
sur une plateforme d'indexation qui fournit, en plus d'une infrastructure matérielle, une
offre logiciel sous la forme de web services.

Pour de?velopper et promouvoir les services propose?s sur la plateforme d?indexation
multime?dia de l?IRISA, nous recrutons un inge?nieur spe?cialiste du traitement des
donne?es multime?dias. Les missions qui lui seront confie?s sont :
? inte?gration a? la plateforme de modules existants
? de?veloppement de nouveaux modules mettant en ?uvre des techniques a? l?e?tat de l?art
? mise en cohe?rence de l?ensemble des modules et documentation
? re?alisation de de?monstrations d?applications multime?dias pour l?e?ducation et le
transfert industriel
? participation a? des campagnes d?e?valuation internationale

L?inge?nieur sera inte?gre? dans l?e?quipe de recherche LINKMEDIA et travaillera en
e?troite collaboration avec les chercheurs et leurs partenaires industriels sur des
projets de R&D.

Le candidat, de niveau Bac+5 ou Bac+8, devra posse?der un inte?re?t marque? pour les
technologies multime?dias et les technologies du web. Il devra e?galement justifier d?une
expe?rience significative en programmation (langages C/C++, perl, python), par exemple au
travers de projets et de stages pour les jeunes diplo?me?s. Une expe?rience dans la
conduite de projets informatiques d?envergure sera appre?cie?e. E?tant donne? le contexte
international de travail, une bonne connaissance de l?anglais est indispensable.

Pour candidater, merci d?adresser un CV accompagne? d?une lettre de motivation. Pour plus
de pre?cisions sur le poste, nous contacter.

Employeur : Centre National de la Recherche Scientifique
Lieu d?exercice : IRISA, Rennes
Contrat : CDD de 12 a? 16 mois, de?s que possible
Re?mune?ration : de 24 k? a? 35 k? annuels bruts selon diplo?me et expe?rience
Contact : Guillaume Gravier, guillaume.gravier@irisa.fr

Back  Top

6-17(2015-11-13) Ph.D. at Limsi, Orsay, France

LIMSI (http://www.limsi.fr) seeks qualified candidates for one fully funded PhD position in the field of automatic speaker recognition. The research will be conducted in the framework of the ANR-funded project ODESSA (Online Diarization Enhanced by recent Speaker identification and Structured prediction Approaches) in partnership with EURECOM (France) and IDIAP (Switzerland).

Master students are welcome to apply for a preliminary internship (starting no later than April 2016) that may lead to this PhD position.

Broadly, the goal of an automatic speaker recognition system is to authenticate or to identify a person through speech signal. Speaker diarization is an unsupervised process that aims at identifying each speaker within an audio stream and determining the intervals during which each speaker is active.

The overall goal of the position is to advance the state-of-the-art in speaker recognition and diarization.
Specifically, the research will explore the use of structured prediction techniques for speaker diarization.

Conversations between several speakers are usually highly structured and speech turns of a given person are not uniformly distributed over time. Hence, knowing that someone is speaking at a particular time t tells us a lot about the probability that (s)he is also going to speak a few seconds later. However, state-of-the-art approaches seldom takes this intrinsic structure into account.
The goal of this task is to demonstrate that structured prediction techniques (such as graphical models or SVMstruct) can be applied to speaker diarization.

The proposed research is a collaboration between EURECOM, IDIAP and LIMSI.
The research will rely on previous knowledge and softwares developed at LIMSI. Reproducible research is a cornerstone of the project. Hence a strong involvement in data collection and open source libraries are expected.

The ideal candidate should hold a Master degree in computer science, electrical engineering or related fields. She or he should have a background in statistics or applied mathematics, optimization, linear algebra and signal processing. The applicant should also have strong programming skills and be familiar with Python, various scripting languages and with the Linux environment. Knowledge in speech processing and machine learning is an asset.

Starting date is as early as possible and no later than October 2016.

LIMSI is a CNRS laboratory with 250 people and 120 permanent members. The Spoken Language Processing group involved in the project is composed of 41 people including 17 permanent members. The group is internationally recognized for its work on spoken language processing, and in particular for its development on automatic speech recognition. The research carried out in the Spoken Language Processing Group aims at understanding the speech communication processes and developing models for use in automatic speech processing. This research area is inherently multidisciplinary, Different topics are addressed among them speech recognition, speaker recognition, corpus linguistics, error analysis, spoken language dialogue, question-answering in spoken data, multimodal indexation of audio and video documents, and machine translation of both spoken and written language.

Contact : Hervé Bredin (bredin@limsi.fr) and Claude Barras (barras@limsi.fr)

Back  Top

6-18(2015-11-15) 1 (W/M) researcher positions at IRCAM, Paris France

 

Position: 1 (W/M) researcher positions at IRCAM

Starting: January 4th, 2016

Duration: 18 months

Deadline for application: December, 1st, 2015

 

Description of the project:

The goal of the ABC-DJ project (European H2020 ICT-19 project) is

to develop advanced Audio Branding (recommending music for a

trade-mark) technologies. For this, ABC-DJ will rely on Music

Content and Semantic Analysis.

Within this project, IRCAM will develop

new music content analysis algorithms (auto-tagging into

genre, emotions, instrumentation, estimation of tonality

and tempo)

new tools for advanced DJ-ing (audio quality

measurement, segmentation into vocal parts, full

hierarchical structure analysis, intelligent track summary,

audio source separation).

 

Position description 201511ABCRES:

For this project IRCAM is looking for a researcher for the

development of the technologies of music content analysis and

advanced DJ-ing.

Required profile:

High Skill in audio signal processing (spectral analysis, audiofeature

extraction, parameter estimation) (the candidate

should preferably hold a PHD in this field)

High skill in machine learning (the candidate should

preferably hold a PHD in this field)

High-skill in Matlab/Python programming, skills in C/C++

programming

Good knowledge of Linux, Windows, Mac-OS environments

High productivity, methodical works, excellent programming

style.

The hired Researchers will also collaborate with the development

team and participate in the project activities (evaluation of

technologies, meetings, specifications, reports).

 

Introduction to IRCAM:

IRCAM is a leading non-profit organization associated to Centre

Pompidou, dedicated to music production, R&D and education in

sound and music technologies. It hosts composers, researchers and

students from many countries cooperating in contemporary music

production, scientific and applied research. The main topics

addressed in its R&D department include acoustics, audio signal

processing, computer music, interaction technologies and

musicology. Ircam is located in the centre of Paris near the Centre

Pompidou, at 1, Place Igor Stravinsky 75004 Paris.

 

Salary:

According to background and experience

 

Applications:

Please send an application letter with the reference 201511ABCRES

together with your resume and any suitable information addressing

the above issues preferably by email to: peeters at ircam dot fr with

cc to vinet at ircam dot fr, roebel at Ircam dot fr.

 

Back  Top

6-19(2015-11-13) Intership at Loria, Vandoeuvre-lès-Nancy, France
Speech intelligibility: how to determine the degree of nuisance
 
General information
Supervisors
Irina Illina, LORIA, Campus Scientifique - BP 239, 54506 Vandoeuvre-lès-Nancy, illina@loria.fr
Patrick Chevret, INRS, 1 rue du Morvan, 54519 Vandoeuvre-lès-Nancy, patrick.chevret@inrs.fr
 
Motivations
The intelligibility of speech means the ability of a conversation to be understood by a listener located nearby. The level of speech intelligibility depends on several criteria: the level of ambient noise, the possible absorption of part of the sound spectrum, acoustic distortion, echoes, etc. The intelligibility of speech is used to assess the performance of telecommunication systems or absorption in rooms.
 
The speech intelligibility can be evaluated:
- subjectively: listeners hear several words or sentences and answer different questions (the transcription of sounds, the percentage of perceived consonants, etc.). The scores are the value of intelligibility ;
- objectively, without involving listeners, using acoustic measures: the index of speech intelligibility (speech transmission index, STI) and the interference level with speech.
 
Subjective measures are dependent of listeners and require a large number of listeners. This is difficult to achieve, especially when there are different types of environments. Moreover, it is necessary to evaluate this measure for each listener. Objective measures have the advantage of being automatically quantifiable and to be precise. However, which objective measures can measure the nuisance of the environment on the intelligibility of speech and people's health remains an open problem. For example, the STI index consists of measuring the energy modulation. But the energy modulation can be produced by the machines, yet it does not match the speech.
 
Subject
In this internship, we focus on the study of various objective measures of speech intelligibility. The goal is to find reliable measures to evaluate the level of nuisance of environment to speech understanding, to long-term mental health of people and to productivity. Some possible solutions consist to correlate the word confidence measure, noise measurement confidence and subjective measures of speech intelligibility. To develop these measures, the automatic speech recognition system will be used.
 
This internship will be performed through collaboration between our Multispeech team of LORIA and INRS (National Institute of Research and Safety). INRS works on professional risk identification, analysis of their impact on health and prevention. INRS has a rich corpus of recordings and subjective measures of speech intelligibility. This corpus will be used in the context of this internship. Our Multispeech team has great expertise in signal processing and has developed several methodologies for noise estimation. The Multispeech team developed the complete system of automatic speech recognition.
 
Required skills
Background in statistics and object-oriented programming.
Back  Top

6-20(2015-11-15) 2 sujets de stage pour 2016 au LIA, Avignon portant sur les interactions vocales


 Voici 2 sujets de stage pour 2016 au LIA, Avignon portant sur les interactions vocales
Homme-Machine . Merci de faire circuler ces offres aupres des etudiants potentiellement
concernés (Masters Informatique, Linguistique, Sciences Cognitives, IA, Traitement des
données, Mathématiques...).

========================================================================
Modèles connexionnistes pour la génération automatique de texte dans le cadre de
l'interaction vocale

Encadrants : Dr Stéphane Huet, Dr Bassam Jabaian, Prof. Fabrice Lefèvre

Descriptif du stage :
Les systèmes d'interaction vocales utilisés dans des applications comme la réservation de
billets d'avion ou d'hôtels, ou bien encore pour le dialogue avec un robot, font
intervenir différents composants. Parmi ceux-ci figure le module de génération de texte
qui produit la réponse du système en langage naturelle à partir d'une représentation
sémantique interne créée par le gestionnaire de dialogue.

Les systèmes de dialogue actuels intègrent des modules de génération basés sur des règles
ou patrons lexicaux définis manuellement, par ex :

confirm(type=$U, food=$W,drinks=dontcare)
? Let me confirm, you are looking for a $U serving $W food and any kind of drinks right ?

Ces modules gagneraient à se baser sur des méthodes d'apprentissage automatique afin de
faciliter la portabilité des systèmes de dialogue vers de nouvelles tâches et améliorer
la diversité des réponses générées. Parmi ces méthodes figurent les réseaux de neurones
qui ont vu un regain d'intérêt depuis l'introduction de la notion de « deep learning ».
De tels réseaux de neurones ont déjà été employés par le laboratoire de recherche de
Google pour une tâche de génération de description d'images
(http://googleresearch.blogspot.fr/2014/11/a-picture-is-worth-thousand-coherent.html)
proche de celle qui nous intéresse ici. Ainsi l'objectif de ce stage est d'étudier
l'utilisation de ces modèles précisément dans le cadre de l'interaction vocale.

Si un intérêt pour l'apprentissage automatique et le traitement de la langue naturelle
est souhaitable, il est attendu surtout du stagiaire de bonnes capacités en développement
logiciel. Le stagiaire travaillera dans le contexte d'une plateforme d'interaction vocale
complète et pourra élargir son champ d'investigation aux autres composants. Plusieurs
pistes pour une prolongation en thèse sont ouvertes.


Durée du stage : 6 mois
Rémunération : Environ 529? / mois
Thématique associée au stage : Systèmes de dialogue homme-machine, génération du langage
naturelle, apprentissage automatique...

========================================================================
Humour et systèmes d'interaction vocale

Encadrants : Dr Bassam Jabaian, Dr Stéphane Huet, Prof. Fabrice Lefèvre

Descriptif du stage : Automatisation de productions humoristiques.

Des travaux précédents en linguistique ont permis d?établir les bases d'une taxonomie des
mécanismes d'humour interactionnels. En partant de cette base la question que nous
souhaitons aborder dans ce travail est : peut-on automatiser la production de traits
humoristiques dans un dialogue homme-machine finalisé et si oui quel est l'impact sur les
performances du dialogue ?

Bien sur il ne peut s'agir de reproduire exactement les capacités générales d'un humain,
qui sont très complexes à décrire et certainement impossible à automatiser, mais plutôt
d'extraire certains mécanismes suffisamment réguliers pour les formaliser et les faire
exécuter en situation de dialogue. Cela devrait permettre de produire un effet décalé,
donnant ainsi une dimension de sympathie au système d'interaction dans la perception de
l'utilisateur.

D'un point de vue pragmatique plusieurs types de production (plus ou moins indépendamment
du mécanisme humoristique utilisé) sont déjà envisagés, réactionnel ou générationnel :
1. Dans le premier cas on détecte une opportunité (présence de connecteurs), puis on
réagit (génération de disjoncteurs). C?est le cas de l?humour basé sur les mots
polysémiques, i.e. on repère un mot que le système fait semblant de comprendre dans son
sens « gênant » ou inadapté.
2. Dans le second cas, on propose un trait d'humour ex-nihilo ou après détection d'une
nécessité de facilitation, par exemple lors de l?apparition d?un désalignement
(l?évolution normale du dialogue est gênée par une ou plusieurs incompréhensions). Il
s?agit alors de calembours (« puns »), mots d?esprits ou histoires drôles (« jokes »). On
pourra alors avoir recours à une base prédéfinie de blagues et les sélectionner selon le
contexte de dialogue (au moyen de technique de recherche d'information classique).

L'objectif est de pouvoir implémenter les solutions retenues sur les robots NAO
d'Aldebaran, disponibles au laboratoire, dans le contexte d'une tâche simple (jeux).
Au-delà de l'intérêt pour la thématique de l'intelligence artificielle sous-jacente au
sujet il est principalement attendu du stagiaire de très bonnes compétences en
développement logiciel. Ce stage ouvre sur plusieurs possibilités de poursuite en thèse
dans le domaine de la communication Homme-Machine pour l'intelligence artificielle.

Durée du stage : 6 mois
Rémunération : Environ 529? / mois
Thématique associée au stage : Systèmes de dialogue homme-machine, compréhension de la
parole, gestion du dialogue, apprentissage automatique.
========================================================================

Les étudiants intéressés sont invités à envoyer un email à
fabrice.lefevreAtuniv-avignon.fr, bassam.jabaianAtuniv-avignon.fr et
stephane.huetAtuniv-avignon.fr en indiquant le sujet visé (ou les 2) et en joignant un
dossier d'évaluation (avec au moins un CV, un relevé de notes des 2 dernières années et
une lettre de motivation).

Une première sélection aura lieu le 24/11/15.

Bien cordialement,
- Fabrice Lefevre

Back  Top

6-21(2015-11-18) PhD and Postdoctoral Opportunities in Multilingual Speech Recognition at Idiap, Martigny, Switzerland

PhD and Postdoctoral Opportunities in Multilingual Speech Recognition

In the context of a new EU funded collaborative project, Idiap Research Institute has PhD
and postdoctoral opportunities in multilingual speech recognition.

For more details and to apply, see the respective entries on our recruitment page:
 http://www.idiap.ch/education-and-jobs

Back  Top

6-22(2015-11-20) Ingénieur de recherche au LPL, Aix-en-Provence, France

Le Laboratoire Parole et Langage à Aix-en-Provence propose un poste d'ingénieur de recherche à la mutation pour la campagne CNRS NOEMi d'hiver.

Le détail du poste est ci-joint.

 

Résumé du profil :

Le/la chef de projet assurera le traitement du signal (signaux acoustiques, kinématiques, physiologiques, électro-encéphalographiques, signaux vidéo), analyses statistiques des données multi-modales ; le développement de programmes pour le pré-traitement et le traitement du signal (filtrage, synthèse/resynthèse, transformations temps/fréquence, édition, extraction de paramètres, segmentation / annotation) et l'analyse statistique (ex. : modèles linéaires à effets mixtes, représentations graphiques, etc.).

 

Pour plus d'information, n'hésitez pas à contacter la Direction du LPL (noel.nguyen@lpl-aix.fr) ou le coordinateur du Centre d?Expérimentation sur la Parole (alain.ghio@lpl-aix.fr)

 

Back  Top

6-23(2015-11-20) Postdoctoral position in speech intelligibility at IRIT Toulouse, France

Title: Postdoctoral position in speech intelligibility

Application deadline: 1/31/2016

Description: The decreasing mortality of Head and Neck Cancers highlights the importance to reduce the impact on Quality of Life (QoL). But, the usual tools for assessing QoL are not relevant for measuring the impact of the treatment on the main functions involved by the sequelae. Validated tools for measuring the functional outcomes of carcinologic treatment are missing, in particular for speech disorders. Some assessments are available for voice disorders in laryngeal cancer but there are based on very poor tools for oral and pharyngeal cancers involving more the articulation of speech than voice.

In this context, the C2SI (Carcinologic Speech Severity Index) project proposes to develop a severity index of speech disorders describing the outcomes of therapeutic protocols completing the survival rates. There is a strong collaboration between linguists, phoneticians, speech therapists and computer science researchers, in particular those from the Toulouse Institute of Computer Science Research (IRIT), within the SAMoVA team (http://www.irit.fr/recherches/SAMOVA/).

Intelligibility of speech is the usual way to quantify the severity of neurologic speech disorders. But this measure is not valid in clinical practice because of several difficulties as the familiarity effect of this kind of speech and the poor inter-judge reproducibility. Moreover, the transcription intelligibility scores do not accurately reflect listener comprehension. Therefore, our hypothesis is that an automatic assessment technic can measure the impact of the speech disorders on the communication abilities giving a severity index of speech in patients treated for head and neck and particularly for oral and pharyngeal cancer.

The main objective is then to demonstrate that the C2SI, obtained by an automatic speech processing tool, produces equivalent or superior outcomes than a score of speech intelligibility obtained by human listeners, in terms of QoL foreseeing the speech handicap, after the treatment of oral and/or pharyngeal cancer.

The database is actually recorded at the Institut Universitaire du Cancer in Toulouse with CVC pseudo-words, readings, short sentences focusing on prosody and spontaneous descriptions of pictures.

Roadmap to develop an automatic system that will evaluate the intelligibility of impaired speech:

- Study existing SAMoVA technologies and evaluate them with the C2SI protocol,

- Find relevant features with the audio signal that support intelligibility,

- Merge those features to obtain the C2SI,

- Correlate it with the speech intelligibility scores obtained by human listeners,

- Study in which way the features support understandability as well.

Skills:

For this project, we are looking for one candidate with a PhD degree in the areas of machine learning, signal processing, and also with: programming skills, scientific rigour, creativity, good publication record, excellent communication skills, enjoying teamwork...

Salary and other conditions of employments will follow CNRS (French National Center for Scientific Research) standard rules for non-permanent researchers, according to the experience of the candidate.

Location: the work will be conducted in the SAMoVA team of the IRIT, Toulouse (France).

Contact: Jérôme Farinas jerome.farinas@irit.fr , Julie Mauclair julie.mauclair@irit.fr

Duration: 12 to 24 months

Candidates should email a letter of application, a detailed CV including a complete list of publications, and source code showcasing programming skills if available.

Back  Top

6-24(2015-12-02) Master2 position at Multispeech Team, LORIA (Nancy, France)

 

Master2 position at Multispeech Team, LORIA (Nancy, France)

Automatic speech recognition: contextualisation of the language model based on neural networks by dynamic adjustment

Framework of ANR project ContNomina

The technologies involved in information retrieval in large audio/video databases are often based on the analysis of large, but closed, corpora, and on machine learning techniques and statistical modeling of the written and spoken language. The effectiveness of these approaches is now widely acknowledged, but they nevertheless have major flaws, particularly for what concern proper names, that are crucial for the interpretation of the content.

In the context of diachronic data (data which change over time) new proper names appear constantly requiring dynamic updates of the lexicons and language models used by the speech recognition system.

As a result, the ANR project ContNomina (2013-2017) focuses on the problem of proper names in automatic audio processing systems by exploiting in the most efficient way the context of the processed documents. To do this, the student will address the contextualization of the recognition module through the dynamic adjustment of the language model in order to make it more accurate.

Subject

Current systems for automatic speech recognition are based on statistical approaches. They require three components: an acoustic model, a lexicon and a language model. This stage will focus on the language model. The language model of our recognition system is based on a neural network learned from a large corpus of text. The problem is to re-estimate the language model parameters for a new proper name depending on its context and a small amount of adaptation data. Several tracks can be explored: adapting the language model, using a class model or studying the notion of analogy.

Our team has developed a fully automatic system for speech recognition to transcribe a radio broadcast from the corresponding audio file. The student will develop a new module whose function is to integrate new proper names in the language model.

Required skills

Background in statistics and object-oriented programming.

Localization and contacts

Loria laboratory, Multispeech team, Nancy, France

Irina.illina@loria.frdominique.fohr@loria.fr

Candidates should email a detailed CV and diploma

References

[1] J. Gao, X. He, L. Deng Deep Learning for Web Search and Natural Language Processing , Microsoft slides, 2015

[2] X. Liu, Y. Wang, X. Chen, M. J. F. Gales, and P. C. Woodland. Efficient lattice rescoring using recurrent neural network langage models, in Proc. ICASSP, 2014, pp. 4941?4945.

[3] M. Sundermeyer, H. Ney, and R. Schlüter. From Feedforward to Recurrent LSTM Neural Networks for Language Modeling. IEEE/ACM Transactions on Audio, Speech, and Language Processing, volume 23, number 3, pages 517-529, March 2015.

Back  Top

6-25(2015-12-03) ,PostDoc position in the field of automatic speaker recognition (ASR) at Idiap, Martigny, Switzerland

The Idiap Research Institute (http://www.idiap.ch  ) seeks qualified candidates for one
PostDoc position in the field of automatic speaker recognition (ASR) .

The research will be conducted in the framework of the SNSF funded project ODESSA (Online
Diarization Enhanced by recent Speaker identification and Structured prediction
Approaches) in partnership with LIMSI and EURECOM in France.

Broadly, the goal of an automatic speaker recognition system is to authenticate or to
identify a person through speech signal.
Speaker diarization is an unsupervised process that aims at identifying each speaker
within an audio stream and determining the intervals during which each speaker is active.

The overall goal of the position is to advance the state-of-the-art in speaker
recognition and diarization. Specifically, the research will:
* investigate i-vectors and deep neural network for ASR and their application to the
problem of speaker diarization,
* explore the use of domain adaptation.

The proposed research is a collaboration between LIMSI (Hervé Bredin, Claude Barras),
EURECOM (Nicholas Evans) and the Biometrics group (Dr. Sebastien
Marcel,http://www.idiap.ch/~marcel  ) at Idiap.
The research will rely on previous knowledge and softwares developed at Idiap, more
specifically Bob toolkit (http://idiap.github.io/bob/  ) and Spear
(https://pypi.python.org/pypi/bob.spear  ). In addition the use
of additional libraries for deep learning (Torch, Caffe or Theano) are considered.

Reproducible research is a cornerstone of the project. Hence a strong involvement in data
collection and open source libraries such as Bob and Spear are expected.

The ideal candidate should hold a PhD degree in computer science, electrical engineering
or related fields. She or he should have a background in statistics or applied
mathematics, optimization, linear algebra and signal processing. The applicant should
also have strong programming skills and be familiar with Python, C/C++ (MATLAB is not a
plus), various scripting languages and with the Linux environment. Knowledge in speech
processing and machine learning is an asset. Shortlisted candidate may undergo a series
of tests including technical reading and writing in English and programming (in Python
and/or C/C++).

Appointment for the position is for 3 years, provided successful progress and may be
renewed depending on funding opportunities. Salary on the first year starts at 80'000 CHF
(gross salary). Starting date is early 2016.

Apply online
here:http://www.idiap.ch/webapps/jobs/ors/applicant/position/index.php?PHP_APE_DR_9e581720b5ef40dc7af21c41bac4f4eb=%7B__TO%3D%27detail%27%3B__PK%3D%2710179%27%7D

Delete | Reply | Reply to List | Reply to All | Forward | Redirect | View Thread | Blacklist | Whitelist | Message Source | Save as | Print
Move | Copy
Back  Top

6-26(2015-12-12) Ph.D. Position in Speech Recognition at Saarland University, Germany

Ph.D. Position in Speech Recognition at Saarland University

 

The spoken Language Systems group from Saarland University in Germany anticipates the availability of a Ph.D. position in the area of speech recognition. This position is part of the Horizon 2020 project MALORCA, a research project on long term unsupervised adaptation of the acoustic and the language models of a speech recognition system. The research will be carried out together with a European consortium of high-profile research institutes and companies.

Requirements:

  • Degree in computer science, electrical engineering or a discipline with a related background.

  • Excellent programming skills C/C++, python and/or perl.

  • Experience with Linux and bash-scripting.

  • Very good math background.

  • Very good oral and written English communication skills.

  • Interest in speech recognition research.

 

Salary:

  • The position is fully funded with a salary in the range of 45,000 Euros to 55,000 Euros per year depending on the qualification and professional experience of the successful candidate.

  • The position is full time, for two years (with possibility of extension).

  • Starting date is April 1st 2016.

 

Research at Saarland University:

Saarland University is one of the leading European research sites in computational linguistics and offers an active, stimulating research environment. Close working relationships are maintained between the Departments of Computational Linguistics and Computer Science. Both are part of the Cluster of Excellence, which also includes the Max Planck Institutes for Informatics (MPI-INF) and Software Systems (MPI-SWS) and the German Research Center for Artificial Intelligence (DFKI).

 

 

 

 

 

Each application should include:

Curriculum Vitae including a list of relevant research experience in addition to a list of publications (if applies).

  • Transcript of records BSc/MSc.

  • Statement of interest (letter of motivation).

  • Names of two references.

  • Any other supporting information or documents.

Applications (documents in PDF format in a single file) should be sent no later than, Sunday, January 10th to: sekretariat@LSV.Uni-Saarland.De

 

Further inquiries regarding the project should be directed to:

Youssef.Oualil@LSV.Uni-Saarland.De

or

Dietrich.Klakow@LSV.Uni-Saarland.De

 

 

 

 

 

 

 

Back  Top

6-27(2015-12-12) PostDoc Position in Speech Recognition at Saarland University, Germany

PostDoc Position in Speech Recognition at Saarland University

 

The spoken Language Systems group from Saarland University in Germany anticipates the availability of a PostDoc position in the area of speech recognition. This position is part of the Horizon 2020 project MALORCA, a research project on long term unsupervised adaptation of the acoustic and the language models of a speech recognition system. The research will be carried out together with a European consortium of high-profile research institutes and companies.

Requirements:

  • Degree in computer science, electrical engineering or a discipline with a related background.

  • Excellent programming skills C/C++, python and/or perl.

  • Experience with Linux and bash-scripting.

  • Very good math background.

  • Very good oral and written English communication skills.

  • Interest in speech recognition research.

 

Salary:

  • The position is fully funded with a salary in the range of 45,000 Euros to 55,000 Euros per year depending on the qualification and professional experience of the successful candidate.

  • The position is full time, for two years.

  • Starting date is April 1st 2016.

 

Research at Saarland University:

Saarland University is one of the leading European research sites in computational linguistics and offers an active, stimulating research environment. Close working relationships are maintained between the Departments of Computational Linguistics and Computer Science. Both are part of the Cluster of Excellence, which also includes the Max Planck Institutes for Informatics (MPI-INF) and Software Systems (MPI-SWS) and the German Research Center for Artificial Intelligence (DFKI).

 

 

 

 

 

Each application should include:

Curriculum Vitae including a list of relevant research experience in addition to a list of publications (if applies).

  • Transcript of records BSc/MSc (Ph.D. if applies).

  • Statement of interest (letter of motivation).

  • Names of two references.

  • Any other supporting information or documents.

Applications (documents in PDF format in a single file) should be sent no later than, Sunday, January 10th to: sekretariat@LSV.Uni-Saarland.De

 

Further inquiries regarding the project should be directed to:

Youssef.Oualil@LSV.Uni-Saarland.De

or

Dietrich.Klakow@LSV.Uni-Saarland.De

 

 

 

 

 

 

 

 

Back  Top



 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA