ISCA Services

Post-doctoral positions at LIMSI-CNRS, Paris Saclay (Paris Sud) University. LIMSI is a multi-disciplinary research unit that addresses the automatic processing of human language for a range of tasks. LIMSI invites applications for 1 one-year Postdoctoral position in Natural Language Processing. The topic is as follows: Dialogue management in a human-machine dialogue system where the system plays the role of a patient during a medical consultation with a doctor. CONTEXT The postdoctoral fellow will contribute to the following project: Patient Genesys (http://www.patient-genesys.com/): in the framework of continuous medical education, the goal of the project is to design and develop a framework for virtual patient-doctor consultation. This is a collaborative project including a hospital and small and medium enterprises. JOB REQUIREMENTS - Ph.D. in Computer Science, Natural Language Processing, Computational Linguistics - Solid programming skills - Strong publication record - A good command of French is a plus - Knowledge of medical terminologies and ontologies is a plus ADDITIONAL INFORMATION Net salary: between 2000 and 2400 ??? per month according to experience Benefits: LIMSI offers a generous benefit package including health insurance and 44 days vacation pa. Duration: 12 months, renewable depending on performance and funding availability Start date: 1st October 2014 Location: Orsay, greater Paris area, France TO APPLY Please send: * a cover letter * a curriculum vitae, including a list of publications * the names and contact information of at least two referees to both: Sophie Rosset (rosset@limsi.fr) Anne-Laure (annlor@limsi.fr) Pierre Zweigenbaum (pz@limsi.fr) Application deadline: October 15th, 2014 Applications will be examined in the following week. ABOUT LIMSI-CNRS LIMSI is a laboratory of the French National Center for Research (CNRS), a leading research institution in Europe. LIMSI is a multi-disciplinary research unit that covers a number of fields from thermodynamics to cognition, encompassing fluid mechanics, energetics, acoustic and voice synthesis, spoken language and text processing, vision, visualisation and perception, virtual and augmented reality. LIMSI hosts about 200 researchers, professors, research support staff and graduate students. It is located in a green area about 30 minutes south of Paris.

BC: 69721 (draft)
BC: 69724 (draft)

Istituto Italiano di Tecnologia (http://www.iit.it) is a private Foundation with the objective of promoting Italy's technological development and higher education in science and technology. Research at IIT is carried out in highly innovative scientific fields with state-of-the-art technology.

iCub Facility (http://www.iit.it/icub) and Robotics, Brain and Cognitive Sciences (http://www.iit.it/rbcs) Departments are looking for 2 post-docs to be involved in the H2020 project EcoMode funded by the European Commission under the H2020-ICT-2014-1 call (topic ICT-22-2014 – Multimodal and natural computer interaction).

Job Description: Robust automatic speech recognition in realistic environments for human-robot interaction, where speech is noisy and distant, is still a challenging task. Vision can be used to increase speech recognition robustness by adding complementary speech-production related information. In this project visual information will be provided by an event-driven (ED) camera. ED vision sensors transmit information as soon as a change occurs in their visual field, achieving incredibly high temporal resolution, coupled with extremely low data rate and automatic segmentation of significant events. In an audio-visual speech recognition setting ED vision can not only provide new additional visual information to the speech recognizer, but also drive the temporal processing of speech by locating (in the temporal dimension) visual events related to speech production landmarks.

The goal of the proposed research is the exploitation of highly dynamical information from ED vision sensors for robust speech processing. The temporal information provided by EDC sensors will allow to experiment with new models of speech temporal dynamics based on events as opposed to the typical fixed-length segments (i.e. frames).

In this context, we are looking for 2 highly motivated Post-docs, respectively tackling vision (Research Challenge 1) and speech processing (Research Challenge 2), as outlined herebelow:

Research Challenge 1 (vision @ iCub facility – BC 69721): the post-doc will mostly work on the detection of features from event-driven cameras instrumental for improving speech recognition (e.g. lips closure, protrusion, shape, etc…). The temporal features extracted from the visual signal will be used for crossmodal event-driven speech segmentation that will drive the processing of speech. In the attempt to increase the robustness to acoustic noise and atypical speech, acoustic and visual features will be combined to recover phonetic gestures of the inner vocal tract (articulatory features).

Research Challenge 2 (speech processing @ RBCS – BC 69724): the post-doc will mainly develop a novel speech recognition system based on visual, acoustic and (recovered) articulatory features, that will be targeted for users with mild speech impairments. The temporal information provided by EDC sensors will allow to experiment with new strategies to model the temporal dynamics of normal and atypical speech. The main outcome of the project will be an audio-visual speech recognition system that robustly recognizes the most relevant commands (key phrases) delivered by users to devices in real-word usage scenarios.

The resulting methods for improving speech recognition will be exploited for the implementation of a tablet with robust speech processing. Given the automatic adaptation of the speech processing to the speech production rhythm, the speech recognition system will target speakers with mild speech impairments, specifically subjects with atypical speech flow and rhythm, typical of some disabilities and of the ageing population. The same approach will then be applied to the humanoid robot iCub to improve its interaction with humans in cooperative tasks.

Skills: We are looking for highly motivated people and inquisitive minds with the curiosity to use a new and challenging technology that requires a rethinking of visual and speech processing to achieve a high payoff in terms of speed, efficiency and robustness. The candidates we are looking for should also have the following additional skills:

PhD in Computer Science, Robotics, Engineering (or equivalent) with a background in machine learning, signal processing or related areas;
ability to analyze, improve and propose new algorithms;
Good knowledge of C, C++ programming languages with proven experience.

Team-work, PhD tutoring and general lab-related activities are expected.

An internationally competitive salary depending on experience will be offered.

Please note that these positions are pending the signature of the grant agreement with the European Commission (expected start date in early 2015)

How to apply:
Challenge 1: Send applications and informal enquires to jobpost.69721@icub.iit.it
Challenge 2: Send applications and informal enquires to jobpost.69724@rbcs.iit.it

The application should include a curriculum vitae listing all publications and pdf files of the most representative publications (maximum 2) If possible, please also indicate three independent reference persons.

Presumed Starting Date:
Challenge 1: January 2015 (but later starts are also possible).
Challenge 2: June 2015 (but later starts are also possible).
Evaluation of the candidates starts immediately and officially closes on November 10th, 2014, but will continue until the position is filled.

References:
Lichtsteiner, P., Posch, C., & Delbruck, T. (2008). A 128×128 120 dB 15 μs latency asynchronous temporal contrast vision sensor. Solid-State Circuits, IEEE Journal of, 43(2), 566-576.
Rea, F., Metta, G., & Bartolozzi, C. (2013). Event-driven visual attention for the humanoid robot iCub. Frontiers in neuroscience, 7.
Benosman, R.; Clercq, C.; Lagorce, X.; Sio-Hoi Ieng; Bartolozzi, C., 'Event-Based Visual Flow,' Neural Networks and Learning Systems, IEEE Transactions on , vol.25, no.2, pp.407,417, Feb. 2014, doi: 10.1109/TNNLS.2013.2273537
Potamianos, G. Neti, C. Gravier, G. Garg, A. and Senior, A.W. (2003) “Recent Advances in the Automatic Recognition of Audiovisual Speech” in Proceedings of the IEEE Vol. 91 pp. 1306-1326
Glass, J. (2003)“A probabilistic framework for segment-based speech recognition”, Computer Speech and Language, vol. 17, pp. 137-152.
Badino, L., Canevari, C., Fadiga, L., Metta, G. 'Deep-Level Acoustic-to-Articulatory Mapping for DBN-HMM Based Phone Recognition', in IEEE SLT 2012, Miami, Florida, 2012

MIT Postdoctoral Associate Opportunity The Spoken Language Systems Group in the MIT Computer Science and Artificial Intelligence Laboratory is seeking a Postdoctoral Associate to participate in research and development of deep learning methods applied to the problems of multilingual speech recognition, dialect, and speaker recognition. This position is expected to start in early 2015 for a one-year period, with the possibility of extension. A Ph.D. in Computer Science or Electrical Engineering is required. Individuals must have at least four years of hands-on, computer-based experience in algorithm and system development in speech processing related to speech recognition, speaker or language recognition, and must have strong programming skills (especially, but not limited to C++, and Python) and familiarity with Linux environment. This individual must be able to work both independently, and cooperatively with others, and have good communication and writing skills. Interested candidates should send their CVs to Jim Glass, glass@mit.edu