ISCApad #167 |
Sunday, May 13, 2012 by Chris Wellekens |
6-1 | (2011-11-02) Postdoc in physiological data acquisition and processing in speech production- Laboratoire de phonetique et Phonologie (UMR 7018) Université de Paris 3 We offer a one year postdoc position in physiological data acquisition and processing in speech production within the 10-year LABEX project 'Empirical foundations of linguistics' that started in 2011. The position is based in Paris, at the Laboratoire de phonetique et Phonologie (UMR 7018) from the Université de Paris 3. This postdoctoral position is linked to the strand « Phonetic and phonological complexity - boosting empirical approaches » of the LABEX project, and will be supervised by Lise CREVIER-BUCHMAN.
The successful candidate will contribute to the physiological and acoustic data collection, and will elaborate signal processing methods and tools to analyse/assess voice and speech in normal and pathological populations. Her responsibilities will include selecting and recording acoustic and articulatory data (ultrasound, EGG, video with high speed imaging and aerodynamics), carrying out acoustic and articulatory analyses. She/he will undertake experiments of two types. One will consist in defining parameters for acoustical and physiological data collection and designing analysis methods. The second will consist in an attempt of modelling phonatory and articulatory behaviours for i) normal standard and non standard productions such as whispered voice or singing styles, ii) and pathological productions. The aim is to better understand deviant speech production mechanisms and compensation capabilities.
The candidate will have a PhD in Engineering or signal processing (audio and video) with, if possible, some knowledge of experimental phonetics and speech sciences. More information on the position can be found at http://www.labex-efl.org/?q=en/hiring/2012 or by contacting Lise CREVIER-BUCHMAN(lise.crevier@univ-paris3.fr)
Application deadline is April 10th 2012.
Candidate should send a CV with list of publications, a site web where publications can be found and the names of two referents to Martine Adda-Decker (madda@limsi.fr)
| |||||
6-2 | (2011-11-05) Postdoc position at Northeastern University Applications are invited for an anticipated postdoctoral position at Northwestern University in the area of computational modeling of the lexicon in many different languages. The postdoctoral fellow will work with linguistics professor Janet Pierrehumbert as a member of an interdisciplinary team that includes outstanding computer scientists and electrical engineers at four other American research institutions. The Northwestern component of the collaborative project will focus on novel methods for bootstrapping lexical networks from sparse statistical samples. The ideal candidate will have a Ph.D in computational linguistics, EECS, applied mathematics, or physics, and the demonstrated ability to create and document robust software. Familiarity with C, Python and Unix/Linux environments are a plus.
Negotiable start date in early 2012. Competitive salary scale comparable to postdoctoral positions in computer science. Initial appointment for one year with renewal subject to performance and availability of funding. Please apply by email with a statement of research interests and a CV including the names and contact information for three references. Send the application to: linguistics@northwestern.edu using the subject line: Pierrehumbert postdoc The application should arrive by December 10, 2011 to receive fullest consideration.
Northwestern is an Equal Opportunity Employer, and applications are encouraged from all qualified junior researchers. Hiring is contingent on eligibility to work in the United States. Non-US-residents will receive consideration for the position.
| |||||
6-3 | (2011-11-09) Postdoc at Aix-en-Provence, France Proposition de post-doctorat à Aix-en-Provence (France), à partir de février 2012, pour une durée de 1 an renouvelable 1 fois. ANR MINPROGEST Rôle de la théorie de l’esprit dans la construction du sens L'objectif général du projet ANR MINDPROGEST est de déterminer, en français, quel rôle joue l'attribution des états mentaux (intention, croyance, connaissance) aux autres - encore appelée théorie de l'esprit (ToM) ou mindreading – dans la construction du sens. La conversation étant le site fondamental de l'utilisation du langage, un défi majeur consistera à déterminer ce rôle de la ToM dans le contexte de l'interaction sociale chez des individus atteints de schizophrénie et des personnes sans pathologie. On connaît le rôle majeur que joue la prosodie dans la construction du sens et notamment dans l’expression et la reconnaissance des intentions (via sa fonction attitudinale). L’intérêt pour le sens, de plus en plus manifeste aujourd’hui dans les études en prosodie, se traduira par l’exploration plus spécifique de la dimension intonative de la prosodie. Dans ce projet, il s’agira donc d’étudier les contours intonatifs en tant qu’ils véhiculent des informations relatives à l’attribution et la reconnaissance d’états mentaux à l’autre en vue de construire le sens. Une approche multidisciplinaire (linguistique, psychologie, neurosciences et santé mentale) sera adoptée pour étudier la contribution des mécanismes linguistiques et cognitifs à la construction du sens. Mots-clés : prosodie, sens de l’intonation, contours intonatifs, pragmatique, théorie de l’esprit, schizophrénie. Profil du/de la canditat(e) : Pour ce projet multidisciplinaire, le/la candidat(e) sera titulaire d’un doctorat en prosodie (de préférence sur la prosodie du français). La maîtrise (quasi) native du français est requise. Une bonne connaissance des approches phonologiques de l’intonation et de la problématique du sens de l’intonation ainsi que la maîtrise des méthodes expérimentales d’investigation de ces questions seront déterminantes. Le dossier de candidature comprendra : a) Un CV b) Une lettre de motivation décrivant les intérêts de recherche du/de la canditat(e) c) 2 lettres de recommandation et/ou le nom et les coordonnées de 2 personnes référentes Financement : subvention ANR (Agence Nationale de la Recherche) Salaire : selon les normes du CNRS Contact : Maud Champagne-Lavau email: maud.champagne-lavau@lpl-aix.fr Laboratoire Parole et Langage UMR 6057, CNRS 5 Av. Pasteur B.P. 80975 13604 Aix-en-Provence, France Téléphone : (33) 04 88 78 57 07 Laboratoire Parole et Langage http://www.lpl.univ-aix.fr/ http://lpl-aix.fr/person/bertrand http://lpl-aix.fr/person/portes
| |||||
6-4 | (2011-11-15) Postdoc position at Institut Eurecom, Sophia Antipolis, France Post-Doctoral Research Position (M/F) at EURECOM, France Title: Speaker modeling, recognition and diarization Department: Multimedia Communications URL: http://www.eurecom.fr/mm.en.htm Start date: January/February 2012 Duration: 18 months contract Description: EURECOM?s Multimedia Communications Department invites applications for a full-time, 18-month post-doctoral research position (M/F) to work in speaker modeling, recognition and diarization within its Speech and Audio Processing Research Group. The group participates in a growing number of national, European and direct, industrial research projects. The performance of text-independent speaker recognition and speaker diarization systems typically degrades when only limited speaker-specific training data is available. Under such circumstances speaker models can be biased or poorly normalized across phones. Linguistic mismatch in test or clustering data leads to degraded speaker recognition and diarization performance respectively. This work will investigate new normalisation and marginalization approaches to improve speaker modeling and hence the robustness of recognition and diarization systems to nuisance and linguistic variation. The position is linked to the on-going Adaptable Ambient LIving ASsistant (ALIAS) project which is funded through the joint-national/European Ambient Assisted Living (AAL) programme. The ALIAS project, which involves several European academic and industrial partners, aims to develop a mobile robot system that interacts with elderly users, monitors and provides cognitive assistance in daily life, and promotes social inclusion by creating connections to people and events in the wider world. You will be required to work on the research, development and integration of speaker diarization, localization and recognition systems to identify and track different users in the context of an adaptable speech interface. The candidate is expected to represent EURECOM at technical project meetings and related international scientific events and conferences. Requirements: The successful candidate will have been awarded a PhD degree in a relevant field of speech processing prior to their joining EURECOM. You will have a strong research track record with significant publications at leading international conferences. Experience of collaborative research projects at the European level is desirable. You will be highly motivated to undertake challenging, applied research and have excellent English language speaking and writing skills. French language skills are a bonus. Applications: Please send to the address below (i) a one page statement of research interests and motivation, (ii) your CV and (iii) contact details for two referees (preferably one from your PhD or most recent research supervisor). The screening of applications will begin immediately and the search will continue until the position is filled. As part of its policy of promoting gender diversity, especially in computer science and scientific professions, EURECOM encourages women to apply for positions predominantly held by men. Contact: Dr Nicholas Evans Postal address: 2229 Route des Crêtes BP 193, F-06904 Sophia Antipolis cedex, France Email address: evans@eurecom.fr Web address: http://www.eurecom.fr/main/institute/job.en.htm Phone: +33/0 4 93 00 81 14 Fax: +33/0 4 93 00 82 00 EURECOM is a graduate school and a Research center in Communication Systems, located in Sophia Antipolis technology park, in close proximity with a large number of research units of leading multinational corporations in the telecommunications, semiconductor and biotechnology sectors, as well as other outstanding research and teaching institutions. EURECOM was founded in 1991 by TELECOM ParisTech (Ecole Nationale Supérieure des Télécommunications) and EPFL (Swiss federal institute of Lausanne) in a consortium form, combining academic and industrial partners. EURECOM deploys its expertise around three major fields: Networking and security, Multimedia Communications and Mobile Communications and has a strong international scope and strategy. EURECOM is particularly active in research in its areas of excellence while also training a large number of doctoral candidates. Its contractual research is recognized across Europe and contributes largely to its budget. --
| |||||
6-5 | (2011-11-15) 2-4 PhD positions in Speech Technology and Communication at KTH 2-4 PhD positions in Speech Technology and Communication at KTH The goal of the positions is to contribute to the research foundation for speech technology in tomorrows conversational systems. Anticipated specializations •1. Conversational human-robot interaction •2. Speech synthesis for human-like dialog •3. Avatars that interact through speech, gesture or sign language •4. Studies of situation-sensitive dialogue behavior Links to more detailed descriptions on KTH's main website: http://www.kth.se/om/work-at-kth/vacancies/phd-students-in-speech-technology-and-communication-1.205567?l=en_UK More information about our department can be found here: http://www.speech.kth.se/
| |||||
6-6 | (2011-11-15) Research Associate/Research Fellow - Natural Speech Technology Research Associate/Research Fellow - Natural Speech Technology Fixed-term for 3 years Salary: Research Associate - Grade 7: £28,251 to £35,788 per annum Research Fellow - Grade 8: £36,862 to £44,016 per annum Closing Date 6 December 2011 The Speech and Hearing research group in the Department of Computer Science (SPandH) is a partner in the EPSRC Programme Grant in Natural Speech Technology (NST), in collaboration with the Universities of Edinburgh and Cambridge. NST is a large and ambitious project, aiming to significantly advance the state-of-the-art in speech technology by making it more natural, approaching human levels of reliability, adaptability and conversational richness. The total duration of the NST programme is 5 years and it is organised in themes that cover a diverse set of collaborative studies in speech recognition and synthesis. Applications, practical demonstrations and interaction with technology users in industry are also part of the programme. The successful applicant will work on speech recognition research topics under the NST programme at Sheffield. SPandH has developed state-of-the-art automatic speech recognition systems that have repeatedly shown best performance in international competitions (U.S. NIST) and are publicly available (www.webasr.org). In clinical applications, SPandH has introduced a user-driven methodology for personalised speech technology. Together, these advances form the foundation for Sheffield work within NST. Excellent computing resources are available to allow ambitious experiments with innovative ideas. This is an opportunity to work in a well-connected international team with world-leading reputations in speech recognition research and in collaboration with outstanding groups at the Centre for Speech Technology Research at Edinburgh and the Machine Intelligence Lab at Cambridge University. Applicants should have a PhD (or have equivalent experience) in a related subject area. Applicants are required to have a good track record in research of speech recognition and/or machine learning topics. Experience in one or more of the following areas will be an advantage: statistical machine learning , pattern processing signal processing acoustic or language modelling for automatic speech recognition Solid knowledge of Unix type operating systems and programming in C/C++ is required. For an appointment at Research Fellow level, experience in research management is essential as candidates are expected to take a leading role in site scientific management. For further information see http://www.jobs.ac.uk/job/ADM425/research-associate-research-fellow For informal enquiries please contact Thomas Hain (t.hain@dcs.shef.ac.uk) or Phil Green (p.green@dcs.shef.ac.uk). --
| |||||
6-7 | (2011-11-15) Three Funded Studentships in Speech Technology & Machine Learning at Sheffield Three Funded Studentships in Speech Technology & Machine Learning at Sheffield Up to three funded Ph.D. studentships are available in the Speech and Hearing Research Group in the Department of Computer Science, University of Sheffield UK. These studentships are supported by the EPSRC programme grant in Natural Speech Technology (http://www.natural-speech-technology.org/) and associated funding sources. Project topics will be defined within the following areas: - Techniques for unsupervised learning from continuous streams of speech data, - Models that adapt to new scenarios and speaking styles, - Recognisers that can detect 'who spoke what, when, and how' in any acoustic environment and task domain, - Personal Adaptive Listening systems for people with communication disorders. For more details, visit http://www.jobs.ac.uk/job/ADM394/phd-research-studentships-natural-speech-technology/ For informal enquiries please contact Thomas Hain (t.hain@dcs.shef.ac.uk) or Phil Green (pdg@dcs.shef.ac.uk).
| |||||
6-8 | (2011-11-24) PhD student in Sonification Research at the Institute for Electronic Music and Acoustics of the University of Music and Performing Arts Graz (Austria) PhD student in Sonification Research
| |||||
6-9 | (2011-12-01) Ph.D. Program Carnegie Mellon | PORTUGAL Ph.D. Program Carnegie Mellon | PORTUGAL in the area of Language and Information Technologies Deadline: December 15, 2011 The Language Technologies Institute (LTI) of the School of Computer Science at Carnegie Mellon University offers a dual degree Ph.D. Program in Language and Information Technologies in cooperation with Portuguese Universities. This Ph.D. program is part of the Carnegie Mellon | Portugal Partnership. The Language Technologies Institute, a world leader in the areas of speech processing, language processing, information retrieval, machine translation, machine learning, and bio-informatics, has been formed 20 years ago. The breadth of language technologies expertise at LTI enables new research in combinations of the core subjects, for example, in speech-to-speech translation, spoken dialog systems, language-based tutoring systems, and question/answering systems. The Portuguese consortium of Universities includes (but is not limited to) the Spoken Language Systems Lab (L2F) of INESC-ID Lisbon/IST, the University of Lisbon (FLUL), the University of Beira Interior (UBI) and the University of Algarve (UALG). These Universities share expertise in the same language technologies as LTI, although with a strong focus on processing the Portuguese language. The LT program involves 1 or 2 new PhD students every year. Each Ph.D. student will receive a dual degree from LTI and the selected Portuguese University, being co-supervised by one advisor from each institute, and spending approximately half of the 5-year doctoral program at each institute.The academic part will be done during the first 2 years, including a maximum of 8 courses, with a proper balance of focus areas (Linguistic, Computer Science, Statistical/Learning, Task Orientation). The remaining 3 years of the doctoral program will be dedicated to research. The thesis topic will be in one of the research areas of the cooperation program, defined by the two advisors. Two multilingual topics have been identified as primary research areas (although other areas of human language technologies may be also contemplated): computer assisted language learning (CALL) and speech-to-speech machine translation (S2SMT). The doctoral students will be involved in one of the collaborative projects between LTI and the Portuguese Universities aimed at building real HLT systems. The scholarship will be funded by the Foundation for Science and Technology (FCT), Portugal. How to Apply The application deadline for the LT Ph.D. program in the scope of the CMU-Portugal partnership is December 15. Students interested in the dual doctoral program must apply by filling the corresponding form at the LTI webpage. For more information about the joint degree doctoral program in LT, send email to the coordinators of the program: •Isabel.Trancoso at inesc-id dot pt •LTI_Portugal_Admissions at cs dot cmu dot edu The applications will be screened by a joint committee formed by representatives of LTI and of the Portuguese Universities. The candidates should indicate their scores in GRE and TOEFL tests. Despite the particular focus on the Portuguese language, applications are not in any way restricted to native or non-native speakers of Portuguese. Program Highlights REAP.PT project http://call.l2f.inesc-id.pt/reap.public/ PT-STAR project http://pt-star.l2f.inesc-id.pt/ptstar/ See also Priberam Machine Learning Lunch Seminars http://www.priberam.pt/Empresa/Inovacao/Seminarios.aspx Lisbon Machine Learning Summer School http://lxmls.it.pt/
| |||||
6-10 | (2011-12-01) Post-doctoral position in Cognitive Neuroscience Post-doctoral position in Cognitive Neuroscience
| |||||
6-11 | (2011-12-23) Offre post-doctorale en traitement de la parole, LIA Avignon France Offre post-doctorale en traitement de la parole. Avignon, LIA. Date limite de candidature : 15 Février 2012 Dans le cadre du projet « DECODA » (http://decoda.univ-avignon.fr) financé par l'ANR, le LIA souhaite recruter un chercheur post-doctorant dans le domaine du traitement automatique de la parole. Le projet Decoda s'intéresse au dépouillement de données audio issues de grands centres d'appels. C'est un projet mené par le LIA, le LIF, la RATP et la société Sonear.
Le LIA (http://lia.univ-avignon.fr) est une Equipe d'Accueil (EA n° 4128) qui regroupe les enseignants-chercheurs de l’Université d’Avignon et des Pays de Vaucluse (UAPV) relevant de la 27e section du CNU ainsi que les ingénieurs, les doctorants et les stagiaires de MASTER durant la période consacrée à leur travail de recherche. Le post-doctorant rejoindra la thématique langage du laboratoire et travaillera avec les membres de la thématique travaillant sur le projet Decoda. Le poste, d'une durée de 12 mois (avec prolongations possibles) est à pourvoir à partir du 15e janvier 2012. La rémunération est comprise entre 2000 et 2400 euros nets par mois, suivant l'expérience du candidat.
| |||||
6-12 | (2011-12-23) Researcher and Research Software Engineer Positions , AT&T Labs - Research AT&T Labs - Research
| |||||
6-13 | (2011-12-24) Stage fin d’études pré-embauche : Ingénieur Développement Stage fin d’études pré-embauche : Ingénieur Développement
| |||||
6-14 | (2012-01-04) Audio Indexing Researcher W/M position at IRCAM – 3DTV project Audio Indexing Researcher W/M position at IRCAM – 3DTV project Starting : January - February , 2012 Duration : 18 months
Introduction to IRCAM IRCAM is a leading non-profit organization associated to Centre Pompidou, dedicated to music production, R&D and education in acoustics and music. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D department include acoustics, audio signal processing, computer music, interaction technologies, musicology. Ircam is located in the centre of Paris near the Centre Pompidou, at 1, Place Igor Stravinsky 75004 Paris.
Introduction to 3DTVs project The goal of the 3DTVS project is to devise scalable 3DTV AV content description, indexing, search and browsing methods across open platforms, by using mobile and desktop user interfaces and to incorporate such functionalities in 3D audiovisual content archives. 3D multichannel audio analysis targets audio event detection based on fusion techniques that combine the feature analysis performed in the individual channels as well as source localization and separation algorithms for the detection of moving audio sources. The results will be used in 3D audio/cross-modal indexing and retrieval. Multimodal 3D audiovisual content analysis will built on the results of 3D video and audio analysis. 3DTV content description and search mechanisms will be developed to enable fast reply to semantic queries.
Role of IRCAM in the 3DTV Project In the 3DTVs project, IRCAM is in charge of the research and development of technologies related to - Audio event detection using multi-channel audio scenes - Speaker diarization - Segmentation into Movie scene from the audio signal - Sound source separation, localization and identification
Position description Hired Researcher will be in charge of the development of technologies related to:
The Researchers will also collaborate with the development team and participate in the project activities (evaluation, meetings, specifications).
Required profiles
Salary According to background and experience
Applications Please send an application letter together with your resume and any suitable information addressing the above issues preferably by email to: peeters_a_t_ircam dot fr with cc to vinet_a_t_ircam dot fr, roebel_at_ircam_dot_fr
L’Ircam recrute un Chercheur H/F – en CDD de 18 mois et à temps plein – Projet 3DTVs Poste disponible à partir du 1er janvier 2012
Présentation de l’Ircam L'Ircam est une association à but non lucratif, associée au Centre National d'Art et de Culture Georges Pompidou, dont les missions comprennent des activités de recherche, de création et de pédagogie autour de la musique du XXème siècle et de ses relations avec les sciences et technologies. Au sein de son département R&D, des équipes spécialisées mènent des travaux de recherche et de développement informatique dans les domaines de l'acoustique, du traitement des signaux sonores, des technologies d’interaction, de l’informatique musicale et de la musicologie. L'Ircam est situé au centre de Paris à proximité du Centre Georges Pompidou au 1, Place Stravinsky 75004 Paris.
Introduction au projet 3DTVs L'objectif du projet 3DTVs est de concevoir des descriptions évolutives des contenus 3DTV, leur indexation, leur recherche ainsi que la conception de méthodes de navigation sur toutes des plateformes ouvertes, en utilisant des interfaces utilisateurs mobiles et fixes et d'intégrer de telles fonctionnalités 3D dans les archives de contenus audiovisuels. L’analyse multi canal audio 3D vise la détection d’événements audio basés sur des techniques de fusion combinant l'analyse audio effectuée dans les canaux individuels ainsi que des algorithmes de localisation et de séparation de source pour la détection des mouvements des sources audio. Les résultats seront utilisés pour l’indexation 3D audio et cross modale ainsi que pour la recherche. L’indexation audio/ video multimodale 3D des contenus audiovisuels s’appuiera sur les résultats de l’indexation vidéo 3D et audio 3D. Des méthodes de description de contenu et de recherche seront développées afin de permettre des réponses rapides aux recherches sémantiques.
Rôle de l’Ircam dans le projet Quaero Dans le projet 3DTVs, l'Ircam est en charge de la recherche et du développement des technologies relatives à la - Détection des événements audio en utilisant les scènes audio multi canal - Segmentation en tours de parole - Segmentation de scène de films partir de l’audio - Séparation, localisation et identification des sources sonores
Missions Le Chercheur sera en charge du développement des technologies liées à: - Détection des événements audio en utilisant les scènes audio multi canal Le chercheur collaborera également avec l'équipe de développement et participera aux activités du projet (évaluation, réunions, spécification).
Profil recherché
Salaire Selon formation et expérience professionnelle
Candidatures Prière d'envoyer une lettre de motivation et un CV détaillant le niveau d'expérience/expertise dans les domaines mentionnés ci-dessus (ainsi que tout autre information pertinente) à peeters_a_t_ircam dot fr avec copie à vinet_a_t_ircam dot fr, roebel_at_ircam_dot_fr
| |||||
6-15 | (2012-01-10) TENURE TRACK OR TENURED POSITION IN NATURAL LANGUAGE PROCESSING, INSTITUTE OF INFORMATION SCIENCE, ACADEMIA SINICA (TAIWAN) TENURE TRACK OR TENURED POSITION IN NATURAL LANGUAGE PROCESSING, INSTITUTE OF INFORMATION SCIENCE, ACADEMIA SINICA (TAIWAN)
The Institute of Information Science (http://www.iis.sinica.edu.tw), Academia Sinica (http://www.sinica.edu.tw), Taiwan, R.O.C. is recruiting junior and senior research fellows (equivalent to the rank from assistant professors, associate professors, to full professors) on the areas of natural language processing (NLP), in particular, Chinese language processing. All candidates for the position should have a Ph.D. degree in computer science with specialties in NLP, machine learning or semantic processing and with a good research background as well as publication record.
For additional information, please contact Deputy Director of the Institute, Dr. Hsin-Min Wang (whm@iis.sinica.edu.tw).
| |||||
6-16 | (2012-01-10) Research Assistant or Postdoctoral Position-Machine Translation at KIT Karlsruhe RFA Karlsruhe Institute of Technology (KIT) is the result of the merger of the University of Karlsruhe and the Research Center, Karlsruhe. It is a unique institution in Germany, which combines the mission of a university with that of a large-scale research center of the Helmholtz Association. With 8000 employees and an annual budget of EUR 650 million, KIT is one of the largest research and education institutions worldwide. At the Institute for Anthropomatics, Lehrstuhl Prof. Waibel several positions are to be filled: Research Assistant or Postdoctoral Position in the field of Machine Translation with a salary according to TV-L E13. The responsibilities include basic research in the area of statistical machine translation, as well as participation in application targeted research projects in the area of multimodal dialog centered Human-machine interaction. The candidate is expected to participate in the design, development, and exploration of innovative methods, algorithms and techniques to be successfully integrated in humanoid robots. Within the framework of the international center for advanced communication technology (interACT) , our center operates in three locations, Karlsruhe Institute of Technology, Germany and at Carnegie Mellon University, Pittsburgh and Silicon Valley, USA. This offers a first class research environment, working in advanced research projects at two of the top computer science research universities in the US and Europe. International joint and collaborative research at and between our centers is common and encouraged, and offers great international exposure and activity. The focus of our research is to develop better communication and computing services that take advantage of an understanding of the Human context and activities. One of our current research interests is the field of simultaneous translation of lectures, video conferencing, and monologues in general. We seek qualified candidates with a M.S. or a Phd degree in Computer Science, Electrical Engineering, or related fields. The position offers the opportunity to work toward a Ph.D. degree or toward the advancement of an academic career. A record of academic achievements, relevant experience and knowledge in relevant areas, and excellent programming skills are expected. KIT is pursuing a gender equality policy. Women are therefore particularly encouraged to apply. If equally qualified, handicapped applicants will be preferred. Questions may be directed to E-Mail: sebastian.stueker@kit.edu, http://isl.anthropomatik.kit.edu The application should be sent to Professor Alex Waibel, Institut für Anthropomatik, Karlsruhe Institute of Technology, Postfach 6980, 76049 Karlsruhe, Germany KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association.
| |||||
6-17 | (2012-01-10) Research Assistant or Postdoctoral Position-Automatic Speech Recognition at KIT Karlsruhe, RFA Karlsruhe Institute of Technology (KIT) is the result of the merger of the University of Karlsruhe and the Research Center, Karlsruhe. It is a unique institution in Germany, which combines the mission of a university with that of a large-scale research center of the Helmholtz Association. With 8000 employees and an annual budget of EUR 650 million, KIT is one of the largest research and education institutions worldwide. At the Institute for Anthropomatics, Lehrstuhl Prof. Waibel several positions are to be filled: Research Assistant or Postdoctoral Position in the field of Automatic Speech Recognition with a salary according to TV-L E13. The responsibilities include basic research in the area of automatic speech recognition, as well as participation in application targeted research projects in the area of multimodal dialog centered Human-machine interaction. The candidate is expected to contribute to the state-of-the art of modern recognition systems. He/she will participate in the design, development, and exploration of innovative methods, algorithms and techniques towards acoustic and language modeling leading to improvements in recognizer performance. Within the framework of the international center for advanced communication technology (interACT) , our center operates in three locations, Karlsruhe Institute of Technology, Germany and at Carnegie Mellon University, Pittsburgh and Silicon Valley, USA. This offers a first class research environment, working in advanced research projects at two of the top computer science research universities in the US and Europe. International joint and collaborative research at and between our centers is common and encouraged, and offers great international exposure and activity. Our current research interests include the simultaneous translation of lectures, video conferencing and general monologues, key-word spotting in a multitude of languages, and multilingual speech recognition and speech recognition for under-resourced languages. We seek qualified candidates with a M.S. or a Phd degree in Computer Science, Electrical Engineering, or related fields. The position offers the opportunity to work toward a Ph.D. degree or toward the advancement of an academic career. A record of academic achievements, relevant experience and knowledge in relevant areas, and excellent programming skills are expected. KIT is pursuing a gender equality policy. Women are therefore particularly encouraged to apply. If equally qualified, handicapped applicants will be preferred. Questions may be directed to E-Mail: sebastian.stueker@kit.edu, http://isl.anthropomatik.kit.edu The application should be sent to Professor Alex Waibel, Institut für Anthropomatik, Karlsruhe Institute of Technology, Postfach 6980, 76049 Karlsruhe, Germany KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association.
| |||||
6-18 | (2012-01-10) Research Assistant or Postdoctoral Position- Dialog Modeling at KIT Karlsruhe, RFA Karlsruhe Institute of Technology (KIT) is the result of the merger of the University of Karlsruhe and the Research Center, Karlsruhe. It is a unique institution in Germany, which combines the mission of a university with that of a large-scale research center of the Helmholtz Association. With 8000 employees and an annual budget of EUR 650 million, KIT is one of the largest research and education institutions worldwide. At the Institute for Anthropomatics, Lehrstuhl Prof. Waibel several positions are to be filled: Research Assistant or Postdoctoral Position in the field of Dialog Modeling with a salary according to TV-L E13. The responsibilities include basic research in the area of dialog systems, as well as participation in application targeted research projects in the area of multimodal dialog centered Human-machine interaction. The candidate is expected to participate in the design, development, and exploration of innovative methods, algorithms and techniques to be successfully integrated in humanoid robots. Within the framework of the international center for advanced communication technology (interACT) , our center operates in three locations, Karlsruhe Institute of Technology, Germany and at Carnegie Mellon University, Pittsburgh and Silicon Valley, USA. This offers a first class research environment, working in advanced research projects at two of the top computer science research universities in the US and Europe. International joint and collaborative research at and between our centers is common and encouraged, and offers great international exposure and activity. The focus of our research is to develop better communication and computing services that take advantage of an understanding of the Human context and activities. One of our current research interests is the field of multimodal perception and dialog centered Human-machine interaction. We seek qualified candidates with a M.S. or a Phd degree in Computer Science, Electrical Engineering, or related fields. The position offers the opportunity to work toward a Ph.D. degree or toward the advancement of an academic career. A record of academic achievements, relevant experience and knowledge in relevant areas, and excellent programming skills are expected. KIT is pursuing a gender equality policy. Women are therefore particularly encouraged to apply. If equally qualified, handicapped applicants will be preferred. Questions may be directed to E-Mail: sebastian.stueker@kit.edu, http://isl.anthropomatik.kit.edu The application should be sent to Professor Alex Waibel, Institut für Anthropomatik, Karlsruhe Institute of Technology, Postfach 6980, 76049 Karlsruhe, Germany KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association.
| |||||
6-19 | (2012-01-10) Immediate postdoctoral position openings USC Signal Analysis and Interpretation Lab (SAIL) Immediate postdoctoral position openings
| |||||
6-20 | (2012-01-12) Ingénieur : Site web et éducation et rééducation de la prononciation du français par des apprenants Poste d'Ingénieur Site web et éducation et rééducation de la prononciation du français par des apprenants pour projet LabEx, EFL (PRES Sorbonne Paris Cité) Opération PPC10 (valorisation) du labex EFL Le laboratoire d’excellence « Fondements empiriques de la linguistique : données, méthodes, modèles » recherche un Developpeur/programmeur (Web) pour une mission de création. Contexte Le projet EFL est un « laboratoire d’excellence » créé en 2011 pour 10 ans (2011-2021) par le Ministère de la Recherche. Il regroupe 13 équipes de recherche de 5 établissements (Paris 3, Paris 5, Paris 7, Paris 13 et INALCO) appartenant au PRES Sorbonne Paris-Cité, autour de la linguistique et des disciplines connexes dont l’objet d’étude est le langage. Il représente 150 chercheurs et une centaine de doctorants. Il est organisé autour de 7 axes scientifiques (géré chacun par un établissement partenaire) et de volets formation et valorisation. Le développeur/programmeur (web) aura pour tâche de créer un site internet/logiciel dans le cadre d'un projet lié à la conception et développement Web pour l’enseignement à distance. Le poste est disponible pour 6 mois, temps complet. Lieu d’affectation : Laboratoire de Phonétique et Phonologie, ILPGA. 19 rue des Bernardins, 75005 Paris. Rémunération : Ingénieur, selon expérience. Le poste Le CDD concernera un spécialiste de l’enseignement à distance, si possible dans le domaine de l’enseignement des langues, pour étudier la faisabilité et mettre en place un prototype. Tous les membres du Labex seront invités à tester ce prototype. L’ingénieur aura à sa charge deux tâches principales: Évaluer les solutions existantes et tester la faisabilité du projet, faire un rapport sur les solutions disponibles. Élaborer un site web de test pour la mise en place de ces cours en ligne (avec en partie, contact direct via internet, avec les futurs enseignants, tous phonéticiens ou orthophonistes diplômés) , avec utilisation à distance de certains capteurs spécialisés (détection de nasalité, d’aspiration, de qualité de voix, etc.) et diagnostics en ligne et en différé des problèmes à résoudre. L’ingénieur choisit sera amené à manipuler en plus de fichiers avec des formats courants, des fichiers avec différents formats : vidéos, acoustiques, textuels. Il lui sera demandé de convertir ces fichiers dans des formats génériques permettant un stockage pérenne via internet. La personne recrutée doit une bonne connaissance de la programmation web (MySQL, PHP), incluant le développement d’interface (IHM), une bonne connaissance des plateformes pédagogiques. Profil recherché - Expérience de 1 à 5 ans dans la conception de site web - Une première expérience dans la construction d'un site pour les langues, la linguistique est recommandée - Des connaissances des formats utilisés en parole pourraient être un plus. Compétences - Pratique de l'anglais et du français courante (oral et écrit) - Rapidité, efficacité, adaptation au monde de la recherche et de l'université Pour candidater envoyer : - un CV - une lettre de motivation - des lettres de recommandation sont souhaitées La candidature doit être envoyée en un seul fichier au format .pdf ou .doc avec comme nom de fichier prénomnom_cv Date limite de candidature : 1 Mars 2012 Début du contrat : dès disponibilité à partir du 1er Mars 2012 A : jacqueline.vaissiere@univ-paris3.fr
| |||||
6-21 | (2012-01-20) Scientist in Learning, data mining and interaction at the University Grenoble Profil : Apprentissage, Fouille de Données, Interaction
Concours : 46-1
Etat du poste : susceptible d'être vacant (date de vacance : 01/09/2012)
Profil complet accessible sur le sur le site du LIG : http://liglab.imag.fr/
Descriptif recherche :
Le laboratoire encourage des candidatures d’excellence permettant de renforcer son encadrement de recherche et participer à la dynamique du laboratoire dans le domaine de l’accès à l’information. Le candidat devra développer un projet scientifique sur une des thématiques suivantes :
– Apprentissage : ingénierie des connaissances, apprentissage automatique, apprentissage humain
– Fouille de données : fouille de données massives et complexes, fouille de données textuelles, recherche d’information multimédia
– Interaction : méthodes, modèles et outils pour la conception d’interfaces homme-machine innovantes, techniques d’interaction, dialogue homme-machine, multimodalité, multilinguisme, systèmes perceptifs et adaptatifs, mondes virtuels
Le candidat devra s’intégrer dans l’une des équipes du LIG et élaborer un projet de recherche sur l’un des aspects du profil du poste.
| |||||
6-22 | (2012-01-20) Proposition de stage (internship) M2R (fouille de données et parole) at LIMSIProposition de stage M2R (fouille de données et parole) Contact : Sophie Rosset (rosset@limsi.fr) Lieu : LIMSI - CNRS, bat 508, BP 133, 91403 Orsay Cedex, groupe Traitement du Langage Parlé Titre : Fouilles de données appliquées à des données audio : erreurs et entités nommées Contexte Ce stage de M2 s'inscrit dans les domaines du Traitement Automatique des Langues (TAL) et de la Parole (TAP) ainsi que celui de la fouille de données. Nous nous intéressons plus particulièrement à la caractérisation des erreurs d'un système de transcription de la parole dont les sorties sont utilisées par un système de reconnaissance d'Entités Nommées. Il s'agit de mettre en place une méthode permettant de classifier et de caractériser les erreurs de plusieurs systèmes de transcription de la parole en quantifiant leur impact sur un (ou plusieurs) systèmes de reconnaissance d'Entités Nommées. Cette méthode devra être généralisable à d'autres types d'applications comme la traduction automatique ou un système de dialogue homme/machine. Sujet Les systèmes de reconnaissance de la parole sont évalués en utilisant le taux d'erreurs de mots (WER ou Word Error Rate) qui considère chaque mot comme ayant une importance égale. Or on constate que cette métrique d'évaluation ne permet de mesurer la difficulté qu'aura un système d'extractions d'information. Autrement dit, si on applique un même système de détection d'entités nommées sur deux sorties de système de reconnaissance ayant pourtant un même WER, le taux d'erreur du système de détection d'entités nommées sera différent. L'objectif de ce stage est donc de caractériser les erreurs d'un système de reconnaissance de la parole en fonction d'une tâche de détection d'entités nommées et de l'impact qu'ont ces erreurs. Nous nous focaliserons au cours de ce stage sur la parole journalistique en utilisant les données d'une campagne d'évaluation récente. Cette campagne a mis en évidence une très grosse perte de résultats des systèmes de reconnaissance d'entités nommées sur des sorties de système de reconnaissance automatique de la parole (30% de perte) [1]. Les sorties de trois systèmes de transcription seront étudiées. Leur impact devra être étudié sur au moins un système d'identification d'Entités Nommées également fourni par le LIMSI. Ces systèmes sont à l'état de l'art et pourront donc servir de première référence. [1] Olivier Galibert; Sophie Rosset; Cyril Grouin; Pierre Zweigenbaum; Ludovic Quintard. Structured and Extended Named Entity Evaluation in Automatic Speech Transcriptions. IJCNLP 2011 (http://aclweb.org/anthology-new/I/I11/I11-1058.pdf) Informations pratiques Le stage, d'une durée de 5 mois, se déroulera au LIMSI, dans le groupe Traitement du Langage Parlé et le stagiaire recevra une gratification (de l'ordre de 480 euros/mois).
| |||||
6-23 | (2012-01-20) Professeur en Linguistique générale et linguistique française. Linguistique de terrain, Université de Lyon France Dans le cadre de la campagne de recrutement d'enseignants-chercheurs 2012, un poste de professeur en 7ème section CNU sera ouvert à l'Université Lumière Lyon 2, avec affectation au département de Sciences du langage et rattachement au laboratoire Dynamique Du Langage -- Directeur du Laboratoire Dynamique Du Langage UMR 5596 CNRS - Université Lumière Lyon 2 Tel. (+33/0)4 72 72 64 94 DDL - ISH 14, av. Berthelot 69363 Lyon Cedex 7 / FranceD
| |||||
6-24 | (2012-02-02) Research and Development Opportunities for Next Generation Technology at Microsoft
Research and Development Opportunities for Next Generation Technology at Microsoft Do you want to impact billions of people all over the world with speech technology that you create?
We are looking for PhD level scientists and senior scientists, who will work on research problems in spoken language understanding, statistical dialog modeling, natural language generation, machine learning, statistical language modeling, and acoustic modeling.
Microsoft is all-in on the Natural User Interface to bring computing to larger audiences in more applications. To drive this mission we are bringing together scientists and engineers in the areas of speech recognition, natural language understanding, dialog modeling, machine learning and synthesis to develop and deliver robust, natural and scalable solutions across a rich set of scenarios and languages.
Join the excitement to be part of the newly formed team of scientists within Microsoft and to impact the lives of billions of people all over the world. We’re talking about Bing, Windows, XBOX, Mobile, Exchange Server and Tellme, just to name a few. Microsoft is dedicated to improving everyday life using speech. And not just in a few countries - but around the world.
How to apply: MICROSOFT CORPORATION Attention: Recruiting, One Microsoft Way, STE 303, Redmond WA 98052-8303
Or email resume to: Tom Swanson toswanso@microsoft.com Please reference Speech in the subject line.
| |||||
6-25 | (2012-02-05) NSF-Supported Summer Research for Undergraduates NSF-Supported Summer Research for Undergraduates
| |||||
6-26 | (2012-02-02) Research and Development Opportunities for Next Generation Technology at Microsoft Research and Development Opportunities for Next Generation Technology at Microsoft Do you want to impact billions of people all over the world with speech technology that you create?
We are looking for PhD level scientists and senior scientists, who will work on research problems in spoken language understanding, statistical dialog modeling, natural language generation, machine learning, statistical language modeling, and acoustic modeling.
Microsoft is all-in on the Natural User Interface to bring computing to larger audiences in more applications. To drive this mission we are bringing together scientists and engineers in the areas of speech recognition, natural language understanding, dialog modeling, machine learning and synthesis to develop and deliver robust, natural and scalable solutions across a rich set of scenarios and languages.
Join the excitement to be part of the newly formed team of scientists within Microsoft and to impact the lives of billions of people all over the world. We’re talking about Bing, Windows, XBOX, Mobile, Exchange Server and Tellme, just to name a few. Microsoft is dedicated to improving everyday life using speech. And not just in a few countries - but around the world.
How to apply: MICROSOFT CORPORATION Attention: Recruiting, One Microsoft Way, STE 303, Redmond WA 98052-8303
Or email resume to: Tom Swanson toswanso@microsoft.com Please reference Speech in the subject line.
| |||||
6-27 | (2012-02-15) Maître de Conférences contractuel, ESPCI ParisTechUn poste de Maître de Conférences contractuel (1 an renouvelable) est disponible au laboratoire SIGMA (SIGnaux, Modèles, Apprentissage statistique) de l'ESPCI ParisTech à partir d'avril 2012.
| |||||
6-28 | (2012-02-10) Poste MCF Informatique [27MCF519 Paris-Sorbonne] Le poste requiert une double compétence : un haut niveau d’excellence scientifique en Informatique et en applications de l’Informatique aux sciences humaines et sociales (notamment le traitement paralinguistique de la parole et du langage, la sociologie computationnelle, …). L’intérêt porté aux applications de la théorie informatique aux sciences humaines et sociales constitue une des spécificités de l’enseignement de l’Informatique à l’Université Paris-Sorbonne. Le candidat enseignera l’Informatique dans différentes formations de licence (LFTI) et de master (ILGII, IILGI). Il s’impliquera également dans l’encadrement de nouvelles licences bi-cursus (licence Sciences-Sciences du langage, …) en projet au sein du PRES Sorbonne Universités.
| |||||
6-29 | (2012-02-15)PhDs at Tilburg Center for Cognition and Communication (TiCC) research program 'Language, Communication and Cognition' (LCC), The Netherlands For the Tilburg Center for Cognition and Communication (TiCC) research program 'Language, Communication and Cognition' (LCC), we are looking for two new, enthusiastic and competent PhD colleagues.
If you are interested in one of these positions, you will need to identify a potential research topic related to one of the research themes of the LCC program. Current themes include:
- Social media and interpersonal communication. - Professional communication (medical, business, etc.). - Alignment and adaptation in communication. - Social exclusion and other social aspects of interaction. - Emotion and speech. - Language acquisition and learning. - Multimodality and communication. - Language and speech production. - Visual communication (diagrams, metaphors, etc.). - Gesture and other forms of non-verbal behavior
For the positions we seek candidates with a background in a relevant discipline, including Psycholinguistics, Communication & Information Sciences, Linguistics, Cognitive Science, Psychology or some related area, with experience in doing experiments and analyzing data.
The PhD candidates have a good (research) master degree in one of the aforementioned areas, a strong interest in doing research, excellent writing skills and a good command of English. Developing and defending a research plan is part of the procedure.
Tilburg University is rated among the top Dutch employers, offering excellent terms of employment. The collective labour agreement of Tilburg University applies. The selected candidates will start with a contract for one year, concluded by an evaluation. Upon a positive outcome of the first-year evaluation, the candidate will be offered an employment contract for the remaining years. Candidates with a Research Master (MPhil) will be offered a 1+2 years-contract. Master students might be offered a 1+3 years-contract. It is also possible to work 80% instead of fulltime. The PhD candidates will be ranked in the Dutch university job ranking system (UFO) as a PhD-student (promovendus) with a starting salary of € 2.042,-- gross per month in the first year, up to € 2.612,-- in the fourth year (amounts fulltime). The selected candidate is expected to have written a PhD thesis by the end of the contract (which may be based on articles).
Research in the Department of Communication and Information Sciences is located in the Tilburg Center for Cognition and Communication (TiCC). TiCC consists of two research programs: Language, Communication and Cognition (LCC) and Creative Computing (CC). There is a strong emphasis on experimental research and interdisciplinary cooperation. More information about the research programs can be found at http://www.tilburguniversity.edu/research/institutes-and-research-groups/ticc/. There is a strong emphasis on experimental work and interdisciplinary cooperation. The department DCI is responsible for a flourishing academic programme Communication and Information Sciences (CIW), that annually attracts about 120 Bachelor students, 130 Pre-master and 200 Master students. The department is also co-responsible for the Research Master Language and Communication. More information about the DCI department can be found at www.tilburguniversity.nl/faculties/humanities/dci/.
For more information on the positions, please contact one of LCC program leaders prof.dr. Emiel Krahmer (E.J.Krahmer@uvt.nl, +311346630700) or prof.dr. Marc Swerts (M.G.J.Swerts@uvt.nl, +31134662922).
Applications should include.
- a cover letter. - a Curriculum Vitae. - a 2-page research proposal on a selected theme, plus names of potential supervisor and promotor. - names of two references.
The only way to apply is via the online link at the bottom of this vacancy: 'apply direct'. If you receive this vacancy via eg. E-mail, please look at the vacancy located at: http://www.tilburguniversity.edu/about-tilburg-university/working-at/wp/. Applications should be sent before the application deadline of March 24, 2012. Interviews are expected to be held in April 2012. Starting dates are flexible, so applicants who expect to graduate in the summer of 2012 are also invited to apply.
| |||||
6-30 | (2012-02-15) Maitre de Conférences, l'Université Sorbonne Nouvelle, Paris Un poste de Maitre de Conférences est ouvert au recrutement pour la rentrée 2012 à l'Université Sorbonne Nouvelle Paris 3. Voici le descriptif ci-dessous. (plus de détails sur http://lpp.univ-paris3.fr/postes/offres.htm)
| |||||
6-31 | (2012-02-15) Postdoc at University of Trento, Italy - Machine Translation/Social Computing Postdoc at University of Trento, Italy - Machine Translation/Social Computing
| |||||
6-32 | (2012-02-21) Two Postdoctoral Research Associates in Speech Technology,University of Edinburgh Two Postdoctoral Research Associates in Speech Technology -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
| |||||
6-33 | (2012-02-21) Senior Researcher in Speech Technology, University of Edinburgh Senior Researcher in Speech Technology -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
| |||||
6-34 | (2012-02-16) 4 PhD positions in spoken dialogue systems research / Charles University in Prague 4 PhD positions in spoken dialogue systems research
| |||||
6-35 | (2012-02-27) PhD Student 'Increasing Robustness of Speech Recognition' Radbout University, Nijmegen, The Netherlands
PhD Student 'Increasing Robustness of Speech Recognition' (1,0 fte) Faculty of Arts Vacancy number: 23.02.12 Closing date: 1 April 2012 Responsibilities As a PhD student you will participate in the FP7 Marie Curie Initial Training Network Investigating Speech Processing In Realistic Environments (INSPIRE). This network provides research opportunities for 13 PhD students and 3 postdocs. You will become a member of an international team of researchers whose aim is to gain a better understanding of how listeners recognize speech, even under non-ideal circumstances. You will contribute to urgently needed solutions that help alleviate the serious communication problems that arise, especially for older and hearing-impaired persons, when different combinations of 'adverse' conditions affect the speech processing system. As a PhD student you will conduct research as part of a project called ’Increasing robustness of speech recognition by using multiple signal representations’. Speech processing in the human brain presumably involves competition between multiple, intermediate signal representations. The redundancy of these different representations are assumed to help improve the robustness of recognition. In some cases, however, they may lead to conflicting interpretations resulting in intelligibility problems. The goal of this PhD project is to investigate to what extent human recognition errors with regard to speech in ’adverse’ conditions can be replicated by machines that were trained on multiple input representations which are partially redundant. Work environment The project will be carried out at the Centre for Language and Speech Technology (CLST). CLST is a research unit within the Faculty of Arts of Radboud University Nijmegen and hosts a large international group of senior researchers and PhD students who conduct research at the frontier of science and develop innovative applications. What we expect from you You should: - hold a Master's degree in engineering or science; - have a strong background in machine learning (experience with dynamic Bayesian networks would be an advantage), mathematical and/or statistical modelling, and signal processing; - have excellent programming skills; - be willing to spend several months at the Technical University of Denmark. Prior exposure to courses in linguistics or speech- or hearing-related fields would be an advantage. Furthermore, you should comply with the rules set forward by the FP7 Marie Curie ITNs, i.e. you should: - not have resided of performed your main research activity in the Netherlands for more than 12 months in the last three years; - be willing to work in at least one other country in the INSPIRE network; - have less than 4 years of research experience since you obtained your Master’s degree, and not hold a PhD. What we have to offer We offer you: - employment: 1,0 fte; - in addition to the salary: an 8% holiday allowance and an 8.3% end-of-year bonus; - the starting salary is €2,042 per month on a full-time basis; the salary will increase to €2,492 per month in the third year; - in addition to the salary, you will receive travel and training allowances on the basis of generous Marie Curie ITN provisions; - duration of the contract: 18 months with the possibility of extension by another 18 months. Are you interested in our excellent employment conditions (http://www.ru.nl/newstaff/working_at_radboud/conditions_of/)? The Radboud University is an equal opportunity employer. Female researchers are strongly encouraged to apply for this vacancy. Would you like to know more? Further information on: Investigating Speech Processing In Realistic Environments (http://www.ru.nl/clst/projects/speech/inspire/) Dr. Bert Cranen, assistant professor Speech science Telephone: +31 24 3612904 E-mail: B.Cranen@let.ru.nl Applications Are you interested? Please include with your application: - a CV; - a 2-page description of your research interests explaining why the INSPIRE goals appeal to you, how the INSPIRE team may benefit from your participation, and your career perspectives as expected from INSPIRE; - university transcripts; - names and email addresses of two potential referees (or alternatively letters of recommendation). It is Radboud University Nijmegen's policy to only accept applications by e-mail. Please send your application, stating vacancy number 23.02.12, to vacatures@let.ru.nl, for the attention of drs. M.J.M. van Nijnatten, before 1 April 2012.
| |||||
6-36 | (2012-03-03) 3 Post-doctoral positions at the Bruno Kessler Foundation, Center for Information Technology, Trento Italy
3 Post-doctoral positions available in the 'Human Language Technologies - HLT' Research Unit at the Bruno Kessler Foundation, Center for Information Technology. Workplace description: The Human Language Technology is a multi-disciplinary research unit that addresses the automatic processing of human language for a range of tasks. In particular, the research unit focuses on: automatic speech recognition, machine translation and content processing. The HLT unit has been developing state-of-the-art technology in all the main research areas it operates in. The group has consistently performed well in several international evaluations, and is currently engaged in international projects for open source software development (e.g. the Moses platform for statistical machine translation). The unit also provides technological support and high-level services in order to optimize the internal research activities, namely a shared and efficient computing environment, software tools, up to the creation and management of large scale linguistic resources. The HLT group is part of the larger network of research labs focusing on Human Language Technologies and related domains in the Trento region, that is quickly becoming one of the areas with the highest concentration of researchers in HLT and related fields anywhere in Europe. More information about the HLT Unit is available at http://hlt.fbk.eu The HLT Research Unit, is looking for 3 candidates to carry out research activities in the field of Textual Inferences, Machine Translation and Speech Recognition. Each research position will be funded through the following European research projects: MateCat: http://www.matecat.com EU-Bridge: http://www.eu-bridge.eu EXCITEMENT: website in progress Open positions: A Postdoctoral position in Textual Inferences (Ref.Code HLT_PostDoc2012_TI) The candidate is expected to carry out research activities in the context of the EU-funded project EXCITEMENT on multilingual semantic processing. The goal of the EXCITEMENT project is to develop generic semantic 'engines' or platforms for robust textual inference that are applicable across languages and linguistic frameworks. These inference platforms will be leveraged for unsupervised text exploration on customer interaction data. Concrete systems will be developed for English, German, and Italian. Project partners are Bar-Ilan University, DFKI Saarbrücken, University of Heidelberg, Almawave S.r.l, NICE Systems, and OMQ GmbH. The selected candidate will join the FBK research group with the aim of advancing the state of the art on component-based textual entailment. A Postdoctoral position in Machine Translation (Ref.Code HLT_PostDoc2012_MT) The candidate is expected to contribute original research results inside leading edge international projects. The aim is to advance the state of the art in the integration of statistical MT in computer assisted translation and in adaptive MT, by drawing ideas and contributions from different areas, such as machine learning, statistical language processing, high performance computing, etc. A Postdoctoral position in Speech Recognition (Ref.Code HLT_PostDoc2012_SR) The candidate is expected to contribute original research results inside a leading edge international project. The aim is to advance the state-of-the-art in multilingual speech processing by improving acoustic modelling, language modelling, and adaptation to different domains, conditions and genres. The contribution will be evaluated on application scenarios that include both efficient annotation of audiovisual archives and live processing of audio streams. Job requirements: Applicants should have a PhD degree related to any of the specific research areas mentioned (computational linguistics, speech processing or related fields) Experience in statistical modelling, speech processing or machine learning (preferable on approaches applied to NLP tasks) Experience in distributed software development (open source) Skills in experimental work and development of algorithms Ability to work and deliver in funded research projects Oral and written proficiency in English In adherence to FBK's policy to promote equal opportunity and gender balance, in case of equal applications, female candidates will be given preference. Employment: Contract type: Full time, 30-month contract (may be extended up to 6 months). Number of positions: 3 Gross salary: from 37,500 to 41,500 €per year (depending on the candidate’s experience) Benefits: 28 vacation days per year, flexi-time, company subsidized cafeteria or meal vouchers, internal car park, welcome office support for visa formalities, accommodation, social security, etc., reductions on bank accounts, public transportation, sport, accommodation and language courses fees. Start date: Spring 2012 Location: Povo, Trento (Italy) Application process: To apply online, please send your detailed CV (.pdf format) including a list of publications, a statement of research interests and contact information for at least 2 references. Please include in your CV your authorization for the handling of the your personal information as per the Personal data Protection Code, Legislative Decree no. 196/2003 June 2003. Applications must be sent to jobs@fbk.eu Emails should have the reference code related to the position of interest ( HLT_PostDoc2012_TI, HLT_PostDoc2012_MT or HLT_PostDoc2012_SR) Application deadline: 9 April 2012 Short-listed candidates will be contacted for an interview. Non-selected applicants will be notified of their exclusion at the end of the selection process. Please note that FBK may contact short-listed candidates who were not selected for the current openings within a period of 6 months for any selection process for similar positions. For transparency purposes, the name of the selected candidate, upon his/her acceptance of the position, will be published on the FBK website at the bottom of the selection notice
| |||||
6-37 | (2012-03-08) Post-docs at the Speech Processing and Transmission Lab ,Universidad de Chile,Santiago,Chile The Speech Processing and Transmission Lab (LPTV, Laboratorio de Procesamiento y Transmisión de Voz) at Universidad de Chile,Santiago,Chile, is looking for post-doc researchers in the following fields:
Robust speech recognition Robust speaker verification Second language learning assessment
The grants are funded by Conicyt (Chilean funding Agency): http://www.conicyt.cl
The applicant are required to present a brief research proposal prepared in collaboration with the director of the LPTV. For further information, contact:
Néstor Becerra Yoma, Ph.D. Professor Speech Processing and Transmission Laboratory Department of Electrical Engineering Universidad de Chile Av. Tupper 2007,POBox412-3 Santiago,Chile
Tel. +56 2 978 4205 Fax. +56 2 695 3881 E-mail: nbecerra@ing.uchile.cl http://www.cec.uchile.cl/~labptvoz/
| |||||
6-38 | (2012-03-10) Senior Researcher/Research Associate in Statistical Dialogue Systems at Cambridge UK Senior Researcher/Research Associate in Statistical Dialogue Systems
Applications are invited at either the Senior Research Associate or Research Associate
level to work on an EU-funded project called Parlance which aims to build mobile voice-
driven systems for interactive hyper-local search.
Candidates should have a PhD or comparable research experience in spoken dialogue
systems and noise robust automatic speech recognition and understanding. Good
programming skills are essential and familiarity with HTK would be an advantage.
Appointment at the senior level will require at least 3 years post-doctoral experience
and evidence of independent standing. Salary range is from £27578 to £46846.
This is an exciting opportunity to join one of the leading groups in statistical speech and
language processing. Cambridge provides excellent research facilities and there are
extensive opportunities for collaboration, visits and attending conferences.
Contact Prof Steve Young (sjy@eng.cam.ac.uk) for further information.
Application details can be found at: http://www.jobs.cam.ac.uk/job/-14472
| |||||
6-39 | (2012-03-12) Postdoc position: Acoustic to articulatory mapping of fricative sounds LORIA Nancy France Postdoc position: Acoustic to articulatory mapping of fricative sounds
15 months, start between September and December 2012 at LORIA (Nancy, France).
Contact : Yves.Laprie@loria.fr
Context This subject deals with acoustic to articulatory mapping [Maeda et al. 2006], i.e. the recovery of the vocal tract shape from the speech signal possibly supplemented by images of the speaker’s face. This is one of the great challenges in the domain of automatic speech processing which did not receive satisfactory answer yet. The development of efficient algorithms would open new directions of research in the domain of second language learning, language acquisition and automatic speech recognition.
The objective is to develop inversion algorithms for fricative sounds. Indeed, there exist now numerical simulation models for fricatives. Their acoustics and dynamics are better known than those of stops and it will be the first category of sounds to be inverted after vowels for which the Speech group has already developed efficient algorithms. The production of fricatives differs from that of vowels about two points:
The approach proposed is analysis-by-synthesis. This means that the signal, or the speech spectrum, is compared to a signal or a spectrum synthesized by means of a speech production model which incorporates two components: an articulatory model intended to approximate the geometry of the vocal tract and an acoustical simulation intended to generate a spectrum or a signal from the vocal tract geometry and the noise source. The articulatory model is geometrically adapted to a speaker from MRI images and is used to build a table made up of couples associating one articulatory vector and the corresponding acoustic image vector. During inversion, all the articulatory shapes whose acoustic parameters are close to those observed in the speech signal are recovered. Inversion is thus an advanced table lookup method which we used successfully for vowels [Ouni & Laprie 2005] [Potard et al. 2008].
Activities The success of an analysis by synthesis method relies on the implicit assumption that synthesis can correctly approximate the speech production process of the speaker whose speech is inverted. There exist fairly realistic acoustic simulations of fricative sounds but they strongly depend on the precision of the geometrical approximation of the vocal tract used as an input. There also exist articulatory models of the vocal tract which yield very good results for vowels. On the other hand, these models are inadequate for those consonants which often require a very accurate articulation at the front part of the vocal tract. The first part of the work will be about the elaboration of articulatory models that are adapted to the production of consonants and vowels. The validation will consist of piloting the acoustic simulation from the geometry and of assessing the quality of the synthetic speech signal with respect to the natural one. This work will be carried out for some X-ray films, whose the acoustic signal recorded during the acquisition of them is sufficiently good.
The second part of the work will be about several aspects of the inversion strategy. Firstly, it is now accepted that spectral parameters implying a fairly marked smoothing and frequency integration have to be used, which is the case of MFCC (Mel Frequency Cepstral Coefficients) vectors. However, the most adapted spectral distance to compare natural and synthetic spectra has to be investigated. Another solution consists in modeling the source so as to limit its impact on the computation of the spectral distance.
The second point is about the construction of the articulatory table which has to be revisited for two reasons: (i) only the cavity downstream the constriction plays an acoustic role, (ii) the location of the noise source is an additional parameter but it depends on the other articulatory parameters. The third point concerns the way of taking into account the vocal context. Indeed, the context is likely to provide important information about the vocal tract deformations before and after the fricative sound, and thus constraints for inversion.
A very complete software environment already exists in the Speech group for acoustic-to-articulatory inversion, which can be exploited by the post-doctoral student.
References - [S. Ouni and Y. Laprie 2005] Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion, Journal of the acoustical Society of America, Vol. 118, pp. 444-460 - [B. Potard, Y. Laprie and S. Ouni], Incorporation of phonetic constraints in acoustic-to-articulatory inversion, JASA, 123(4), 2008 (pp.2310-2323). - [Maeda et al. 2006] Technology inventory of audiovisual-to-articulatory inversion http://aspi.loria.fr/Save/survey-1.pdf
Expected skills Knowledge of speech processing and articulatory modeling, Acoustics, Computer sciences, Applied mathematics
| |||||
6-40 | (2012-03-15) Ingénieur de recherche à IRISA Lannion France Un poste d'ingénieur de recherche (CDD 24 mois) est ouvert dans l'équipe de recherche Cordial de l'Irisa à Lannion. Ce recrutement, dont le profil de recherche se situe en traitement de la parole et du signal de parole, est effectué dans le cadre du projet ANR Phorevox. Le poste est à pourvoir dès que possible. Le profil détaillé est disponible en suivant le lien :
| |||||
6-41 | (2012-03-15) INGENIEUR D’ETUDES ET RECHERCHE au Labo Nat. Métrologie et essais. DIRECTION DES ESSAIS Pôle Essais en Environnement
INGENIEUR D’ETUDES ET RECHERCHE EN TRAITEMENT AUTOMATIQUE DES LANGUES H/F Réf : CL/TAL/DE
Contexte :
Le Laboratoire National de Métrologie et d’Essais propose des prestations d’évaluation de la performance des systèmes de traitement automatiques des langues et de la parole pour une tâche donnée (transcription, traduction, extraction d’informations,…). Au sein du Département CEM, Sécurité Electrique et Technologies de l’information, l’équipe de traitement de l’information multimedia travaille sur les différentes étapes qui définissent une évaluation. Ses principales missions sont :
- De définir des tâches pertinentes à évaluer en fonction des besoins applicatifs et/ou théoriques, - De déterminer des caractéristiques des données à utiliser pour répondre à la tâche considérée, - D’établir des mesures qui permettent de rendre compte de la pertinence d’un système pour une tâche donnée.
Missions :
Dans le cadre de programmes d’études et recherche, vous aurez pour mission de contribuer au développement de l’activité, notamment au travers des éléments suivants :
- Le montage et la gestion de projets de recherche et développement dans le domaine du multimédia,
- L’élaboration des protocoles pour répondre aux problématiques de l’évaluation en Traitement Automatique du Langage :
- Le développement des partenariats au niveau international afin de renforcer la position du LNE dans le domaine.
Profil :
Docteur Ingénieur en Informatique, spécialisé en Traitement Automatique des Langues (TAL). Vous possédez une première expérience professionnelle (3 à 5 ans en plus de la thèse), durant laquelle vous avez travaillé sur l’évaluation des systèmes automatiques. Vous maîtrisez la gestion de projet et êtes à l’aise dans l’approche clients et l’organisation et l’animation de réunions et/ou séminaires. Vous avez des connaissances solides en programmation et analyse de données (fouille de données). Rigoureux, dynamique, déterminé et d’un relationnel facile, vous saurez rapidement vous intégrer au sein des équipes et démontrer le leadership et l’expertise nécessaires à la réussite de votre mission. Anglais courant impératif. Déplacements à prévoir (une dizaine par an de 1 à 3 jours, majoritairement en France). Poste en CDI basé à Trappes (78).
Contact :
Postuler sous la référence CL/TAL/DE A l’attention de Mlle Christelle LEBRAULT - Par mail : recrut@lne.fr
| |||||
6-42 | (2012-03-25) Audio Indexing Researcher W/M position at IRCAM – 3DTV project If you already applied for this position, please just send us a quick email telling us you are still interrested and we get back to you. Audio Indexing Researcher W/M position at IRCAM – 3DTV project Starting : April 2012 (as soon as possible) Duration : 18 months Introduction to IRCAM IRCAM is a leading non-profit organization associated to Centre Pompidou, dedicated to music production, R&D and education in acoustics and music. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D department include acoustics, audio signal processing, computer music, interaction technologies, musicology. Ircam is located in the centre of Paris near the Centre Pompidou, at 1, Place Igor Stravinsky 75004 Paris. Introduction to 3DTVs project The goal of the 3DTVS project is to devise scalable 3DTV AV content description, indexing, search and browsing methods across open platforms, by using mobile and desktop user interfaces and to incorporate such functionalities in 3D audiovisual content archives. 3D multichannel audio analysis targets audio event detection based on fusion techniques that combine the feature analysis performed in the individual channels as well as source localization and separation algorithms for the detection of moving audio sources. The results will be used in 3D audio/cross-modal indexing and retrieval. Multimodal 3D audiovisual content analysis will built on the results of 3D video and audio analysis. 3DTV content description and search mechanisms will be developed to enable fast reply to semantic queries. Role of IRCAM in the 3DTV Project In the 3DTVs project, IRCAM is in charge of the research and development of technologies related to - Audio event detection using multi-channel audio scenes - Speaker diarization - Segmentation into Movie scene from the audio signal - Sound source separation, localization and identification Position description Hired Researcher will be in charge of the development of technologies related to:
The Researchers will also collaborate with the development team and participate in the project activities (evaluation, meetings, specifications). Required profiles
· High productivity, methodical works, excellent programming style. Salary According to background and experience Applications Please send an application letter together with your resume and any suitable information addressing the above issues preferably by email to: peeters_a_t_ircam dot fr with cc to vinet_a_t_ircam dot fr, roebel_at_ircam_dot_fr
L’Ircam recrute un Chercheur H/F – en CDD de 18 mois et à temps plein – Projet 3DTVs Poste disponible à partir d'avril 2012
Présentation de l’Ircam L'Ircam est une association à but non lucratif, associée au Centre National d'Art et de Culture Georges Pompidou, dont les missions comprennent des activités de recherche, de création et de pédagogie autour de la musique du XXème siècle et de ses relations avec les sciences et technologies. Au sein de son département R&D, des équipes spécialisées mènent des travaux de recherche et de développement informatique dans les domaines de l'acoustique, du traitement des signaux sonores, des technologies d’interaction, de l’informatique musicale et de la musicologie. L'Ircam est situé au centre de Paris à proximité du Centre Georges Pompidou au 1, Place Stravinsky 75004 Paris.
Introduction au projet 3DTVs L'objectif du projet 3DTVs est de concevoir des descriptions évolutives des contenus 3DTV, leur indexation, leur recherche ainsi que la conception de méthodes de navigation sur toutes des plateformes ouvertes, en utilisant des interfaces utilisateurs mobiles et fixes et d'intégrer de telles fonctionnalités 3D dans les archives de contenus audiovisuels. L’analyse multi canal audio 3D vise la détection d’événements audio basés sur des techniques de fusion combinant l'analyse audio effectuée dans les canaux individuels ainsi que des algorithmes de localisation et de séparation de source pour la détection des mouvements des sources audio. Les résultats seront utilisés pour l’indexation 3D audio et cross modale ainsi que pour la recherche. L’indexation audio/ video multimodale 3D des contenus audiovisuels s’appuiera sur les résultats de l’indexation vidéo 3D et audio 3D. Des méthodes de description de contenu et de recherche seront développées afin de permettre des réponses rapides aux recherches sémantiques.
Rôle de l’Ircam dans le projet Quaero Dans le projet 3DTVs, l'Ircam est en charge de la recherche et du développement des technologies relatives à la - Détection des événements audio en utilisant les scènes audio multi canal - Segmentation en tours de parole - Séparation, localisation et identification des sources sonores
Missions Le Chercheur sera en charge du développement des technologies liées à: - Détection des événements audio en utilisant les scènes audio multi canal Le chercheur collaborera également avec l'équipe de développement et participera aux activités du projet (évaluation, réunions, spécification).
Profil recherché
Salaire Selon formation et expérience professionnelle
Candidatures Prière d'envoyer une lettre de motivation et un CV détaillant le niveau d'expérience/expertise dans les domaines mentionnés ci-dessus (ainsi que tout autre information pertinente) à peeters _a_t_ ircam dot fr avec copie à vinet _a_t_ ircam dot fr, roebel _at_ ircam _dot_ fr If you already applied for this position, please just send us a quick email telling us you are still interrested and we get back to you.
| |||||
6-43 | (2012-03-27) Post Doctoral Fellow or Research Associate, Toronto, Canada Position: Post Doctoral Fellow or Research Associate (scientific) Site: Toronto Rehabilitation Institute, University Centre, Toronto, Canada
KEY RESPONSIBILITIES:
KEY REQUIREMENTS:
ASSET REQUIREMENTS:
Alshaer dot Hisham at torontorehab dot on dot ca
| |||||
6-44 | (2012-04-02) Research position in Spoken Language Dialogue Systems Development for Serious Games ; University of Ulm Germany Research Position with perspective of a PhD degree in Spoken Language -- Wolfgang Minker Ulm University Communications Engineering - Dialogue Systems Albert-Einstein-Allee 43 D-89081 Ulm Phone: +49 731 502 6254/-6251 Fax: +49 731 501 226254 http://dialogue-systems.org
| |||||
6-45 | (2012-04-04) PhD fellowship- Fondazione Bruno Kessler (FBK), Trento, Italy
| |||||
6-46 | (2012-04-04) Post-Doctoral Research Position, Aalto University Post-Doctoral Research Position, Aalto University
Title: Statistical speech synthesis Department: Department of Signal Processing and Acoustics
URL: http://spa.aalto.fi/en/ Start date: August-October 2012 Duration: 12-18 months contract
Department of Signal Processing and Acoustics, Aalto University (Espoo, Finland), invites applications for a post-doctoral researcher position in speech technology. The position is funded by the Simple4all project (http://simple4all.org/), which is a collaboration between Aalto University, University of Edinburgh (coordinator), University of Helsinki, Universidad Politécnica de Madrid, and Universitatea Tehnica Cluj-Napoca. Simple4All is a 3 year project, funded by EC’s FP7 ICT Programme, whose general aim is to create speech synthesis technology that learns from data with little or no expert supervision and continually improves itself, simply by being used.
The work at the Department of Signal Processing and Acoustics focuses on novel vocoding technologies in statistical parametric speech synthesis. More specifically, we are interested in utilizing such speech models in statistical speech synthesis that are closer to the human speech production mechanism and are inherently able to produce many voice qualities. Applicants for the post-doctoral researcher position must have a PhD (or equivalent experience) in speech processing, digital signal processing or computer science. They must have background in statistical speech synthesis, experience in the development of vocoders is particularly appreciated. In addition, experience of project development and project leadership in a research context, together with excellent communication, presentation, and organisational skills are highly desirable.
To apply, please send your CV (.pdf format) including a list of publications and your contact information, a statement of research interests and contact information for at least 2 references. Applications must be sent to paavo.alku@aalto.fi using the subject line: Post-doc position in statistical speech synthesis Application deadline: 30 June 2012
| |||||
6-47 | (2012-04-15) Full Time Research Programmer, Dialog Research Center, CMU Pittsburgh Full Time Research Programmer, Dialog Research Center
| |||||
6-48 | (2012-04-20) PhD grant: Prosodic markers at IRIT Toulouse Modélisation de trajectoires de marqueurs prosodiques et linguistiques ; application à la caractérisation des intentions des intervenants dans les discours audiovisuels
Contact Jérôme Farinas, jfarinas@irit.fr équipe SAMOVA http://www.irit.fr/recherches/SAMOVA/
Description du sujet Dans le domaine du traitement automatique de l'audio, les systèmes actuels sont parvenus à une assez grande maturité pour extraire de façon plutôt fiable des informations sur les locuteurs présents, la langue utilisée et la transcription de la parole. Un des objectifs de la recherche actuelle consiste à utiliser ces informations afin de structurer les interventions des locuteurs et plus largement le contenu radiophonique et télévisuel.
Dans ce contexte, l'équipe SAMOVA de l'IRIT a acquis ces dernières années de fortes compétences en modélisation et segmentation automatique en locuteurs [Louradour 2007, El Khoury 2010], en identification automatique de langues [Pellegrino 1998, Farinas 2002, Rouas 2005], en segmentation parole/musique/chant [Pinquier 2004, Lachambre 2009], en extraction de jingle [Pinquier 2004], en transcription de la parole [Campagne ESTER 2004], en recherche de zones de parole conversationnelle [Projet EPAC 2010] et de mots-clés [Le Blouch 2009]. En s'appuyant sur ces travaux, l'équipe travaille sur la structuration des émissions en se basant sur les interventions des locuteurs et leurs interactions [Bigot 2011] ainsi que sur la vidéo [Ercolessi 2011].
A partir d'une caractérisation du rôle des intervenants (présentateur, locuteur dominant...) notre objectif est d'étudier plus précisément les interactions entre locuteurs afin de distinguer ce qui dans le message relève de l'interaction (ouverture, clôture, présentation d'un invité, gestion des tours de parole) et des échanges d'opinion. Plus largement, le sujet de thèse proposé vise à étudier l'intention dans les interventions audiovisuelles de personnes. La modélisation des intentions est principalement basée sur la modélisation de la prosodie, qui a travers l'intonation et le rythme permet d'influer sur la forme du discours. Cette modélisation devra prendre en compte la prosodie à court ou long terme [Farinas2002,Rouas2004]. Deux niveaux de modélisations seront donc mis en œuvre afin de caractériser la modalité de la phrase et la modification de la prosodie des mots. Cela passera par la choix de paramètres prosodiques appropriée (F0, energie) et la modélisation statistique de ces paramètres. L'évolution temporelle pourra être prise en compte en utilisant des modélisations stochastiques, des modélisations de trajectoires. Cette étude se déroulera en deux phases :
Les applications de cette recherche concernent la structuration de contenus audiovisuels pour aider à l'archivage documentaire et la recherche d'information dans ces contenus. Cette structuration et caractérisation de zones d'interaction présente également un intérêt pour la constitution de résumés audio-visuels.
Le candidat devra posséder un Master avec de fortes compétences en informatique. Des connaissances en traitement du signal, en reconnaissance de la parole seraient souhaitables (reconnaissance de la parole et prosodie).
Références [Louradour 2007] Noyaux de séquences pour la vérification du locuteur par Machines à Vecteurs de Support. Thèse de doctorat, Université Paul Sabatier, janvier 2007 [El Khoury 2010] Unsupervised Video Indexing based on Audiovisual Characterization of Persons. Thèse de doctorat, Université de Toulouse, juin 2010 [Pellegrino 1998] Une approche phonétique en identification automatique des langues : la modélisation acoustique des systèmes vocaliques. Thèse de doctorat, Université Paul Sabatier, décembre / december 1998. [Farinas 2002] Une modélisation automatique du rythme pour l'identification des langues. Thèse de doctorat, Université Paul Sabatier, novembre 2002. [Rouas 2005] Caractérisation et identification automatique des langues. Thèse de doctorat, Université Paul Sabatier, mars 2005. [Pinquier 2004] Indexation sonore : recherche de composantes primaires pour une structuration audiovisuelle. Thèse de doctorat, Université Paul Sabatier, décembre 2004. [Lachambre 2009] Caractérisation de l'environnement musical dans les documents audiovisuels. Thèse de doctorat, Université de Toulouse, décembre 2009. [Campagne ESTER 2004] G. Gravier, J.F. Bonastre, S. Galliano, E. Geoffrois, K. Mc Tait and K. Choukri. ESTER, une campagne d'évaluation des systèmes d'indexation d'émissions radiophoniques, Proc. Journées d'Etude sur la Parole, Avril 2004. [projet EPAC 2010] Yannick Estève, Thierry Bazillon, Jean-Yves Antoine, Frédéric Béchet, Jérôme Farinas. The EPAC corpus: manual and automatic annotations of conversational speech in French broadcast news (regular paper). Dans : Language Resources and Evaluation Conference (LREC 2010), Valletta, Malte, 19/05/2010-21/05/2010, Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk (Eds.), European Language Resources Association (ELRA), p. 1686-1689, 2011. [Le Blouch 2009] Décodage acoustico-phonétique et applications à l'indexation audio automatique. Thèse de doctorat, Université Paul Sabatier, juin 2009. [Bigot 2011] Benjamin Bigot, Isabelle Ferrané, Julien Pinquier, Régine André-Obrecht. Speaker Role Recognition to help Spontaneous Conversational Speech Detection (regular paper). Dans : International workshop on Searching Spontaneous Conversational Speech SCSS (SCSS 2010), Firenze, Italy, 25/10/2010-29/10/2010, ACM, p. 5-10, octobre 2010. [Ercolessi 2011] Philippe Ercolessi, Hervé Bredin, Christine Sénac and Philippe Joly, Segmenting TV series into scenes using speaker diarization, WIAMIS 12th International Workshop on Image Analysis for Multimedia Interactive Services, Delft, Pays-Bas,13-15 avril 2011.
Mots clés Traitement automatique de la parole, décodage phonétique, recherche de mots clés, prosodie, acoustique, structuration en émissions, vidéo
Kewords Automatic Speech Processing, Phonetic Decoding, Keyword Spotting, Prosody, Acoustic, Structuring Programs, Video
| |||||
6-49 | (2012-04-20) Ingénieur at INRIA France Inria recherche un ingénieur jeune diplômé pour développer sa boîte à outils de séparation de sources audio FASST (http://bass-db.gforge.inria.fr/fasst/) et effectuer un travail de recherche sur la reconnaissance de la parole robuste au bruit.
| |||||
6-50 | (2012-05-01) PhD Reconnaissance automatique de la parole continue : parole spontanée LORIA Nancy France Sujet de thèse :Reconnaissance automatique de la parole continue : parole spontanée
Encadrants pour ce sujet : Lieu : Inria-LORIA Nancy
Le sujet est affiché sur le site de l'école doctorale IAEM http://www.iaem.uhp-nancy.fr/ , rubrique 'propositions contrats doctoraux'. Date limite du depot de candidature : le 1-er juin
Conetxte : La reconnaissance de la parole est un processus par lequel un ordinateur transforme le signal acoustique de la parole prononcée en texte. Pendant ce processus, le système de reconnaissance utilise des modèles acoustiques, des modèles de langage et un lexique de prononciations.
L’objet de cette thèse est d’apporter des éléments de solution à ce problème en proposant de nouvelles méthodes qui permettent de mieux prendre en compte les caractéristiques de la prononciation spontanée dans le cadre de la reconnaissance automatique de la parole.
Références : [Brun et al.2005] A. Brun, C. Cerisara, D. Fohr et I. Illina. ANTS : le système de transcription automatique du LORIA. WorkShop ESTER, 2005.
| |||||
6-51 | (2012-05-13) PhD position: Caractérisation de l'ambiance sonore dans des enregistrements ethnomusicologiques IRIT Toulouse France Titre : Caractérisation de l?ambiance sonore dans des enregistrements ethnomusicologiques
Responsables : Régine André-Obrecht et Julien Pinquier (IRIT, équipe SAMoVA) obrecht@irit.fr et pinquier@irit.fr
Cette thèse concerne le traitement de données ethnomusicologiques issues des archives du CNRS-Musée de l?Homme, gérées par le Centre de Recherche en EthnoMusicologie (CREM) du Laboratoire d'Ethnologie et de Sociologie Comparative (LESC). Il s?agit de documents en cours de numérisation et d?informatisation (3500 heures d?enregistrements inédits, de 1900 à nos jours, de musiques traditionnelles et d?enquêtes ethnographiques du monde entier et 3500 heures de documents anciens et rares). Cette collection est d?une grande importance historique et est unique au monde. Dans ce contexte applicatif, il est nécessaire de mettre au point un ensemble d'outils de traitement automatique de l'audio (parole, musique, chant, bruits?) afin de produire une indexation (semi)automatique pour un accès intelligent à la collection d'enregistrements sonores. Ce travail est principalement à destination de chercheurs (experts) en ethnomusicologie.
L?étude envisagée a pour objectif une caractérisation plus fine des composantes Parole, Musique, Chant, Bruits afin de définir l?environnement sonore générique. De plus, l?introduction d?une approche semi-supervisée (prise en compte de métadonnées disponibles ou de l?utilisateur) doit permettre la caractérisation d?environnements sonores spécifiques.
Après s?être approprié les différents systèmes précédemment développées à l?IRIT, concernant la détection de parole et de musique, le doctorant aura en charge leur adaptation au corpus du projet. L?analyse des zones de parole et de voix chantée détectées doit conduire à une segmentation en tours de parole et en tours de chant, suivie du regroupement de ces segments par recherche de similarité des voix. Dès lors que les enregistrements sonores sont effectués dans des conditions naturelles et lorsque les zones de parole, de musique et de chant sont identifiées, restent des zones sonores digne d?un intérêt pour un ethnomusicologique car leur écoute permet de préciser le contexte sonore de la session de l?enregistrement, ce que l?on appelle « l?ambiance sonore ». Il est proposé de localiser ces zones de bruit d?intérêt et de spécifier un étiquetage. Pour ce faire, deux stratégies sont envisagées : - un mode supervisé en utilisant les attributs acoustiques classiques (approche générique), - un mode non-supervisé en introduisant des connaissances issues des ethnomusicologues (approche spécifique) via la plateforme Telemeta (http://crem.telemeta.org/).
Ce doctorat sera financé par le projet ANR DIADEMS qui démarrera en octobre 2012. Il serait appréciable que le candidat ait des connaissances en reconnaissance de formes et en traitements de la parole et de la musique.
|