ISCApad Archive » 2014 » ISCApad #197 » Jobs » (2014-09-18) Post-doc position at LORIA (Nancy, France) |
ISCApad #197 |
Wednesday, November 12, 2014 by Chris Wellekens |
Post-doc position at LORIA (Nancy, France) Automatic speech recognition: contextualisation of the language model by dynamic adjustment Framework of ANR project ContNomina The technologies involved in information retrieval in large audio/video databases are often based on the analysis of large, but closed, corpora, and on machine learning techniques and statistical modeling of the written and spoken language. The effectiveness of these approaches is now widely acknowledged, but they nevertheless have major flaws, particularly for what concern proper names, that are crucial for the interpretation of the content. In the context of diachronic data (data which change over time) new proper names appear constantly requiring dynamic updates of the lexicons and language models used by the speech recognition system. As a result, the ANR project ContNomina (2013-2017) focuses on the problem of proper names in automatic audio processing systems by exploiting in the most efficient way the context of the processed documents. To do this, the post-doc student will address the contextualization of the recognition module through the dynamic adjustment of the language model in order to make it more accurate. Post-doc subject The language model of the recognition system (n gram learned from a large corpus of text) is available. The problem is to estimate the probability of a new proper name depending on its context. Several tracks will be explored: adapting the language model, using a class model or studying the notion of analogy. Our team has developed a fully automatic system for speech recognition to transcribe a radio broadcast from the corresponding audio file. The postdoc will develop a new module whose function is to integrate new proper names in the language model. Required skills A PhD in NLP (Natural Language Processing), be familiar with the tools for automatic speech recognition, background in statistics and computer program skills (C and Perl). Post-doc duration 12 months, start during 2014 (these is some flexibility) Localization and contacts Loria laboratory, Speech team, Nancy, France Irina.illina@loria.frdominique.fohr@loria.fr Candidates should email a letter of application, a detailed CV with a list of publications and diploma |
Back | Top |