ISCApad #175 |
Thursday, January 10, 2013 by Chris Wellekens |
6-1 | (2012-07-05) PhD at LIG (Grenoble-France) PhD proposal : Collaborative Annotation of multi-modal, multi-lingual and multimedia documents
| ||
6-2 | (2012-07-08) Faculty position in Phonetic Science and Speech Technology at Nanjing Normal University, China Faculty position in Phonetic Science and Speech Technology at Nanjing Normal University, China (Urgent job announcement) The Institute of Linguistic Science and Technology at Nanjing Normal University, China, invites applications for a faculty position in the area of Phonetic Science and Speech Technology. The position can be Lecturer, Associate Professor, or Professor, depending on the qualifications and experience of the applicant. Nanjing Normal University (NNU) is situated in Nanjing, a city in China not only famous for its great history and culture but also pride for excellence in education and academy. With Chinese-style buildings and garden-like environment, the Suiyuan Campus of NNU is often entitled the “ Most Beautiful Campus in the Orient.” Nanjing Normal University is among the top 5 universities of China in the area of Linguistics. Placing strong emphasis on interdisciplinary research, the Institute of Linguistic Science and Technology at NNU is unique in that it bridges the studies of theoretical and applied linguistics, phonetics, cognitive sciences, neural sciences, and information technologies. The phonetic laboratory is very well equipped, with sound-proof recording studio, professional audio facilities, physiological instruments (e.g., WAVE system, PowerLab, EGG, EPG, airflow and pressure module, and nasality sensor), EEG for ERP studies, eye tracker, etc. The laboratory just organized an international symposium TAL 2012 www.TAL2012.org very successfully at the end of May. We welcome interested colleagues to join us. The research can cover any areas in phonetic sciences and speech technologies, including but not limited to speech production, speech perception, prosodic modeling, speech synthesis, automatic speech recognition and understanding, spoken language acquisition, computer-aided language learning, and ERP study for spoken languages. Outstanding research support will be offered. Requirements: * A PhD degree (or an expected one) in related disciplines (e.g., linguistics, psychology, physics, applied mathematics, computer sciences, and electronic engineering); * Good publication/patent record in phonetic sciences or speech technologies; * Good oral and written communication skills in both Chinese and English; * Team work spirit in a multidisciplinary group. Interested candidates should submit a CV, a detailed list of publication, the copies of the best two or three publications, and the contact information of two references to: Prof. Wentao GU Email: wtgu@njnu.edu.cn; wentaogu@gmail.com Phone: (office) +86-25-8359-8624, (mobile) +86-189-3687-2840 The position will keep open until it is filled. An earlier application is strongly recommended
| ||
6-3 | (2012-07-26) Offre de thèse en correction orthographique par traduction statistique, Univ. Le Mans, FranceOffre de thèse financée au sein du laboratoire d'Informatique de l'Université du Maine (LIUM) dans le domaine de la correction orthographique automatique par méthodes de traduction statistique. Lieu : LIUM (Le Mans) Date : 1/10/2012 Durée : 3 ans Cette thèse s'inscrit dans le projet 'investissement d'avenir' PACTE, porté par l'entreprise Diadeis, et dont sont également partenaires l'équipe Alpage (INRIA et Paris 7), et les entreprises A2ia et Isako. PACTE a pour objectif l'amélioration de la qualité orthographique des textes issus de différentes méthodes de capture textuelle. L'accent est mis sur les sorties d'OCR (reconnaissance optique de caractères sur des textes imprimés scannés), mais concerne également des données obtenues par reconnaissance d'écriture manuscrite, par saisie manuelle, et par rédaction directe. Les techniques qui seront utilisées sont à la fois statistiques et hybrides, faisant usage d'outils et de ressources de linguistique computationnelle. Le domaine d'application principal du projet est celui de la numérisation du patrimoine écrit, dans un contexte multilingue. Une deuxième thèse démarrera à Alpage avec un accent sur l'utilisation des connaissances linguistiques pour aider à optimiser automatiquement ou quasi-automatiquement la qualité orthographique des textes. Dans le cadre du projet PACTE, une étroite collaboration aura lieu entre le LIUM, Alpage et la société Diadeis. Dans ce contexte, l'enjeu de la thèse au LIUM est d'analyser comment utiliser les techniques de traduction automatique statistique pour la correction d'erreur. En effet, on peut considérer la correction d'erreur comme un processus de passage d'une langue erronée vers une langue correcte. Une approche similaire a déjà été utilisée avec succès pour corriger les sorties des systèmes de traduction par règles, connue sous le nom 'statistical post-editing (SPE)'. Dans le cadre de cette thèse, il s'agit donc d'étudier comment une approche similaire peut être utilisée pour la correction orthographique. Un aspect important de cette thèse concerne le développement de modèles de langue efficaces, donnant de bons résultats avec une faible empreinte mémoire. Les modèles n-grammes à repli seront privilégiés, mais d'autres méthodes seront également explorées, notamment la modélisation dans l'espace continu (continuous space language models). Nous nous intéresserons aussi à l'intégration de connaissances morphosyntaxiques, en collaboration avec l'équipe Alpage. Les langues étudiées seront prioritairement le français et l'anglais, ainsi que l'allemand. Une application à l'espagnol, l'italien, voire d'autres langues européennes est possible. Profil recherché : - bonnes compétences en informatique (la maîtrise de Linux est indispensable, programmation en C++, utilisation de scripts, Perl, etc); - des connaissances en traduction automatique statistique sont souhaitées, ou, à défaut, en apprentissage automatique; - une expérience avec l'outil Moses est un plus. La thèse se déroulera au sein de l'équipe LST du LIUM. Le LIUM est connu au niveau international pour ses recherches dans le domaine de la traduction statistique, et nous avons de nombreuses collaborations avec des universités et entreprises en Europe et aux États-Unis. Contact : Holger Schwenk Holger.Schwenk@lium.univ-lemans.fr
| ||
6-4 | (2012-08-03) PhD Studentship in Speaker Diarization at EURECOM, Sophia Antipolis, Alpes Maritimes, France PhD Studentship in Speaker Diarization at EURECOM
| ||
6-5 | (2012-08-08) Poste de chercheur junior en sciences du langage-Univ Mons-Hainaut Belgique Offre d’emploi : « Chercheur en sciences de la parole » _____________________________________________________________________________________
Service de Métrologie et Sciences du langage, Laboratoire de phonétique,
Université de Mons, Mons, Belgique _____________________________________________________________________________________
Le service de métrologie et sciences du langage constitue une réserve ouverte de recrutement en vue de l’engagement possible dans des postes de chercheur junior (M/F) boursier en sciences de la parole. Profil du candidat (M/F) :
Profil de poste : Le titulaire du poste (M/F) contribue aux efforts de recherche du Service dans le cadre de l’une des thématiques ressortissant à l’un des deux projets décrits en annexe. Il prépare une thèse de doctorat articulée avec l’un de ces projets. Il peut être amené à prendre une part mineure aux activités d’encadrement pédagogique du service. Certains postes sont d’ores et déjà disponibles au service, d’autres sont susceptibles de lui être attribués moyennant une candidature au sein d’un concours (interne et/ou externe à l’UMONS), avec support du Service. Bourse de recherche d’une durée de quatre ans, par tranches renouvelables d’un an, avec prise de fonctions au plus tôt le 1er octobre 2012. Les personnes intéressées sont priées d’adresser au plus vite un dossier comportant :
au format pdf (exclusivement) à l’adresse : bernard.harmegnies@umons.ac.be Annexe 1 : Projet PAROLPATHOS Titre du projet
Evaluation acoustique et auditive du signal de parole de locuteurs francophones en situation de handicap. Apports de la phonétique clinique au développement de procédures d’évaluation holistique du sujet communicant situé dans son écosystème.
Résumé et objectifs du projet
Les phénomènes acoustiques que produit le locuteur dans l’actualisation de son intention de communication ne comportent pas seulement la manifestation matérielle des formes linguistiques prescrites par les systèmes de la langue. Le signal de parole charrie aussi quantité d'éléments sans rapport avec les signes linguistiques, mais causalement liés à divers aspects du fonctionnement ou de l'état du locuteur, et susceptibles, dès lors, d'avoir valeur d'indices.
Dans les cas où le sujet se trouve en situation de handicap, des éléments liés à un contexte pathologique peuvent ainsi affleurer dans le signal de parole. Les productions vocales du locuteur peuvent dès lors porter la marque tant de problèmes localisés au niveau de l'une ou l'autre des procédures de traitement du langage que de difficultés non langagières mais ayant sur le langage des répercussions plus ou moins directes.
La phonétique clinique, discipline en émergence depuis deux décennies dans le monde anglo-saxon, et depuis quelques années seulement en Francophonie, se centre sur ces phénomènes avec le projet de mettre ses méthodes et techniques de laboratoire au service de la compréhension du fonctionnement du locuteur, là où l'existence d'une pathologie lui fait affronter la situation de communication comme situation de handicap. Aujourd'hui, si les approches de la phonétique clinique apparaissent, au crible de ces travaux, d'un très haut intérêt, force est cependant de reconnaître que l'état des connaissances scientifiques en la matière demeure embryonnaire et inégal. D'une part, les recherches se centrent le plus souvent sur des phénomènes observés dans des échantillons de parole de langue anglaise: la langue française fait, de ce point de vue, figure de parent pauvre. D'autre part, certains secteurs pathologiques ont fait l'objet de bien moins d'efforts que d’autres (sinon aucun). En outre, souvent, des équipes différentes ont recours à des méthodologies différentes, même si elles travaillent sur des cadres pathologiques identiques ; il est donc malaisé d'évaluer la fiabilité et la validité des différentes approches métrologiques. Par ailleurs, dans de nombreux cas, les approches quantitatives à base acoustique, quelque sophistiquées qu’elles soient, ne parviennent pas à une finesse clinique comparable à celles de l'expert humain. Enfin, la plupart des outils aujourd'hui disponibles s'assortissent de contraintes techniques et méthodologiques qui en rendent l'utilisation difficile en contexte écologique de communication.
Notre projet vise, en conséquence, à dégager une synthèse générale des moyens d’évaluation disponibles, valable pour les productions des locuteurs francophones s’exprimant dans des situations de handicap comportant une dimension pathologique. Il recourt à une large variété de tableaux cliniques (troubles de l’articulation, troubles de la fluence, laryngopathies acquises, carcinomes des voies aéro-digestives supérieures, pathologies neurologiques non-spécifiquement liées à la sphère du langage, vieillissement langagier physiologique vs. pathologique) et à un panel d'approchesméthodologiques diversifié, afin de pouvoir étudier le croisement des tableaux cliniques et des méthodologies dans un large territoire conceptuel. Il compare ces approches métrologiques à entrée acoustique aux évaluations formées par des auditeurs humains dotés de types et de niveaux d’expertise variables. Ce faisant, il vise non seulement à faire oeuvre d’intervalidation, mais interroge également les processus cognitifs permettant à l’observateur de construire et d’exercer son expertise. Il étudie en outre la généralisabilité des mesures et des évaluations pratiquées en contexte artificiel (laboratoire, hôpital) aux contextes écologiques de vie (famille, services, institutions) et propose une approche intégrée de l’évaluation de la contribution de la qualité communicationnelle à la qualité de vie.
Annexe 2 : Projet COGNIPHON Titre du projet
Contrôle cognitif de la production des sons de parole en phase d’acquisition de la L2
Résumé et objectifs du projet
L’individu qui, ayant acquis la maîtrise du langage au travers de sa seule langue maternelle, souhaite en apprendre une autre, se trouve confronté à la nécessité de traiter, en L2, (en perception comme en production) des sons similaires à ceux de la L1 d’une manière différente (par exemple, certaines réalisations du /e/ de l’espagnol peuvent être acoustiquement fort similaires de celles du /ɛ/ du portugais), voire de percevoir et produire en L2 des sons inexistants en L1 (par exemple, la production du /ʔ/ en arabe ne correspond à la réalisation d’aucun phonème de l’anglais). L’idée que, dans le cadre de cet apprentissage, le sujet recourt mal à propos aux stratégies qui lui sont habituelles en L1 a, depuis longtemps, inspiré les linguistes, les pédagogues et, singulièrement, les scientifiques intéressés à la cognition humaine, qui peuvent y voir la mise en œuvre inappropriée de stratégies routinisées par le biais de l’usage de la L1.
Les pédagogues de l’oral en langue étrangère (particulièrement le courant verbo-tonal), soucieux de prendre en charge cette propension de l’apprenant, ont proposé divers moyens d’intervention dans le cadre de démarches qu’il est de coutume de rassembler sous l’appellation correction phonétique. De cela résulte un corps technique rassemblant des procédés didactiques appuyés essentiellement sur l’expertise des praticiens. Les enseignants de langue s’accordent certes en général sur l’intérêt de ces techniques. Néanmoins, l’étude objective non seulement de leur efficacité mais aussi, plus profondément, de leur mode de fonctionnement, n’a fait l’objet que de fort peu de développements, hormis les travaux de quelques trop rares équipes. La situation actuelle est donc paradoxale, car même si les échanges verbaux sont désormais au centre des pratiques de classe en L2, la question du traitement pédagogique de l’acquisition des processus de gestion cognitive de la matière phonique a largement été scotomisée dans le domaine de la recherche empirique et très peu de connaissances fiablement basées sur des évidences expérimentales sont en fait disponibles. C’est d’autant plus regrettable que la demande sociale pour des prestations orales multilingues de qualité s’accroît continûment et que diverses officines vendent aujourd’hui au prix fort un savoir-faire souvent banal, prétendument fondé sur des évidences scientifiques, en vérité non démontrées. Notre projet vise au développement d’un programme de recherche dont la finalité est précisément de contribuer à combler cette lacune par la mise en œuvre de dispositifs expérimentaux susceptibles d’apprécier le poids des divers éléments causaux impliqués dans l’acquisition de nouvelles compétences de contrôle phonique et susceptibles d’être appliqués à tout sujet, quelles que soient ses caractéristiques intrinsèques. Si, dans son origine, notre réflexion part de constats opérés dans le cadre de situations concrètes d’enseignement-apprentissage et si, par ailleurs, elle se nourrit de l’expertise des enseignants, nous nous inscrivons cependant résolument dans une perspective de recherche fondamentale. Notre objet d’étude n’est autre que l’ensemble des facteurs extrinsèques susceptibles d’être manipulés en vue de favoriser la maîtrise, par l’apprenant de langue étrangère, de nouvelles possibilités de contrôle phonique, que ces facteurs aient ou non été identifiés et/ou délibérément exploités dans le cadre pédagogique. Nous nous trouvons donc bien ici au cœur de ces processus cognitifs “ mis en jeu au cours de l’acquisition, la perception, la compréhension et la production du langage parlé […] ” que Ferrand & Grainger (2004, p. 11) définissent précisément comme constituant l’objet-même des préoccupations scientifiques de la psycholinguistique cognitive.
| ||
6-6 | (2012-08-24) Post-docs call for application at Brain and Language Research Institute, Aix en Provence France Brain and Language Research Institute
| ||
6-7 | (2012-08-30) Post-doc position at LIMSI-CNRS in the Spoken Language Processing group, Paris Post-doc position at LIMSI-CNRS Post-doc positionin the Spoken Language Processing group A post-doc position will be proposed at LIMSI-CNRS, in the context of the ANR-funded CHIST-ERA CAMOMILE Project (Collaborative Annotation of multi-MOdal, MultI-Lingual and multi-mEdia documents).
Description Human activity is constantly generating large volumes of heterogeneous data, in particular via the Web. These data can be collected and explored to gain new insights in social sciences, linguistics, economics, behavioural studies as well as artificial intelligence and computer sciences. Skills A PhD in a field related to the project is required. Contacts
Agenda
| ||
6-8 | (2012-08-30) A Post-doc position at Bruno Kessler Foundation, Center for Information Technology (Trento-Italy) A Post-doc position is available in the Speech-acoustic scene analysis and interpretation - SHINE Unit at Bruno Kessler Foundation, Center for Information Technology. The Bruno Kessler Foundation (FBK) conducts research activities in Information Technology, Materials and Microsystems, Theoretical Physics, Mathematics, Italian-Germanic historical studies, Religious studies and International Relations. Through its network, it also develops research in the fields of international relationships, conflict causes and effects, European economic institutions, behavioral economics and evaluative assessment of public policies. Workplace description The SHINE unit conducts research on acoustic signal processing and interpretation, mainly concerning speech signals acquired by multi-microphone systems in indoor environment. The research aims to progress in the scientific areas of Acoustic Scene Analysis and Speech Interaction under noisy and reverberant conditions, in particular with a speaker at distance from the microphones. More information about SHINE unit is available at the following link: http://shine.fbk.eu
Job description The SHINE Research Unit is looking for a candidate to carry out research activities in the field of Distant Speech Recognition. Applications are invited for a post-doctoral researcher who will work under the DIRHA project funded by the EU (http://dirha.fbk.eu) and other internal research activities. This project aims to study voice-based systems in domestic environments supporting natural speech interaction using distant microphones e.g. for supporting motor-impaired persons. Main field of research are multi- channel acoustic processing, distant speech recognition and understanding, speaker identification and verification, spoken dialogue management. Job requirements
Employment Type of contract: 30-month contract Number of position: 1 Gross salary: from 33.000 to 41.000 € per year (depending on the candidate’s experience) Benefit: company subsidized cafeteria or meal vouchers, internal car park, welcome office support for visa formalities, accommodation, social security, etc., reductions on bank accounts, public transportation, sport, accommodation and language courses fees. Start date: Autumn 2012 Place: Povo, Trento (Italy) Application process To apply online, please send your detailed CV (.pdf format) including a list of publications, a statement of research interests and contact information for at least 2 references. Please include in your CV your authorization for the handling of your personal information as per the Italian Personal data Protection Code, Legislative Decree no. 196/2003 June 2003. Applications must be sent to jobs@fbk.eu Emails should have the following reference code: SHINE_PostDoc2012_DSR Application deadline: September 25th 2012 For more information, please contact: Maurizio Omologo (e-mail: omologo@fbk.eu) Those candidates who will pass the preliminary curricula screening will be contacted shortly for an interview. Those applicants who will not be selected, will be notified of the exclusion at the end of the selection process. Please note that FBK may contact short-listed candidates who were not selected for the current openings within a period of 6 months for any selection process for similar positions. For transparency purposes, the name of the selected candidate, upon his/her acceptance of the position, will be published on the FBK website at the bottom of the selection notice.
| ||
6-9 | (2012-09-07) 4 positions as Google's Dublin office as Speech Linguistic Project Managers There are four temporary positions opening at Google's Dublin office as Speech Linguistic Project Managers for French, Italian, German and Spanish (see description below). The role would suit someone with an advanced degree in (Computational) Linguistics (Master's degree or Ph.D.) and a native speaker of one of these languages.
These positions were recently advertised on the Linguist List (http://linguistlist.org/jobs/get-jobs.cfm?JobID=98660&SubID=4551801) where all the relevant information can be found. A description can be found below as well.
Job title:
Speech Linguistic Project Manager (French, German, Italian, Iberian Spanish)
Job description:
As a Linguistic Project Manager and a native speaker of one of the target languages, you will oversee and manage all work related to achieving high data quality for speech projects in your own language.
You will be based in the Dublin office, managing a team of Data Evaluators and working on a number of projects towards Speech research: ASR, TTS, and NLP
This includes:
- managing and overseeing the work of your team
- creating verbalisation rules, such as expanding URLs, email addresses, numbers
- providing expertise on pronunciation and phonotactics
- building and maintaining a database of speech recognition patterns
- creating pronunciations for new lexicon entries, maintaining the lexicon
- working with QA tools according to given guidelines and using in-house tools
Job requirements:
- native-level speaker of one of the target languages (with good command of the standard dialect) and fluent in English
- keen ear for phonetic nuances and attention to detail; knowledge of the language's phonology
- must have attended elementary school in the country where the language is spoken
- ability to quickly grasp technical concepts
- excellent oral and written communication skills
- good organizational skills, previous experience in managing external resources
- previous experience with speech/NLP-related projects a plus
- advanced degree in Linguistics, Computational Linguistics preferred
- also a plus: proficiency with HTML, XML, and some programming language; previous experience working in a Linux environment
Project duration: 6-9 months (with potential for extension)
For immediate consideration, please email your CV and cover letter in English (PDF format preferred) with 'Speech Linguistic Project Manager [language]' in the subject line.
Email Address for Applications: DataOpsMan@gmail.com
Contact Person: Linne Ha
Closing date: open until filled
| ||
6-10 | (2012-09-26) Speech Recognition/Machine Learning Engineers ar Cantab Research, Cambridge,UK Speech Recognition/Machine Learning Engineers
| ||
6-11 | (2012-10-05) ASSOCIATE RESEARCH SCIENTIST POSITION at ETS Princeton, NJ, USA ASSOCIATE RESEARCH SCIENTIST POSITION Speech Educational Testing Service Headquartered in Princeton, NJ, ETS is the world’s premier educational measurement institution and a leader in educational research. As a nonprofit corporation and an innovator in developing tests for clients in education, government, and business we are dedicated to advancing educational excellence for the communities we serve. ETSs Research & Development division has an opening for a researcher in the NLP & Speech Group. The Group currently consists of about 15 Ph.D. level research scientists in areas related to NLP and speech. Its main focus is on foundational research as well as on development of new capabilities to automatically score written and spoken test responses in a wide range of ETS test programs including TOEFL(R)iBT and GRE(R). PRIMARY RESPONSIBILITIES Provide scientific and technical skills in conceptualizing, designing, obtaining support for, conducting, managing, and disseminating results of research projects in the field of speech technology, or portions of largescale research studies or programs in the same field. Develop and/or modify speech theories to conceptualize and implement new capabilities in automated scoring and speech-based analysis and evaluation systems which are used to improve assessments, learning tools and test development practices. Apply scientific, technical and software engineering skills in designing and conducting research studies and capability development in support of educational products and services. Develop and oversee the conduct of selected portions of research proposals and project budgets. Design and conduct complex scientific studies functioning as an expert in major facets of the projects. Assist in the conduct of research projects by accomplishing directed tasks according to schedule and within budget. Participate in dissemination activities through the publications of research papers, progress and technical reports, the presentation of seminars or other appropriate communication vehicles. Develop professional relationships as a representative, consultant or advisor to external advisory and policy boards and councils, research organizations, educational institutions and educators. REQUIREMENTS A Ph.D. in Language Technologies, Natural Language Processing, Computer Science or Electrical Engineering, with strong emphasis on speech technology and preferably some education in linguistics is required. Evidence of substantive research experience and/or experience in developing and deploying speech capabilities is required. Demonstrable contributions to new and/or modified theories of speech processing and their implementation in automated systems. Demonstrable expertise in the application of speech recognition systems and fluency in at least one major programming language (e.g., Java, Perl, C/C++, Python). HOW TO APPLY Please apply online at www.ets.org/careers – position #124337. ETS offers competitive salaries, outstanding benefits, a stimulating work environment, and attractive growth potential. ETS is an Equal Opportunity, Affirmative Action Employer.
| ||
6-12 | (2012-10-05) Researcher in Speech Technology at Vicomtech-IK4, San Sebastian, Spain Researcher in Speech Technology
| ||
6-13 | (2012-10-10) Dolby Research Beijing looking for world-class talent!
| ||
6-14 | (2012-10-11) Faculty Position at the Center for Spoken Language Understanding, Portland, Oregon
The Institute on Development & Disability/Center for Spoken Language Understanding invites applications at all ranks for a faculty position in Natural Language Processing, to include technologies for analysis of speech, language, or both. Special interest in applications to behavioral manifestations of neurological disorders is essential.
The primary interest is to extend our existing program in developing behavioral technologies that allow early detection and remediation of a wide range of neurological disorders, in including Autism and Parkinson’s.
The Institute on Development & Disability/ Center for Spoken Language Understanding is at the forefront of this new, exciting area of research. The faculty member will be expected to teach courses supporting the research program and appropriate background areas such Machine Learning and Computational Linguistics. We seek a researcher with a well-developed program in Natural Language Processing, to collaborate with the CSLU team and with clinicians throughout OHSU. The appointee will be expected to maintain an independent, extramurally funded research program.
Requirements:
Please contact: Jan van Santen, vansantj@ohsu.edu
| ||
6-15 | (2012-10-11) Research Programmer at the College of Pharmacy at the University of Minnesota Brief Description: The College of Pharmacy at the University of Minnesota is seeking a talented, pro-active and innovative individual for a Research Programmer position to work on several projects in Center for Clinical and Cognitive Neuropharmacology (C3N). C3N is engaged in conducting interdisciplinary research focused on cognitive effects of medications and neurodegenerative disorders such as Alzheimer's disease. Computerized assessment is used to measure these cognitive effects. The successful candidate for this position will be responsible for a variety of computer-related tasks including creating and maintaining innovative computer-based neuropsychological testing applications that involve spontaneous speech and language collection and analysis. The successful candidate will also be responsible for creating and maintaining databases used to store and organize experimental samples and web-enabled interfaces to the databases and data analysis tools. The successful candidate will also be expected to work with graduate and undergraduate students on specific programming and research projects to meet the needs of the Center.
Full Description is available here on the official University of Minnesota job posting site:
| ||
6-16 | (2012-10-12) Postdoctoral research position: Automatic identifcation of South African languages Postdoctoral research position: Automatic identifcation of South African languages U.Stellenbosch South Africa
A postdoc position focussing on automatic language identification for the eleven official languages of South Africa is available in the Digital Signal Processing Group of the Department of Electrical and Electronic Engineering at the University of Stellenbosch. Specific project objectives include the developemt of a research system, the production of associated publishable outputs, and the development of a web-based demonstrator. The position is part of a bilateral project grant between the Netherlands and South Africa. Applicants should hold a PhD in the field of Electronic/Electrical Engineering, Information Engineering, or Computer Science, or other relevant disciplines. Suitable candidates must have strong computer programming, analytical and mathematical skills, and be familiar with a Linux computing environment. Candidates must also be self-motivated and able to work independently. Finally, candidates must have excellent English writing skills and have an explicit interest in scientific research and publication. The position will be available for one year, with a possible extension to a second year, depending on progress and available funds. The proposed starting date is not later than 15 January 2013. Applications should include a covering letter, curriculum vitae, list of publications, research projects, conference participation and details of three contactable referees and should be sent as soon as possible to: Prof Thomas Niesler, Department of Electrical and Electronic Engineering, University of Stellenbosch, Private Bag X1, Matieland 7602. Applications can also be sent by email to: trn@sun.ac.za. The successful applicant will be subject to University policies and procedures. Interested applicants are welcome to contact me at the above e-mail address for further information regarding the project.
| ||
6-17 | (2012-10-18) Dolby’s technology group in Beijing looking for software engineers! Dolby’s technology group in Beijing looking for software engineers!
Job Title: Embedded SW-Engineer (Audio)
Summary DescriptionThis position is in the Engineering organization of Dolby Laboratories and located in Beijing, China. The main focus of this position is to implement Dolby’s audio technologies, including creating the reference code, porting to the embedded platforms such as ARM cores or TI DSPs. The position requires a deep knowledge in signal processing algorithms, fixed-pointed algorithms and optimization technical including the use of assembly language, as well as an excellent understanding of DSP architectures.
We are looking for a highly motivated individual for whom working with different tool chains under various operating systems in hardware close environments is fun and not a challenge.
The candidate will be part of the engineering team in Beijing and work closely together with other Dolby engineering entities in the US, Germany and Australia. We expect the candidate to build-up expert knowledge on highly efficient Dolby audio engines. Working in an international environment requires excellent verbal and written English communication skills. Essential Job Functions:
Teamwork & Communications
Education, Skills, Abilities, and Experience:
Please send your CV to cv.engineerin.beijing@dolby.com
| ||
6-18 | (2012-10-19) Post-doc at LABEX EPL Paris 3 Dans le cadre du LABEX EFL 'Empirical foundations of Linguistics' (http://www.labex-efl.org/), un projet d’une durée de 10 ans initié en 2011, nous proposons un projet post-doctoral de 6 mois à temps complet dans l’opération de recherche « Assessing phonetic and phonological complexity in motoric speech disorders ». Le candidat participera à une recherche sur la complexité phonétique et phonologique dans le contexte des troubles moteurs de la parole. Il travaillera sous la responsabilité de Cécile Fougeron et en collaboration avec le Dc Lise Crevier-Buchman au Laboratoire de Phonétique et Phonologie, à Paris . Le candidat sera titulaire d’un doctorat en phonétique ou en Speech and Hearing Sciences, avec une expérience en phonétique clinique. Une formation en phonétique acoustique et dans l’expérimentation avec des patients est requise. Une expérience en investigation physiologique avec l’ultrason sera un plus. Les dossiers de candidatures contiendront les documents suivants et seront envoyés à martine.adda-decker_at_univ-paris3.fr avant le 21/11/2012 :
Pour plus d'informations, voir http://www.labex-efl.org/?q=fr/node/136 Contact: Cécile Fougeron (LPP-P3)
Adresse du responsable: cecile.fougeron@univ-paris3.fr
Université: Université Paris 3
Niveau: chercheur postdoctorant
Durée: 06 mois
Salaire: 12 000€ net / 6 mois
Spécialités: phonétique clinique, acoustique, échographie linguale, dysarthrie
Date limite de candidature: 21 novembre 2012
Adresse pour la candidature: martine.adda-decker_at_univ-paris3.fr
Référence de candidature: EFL-PPC5
| ||
6-19 | (2012-10-20) Large-Scale Audio Indexing Researchers/Engineers: 2 W/M positions at IRCAM-Paris Job Openings: Large-Scale Audio Indexing Researchers/Engineers: 2 W/M positions at IRCAM
Starting : January, 2013
Duration : 18 months
Position description A
The hired Researcher will be in charge of the research and the development of scalable technologies for supervised learning (i.e. scaling GMM, PCA or SVM algorithms) to be applicable to millions of annotated data.
He/she will then be in charge of the application of the developed technologies for the training of large-scale music genre and music mood models and their application to large-scale music catalogues.
Required profile for A:
Position description B
The hired Engineer/Researcher will be in charge of the development of the framework for scalable storage, management and access of distributed data (audio and meta-data). He/she will be also in charge of the development of scalable search algorithms.
Required profile for B:
The hired Engineers/Researchers A and B will also collaborate with the development team and participate in the project activities (evaluation of technologies, meetings, specifications, reports).
Introduction to IRCAM
IRCAM is a leading non-profit organization associated to Centre Pompidou, dedicated to music production, R&D and education in sound and music technologies. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D department include acoustics, audio signal processing, computer music, interaction technologies, musicology. Ircam is located in the centre of Paris near the Centre Pompidou at 1, Place Igor Stravinsky 75004 Paris.
Salary
According to background and experience
Applications
Please send an application letter together with your resume and any suitable information addressing the above issues preferably by email to: peeters_a_t_ircam dot fr with cc to vinet_a_t_ircam dot fr, roebel_at_ircam_dot_fr
| ||
6-20 | (2012-11-03) Research Scientist at Yandex Zurich Title: Research Scientist
| ||
6-21 | (2012-11-08) Postdoc position at IBM Language and Knowledge Center, Trento , Italy The newly established IBM Language and Knowledge Center, Trento , Italy has a postdoc position in the following area: -Natural Language Dialog The postdoc scholar will be part of a Êresearch project aiming at designing machines that interact with humans and support them in complex and large scale knowledge and decision making tasks. The team includes researchers Êfrom IBM and the TrentoRise Human Language Technology Center founded by the University of Trento and FBK. Candidates with strong research background in at least one of the following: - Conversational Dialogue Systems - Statistical models of Dialogue - Natural Language Understanding - Machine Learning - Question Answering Systems are invited to apply. The official postdoc position application site: http://www.trentorise.eu/call-for-participation/bando-di-selezione-l-call-po sitions If you would like to enquiry about the position send an email along with the CV addressed to Prof. Giuseppe Riccardi sisl-jobs@disi.unitn.it Subject: Postdoc Position at IBM, Trento Deadline: February 5, 2013
| ||
6-22 | (2012-10-12) PhD position on The influence of robots on the development of language, New Zealand PhD Position: The influence of robots on the development of language
Job Posting
Project description
The ‘Wordovators’ project is a three-year project funded by the John Templeton Foundation. The project will conduct large-scale experiments in the form of computerized word games. These games will be designed to probe the factors underpinning word creation and creativity, and how these develop through the life-span. One strand of the project will probe particular issues surrounding interactions between people and humanoid Robots. How are new words created and adopted in contexts involving such interactions? This PhD position is for a highly motivated student to join the project team, and conduct work that explores the ways that robots might shape human languages. These studies will analyze the factors and processes that might contribute to the influence of robots on the vocabularies of English and of artificial languages in imaginary worlds.
This project is a collaboration between University of Canterbury, New Zealand and Northwestern University, USA. The PhD candidate will enroll for a PhD degree in the HIT Lab NZ at University of Canterbury, and will be primarily supervised by Dr Christoph Bartneck. (the HIT Lab NZ). Other associated faculty are Professor Jen Hay (NZILBB), Janet Pierrehumbert (Northwestern University / Adjunct Professor NZILBB), and Professor Stephanie Stokes (NZILBB). The PhD student will be encouraged to regularly visit Northwestern University.
Your skillsYou should have an interest in human language and have a strong background in robotics or computer science. The HIT Lab NZThe Human Interface Technology Laboratory New Zealand (HIT Lab NZ) is world leading research institutions developing and commercializing technology that improves human computer interaction. The HIT Lab NZ has over 50 staff and students and has extensive experience in Human Computer Interaction and Science & Technology Studies. The HIT Lab NZ is located at the University of Canterbury in Christchurch, New Zealand. The University of Canterbury has the top Engineering School in New Zealand, including a highly ranked Department of Computer Science. For more information about the HIT Lab NZ see http://www.hitlabnz.org/.
NZILBBThe HIT Lab NZ at the University of Canterbury is affiliated with the New Zealand Institute of Language, Brain and Behaviour (NZILBB). NZILBB is a multi-disciplinary centre dedicated to the study of human language. The researchers come from a wide range of disciplines, forging connections across linguistics, speech production and perception, language acquisition, language disorders, social cognition, memory, brain imaging, cognitive science, bilingual education, and interface technologies. More information is available at: http://www.nzilbb.canterbury.ac.nz/.
Christchurch
Christchurch is the second largest city in New Zealand and offers an exciting and easy lifestyle for students. It is the most affordable major city to live in. It is easy to get around whether you are biking, walking, driving or using the excellent public transport system. Christchurch also offers outstanding opportunities for outdoor activities, and is close to both surf beaches and ski-fields. Appointment and Scholarship SupportThe PhD scholarship is full time for a duration of three years with an annual scholarship of $25,000 NZD. The scholarship will also cover the tuition fees. The research in this project must be concluded with writing a PhD thesis within the Human Interface Technology PhD program of the HIT Lab NZ. For more information about the PhD program in Human Interface Technology, please see http://www.hitlabnz.org/index.php/education/phd-program. Further Information and ApplicationFurther information can be obtained by contacting Christoph Bartneck (christoph.bartneck@canterbury.ac.nz). Information about the HIT Lab NZ is available at: http://www.hitlabnz.org. Please upload your application as one PDF file at http://www.hitlabnz.org/index.php/jobs/job/37/ Your application must include a letter explaining your specific interest in the project, an extensive curriculum vitae, your academic records, and a list of two references. Applications will be accepted until November 15th, 2012 or until position is filled.
International applicants will be required to arrange for their NZ student visa after an offer of a place. Please check http://www.immigration.govt.nz for information about what type of visa might be most suitable and the process of acquiring it. The university has various types of accommodation available on campus. Please check http://www.canterbury.ac.nz/accom/ for information about the options and prices. International students should also consult the International Student website at http://www.canterbury.ac.nz/international/ to learn about the cost of living, fees, and insurances.
| ||
6-23 | (2012-11-15) Faculty positions at CSLP at the Johns Hopkins University in Baltimore, USA The Center for Language and Speech Processing at the Johns Hopkins University in Baltimore, USA,
| ||
6-24 | (2012-11-26) INRIA-Internship for Master2 Students INRIA-Internship for Master2 Students Title: Speech analysis for Parkinson's disease detection Description: Parkinson's disease (PD) is one of the most common neurodegenerative disorders and its clinical diagnosis, particularly early one, is still a difficult task. Recent research has shown that the speech signal may be useful for discriminating people with PD from healthy ones, based on clinical evidence which suggests that the former typically exhibit some form of vocal disorder. In fact, vocal disorder may be amongst the earliest PD symptoms, detectable up to five years prior to clinical diagnosis. The range of symptoms present in speech includes reduced loudness, increased vocal tremor, and breathiness (noise). Vocal impairment relevant to PD is described as dysphonia (inability to produce normal vocal sounds) and dysarthria (difficulty in pronouncing words). The use of sustained vowels, where the speaker is requested to sustain phonation for as long as possible, attempting to maintain steady frequency and amplitude at a comfortable level, is commonplace in clinical practice. Research has shown that the sustained vowel “aaaaa” is sufficient for many voice assessment applications, including PD status prediction. The first goal of this internship is to implement/improve some state-of-the-art algorithms for dysphonia measures and use them within an appropriate classifier (like SVM) to discriminate between disordered and healthy voices. These measures are based on linear and nonlinear speech analysis and are well documented in [1]. The experiments will be carried on on the well established Kay Elemetrics Disordered Voice Database ( http://www.kayelemetrics.com/). The second goal is to try to develop new dysphonia measures based on novel nonlinear speech analysis algorithms recently developed in the GeoStat team [2]. These algorithms have indeed shown significant improvements w.r.t. state-of-the-art techniques in many applications including speech segmentation, glottal inverse filtering and sparse modeling. The work of this internship will be conducted in collaboration with Dr. Max Little (MediaLab of MIT and Imperial College of London) and should lead to a PhD fellowship proposition. References: [1] A. Tsanas, M.A. Little, P.E. McSharry, J. Spielman, L.O. Ramig. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 59(5):1264-1271. 2012 [2] PhD thesis of Vahid Khanagha. INRIA Bordeaux-Sud Ouest. January 2013. Prerequisites: Good level in mathematics and signal/speech processing is necessary, as well as Matlab and C/C++ programing. Knowledge in machine learning would be an advantage. Supervisor: Khalid Daoudi (khalid.daoudi@inria.fr), GeoStat team (http://geostat.bordeaux.inria.fr). Location: INRIA- Bordeaux Sud Ouest (http://www.inria.fr/bordeaux). Bordeaux, France. Starting date: Fev/Mars 2012. Duration : 6 months Salary: 1200 euros / month
| ||
6-25 | (2012-12-07) Doctoral and Post-doctoral Positions in Signal Processing for Hearing Instruments, Bochum, Germany Doctoral and Post-doctoral Positions in Signal Processing for Hearing Instruments Position Description The ITN ICanHear, starting on 1 January 2013, will provide cutting-edge research projects for 12 doctoral and 5 post-doctoral research fellows in digital signal processing for hearing instruments. ICanHear aims to develop models based on emerging knowledge about higher-level processing within the auditory pathway and exploit that knowledge to develop creative solutions that will improve the performance of hearing instruments. Attractive grants and a wide variety of international training activities, including collaborations with ICanHear Associated Partners in the U.K., Switzerland, Belgium, U.S.A., and Canada, will be made available to successful candidates, who will stay in the network for a period of 12 to 36 months.
Research and training positions will be available in the following ICanHear labs:
Requirements for Candidates and Application procedure: Early-stage (doctoral) Research Fellows have less than four years experience in research after obtaining a Masters degree in engineering, computer science, or similar. Experienced (post-doctoral) Researcher Fellows are already in possession of a doctoral degree or have at least 4 years but less than 5 years of research experience in engineering and/or hearing research. In order to ensure transnational mobility candidates may have resided no more than 12 months (during the last 3 years) in the country of the host institution they wish to apply to. For all positions excellent English language skills are required. To apply please send in the following documents via e-mail to the ICanHear coordination office (icanhear@rub.de): CV, certified copies of all relevant diplomas and transcripts, two letters of recommendation, proof of proficiency in English, letter of motivation (research interest, reasons for applying to programme and host). For further information on research projects available, application details and eligibility please visit the ICanHear web-site (http://www.icanhear-itn.eu) or contact the project coordinator Rainer Martin (rainer.martin@rub.de).
| ||
6-26 | (2012-12-15) Technicien en instrumentation scientifique,expérimentation et mesure Aix-en-Provence FranceCAMPAGNE NOEMI HIVER 2012-2013 PROFIL DE POSTE Description de l'Unité Code unité : UMR 7309 Nom de l’unité : Laboratoire Parole et Langage Directeur : Noël NGUYEN Ville : Aix-en-Provence Délégation régionale : DR12 Institut : INSHS Description du poste NUMERO NOEMI : T54030 CORPS : Technicien BAP : C Emploi-type : C4B21-Technicien en instrumentation scientifique, expérimentation et mesure Fonction Technicien de Plateforme Technicien de Plateforme Mission Au sein du Laboratoire Parole et Langage (LPL), affecté au Centre d’Expérimentation sur la Parole (CEP), l’agent sera chargé du soutien aux expériences en collaboration avec le coordinateur de la plateforme. Activités L’activité principale consiste à apporter un soutien quotidien au fonctionnement de la plateforme, il peut s’agir notamment : - D’assurer le prêt et le suivi du matériel utilisé sur la plateforme ou à l’extérieur, - D’effectuer le montage, l’assemblage de sous-ensembles (notamment audio et vidéo) pour la réalisation d’expériences, - D’assister les expérimentateurs lors de la passation d’expériences en appliquant un protocole défini, - D’effectuer des modifications ou adaptations de dispositifs expérimentaux, - D’assurer la maintenance et les interventions de premier niveau, la détection et le diagnostique de pannes,. - De réaliser des enregistrements (installation et enregistrement proprement-dit) audio et vidéo, - D’assurer la gestion des consommables nécessaires au déroulement des expériences, - D’utiliser des applications logicielles de contrôle d’instruments. Compétences Le (ou)la candidate devra faire preuve d’une grande motivation pour ce poste de soutien indispensable au fonctionnement du centre d’expérimentation sur la parole. Une formation de base en électronique et/ou en mesures physiques est souhaitée, pour mener à bien la réalisation éventuelle de modules élémentaires de synchronisation entre instruments, ou encore pour synchroniser les systèmes d’enregistrement audio et vidéo, pour effectuer le montage et l’assemblage de sous-ensembles pour la réalisation de dispositifs expérimentaux. Le ou la candidate devra apprécier le travail en équipe puisqu’il ou elle travaillera en lien étroit avec le coordinateur de la plateforme. La personne doit être capable d’apprendre de nouvelles techniques, et avoir goût pour lele sens du contact humain car elle sera en contact avec un grand nombre d’utilisateurs. Elle devra montrer une grande rigueur dans le respect des procédures mises en place. L’adhésion aux règles d’hygiène et sécurité en place ainsi est indispensable. Contexte Le Laboratoire Parole et Langage est une unité de recherche du CNRS et d’Aix Marseille Université. Il accueille des phonéticiens, linguistes, des informaticiens, des psychologues, des neuroscientifiques, des physiciens et des médecins. Les activités du LPL portent sur l’étude des mécanismes de production et de perception du langage et de la parole. Le LPL se distingue par ses méthodes de recherche reposant à la fois sur l’expérimentation, l’investigation instrumentale et la formalisation. Approche originale dans ce champ scientifique, qui émarge à la fois aux domaines des sciences humaines, des sciences du vivant et des sciences pour l’ingénieur. Cette particularité explique, au-delà d’une forte activité de recherche fondamentale, l’importance des applications développées à partir des travaux menés dans les domaines du traitement de l’écrit, de l’intelligibilité du message parlé, de la conversion texte-parole de qualité, ou encore de l’évaluation et de la rééducation des troubles de la voix ou du langage. Ces caractéristiques font du Laboratoire Parole et Langage une unité de recherche adaptée aux défis scientifiques des sciences du langage, tout en étant impliquée dans leurs enjeux technologiques. Le LPL regroupe actuellement plus de 80 personnes statutaires (chercheurs, enseignants-chercheurs, ingénieurs, techniciens, administratifs), auxquelles s’ajoutent 40 doctorants dont 20 boursiers. Il est le laboratoire français le plus important dans ce domaine scientifique et l’un des premiers en Europe. Le LPL dispose désormais d’une plateforme technique regroupant un ensemble d’instruments pour l’investigation de la production et la perception de la parole : électro-encéphalographie, tracking oculaire, articulographie, électro- palatographie, évaluation articulatoire, etc. Cette ressource unique en Europe est mutualisée au sein du Centre d’Expérimentation sur la Parole (http://www.lpl.univ-aix.fr/~cep/), plateforme technique à laquelle le poste sera affecté le poste.
| ||
6-27 | (2012-12-16) Master project IRISA Rennes France Computer Science Internship CORDIAL group Title : Voice Conversion from non-parallel corpora Description : The main goal of a voice conversion system (VCS) is to transform the speech signal uttered by speaker (the source speaker) so that it sounds like it was uttered by an other person (the target speaker). The applications of such techniques are limitless. For example, a VCS can be combined to a Text-To-Speech system in order to produce multiple high quality synthetic voices. In the entertainment domain, a VCS can be used to dub an actor with its own voice. State of the art VCS use Gaussian Mixture Models (GMM) to capture the transformation from the acoustic space of the source to the acoustic space of the target. Most of the models are source-target joint models that are trained on paired source-target observations. Those paired observations are often gathered from parallel corpora, that is speech signals resulting from the two speakers uttering the same set of sentences. Parallel corpora are hard to come with. Moreo- ver, they do not guaranty that the pairing of vectors is accurate. Indeed, the pairing process is unsupervised and uses a Dynamic Time Warping under the strong (and unrealistic) hypothesis that the two speakers truly uttered the same sentence, with the same speaking style. This asser- tion is often wrong and results in non-discriminant models that tends to over-smooth speaker's distinctive characteristics. The goal of this Master subject is to suppress the use of parallel corpora in the process of training joint GMM for voice conversion. We suggest to pair speech segments on high level speech descriptors as those used in Unit Selection Text-To-Speech. Those descriptors not only contain the segmental information (acoustic class for example) but also supra-segmental informations such as phoneme context, speed, prosody, power, ... In a rst step, both source and target corpora are segmented and tagged with descriptors. In a second step, each class from one corpus is paired with the equivalent class from the other corpus. Finally, a classical DTW algorithm can be applied on each paired class. The expected result is to derive transform models that both could take into account speaker variability and be more robust to pairing errors. Keywords : Voice Conversion, Gaussian Mixture Models Contacts : Vincent Barreaud (vincent.barreaud@irisa.fr) Bibliographie : [1] H. Benisty and D. Malah. Voice conversion using gmm with enhanced global variance. In Conference of the International Speech Communication Association (Interspeech) , pages 669{ 672, 2011. [2] L. Mesbahi, V. Barreaud, and O. Boeard. Non-parallel hierarchical training for voice conver- sion. In Proceedings of the 16th European Signal Processing Conference, Lausanne, Switzerland, 2008. [3] Y. Stylianou, O. Cappe, and E. Moulines. Continuous probabilistic transform for voice conver- sion. IEEE Transactions on Speech and Audio Processing, 6(2) :131-142, 1998.
| ||
6-28 | (2012-12-16) Master project 2 IRISA Rennes France Computer Science Internship CORDIAL group Title : Unit-selection speech synthesis guided by a stochastic model of spectral and prosodic parameters. A Text-To-Speech system (TTS) produces a speech signal corresponding to the vocalization of a given text. Such a system is composed of a linguistic processing stage followed by an acoustic one which complies as much as possible with the linguistic directives. Concerning the second step, the most used approaches are { the corpus based synthesis approach which lies on the selection and concatenation of unit sequences extracted from a large continuous speech corpus. It has been popular for 20 years, yielding an unmatched sound quality but still bearing some artefacts due to spectral discontinuities. { the statistical approach. The new generation of TTS systems has emerged in the last years, reintroducing the rule based systems. The rules are no longer deterministic like in the rst systems in the 1950's, but they are replaced by stochastic models. HTS, an HMMbased speech synthesis system, is currently the most used statistical system. The HTS type systems yield a good acoustic continuum but with a sound quality strongly depending on the underlying acoustic model. Recently, some hybrid synthesis systems have been proposed, combining the statistical approach with the method of unit selection. It consists in using the acoustic descriptions and the melodic contours generated by a statistical system in order to drive the cost function during the natural speech unit selection phase, or also, substituting the poor quality natural speech units by units derived from a statistical system. The framework of this subject is the corpus based TTS. Considering the combinatorial problem due to the search of an optimal unit sequence with a blind sequencing, the work consists in determining heuristics to reduce the search space and satisfy a real time objective. These assumptions, based on spectral and prosodic type parameters generated by HTS, will permit to implement pre-selection lters or to propose new cost functions within the corpus based system developped by the Cordial group. The production of the hybrid system will be evaluated and compared via listening tests with standard systems like HTS and a corpus based system. Keywords : TTS, Corpus based speech synthesis, Statistical Learning, Experiments. Contacts : Olivier Boe Bibliography : [1] A. W. Black and K. A. Lenzo, Optimal data selection for unit selection synthesis, 4th ISCA Tutorial and Research Workshop on Speech Synthesis, 2001. [2] H. Kawai, T. Toda, J. Ni, M. Tsuzaki and K. Tokuda, Ximera : a new tts from atr based on corpus-based technologies . ISCA Tutorial and Research Workshop on Speech Synthesis, 2004. [3] S. Rouibia and O. Rosec, Unit selection for speech synthesis based on a new acoustic target cost , Interspeech, 2005. [4] H. Zen, K. Tokuda and A. W. Black, Statistical parametric speech synthesis. Speech Communication, v.51, n.11, pages 1039{1064, 2009. [5] H. Silen, E. Helander, J. Nurminen, K. Koppinen and M. Gabbouj, Using Robust Viterbi Algorithm and HMM-Modeling in Unit Selection TTS to Replace Units of Poor Quality , Interspeech 2010.
| ||
6-29 | (2012-12-16) Master project 3 IRISA Rennes France Computer Science Internship CORDIAL group Title: Grapheme-to-phoneme conversion adaptation using conditional random elds Description: Grapheme-to-phoneme conversion consists in generating possible pronuncia- tions for an isolated word or for a sequence of words. More formally, this conversion is a translit- eration of a sequence of graphemes, i.e., letters, into a sequence of phonemes, symbolic units to represent elementary sounds of a language. Grapheme-to-phoneme converters are used in speech processing
either to help automatic speech recognition systems to decode words from a speech signal
or as a mean to explain speech synthesizers how a written input should be acoustically produced. A problem with such tools is that they are trained on large and varied amounts of aligned sequences of graphemes and phonemes, leading to generic manners of pronouncing words in a given language. As a consequence, they are not adequate as soon as one wants to recognize or synthesize specic voices, for instance, accentuated speech, stressed speech, dictating voices versus chatting voices, etc. [1]. While multiple methods have been proposed for grapheme-to-phoneme conversion [2, 3], the primary goal of this internship is to propose a method to adapt grapheme-to-phoneme models which can easily be adapted under conditions specied by the user. More precisely, the use of conditional random elds (CRF) will be studied to model the generic French pronunciation and variants of it [4]. CRFs are state-of-the-art statistical tools widely used for labelling problems in natural language processing [5]. A further important goal is to be able to automatically characterize pronunciation distinctive features of a given specic voice as compared to a generic voice. This means highlighting and generalizing di sequences of phonemes derived from a same sequence of graphemes. Results of this internship would be integrated into the speech synthesis platform of the team in order to easily and automatically simulate and imitate specic voices. Technical skills: C/C++ and a scripting language (e.g., Perl or Python) Keywords: Natural language processing, speech processing, machine learning, statistical learn- ing Contact: Gwenole Lecorve (gwenole.lecorve@irisa.fr) References: [1] B. Hutchinson and J. Droppo. Learning non-parametric models of pronunciation. In Pro- ceedings of ICASSP , 2011. [2] M. Bisani and H. Ney. Joint-sequence models for grapheme-to-phoneme conversion. In Speech Communication , 2008. [3] S. Hahn, P. Lehnen, and Ney H. Powerful extensions to crfs for grapheme to phoneme conversion. In Proceedings of ICASSP, 2011. [4] Irina Illina, Dominique Fohr, and Denis Jouvet. Multiple pronunciation generation using grapheme-to-phoneme conversion based on conditional random elds. In Proceedings of SPECOM , 2011. [5] John D. La elds: probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML , 2001.
| ||
6-30 | (2013-01-14) Ph.D. Researcher in Speech Synthesis, Trinity College, Dublin, Ireland Post Specification Post Title: Ph.D. Researcher in Speech Synthesis Post Status: 3 years Department/Faculty: Centre for Language and Communication Studies (CLCS) Location: Phonetics and Speech Laboratory Salary: €16,000 per annum (plus fees paid) Closing Date: 31st January 2013 Post Summary A Ph.D. Researcher is required to work in the area of speech synthesis at the Phonetics and Speech Laboratory, School of Linguistic, Speech and Communication Sciences. The position will involve carrying out research on the topic of Hidden Markov Model (HMM)-based speech synthesis. Specifically, we are looking for a researcher to work on developing a source-filter based acoustic modelling for HMM-based speech synthesis which is closely related to the human speech production process and which can facilitate modification of voice source and vocal tract filter components at synthesis time. Background to the Post Much of the research carried out to date in the Phonetics and Speech Laboratory has been concerned with the role of the voice source in speech. This research involves the development of accurate voice source processing both as a window on human speech production and for exploitation in voice-sensitive technology, particularly synthesis. The laboratory team is interdisciplinary and includes engineers, linguists, phoneticians and technologists. This post will the main be funded by the on-going Abair project which has developed the first speech synthesisers for Irish (www..abair.ie), and the researcher will exploit the current Abair synthesis platform. In this project the aim is to deliver multi-dialect synthesis with multiple personages and voices that can be made appropriate to different contexts of use. The post will also be linked to the FastNet project which aims at voice-sensitive speech technologies. A specific goal of our laboratory team is to leverage our expertise on the voice by improving the naturalness of parametric speech synthesis, as well as making more flexible synthesis platforms which can allow modifications of voice characteristics (e.g., for creating different personalities/characters, different forms of expression etc). Standard duties of the Post Initially the researcher will be required to attend some lectures as part of the Masters programme on Speech and Language Processing. This and a supervised reading programme will provide a background in the area of voice production, analysis and synthesis. * In the very early stages the researcher will be required to develop synthetic voices, using the Irish corpora, with the standard HMM-based synthesis platform (i.e. HTS). Note that to work with the Irish corpora does not require a background in the Irish language, as there will be collaboration with experts in this field. * The researcher will be required to familiarise themselves with existing speech synthesis platforms which provide explicit modelling of the voice source (e.g., Cabral et al. 2011, Raitio et al. 2011, Anumanchipalli et al. 2010). * The researcher will then need to first implement similar versions of these systems and then work towards developing novel vocoding methods which would allow full parametric flexibility of both voice source and vocal tract filter components at synthesis time. Person Specification Qualifications * Bachelors degree in Electrical Engineering, Computer Science with specialisation in speech signal processing, or related areas. * Knowledge & Experience (Essential & Desirable) * Strong digital signal processing skills (Essential) *Good knowledge of HTS including previous experience developing synthetic voices (Essential) * Knowledge of speech production and perception (Desirable) * Experience in speech recognition (Desirable) Skills & Competencies * Good knowledge of written and spoken English. Benefits * Opportunity to work with a world-class inter-disciplinary speech research group. To apply, please email a brief cover letter and CV, including the names and addresses of two academic referees, to: kanejo@tcd.ie and to cegobl@tcd.ie
|