6-1 | (2017-11-05) 15 PhD positions from mid-2018
Le réseau de formation sur le traitement automatique de la parole pathologique (TAPAS) est un projet H2020 MSCA-ITN-ETN qui fournira à 15 doctorants une formation large et intensive sur le traitement de la parole pathologique. Le consortium du projet TAPAS comprend des praticiens cliniques, des chercheurs universitaires et des partenaires industriels, avec une expertise couvrant l'ingénierie de la parole, la linguistique et la science clinique. Le programme de travail TAPAS est organisé autour de trois grands thèmes:
- Détection de la parole pathologique (4 thèses)
- Évaluation de la parole pathologique et thérapie (6 thèses)
- Technologies de communication pour la vie assistée et la réadaptation (5 thèses)
Le consortium TAPAS comprend: -Idiap Research Institute (CH) -Friedrich-Alexander-Univeristaet (DE) -Interuniversitair Micro-Electronicacentrum IMEC VZW (BE) -INESC ID - Instituto de Engenhariade Sistemas E Computabores Investigacao E Desenvolvimento em Lisboa (PT) -Ludwig-Maximilians-Universitaet Muenchen (DE), -Stichting Het Nederlands Kanker Instituut-Antoni Van Leeuwenhoek Ziekenhuis (NL), -Philips Electronics Nederland B.V. (NL), -Stichting Katholieke Universiteit (NL) -Universität Augsburg (DE), -Université Toulouse III-Paul Sabatier (FR), -The University of Sheffield United Kingdom (UK), -Universitair Ziekenhuis Antwerpen (BE)
Pour plus d?informations et pour candidater, veuillez vous référer au site http://www.tapas-etn-eu.org
Bien cordialement,
Julie Mauclair MCF IRI Université Paris Descartes
|
6-2 | (2017-11-08) Visiting Assistant Professor in Computational Linguistics and Language Science, Rochester, NY, USA
Visiting Assistant Professor in Computational Linguistics and Language Science
URL:
http://apptrkr.com/1116774
Requisition Number: 3499BR
Detailed Job Description:
The Department of English invites applications for a Visiting Assistant Professor position, beginning in January 2018, with specialization in computational linguistics and/or innovative technical or scientific methods in language science at Rochester Institute of Technology (RIT), with a focus on one or more areas of application. Possible areas include:
· Deep learning for natural language understanding
· Speech and speech technology
· Multimodal and linguistic sensors
· Human-computer interaction
· Linguistic narrative analytics
The applicant should demonstrate a fit with our commitment to collaborate with colleagues across the university on initiatives in artificial intelligence and in digital humanities and social sciences. The position has the possibility of extension beyond Spring 2018.
The successful applicant will be a researcher and teacher with an agenda that emphasizes innovative technical methods in linguistics, for instance in natural language processing, linguistic/multimodal sensors, speech and speech technology, and/or other computational or technical approaches applied to language data. We are seeking a scholar who engages in disciplinary and interdisciplinary teamwork, student mentoring, and has a coherent plan for grant seeking activities. The right candidate will contribute to advancing our interdisciplinary language science curriculum in a college of liberal arts at a technical university. Contributions that build students' global education experiences are additionally valued.
The teaching assignment may be Introduction to Language Science, Language Technology, Introduction to NLP, Science and Analytics of Speech (acoustic and experimental phonetics), Spoken Language Processing (automatic speech recognition and text-to-speech synthesis), Seminar in Computational Linguistics, self-designed courses, or another course depending on background.
We are seeking an individual who has the ability and interest in contributing to a community committed to student-centeredness; professional development and scholarship; integrity and ethics; respect, diversity and pluralism; innovation and flexibility; and teamwork and collaboration. Select to view links to RIT's core values, honor code, and statement of diversity.
Department Description:
THE UNIVERSITY AND ROCHESTER COMMUNITY: RIT is a national leader in professional and career-oriented education. Talented, ambitious, and creative students of all cultures and backgrounds from all 50 states and more than 100 countries have chosen to attend RIT. Founded in 1829, Rochester Institute of Technology is a privately endowed, coeducational university with nine colleges emphasizing career education and experiential learning. With approximately 15,000 undergraduates and 2,900 graduate students, RIT is one of the largest private universities in the nation. RIT offers a rich array of degree programs in engineering, science, business, and the arts, and is home to the National Technical Institute for the Deaf. RIT has been honored by The Chronicle of Higher Education as one of the ?Great Colleges to Work For? for four years. RIT is a National Science Foundation ADVANCE Institutional Transformation site. RIT is responsive to the needs of dual-career couples by our membership in the Upstate NY HERC.
Rochester, situated between Lake Ontario and the Finger Lakes region, is the 51st largest metro area in the United States and the third largest city in New York State. The Greater Rochester region, which is home to nearly 1.1 million people, is rich in cultural and ethnic diversity, with a population comprised of approximately 18% African and Latin Americans and another 3% of international origin. It is also home to one of the largest deaf communities per capita in the U.S. Rochester ranks 4th for ?Most Affordable City' by Forbes Magazine, and MSN selected Rochester as the ?#1 Most Livable Bargain Market? (for real-estate). Kiplinger named Rochester one of the top five ?Best City for Families.?
Job Requirements:
· Ph.D. with training in Computational Linguistics, Linguistics, or an allied field for language science, in hand prior to appointment date.
· Advanced graduate coursework in computational linguistics, including natural language and/or spoken language processing or technical methods in linguistics.
· Publication record and coherent plan for research and grant seeking activities.
· Evidence of outstanding teaching.
· Ability to contribute in meaningful ways to the college's continuing commitment to cultural diversity, pluralism, and individual differences.
How to Apply:
Apply online at http://apptrkr.com/1116774. Please submit your online application, curriculum vitae, cover letter addressing the listed qualifications and upload the following attachments:
· A research statement
· A teaching statement
· Copy of transcripts of graduate coursework
· A sample publication
· The names, addresses, and phone numbers for three references · Statement of diversity
Questions regarding this position can be directed to the search committee chair-Dr. Cecilia Ovesdotter Alm at coagla@rit.edu.
Review of applications will begin on November 25, 2017 and will continue until an acceptable candidate is found.
|
6-3 | (2017-11-10) Principal Speech Recognition Engineer, Speechmatics, Cambridge, UK
Principal Speech Recognition Engineer
Location: Cambridge, UK
Contact: careers@speechmatics.com
Background
Speechmatics’ versatile automatic speech recognition technology, based on decades of research and experience in neural networks, is enabling world-leading companies to power a speech-enabled future. Having already transcribed millions of hours of audio and helped customers across a diverse range of use cases and applications, the team’s mission is to build the best speech technology for any application, anywhere, in any language and put speech back at the heart of communication.
In the office, we pride ourselves on a relaxed but productive environment enabling both commercial success and personal development - we often host lunch and learn sessions and attend regular academic and commercial conferences. When we're not working hard, we regularly host company outings and events where your plus-one is welcomed to enjoy great food, great drinks, and great company! We also reward ourselves occasionally with massages, and even get our bikes fixed onsite!
We think it’s important to give a little back too, so everyone is eligible for some time off for charity work plus we’ll match your contribution via the Give As You Earn scheme. See more about our great perks below!
We are expanding rapidly and are seeking talented people to join us as we continue to push the boundaries of speech recognition. This is an opportunity to join a high growth team and form a major part of its future direction.
The Opportunity
We are looking for a talented and experienced speech recognition engineer to help us build the best speech technology for anybody, anywhere, in any language. You will be a part of a team that is building languages packs and developing our core ASR capabilities including improving our speed, accuracy and support for all languages. Your work will feed into the ‘Automatic Linguist’, our ground-breaking framework to support the building of ASR models and hence the delivery of every language pack published by the company. Alongside the wider team you will be responsible for keeping our system the most accurate and useful commercial speech recognition available.
Because you will be joining a rapidly expanding team, you will need to be a team player, who thrives in a fast paced environment, with a focus on rapidly moving research developments into products. We strongly encourage versatility and knowledge transfer within the team, so we can share efficiently what needs to be done to meet our commitments to the rest of the company.
Key Responsibilities
-
Delivering the artefacts comprising high quality speech recognition software products
-
Keeping us ahead of the rest of the world in terms of speech recognition capabilities
-
Transferring knowledge to the wider team and company
Experience
Essential
-
Proven track record as one of the best in the world at modern LVCSR
-
Extensive practical experience of speech recognition, covering all aspects (acoustic, pronunciation and language modelling as well as decoders / search)
-
Experience working with standard speech and ML toolkits, e.t., Kaldi, KenLM, TensorFlow, etc.
-
Solid Python programming skills
-
Experience using Unix / Linux systems
-
Proven ability to effectively communicate highly technical subjects
Desirable
-
Expertise in all aspects of modern speech recognition, including WFSTs, lattice processing, neural nets (RNN / DNN / LSTM etc.), etc.
-
Knowledge of computational linguistics.
-
Deep production-grade software development experience, especially with Python, C/C++ and / or Go.
-
Experience working effectively with software engineering teams or as a Software Engineer.
-
Experience of team leadership and line management
-
Experience working in an Agile framework
Salary
We offer a competitive salary and bonus scheme, pension contribution matching and a generous EMI share option scheme. We also have several additional benefits including private medical insurance, holiday purchase, life assurance, childcare vouchers, cycle scheme, massages, bike doctor, fully stocked drinks fridge, and fresh fruit available daily to name just a few!
|
6-4 | (2017-11-10) Principal Language Modelling Engineer, Speechmatics, Cambridge, UK
Principal Language Modelling Engineer
Location: Cambridge, UK
Contact: careers@speechmatics.com
Background
Speechmatics’ versatile automatic speech recognition technology, based on decades of research and experience in neural networks, is enabling world-leading companies to power a speech-enabled future. Having already transcribed millions of hours of audio and helped customers across a diverse range of use cases and applications, the team’s mission is to build the best speech technology for any application, anywhere, in any language and put speech back at the heart of communication.
In the office, we pride ourselves on a relaxed but productive environment enabling both commercial success and personal development - we often host lunch and learn sessions and attend regular academic and commercial conferences. When we're not working hard, we regularly host company outings and events where your plus-one is welcomed to enjoy great food, great drinks, and great company! We also reward ourselves occasionally with massages, and even get our bikes fixed onsite!
We think it’s important to give a little back too, so everyone is eligible for some time off for charity work plus we’ll match your contribution via the Give As You Earn scheme. See more about our great perks below!
We are expanding rapidly and are seeking talented people to join us as we continue to push the boundaries of speech recognition. This is an opportunity to join a high growth team and form a major part of its future direction.
The Opportunity
We are looking for a talented and experienced Language Modelling expert to help us build the best speech technology for anybody, anywhere, in any language. You will be a part of a team that is working on our core ASR capabilities to improve our speed, accuracy and support for all languages. Your role will include making sure our Language Modelling capability remains at the head of the field. Your work will feed into the ‘Automatic Linguist’, our ground-breaking framework to support the building of ASR models, and hence the delivery of every language pack published by the company. Alongside the wider team you will be responsible for keeping our system the most accurate and useful commercial speech recognition available.
Because you will be joining a rapidly expanding team, you will need to be a team player who thrives in a fast paced environment, with a focus on rapidly moving research developments into products. We strongly encourage versatility and knowledge transfer within the team, so we can share efficiently what needs to be done to meet our commitments to the rest of the company.
Key Responsibilities
-
Analysing advances in the field of ASR – especially language modelling – and reporting back on what is the latest and greatest
-
Ensuring we can implement the best ASR technology in a production environment
-
Leading the language modelling extension of our ML framework
-
Being part of a team delivering all the artefacts required to make up the best speech recognition available to our customers
Experience
Essential
Desirable
-
MSc, PhD or equivalent qualification in the academic aspects of speech recognition
-
Extensive experience working with standard language modelling and ML toolkits, e.g. pocolm, KenLM, SRILM, TensorFlow, etc.
-
Expertise in all aspects of modern speech recognition, including WFSTs, lattice processing, neural net (RNN / DNN / LSTM), acoustic and language models, Viterbi decoding
-
Experience translating academic advances in ASR into production systems
-
Comprehensive knowledge of machine learning and statistical modelling
-
Expertise in Python and/or C++ software development
-
Experience working effectively with software engineering teams or as a Software Engineer
-
Experience of technical leadership of a team / teams
-
Experience of team leadership and line management
-
Experience of working in an Agile framework
Salary
We offer a competitive salary and bonus scheme, pension contribution matching and a generous EMI share option scheme. We also have several additional benefits including private medical insurance, holiday purchase, life assurance, childcare vouchers, cycle scheme, massages, bike doctor, fully stocked drinks fridge, and fresh fruit available daily to name just a few!
|
6-5 | (2017-11-10) Principal Acoustic Modelling Engineer, Speechmatics, Cambridge,UK
Principal Acoustic Modelling Engineer
Location: Cambridge, UK
Contact: careers@speechmatics.com
Background
Speechmatics’ versatile automatic speech recognition technology, based on decades of research and experience in neural networks, is enabling world-leading companies to power a speech-enabled future. Having already transcribed millions of hours of audio and helped customers across a diverse range of use cases and applications, the team’s mission is to build the best speech technology for any application, anywhere, in any language and put speech back at the heart of communication.
In the office, we pride ourselves on a relaxed but productive environment enabling both commercial success and personal development - we often host lunch and learn sessions and attend regular academic and commercial conferences. When we're not working hard, we regularly host company outings and events where your plus-one is welcomed to enjoy great food, great drinks, and great company! We also reward ourselves occasionally with massages, and even get our bikes fixed onsite!
We think it’s important to give a little back too, so everyone is eligible for some time off for charity work plus we’ll match your contribution via the Give As You Earn scheme. See more about our great perks below!
We are expanding rapidly and are seeking talented people to join us as we continue to push the boundaries of speech recognition. This is an opportunity to join a high growth team and form a major part of its future direction.
The Opportunity
We are looking for a talented and experienced Acoustic Modelling expert to help us build the best speech technology for anybody, anywhere, in any language. You will be a part of a team that is working on our core ASR capabilities to improve our speed, accuracy and support for all languages. Your role will include making sure our Acoustic Modelling capability remains at the head of the field. Your work will feed into the ‘Automatic Linguist’, our ground-breaking framework to support the building of ASR models, and hence the delivery of every language pack published by the company. Alongside the wider team you will be responsible for keeping our system the most accurate and useful commercial speech recognition available.
Because you will be joining a rapidly expanding team, you will need to be a team player who thrives in a fast paced environment, with a focus on rapidly moving research developments into products. We strongly encourage versatility and knowledge transfer within the team, so we can share efficiently what needs to be done to meet our commitments to the rest of the company.
Key Responsibilities
-
Analysing advances in the field of ASR – especially acoustic modelling – and reporting back on what is the latest and greatest
-
Ensuring we can implement the best ASR technology in a production environment
-
Leading the acoustic modelling extension of our ML framework
-
Being part of a team delivering all the artefacts required to make up the best speech recognition available to our customers
Experience
Essential
Desirable
-
MSc, PhD or equivalent qualification in the academic aspects of speech recognition
-
Extensive experience working with standard acoustic modelling and ML toolkits, e.g. Kaldi, TensorFlow, etc.
-
Expertise in all aspects of modern speech recognition, including WFSTs, lattice processing, neural net (RNN / DNN / LSTM), acoustic and language models, Viterbi decoding
-
Experience translating academic advances in ASR into production systems
-
Comprehensive knowledge of machine learning and statistical modelling
-
Expertise in Python and/or C++ software development
-
Experience working effectively with software engineering teams or as a Software Engineer
-
Experience of technical leadership of a team / teams
-
Experience of team leadership and line management
-
Experience of working in an Agile framework
Salary
We offer a competitive salary and bonus scheme, pension contribution matching and a generous EMI share option scheme. We also have several additional benefits including private medical insurance, holiday purchase, life assurance, childcare vouchers, cycle scheme, massages, bike doctor, fully stocked drinks fridge, and fresh fruit available daily to name just a few!
|
6-6 | (2017-11-10) Speech Recognition Intern, Speechmatics, Cambridge, UK
Speech Recognition Intern
Location: Cambridge, UK
Contact: careers@speechmatics.com
Background
Speechmatics’ versatile automatic speech recognition technology, based on decades of research and experience in neural networks, is enabling world-leading companies to power a speech-enabled future. Having already transcribed millions of hours of audio and helped customers across a diverse range of use cases and applications, the team’s mission is to build the best speech technology for any application, anywhere, in any language and put speech back at the heart of communication.
In the office, we pride ourselves on a relaxed but productive environment enabling both commercial success and personal development - we often host lunch and learn sessions and attend regular academic and commercial conferences. When we’re not working hard, we regularly host company outings and events where your plus-one is welcomed to enjoy great food, great drinks and great company! We also reward ourselves occasionally with massages, and even get our bikes fixed onsite!
We are expanding rapidly and are seeking talented people to join us as we continue to push the boundaries of speech recognition. This is an opportunity to join a high growth team and form a major part of its future direction.
The Opportunity
We are looking for a bright, enthusiastic and talented speech recognition intern to help us build the best speech technology for anybody, anywhere, in any language. You will be a part of a team that is building languages packs and developing our core ASR capabilities including improving our speed, accuracy and support for all languages. Your work will feed into the ‘Automatic Linguist’, our ground-breaking framework to support the building of ASR models and hence the delivery of every language pack published by the company. Alongside the wider team you will be responsible for keeping our system the most accurate and useful commercial speech recognition available.
Because you will be joining a rapidly expanding team, you will need to be a team player who thrives in a fast paced environment, happy to pick up whatever needs to be done, with a focus on rapidly moving research developments into products. We strongly encourage versatility and knowledge transfer within the team, so we can share efficiently what needs to be done to meet our commitments to the rest of the company.
Key Responsibilities
Experience
Essential
-
Interest in speech recognition
-
Experience using Python
-
Experience using Unix / Linux systems
-
A team player willing to contribute to all sprint activities
Desirable
-
Experience in Speech recognition or related fields
-
Experience working with standard speech and ML toolkits, e.g. Kaldi, KenLM, TensorFlow, etc.
-
Knowledge of computational linguistics.
-
Knowledge of modern software development practices
Salary
This will be a paid internship. We also have several additional benefits including massages, bike doctor, fully stocked drinks fridge, and fresh fruit available daily to name just a few!
|
6-7 | (2017-11-10) Speech Recognition Engineer, Speechmatics, Cambridge, UK
Speech Recognition Engineer
Location: Cambridge, UK
Contact: careers@speechmatics.com
Background
Speechmatics’ versatile automatic speech recognition technology, based on decades of research and experience in neural networks, is enabling world-leading companies to power a speech-enabled future. Having already transcribed millions of hours of audio and helped customers across a diverse range of use cases and applications, the team’s mission is to build the best speech technology for any application, anywhere, in any language and put speech back at the heart of communication.
In the office, we pride ourselves on a relaxed but productive environment enabling both commercial success and personal development - we often host lunch and learn sessions and attend regular academic and commercial conferences. When we're not working hard, we regularly host company outings and events where your plus-one is welcomed to enjoy great food, great drinks, and great company! We also reward ourselves occasionally with massages, and even get our bikes fixed onsite!
We think it’s important to give a little back too, so everyone is eligible for some time off for charity work plus we’ll match your contribution via the Give As You Earn scheme. See more about our great perks below!
We are expanding rapidly and are seeking talented people to join us as we continue to push the boundaries of speech recognition. This is an opportunity to join a high growth team and form a major part of its future direction.
The Opportunity
We are looking for a talented speech recognition engineer to help us build the best speech technology for anybody, anywhere, in any language. You will be a part of a team that is building languages packs and developing our core ASR capabilities including improving our speed, accuracy and support for all languages. Your work will feed into the ‘Automatic Linguist’, our ground-breaking framework to support the building of ASR models and hence the delivery of every language pack published by the company. Alongside the wider team you will be responsible for keeping our system the most accurate and useful commercial speech recognition available.
Because you will be joining a rapidly expanding team, you will need to be a team player who thrives in a fast paced environment happy to pick up whatever needs to be done, with a focus on rapidly moving research developments into products. We strongly encourage versatility and knowledge transfer within the team, so we can share efficiently what needs to be done to meet our commitments to the rest of the company.
Key Responsibilities
Experience
Essential
-
Practical experience of speech recognition or a related field with crossover knowledge
-
Interest in speech recognition
-
Experience using Python
-
Experience using Unix / Linux systems
-
A team player willing to contribute to all sprint activities
Desirable
-
Experience in modern speech recognition, such as WFSTs, lattice processing, neural nets (RNN / DNN / LSTM etc.), acoustic and language modelling, etc.
-
Experience working with standard speech and ML toolkits, e.g. Kaldi, KenLM, TensorFlow, etc.
-
Knowledge of computational linguistics.
-
Production-grade software development experience, especially with Python, C/C++ and / or Go.
-
Experience working effectively with software engineering teams or as a Software Engineer.
-
Experience working in an Agile framework
Salary
We offer a competitive salary and bonus scheme, pension contribution matching and a generous EMI share option scheme. We also have several additional benefits including private medical insurance, holiday purchase, life assurance, childcare vouchers, cycle scheme, massages, bike doctor, fully stocked drinks fridge, and fresh fruit available daily to name just a few!
|
6-8 | (2017-11-10) Senior Speech Recognition Engineer, Speechmatics, Cambridge,UK
Senior Speech Recognition Engineer
Location: Cambridge, UK
Contact: careers@speechmatics.com
Background
Speechmatics’ versatile automatic speech recognition technology, based on decades of research and experience in neural networks, is enabling world-leading companies to power a speech-enabled future. Having already transcribed millions of hours of audio and helped customers across a diverse range of use cases and applications, the team’s mission is to build the best speech technology for any application, anywhere, in any language and put speech back at the heart of communication.
In the office, we pride ourselves on a relaxed but productive environment enabling both commercial success and personal development - we often host lunch and learn sessions and attend regular academic and commercial conferences. When we're not working hard, we regularly host company outings and events where your plus-one is welcomed to enjoy great food, great drinks, and great company! We also reward ourselves occasionally with massages, and even get our bikes fixed onsite!
We think it’s important to give a little back too, so everyone is eligible for some time off for charity work plus we’ll match your contribution via the Give As You Earn scheme. See more about our great perks below!
We are expanding rapidly and are seeking talented people to join us as we continue to push the boundaries of speech recognition. This is an opportunity to join a high growth team and form a major part of its future direction.
The Opportunity
We are looking for a talented and experienced speech recognition engineer to help us build the best speech technology for anybody, anywhere, in any language. You will be a part of a team that is building languages packs and developing our core ASR capabilities including improving our speed, accuracy and support for all languages. Your work will feed into the ‘Automatic Linguist’, our ground-breaking framework to support the building of ASR models and hence the delivery of every language pack published by the company. Alongside the wider team you will be responsible for keeping our system the most accurate and useful commercial speech recognition available.
Because you will be joining a rapidly expanding team, you will need to be a team player who thrives in a fast paced environment, happy to pick up whatever needs to be done, with a focus on rapidly moving research developments into products. We strongly encourage versatility and knowledge transfer within the team, so we can share efficiently what needs to be done to meet our commitments to the rest of the company.
Key Responsibilities
-
Delivering the artefacts comprising high quality speech recognition software products
-
Keeping us ahead of the rest of the world in terms of speech recognition capabilities
-
Transferring knowledge to the wider team and company
Experience
Essential
-
Practical experience of speech recognition, covering all aspects (acoustic, pronunciation and language modelling as well as decoders / search)
-
Experience working with standard speech and ML toolkits, e.g., Kaldi, KenLM, TensorFlow, etc.
-
Solid Python programming skills
-
Experience using Unix / Linux systems
-
A team player willing to contribute to all sprint activities
-
Ability to effectively communicate highly technical subjects
Desirable
-
Expertise in all aspects of modern speech recognition, including WFSTs, lattice processing, neural nets (RNN / DNN / LSTM etc.), etc.
-
Knowledge of computational linguistics.
-
Deep production-grade software development experience, especially with Python, C/C++ and / or Go.
-
Experience working effectively with software engineering teams or as a Software Engineer.
-
Experience of team leadership and line management
-
Experience working in an Agile framework
Salary
We offer a competitive salary and bonus scheme, pension contribution matching and a generous EMI share option scheme. We also have several additional benefits including private medical insurance, holiday purchase, life assurance, childcare vouchers, cycle scheme, massages, bike doctor, fully stocked drinks fridge, and fresh fruit available daily to name just a few!
|
6-9 | (2017-11-12) Language Resources Project Manager - Junior (m/f), ELDA, Paris, France
nceThe European Language resources Distribution Agency (ELDA), a company specialized in Human Language Technologies within an international context is currently seeking to fill an immediate vacancy for a Language Resources Project Manager ? Junior position. This yields excellent opportunities for young, creative, and motivated candidates wishing to participate actively to the Language Engineering field.
Language Resources Project Manager - Junior (m/f)
Under the supervision of the Language Resources Manager, the Language Resources Project Manager ? Junior will be in charge of the identification of Language Resources (LRs), the negotiation of rights in relation with their distribution, as well as the data preparation, documentation and curation.
The position includes, but is not limited to, the responsibility of the following tasks:
- Identification of LRs and Cataloguing
- Negotiation of distribution rights, including interaction with LR providers, drafting of distribution agreements, definition of prices of language resources to be integrated in the ELRA catalogue or for research projects
- LR Packaging within production projects
- Data preparation, documentation and curation
Profile:
- PhD in computational linguistics or similar fields
- Experience in managing NLP tools
- Experience in project management and participation in European projects, as well as practice in contract and partnership negotiation at an international level, would be a plus
- Good knowledge of script programming (Perl, Python or other languages)
- Good knowledge of Linux
- Dynamic and communicative, flexible to combine and work on different tasks
- Ability to work independently and as part of a team
- Proficiency in English, with strong writing and documentation skills. Communication skills required in a French-speaking working environment
- Citizenship of (or residency papers) a European Union country
All positions are based in Paris. Applications will be considered until the position is filled.
Salary is commensurate with qualifications and experience. Applicants should email a cover letter addressing the points listed above together with a curriculum vitae to:
ELDA 9, rue des Cordelières 75013 Paris FRANCE Fax : 01 43 13 33 30 Mail job@elda.org
ELDA is acting as the distribution agency of the European Language Resources Association (ELRA). ELRA was established in February 1995, with the support of the European Commission, to promote the development and exploitation of Language Resources (LRs). Language Resources include all data necessary for language engineering, such as monolingual and multilingual lexica, text corpora, speech databases and terminology. The role of this non-profit membership Association is to promote the production of LRs, to collect and to validate them and, foremost, make them available to users. The association also gathers information on market needs and trends.
For further information about ELDA and ELRA, visit: http://www.elra.info
|
6-10 | (2017-11-15) PHD RESEARCH FELLOWSHIPS ( ML/Dialogue/Language/Speech), University of Trento, Italy
Title: 2018 PHD RESEARCH FELLOWSHIPS ( ML/Dialogue/Language/Speech) Location: University of Trento , Italy
You may have enjoyed reading about bots, artificial intelligence, machine learning, digital assistants, systems that support doctors, teachers, customers and help people. Then, you would like to consider taking a front row seat and join the research team that has been training intelligent machines and evaluating AI-based systems for more than two decades, collaborating with best research labs in the world and deployed them in the real-world.
Here is a sample of the projects ( http://sisl.disi.unitn.it/demo/ ) the Signals and Interactive Systems Lab (University of Trento, Italy) has been leading:
-Natural Language Understanding systems for massive amount of human language data: http://www.sensei-conversation.eu
-Amazon Alexa challenge on Conversational Systems: http://sisl.disi.unitn.it/university-of-trento-is-selected-by-amazon-for-the-alexa-challenge/
-Designing AI personal agents for healthcare domain: http://sisl.disi.unitn.it/pha/
We are looking for top-candidates for its funded PhD research fellowships. Candidates should have background at least in one of the following areas:
- Speech Processing
- Natural Language Understanding
- Conversational Systems
- Machine Learning
Candidates will be working on research domains such as Conversational Agents, Intelligent Systems, Speech/Text Document Mining and Summarization, Human Behavior Understanding, Crowd Computing and AI-based systems for tutoring.
For more info on research and projects visit the lab website Visit lab website at http://sisl.disi.unitn.it/
The SIS Lab research is driven by an interdisciplinary approach to research, attracting researchers from disciplines such as Digital Signal Processing, Speech Processing, Computational Linguistics, Psychology, Neuroscience and Machine Learning.
The official language ( research and teaching ) of the department is English.
FELLOWSHIP
Gross amount of the fellowships ( internship and PhD ) is competitive and approximately 1.600 Euro/month. Students may qualify for reduced campus lodging, transportation and cafeteria reduced rates.
For more information about cost of living, campus, graduate education programs, please visit the graduate school website at http://ict.unitn.it/
DEADLINES
Immediate openings with start date as early as March 2018. Open until filled.
REQUIREMENTS
Strict requirement is at least Master level degree in Computer Science, Electrical Engineering, Computational Linguistics or similar or affine disciplines. Students with other background (Physics, Applied Math) may apply as well. Background in at least one of the posted research areas is required. All applicants should have good very programming, math skills and used to team work.
HOW TO APPLY
Interested applicants should send their 1) CV 2) Statement of research interest and 3) three reference letters sent to:
Email: sisl-jobs@disi.unitn.it
For more info:
Signals and Interactive Systems Lab : http://sisl.disi.unitn.it/
PhD School : http://ict.unitn.it/
Department : http://disi.unitn.it/
Information Engineering and Computer Science Department (DISI)
DISI has a strong focus on cross-disciplinarity with professors from different faculties of the University (Physical Science, Electrical Engineering, Economics, Social Science, Cognitive Science, Computer Science) with international background. DISI aims at exploiting the complementary experiences present in the various research areas in order to develop innovative methods, technologies and applications.
University of Trento
The University of Trento is consistently ranked as premiere Italian university institution. See http://www.unitn.it/en/node/1636/mid/2573
University of Trento is an equal opportunity employer.
|
6-11 | (2017-11-20) Audio Signal Processing Maverick (Applied Research), AVA, Paris, France
Audio Signal Processing Maverick (Applied Research)
ASR research is sadly mostly largely done in big companies. Sure, talent is around, but it's really only in a less crowded space that you can really shine and see the impact that you can do. A.K.A. an early-stage startup, when you’re still a group of friends with a crazy ambition to change the world.
Here’s what drives us nuts: Products where ASR is really the key component are 99% of the time made for: *answering quick/superficial requests from lazy users (Siri, Google Now, Cortana, Echo) *dealing with angry customers on the phone (all the IVRs) *dictating emails for busy people (Nuance)
What if you could truly change 400M lives instead? Turn a lifetime of frustration into a deep connection?
Ava aims at captioning the world to make it fully accessible, 24/7, to deaf & hard-of-hearing people. Mobile-first, the app is the fastest & most advanced captioning system in the world, beating what tech giants have done, by cleverly using speech and speaker identification technologies to make conversations between deaf & hard-of-hearing people and hearing people possible.
At Ava, the CEO is the only hearing person in a family of deaf people, and the CTO is deaf and non-speaking - both were Forbes 30 under 30 2017. We use our ASR-based product everyday to communicate. Our motivations are aligned with the change we want to make in the world. We care about the millions of people out there who struggle everyday to just have a social & professional life and that YOUR tech will help. If it wasn't for Ava, the next best solution would take 10X time (it's not a solution) or 100X cost (it's not accessible to all). We’re working with companies such as GE, Nike, Salesforce, but also universities, stores, and even churches to fulfill our mission to make the world truly accessible.
What we need to get to the next level? You - someone with prior research experience in audio signal processing. The core mission will be to enhance a speech recognition system used in real world cocktail party situations. The signal is acquired via an array of ad-hoc microphones, and is processed to optimize its quality for the transcription, using a set of techniques: source localization, Time Difference of Arrival, noise cancelling, source separation… all in real time.
Interested to learn more about it? Let’s chat.
Especially if:
● You just finished a PhD in audio signal processing.
● You’re ready to be a pioneer in the field, and do what is necessary to make things work in real world situations.
● You're of the persistent, yet open-minded and collaborative type: you reason by independent thinking first, but you know that together, we're stronger. What you get
● Early-stage -> massive equity opportunity.
● An opportunity to apply cutting-edge technologies to solve real world problems, right now
● Competitive salary The job will be based in our Paris office.
Interested? Let us know at alex@ava.me
|
6-12 | (2017-11-20) Speaker Identification Maverick (Applied Research) , Ava, Paris, France
Speaker Identification Maverick (Applied Research)
ASR research is sadly mostly largely done in big companies. Sure, talent is around, but it's really only in a less crowded space that you can really shine and see the impact that you can do. A.K.A. an early-stage startup, when you’re still a group of friends with a crazy ambition to change the world. Here’s what drives us nuts: Products where ASR is really the key component are 99% of the time made for: *answering quick/superficial requests from lazy users (Siri, Google Now, Cortana, Echo) *dealing with angry customers on the phone (all the IVRs) *dictating emails for busy people (Nuance) What if you could truly change 400M lives instead? Turn a lifetime of frustration into a deep connection? Ava aims at captioning the world to make it fully accessible, 24/7, to deaf & hard-of-hearing people. Mobile-first, the app is the fastest & most advanced captioning system in the world, beating what tech giants have done, by cleverly using speech and speaker identification technologies to make conversations between deaf & hard-of-hearing people and hearing people possible. At Ava, the CEO is the only hearing person in a family of deaf people, and the CTO is deaf and non-speaking - both were Forbes 30 under 30 2017. We use our ASR-based product everyday to communicate. Our motivations are aligned with the change we want to make in the world. We care about the millions of people out there who struggle everyday to just have a social & professional life and that YOUR tech will help. If it wasn't for Ava, the next best solution would take 10X time (it's not a solution) or 100X cost (it's not accessible to all). We’re working with companies such as GE, Nike, Salesforce, but also universities, stores, and even churches to fulfill our mission to make the world truly accessible. What we need to get to the next level? You - someone with prior research exposure to speaker identification (deep learning interest/experience is a big plus). The core of your mission will be to reinvent what voice recognition can do to understand real world conversations: crack the cocktail-party problem. Interested to learn more about it? Let’s chat. Especially if:
● You just finished a PhD in Machine Learning.
● Experience in Speaker Identification, ASR, NLP, acoustic modeling, language models or source separation is a plus.
● You ambition to be a pioneer in the field, and do what is necessary to make things work in real world situations.
● You're of the persistent, yet open-minded and collaborative type: you reason by independent thinking first, but you know that together, we're stronger.
What we offer
● Early-stage -> massive equity opportunity.
● An opportunity to apply cutting-edge technologies to solve real world problems, right now
● Competitive salary The job will be based in our Paris office.
Interested? Let us know at alex@ava.me.
|
6-13 | (2017-11-21) PhD position in Opinion Analysis in human-agent interactions, Telecom ParisTech, Paris France
PhD position in Opinion Analysis in human-agent interactions
Telecom ParisTech [1]
46 rue Barrault 75013 Paris ? France
Starting date: from Now to Early Autumn 2018
Possibility to start with an internship during first semester 2018.
Duration of the PhD funding: 36 months
*Position description*
The PhD student will take part in the ANR JCJC MAOI (Multimodal Analysis of Opinions in Interactions) at Telecom-ParisTech. He/She will tackle the following challenging issue: the integration of opinion mining methods in human-agent interactions (i.e. companion robots or virtual vocal assistants such as Siri, Google Now, Cortana, etc.)
The role of the PhD will consist in developing machine learning methods for the multimodal (i.e. speech and text) analysis of the user?s opinion during his/her interaction with an agent. The main challenge will be to integrate the interaction context in machine-learning opinion detection methods.
The work will include:
- the development of machine learning/deep learning approaches (Conditional Random Fields, Long-Short-Term-Memory networks)
- the integration of complex and interactional linguistic features in machine-learning models for the detection of opinions in interactions
- the integration of acoustic features in multimodal models
- the evaluation of the system in interaction context.
The PhD will join the Social Computing topic [2] in the S2a group [3] at Telecom-ParisTech.
Selected references for this position from [4] :
Barriere, V., Clavel, C., and Essid, E. (2017). Opinion dynamics modeling for movie review transcripts classification with hidden conditional random fields. Interspeech.
Clavel, C.; Callejas, Z., Sentiment analysis: from opinion mining to human-agent interaction, Affective Computing, IEEE Transactions on, 7.1 (2016): 74-93
Langlet, C. and Clavel, C. (2015). Improving social relationships in face-to-face human-agent interactions: when the agent wants to know user?s likes and dislikes. In ACL, Beijin, China.
Langlet, C. and Clavel, C. (2016). Grounding the detection of the user?s likes and dislikes on the topic structure of human-agent interactions. Knowledge-Based Systems.
* Candidate profile*
As a minimum requirement, the successful candidate will have:
? A master degree or equivalent in one or more of the following areas: machine
learning, natural language processing, affective computing
? Excellent programming skills (preferably in Python)
? Good command of English
The ideal candidate will also (optionally) have:
? Knowledge in natural language processing
? Knowledge in probabilistic graphical models and deep learning
-- More about the position
? Place of work: Paris, France
? For more information about Telecom ParisTech see [1]
-- How to apply
Applications are to be sent to Chloé Clavel [4]
The application should be formatted as a single pdf file and should include:
? A complete and detailed curriculum vitae
? A letter of motivation
? The transcript of grades
? The names and addresses of two referees
[1] https://www.telecom-paristech.fr/eng/
[2] https://www.tsi.telecom-paristech.fr/recherche/themes-de-recherche/analyse-automatique-des-donnees-sociales-social-computing/
[3] http://www.tsi.telecom-paristech.fr/ssa/#
[4] https://clavel.wp.imt.fr/publications/
|
6-14 | (2017-11-20) ASSISTANT PROFESSOR IN HUMAN-CENTERED COMPUTING, Virginia Tech, USA
ASSISTANT PROFESSOR IN HUMAN-CENTERED COMPUTING
The Department of Computer Science at Virginia Tech (www.cs.vt.edu <http://www.cs.vt.edu/>) seeks applicants for a tenure-track assistant professor position in human-centered computing. Exceptional candidates at higher ranks may also be considered. Strong candidates from any area related to human-computer interaction, user experience, or interactive computing are encouraged to apply. We especially encourage applicants with interests in novel interactive experiences and technologies—including immersive environments (virtual reality and augmented reality), multi-sensory displays, multi-modal input, visualization, visual analytics, human-robot interaction, game design, and creative technologies.
The successful candidate will have the opportunity to engage in transdisciplinary research, curriculum, and outreach initiatives with other university faculty working in the Creativity & Innovation (C&I) Strategic Growth Area, one of several new university-wide initiatives at Virginia Tech (see provost.vt.edu/destination-areas <http://provost.vt.edu/destination-areas>). The C&I Strategic Growth Area is focused on empowering partners and stakeholders to collaborate on creativity, innovation, and entrepreneurship efforts that transcend disciplinary boundaries. Faculty working together in this area comprise a vibrant ecosystem that melds the exploration of innovative technologies and the design of creative experiences with best practices for developing impact-driven and meaningful outcomes and solutions. Candidates with demonstrated experience in interdisciplinary teaching or research that aligns with the C&I vision (provost.vt.edu/destination-areas/sga-overview/sga-creativity.html <http://provost.vt.edu/destination-areas/sga-overview/sga-creativity.html>) are especially encouraged to apply. The successful candidate will also have opportunities for collaboration in the interdisciplinary Center for Human-Computer Interaction (www.hci.vt.edu <http://www.hci.vt.edu/>) that includes nearly 40 faculty across campus; the Institute for Creativity, Arts, and Technology (icat.vt.edu <http://icat.vt.edu/>) housed in the new Moss Arts Center; and the Discovery Analytics Center (dac.cs.vt.edu <http://dac.cs.vt.edu/>).
Applications must be submitted online to jobs.vt.edu <https://listings.jobs.vt.edu/postings/80519> for posting #TR0170152. Applicant screening will begin on December 1, 2017 and continue until the position is filled. Inquiries should be directed to Dr. Doug Bowman, Search Committee Chair, dbowman@vt.edu <mailto:dbowman@vt.edu>.
-- Doug A. Bowman Frank J. Maher Professor, Computer Science Director, Center for Human-Computer Interaction Fellow, Institute for Creativity, Arts, and Technology Virginia Tech dbowman@vt.edu Personal: http://people.cs.vt.edu/~bowman/ Group: http://research.cs.vt.edu/3di/ Center: http://hci.vt.edu/ Twitter: @CHCI_VT
|
6-15 | (2017-11-20) Three Postdoctoral Researchers/Project Researchers (Speech processing and deep learning), University of East Finland, Finland
Three Postdoctoral Researchers/Project Researchers (Speech processing and deep learning)
The University of Eastern Finland, UEF, is one of the largest multidisciplinary universities in Finland. We offer education in nearly one hundred major subjects, and are home to approximately 15,000 students and 2,500 members of staff. From 1 August 2018 onwards, we?ll be operating on two campuses, in Joensuu and Kuopio. In international rankings, we are ranked among the leading universities in the world.
The Faculty of Science and Forestry operates on the Kuopio and Joensuu campuses of the University of Eastern Finland. The mission of the faculty is to carry out internationally recognised scientific research and to offer research-education in the fields of natural sciences and forest sciences. The faculty invests in all of the strategic research areas of the university. The faculty?s environments for research and learning are international, modern and multidisciplinary. The faculty has approximately 3,800 Bachelor?s and Master?s degree students and some 490 postgraduate students. The number of staff amounts to 560. http://www.uef.fi/en/lumet/etusivu
We are now inviting applications for three Postdoctoral Researcher/Project Researcher positions in speech processing and deep learning funded by Academy of Finland, School of Computing, Joensuu Campus.
o Two positions in automatic speaker rec, voice conversion, anti-spoofing (NOTCH project)
o One position in deep reinforcement learning for physical agents (DEEPEN project)
The two projects share similarities in terms of machine learning methods being used and developed further, but are otherwise differently focused.
The NOTCH research project (NOn-cooperaTive speaker CHaracterization), being led by Associate Professor Tomi Kinnunen, aims at advancing state-of-the-art in automatic speaker verification (defense) and voice conversion (attack) under a generic umbrella of non-cooperative speech, whether being induced by spoofing attacks, disguise, or other intentional voice modifications. A successful applicant needs to have background in speaker verification, anti spoofing, voice conversion, machine learning or closely related topics.
The DEEPEN research project (Deep Reinforcement Learning for Physical Agents) is run in co operation between UEF and robotics group at Aalto University. UEF?s part, lead by Senior Researcher Ville Hautamäki, aims at designing new statistical models for simulated robot control and to take steps towards solving the so-called ?reality gap? problem. The post-doc may also contribute to speech and deep learning topics. A successful applicant needs to have background in deep learning, reinforcement learning, speech technology or machine vision. Practical experience in DRL research environments (e.g. VizDoom or MuJoCo), will be counted as a plus.
The Machine Learning group of the School of Computing, at the facilities of Joensuu Science Park, provides access to modern research infrastructure and is a strongly international working environment. We hosted the Odyssey 2014 conference, were a partner in the H2020-funded OCTAVE project, and are a co-founder of the Automatic Speaker Verification and Countermeasures (ASVspoof) challenge series (http://www.asvspoof.org/).
A person to be appointed as a postdoctoral researcher shall hold a suitable doctoral degree that has been awarded less than five years ago. If the doctoral degree has been awarded more than five years ago, the post will be one of a project researcher. The doctoral degree should be in spoken language technology, electrical engineering, computer science, machine learning or a closely related field. Researchers finishing their PhD in the near future are also encouraged to apply for the positions. However, they are expected to hold a PhD degree by the starting date of the position. We expect strong hands-on experience and creative out-of-the-box problem solving attitude. A successful applicant needs to have an internationally proven track record in topics relevant to the project he or she applies to.
English may be used as the language of instruction and supervision in these positions.
The positions will be filled from earliest January 1, 2018 for a period of 12 months. The continuation of the position will be agreed separately. The position will be filled for a fixed term due to pertaining to a specific project (Postdoctoral researcher positions shall always be filled for a fixed term, UEF University Regulations 31 §).
The salary of the position is determined in accordance with the salary system of Finnish universities and is based on level 5 of the job requirement level chart for teaching and research staff (?2.865,30/ month). In addition to the job requirement component, the salary includes a personal performance component, which may be a maximum of 46.3% of the job requirement component.
A probationary period is applied to all new members of the staff.
You can use the same electronic form to apply for both research projects. The electronic application should contain the following appendices:
- a résumé or CV
- a list of publications
- copies of the applicant's academic degree certificates/ diplomas, and copies of certificates / diplomas relating to the applicant?s language proficiency, if not indicated in the academic degree certificates/diplomas
- motivation letter
- a cover letter indicating the position to be applied for
- The names and contact information of at least two referees are requested in the application form.
The application needs to be submitted no later than December 22, 2017 (by 24:00 EET) by using the electronic application form. Navigate to http://www.uef.fi/en/uef/en-open-positions and search for ?Three Postdoctoral Researchers/Project Researchers (Speech processing and deep learning)? to find the link to the electronic application form.
|
6-16 | (2017-12-03) Machine Learning Engineer, Speech Recognition, Aja-la studios, Green Richmond UK
Machine Learning Engineer, Speech Recognition Location: London, UK Contact: hello@ajalastudios.com Summary & Opportunity AJA.LA Studios is a funded early-stage startup developing speech and natural language understanding technologies for under-resourced languages. We are looking to hire an engineer, to be based in London, to participate in
developing acoustic and
language models, and related algorithms, for our suite of proprietary speech recognition products
for a broad library of under-resourced languages. This role provides a unique opportunity to pursue research and
commercialization of speech recognition
for under-resourced languages. Ideally, candidates should be comfortable working with large quantities of data, have an interest
in and/or demonstrate
experience working with under-resourced languages, an interest in working on the entire R&D/product-development cycle. Skills & Requirements The ideal candidate should possess a combination of the following skills and qualifications
•Masters or PhD in an analytical discipline through which you have acquired a strong knowledge of
topics including
o Theory and practice of speech recognition and/or speech processing (LSCVR)
o Signal Processing/Pattern Recognition
o Probability theory
o Bayesian inference
o Machine learning and related topics
• Strong software development skills o Required: C/C++, Python, CUDA/Nsight IDE, shell scripting, Perl, Github/SVN
o Optional/Additional: Java/Android/Gradle/Android Studio, Objective C/Xcode/Cocos2dx
• Speech processing, Neural Network and Natural Language platforms and libraries
o Kaldi, KenLM, OpenFST, and HTS
o Theano, PDNN, pyTorch, TensorFlow
• Operating Systems: Unix/Linux/Mac OS
1 The Green Richmond, TW9 1PL UK www.ajalastudios.com
Salary We offer a compentitive salary, pension contribution, private medical insurance, and share options, flexible working hours,
amongst other benefits.
|
6-17 | (2017-12-04) Machine Learning Engineer, Speech Synthesis , Aja-la studios,Green Richmond UK
Machine Learning Engineer, Speech Synthesis Location: London, UK Contact: hello@ajalastudios.com Summary & Opportunity AJA.LA Studios is a funded early-stage startup developing speech and natural language understanding technologies for under-resourced languages. We are looking to hire an engineer, to be based in London, to participate in developing unit-selection and parametric speech synthesis for a broad library of under-resourced languages. This role provides a unique opportunity to pursue research and commercialization of speech recognition for under-resourced languages. Ideally, candidates should be comfortable working with large quantities of data, have an interest in and/or demonstrate experience working with under-resourced languages, an interest in working on the entire R&D/product-development cycle, and possess the following skills and qualifications Skills & Requirements • Masters or PhD in an analytical discipline through which you have acquired a strong knowledge of
topics including
o Theory and practice of speech synthesis and/or speech processing, e.g. vocoding
o Signal Processing/Pattern Recognition
o Probability theory o Bayesian inference
o Machine learning and related topics
• Strong software development skills
o Required: C/C++, Python, CUDA/Nsight IDE, shell scripting, Perl, Github/SVN
o Optional/Additional: Java/Android/Gradle/Android Studio, Objective C/Xcode/Cocos2dx
• Speech processing, Neural Network and Natural Language platforms and libraries
o Festival, HTK, and HTS
o Theano, PDNN, pyTorch, TensorFlow
• Operating Systems: Unix/Linux/Mac OS
1 The Green Richmond, TW9 1PL UK
www.ajalastudios.com Salary We offer a compentitive salary, pension contribution, private medical insurance, and share options, flexible working hours,
amongst other benefits.
|
6-18 | (2017-12-03) Research Assistant/Associate in Speech Processing, at Cambridge University Engineering Department, Cambridge, UK.
Research Assistant/Associate in Speech Processing, at Cambridge University Engineering Department, Cambridge, UK.
|
6-19 | (2017-12-05) One-year post-doctoral position in speech production, GIPSA, Grenoble, France
One-year post-doctoral position in speech production, in the framework of the StopNCo ANR project (http://www.agence-nationale-recherche.fr/Project-ANR-14-CE30-0017 <http://www.agence-nationale-recherche.fr/Project-ANR-14-CE30-0017>), starting from March 2018 (at the latest in October 2018). More details at: https://www.gipsa-lab.grenoble-inp.fr/~maeva.garnier/mes_documents/PostDocPosition-StopNCo.pdf I would be thankful if you could circulate this job offer in your research institution and forward it to anyone who may be interested in.
Maëva Garnier
|
6-20 | (2017-12-06) PhD Position in Social Signal Processing for Multi-Sensor Conversation Quality Modeling, Delft University, The Netherlands
Job Link: https://tinyurl.com/MINGLEPhD
PhD Position in Social Signal Processing for Multi-Sensor Conversation Quality Modeling
Location: Delft University of Technology, The Netherlands
Deadline: January 12 2018 (see below for application procedure)
Project Description:
An important but under-explored problem in computer science is the automated analysis of conversational dynamics in large unstructured social gatherings such as networking or mingling events. Research has shown that attending such events contributes greatly to career and personal success. While much progress has been made in the analysis of small pre-arranged conversations, scaling up robustly presents a number of fundamentally different challenges.
Unlike analysing small pre-arranged conversations, during mingling, sensor data is seriously contaminated. Moreover, determining who is talking with whom is difficult because groups can split and merge at will. A fundamentally different approach is needed to handle both the complexity of the social situation as well as the uncertainty of the sensor data when analysing such scenes.
The successful applicants will develop automated techniques to analyse multi-sensor data (video, acceleration, audio, etc) of human social behavior. They will work as part of a team on the NWO Funded Vidi project MINGLE (Modelling Group Dynamics in Complex Conversational Scenes from Non-Verbal Behaviour). They will have the opportunity to interact with researchers from both computer science and social science both locally and internationally.
The main aim of the project is to address the following question: How can multi-sensor processing and machine learning methods be developed to model the dynamics of conversational interaction in large social gatherings using only non-verbal behaviour? The two project advertised focus on developing novel computational methods to measure conversation quality (e.g. involvement, rapport) from multi-sensor streams in crowded environments
Job requirements:
We are looking students who have recently completed or expect very soon an MSc or equivalent degree in computer science, electrical/electronic engineering, applied mathematics, applied physics, or a related discipline. Experience in the following or related fields are preferred: signal/audio/speech processing, computer vision, machine learning, and pattern recognition. Some experience with embedded systems is a bonus, though not necessary.
The successful applicant will have:- good programming skills;- curiosity and analytical skills;- the ability to work in a multi-disciplinary team;- motivation to meet deadlines;- an affinity with the relevant social science research;- good oral and written communication skills;-proficiency in English;- an interest in communicating their research results to a wider audience;
Institution:
The department Intelligent Systems is part of the Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS) at Delft University of Technology. The faculty offers an internationally competitive interdisciplinary setting for its 500 employees, 350 PhD students and 1700 undergraduates. Together they work on a broad range of technical innovations in the fields of sustainable energy, quantum engineering, microelectronics, intelligent systems, software technology, and applied mathematics.???
The Pattern Recognition and BioInformatics Group is one of five groups in the department, consisting of 7 faculty and over 20 postdoc and PhD students. Within this group, research is carried out in three core subjects; pattern recognition, computer vision, and bioinformatics. One of the main focuses of the group is on developing tools and theories, and gaining knowledge and understanding applicable to a broad range of general problems but typically involving sensory data, e.g. times signals, images, video streams, or other physical measurement data.
For information about the TU Delft Graduate School, please visit www.phd.tudelft.nl.????
Application Procedure:
Interested applicants should send an up-to-date curriculum vitae, degree transcripts, letter of application, and the names and the contact information (telephone number and email address) of two references to Hr-eemcs@tudelft.nl with the subject heading '[MINGLE PhD]'.
The letter of application should summarise (i) why the applicant wants to do a PhD, (ii) why the project is of interest to the applicant, (iii) evidence of suitability for the job, and (iv) what the applicant hopes to gain from the position.
The application procedure is ongoing until the position is filled, so interested candidates are encouraged to apply as soon as possible and before January 12 2018. Note that candidates who apply after this deadline may still be considered but applications before the deadline will be given priority.
|
6-21 | (2017-12-08) PhD grant at IRISA, Rennes France
L'équipe Expression de l'IRISA recrute un.e doctorant.e en informatique sur le sujet 'caractérisation de registres de langue par extraction de motifs séquentiels' dans le cadre du projet ANR TREMoLo.
Détails de l'offre :
https://www-expression.irisa.fr/files/2017/12/these_TREMoLo_2017.pdf
Dossier de candidature (* : éléments obligatoires) :- CV détaillé*- lettre de motivation*- relevés de notes (avec classement si possible)*- contacts pour recommandation*- rapport(s) de stage recherche.
Envoyer à : del.battistelli@gmail.com, nicolas.bechet@irisa.fr,gwenole.lecorve@irisa.fr.
Cordialement,Gwénolé Lecorvé.
|
6-22 | (2017-12-15) Internship 1 at LIA, Avignon, France
Adaptation des réseaux de neurones profonds pour les systèmes
de transcription de la parole Mots-clés : système de transcription de la parole, modèle de langage, adaptation nonsupervisée Description La Reconnaissance Automatique de la Parole (RAP) consiste à transcrire en texte les mots
prononcés dans un enregistrement audio ou vidéo. Les systèmes de RAP les plus robustes reposent
souvent sur une architecture multi-passe (Gauvain et Lee 1994) (Gales 1998),
chaque passe permettant d’obtenir une transcription du signal audio qui se veut de meilleure qualité
que la précédente.
Ainsi, dans certains cas, les sorties de la passe précédente sont utilisées pour adapter les modèles de la
passe en cours. L’idée de cette adaptation est d’obtenir des modèles spécialisés à l’enregistrement, et
donc d’être plus robuste face aux « variabilités » des enregistrements audio (conditions acoustiques différentes,
locuteurs inconnus, spontanéité de la parole, bruits de l’environnement...). L’objectif général du stage est de faire progresser l’état de l’art sur la transcription automatique de la parole.
Plus précisément, le stage explorera l’adaptation non-supervisée des réseaux de neurones profonds.
Un des principaux challenges est d’utiliser les réseaux de neurones en tant que modèle de langage et de pouvoir
les adapter à une première transcription issue du décodage. Ce sujet pourra donner lieu à une thèse. Profil du candidat Etudiant en Master 2 en informatique. Le candidat devra posséder un bon niveau en
programmation (C/C++ et/ou Python). Des notions en Traitement Automatique de la Langue, Traitement de la parole
ou Apprentissage automatique serait un plus. Lieu du stage LIA, 339, chemin des Meinajariès, 84911 Avignon Durée et rémunération 6 mois, environ 580€ par mois. Contact Mickaël Rouvier – Maître de conférence – mickael.rouvier@univ-avignon.fr Richard Dufour – Maître de conférence – richard.dufour@univ-avignon.fr Bibliographie Gales, Mark JF. «Maximum likelihood linear transformations for HMM-based speech recognition.»
Computer Speech and Language (CSL), 1998. Gauvain, Jean-Luc, et Chin-Hui Lee. «Maximum
a posteriori estimation for multivariate Gaussian mixture observations of Markov chains.» IEEE Transactions on
Speech and Audio Processing (TASP), 1994.
|
6-23 | (2017-12-15) Internship 2 at LIA Avignon, France
Résumé vidéo automatique par contextualisation de vidéo à partir d’un texte Mots-clés : résumé automatique par extraction Description Le résumé automatique est un moyen de produire des synthèses qui extraient l’essentiel des contenus et
les présentent de façon aussi concise que possible. Dans ce stage nous nous intéressons aux méthodes de résumé vidéo
par extraction basées sur l’analyse du texte [Li11, Trione14, Favre15]. Une des approches classiques suivie pour la génération automatique de résumé vidéos consiste à passer par une
représentation intermédiaire textuelle : le contenu audio de la vidéo (et parfois les textes incrustés) sont extraits,
transcrits puis résumés. Ce résumé texte est ensuite utilisé pour assembler un résumé vidéo. L’objectif général du
stage est d’explorer des méthodes de contextualisation de vidéos ou d’images à partir de la transcription texte.
Cette contextualisation doit aider à la composition du résumé vidéo final. Ce sujet pourra donner lieu à une thèse. Profil du candidat Etudiant en Master 2 en informatique. Le candidat devra posséder un bon niveau en programmation
(C/C++ et/ou Python). Des notions en Traitement Automatique de la Langue ou Apprentissage automatique seraient un plus. Lieu du stage LIA, 339, chemin des Meinajariès, 84911 Avignon Durée et rémunération 6 mois, environ 580€ par mois. Contact Mickaël Rouvier – Maître de conférence – mickael.rouvier@univ-avignon.fr Richard Dufour – Maître de conférence – richard.dufour@univ-avignon.fr Bibliographie [Li11] Li, Y., Merialdo, B., Rouvier, M., & Linares, G. (2011). Static and dynamic video summaries. In Proceedings
of the 19th ACM international conference on Multimedia (pp. 1573-1576). ACM.
[Trione14] Trione, J. (2014). Extraction methods for automatic summarization of spoken conversations from call centers
(Méthodes par extraction pour le résumé automatique de conversations parlées provenant de centres d’appels)[in French]. In Proceedings of TALN 2014 (Volume 4: RECITAL-Student Research Workshop) (Vol. 4, pp. 104-111).
[Favre15] Favre, B., Stepanov, E. A., Trione, J., Béchet, F., & Riccardi, G. (2015). Call Centre Conversation Summarization:
A Pilot Task at Multiling 2015. In SIGDIAL Conference (pp. 232-236).
|
6-24 | (2017-12-13) Internship and PhD position at Telecom-ParisTech and LTCI lab, Paris, France
Internship and PhD position in machine learning for multimodal engagement analysis
in human-robot interactions (HRI)
Telecom ParisTech [1], LTCI lab [2]
Duration: 6-month internship to be continued as 3-year PhD contract Start: Any date from February 1st, 2018
Salary: according to background and experience
*Position description*
The internship/PhD project will take part in a collaboration between Softbank Robotics and Télécom ParisTech on the topic of engagement analysis in interactions of humans with Softbank?s robots.
The role of the intern/PhD student will consist in developing robust machine learning systems able to effectively take advantage of the multimodal signals acquired by the robot?s sensors during its interaction with a human. The work will include:
- the design of appropriate elicitation protocols and multimodal data acquisition procedures ;
- the development of multimodal feature learning and dynamic classification procedures capable of handling noisy observations with missing values, especially exploiting deep learning techniques ;
- the evaluation of the system in realistic scenarios involving end-users.
The PhD project will be hosted at Telecom ParisTech department of images, data and signals of [3], jointly by the social computing [4] and the audio data analysis and signal processing [5] teams.
* Candidate profile*
As a minimum requirement, the successful candidate will have:
? A Master?s degree (possibly to be granted in 2018) in one of the following areas: computer science, artificial intelligence, machine learning, signal processing, affective computing, applied mathematics
? Excellent programming skills (preferably in Python)
? Good command of English
The ideal candidate will also (optionally) have:
? Knowledge in deep learning techniques
-- More about the position
? Place of work: Paris, France
? For more information about Télécom ParisTech see [1]
-- How to apply
Applications are to be sent to Chloé Clavel [6], Giovanna Varni [7] and Slim Essid [8] by email (using <firstname.lastname>@telecom-paristech.fr)
The application should be formatted as a single pdf file and should include:
? A complete and detailed curriculum vitae
? A letter of motivation
? Academic records of the last two years
? The names and addresses of two referees
[1] http://www.tsi.telecom-paristech.fr
[2] https://www.ltci.telecom-paristech.fr/?lang=en
[3] http://www.tsi.telecom-paristech.fr/en/
[4]https://www.tsi.telecom-paristech.fr/recherche/themes-de-recherche/analyse-automatique-des-donnees-sociales-social-computing/
[5] http://www.tsi.telecom-paristech.fr/aao/en/
[6] https://clavel.wp.mines-telecom.fr/
[7] http://sites.google.com/site/gvarnisite/
[8] http://www.telecom-paristech.fr/~essid
|
6-25 | (2017-12-16) Position at INA, Bry/Marne, France
L’Institut national de l’audiovisuel (INA), entreprise publique audiovisuelle et numérique, collecte, sauvegarde et transmet le patrimoine audiovisuel français. Dans une démarche d’innovation tournée vers les usages, l’INA valorise ses contenus et les partage avec le plus grand nombre : sur ina.fr pour le grand public, sur inamediapro.com pour les professionnels, à l’InaTHÈQUE pour les chercheurs. L’institut développe ainsi des offres et des services afin de se rapprocher de ses usagers et clients, en France comme à l’international.
Son département Recherche et Innovation soutient une culture de l’innovation forte et ambitieuse. Notre technologie Ina-Signature (technologie de « fingerprint ») – issue de la R&D de l’Ina - a su s’imposer auprès de clients renommés, grâce à une stratégique axée sur la performance et la qualité. Notre offre continue à évoluer, avec la démocratisation du SAAS (software as a service) et du CLOUD.
Dans le cadre de votre mission, rattaché/e au Chef du service de la Recherche, vous garantissez la conception, la mise en oeuvre, l'intégration ou l'adaptation des technologies d’apprentissage automatique, d’analyse et de fusion de données dans le cadre des projets de Recherche pour l’expérimentation de nouveaux usages de valorisation des contenus.
A ce titre, vous serez en charge de :
1 – Effectuer de la Recherche scientifique et technologique
- Définir les axes de recherche et développement liés à cette thématique ;
- Concevoir, implémenter, tester, évaluer des outils technologiques innovants dans le cadre des usages existants ou pressentis de l’Institut ;
- Collaborer avec l’ensemble des acteurs internes et externes du département ;
- Participer à la stratégie de recherche et développement du service ;
- Encadrer des stagiaires et à terme des doctorants ;
- Rédiger ou participer à la rédaction d’articles scientifiques et présenter ces articles dans des colloques ;
- Démontrer les travaux de recherche lors de colloques, séminaire ou salons :
- Participer à la rédaction des documents liés à l’activité (rapports d’activité, livrables des projets en particulier).
2 – Assurer une R&D au service de l’Institut
- Proposer, préparer, coordonner, participer à des projets de Recherche et Développement internes en lien avec les services opérationnels ;
- Proposer, piloter, participer à des actions de concertation et de réflexion internes et groupes de travail.
3 – Réaliser des partenariats
- Proposer, préparer, coordonner, participer à des projets de Recherche et Développement collaboratifs, nationaux ou internationaux en lien avec des partenaires académiques, institutionnels ou industriels ;
- Proposer, coordonner, participer à des instances de coopération scientifique et technologique (COMUE, Pôles de compétitivité, Groupes de recherche).
4 – Collaborer au management fonctionnel
- Participer à la coordination du service (réunions de coordination) ;
- Participer aux tâches de gestion des ressources informatiques et techniques du service ;
- Participer à la vie du service (réunions de service, suivis d’activité, rapports).
Profil :
Vous justifierez d'un doctorat dans le domaine de l’apprentissage automatique et/ou de l’analyse de données ou d'un parcours professionnel admis en équivalence.
Complété de compétences en :
- Maîtrise et expérience dans le(s) domaine(s) suivants : apprentissage automatique (Deep Learning), analyse et fusion de données, analyse de l’image et/ou de l’audio, développement informatique
- Bonne pratique en recherche académique et/ou industrielle ;
- Pratique en publications scientifiques ;
- Bonne connaissance et pratique de projets collaboratifs ;
- Connaissance du paysage audiovisuel français ;
- Connaissance du monde académique ;
- Maîtrise des outils bureautiques ;
- Intérêt pour le monde de l’audiovisuel et des médias ;
- Intérêt pour les Sciences Humaines et Sociales et les Humanités Numériques
Des qualités d’analyse et de synthèse, de créativité et d’imagination, de force de proposition, relationnelles et d’esprit d’équipe seront vos meilleurs atouts pour réussir dans le poste.
Modalités du poste :
- Contrat : CDI
- Statut : Cadre
- Poste à pourvoir : au plus vite
- Salaire : selon expérience
- Clôture de la consultation : 31 janvier 2018
- Contact : jcarrive@ina.fr
- Localisation géographique : Bry S/Marne (94)
|
Jean Carrive
Responsable de Département adjoint
Recherche et Innovation numérique
Direction déléguée à la Diffusion et à l'Innovation
Ligne directe : +33 1 49 83 34 29 - jcarrive@ina.fr
|
|
institut.ina.fr
|
|
6-26 | (2017-12-16) Post-doc position at Uniklinik RWTH Aachen (Germany)
We are looking at the Uniklinik RWTH Aachen (Germany) for a postdoctoral researcher in the field of articulatory modelling of the vocal tract for the analysis of dysarthria from MRI images.
The position is for 14 months on the German pay scale TV-L 13 (typically around 2300? net after all tax deduction) and expected to start around March 2018.
All details can be found here: http://antoine.serrurier.free.fr/index_documents/2017_12_PostDoc_DysArtMod_EN.pdf
|
6-27 | (2017-12-13) PhD position in Conversational systems and Social robotics, KTH, Stockholm, Sweden
PhD position in Conversational systems and Social robotics, KTH, SwedenKTH Royal Institute of Technology in Stockholm has grown to become oneof Europe?s leading technical and engineering universities, as well asa key centre of intellectual talent and innovation. We are Sweden?slargest technical research and learning institution and home tostudents, researchers and faculty from around the world.We are looking for a doctoral student that will work on situatedspoken interaction between humans and robots, under the supervision ofAssoc. Prof. Gabriel Skantze, at the Department of Speech Music andHearing. A central research question will be how social robots shouldadapt their conversational behavior to the users' level of attention,understanding and engagement. This means that the robot must be ableto monitor gaze and feedback behaviour from the user, and then forexample adjust the pace of information delivery, in real time. Thework will involve implementation of components for conversationalsystems, collecting data and doing experiments with users interactingwith the system, and using this data to build models of the users'behaviours.Applicants should have a Master degree (or similar) in a subjectrelevant for the research, such as computer science, languagetechnology, or cognitive science. Applicants are expected to have goodskills in programming, and knowledge in either experimental methodsand statistics, or machine learning. Applicants must be stronglymotivated for doctoral studies, possess the ability to workindependently and perform critical analysis, and possess good levelsof cooperative and communicative abilities. Good command of English,in writing and speaking, is a prerequisite for presenting researchresults in international periodicals and at conferences. We alsoexpect applicants to have a deep interest in spoken languageinteraction between humans and between humans and machines.The position is mainly a research position for 4-5 years, with a smallfraction of departmental duties (e.g. teaching). The starting date isopen for discussion, though ideally we would like the successfulcandidate to start as soon as possible.For more information, see:https://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:178626/where:4/
|
6-28 | (2017-12-13) 2 funded PhD positions in interactive virtual characters and social robots at KTH, Stockholm, Sweden
** 2 funded PhD positions in interactive virtual characters and social robots at KTH, Sweden**
Embodied Social Agents Lab
KTH Royal Institute of Technology
Stockholm, Sweden
Deadline: 15th January 2018
ABOUT KTH
KTH Royal Institute of Technology in Stockholm has grown to become one of Europe?s leading technical and engineering universities, as well as a key center of intellectual talent and innovation. We are Sweden?s largest technical research and learning institution and home to students, researchers and faculty from around the world. Our research and education covers a wide area including natural sciences and all branches of engineering, as well as in architecture, industrial management, urban planning, history and philosophy.
The Embodied Social Agents Lab (http://www.csc.kth.se/~chpeters/ESAL/) led by Dr. Christopher Peters aims to develop virtual characters and other systems capable of interacting socially with humans for real-world application to areas such as education. The lab is already involved in a number of local and international initiatives involving virtual characters, social robots and education. It is based out of the Visualization Studio (VIC) at KTH, a research, teaching and dissemination resource with some of the most advanced interactive visualization technologies in the world, supporting platforms for interacting with sophisticated virtual characters.
JOB DESCRIPTION
Two PhD positions are available in the area of interactive virtual characters and social robots for application to education. Research in this area brings together multidisciplinary expertise to address new challenges and opportunities in the area of virtual characters, based on real-time computer graphics and animation techniques, to investigate multimodal and natural interaction for both individuals and groups, multimodal generation of expressions, individualization of behaviour and effects of embodiment (appearance, virtual versus physical objects). Applications are the design of interactive virtual and physical systems for educational purposes.
The topics to be pursued respectively in the PhDs are:
1. Compliant Small Group Behaviour (ref: ESR5)
Develop socially compliant behaviours allowing agents to join and leave free-standing formations based on their varying roles as teachers, teaching assistants and learners in pedagogical scenarios. Investigate the impact of variations in the artificial behaviour of agents on the efficacy of pedagogical approaches and potential for application to mobile robots through virtual replicas.
2. Impact of Appearance Customisation on Interaction (ref: ESR15)
Investigate technological approaches for customising the appearances and behaviours of avatars (user controlled virtual characters and robot replicas) in relation to their users and assess the impact on interactions during learning scenarios.
Both of the PhDs involve crossovers between virtual and augmented reality, virtual characters and mobile social robots and take place within the Horizon 2020 Marie Sklodowska Curie European Training Network ANIMATAS.
ANIMATAS will establish a leading European Training Network (ETN) devoted to the development of a new generation of creative and critical research leaders and innovators who have a skill-set tailored for the creation of social capabilities necessary for realising step changes in the development of intuitive human-machine interaction (HMI) in educational settings. 15 early-stage researcher (ESR) positions are available within ANIMATAS.
The successful candidates will participate in the network?s training activities offered by the European academic and industrial participating teams. PhD students will have the opportunity to work with the partners of the ANIMATAS project, such as Uppsala University, Jacobs University Bremen, Institut Mines-Télécom, University of Wisconsin-Madison, Pierre et Marie Curie University and Softbank Robotics, with possible opportunities for secondments at these institutions according to the ESR.
QUALIFICATIONS
The candidates must have an MSc degree in computer science or related areas relevant to the PhD topics. Good programming skills are required. A background in computer graphics and animation techniques or similar areas is appreciated. The PhD positions are highly interdisciplinary and require an understanding and/or interest in psychology and social sciences. The applicant should have excellent communication skills and be motivated to work in an interdisciplinary environment involving multiple stakeholders across academia, industry and education. An excellent level of written and spoken English is essential.
The positions are for four years.
HOW TO APPLY
To apply, candidates must submit their CV, a letter of application, two letters of reference and academic credentials to the ANIMATAS recruitment committee: Mohamed Chetouani (network coordinator), Ana Paiva and Arvid Kappas at contact-animatas@listes.upmc.fr, and to the main supervisor of the research project of interest (Christopher Peters, chpeters@kth.se). All applications should be made in English.
Please include the keyword ?ANIMATAS? somewhere in the subject line and specify which project you are applying for (ESR5 or ESR15).
The application deadline is 15th January 2018
Information about the positions can be provided by Dr. Christopher Peters, chpeters@kth.se
|
6-29 | (2018-01-09) Two postdoc positions at IDIAP, Martigny, Switzerland
=========================================== position 1: =========================================== Name : Multimodal people monitoring using sound (and vision) Type : Postdoc Description : The Idiap Research Institute together with Swiss Center for Electronics and Microtechnology (CSEM) invite applications for a post-doctoral position in research and development for multimodal people monitoring. The position is funded for one year by Idiap (with a possible extension depending on his/her performance)
The successful candidate will work with Dr. Petr Motlicek in Idiap's Speech and Audio Processing group, engaged in world-class research in speech processing. Exceptionally qualified candidate can also be considered for a longer-term Research Associate position.
Detailed description: We have witnessed a large interest and potential of self-dependent smart sound devices to be deployed for security, surveillance, or emergency applications. Recent developments performed by CSEM in building occupancy detection and monitoring using embedded vision have led to the creation of successful monitoring applications. This project will focus on a combination of the visual and speech information which will take place in an embedded platform providing industrial grade vision sensing together with an acoustic front-end.
CSEM will provide an expertise in embedded platform, visual analysis and data fusion. The Idiap postdoctoral position will mainly focus on the speech related aspects of the project, including speaker identification and keyword spotting, aiming to operate with limited resources.
We envisage three related research threads for this position: 1. Parameter reduction, in which we will apply sparsity and relevance constraints to train neural networks that function using as few parameters as possible. 2. Acoustic modeling sharing between different applications, in which we will build on the commonality between technologies for automatic speech recognition or keyword spotting and speaker recognition to create a single system with multiple capabilities. 3. Far-field speech processing, in which we will process signals recorded by a microphone array to substantially increase SNR of the input signal.
The successful candidate will work at Idiap in Martigny, but in close collaboration with CSEM?s R&D team based in Switzerland. The project is a unique combination of applied science and academic research expected to yield both reference designs and academic publications.
Profile: Candidate should have either or both of: 1. A strong background in engineering, mathematics or a related discipline, along with the associated familiarity with modern distributed programming environments and languages such as C++, Python and Perl. 2. An exceptional academic record and a clear aptitude for creative (and independent) research in a related discipline. In either case, familiarity with speech processing tools such as Kaldi and deep learning toolkits such as Torch will be a distinct advantage. Although a PhD is normally a prerequisite for a post-doctoral position, candidates without a PhD may be considered in exceptional cases.
Timescale: The position is offered on a one-year basis with the possibility of renewal based on funding and performance. The starting salary will be 80,000 CHF/year. Starting date could be immediate, but otherwise as soon as possible in 2018.
=========================================== position 2: ===========================================
Name : Speech and Speaker recognition for HMI devices Type : Postdoc Description : The Idiap Research Institute together with a global industry partner, leader in Consumer Electronics, invite applications for two post-doctoral positions in speech and speaker recognition for HMI devices. The positions are funded for two years by the Swiss Commission for Technology and Innovation (CTI), enabling a collaboration between Idiap and an innovative product company.
The successful candidates will work with Dr. Philip N. Garner, and/or Dr. Petr Motlicek in Idiap's Speech and Audio Processing group, engaged in world-class research in speech processing. Exceptionally qualified candidates can also be considered for a longer-term Research Associate position.
Description
In recent years, the state of the art in speech and speaker recognition has been dominated by deep learning. Such technology is typically highly parametric; training can require significant CPU or GPU resources. The goal of the project is to investigate the application of the state of the art to the more limited resources of consumer-grade embedded systems which operate in combination with cloud services.
We envisage three related research threads:
1. Parameter reduction, in which we will apply sparsity and relevance constraints to train networks that function using as few parameters as possible.
2. Smart handover, in which we will assess the complexity of voice commands to optimise workload between local devices and cloud-based services.
3. System combination, in which we will build on the commonality between technologies for multilingual speech recognition, keyword spotting and speaker recognition to create a single system with multiple capabilities.
The successful candidates will work at Idiap in Martigny, but in close collaboration with the partner?s R&D team based in Switzerland. The project is a unique combination of applied science and academic research expected to yield both reference designs and academic publications.
Profile
Candidates should have either or both of: 1. A strong background in engineering, mathematics or a related discipline, along with the associated familiarity with modern distributed programming environments and languages such as C++, Python and Perl. 2. An exceptional academic record and a clear aptitude for creative (and independent) research in a related discipline. In either case, familiarity with speech processing tools such as Kaldi and deep learning toolkits such as Torch will be a distinct advantage. Although a PhD is normally a prerequisite for a post-doctoral position, candidates without a PhD may be considered in exceptional cases.
Timescale
All positions are offered on a one-year basis with the possibility of renewal based on funding and performance. The starting salary will be 80,000 CHF/year. Starting date could be immediate, but otherwise as soon as possible in 2018.
|
6-30 | (2018-01-10) Postdocs at Monash University, Melbourne, Australia
The Faculty of Information Technology (https://www.monash.edu/it) at Monash University in Melbourne Australia is establishing a new group in HCI and creative technologies. We invite accomplished and creative PhDs to apply for a 3-year postdoctoral fellowship in multimodal interfaces and behavior analytics. The selected candidate will join a rapidly expanding multidisciplinary group with expertise in areas such as mobile and multimodal-multisensor interfaces, agent-based conversational interfaces, brain-computer and adaptive interfaces, wearable and contextually-aware personalized interfaces, education and health interfaces, data analytics for predicting user cognition and health status, and other topics.
We are especially interested in adding faculty in these preferred areas:
(1) Wearable, contextually-aware and personalized interfaces
(2) Mobile and multimodal-multisensor interfaces, including fusion-based ones
(3) Data analytics for predicting user emotion, cognition, and health status
(4) Agent-based conversational dialogue interfaces
(5) Brain-computer and adaptive interfaces
This position involves research on predicting user cognition and health status, based on analysis of different modalities (e.g., speech, writing, images, sensors) during naturally occurring activities. These analyses involve exploring predictive patterns at the signal, activity pattern, lexical, and/or transactional levels. The ideal candidate would be an initiating researcher with a strong publication record who is interested in pioneering in emerging research areas. He/she would have an interest in developing new technologies to identify users? cognitive and health status, and using this information to develop personalized and adaptive interfaces that promote learning, performance, and health.
Requirements:
? PhD in computer science, engineering, information sciences, cognitive or linguistic sciences, or related field
? Training in HCI, multimodal interfaces, data science and analytics, modeling human behavior & communication
? Experience collecting and analyzing speech, images, handwriting, and/or other sensor data
? Experience applying machine learning/deep learning, empirical/statistical, linguistic, or hybrid analysis methods
? Interest in human cognition and educational technologies, and/or health and mental health technologies
? Strong interpersonal, teamwork, communication and writing skills
? Ability to work with diverse partners? domain experts (teachers, clinicians), industry, undergraduate/graduate students
? Prefer candidate with 2-3 years post-PhD research or work experience
HCI Group: The HCI group designs, builds, and evaluates state-of-the-art interface technologies. Our multidisciplinary interests span computer science and engineering, cognitive and learning sciences, communications, health, media design, and other topics. We are interested in applications such as health, education, communications, personal assistance, and digital arts. The HCI group has partnerships with CSIRO-Data61 and industry. The HCI area director is Dr. Sharon Oviatt, an ACM Fellow and international pioneer in human-centered, mobile, and multimodal interfaces (see https://www.monash.edu/it/our-research/graduate-research/scholarship-funded-phd-research-projects/projects/human-centred-mobile-and-multimodal-interfaces)
Monash is Australia?s largest university, and ranks in the top 60 universities worldwide, with Computer and Information Systems rated in the top 70 worldwide (QS World University rankings 2018). In addition to growing rapidly in human-centered computing, software, and cyber-security, it includes data science and machine learning, artificial intelligence and robotics, computational biology, social computing, and basic computer science.
Experimental Labs & Design Spaces: The university has made recent strategic investments in facilities for prototyping innovative concepts, collecting and analyzing data, and displaying digital installations and interactive media?including sensiLab (supporting tangible, wearable, augmented and virtual reality, multimodal-multimedia, maker-space), Immersive Visualization platform and Analytics lab, the Centre for Data Science, and the ARC Centre of Excellence on Integrative Brain function (pioneering new multimodal imaging techniques for data exploration). The university currently is investing in HCI group facilities for prototyping and developing new mobile, multimodal and multisensor interfaces, analyzing human multimodal interaction (e.g., whole-body activity, speech), and predicting users? cognitive and health status.
Melbourne Area: Melbourne recently has been rated the #1 city worldwide for quality of life (see Economist & Guardian, http://www.economist.com/blogs/graphicdetail/2016/08/daily-chart-14 and https://www.theguardian.com/australia-news/2016/aug/18/melbourne-wins-worlds-most-liveable-city-award-sixth-year-in-a-row), with excellent education, healthcare, infrastructure, low crime, and exceptional cuisine, cultural activities, and creative design. The regional area is renowned for its dramatic coastline, extensive parks, exotic wildlife, and Yarra Valley wine region.
Position & Compensation: This position is full-time for 3 years, with competitive salary (Academic level B-6, $119,683 AUD) and benefits, including 17% superannuation retirement fund, health insurance options, relocation, and seed funds for equipment and travel. Start date is negotiable after April 1, 2018. For enquiries, contact Oviatt@incaadesigns.org.
To apply: To submit an online application: http://careers.pageuppeople.com/513/cw/en/job/571150/research-fellow-multimodal-interfaces-behaviour-analytics Required application materials include: (1) cover letter (indicating date of availability); (2) current CV with publication list, research and teaching interests, and 3 references with email/phone contact; (3) graduate transcripts; and (4) three representative publications. Monash has a Women in IT Program, and participates in the Athena Swan Charter to enhance gender equality. We welcome female, minority and international applicants.
|
6-31 | (2018-01-11) Postdoc position at IDIAP, Martigny, Switzerland
We have a new opening for a post-doctoral researcher at Idiap Research Institute. It is a joint position with the Swiss Center for Electronics and Microtechnology (CSEM), and involves investigation of deep learning in the context of speech and image processing. For a full description, please see the link: http://www.idiap.ch/education-and-jobs/job-10236
Idiap is located in French speaking Switzerland, although the lab hosts many nationalities, and functions in English. All positions offer quite generous salaries.
Several similar positions at PhD, post-doc and senior level are available at the institute in general. http://www.idiap.ch/en/join-us/job-opportunities
|
6-32 | (2018-01-18) (SENIOR) SPEECH SCIENTIST at Voicebox, München, Germany
(SENIOR) SPEECH SCIENTIST Voicebox is an acknowledged pioneer in the voice technology and application industry. Our patented innovations create compelling AI voice interfaces of unparalleled flexibility and usability in app dev. Trusted by many of the world’s leading companies, we have established a leadership position in the automotive market. And because our capabilities were developed against our vision of a single, unifying interface across all connected devices, our continued growth is being driven by new markets, such as connected TVs and mobile computing. Our ability to capture this growth requires that we continue to add to our diverse team of talented professionals. Our opportunity is your opportunity. As Speech Scientist you will work closely with the Cloud ASR team on cutting-edge conversational voice information retrieval systems. Within this team of researchers and engineers you will improve existing products and develop brand new technologies. You will be responsible for the design, development and testing of large vocabulary speech application technologies and also engage in all aspects of research activities (e.g., writing proposals, conducting novel research, presenting and publishing your research results). Typical work packages are: • Training and adaptation of acoustic models (car, mobile, far-field) for a multitude of languages • Develop statistical language models for various applications and languages • Design and develop new speech applications • Tune and maintain speech applications • Adapt speech resources for certain customers’ requirements • Research, development and implementation of new algorithms in ASR. • Perform statistical analysis on large datasets (multiple terabytes of data) Skills: • Highly independent and capable of fulfilling multiple project commitments concurrently • Passionate about analyzing data to solve problems and improve systems • Good analytical and diagnostic skills, quick learning • Coding skills in languages like in C/C++ or Java • Knowledge of multiple natural languages is a strong plus • Knowledge in digital signal processing • Be able to write technical specifications, requirements in English • Good English communication skills Experience: • Prior experience training models for HMM/DNN-based recognizers (Kaldi/HTK) • Understanding of ASR training/decoding processes • Understanding of ASR front-end components • Knowhow of far-field and microphone array processing • Experience in UNIX environment, strong in scripting languages such as Bash, Perl and Python • Ph.D. or Master in Computer Science, Electrical Engineering, Mathematics, or relevant field or strong professional experience. VoiceBox Technologies Deutschland GmbH Ramersdorfer Straße 1, 81669 München, Germany michaelw@voicebox.com
|
6-33 | (2018-01-18) 3 permanent(indefinite tenure) faculty positions at Telecom ParisTech, Paris, France
Telecom ParisTech has three new permanent(indefinite tenure) faculty positions:
? Faculty position (Full Professor) in audio/speech/music signal processing ? Faculty position (Associate Professor) in deep learning applied to temporal data analysis ? Faculty position (Associate Professor) in deep learning for image processing
More information on: http://www.tsi.telecom-paristech.fr/en/blog/2018/01/05/three-new-permanent-tenure-faculty-positions/
Please note the following important dates:
? February 28th, 2018: closing date ? Mid-April 2018: hearings of preselected candidates (tentative dates for hearings are April 11th and/or 13th) ? September 2018: tentative starting date
|
6-34 | (2018-01-20) Machine learning Software Engineer, Adobe Research - Speech Recognition, San Jose, CA,USA
Machine learning Software Engineer, Adobe Research - Speech Recognition Description Creative Intelligence Lab within Adobe Research plays a key role in creating next-generation applications and features in Adobe’s flagship products, including Photoshop, Lightroom, Audition, and Acrobat. Creative Intelligence Lab is searching for a machine learning software engineer specializing in speech recognition. Responsibilities include working closely with researchers, engineers, user experience designers, and product managers to build prototypes that showcase new research technologies, and to help integrate those technologies into Adobe’s products. This position will initially focus on building a speech recognition system for our creative assistant, and may expand to include additional areas of innovation, such as text to speech, NLP, HCI, machine learning, and dialog systems. We’re looking for exceptional candidates with expertise in computer science or software engineering. For this position, we will give preference to candidates with experience in applied machine learning and speech recognition. For successful candidates, nearly all of the following will be true: You have significant experience in building robust, complex software systems. You are excellent at collaborating with a team to get work done. You are comfortable both building prototypes from scratch and writing maintainable code inside large existing codebases. You have shipped software in a commercial environment (start-ups a plus) and can deal with last-minute bug fixes and schedule changes. Requirements M.S. degree or higher in Computer Science or a related field. Significant experience developing speech recognition systems (acoustic models, language models) using popular machine learning and deep learning software libraries, such as Kaldi and/or Tensorflow. Ability to write efficient, clean, and reusable code. Strong communication and collaboration skills. Ability and willingness to learn new technologies quickly. At Adobe, you will be immersed in an exceptional work environment that is recognized throughout the world on Best Companies lists. You will also be surrounded by colleagues who are committed to helping each other grow through our unique Check-In approach where ongoing feedback flows freely. If you’re looking to make an impact, Adobe's the place for you. Discover what our employees are saying about their career experiences on the Adobe Life blog and explore the meaningful benefits we offer. Adobe is an equal opportunity employer. We welcome and encourage diversity in the workplace regardless of race, gender, sexual orientation, gender identity, disability or veteran status. Contact: Trung Bui (bui@adobe.com) or Walter Chang (wachang@adobe.com)
|
6-35 | (2018-01-25) 2018 PHD RESEARCH FELLOWSHIPS at University of Trento , Italy
2018 PHD RESEARCH FELLOWSHIPS ( ML/Dialogue/Language/Speech) Location: University of Trento , Italy
You may have enjoyed reading about bots, artificial intelligence, machine learning, digital assistants, systems that support doctors, teachers, customers and help people. Then, you would like to consider taking a front row seat and join the research team that has been training intelligent machines and evaluating AI-based systems for more than two decades, collaborating with best research labs in the world and deployed them in the real-world.
Here is a sample of the projects ( http://sisl.disi.unitn.it/demo/ ) the Signals and Interactive Systems Lab (University of Trento, Italy) has been leading:
-Natural Language Understanding systems for massive amount of human language data: http://www.sensei-conversation.eu
-Amazon Alexa challenge on Conversational Systems: http://sisl.disi.unitn.it/university-of-trento-is-selected-by-amazon-for-the-alexa-challenge/
-Designing AI personal agents for healthcare domain: http://sisl.disi.unitn.it/pha/
We are looking for top-candidates for its funded PhD research fellowships. Candidates should have background at least in one of the following areas:
- Speech Processing
- Natural Language Understanding
- Conversational Systems
- Machine Learning
Candidates will be working on research domains such as Conversational Agents, Intelligent Systems, Speech/Text Document Mining and Summarization, Human Behavior Understanding, Crowd Computing and AI-based systems for tutoring.
For more info on research and projects visit the lab website Visit lab website at http://sisl.disi.unitn.it/
The SIS Lab research is driven by an interdisciplinary approach to research, attracting researchers from disciplines such as Digital Signal Processing, Speech Processing, Computational Linguistics, Psychology, Neuroscience and Machine Learning.
The official language ( research and teaching ) of the department is English.
FELLOWSHIP
Gross amount of the fellowships ( internship and PhD ) is competitive and approximately 1.600 Euro/month. Students may qualify for reduced campus lodging, transportation and cafeteria reduced rates.
For more information about cost of living, campus, graduate education programs, please visit the graduate school website at http://ict.unitn.it/
DEADLINES
Immediate openings with start date as early as March 2018. Open until filled.
REQUIREMENTS
Strict requirement is at least Master level degree in Computer Science, Electrical Engineering, Computational Linguistics or similar or affine disciplines. Students with other background (Physics, Applied Math) may apply as well. Background in at least one of the posted research areas is required. All applicants should have good very programming, math skills and used to team work.
HOW TO APPLY
Interested applicants should send their 1) CV 2) Statement of research interest and 3) three reference letters sent to:
Email: sisl-jobs@disi.unitn.it
For more info:
Signals and Interactive Systems Lab : http://sisl.disi.unitn.it/
PhD School : http://ict.unitn.it/
Department : http://disi.unitn.it/
Information Engineering and Computer Science Department (DISI)
DISI has a strong focus on cross-disciplinarity with professors from different faculties of the University (Physical Science, Electrical Engineering, Economics, Social Science, Cognitive Science, Computer Science) with international background. DISI aims at exploiting the complementary experiences present in the various research areas in order to develop innovative methods, technologies and applications.
University of Trento
The University of Trento is consistently ranked as premiere Italian university institution. See http://www.unitn.it/en/node/1636/mid/2573
University of Trento is an equal opportunity employer.
|
6-36 | (2018-01-25) PhD student in Robot-assisted Language Learning at KTH Royal Institute of Technology, Stockholm, Sweden
PhD student in Robot-assisted Language Learning at KTH Royal Institute of Technology, Stockholm, Sweden
Ending January 31st 2018.
Olov Engwall
Professor in Speech Communication
|
6-37 | (2018-01-26) Poste MCF à l'ENSIMAG, Grenoble, France
Poste MCF à l'ENSIMAG.
Ecole de rattachement : ENSIMAG
Profil d?enseignement :
L?Ensimag recrute un maître de conférences en mathématiques appliquées ou en informatique
afin de développer les enseignements d?apprentissage statistique, d?intelligence artificielle, de
visualisation de données, de calcul haute performance ou de « big data ». Le dossier de
candidature devra faire apparaître le caractère ?interdisciplinaire? du candidat, sa capacité à
prendre des responsabilités au sein de la structure, ainsi qu?une liste conséquente de travaux
ou publications en relation avec une ou plusieurs branches de la science des données. Outre la
formation aux sciences des données (synthèse de programmes à partir de données, aide à la
décision), la personne recrutée devra s?investir dans les enseignements du tronc commun
Ensimag (1ère année et environ 75% des filières de la 2ème année) qui constitue le socle de nos
élèves ingénieurs. Elle sera amenée à s'investir et prendre des responsabilités dans des
parcours de l?Ecole tels que le « mastère big data » ou le master « Data Science ». En
partenariat avec des industriels, la personne recrutée pourrait superviser l?organisation de «
challenges » et de « hackatons » afin d?enrichir les contacts de l?Ecole dans le domaine de
l?intelligence artificielle et des « big data ». En collaboration avec les équipes pédagogiques
concernées, elle devra s?impliquer dans le montage d?enseignements par projets et la
formation par le Numérique.
RECHERCHE
Laboratoire d?accueil : LIG / LJK
Profil de recherche :
Le candidat effectuera ses recherches dans le domaine de l?intelligence artificielle ou de la
science des données, et montrera son ouverture aux différentes approches possibles dans ce
domaine. Les thématiques privilégiées sont l?apprentissage sur données complexes, structurées
ou non structurées, l?apprentissage profond et les réseaux de neurones et en particulier les
problématiques d?optimisation, de causalité, de capacité de généralisation et leur analyse
mathématique. Parmi les applications de l?apprentissage et de l?apprentissage profond, un
intérêt particulier est porté au traitement du signal et de l?image, à l?apprentissage de
représentation, à l?apprentissage avec des données multimédia, des données langagières pour
des problématiques issues du traitement du langage naturel, les thématiques de transparence
des mécanismes d?apprentissage, ainsi que les applications en biologie, santé, sciences
humaines, réseaux sociaux, physique, environnement, etc.
Le recrutement renforcera les liens entre le LIG et le LJK dans les domaines de la science des
données et de l?apprentissage automatique. Les deux laboratoires sont localisés sur le campus
de Saint Martin d?Hères et ont des collaborations actives, en particulier au sein de l?axe du
traitement de données et de connaissance à large échelle (équipes AMA, GETALP, MRIM,
SLIDE), des équipes PERVASIVE, TYREX du LIG et au sein du département Proba-Stat (équipes
DAO, SVH, MISTIS, FIGAL) et de l?équipe THOTH du LJK. Parmi les projets communs entre les
deux laboratoires, on peut également citer les problèmes de prédiction et de classification avec
des données structurées de type fonctionnelles, le transport optimal pour l?apprentissage, les
problèmes de parcimonie et de régularisation pour l?apprentissage multitâches et leur
résolution par des méthodes d?optimisation stochastique. La personne recrutée montrera sa
capacité à jouer un rôle actif dans les projets contractuels académiques (ANR, FUI, PFIA, EU...)
et industriels sur ces thèmes très porteurs.
ACTIVITES ADMINISTRATIVES
Spécificités du poste ou contraintes particulières :
Activités administratives liées aux fonctions de maître de conférences : responsabilités d?unité
d?enseignement, responsabilités de filières ou d?année.
Compétences attendues :
Savoir : Enseignement de l?informatique, de l?intelligence artificielle et de la science
des données
Savoir-faire : Pédagogie et responsabilités dans l?Ecole
Savoir-être : Travail en équipe
pdf
Intelligence artificielle, Science des données, Big data, Apprentissage
------------------------
Laurent Besacier
Professeur à l'Univ. Grenoble Alpes (UGA)
Laboratoire d'Informatique de Grenoble (LIG)
Membre Junior de l'Institut Universitaire de France (IUF 2012-2017)
Responsable équipe GETALP du LIG
Directeur de l'école doctorale (ED) MSTII
-------------------------
!! Nouvelles coordonnées !!: LIG
Laboratoire d'Informatique de GrenobleBâtiment IMAG700 avenue CentraleDomaine Universitaire - 38401 St Martin d'Hères
Nouveau tel: 0457421454
--------------------------
|
6-38 | (2018-01-27) Post-doctoral researcher at Idiap Research Institute, Martigny, Switzerland
Post-doctoral researcher at Idiap Research Institute.
It is a joint position with the Swiss Center for Electronics and Microtechnology (CSEM), and involves investigation of deep learning in the context of speech and image processing. For a full description, please see the link: http://www.idiap.ch/education-and-jobs/job-10236
Idiap is located in French speaking Switzerland, although the lab hosts many nationalities, and functions in English. All positions offer quite generous salaries.
Several similar positions at PhD, post-doc and senior level are available at the institute in general. http://www.idiap.ch/en/join-us/job-opportunities
|
6-39 | (2018-01-27) Post Doctoral Position (12 months) at INRIA Nancy, France
Post Doctoral Position (12 months)
Natural language processing: automatic speech recognition system using deep neural networks without out-of-vocabulary words
_______________________________________
- Location:INRIA Nancy Grand Est research center, France
- Research theme: PERCEPTION, COGNITION, INTERACTION
- Project-team: Multispeech
- Scientific Context:
More and more audio/video appear on Internet each day. About 300 hours of multimedia are uploaded per minute. In these multimedia sources, audio data represents a very important part. If these documents are not transcribed, automatic content retrieval is difficult or impossible. The classical approach for spoken content retrieval from audio documents is an automatic speech recognition followed by text retrieval.
An automatic speech recognition system (ASR) uses a lexicon containing the most frequent words of the language and only the words of the lexicon can be recognized by the system. New Proper Names (PNs) appear constantly, requiring dynamic updates of the lexicons used by the ASR. These PNs evolve over time and no vocabulary will ever contains all existing PNs. When a person searches for a document, proper names are used in the query. If these PNs have not been recognized, the document cannot be found. These missing PNs can be very important for the understanding of the document.
In this study, we will focus on the problem of proper names in automatic recognition systems. The problem is how to model relevant proper names for the audio document we want to transcribe.
- Missions:
We assume that in an audio document to transcribe we have missing proper names, i.e. proper names that are pronounced in the audio document but that are not in the lexicon of the automatic speech recognition system; these proper names cannot be recognized (out-of-vocabulary proper names, OOV PNs). The purpose of this work is to design a methodology how to find and model a list of relevant OOV PNs that correspond to an audio document.
Assuming that we have an approximate transcription of the audio document and huge text corpus extracted from internet, several methodologies could be studied:
-
From the approximate OOV pronunciation in the transcription, generate the possible writings of the word (phoneme to character conversion) and search this word in the text corpus.
-
A deep neural network can be designed to predict OOV proper names and their pronunciations with the training objective to maximize the retrieval of relevant OOV proper names.
The proposed approaches will be validated using the ASR developed in our team.
Keywords: deep neural networks, automatic speech recognition, lexicon, out-of-vocabulary words.
- Bibliography
[Mikolov2013] Mikolov, T., Chen, K., Corrado, G. and Dean, J. ?Efficient estimation of word representations in vector space?, Workshop at ICLR, 2013.
[Deng2013] Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero A. ?Recent advances in deep learning for speech research at Microsoft?, Proceedings of ICASSP, 2013.
[Sheikh2016] Sheihk, I., Illina, I., Fohr, D., Linarès, G. ?Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition?. Interspeech, 2016.
[Li2017] J. Li, G. Ye, R. Zhao, J. Droppo, Y. Gong , ?Acoustic-to-Word Model without OOV?, ASRU, 2017.
- Skills and profile: PhD in computer science, background in statistics, natural language processing, experience with deep learning tools (keras, kaldi, etc.) and computer program skills (Perl, Python).
- Additional information:
Supervision and contact: Irina Illina, LORIA/INRIA (illina@loria.fr), Dominique Fohr INRIA/LORIA (dominique.fohr@loria.fr) https://members.loria.fr/IIllina/, https://members.loria.fr/DFohr/
Additional links : Ecole Doctorale IAEM Lorraine
Deadline to apply: June 6th
Selection results: end of June
Duration :12 of months.
Starting date: between Nov. 1st 2018 and Jan. 1st 2019 Salary: about 2.115 euros net, medical insurance included
The candidates must have defended their PhD later than Sept. 1st 2016 and before the end of 2018.
The candidates are required to provide the following documents in a single pdf or ZIP file:
-
CV including a description of your research activities (2 pages max) and a short description of what you consider to be your best contributions and why (1 page max and 3 contributions max); the contributions could be theoretical or practical. Web links to the contributions should be provided. Include also a brief description of your scientific and career projects, and your scientific positioning regarding the proposed subject.
-
The report(s) from your PhD external reviewer(s), if applicable.
-
If you haven't defended yet, the list of expected members of your PhD committee (if known) and the expected date of defence.
In addition, at least one recommendation letter from the PhD advisor should be sent directly by their author(s) to the prospective postdoc advisor.
Help and benefits:
|
6-40 | (2018-01-27) PhD grant Natural language processing: adding new words to a speech recognition system using Deep Neural Networks, INRIA/LORIA, Nancy, France
Natural language processing: adding new words to a speech recognition system using Deep Neural Networks
- Location: INRIA/LORIA Nancy Grand Est research center France
- Research theme:Perception, Cognition, Interaction
- Project-team: Multispeech
- Scientific Context:
Voice is seen as the next big field for computer interaction. The research company Gartner reckons that by 2018, 30% of all interactions with devices will be voice-based: people can speak up to four times faster than they can type, and the technology behind voice interaction is improving all the time.
As of October 2017, Amazon Echo is present in about 4% of American households. Voice assistants are proliferating in smartphones too: Apple?s Siri handles over 2 billion commands a week, and 20% of Google searches on Android-powered handsets in America are done by voice input.
The proper nouns (PNs) play a particular role: they are often important to understand a message and can vary enormously. For example, a voice assistant should know the names of all your friends; a search engine should know the names of all famous people and places, names of museums, etc.
An automatic speech recognition system uses a lexicon containing the most frequent words of the language and only the words of the lexicon can be recognized by the system. It is impossible to add all possible proper names because there are millions proper names and new ones appear every day. A competitive solution is to dynamically add new PNs into the ASR system. The idea is to add only relevant proper names: for instance if we want to transcribe a video document about football results, we should add the names of famous football players and not politicians.
In this study, we will focus on the problem of proper names in automatic recognition systems. The problem is to find relevant proper names for the audio document we want to transcribe. To select the relevant proper names, we propose to use an artificial neural network.
- Missions:
We assume that in an audio document to transcribe we have missing proper names, i.e. proper names that are pronounced in the audio document but that are not in the lexicon of the automatic speech recognition system; these proper names cannot be recognized (out-of-vocabulary proper names, OOV PNs)
The goal of this PhDThesis is to find a list of relevant OOV PNs that correspond to an audio document and to integrate them in the speech recognition system. We will use a Deep neural network to find relevant OOV PNs The input of the DNN will be the approximate transcription of the audio document and the output will be the list of relevant OOV PNs with their probabilities. The retrieved proper names will be added to the lexicon and a new recognition of the audio document will be performed.
During the thesis, the student will investigate methodologies based on deep neural networks [Deng2013]. The candidate will study different structures of DNN and different representation of documents [Mikolov2013]. The student will validate the proposed approaches using the automatic transcription system of radio broadcast developed in our team.
- Bibliography:
[Mikolov2013] Mikolov, T., Chen, K., Corrado, G. and Dean, J. ?Efficient estimation of word representations in vector space?, Workshop at ICLR, 2013.
[Deng2013] Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero A. ?Recent advances in deep learning for speech research at Microsoft?, Proceedings of ICASSP, 2013.
[Sheikh2016] Sheihk, I., Illina, I., Fohr, D., Linarès, G. ?Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition?. Interspeech, 2016.
- Skills and profile: Master in computer science, background in statistics, natural language processing, experience with deep learning tools (keras, kaldi, etc.) and computer program skills (Perl, Python).
- Additional information:
Supervision and contact: Irina Illina, LORIA/INRIA (illina@loria.fr), Dominique Fohr INRIA/LORIA (dominique.fohr@loria.fr) https://members.loria.fr/IIllina/, https://members.loria.fr/DFohr/
Additional links: Ecole Doctorale IAEM Lorraine
Duration: 3 years
Starting date: between Oct. 1st 2018 and Jan. 1st 2019
Deadline to apply : May 1st 2018
The candidates are required to provide the following documents in a single pdf or ZIP file:
-
CV
-
A cover/motivation letter describing their interest in the topic
-
Degree certificates and transcripts for Bachelor and Master (or the last 5 years)
-
Master thesis (or equivalent) if it is already completed, or a description of the work in progress, otherwise
-
The publications (or web links) of the candidate, if any (it is not expected that they have any)
In addition, one recommendation letter from the person who supervises(d) the Master thesis (or research project or internship) should be sent directly by his/her author to the prospective PhD advisor.
|
6-41 | (2018/02/01) Junior Linguist (French), Paris, France
Junior Linguist [French]
Job Title:
Junior Linguist [French]
Linguistic Field(s):
Phonetics, Phonology, Morphology, Semantics, Syntax, Lexicography, NLP
Location:
Paris, France
Job description:
The role of the Junior Linguist is to annotate and review linguistic data in French. The Junior Linguist will also contribute to a number of other tasks to improve natural language processing. The tasks include:
-
Providing phonetic/phonemic transcription of lexicon entries
-
Analyzing acoustic data to evaluate speech synthesis
-
Annotating and reviewing linguistic data
-
Labeling text for disambiguation, expansion, and text normalization
-
Annotating lexicon entries according to guidelines
-
Evaluating current system outputs
-
Deriving NLP data for new and on-going projects
-
Be able to work independently with confidence and little oversight
Minimum Requirements:
-
Native speaker of French and fluent in English
-
Extensive knowledge of phonetic/phonemic transcriptions
-
Familiarity with TTS tools and techniques
-
Experience in annotation work
-
Knowledge of phonetics, phonology, semantics, syntax, morphology or lexicography
-
Excellent oral and written communication skills
-
Attention to detail and good organizational skills
Desired Skills:
-
Degree in Linguistics or Computational Linguistics or Speech processing
-
Ability to quickly grasp technical concepts; learn in-house tools
-
Keen interest in technology and computer-literate
-
Listening Skills
-
Fast and Accurate Keyboard Typing Skills
-
Familiarity with Transcription Software
-
Editing, Grammar Check and Proofing Skills
-
Research Skills
CV + motivation letter : maroussia.houimli@adeccooutsourcing.fr
|
6-42 | (2018-02-16) Postdoctoral Research Scientist: Computational Linguistics, Rochester, NY, USA
Postdoctoral Research Scientist: Computational Linguistics
We invite applications for an interdisciplinary postdoctoral position with specialization in computational linguistics and/or technical or scientific methods in language science at Rochester Institute of Technology (RIT), in Rochester, NY. This is a one-year position with opportunity for renewal. The applicant should demonstrate a fit with our commitment to collaborate with colleagues across the university on research initiatives in Personalized Healthcare Technology. In addition to engaging in research projects, the right candidate will be able to teach a total of two courses per year - one course each in the College of Liberal Arts and the Golisano College of Computing and Information Sciences at RIT. The teaching assignment may be Computer Science Principles, Introduction to Language Science, Language Technology, Introduction to Natural Language Processing, Science and Analytics of Speech (acoustic and experimental phonetics), Spoken Language Processing (automatic speech recognition and text-to-speech synthesis), Seminar in Computational Linguistics, or another course depending on background.
Required Minimum Qualifications
- PhD., with training in Computational Linguistics, Linguistics, or an allied field
- Advanced graduate coursework in computational linguistics (natural language processing or speech processing), linguistics, or language science broadly
- Publication record and plan for research and grant seeking activities
- Ability to contribute in meaningful ways to our commitment to cultural diversity, pluralism, and individual differences
Required Application Documents
Cover Letter, Curriculum Vitae or Resume, List of References, Research Statement
How To Apply
Please apply at: http://careers.rit.edu/staff. Click the link for search openings and in the keyword search field, enter the title of the position or 3599BR.
|
6-43 | (2018-02-18) Postdoctoral Research Associate (PDRA) at University of Kent, UK
A Postdoctoral Research Associate (PDRA) position is available in English Language and Linguistics at the University of Kent. We are looking for an enthusiastic candidate to join for our interdisciplinary team working on a project funded by the Leverhulme Trust - ?Does Language Have Groove? Sensorimotor Synchronisation for the Study of Linguistic Rhythm?. The project brings together expertise from phonetics, cognitive and movement sciences, and involves a collaboration between the University of Kent (UK) and the University of Montreal (Canada), in particular the International Laboratory for Brain, Music and Sound Research (BRAMS). The post holder will be based at Kent in Canterbury, with opportunities to travel to collaborative meetings and conferences.
This project aims to resolve current controversies surrounding rhythmic properties of language. Using paradigms based on sensorimotor synchronisation and rhythmic movement to cross-linguistic materials, we will develop new approaches to the study of linguistic rhythm. The PDRA will have strong computational skills and expertise in data analyses and modelling. S/he will be responsible for collecting and analysing tapping and perception data, for disseminating the project?s findings to academic and non-academic audiences, and writing up the results for publication.
The successful candidate must have a PhD in Phonetics, Experimental Psychology, Cognitive Science, Computing, Movement Sciences or related disciplines. Excellent programming skills in MATLAB/R environments as well as expertise in experimental design and statistical methods (e.g., multivariate statistics, linear mixed models, time-series analyses) are required. A publication record (commensurate with the applicant?s career stage), knowledge of non-linear modelling advanced techniques and experience working with different speech perception paradigms are highly desirable. The candidate should enjoy working in an interdisciplinary environment and be interested in working with speakers from typologically diverse language backgrounds. Language knowledge in addition to English (French, Italian and/or non-European languages) are a plus.
The required application documents include (1) a cover letter outlining the candidate?s background and suitability for the post and (2) a CV with a list of publications and contact details of three academic referees who are available to provide a reference letter for the shortlisted candidates prior to the interview date. ? Start date for applications: 24 January 2018 ? Closing date for applications: 26 February 2018 ? Interviews are to be held: 29 March 2018 ? Start date of the post: 1 May 2018 Please use this link to view the full job description and also to apply for this post. If you require further information regarding the application process please contact The Resourcing Team on jobs@kent.ac.uk quoting ref number: HUM0834. For informal enquiries about the post, please contact Dr Tamara Rathcke (t.v.rathcke@kent.ac.uk) +44 1227 826540.
|
6-44 | (2018-02-19)Ph D at Loria/Inria and Telecom Paris Rech
Nous proposons un sujet de thèse de doctorat sur le rehaussement de la parole par apprentissage profond au Loria/Inria Nancy Grand-EST et au LTCI/Télécom ParisTech.
Clôture de l'appel à candidature le 30 avril 2018.
|
6-45 | (2018-02-19) Post-Doctoral Researcher at Paderborn University, Germany
The department of Communications Engineering at Paderborn University, Faculty of Electrical Engineering, Informatics and Mathematics, offers a position as Post-Doctoral Researcher (pay scale 13 TV-L) at full time employment (100 %). The position is according to the German Wissenschaftszeitvertragsgesetz (WissZeitVG) and aims for scientific qualification in the area of project management and original research. It is temporary for 1.5 years, which is considered suitable for the qualification aim. An extension is, in principle, possible. The project: Your work will be concerned with signal processing and machine learning for wireless acoustic sensor networks. The research is carried out in the context of a multidisciplinary and multi-site collaborative research initiative. About us: We are a highly motivated research group working on cutting-edge techniques for robust acquisition and processing of speech and audio, such as enhancement, beamforming, source separation, recognition and acoustic scene understanding. We have strong links to both national and international research groups. . About you: You hold a PhD in Electrical Engineering, Computer Science or a closely related field You gained thorough and extensive knowledge in the fields of signal processing and machine learning for audio or speech You have a strong publication record demonstrating innovative research achievements You have excellent programming skills Applications of women are particularly welcome and, in case of equal qualifications and experiences, will receive preferential treatment according to the North RhineWestphalian Equal Opportunities Act (LGG), unless there are preponderant reasons to give preference to another applicant. Part-time employment is, in principle, possible. Applications from disabled people with appropriate suitability are explicitly welcome. This also applies to people with equal opportunities in accordance with the German social law SGB IX. Information about the Department of Communications Engineering can be found at http://ei.uni-paderborn.de/nt/ Please send your application including letter of motivation, research profile, CV, certificates, list of publications, and contact data for reference letters by mail citing the reference number 3291, not later than 31 March 2018, to: Prof. Dr. Reinhold Häb-Umbach Fachgebiet Nachrichtentechnik Universität Paderborn Warburger Str. 100 33098 Paderborn GERMANY (haeb@nt.uni-paderborn.de)
|
6-46 | (2018-02-20) 1 (W/M) researcher positions at IRCAM, Paris, France
Position: 1 (W/M) researcher positions at IRCAM Starting: April 1st, 2018 Duration: 12 months
Deadline for application: March, 15th, 2018 Position description 201802UMGRES: IRCAM is looking for a researcher for the development of music content analysis technologies (such as tempo, chord, structure, auto-tagging) in the framework of a technology transfer with Universal Music Group. Required profile: • Very high Skill in audio signal processing (spectral analysis, audio-feature extraction, parameter estimation) (the candidate should preferably hold a PHD in this field) • Very high skill in machine learning (SVM, ConvNet) (the candidate should preferably hold a PHD in this field) • High skill in distributed computing • High-skill in Matlab and Python programming, skills in C/C++ programming • Good knowledge of UNIX environments (GNU-Linux ou MacOSX) • High productivity, methodical works, excellent programming style. The hired Researchers will also collaborate with the development team and participate in the project activities (evaluation of technologies, meetings, specifications, reports). Introduction to IRCAM: IRCAM is a leading non-profit organization associated to Centre Pompidou, dedicated to music production, R&D and education in sound and music technologies. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D department include acoustics, audio signal processing, computer music, interaction technologies and musicology. Ircam is located in the centre of Paris near the Centre Pompidou, at 1, Place Igor Stravinsky 75004 Paris. Salary: According to background and experience Applications: Please send an application letter with the reference 201802UMGRES together with your resume and any suitable information addressing the above issues preferably by email to: peeters at ircam dot fr with cc to vinet at ircam dot fr.
|
6-47 | (2018-02-22) Postdoctoral Researcher at Saarland University, Germany
Postdoctoral Researcher
(computational linguistics or computer science)
Models of Intercomprehension in Speech and Language
The Language Science and Technology department at Saarland University seeks to fill a postdoctoral position. Applications are invited from individuals with research expertise in any field related to speech science and speech technology. Our research project is concerned with the analysis of cross-lingual mutual intelligibility between Slavic languages. It studies the auditory-perceptual intercomprehension of Slavic languages based on analyses of the acoustic, phonetic and phonological structure of spoken utterances. This line of investigation will be complemented by using adaptation techniques established in speech synthesis and recognition to measure the distance between languages. In addition, similarity will be determined on the level of complete utterances.
The postdoc will join a vibrant community of speech and language researchers at Saarland University whose expertise spans areas such as computational linguistics, psycholinguistics, language and speech technology, speech science, theoretical and corpus linguistics, computer science, and psychology.
Requirements: The successful candidate should have a Ph.D./Master's in Computer Science, Computational Linguistics, or a related discipline, with a strong background in speech science and speech technology, in particular TTS and ASR. Strong programming skills are essential. A good command of English is mandatory. Working knowledge of German is desirable but not a prerequisite. Candidates must have completed their Ph.D. by the time of the appointment.
The position is a full position (100%) on the German E13 scale and subject to the final approval by the funding agency. Starting dates can be between July and October, 2018. The appointment will be for between one and four years.
About the department: The department of Language Science and Technology is one of the leading departments in the speech and language area in Europe. The flagship project at the moment is the CRC on Information Density and Linguistic Encoding. Furthermore, the department is involved in the cluster of excellence Multimodal Computing and Interaction. It also runs a significant number of European and nationally funded projects. In total it has seven faculty and around 50 postdoctoral researchers and PhD students.
How to apply: Please send us: (1) a letter of motivation, (2) your CV, (3) your transcripts, (4) a list of publications, and (5) the names and contact information of at least two references, as a single PDF or a link to a PDF if the file size is more than 3 MB.
Please apply by April 3rd, 2018.
Contacts: If you are interested in the project, please send an email to Bernd Möbius (moebius@coli.uni-saarland.de) and Dietrich Klakow (dietrich.klakow@lsv.uni-saarland.de).
|
6-48 | (2018-03-02) POSTDOCTORAL FELLOW POSITION, CNRS and INSERM, Lyon, France
POSTDOCTORAL FELLOW POSITION
Applications are invited for a 12-month full-time (with possible 12-month extension) Postdoctoral Position in cognitive neuroscience in Lyon, to collect and analyze fMRI data on language processing. The post-doc is part of an exciting new project, which is a collaboration between Drs Alice Roy and Véronique Boulenger from the Laboratory Dynamics of Language (CNRS), and Dr Claudio Brozzoli from the Lyon Neuroscience Research Centre (INSERM).
The project lies in the context of embodied cognition theories and aims at uncovering the functional role of the motor system in second language processing. It will examine, using fMRI, the dynamics of cortical activation in motor regions before and after phonological training in a foreign language.
The project will be conducted in Lyon, a vibrant and stimulating neuroscience environment and a culturally rich city life, ideally located just an hour away from the Alpes, 2 hours from Paris and an hour and a half from Marseille and the Mediterranean sea (by train).
Key requirements for the candidates:
The ideal candidate will have a PhD in neuroscience, cognitive sciences or a related field and will have substantial experience in fMRI imaging analyses (e.g. SPM, connectivity analysis, resting state) and good programming skills (MATLAB). A background in speech and language is required.
Applications in the form of a cover letter with statement of research interests and a CV with full publication list should be sent by email to alice.roy@cnrs.fr and veronique.boulenger@cnrs.fr, with cc to claudio.brozzoli@inserm.fr.
Applicants from outside the European Union are welcome but they must qualify for a valid visa. French speaking is not a requirement (although it is an asset) as long as the English language is mastered.
Starting date: 2018 ? please contact us for further information.
Net salary: ~2000 ? / month
Applications will be considered until the position is filled.
Please feel free to forward this announcement to colleagues and students who could be interested in this position.
http://www.ddl.ish-lyon.cnrs.fr/equipes/index.asp?Langue=EN&Equipe=7&Page=Presentation&
--
Véronique Boulenger
Chargée de Recherche CNRS
Laboratoire Dynamique Du Langage
UMR5596 CNRS/Université de Lyon
04.72.72.79.24
veronique.boulenger@cnrs.fr
|
6-49 | (2018-03-03) Maitre de conférences, ENSIMAG, Grenoble,France
Ecole de rattachement : ENSIMAG
Profil d?enseignement :
L?Ensimag recrute un maître de conférences en mathématiques appliquées ou en informatique
afin de développer les enseignements d?apprentissage statistique, d?intelligence artificielle, de
visualisation de données, de calcul haute performance ou de « big data ». Le dossier de
candidature devra faire apparaître le caractère ?interdisciplinaire? du candidat, sa capacité à
prendre des responsabilités au sein de la structure, ainsi qu?une liste conséquente de travaux
ou publications en relation avec une ou plusieurs branches de la science des données. Outre la
formation aux sciences des données (synthèse de programmes à partir de données, aide à la
décision), la personne recrutée devra s?investir dans les enseignements du tronc commun
Ensimag (1ère année et environ 75% des filières de la 2ème année) qui constitue le socle de nos
élèves ingénieurs. Elle sera amenée à s'investir et prendre des responsabilités dans des
parcours de l?Ecole tels que le « mastère big data » ou le master « Data Science ». En
partenariat avec des industriels, la personne recrutée pourrait superviser l?organisation de «
challenges » et de « hackatons » afin d?enrichir les contacts de l?Ecole dans le domaine de
l?intelligence artificielle et des « big data ». En collaboration avec les équipes pédagogiques
concernées, elle devra s?impliquer dans le montage d?enseignements par projets et la
formation par le Numérique.
RECHERCHE
Laboratoire d?accueil : LIG / LJK
Profil de recherche :
Le candidat effectuera ses recherches dans le domaine de l?intelligence artificielle ou de la
science des données, et montrera son ouverture aux différentes approches possibles dans ce
domaine. Les thématiques privilégiées sont l?apprentissage sur données complexes, structurées
ou non structurées, l?apprentissage profond et les réseaux de neurones et en particulier les
problématiques d?optimisation, de causalité, de capacité de généralisation et leur analyse
mathématique. Parmi les applications de l?apprentissage et de l?apprentissage profond, un
intérêt particulier est porté au traitement du signal et de l?image, à l?apprentissage de
représentation, à l?apprentissage avec des données multimédia, des données langagières pour
des problématiques issues du traitement du langage naturel, les thématiques de transparence
des mécanismes d?apprentissage, ainsi que les applications en biologie, santé, sciences
humaines, réseaux sociaux, physique, environnement, etc.
Le recrutement renforcera les liens entre le LIG et le LJK dans les domaines de la science des
données et de l?apprentissage automatique. Les deux laboratoires sont localisés sur le campus
de Saint Martin d?Hères et ont des collaborations actives, en particulier au sein de l?axe du
traitement de données et de connaissance à large échelle (équipes AMA, GETALP, MRIM,
SLIDE), des équipes PERVASIVE, TYREX du LIG et au sein du département Proba-Stat (équipes
DAO, SVH, MISTIS, FIGAL) et de l?équipe THOTH du LJK. Parmi les projets communs entre les
deux laboratoires, on peut également citer les problèmes de prédiction et de classification avec
des données structurées de type fonctionnelles, le transport optimal pour l?apprentissage, les
problèmes de parcimonie et de régularisation pour l?apprentissage multitâches et leur
résolution par des méthodes d?optimisation stochastique. La personne recrutée montrera sa
capacité à jouer un rôle actif dans les projets contractuels académiques (ANR, FUI, PFIA, EU...)
et industriels sur ces thèmes très porteurs.
ACTIVITES ADMINISTRATIVES
Spécificités du poste ou contraintes particulières :
Activités administratives liées aux fonctions de maître de conférences : responsabilités d?unité
d?enseignement, responsabilités de filières ou d?année.
Compétences attendues :
Savoir : Enseignement de l?informatique, de l?intelligence artificielle et de la science
des données
Savoir-faire : Pédagogie et responsabilités dans l?Ecole
Savoir-être : Travail en équipe
pdf
Intelligence artificielle, Science des données, Big data, Apprentissage
|
6-50 | (2018-03-03) Research Linguist at ObEN, Los Angeles,CA, USA
RESEARCH LINGUIST Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Research Linguist, you will collaborate with other scientists who are experts in speech engineering, natural language processing, and computer vision. You will be working on a variety of tasks to improve technologies for speech synthesis, speech recognition, visual speech, and natural language processing. The tasks include:
● Design material and procedures to collect spoken and written language data;
● Design schemas and label/tag sets to annotate recordings and text with phonetic, prosodic, semantic, and syntactic features;
● Design methods and protocols to ensure the quality of linguistic data and annotations;
● Design perceptual or linguistic tests to evaluate the performance of speech and language systems;
● Contribute to the formalization of speech and language models by offering linguistic knowledge, identifying issues and providing solutions. Basic qualifications:
● Masters or higher degree in Linguistics or a closely-related field
● Specialization in Phonetics or Phonology
● Native or near-native proficiency in Japanese or Korean
● Ability to use programming scripts Preferred qualifications:
● Knowledge of scripting languages, e.g., Python
● Background in Psychology/Psycholinguistics
● Willingness to accept reprioritization as necessary Contact: pierre@oben.com
|
6-51 | (2018-03-03) Speech Research Scientist (ASR) at ObEN, Los Angeles, CA, USA
SPEECH RESEARCH SCIENTIST (ASR) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As an ASR Research Scientist, you’ll be working on developing tools to automate speech data acquisition and selection from diverse sources of data for the training of ObEN’s speech technology components. Responsibilities: ● Develop and extend ObEN’s proprietary ASR systems for different languages (English, Chinese, Korean, Japanese), in view of improving the robustness against environmental and channel distortion;
● Develop long (>1h) speech-text alignment systems;
● Develop lyrics-singing voice alignment systems;
● Develop tools and measures for data selection (confidence scores, acoustic measures);
● Develop tools for metadata extraction from speech and text (e.g: emotion, speakerID, etc). Requirements:
● PhD with strong research experience in ASR demonstrated by publications in top Speech Journals and Conferences (ICASSP, Interspeech, ASRU, etc.);
● Experience with robust ASR, long speech-text alignment, lightly supervised approaches and confidence measures computation; ● Fluent in Python and C++, excellent knowledge of Kaldi;
● Strong machine learning background and familiar with standard statistical modeling techniques applied to speech;
● Good knowledge of deep learning packages (Tensorflow, Theano, Keras, etc). Contact: pierre@oben.com
|
6-52 | (2018-03-03) Speech Research Scientist (Prosody Modeling) at ObEN, Los Angeles, CA, USA
SPEECH RESEARCH SCIENTIST (Prosody Modeling) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on Prosody Modeling, you will be working on developing new prosody models for different languages (Chinese, English, Japanese, Korean) to improve the naturalness and the similarity of the synthesized voice and to allow a better control of its expressivity. Responsibilities:
● Develop new prosody model for different languages, adaptable using a small amount of data;
● Develop generic prosodic models for different expressivity which can be applied to any voice;
● Develop sentiment analysis algorithms to control expressivity from text input.
Requirements:
● PhD with strong experience in Prosody Modeling for Speech Synthesis demonstrated by publications in top Speech Journals and Conferences (Speech prosody, Icassp, Interspeech, etc);
● Strong implementation skills and general knowledge in ML;
● Fluent in Python and C++, and good knowledge of deep learning packages;
● Familiarity with linguistic phonetics;
● Knowledge of basic digital signal processing techniques for audio. Contact: pierre@oben.com
|
6-53 | (2018-03-03) Speech Research Scientist (Singing Voice Synthesis) at ObEN, Los Angeles, CA, USA
SPEECH RESEARCH SCIENTIST (Singing Voice Synthesis) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on singing voice generation, you’ll be working on improving the overall quality and control of ObEN’s virtual singing technology. Responsibilities:
● Develop and improve ObEN’s virtual singing voice technology based on novel voice model with improved glottal source modelingl;
● Explore new approaches for singing voice generation based on deep generative models;
● Develop singing voice generation approach from musical annotation. Requirements:
● PhD with strong experience in speech synthesis, preferably singing voice synthesis demonstrated by publications in top Speech journals and conferences (Icassp, Interspeech, etc);
● Good experience in deep generative models and sequential modelling;
● Strong implementation skills and knowledge in ML;
● Fluent in Python and C++, and good knowledge of deep learning packages;
● Familiarity with linguistic phonetics;
● Knowledge of basic digital signal processing techniques for audio. Contact: pierre@oben.com
|
6-54 | (2018-03-03) Speech Research Scientist (Speech Synthesis) at ObEN, Los Angeles, CA, USA
SPEECH RESEARCH SCIENTIST (Speech Synthesis) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on Speech Synthesis, you’ll be working on improving ObEN’s speech synthesis technology. This will include the improvement of our current voice model and the development of new speech generation approaches based on deep generative models. Responsibilities:
● Develop and extend ObEN’s glottal source model, in view of improving the quality, flexibility and control (e.g. voice quality, expressivity) of ObEN’s speech and singing voice synthesis system;
● Develop new speech generation approaches based on deep generative models (e.g. wavenet) with reduced amount of data and better control. Requirements:
● PhD with strong experience in Speech Synthesis demonstrated by publications in top Speech Journals and Conferences (Icassp, Interspeech, etc);
● Expertise in signal processing in particular in the design of voice models (glottal source model, ...) allowing a fine control of the characteristics of the synthesized voice (speech and singing voice);
● Experience in deep generative model of raw audio (wavenet) and Generative Adversarial Network (WGAN);
● Fluent in Python and C++, and good knowledge of deep learning packages (TensorFlow, Theano, Keras, etc);
● Familiarity with linguistic phonetics;
● Knowledge of basic digital signal processing techniques for audio. Contact: pierre@oben.com
|
6-55 | (2018-03-03) Speech Research Scientist (TTS) at ObEN, Los Angeles, CA, USA
SPEECH RESEARCH SCIENTIST (TTS) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on Text-to-Speech, you will be working on developing cutting-edge deep learning algorithms for voice personalization. This will include the development of structured acoustic models for synthesis allowing the control of factors such as voice timbre, voice quality, language, accent, expressiveness and speaking style and the adaptation/conversion towards a target voice using a reduced amount of data. Responsibilities: ● Develop and extend ObEN’s proprietary TTS system, in view of improving the quality and the naturalness of the synthesized voice as well as the similarity to the target voice and reducing the amount of data for speaker adaptation;
● Develop deep generative model of raw speech waveform; ● Develop cross-lingual approaches (e.g. phonetic posteriorgrams). Requirements:
● PhD with strong research experience in Adaptation of DNN-based TTS systems demonstrated by publications in top Speech journals and conferences (Icassp, Interspeech, etc);
● Strong machine learning background and familiar with standard statistical modeling techniques applied to speech;
● Research experience in deep generative model of raw audio (wavenet) and Generative Adversarial Network (WGAN);
● Fluent in Python and C++, and expert knowledge of deep learning packages (TensorFlow, Theano, Keras, etc);
● Familiarity with linguistic phonetics;
● Knowledge of basic digital signal processing techniques for audio.
Contact: pierre@oben.com
|
6-56 | (2018-03-06) Post Doc position at University of Saarland, Germany
Post Doc position (computer science, computational linguistics, physics or similar) Integrating Models of Vision, Knowledge and Language Statistical models of natural language so far have only considered the preceding words as a context. We would now like to generalize this to include also knowledge of from images or data bases. To this end, suitable neural network architectures will be explored and their properties analysed. Alternative machine learning based approaches will also be considered. Requirements: The successful candidate should have a Master's degree in Computer Science, Computational Linguistics, Physics or a related discipline, with a strong background in mathematics and programming. A good command of English is mandatory as English is the working language of the department. The position is full time on the German E13 scale and subject to the final approval by the funding agency. Starting dates can be between July and October, 2018. The appointment will be for up to four years. About the department: The department of Language Science and Technology is one of the leading departments in the speech and language area in Europe. The flagship project at the moment is the CRC on Information Density and Linguistic Encoding. Furthermore, the department is involved in the cluster of excellence Multimodal Computing and Interaction. It also runs a significant number of European and nationally funded projects. In total it has seven faculty and around 50 postdoctoral researchers and PhD students. How to apply: Please send us a letter of motivation, your CV, your transcripts, a list of publications, and the names and contact information of at least two references, as a single PDF or a link to a PDF if the file size is more than 3 MB. Please apply by April 10th, 2018. Contact: If you are interested in the project, please send an email to Dietrich Klakow (dietrich.klakow@lsv.uni-saarland.de)
|
6-57 | (2018-03-14) PhD Position in Experimental Mechanics of Materials/Structures (vocal-fold 3D structure), Gipsa Lab, Grenoble
PhD Position in Experimental Mechanics of Materials/Structures “From the vocal-fold 3D structure and micro-mechanics to the design of biomimetic materials” Location : 3SR Lab, CoMHet team, Grenoble, France lucie.bailly@3sr-grenoble.fr, laurent.orgeas@3sr-grenoble.fr, sabine.rollandduroscoat@3sr-grenoble.fr Collaboration : GIPSA-lab, VSLD team, Grenoble, France Nathalie.Henrich@gipsa-lab.fr Project summary The vocal folds are soft multi-layered laryngeal tissues, owning remarkable vibro-mechanical performances. Composed of collagen and elastin microfibrils’ networks, the upper layers play a major role in the vocal-fold vibrations. However, the impact of these tissues’ histological features on their mechanical behavior is still poorly known. This is mainly ascribed to their challenging experimental characterization at the scale of their fibrous networks. Therefore, this PhD project aims to gain an in-depth understanding of the link between the micromechanics of vocal-fold tissues and their unique vibratory macroscale performances. The strategy will be : 1. To go further in the investigation of the vocal-fold 3D architecture and micromechanics and behaviour upon finite deformation. This step will be based on experimental biomechanical campaigns and unprecedented synchrotron X-ray in situ microtomography ; 2. To use these data to mimic and process fibrous biomaterials with tailored structural and biomechanical properties ; 3. To characterize the vibro-mechanical properties of these biomaterials at different scales (macro/micro) and frequencies (low/high). Location and practical aspects The successful applicant will be hosted by the laboratory Soils, Solids, Structures, Risks (3SR, UMR5521 - Grenoble, France - www.3sr-grenoble.fr/) in the “CoMHet” team. A part of his/her work will also be conducted in the Images, Speech, Signal and Automation Laboratory (GIPSA-lab, UMR5216 - Grenoble, France - www.gipsalab.grenoble-inp.fr/). This project will benefit from a collaboration existing between researchers in mechanical engineering, voice production and clinicians from Grenoble University Hospital (LADAF). The PhD fellowship offer is available from September 2018 (possible adjustments of this starting date if need be) for a period of 3 years (financial support acquired from ANR MicroVoice project). Applications Candidates with academic backgrounds in solid mechanics, materials science and engineering are expected. Specific skills in dynamics of composites, vibromechanics, and experimental mechanics will be appreciated. Additional knowledge in acoustics and/or biomechanics of soft tissues will be interestingly examined. Interested candidates should send their CV, a cover letter and official transcripts of the last two years before 2018, April the 30th to Lucie BAILLY, lucie.bailly@3sr-grenoble.fr, (+33) (0)4 76 82 70 85.
|
6-58 | (2018-03-15) Internship at ELDA, Paris, France
Nous recherchons un-e stagiaire dans le cadre d'un projet ayant pour but l'actualisation d?un inventaire de ressources linguistiques pour les langues régionales françaises, ainsi que la négociation des droits pour permettre leur partage avec la communauté des technologies de la langue.
Les tâches consisteront principalement en:
? La mise à jour de l?inventaire de ressources linguistiques existant ? L'étude technique et juridique des conditions de partage actuelles de ces ressources (analyse des formats d?exploitation des ressources et identification des droits d?utilisation en coopération avec un expert juridique en interne) ? La négociation avec les fournisseurs, la définition des conditions de réutilisation des ressources linguistiques, l?établissement de contrats de distribution, ? La description et l?intégration des ressources disponibles dans le catalogue ELRA ? La rédaction d?un rapport final
Profil: ? Niveau master 2 traitement automatiques des langues ou domaines assimilés ? Durée : 6 mois ? Aptitude à travailler tant de façon indépendante qu?au sein d?une équipe ? Forte aptitude rédactionnelle et analytique ? Convention de stage requise
Toute candidature sera étudiée jusqu?à ce que le poste soit pourvu. Le poste est basé à Paris.
Salaire : en fonction des qualifications et expériences.
Les candidatures doivent être adressées par courriel (lettre de motivation et curriculum vitae) à: ELDA 9, rue des Cordelières 75013 Paris FRANCE Email : job@elda.org
ELDA, unité opérationnelle de l?association ELRA, est chargée de promouvoir le développement de ressources linguistiques sous toutes les formes électroniques utilisables, en particulier sous la forme de corpus oraux et écrits, de lexiques et de bases terminologiques. Depuis sa création en 1995, ELDA s?est affirmée comme un centre unique en Europe pour la distribution de ressources linguistiques, capable de répondre aux divers besoins des développeurs de technologie. Ses activités se développent maintenant vers de nouveaux types de ressources linguistiques (données de type multimodal/multimédia). Certaines ressources linguistiques sont conçues au cours de projets (co-)financés par ELDA. Celles-ci sont ensuite compilées sous la forme d?un catalogue de ressources linguistiques. ELDA est impliquée dans un certain nombre de projets européens et nationaux. ELDA s?intéresse également aux problèmes juridiques en rapport avec les ressources linguistiques, réalise régulièrement des études de marché sur les besoins des utilisateurs, et travaille à l?amélioration des procédures de validation des ressources.
Pour de plus amples renseignements concernant ELRA/ELDA, voir : http://www.elra.info ou www.elda.org
|
6-59 | (2018-03-19) Faculty position (Associate professor) at Telecom ParisTech, Paris
Telecom ParisTech has one new permanent (indefinite tenure) faculty position (Associate Professor) in machine learning. Applicants from machine learning for speech processing, natural language processing or affective computing are welcomed.
More information on the social computing topic is avalable here : https://www.tsi.telecom-paristech.fr/en/research/1885-2/social-computing-topic/
********************************************************
Faculty position (Associate professor) at Telecom ParisTech in
Machine-Learning.
Important Dates
? May 25th, 2018: closing date
? Mid June: hearings of preselected candidates
Telecom ParisTech?s [1] machine learning, statistics and signal processing group (a.k.a S²A group) [2], within the laboratoire de traitement et communication de l?information (LTCI) [5], is inviting applications for a permanent (indefinite tenure) faculty position at the *Associate Professor* level (Maitre de Conferences) in *Machine learning*.
Main missions
The recruit will be expected to:
Research activities
? Develop groundbreaking research in the field of theoretical or applied machine learning, targeting applications that are well aligned with the topics of the S²A group [3] and the Images, Data & Signals department [4], which include (and is not restricted to) time series analysis (audio, ?), reinforcement learning, natural language processing, social signal processing, predictive maintenance, biomedical or physiological signal analysis, recommendation, finance, health, ?.
? Develop both academic and industrial collaborations on the same topic, including collaborative activities with other Telecom ParisTech research departments and teams, and research contracts with industrial players
? Set up research grants and take part in national and international collaborative research projects
Teaching activities
? Participate in teaching activities at Telecom ParisTech and its partner academic institutions (as part of joint Master programs), especially in machine learning and Data science, including life-long training programs (e.g. the local Data Scientist certificate)
Impact
? Publish high quality research work in leading journals and conferences
? Be an active member of the research community (serving in scientific committees and boards, organizing seminars, workshops, special sessions...)
Candidate profile
As a minimum requirement, the successful candidate will have:
? A PhD degree
? A track record of research and publication in one or more of the following areas: machine learning, applied mathematics, signal processing,
? Experience in teaching
? Good command of English
The ideal candidate will also (optionally) have:
? Experience in temporal data analysis problems (sequence prediction, multivariate time series, probabilistic graphical models, recurrent neural networks...)
NOTE:
The candidate does *not* need to speak French to apply, just to be willing to learn the language (teaching will be mostly given in English)
Other skills expected include:
? Capacity to work in a team and develop good relationships with colleagues and peers
? Good writing and pedagogical skills
More about the position
? Place of work: Paris until 2019, then Saclay (Paris outskirts)
? For more information about being an Associate Professor at Telecom ParisTech, check [6] (in French)
How to apply
Applications are to be sent by e-mail to: recrutement@telecom-paristech.fr
The application should include:
? A complete and detailed curriculum vitae
? A letter of motivation
? A document detailing past activities of the candidate in teaching and research: the two types of activities will be described with the same level of detail and rigor.
? The texts of the main publications
? The names and addresses of two referees
? A short teaching project and a research project (maximum 3 pages)
Contacts :
Slim Essid (Coordinator of the ADASP team)
Florence d?Alché-Buc (Professor, Machine Learning)
Stéphan Clémençon (Head of the S²A group)
Gaël Richard (Head of the IDS department)
[1] http://www.tsi.telecom-paristech.fr
[2] http://www.tsi.telecom-paristech.fr/ssa/
[3] http://www.tsi.telecom-paristech.fr/aao/en/
|
6-60 | (2018-03-24) PhD-student in a research project investigating strategies for human?robot-interaction, Bielefeld, Germany
The Social Cognitive Systems group (headed by Prof. Dr. Stefan Kopp; Cluster of Excellence Cognitive Interaction Technology, Bielefeld University, Germany) is currently looking for a PhD-student in a research project investigating strategies for human?robot-interaction, with a focus on generation of multimodal spoken dialogue behaviour.
Applicants should have a masters degree in computer science or (computational) linguistics with a focus on machine learning, statistical methods in natural language processing, and/or dialogue modelling and should have strong communication skills and be motivated to work in an interdisciplinary team of computer scientists, psychologists, engineers, and designers.
The position is fully paid (TV-L 13) with funding for three years. The official job advertisement (in German) can be found here: https://scs.techfak.uni-bielefeld.de/scswp/wordpress/wp-content/uploads/2018/03/wiss18072.pdf The deadline for applications to receive full consideration is 2018-04-06.
If you have any questions or want to know more about the research project, our research group, or living and working in Bielefeld, don't hesitate to contact Stefan Kopp <skopp@techfak.uni-bielefeld.de>.
--
Hendrik Buschmeier
Social Cognitive Systems Group, CITEC, Bielefeld University
https://purl.org/net/hbuschme
|
6-61 | (2018-03-26) PhD grant in Machine learning, Lannion, France
L?équipe Expression de l?IRISA propose une thèse en Informatique co-financée par la DGA sur le sujet suivant « Machine learning models for multimodal detection of anomalous behaviors ».
La description du sujet est disponible à cet emplacement :
Profil des candidats : Les candidat(e)s doivent être titulaire d'un Master recherche en informatique. Ils doivent également posséder un bon niveau de développement (C/C++/Python/?) ainsi que des connaissances en apprentissage automatique et si possible en traitement du signal. La DGA impose que les candidats posséder la nationalité d?un pays membre de l?Europe. Un excellent niveau en anglais est requis.
Date limite de candidature : 10 avril 2018
Localisation : Lannion
|
6-62 | (2018-03-15) PhD grant at LJK and LIG, Grenoble, France
CDP TITLE: Performance Laboratory
SUBJECT TITLE: Computational Video Editing for Stage Performances
SCIENTIFIC DEPARTMENT (LABORATORY’S NAME): LJK+LIG
DOCTORAL SCHOOL’S: MSTII (Mathématiques appliquées et informatique)
SUPPORTER’S NAME: Rémi Ronfard & Benjamin Lecouteux
The PERFORMANCE LABORATORY cross-fertilises UGA’s performing arts, geography-urban studies
and computer science communities to produce innovative performance as research. This new
interdisciplinary community of 41 academics will allow the development of cutting edge art research, digital
documentation, performance literacy tools and innovative forms of material and immaterial heritage. This
will push the very boundaries of the scientific disciplines themselves, both methodologically and
epistemologically, and in turn, create a new pluridisciplinary ecosystem at CUGA.
SUBJECT DESCRIPTION:
Context : This PhD thesis is proposed as part of an ongoing collaboration between computer scientists and
performings arts researchers at Univ. Grenoble Alpes and INRIA to use video in teaching and researching
the performing arts. In a previous project, the IMAGINE team at LJK and INRIA developped methods for
automatic generation of cinematic rushes from ultra high definition video recordings of stage performances
[1]. Here, we would like to propose techniques for making documentary movies from the generated rushes,
based on an analysis of the script of the performance and a formalization of the rules of film editing. Ideally,
the proposed techniques should be completely non-invasive (not requiring sensors on actors or on stage)
and intuitive enough to be used by performing arts students, professors and researchers, without any
expertise in video production.
Description: The goal of the PhD thesis will be to propose novel interaction techniques to students,
professors and researchers in the performing arts for making movies from stage performances recorded on
stage. On the one hand, we will propose novel algorithms for editing cinematographic rushes together into
movie clips automatically, based on computational models of film editing « idioms » and machine analysis
of the actors speech and motion. On the other hand, we will propose novel user interfaces for easily
choosing between available idioms as in [2] and creating new idioms for the specific purpose of teaching
and researching mise en scene and acting techniques.
During his/her thesis, the PhD student will create an extensive database of stage performance recordings,
as part of a collaboration with the performing arts department at Univ. Grenoble Alpes and associated
theatre companies. The raw recordings and the generated movies will be used as supporting material for
teaching mise- en-scène and acting techniques, and for researching multiple aspects of expressive human
motion, verbal and non-verbal communication, and dramaturgic techniques, as part of the new crossdisciplinary
research project « Performance Lab ».
References:
[1] Vineet Gandhi, Rémi Ronfard, Michael Gleicher. Multi-Clip Video Editing from a Single Viewpoint.
CVMP 2014 - European Conference on Visual Media Production, Nov 2014.
[2] Mackenzie Leake, Abe Davis, Anh Truong, and Maneesh Agrawala. Computational video editing for
dialogue-driven scenes. ACM Trans. Graph. 36, 4, July 2017.
ELIGIBILITY CRITERIA
Applicants:
- must hold a Master's degree (or be about to earn one) or have a university degree equivalent to a
European Master's (5-year duration),
Applicants will have to send an application letter in English and attach:
- Their last diploma
- Their CV
- A short presentation of their scientific project (2 to 3 pages max)
- Letters of recommendation are welcome.
SELECTION PROCESS
Application deadline: May 15th 2018 at 17:00 (CET)
Applications will be evaluated through a three-step process:
1. Eligibility check of applications in May 17th 2018
2. 1st round of selection: the applications will be evaluated by a Review Board and results will be May
25th.
3. 2nd round of selection: shortlisted candidates will be invited for an interview session in Grenoble on
May 31st 2018 (if necessary).
4. Final decision will be given June 30.
TYPE of CONTRACT: temporary-3 years of doctoral contract
JOB STATUS: Full time
HOURS PER WEEK: 35
OFFER STARTING DATE: October 1 2018
APPLICATION DEADLINE: May 15th 2018
Salary: between 1768.55 € and 2100 € (gross) per month (depending on complementary activity or not)
|
6-63 | (2018-03-16) Research Linguist at ObEN, Inc, Pasadena, California, USA
RESEARCH LINGUIST Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Research Linguist, you will collaborate with other scientists who are experts in speech engineering, natural language processing, and computer vision. You will be working on a variety of tasks to improve technologies for speech synthesis, speech recognition, visual speech, and natural language processing. The tasks include: ● Design material and procedures to collect spoken and written language data; ● Design schemas and label/tag sets to annotate recordings and text with phonetic, prosodic, semantic, and syntactic features; ● Design methods and protocols to ensure the quality of linguistic data and annotations; ● Design perceptual or linguistic tests to evaluate the performance of speech and language systems; ● Contribute to the formalization of speech and language models by offering linguistic knowledge, identifying issues and providing solutions. Basic qualifications: ● Masters or higher degree in Linguistics or a closely-related field ● Specialization in Phonetics or Phonology ● Native or near-native proficiency in Japanese or Korean ● Ability to use programming scripts Preferred qualifications: ● Knowledge of scripting languages, e.g., Python ● Background in Psychology/Psycholinguistics ● Willingness to accept reprioritization as necessary Contact: pierre@oben.com
|
6-64 | (2018-03-16) SPEECH RESEARCH SCIENTIST (ASR) at ObEN, Inc, Pasadena, California,USA
SPEECH RESEARCH SCIENTIST (ASR) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As an ASR Research Scientist, you’ll be working on developing tools to automate speech data acquisition and selection from diverse sources of data for the training of ObEN’s speech technology components. Responsibilities: ● Develop and extend ObEN’s proprietary ASR systems for different languages (English, Chinese, Korean, Japanese), in view of improving the robustness against environmental and channel distortion; ● Develop long (>1h) speech-text alignment systems; ● Develop lyrics-singing voice alignment systems; ● Develop tools and measures for data selection (confidence scores, acoustic measures); ● Develop tools for metadata extraction from speech and text (e.g: emotion, speakerID, etc). Requirements: ● PhD with strong research experience in ASR demonstrated by publications in top Speech Journals and Conferences (ICASSP, Interspeech, ASRU, etc.); ● Experience with robust ASR, long speech-text alignment, lightly supervised approaches and confidence measures computation; ● Fluent in Python and C++, excellent knowledge of Kaldi; ● Strong machine learning background and familiar with standard statistical modeling techniques applied to speech; ● Good knowledge of deep learning packages (Tensorflow, Theano, Keras, etc). Contact: pierre@oben.com
|
6-65 | (2018-03-16) SPEECH RESEARCH SCIENTIST (Prosody Modeling)at ObEN Inc.,Pasadena, California, USA
SPEECH RESEARCH SCIENTIST (Prosody Modeling) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on Prosody Modeling, you will be working on developing new prosody models for different languages (Chinese, English, Japanese, Korean) to improve the naturalness and the similarity of the synthesized voice and to allow a better control of its expressivity. Responsibilities: ● Develop new prosody model for different languages, adaptable using a small amount of data; ● Develop generic prosodic models for different expressivity which can be applied to any voice; ● Develop sentiment analysis algorithms to control expressivity from text input. Requirements: ● PhD with strong experience in Prosody Modeling for Speech Synthesis demonstrated by publications in top Speech Journals and Conferences (Speech prosody, Icassp, Interspeech, etc); ● Strong implementation skills and general knowledge in ML; ● Fluent in Python and C++, and good knowledge of deep learning packages; ● Familiarity with linguistic phonetics; ● Knowledge of basic digital signal processing techniques for audio. Contact: pierre@oben.com
|
6-66 | (2018-03-16) SPEECH RESEARCH SCIENTIST (Singing Voice Synthesis) at ObEN Inc., Pasadena, California, USA
SPEECH RESEARCH SCIENTIST (Singing Voice Synthesis) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on singing voice generation, you’ll be working on improving the overall quality and control of ObEN’s virtual singing technology. Responsibilities: ● Develop and improve ObEN’s virtual singing voice technology based on novel voice model with improved glottal source modelingl; ● Explore new approaches for singing voice generation based on deep generative models; ● Develop singing voice generation approach from musical annotation. Requirements: ● PhD with strong experience in speech synthesis, preferably singing voice synthesis demonstrated by publications in top Speech journals and conferences (Icassp, Interspeech, etc); ● Good experience in deep generative models and sequential modelling; ● Strong implementation skills and knowledge in ML; ● Fluent in Python and C++, and good knowledge of deep learning packages; ● Familiarity with linguistic phonetics; ● Knowledge of basic digital signal processing techniques for audio. Contact: pierre@oben.com
|
6-67 | (2018-03-16) SPEECH RESEARCH SCIENTIST (Speech Synthesis) at ObEN Inc., Pasadena, California, USA
SPEECH RESEARCH SCIENTIST (Speech Synthesis) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on Speech Synthesis, you’ll be working on improving ObEN’s speech synthesis technology. This will include the improvement of our current voice model and the development of new speech generation approaches based on deep generative models. Responsibilities: ● Develop and extend ObEN’s glottal source model, in view of improving the quality, flexibility and control (e.g. voice quality, expressivity) of ObEN’s speech and singing voice synthesis system; ● Develop new speech generation approaches based on deep generative models (e.g. wavenet) with reduced amount of data and better control. Requirements: ● PhD with strong experience in Speech Synthesis demonstrated by publications in top Speech Journals and Conferences (Icassp, Interspeech, etc); ● Expertise in signal processing in particular in the design of voice models (glottal source model, ...) allowing a fine control of the characteristics of the synthesized voice (speech and singing voice); ● Experience in deep generative model of raw audio (wavenet) and Generative Adversarial Network (WGAN); ● Fluent in Python and C++, and good knowledge of deep learning packages (TensorFlow, Theano, Keras, etc); ● Familiarity with linguistic phonetics; ● Knowledge of basic digital signal processing techniques for audio. Contact: pierre@oben.com
|
6-68 | (2018-03-16) SPEECH RESEARCH SCIENTIST (TTS) at ObEN Inc., Pasadena, California, USA
SPEECH RESEARCH SCIENTIST (TTS) Come join us and build Personal Artificial Intelligence (PAI) -- intelligent 3D avatars that look, sound, and behave like the individual user! ObEN is an artificial intelligence company developing a decentralized AI platform for Personal AI (PAI). Founded in 2014, ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company. As a Speech Research Scientist focusing on Text-to-Speech, you will be working on developing cutting-edge deep learning algorithms for voice personalization. This will include the development of structured acoustic models for synthesis allowing the control of factors such as voice timbre, voice quality, language, accent, expressiveness and speaking style and the adaptation/conversion towards a target voice using a reduced amount of data. Responsibilities: ● Develop and extend ObEN’s proprietary TTS system, in view of improving the quality and the naturalness of the synthesized voice as well as the similarity to the target voice and reducing the amount of data for speaker adaptation; ● Develop deep generative model of raw speech waveform; ● Develop cross-lingual approaches (e.g. phonetic posteriorgrams). Requirements: ● PhD with strong research experience in Adaptation of DNN-based TTS systems demonstrated by publications in top Speech journals and conferences (Icassp, Interspeech, etc); ● Strong machine learning background and familiar with standard statistical modeling techniques applied to speech; ● Research experience in deep generative model of raw audio (wavenet) and Generative Adversarial Network (WGAN); ● Fluent in Python and C++, and expert knowledge of deep learning packages (TensorFlow, Theano, Keras, etc); ● Familiarity with linguistic phonetics; ● Knowledge of basic digital signal processing techniques for audio. Contact: pierre@oben.com
|
6-69 | (2018-03-17) 2 PhD grants and 2 postdoc positions (2-year), at Aix-Marseille/Avignon , France
2 PhD grants and 2 postdoc positions (2-year)
at Aix-Marseille/Avignon
on Language, Communication and the Brain
The Center of Excellence on Brain and Language (BLRI, www.blri.fr/) and the Institute of Language, Communication and the Brain (ILCB, http://www.ilcb.fr/ ) award :
- 2 PhD grants (3-year) on any topic that falls within the area of language, communication, brain and modelling.
- 2 postdoc positions (2-year) on any topic that falls within the area of language, communication, brain and modelling.
The BLRI-ILCB is located in Aix-en-Provence, Avignon and Marseille and regroups several research centers in linguistics, psychology, cognitive neuroscience, medicine, computer science, and mathematics.
Interested candidates need to find one or more PhD or postdoc supervisors amongst the members of the BRLI-ILCB. Together with the supervisor(s), they would then need to write a 3-year PhD project or a 2-year postdoc project. A priority is given to interdisciplinary co-directions and to projects that involve two different laboratories of the institute.
. PhD grants : Monthly salary: 1 685? (1 368? net) for a period of 3 years
. Postdoc: Monthly salary: ~2000 ? net (depending on experience)
. Deadline: June 17, 2018
HOW TO APPLY
Candidates should first contact potential supervisor(s) among the members of the ILCB/BLRI. A list of potential projects and supervisors that will be given priority for this call can be find here. However, you can also apply to any subject, under the supervision of any ILCB/BLRI member (http://www.blri.fr/members.html.).
When the research project is finalized and approved by the supervisor(s), the application must be sent to nadera.bureau@blri.fr.
|
6-70 | (2018-03-22) Research scientist at the University of Trento, Italy
At the University of Trento ( Italy ) we are looking for highly motivated researcher to join our research team and work on Natural Language Understanding and Dialog Modeling and Systems.The Signals and Interactive Systems Lab at University of Trento attracts researchers fromcomputational linguistics, computer science, electrical engineering to design and train the most advanced interactive and conversational systems.You will join the research team that has been training intelligent machines and evaluatingAI-based systems for more than two decades, collaborating with leading research labs and successful startups in the world.You can check a sample of the projects in the area of Natural Language Understanding, ConversationalSystems and Personal Agents ( and more ) at: http://sisl.disi.unitn.it/demo/ The candidates should have strong background, past achievement records inat least in one of the following areas:- Natural Language Understanding- Conversational Modeling and Systems- Machine LearningFor more info on research and projects visit the lab websiteVisit lab website at http://sisl.disi.unitn.it/ The official language (research and graduate teaching) of the department is English.FELLOWSHIPThe research fellowship will depend on experience and in the range of 19367 - 33000 Euros per year.The position is for one year, renewable.For more information about cost of living, campus,please visit the graduate school website at http://ict.unitn.it/DEADLINESImmediate openings with start date as early as May 2018. Open until filled.REQUIREMENTS- PhD degree in Computer Science, Computational Linguistics, Machine Learning or similar or affine disciplines.- Strong academic record (publications in top conferences and journals)- Strong programming skills - Excellent command of oral and written English- Excellent understanding of experimental design methodology and statistics- Excellent understanding of natural language processing - Excellent understanding of machine learning methods - Experience working on research projects- Excellent team-work skills- Supervison of studentsHOW TO APPLYInterested applicants should send their1) CV 2) At least three reference letters sent to:Email: sisl-jobs@disi.unitn.itFor more info:Signals and Interactive Systems Lab: http://sisl.disi.unitn.it/PhD School : http://ict.unitn.it/Department : http://disi.unitn.it/Information Engineering and Computer Science Department (DISI)DISI has a strong focus on cross-disciplinarity with professors from different faculties of the University (Physical Science, Electrical Engineering, Economics,Social Science, Cognitive Science, Computer Science) with international background. DISI aims at exploiting the complementary experiences present in the various research areas in order to develop innovative methods, technologies andapplications.University of TrentoThe University of Trento is consistently ranked as premiere Italian university institution. See http://www.unitn.it/en/node/1636/mid/2573University of Trento is an equal opportunity employer.
|
6-71 | (2018-04-09) Postes d'ATER en Traitement automatique des langues et de la Parole, Sorbonne Université, Paris, France
Des postes d'ATER en Traitement automatique des langues et de la Parole sont disponibles à la faculté des lettres de Sorbonne Université. Le lien pour postuler est http://concours.univ-paris4.fr/PostesAter?entiteBean=posteCandidatureCourant
Les conditions pour candidater sont disponibles sur http://lettres.sorbonne-universite.fr/ater.
Cordialement,
Claude Montacié claude.montacie@sorbonne-universite.fr
|
6-72 | (2018-04-11) A three-year doctoral position at the University Sorbonne Nouvelle, Paris, France
Dear colleagues, Please find attached the description of a three-years doctoral position at the University Sorbonne Nouvelle to be filled at the last term of 2018.
The Laboratory of Phonetics and Phonology (http://lpp.in2p3.fr/), Paris, France, offers a funded position for a PhD candidate for a period of three years on the acoustic phonetic markers of inter and intra-speaker variability with a special notice considering the normalization of procedures.
We would be most grateful if you could also distribute this information among other persons who may be interested by this offer.
Cédric Gendrot et Cécile Fougeron
Descriptif de l?offre :
Offre de contrat doctoral par le Laboratoire de Phonétique et Phonologie : « Marqueurs phonétiques et acoustiques de la variabilité inter- et intra-individuelle »
Le Laboratoire de Phonétique et Phonologie propose un contrat doctoral de 3 ans financé par l?ANR pour la rentrée universitaire 2018.
Le thème du doctorat proposé ici a pour objectif d'analyser les marqueurs phonétiques et acoustiques de la variabilité inter et intra locuteurs. Une attention particulière sera portée à la standardisation des méthodes d?analyse proposées, permettant leur transposition dans des domaines d?application connexes, dont celui du traitement automatique de la parole.
Il s?agira de prendre en compte des caractéristiques de la voix/parole très liées au contexte de la comparaison de voix. Dans la mesure où les variations de la parole sont multifactorielles, il apparaît indispensable d?établir des standards de mesures objectives pour lesquelles les méthodologies récentes de la phonétique expérimentale peuvent apporter une garantie.
On s?intéressera notamment aux marqueurs acoustiques qui retranscrivent des propriétés physiologiques individuelles ainsi qu?aux habitudes articulatoires, vecteurs d?identité sociale.
Le/la doctorant(e) effectuera ses recherches au LPP (Laboratoire de Phonétique et de Phonologie), une unité de recherche mixte CNRS/Université Paris3 Sorbonne Paris Cité. Voir les travaux sur ce thème du Laboratoire de Phonétique et de Phonologie http://lpp.in2p3.fr
Le/la candidat(e) sélectionné(e) sera encadré(e) par Cédric Gendrot et Cécile Fougeron, respectivement enseignant-chercheur de l?Université Sorbonne Nouvelle et Directrice de recherche au CNRS. Il/elle dépendra de l'Ecole Doctorale ED268 de l'Université Sorbonne nouvelle.
Le/la doctorant(e) bénéficiera des ressources du laboratoire, de l'Ecole Doctorale ED268 et de l'environnement de recherche interdisciplinaire du Laboratoire d'Excellence EFL. Il/elle pourra assister à des séminaires hebdomadaires de recherche phonétique et phonologie au LPP et d'autres équipes de recherche, suivre des conférences données par des professeurs invités de stature internationale, des formations, des colloques et des écoles d'été.
? Conditions
- avoir une bonne maitrise de la langue française.
- avoir mené avec succès un premier projet de recherche personnel
- aucune condition de nationalité n'est exigée.
- avoir de très bonnes connaissances en traitement de données de type phonétique acoustique.
- des connaissances en informatique et en analyse statistique seraient un plus.
? Pièces à joindre pour la candidature
1. un CV
2. une lettre de motivation
3. le mémoire de master 2 en phonétique
4. le nom de deux référents (avec leur adresse courriel)
Date limite de candidature: 30 juin 2018
Les dossiers complets seront à envoyer par mail au plus tard le 30 juin 2018 à Cédric Gendrot (cgendrot@univ-paris3.fr) et Cécile Fougeron (cecile.fougeron@univ-paris3.fr)
- Présélection sur dossier et Audition des candidats présélectionnés
Les candidats présélectionnés seront auditionnés entre le 2 et le 6 juillet 2018) sur place ou par visio-conférence.
Contact pour plus d?information :
Cédric Gendrot : cgendrot@univ-paris3.fr
Cécile Fougeron : cecile.fougeron@univ-paris3.fr
|
6-73 | (2018-04-12) Post-doc en criminalistique, LNE, Trappes, France
POST DOC 18 mois - Comparaison de voix dans le domaine criminalistique : définition d’une méthodologie et d’un référentiel pour la certification de laboratoires
Localisation : Trappes (78). Laboratoire national de métrologie et d'essais (LNE) REF : ML/VOX/DE
CONTEXTE : Le projet ANR VoxCrim (2017-2021) propose d’objectiver scientifiquement les possibilités de mise en œuvre d’une comparaison de voix dans le domaine criminalistique. Deux objectifs principaux : a) mettre en place une méthodologie d’accréditation de type ISO 17025 pour les laboratoires de la Police, b) établir des standards de mesures objectives. Ce projet permettra de faciliter le traitement d’une comparaison de voix dans les services de police et permettra de renforcer la recevabilité de la preuve auprès des tribunaux. Le sujet du post-doctorat s’intègre dans le sous-projet « Accréditation, certification, normalisation et métrologie » du projet VoxCrim. Ce sous-projet s'appuie sur l’existant disponible auprès de l’Association Française de Normalisation (AFNOR) et du Comité Français d’Accréditation (COFRAC). Le travail à réaliser consiste dans un premier temps à évaluer l’existant et les adaptations nécessaires au contexte de l’accréditation des laboratoires réalisant des comparaisons de voix et à développer les protocoles de métrologie correspondants. Le sous-projet vise, en fin de projet, la définition complète d’une solution pratique d’accréditation et de certification en comparaison de voix.
MISSIONS : Les missions confiées s’organisent en trois tâches : - Rapport sur l’existant. Cette tâche consiste à explorer l’existant pour identifier les normes et directives à respecter, à faire évoluer ou dont il faut s’inspirer autant que possible. Le travail intégrera une dimension européenne et internationale (travaux du NIST-OSAC par exemple), et s’appuiera principalement sur les normes ISO 17025, 17043, 13528 pour mettre en place l’écosystème nécessaire pour valider les méthodes de comparaison de voix. Ces normes étant relatives principalement à de la mesure physique, le (la) post-doctorant(e) étudiera également la norme ISO 15189 qui présente des exigences relatives à des laboratoires où le prélèvement est fait sur un humain. - Spécifications des protocoles de métrologie intra- et inter-laboratoires, adaptées au contexte de la comparaison de voix, et plus spécifiquement dans le domaine de la criminalistique. - Le (la) post-doctorant(e) vérifiera l’adéquation des protocoles identifiés avec les jeux de conditions de mise en œuvre de comparaison de voix développés par les autres membres du projet. Outre le soutien apporté par les équipes Evaluation des systèmes de traitement de l’information et Mathématiques-Statistiques, le (la) post-doctorant(e) bénéficiera de formations : - En début de contrat, une journée de formation sur les méthodes de comparaison inter-laboratoire et d’accréditation, dispensée par le LNE aux membres du consortium VoxCrim. - Courts stages pratiques à la SDPTS (Sous-Direction de la Police Scientifique et Technique à Ecully) et/ou à l’IRCGN (Institut de Recherche Criminalistique de la Gendarmerie Nationale) afin de comprendre les problématiques liées à la comparaison de voix en criminalistique. - Participation aux journées d’étude Voxcrim organisées par les membres du consortium à la SDPTS. Des publications (et présentations, le cas échéant) en conférences et journaux internationaux sont attendues du (de la) post-doctorant(e).
DUREE : 18 mois. Début de préférence en septembre 2018.
PROFIL : Vous êtes titulaire d’un doctorat en informatique ou en sciences du langage, avec une spécialisation en traitement automatique de la parole. Vous possédez des connaissances en méthodologie d’évaluation et en biométrique vocale. Des connaissances en normalisation seraient un véritable atout.
Pour candidater, merci d’envoyer votre CV à l’adresse recrut@lne.fr en rappelant la référence : ML/VOX/DE
|
6-74 | (2018-04-14)2 PhD positions, IRIT Toulouse France
Two PhD positions are still available at IRIT Toulouse France starting ideally in Sept. 2018.
Position 1: Deep learning approaches to assess head and neck cancer voice intelligibility
Position 2: Clinical relevance of the intelligibility measures
These positions are in the framework of the TAPAS European Project.
For official information and applications, see https://www.tapas-etn-eu.org/positions
You may obtain further information from Julie Mauclair (phone: +33 5 61 55 60 55, julie.mauclair@irit.fr) and Thomas Pellegrini (phone: +33 5 61 55 68 86, thomas.pellegrini@irit.fr)
|
6-75 | (2018-04-16)Post doc position at INRIA Nancy France
Pos Doctoral Position (12 months)
Natural language processing: automatic speech recognition system using deep neural networks without out-of-vocabulary words
_______________________________________
- Location:INRIA Nancy Grand Est research center, France
- Research theme: PERCEPTION, COGNITION, INTERACTION
- Project-team: Multispeech
- Scientific Context:
More and more audio/video appear on Internet each day. About 300 hours of multimedia are uploaded per minute. In these multimedia sources, audio data represents a very important part. If these documents are not transcribed, automatic content retrieval is difficult or impossible. The classical approach for spoken content retrieval from audio documents is an automatic speech recognition followed by text retrieval.
An automatic speech recognition system (ASR) uses a lexicon containing the most frequent words of the language and only the words of the lexicon can be recognized by the system. New Proper Names (PNs) appear constantly, requiring dynamic updates of the lexicons used by the ASR. These PNs evolve over time and no vocabulary will ever contains all existing PNs. When a person searches for a document, proper names are used in the query. If these PNs have not been recognized, the document cannot be found. These missing PNs can be very important for the understanding of the document.
In this study, we will focus on the problem of proper names in automatic recognition systems. The problem is how to model relevant proper names for the audio document we want to transcribe.
- Missions:
We assume that in an audio document to transcribe we have missing proper names, i.e. proper names that are pronounced in the audio document but that are not in the lexicon of the automatic speech recognition system; these proper names cannot be recognized (out-of-vocabulary proper names, OOV PNs). The purpose of this work is to design a methodology how to find and model a list of relevant OOV PNs that correspond to an audio document.
Assuming that we have an approximate transcription of the audio document and huge text corpus extracted from internet, several methodologies could be studied:
-
From the approximate OOV pronunciation in the transcription, generate the possible writings of the word (phoneme to character conversion) and search this word in the text corpus.
-
A deep neural network can be designed to predict OOV proper names and their pronunciations with the training objective to maximize the retrieval of relevant OOV proper names.
The proposed approaches will be validated using the ASR developed in our team.
Keywords: deep neural networks, automatic speech recognition, lexicon, out-of-vocabulary words.
- Bibliography
[Mikolov2013] Mikolov, T., Chen, K., Corrado, G. and Dean, J. ?Efficient estimation of word representations in vector space?, Workshop at ICLR, 2013.
[Deng2013] Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero A. ?Recent advances in deep learning for speech research at Microsoft?, Proceedings of ICASSP, 2013.
[Sheikh2016] Sheihk, I., Illina, I., Fohr, D., Linarès, G. ?Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition?. Interspeech, 2016.
[Li2017] J. Li, G. Ye, R. Zhao, J. Droppo, Y. Gong , ?Acoustic-to-Word Model without OOV?, ASRU, 2017.
- Skills and profile: PhD in computer science, background in statistics, natural language processing, experience with deep learning tools (keras, kaldi, etc.) and computer program skills (Perl, Python).
- Additional information:
Supervision and contact: Irina Illina, LORIA/INRIA (illina@loria.fr), Dominique Fohr INRIA/LORIA (dominique.fohr@loria.fr) https://members.loria.fr/IIllina/, https://members.loria.fr/DFohr/
Additional links : Ecole Doctorale IAEM Lorraine
Deadline to apply: Mai 20th
Selection results: end of June
Duration :12 of months.
Starting date: between Nov. 1st 2018 and Jan. 1st 2019 Salary: about 2.115 euros net, medical insurance included
The candidates must have defended their PhD later than Sept. 1st 2016 and before the end of 2018.
The candidates are required to provide the following documents in a single pdf or ZIP file:
-
CV including a description of your research activities (2 pages max) and a short description of what you consider to be your best contributions and why (1 page max and 3 contributions max); the contributions could be theoretical or practical. Web links to the contributions should be provided. Include also a brief description of your scientific and career projects, and your scientific positioning regarding the proposed subject.
-
The report(s) from your PhD external reviewer(s), if applicable.
-
If you haven't defended yet, the list of expected members of your PhD committee (if known) and the expected date of defence.
In addition, at least one recommendation letter from the PhD advisor should be sent directly by their author(s) to the prospective postdoc advisor.
Help and benefits:
|
6-76 | (2018-04-16) PhD grant, INRIA Nancy France
Natural language processing: adding new words to a speech recognition system using Deep Neural Networks
- Location: INRIA/LORIA Nancy Grand Est research center France
- Project-team: Multispeech
- Scientific Context:
Voice is seen as the next big field for computer interaction. The research company Gartner reckons that by 2018, 30% of all interactions with devices will be voice-based: people can speak up to four times faster than they can type, and the technology behind voice interaction is improving all the time.
As of October 2017, Amazon Echo is present in about 4% of American households. Voice assistants are proliferating in smartphones too: Apple?s Siri handles over 2 billion commands a week, and 20% of Google searches on Android-powered handsets in America are done by voice input.
The proper nouns (PNs) play a particular role: they are often important to understand a message and can vary enormously. For example, a voice assistant should know the names of all your friends; a search engine should know the names of all famous people and places, names of museums, etc.
An automatic speech recognition system uses a lexicon containing the most frequent words of the language and only the words of the lexicon can be recognized by the system. It is impossible to add all possible proper names because there are millions proper names and new ones appear every day. A competitive solution is to dynamically add new PNs into the ASR system. The idea is to add only relevant proper names: for instance if we want to transcribe a video document about football results, we should add the names of famous football players and not politicians.
In this study, we will focus on the problem of proper names in automatic recognition systems. The problem is to find relevant proper names for the audio document we want to transcribe. To select the relevant proper names, we propose to use an artificial neural network.
- Missions:
We assume that in an audio document to transcribe we have missing proper names, i.e. proper names that are pronounced in the audio document but that are not in the lexicon of the automatic speech recognition system; these proper names cannot be recognized (out-of-vocabulary proper names, OOV PNs)
Tgoal of this PhDThesis is to find a list of relevant OOV PNs that correspond to an audio document and to integrate them in the speech recognition system. We will use a Deep neural network to find relevant OOV PNs The input of the DNN will be the approximate transcription of the audio document and the output will be the list of relevant OOV PNs with their probabilities. The retrieved proper names will be added to the lexicon and a new recognition of the audio document will be performed.
During the thesis, the student will investigate methodologies based on deep neural networks [Deng2013]. The candidate will study different structures of DNN and different representation of documents [Mikolov2013]. The student will validate the proposed approaches using the automatic transcription system of radio broadcast developed in our team.
- Bibliography:
[Mikolov2013] Mikolov, T., Chen, K., Corrado, G. and Dean, J. ?Efficient estimation of word representations in vector space?, Workshop at ICLR, 2013.
[Deng2013] Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero A. ?Recent advances in deep learning for speech research at Microsoft?, Proceedings of ICASSP, 2013.
[Sheikh2016] Sheihk, I., Illina, I., Fohr, D., Linarès, G. ?Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition?. Interspeech, 2016.
- Skills and profile: Master in computer science, background in statistics, natural language processing, experience with deep learning tools (keras, kaldi, etc.) and computer program skills (Perl, Python).
Deadline to apply : May 1st 2018
The candidates are required to provide the following documents in a single pdf or ZIP file:
-
CV
-
A cover/motivation letter describing their interest in the topic
-
Degree certificates and transcripts for Bachelor and Master (or the last 5 years)
-
Master thesis (or equivalent) if it is already completed, or a description of the work in progress, otherwise
-
The publications (or web links) of the candidate, if any (it is not expected that they have any)
In addition, one recommendation letter from the person who supervises(d) the Master thesis (or research project or internship) should be sent directly by his/her author to the prospective PhD advisor.
|
6-77 | (2018-04-17) PhD at LORIA Nancy France
Impact LUE Open Language and Knowledge for Citizens ? OLKi Application for a PhD grant 2018 co-supervised by the Crem and the Loria ?Online hate speech against migrants?
Deadline to apply : May 1st 2018
According to the 2017 International Migration Report, the number of international migrants worldwide has continued to grow rapidly in recent years, reaching 258 million in 2017, up from 220 million in 2010 and 173 million in 2000. In 2017, 64 per cent of all international migrants worldwide ? equal to 165 million international migrants ? lived in high-income countries; 78 million of them were residing in Europe. Since 2000, Germany and France figure among the countries hosting the largest number of international migrants. A key reason for the difficulty of EU leaders to take a decisive and coherent approach to the refugee crisis has been the high levels of public anxiety about immigration and asylum across Europe. Indeed, across the EU, attitudes towards asylum and immigration have hardened in recent years because of (Berri et al., 2015): (i) the increase in the number and visibility of migrants in recent years, (ii) the economic crisis and austerity policies enacted since the 2008 Global Financial Crisis, (iii) the role of the mass media in influencing public and elite political attitudes towards asylum and migration. Refugees and migrants tend to be framed negatively as a problem, potentially nourishing.
Indeed, the BRICkS ? Building Respect on the Internet by Combating Hate Speech ? EU project1 has revealed a significant increase of the use of hate speech towards immigrants and minorities, which are often blamed to be the cause of current economic and social problems. The participatory web and the social media seem to accelerate this tendency, accentuated by the online rapid spread of fake news which often corroborate online violence towards migrants. Based on existing research, Carla Schieb and Mike Preuss (2016) highlight that hate speech deepens prejudice and stereotypes in a society (Citron & Norton, 2011). It also has a detrimental effect on mental health and emotional well-being of targeted groups, especially on targeted individuals (Festl & Quandt, 2013) and is a source of harm in general for those under attack (Waldron, 2012), when culminating in violent acts incited by hateful speech. Such violent hate crimes may erupt in the aftermath of certain key events, e.g. anti-Muslim hate crimes in response to the 9/11 terrorist attacks (King & Sutton, 2013).
Hate speech and fake news are not, of course, just problems of our times. Hate speech has always been part of antisocial behavior such as bullying or stalking (Delgado & Stefancic, 2014); ?trapped?, emotional, unverified and/or biased contents have always existed (Dauphin, 2002; Froissart, 2002, 2004; Lebre, 2014) and need to be understood on an anthropological level as reflections of people?s fears, anxieties or fantasies. They reveal what Marc Angenot calls a certain ?state of society? (Angenot, 1978; 1989; 2006). Indeed, according to this author, analysis of situated specific discourses sheds light to some of the topoi ? common premises and patterns ? that characterize public doxa. This ?gnoseological? perspective reveals the ways the visions of the ?world? can be systematically schematized on linguistic materials at a certain moment.
Within this context and problematic, the PhD project jointly proposed by the Crem and the Loria aims to analyse hate speech towards migrants in social media and more particularly on Twitter. It seeks to provide answers to the following questions: ? What are the representations of migrants as they emerge in hate speech on Twitter? ? What themes are they associated with? ? What can the latter tell us about the ?state? of our society, in the sense previously given to this term by Marc Angenot?
Secondary questions will also be addressed as to refine the main results: 1 http://www.bricks-project.eu/wp/about-the-project/ ? What is the origin of these messages? (individual accounts, political party accounts, bots, etc.) ? What is the circulation of these messages? (reactions, retweets, interactions, etc.) ? Can we measure the emotional dimension of these messages? Based on which indicators? ? Can a scale be established to measure the intensity of hate in speech? More and more audio/video/text appear on Internet each day. About 300 hours of multimedia are uploaded per minute. In these multimedia sources, manual content retrieval is difficult or impossible. The classical approach for spoken content retrieval from multimedia documents is an automatic text retrieval. Automatic text classification is one of the widely used technologies for the above purposes. In text classification, text documents are usually represented in some so-called vector space and then assigned to predefined classes through supervised machine learning. Each document is represented as a numerical vector, which is computed from the words of the document. How to numerically represent the terms in an appropriate way is a basic problem in text classification tasks and directly affects the classification accuracy. Sometimes, in text classification, the classes cannot be defined in advance. In this case, unsupervised machine learning is used and the challenge consists in finding underlying structures from unlabeled data. We will use methodologies to perform one of the important tasks of text classification: automatic hate speech detection.
Developments in Neural Network (Mikolov et al., 2013a) led to a renewed interest in the field of distributional semantics, more specifically in learning word embeddings (representation of words in a continuous space). Computational efficiency was one big factor which popularized word embeddings. The word embeddings capture syntactic as well as semantic properties of the words (Mikolov et al., 2013b). As a result, they outperformed several other word vector representations on different tasks (Baroni et al., 2014).
Our methodology in the hate speech classification will be related on the recent approaches for text classification with neural networks and word embeddings. In this context, fully connected feed forward networks (Iyyer et al., 2015; Nam et al., 2014), Convolutional Neural Networks (CNN) (Kim, 2014; Johnson and Zhang, 2015) and also Recurrent/Recursive Neural Networks (RNN) (Dong et al., 2014) have been applied. On the one hand, the approaches based on CNN and RNN capture rich compositional information, and have outperformed the state-of-the-art results in text classification; on the other hand they are computationally intensive and require careful hyperparameter selection and/or regularization (Dai and Le, 2015).
This thesis aims at proposing concepts, analysis and software components (Hate Speech Domain Specific Analysis and related software tools in connection with migrants in social media) to bridge the gap between conceptual requirements and multi-source information from social media. Automatic hate speech detection software will be experimented in the modeling of various hate speech phenomenon and assess their domain relevance with both partners. The language of the analysed messages will be primarily French, although links with other languages (including messages written in English) may appear throughout the analysis. This PhD project complies with the Impact OLKi (Open Language and Knowledge for Citizens) framework because: ? It is centred on language. ? It aims to implement new methods to study and extract knowledge from linguistic data (indicators, scales of measurement). ? It opens perspectives to produce technical solutions (applications, etc.) for citizens and digital platforms, to better control the potential negative use of language data. Scientific challenges: ? to study and extract knowledge from linguistic data that concern hate speech towards migrants in social media; ? to better understand hate speech as a social phenomenon, based on the data extracted and analysed; ? to propose and assess new methods based on Deep Learning for automatic detection of documents containing hate speech. This will allow to set up a hate speech online management protocol.
Keywords: hate speech, migrants, social media, natural language processing. Doctoral school: Computer Science (IAEM) Principal supervisor: Irina Illina, Assistant Professor in Computer Science, irina.illina@loria.fr Co-supervisors: Crem Loria Angeliki Monnier, Professor Information-Communication, angeliki.monnier@univ-lorraine.fr Dominique Fohr, Research scientist CNRS, dominique.fohr@loria.fr
References Angenot M (1978) Fonctions narratives et maximes idéologiques. Orbis Litterarum 33: 95-100. Angenot M (1989) 1889 : un état du discours social. Montréal : Préambule. Angenot M (2006) Théorie du discours social. Notions de topographie des discours et de coupures cognitives, COnTEXTES. thttps://contextes.revues.org/51. Baroni, M., Dinu, G., and Kruszewski, G. (2014). ?Don?t count, predict! a systematic comparison of contextcounting vs. contextpredicting semantic vectors?. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Volume 1, pages 238-247. Berri M, Garcia-Blanco I, Moore K (2015), Press coverage of the Refugee and Migrant Crisis in the EU: A Content Analysis of five European Countries, Report prepared for the United Nations High Commission for Refugees, Cardiff School of Journalism, Media and Cultural Studies. Chouliaraki L, Georgiou M and Zaborowski R (2017), The European ?migration crisis? and the media: A cross- European press content analysis. The London School of Economics and Political Science, London, UK. Citron, D. K., Norton, H. L. (2011), ?Intermediaries and hate speech: Fostering digital citizenship for our information age?, Boston University Law Review, 91, 1435. Dai, A. M. and Le, Q. V. (2015). ?Semi-supervised sequence Learning?. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28, pages 3061-3069. Curran Associates, Inc Dauphin F (2002), Rumeurs électroniques : synergie entre technologie et archaïsme. Sociétés 76 : 71-87. Delgado R., Stefancic J. (2014), ?Hate speech in cyberspace?, Wake Forest Law Review, 49. Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., and Xu, K. (2014). ?Adaptive recursive neural network for targetdependent twitter sentiment classification?. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL, Baltimore, MD, USA, Volume 2: pages 49-54. Festl R., Quandt T (2013), Social relations and cyberbullying: The influence of individual and structural attributes on victimization and perpetration via the internet, Human Communication Research, 39(1), 101?126. Froissart P (2002) Les images rumorales, une nouvelle imagerie populaire sur Internet. Bry-Sur-Marne : INA. Froissart P (2004) Des images rumorales en captivité : émergence d?une nouvelle catégorie de rumeur sur les sites de référence sur Internet. Protée 32(3) : 47-55. Johnson, R. and Zhang, T. (2015). ?Effective use of word order for text categorization with convolutional neural networks?. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 103-112. Iyyer, M., Manjunatha, V., Boyd-Graber, J., and Daumé, H. (2015). ?Deep unordered composition rivals syntactic methods for text classification?. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, volume 1, pages 1681-1691. Kim, Y. (2014). ?Convolutional neural networks for sentence classification?. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746-1751. King R. D., Sutton G. M. (2013). High times for hate crimes: Explaining the temporal clustering of hate-motivated offending. Criminology, 51 (4), 871?894. Lebre J (2014) Des idées partout : à propos du partage des hoaxes entre droite et extrême droite. Lignes 45: 153- 162. Mikolov, T., Yih, W.-t., and Zweig, G. (2013a). ?Linguistic regularities in continuous space word representations?. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746-751. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013b). ?Distributed representations of words and phrases and their Compositionality?. In Advances in Neural Information Processing Systems, 26, pages 3111-3119. Curran Associates, Inc. Nam, J., Kim, J., Loza Menc__a, E., Gurevych, I., and F urnkranz, J. (2014). ?Large-scale multi-label text classification ? revisiting neural networks?. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD-14), Part 2, volume 8725, pages 437-452. Schieb C, Preuss M (2016), Governing Hate Speech by Means of Counter Speech on Facebook, 66th ICA Annual Conference, Fukuoka, Japan. United Nations (2018), International Migration Report 2017. Highlights, New York, Department of Economic and Social Affairs. Waldron J. (2012), The harm in hate speech, Harvard University Press.
|
6-78 | (2018-04-17) PhD grant at Loria, Nancy France
Thesis title Expressive speech synthesis based on deep learning Location: INRIA Nancy Grand Est research center --- LORIA Laboratory, Nancy, France Research theme: Perception, Cognition, Interaction, Project-team: MULTISPEECH (https://team.inria.fr/multispeech/) Scientific Context Over the last decades, text-to-speech synthesis (TTS) has reached good quality and intelligibility, and is now commonly used in information delivery services, as for example in call center automation, in navigation systems, and in voice assistants. In the past, the main goal when developing TTS systems was to achieve high intelligibility. The speech style was then typically a “reading style”, which resulted from the style of the speech data used to develop TTS systems (reading of a large set of sentences). Although a reading style is acceptable for occasional interactions, TTS systems should benefit from more variability and expressivity in the generated synthetic speech, for example, for lengthy interactions between machines and humans, or for entertainment applications. This is the goal of recent or emerging research on expressive speech synthesis. Contrary to neutral speech, which is typically read speech without conveying any particular emotion, expressive speech can be defined as speech carrying an emotion, or spoken as in spontaneous speech, or also as speech with emphasis set on some words. Missions: (objectives, approach, etc.) Deep learning approaches leads to good speech synthesis quality, however the main scientific and technological barrier remains the necessity of having a speech corpora corresponding to the speaker and the target style conditions, here expressive speech. This thesis aims at investigating approaches to overcome this barrier. More precisely, the objective is to propose and investigate approaches allowing expressive speech synthesis for a given speaker voice, using both the neutral speech data of that speaker, or the corresponding neutral speech model, and expressive speech data from other speakers. This will avoid lengthy and costly recording of specific ad hoc expressive speech corpora (e.g., emotional speech data from the target voice speaker). Let recall that three main steps are involved in parametric speech synthesis: the generation of sequences of basic units (phonemes, pauses, etc.) from the source text; the generation of prosody parameters (durations of sounds, pitch values, etc.); and finally the generation of acoustic parameters, which leads to the synthetic speech signal. All the levels are involved in expressive speech synthesis: alteration of pronunciations and presence of pauses, modification of prosody correlates and modification of the spectral characteristics. The thesis will essentially focus on the two last points, i.e., a correct prediction of prosody and spectral characteristics to produced expressive speech through deep learning-based approaches. Some aspects to be investigated include the combined used of only the neutral speech data of the target voice speaker and expressive speech of other speakers in the training process, or in an adaptation process, as well as data augmentation processes. The baseline experiments will rely on neutral speech corpora and expressive speech corpora previously collected for speech synthesis in the Multispeech team. Further experiments will consider using other expressive speech data, possibly extracted from audiobooks. Skills and profile: Master in automatic language processing or in computer science Background in statistics, and in deep learning Experience with deep learning tools Good computer skills (preferably in Python) Experience in speech synthesis is a plus Bibliography: (if any) [Sch01] M. Schröder. Emotional speech synthesis: A review. Proc. EUROSPEECH, 2001. [Sch09] M. Schröder. Expressive speech synthesis: Past, present, and possible futures. Affective information processing, pp. 111–126, 2009. [ICHY03] A. Iida, N. Campbell, F. Higuchi and M. Yasumura. A corpus-based speech synthesis system with emotion. Speech Communication, vol. 40, n. 1, pp. 161–187, 2003. [PBE+06] J.F. Pitrelli, R. Bakis, E.M. Eide, R. Fernandez, W. Hamza and M.A. Picheny. The IBM expressive text-to-speech synthesis system for American English. IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, n. 4, pp. 1099–1108, 2006. [JZSC05] D. Jiang,W. Zhang, L. Shen and L. Cai. Prosody analysis and modeling for emotional speech synthesis. Proc. ICASSP, 2005. [WSV+15] Z. Wu, P. Swietojanski, C. Veaux, S. Renals, S. King. A study of speaker adaptation for DNN-based speech synthesis. Proc. INTERSPEECH, pp. 879–883, 2015. Additional information: Supervision and contact: Denis Jouvet (denis.jouvet@loria.fr; https://members.loria.fr/DJouvet/) Vincent Colotte (Vincent.colotte@loria.fr; https://members.loria.fr/VColotte/) Additional link: Ecole Doctorale IAEM Lorraine (http://iaem.univ-lorraine.fr/) Duration: 3 years Starting date: autumn 2018 Deadline to apply: May 1st, 2018 The candidates are required to provide the following documents in a single pdf or ZIP file: CV A cover/motivation letter describing their interest in the topic Degree certificates and transcripts for Bachelor and Master (or the last 5 years) Master thesis (or equivalent) if it is already completed, or a description of the work in progress, otherwise The publications (or web links) of the candidate, if any (it is not expected that they have any) In addition, one recommendation letter from the person who supervises(d) the Master thesis (or research project or internship) should be sent directly by his/her author to the prospective PhD advisor.
|
6-79 | (2018-04-19) PhD at LeMans University, France
Title of the PhD thesis:
Automatic speech processing in meetings
using microphone array
Key words : environment with reverberation– Array & Beamforming – Signal processing – Deep learning – Transcription and speaker recognition
Supervision : Silvio Montrésor (LAUM), Anthony Larcher (LIUM), Jean-Hugh Thomas (LAUM)
Funding: LMAC (Scientific bets of Le Mans Acoustique)
Beginning : September 2018
Contact : jean-hugh.thomas@univ-lemans.fr
Aim of the PhD thesis
The subject is supported by two laboratories of Le Mans – Université: the acoustics lab (LAUM) and the computer science lab (LIUM). The aim is to enhance automatic speech processing in meetings, transcription and speaker recognition, by using a recording device and audio signal processing from a microphone array.
Subject of the PhD thesis
It consists in implementing a hands-free system able to localise the speakers in a room, to separate the signals emitted by these speakers and to enhance the speech signal and its processing.
The thesis’ issues are the following:
- Define an array geometry adapted to distant sound recording with few microphones.
- Propose processing able to take advantage of the acoustic data provided by the array and to select the parts of the audio signals (reflexion orders) the most relevant for enhancing the performance of the automatic speech recognition system of the LIUM. The process should take into account the confined environment (meeting room). It will also use source separation algorithms to identify the different speakers during the meeting.
- Propose new development to the usual methods to extract features from the signal to enhance the relevance for the neural network.
- Propose a learning strategy for the neural network to enhance the transcription performance.
Some références
[1] J. H. L. Hansen, T. Hasan, Speaker recognition by machines and humans, IEEE Signal Processing Magazine, 74, 2015.
[2] L. Deng, G. Hinton, B. Kingsbury, New types of deep neural network learning for speech recognition and related applications: An overview, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 8599-8603).
[3] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-R. Mhamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, B. Kingsbury, Deep neural networks for acoustic modelling in speech recognition, IEEE Signal Processing Magazine, 82, 2012.
[4] P Bell, MJF Gales, T Hain, J Kilgour, P Lanchantin, X Liu, A McParland, S Renals, O Saz, M Wester, et al.The MGB challenge : Evaluating multi-genre broadcast media recognition. Proc. of ASRU, Arizona, USA, 2015.
[5] T. B. Spalt, Background noise reduction in wind tunnels using adaptive noise cancellation and cepstral echo removal techniques for microphone array applications, Master of Science in Mechanical Engineering, Hampton, Virginia, USA, 2010.
[6] D. Blacodon, J. Bulté, Reverberation cancellation in a closed test section of a wind tunnel using a multi-microphone cepstral method, Journal of Sound and Vibration 333, 2669-2687 (2014).
[7] Q.-G. Liu, B. Champagne, P. Kabal, A microphone array processing technique for speech enhancement in a reverberant space, Speech Communication 18 (1996) 317-334.
[8] S. Doclo, Multi-microphone noise reduction and de-reverberation techniques for speech applications, S. Doclo, Thesis, Leuven (Belgium), 2003.
[9] Y. Liu, N. Nower, S. Morita, M. Unoki, Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments, Speech Communication 84 (2016) 1-14.
[10] Feng, X., Zhang, Y., & Glass, J. (2014, May). Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1759-1763). IEEE.
[11] Kinoshita, K., Delcroix, M., Yoshioka, T., Nakatani, T., Sehr, A., Kellermann, W., & Maas, R. (2013, October). The reverb challenge: Acommon evaluation framework for dereverberation and recognition of reverberant speech. In 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (pp. 1-4). IEEE.
[12] Xiong X., Watanabe S., Erdogan H., Lu L., Hershey J., Seltzer M. L., Chen G., Zhang Y., Mandel M., Yu D., Deep Beamforming Networks for Multi-Channel Speech Recognition, Proceedings of ICASSP 2016, pp 5745-5749.
|
6-80 | (2018-04-15) PhD Project Australia-France
PhD Project – Call for Applications Situated Learning for Collaboration across Language Barriers
People working in development are often deployed to remote locations where they work alongside locals who speak an unwritten minority language. Outsiders and locals share knowhow and pick up phrases in each other’s languages. They are performing a type of situated learning of language and culture. This situation is found across the world, in developing countries, border zones, and in indigenous communities. This project will develop computational tools to help people work together across language barriers. The research will be evaluated in terms of the the quality of the social interaction, the mutual acquisition of language and culture, the effectiveness of cross-lingual collaboration, and the quantity of translated speech data collected. The ultimate goal is to contribute to the grand task of documenting world’s languages. The project will involve working between France and Australia, and will include fieldwork with a remote indigenous community. We’re looking for outstanding and highly motivated candidates to work on a PhD on this subject. Competencies in two or more of the following areas are mandatory:
• machine learning for natural language processing;
• speech processing for interactive systems;
• participatory design;
• mobile software development;
• documenting and describing unwritten languages.
The project will build on previous work in the following areas: mobile platforms for collecting spoken language data [6, 7]; respeaking as a technique for improving the value of recordings made ‘in the wild’ and an alternative to traditional transcription practices [12, 13]; machine learning of structure in phrase-aligned bilingual speech recordings [2, 3, 4, 8, 9, 10, 11]; participatory design of mobile technologies for working with minority languages [5]; managing multilingual databases of text, speech and images [1]. Some recent indicative PhD theses include: Computer Supported Collaborative Language Documentation (Florian Hanke, 2017); Automatic Understanding of Unwritten Languages (Oliver Adams, 2018); Collecter, Transcrire, Analyser : quand la Machine Assiste le Linguiste dans son Travail de Terrain (Elodie Gauthier, 2018); Enriching Endangered Language Resources using Translations (Antonios Anastasopoulos, in prep); Digital Tool Deployment for Language Documentation (Mat Bettinson, in prep); Bayesian and Neural Modeling for Multi Level and Crosslingual Alignment (Pierre Godard, in prep). Details of the position. Funding includes remission of university fees, a stipend of approximately e17,500 per year, and a travel allowance. The position starts in Fall 2018 (ie from September) and lasts for three years. The research will be supervised by Steven Bird (Charles Darwin University, Australia) and Laurent Besacier (Univ. Grenoble Alpes, France). Acceptance will be subject to approval by both host institutions (Grenoble and Darwin). Given the cross-cultural nature of the project, the successful candidate will have demonstrated substantial experience of cross-cultural living.
Apply. To apply, please contact laurent.besacier@univ-grenoble-alpes.fr and steven. bird@cdu.edu.au including a cover letter, curriculum vitae, academic transcripts and reference letter by your MSc thesis advisor.
Institutions The University of Grenoble offers an excellent research environment with ample compute hardware to solve hard speech and natural language processing problems, as well as remarkable surroundings to explore over the weekends. Charles Darwin University is a research-intensive university attracting students from over 50 countries. CDU is situated in Australia’s tropical north, in the midst of one of the world’s hot-spots for linguistic diversity and language endangerment. Darwin is a youthful, multicultural, cosmopolitan city in a territory that is steeped in Aboriginal tradition and culture and which enjoys a close interaction with the peoples of Southeast Asia.
References [1] Steven Abney and Steven Bird. The Human Language Project: building a universal corpus of the world’s languages. In Proceedings of the 48th Meeting of the Association for Computational Linguistics, pages 88–97. ACL, 2010. [2] Oliver Adams, Graham Neubig, Trevor Cohn, and Steven Bird. Learning a translation model from word lattices. In Interspeech 2016, pages 2518–22, 2016. [3] Antonios Anastasopoulos, Sameer Bansal, David Chiang, Sharon Goldwater, and Adam Lopez. Spoken term discovery for language documentation using translations. In Proceedings of the Workshop on Speech-Centric NLP, pages 53–58, 2017. [4] Antonios Anastasopoulos and David Chiang. A case study on using speech-to-translation alignments for language documentation. In Proc. Workshop on Use of Computational Methods in Study of Endangered Languages, pages 170–178, 2017. [5] Steven Bird. Designing mobile applications for endangered languages. In Kenneth Rehg and Lyle Campbell, editors, Oxford Handbook of Endangered Languages. Oxford University Press, 2018. [6] Steven Bird, Florian R. Hanke, Oliver Adams, and Haejoong Lee. Aikuma: A mobile app for collaborative language documentation. In Proceedings of the Workshop on the Use of Computational Methods in the Study of Endangered Languages. ACL, 2014. [7] David Blachon, Elodie Gauthiera, Laurent Besacier, Guy-No¨el Kouaratab, Martine Adda-Decker, and Annie Rialland. Parallel speech collection for under-resourced language studies using the Lig-Aikuma mobile device app. In Proceedings of the Fifth Workshop on Spoken Language Technologies for Under-resourced languages, volume 81, pages 61–66, 2016. [8] V. H. Do, N. F. Chen, B. P. Lim, and M. A. Hasegawa-Johnson. Multitask learning for phone recognition of underresourced languages using mismatched transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26:501–514, 2018. [9] Ewan Dunbar, Xuan Nga Cao, Juan Benjumea, Julien Karadayi, Mathieu Bernard, Laurent Besacier, Xavier Anguera, and Emmanuel Dupoux. The zero resource speech challenge 2017. In Automatic Speech Recognition and Understanding (ASRU), 2017 IEEE Workshop on. IEEE. [10] Long Duong, Antonios Anastasopoulos, David Chiang, Steven Bird, and Trevor Cohn. An attentional model for speech translation without transcription. In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 949–959, 2016. [11] Pierre Godard, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Laurent Besacier, Helene Bonneau-Maynard, Guy-No¨el Kouarata, Kevin L¨oser, Annie Rialland, and Franc¸ois Yvon. Preliminary experiments on unsupervised word discovery in Mboshi. In Interspeech 2016, 2016. [12] Mark Liberman, Jiahong Yuan, Andreas Stolcke, Wen Wang, and Vikramjit Mitra. Using multiple versions of speech input in phone recognition. In ICASSP, pages 7591–95. IEEE, 2013. [13] Anthony C. Woodbury. Defining documentary linguistics. Language Documentation and Description, 1:35–51, 2003.
|
6-81 | (2018-04-19) Joint PhD, Rennes/Dublin
Funded joint PhD between Univ Rennes and DIT, Dublin. The subject is about « Deep neural natural language style transfer ».
|