ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2024 » ISCApad #314 » Academic and Industry Notes

ISCApad #314

Friday, August 09, 2024 by Chris Wellekens

4 Academic and Industry Notes

4-1

New Master curriculum integrating advanced study and research covering all areas of language science,Univ. of Paris, France

The Paris Graduate School of Linguistics (PGSL) is a newly-formed Paris-area graduate program covering all areas of language science.

It offers a comprehensive Master curriculum integrating advanced study and research, in close connection with PhD programs as well as with the Empirical Foundations of Linguistics consortium.

Research plays a central part in the program, and students also take elective courses to develop an interdisciplinary outlook. Prior knowledge of French is not required.

For more details, please see https://paris-gsl.org/index.html

New funding opportunity: https://u-paris.fr/en/call-for-applications-international-students-miem-scholarship-program/

Application deadline : February 1st 2021 (program starting September 1st 2021)

PGSL is funded by Smarts-UP (Student-centered iMproved, Active Research-based Training Strategy at Université de Paris) through the ANR SFRI grant « Grandes universités de recherche » (PIA3) 2020-2029.

Back

Top

4-2

Cambridge's Machine Learning and Machine Intelligence MPhil

Cambridge's Machine Learning and Machine Intelligence MPhil

Are you interested in speech and language processing, computer vision & robotics, human-computer interaction, or machine learning? Please consider applying to the University of Cambridge?s Machine Learning and Machine Intelligence (MLMI) MPhil programme.

The MLMI MPhil is an elite 11 month programme with a small cohort of about 30 students each year. Due to its small size there is the opportunity to carry out PhD-like research projects on the course (see here for previous students? dissertations), as well having a number of bespoke taught modules with lots of opportunities to interact with the faculty and other members of the course (see here for a list of modules and here for a list of the teaching staff).

Previous members of the MPhil have gone on to study for PhDs in top research groups (e.g. at Oxford, Cambridge, Stanford, and MIT), and have gone into top industry positions (e.g. Microsoft Research, Facebook AI Research, Open AI, and AstraZeneca).

This year our programme is restructuring around four overlapping tracks: speech and language processing, computer vision & robotics, human-computer interaction, and machine learning. You apply to one of these tracks and this choice shapes your module options and the research project that you will take on. We are especially interested in candidates who are interested in speech and language processing, computer vision & robotics, and human-computer interaction as we have significant capacity to expand in these areas this year.

Details about the application process can be found on our website. The application deadline is 2nd December 2021.

Back

Top

4-3

Serveur Discord pour jeunes chercheurs.

Nous souhaitons vous faire part aujourd'hui de la création d'un serveur Discord pour les jeunes chercheurs en parole: https://discord.gg/kSgaZp7yg9

Cet espace de discussion a pour but de rassembler la communauté des jeunes chercheurs en parole en France (étudiants en master, doctorants, post-doctorants...).

Vous pourrez notamment y partager vos derniers papiers publiés, vos questions ou appels à l'aide pour l'utilisation d'un logiciel (ou autre), ou simplement rentrer en contact avec des personnes qui travaillent dans votre domaine ou des domaines liés au votre. Rien n'est fixe, le serveur est voué à évoluer au cours de son utilisation! Il peut aussi nous servir à nous retrouver lors d'une conférence ou autre. En tant que jeunes chercheurs avec un petit réseau, on connaît tous ce sentiment peu confortable d'être seul(e) à une conférence, malgré les 1500 personnes autour de nous. Alors plutôt que de stresser chacun dans notre coin, autant nous retrouver et partager tout ça ensemble en nous donnant rendez-vous grâce au serveur de discussion!

La création de ce serveur Discord fait suite à l'appel du comité d'organisation des JEP pour l'organisation d'un événement à destination des jeunes chercheurs en parole lors des Journées d'Etudes en Parole qui auront lieu à Noirmoutier du 13 au 17 juin 2022 (https://jep2022.univ-nantes.fr/). Nous l'avons d'abord créé pour discuter entre jeunes chercheurs sur ce que l'on pourrait imaginer comme événement, puis nous avons pensé qu'il serait intéressant de l'ouvrir à tous pour recueillir vos besoins/envies et voir ce qui pourrait intéresser un maximum de personnes. Il est donc également destiné à essayer de mieux cerner les besoins de formation de chacun. Rassurez-vous, il est partagé en deux: une catégorie pour tous, et une catégorie pour celles et ceux intéressés par l'organisation de journées d'études, ainsi, pas de spam inutile. Si vous souhaitez rejoindre cette deuxième catégorie, il faudra me le notifier dans le canal #général ou me le demander par message privé*.

Rendez-vous sur le serveur pour faire vivre notre communauté de jeunes chercheurs!

Merci aux organisateurs des JEP et à l'AFCP pour leur soutien.

Back

Top

4-4

Le projet European Language Equality

Le projet European Language Equality vise à établir un agenda stratégique concernant la
recherche et l’innovation pour atteindre l’égalité des langues à l’ère numérique en
Europe en 2030. Dans le cadre de ce projet, les partenaires ont produit des rapports
documentant l'état des technologies et ressources pour chaque langue officielle, ainsi
que pour certaines langues non-officielles (D1.4-D1.36). Des états de l'art couvrant
quatre grands domaines ont également été produits (D2.12-D2.16):
- la traduction automatique:
- les technologies vocales
- les technologies pour l'analyse et la compréhension des langues
- les ressources et les bases de connaissance.

Tous ces rapports sont accessibles depuis le site du projet:
https://european-language-equality.eu/deliverables/

Le rapport consacré à l'état des technologies pour la langue française et pour la langue
des signes française a fait l'objet d'une traduction en français. Il est disponible ici:
https://hal.archives-ouvertes.fr/hal-03637784

Back

Top

4-5

Distribution Agreement between ELDA and Lexicala for Multilingual Lexical Data Dissemination

Press Release – immediate
Paris, France and Tel Aviv, Israel, October 12, 2023

Distribution Agreement between ELDA and Lexicala for Multilingual Lexical Data Dissemination

ELDA and Lexicala by K Dictionaries are delighted to announce their new cooperation on distributing Language Resources for 50 languages.

ELDA is now making available Lexicala’s high-quality lexical data designed to enhance language learning, and support Machine Translation and diverse Natural Language Processing and Artificial Intelligence applications.

The Lexicala resources consist of different groups of datasets. Full descriptions can be found in the ELRA Catalogue of Language Resources under the following links:

GLOBAL Multilingual Lexical Data: a network of lexicographic cores for major world languages, comprising monolingual cores, bilingual pairs, and multilingual combinations for 25 languages.
MULTIGLOSS Multilingual Glossaries: a series of innovative word-to-sense glossaries for over 30 languages into 45 more languages.
Morphological lexicons: extensive morphological lists linking inflected forms to main lemmas for 15 languages.
Parallel Corpora & Domains: parallel corpora for nearly 400 language pairs and numerous multilingual combinations, featuring general language and vertical domain vocabularies.
Biographical & Geographical Names:

English BIO Biographical Names: 4,200 dictionary entries regarding prominent persons worldwide.

English GEO Geographical Names: 7,200 dictionary entries regarding major locations worldwide.

GEOLINGUAL Tables: multilingual tables of over 200 countries and geographical names – including their adjectives, persons, and main languages – in 16 languages.

Audio Pronunciation & Phonetic Transcription: human voice recordings of single-word lemmas and multiword expressions, as well as IPA and alternative scripts for 21 languages.

For more information, please write to contact@elda.org.

About Lexicala

Lexicala by K Dictionaries offers multi-layer lexical data for and across 50 languages, relying on 30-year experience in pedagogical and multilingual lexicography worldwide. Lexicala converges manual content creation and curation with automated data processes and helps to enhance machine translation and other natural language processing applications, as well as language learning and model training.

To find out more about Lexicala, please visit: https://lexicala.com/

About ELDA

The Evaluation and Language resources Distribution Agency (ELDA) identifies, collects, markets, and distributes language resources, along with the dissemination of general information in the field of Human Language Technologies (HLT). ELDA has considerable knowledge and skills in HLT applications. ELDA is part of major French, European and international projects in the field of HLT.

To find out more about ELDA, please visit: http://www.elda.org/

Back

Top

4-6

Cf bids for the 2025 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2025)

Following the success of the bi-annual ASRU workshop over the past decade, the IEEE Speech and Language Technical Committee (SLTC) invites proposals to host the 2025 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2025). Past ASRU workshops have fostered a collegiate atmosphere through a thoughtful selection of venues, thus offering a unique opportunity for researchers to interact and learn.

The proposal should include the information outlined below.

Workshop location and practicalities
- Geographical location
- Workshop venue (facilities, meeting rooms, network access during the workshop, audio/visual equipment)
- Accommodation -- hotel availability and pricing
- Meals
- Transportation options -- major airports, logistics, visas
  Climate
- Information on the policy regarding online presentation and participation
Approximate workshop dates
- Previous workshops have been held in the month of December
Approximate total cost for participants to attend, including accommodation, meals, and registration fees.
Estimated budget for 300-400+ participants and expected sponsorships, including venue costs, administration, banquet, coffee breaks, publication costs, etc.
Committee Personnel
- General chair(s)
- Technical chair(s)
- Local arrangements chair(s)
- Other committee members and roles
Tentative program
- Dates for paper submission, notification of acceptance, proposals for
  demonstrations, early registration
- Reception, talks, posters, demo session, banquet, etc.
- Substantial program additions/changes vs. past instances of ASRU

The deadline for proposals is June 14, 2024.

Please send proposals and questions to the Workshop Sub-Committee:

In July 2024, the IEEE SLTC will review proposals, and selection results are expected by August 15, 2024.

If you are interested in submitting a proposal, we encourage you to contact the IEEE SLTC Workshops Sub-Committee in advance of submitting a proposal. They can provide an example of a past successful proposal and example budget. Further, proposers who make contact before April 1, 2024 may be invited to briefly present in-person or virtually at the annual IEEE SLTC meeting at ICASSP 2024 (https://2024.ieeeicassp.org/) to obtain feedback from the SLTC. Presentations should be reasonably specific but need not be complete. Note that the SLTC does not have funding available for travel to ICASSP.

The organizers of the ASRU workshop do not have to be SLTC members, and we encourage submissions from all potential organizers. IEEE SLTC members are welcome to participate in proposals, and the organizing committees of past ASRU events have included many SLTC members. To maintain fairness of selection, SLTC members who are affiliated with any ASRU 2025 proposals will not participate in the proposal selection vote. Further, the members of the Workshops Sub-Committee may not be affiliated with any ASRU 2025 proposals. Please feel free to distribute this call for proposals far and wide, and invite members of the speech and language community at large to submit a proposal to organize the next ASRU workshop.

For more information on the most recent workshops, please see:

http://www.asru2023.org/ for information about ASRU 2023 in Taipei, Taiwan

And feel free to contact the Workshops Sub-committee with questions.

Zhijian Ou, Abdelrahman Mohamed, Seokhwan Kim, Hagai Aronowitz, Sibel Oyman, Haitao Mi

IEEE SLTC Workshops Sub-committee

Back

Top

4-7

Invitation to the DISPLACE challenge: DIarization of SPeaker and LAnguage in Conversational Environments

Inviting participants to register and advance the field of diarization in the special session at Interspeech 2024 on DIarization of SPeaker and LAnguage in Conversational Environments [DISPLACE] Challenge.

The Second DISPLACE challenge entails a first of kind task to perform speaker and language diarization as well as speech recognition on a demanding dataset, where the data contains multi-speaker social conversations in multilingual code-mixed speech. In such cases, we find that current speech processing systems are not equipped to perform meaningfully. The first DISPLACE challenge in 2023 illustrated this abundantly - https://arxiv.org/pdf/2311.12564.pdf

With this motivation, the Second DISPLACE challenge attempts to advance this further by benchmarking and improving Speaker Diarization (SD) in multilingual settings and Language Diarization (LD) in multi-speaker settings, using the same underlying dataset. Further, the track on speech recognition attempts to improve speech transcription in code-mixed multi-speaker speech. Registrations are open for this challenge which will contain two tracks - a) Speaker diarization track , b) Language diarization track and c) ASR task.

A baseline system and an open leaderboard is available to the participants. The DISPLACE challenge is split into two phases, where the first phase is linked to the Interspeech paper submission deadline, while the second phase aligns with the camera ready submission deadline. For more details, dates and to register, kindly visit the DISPLACE challenge website: https://displace2024.github.io

We look forward to your team challenging to 'displace' the state-of-the-art in speaker, language diarization systems and/or ASR systems.

Thank you and Namaste,

The DISPLACE team

Back

Top

4-8

Call for membership: ManyLanguages

We are excited to launch ManyLanguages, a globally distributed network of laboratories that helps coordinate Big Team Science studies on human language.

Our mission is to facilitate the connection between language science researchers to diversify the languages, participants, researchers, and projects represented in the language sciences. We will facilitate the collection of evidence across the language sciences by supporting a distributed laboratory network that is ongoing, diverse, and inclusive. We embrace open science principles by sharing collected data, materials, translations, and other research outputs from the network. We strive to engage research across a broad spectrum of language sciences creating an inclusive and diverse environment for ideas, investigation, and participation.

Join us as a member and learn more about our plans to help advance the language sciences. Currently, we are accepting proposals for big team science projects that replicate experimental linguistic phenomena across many languages. Selected projects will be supported by our team and external experts throughout the entire project.

Find more information here: https://many-languages.com

Join us as a member here: https://many-languages.com/join.html

Get in touch with us: many-languages@googlegroups.com

And follow us on social media for updates:

Back

Top

4-9

Bids for Speech Prosody 2026

Dear Speech Prosody SIG Members,

While we're all looking forward to Speech Prosody 2024 in Leiden …

… it's also time to start planning Speech Prosody 2026. Accordingly, members of SProSIG with a history of attendance at Speech Prosody conferences are encouraged to submit bids to host Speech Prosody 13. Bids submitted by May 31 will garner a presentation spot during Speech Prosody in Leiden. The final deadline is August 9, 2024, and written bids received by that date will be posted at http://sprosig.org. The SProSIG membership will then be invited to consider the bids and to vote their preferences. Bids may include any information that you believe is likely to sway the members, but should contain at least:

- City and Country

- General Chair

- Organizing Committee Members

- Proposed conference dates

- Expected registration fee

- Sponsoring Organization (university, company, or agency; may be tentative)

- Venue (may be tentative)

- Access from the closest major airport

- Accommodation options

Further information and the bid template are available at http://sprosig.org/about.html. Please submit bids to Martine Grice martine.grice@uni-koeln.de and Aoju Chen aoju.chen@uu.nl with a cc to Nigel Ward nigelward@acm.org .

Plinio Barbosa, Hongwei Ding, Martine Grice, Aoju Chen, Nigel Ward

(Speech Prosody SIG Officers)

Back

Top

4-10

The Speech Prosody Conference program 2024

The Speech Prosody Conference Program is now available, at https://www.universiteitleiden.nl/sp2024/program .

The online lecture series resumes next month, with a talk on speech synthesis needs by Zofia Malisz; details below, also at https://sprosig.org/index.html . After that the tentative schedule is

Gabriel Skantze, KTH, May 15.
Simon Roessig, York, September.
Sam Tilsen, Cornell, October.
Sasha Calhoun, Victoria University of Wellington, November.
Robert Xu, Stanford, December.

The speech synthesis phoneticians need is both realistic and controllable: A survey and a roadmap towards modern synthesis tools for phonetics.
Zofia Malisz, KTH Royal Institute of Technology.
April 17th, 2 pm (Brasilia time). viewing link

ABSTRACT
In the last decade, data and machine learning-driven methods to speech synthesis have greatly improved its quality. So much so, that the realism achievable by current neural synthesisers can rival natural speech. However, modern neural synthesis methods have not yet transferred as tools for experimentation in the speech and language sciences. This is because modern systems still lack the ability to manipulate low-level acoustic characteristics of the signal such as e.g.: formant frequencies.
In this talk, I survey recent advances in speech synthesis and discuss their potential as experimental tools for phonetic research. I argue that speech scientists and speech engineers would benefit from working more with each other again: in particular, in the pursuit of prosodic and acoustic parameter control in neural speech synthesis. I showcase several approaches to fine synthesis control that I have implemented with colleagues: the WavebenderGAN and a system that mimicks the source-filter model of speech production. These systems allow to manipulate formant frequencies and other acoustic parameters with the same (or better) accuracy as e.g.: Praat but with a far superior signal quality.
Finally, I discuss ways to improve synthesis evaluation paradigms, so that not only industry but also speech science experimentation benchmarks are met. My hope is to inspire more students and researchers to take up these research challenges and explore the potential of working at the intersection of the speech technology and speech science.

Outline: 1. I discuss briefly the history of advancements in speech synthesis starting in the formant synthesis era and explain where the improvements came from. 2. I show experiments that I have done that prove modern synthesis is processed not differently than natural speech by humans in a lexical decision task as evidence that the realism (“naturalness”) goal has been largely achieved. 3. I explain how realism came at the expense of controllability. I show how controllability is an indispensable feature for speech synthesis to be adopted in phonetic experimentation. I survey the current state of research on controllability in speech engineering - concentrating on prosodic and formant control. 4. I propose how we can fix this by explaining the work I have done with colleagues on several systems that feature both realism and control. 5. I sketch a roadmap to improve synthesis tools for phonetics - by placing focus on benchmarking systems according to scientific criteria.

TBD. Gabriel Skantze, KTH, May 15.
TBD. Simon Roessig, York, September.
TBD. Sam Tilsen. October.
TBD. Sasha Calhoun, November.
TBD, Robert Xu, December.

Nigel Ward, SProSIG Chair, Professor of Computer Science, University of Texas at El Paso

nigel@utep.edu https://www.cs.utep.edu/nigel/

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy