ISCApad #237 |
Sunday, March 11, 2018 by Chris Wellekens |
7-1 | CfP Special Issue of Speech Communication on *REALISM IN ROBUST SPEECH AND LANGUAGE PROCESSING* Speech Communication
Special Issue on *REALISM IN ROBUST SPEECH AND LANGUAGE PROCESSING*
*Deadline: May 31st, 2017* (For further information see attached)
How can you be sure that your research has actual impact in real-world applications? This is one of the major challenges currently faced in many areas of speech processing, with the migration of laboratory solutions to real-world applications, which is what we address by the term 'Realism'. Real application scenarios involve several acoustic, speaker and language variabilities which challenge the robustness of systems. As early evaluations in practical targeted scenarios are hardly feasible, many developments are actually based on simulated data, which leaves concerns for the viability of these solutions in real-world environments.
Information about which conditions are required for a dataset to be realistic and experimental evidence about which ones are actually important for the evaluation of a certain task is sparsely found in the literature. Motivated by the growing importance of robustness in commercial speech and language processing applications, this Special Issue aims to provide a venue for research advancements, recommendations for best practices, and tutorial-like papers about realism in robust speech and language processing.
Prospective authors are invited to submit original papers in areas related to the problem of realism in robust speech and language processing, including: speech enhancement, automatic speech, speaker and language recognition, language modeling, speech synthesis and perception, affective speech processing, paralinguistics, etc. Contributions may include, but are not limited to:
- Position papers from researchers or practitioners for best practice recommendations and advice regarding different kinds of real and simulated setups for a given task
- Objective experimental characterization of real scenarios in terms of acoustic conditions (reverberation, noise, sensor variability, source/sensor movement, environment change, etc)
- Objective experimental characterization of real scenarios in terms of speech characteristics (spontaneous speech, number of speakers, vocal effort, effect of age, non-neutral speech, etc)
- Objective experimental characterization of real scenarios in terms of language variability
- Real data collection protocols
- Data simulation algorithms
- New datasets suitable for research on robust speech processing
- Performance comparison on real vs. simulated datasets for a given task and a range of methods
- Analysis of advantages vs. weaknesses of simulated and/or real data, and techniques for addressing these weaknesses
Papers written by practitioners and industry researchers are especially welcomed. If there is any doubt about the suitability of your paper for this special issue, please contact us before submission.
*Submission instructions: *
Manuscript submissions shall be made through EVISE at https://www.evise.com/profile/#/SPECOM/login
Select article type 'SI:Realism Speech Processing'
*Important dates: *
March 1, 2017: Submission portal open
May 31, 2017: Paper submission
September 30, 2017: First review
November 30, 2017: Revised submission
April 30, 2018: Completion of revision process
*Guest Editors: *
Dayana Ribas, CENATAV, Cuba
Emmanuel Vincent, Inria, France
John Hansen, UTDallas, USA
| |||||
7-2 | CfP IEEE Journal of Selected Topics in Signal Processing: Special Issue on End-to-End Speech and Language Processing CALL FOR PAPERS IEEE Journal of Selected Topics in Signal Processing Special Issue on End-to-End Speech and Language Processing End-to-end (E2E) systems have achieved competitive results compared to conventional hybrid Hidden Markov-deep neural network model-based automatic speech recognition (ASR) systems. Such E2E systems are attractive because they do not require initial alignments between input acoustic features and output graphemes or words. Very deep convolutional networks and recurrent neural networks have also been very successful in ASR systems due to their added expressive power and better generalization. ASR is often not the end goal of real-world speech information processing systems. Instead, an important end goal is information retrieval, in particular keyword search (KWS), that involves retrieving speech documents containing a user-specified query from a large database. Conventional keyword search uses an ASR system as a front-end that converts the speech database into a finitestate transducer (FST) index containing a large number of likely word or sub-word sequences for each speech segment, along with associated confidence scores and time stamps. A user-specified text query is then composed with this FST index to find the putative locations of the keyword along with confidence scores. More recently, inspired by E2E approaches, ASR-free keyword search systems have been proposed with limited success. Machine learning methods have also been very successful in Question- Answering, parsing, language translation, analytics and deriving representations of morphological units, words or sentences. Challenges such as the Zero Resource Speech Challenge aim to construct systems that learn an end-to-end Spoken Dialog (SD) system, in an unknown language, from scratch, using only information available to a language learning infant (zero linguistic resources). The principal objective of the recently concluded IARPA Babel program was to develop a keyword search system that delivers high accuracy for any new language given very limited transcribed speech, noisy acoustic and channel conditions, and limited system build time of one to four weeks. This special issue will showcase the power of novel machine learning methods not only for ASR, but for keyword search and for the general processing of speech and language. Topics of interest in the special issue include (but are not limited to): • Novel end-to-end speech and language processing • Query-by-example search • Deep learning based acoustic and word representations • Query-by-example search • Question answering systems • Multilingual dialogue systems • Multilingual representation learning • Low and zero resource speech processing • Deep learning based ASR-free keyword search • Deep learning based media retrieval • Kernel methods applied to speech and language processing • Acoustic unit discovery • Computational challenges for deep end-to-end systems • Adaptation strategies for end to end systems • Noise robustness for low resource speech recognition systems • Spoken language processing: speech to speech translation, speech retrieval, extraction, and summarization • Machine learning methods applied to morphological, syntactic, and pragmatic analysis • Computational semantics: document analysis, topic segmentation, categorization, and modeling • Named entity recognition, tagging, chunking, and parsing • Sentiment analysis, opinion mining, and social media analytics • Deep learning in human computer interaction Dates: Manuscript submission: April 1, 2017 First review completed: June 1, 2017 Revised Manuscript Due: July 15, 2017 Second Review Completed: August 15, 2017 Final Manuscript Due: September 15, 2017 Publication: December. 2017 Guest Editors: Nancy F. Chen, Institute for Infocomm Research (I2R), A*STAR, Singapore Mary Harper, Army Research Laboratory, USA Brian Kingsbury, IBM Watson, IBM T.J. Watson Research Center, USA Kate Knill, Cambridge University, U.K. Bhuvana Ramabhadran, IBM Watson, IBM T.J. Watson Research Center, USA
| |||||
7-3 | Travaux Interdisciplinaires sur la Parole et le Langage, TIPA L'équipe de rédaction des TIPA a le plaisir de vous annoncer la parution du dernier numéro de la revue sur Revues.org : Travaux Interdisciplinaires sur la Parole et le Langage, TIPA n° 32 I 2016 : Ce numéro sera complété par le n° 33 I 2017 qui portera sur la même thématique :
| |||||
7-4 | CfP IEEE Journal of Selected Topics in Signal Processing/ Special Issue on End-to-End Speech and Language Processing Call for Papers IEEE Journal of Selected Topics in Signal Processing
|
Back | Top |
Revue TIPA n°34, 2018
Travaux interdisciplinaires sur la parole et le langage
http://tipa.revues.org/
LA LANGUE DES SIGNES, C?EST COMME ÇA
Langue des signes : état des lieux, description, formalisation, usages
Éditrice invitée
Mélanie Hamm,
Laboratoire Parole et Langage, Aix-Marseille Université
« La langue des signes, c?est comme ça » fait référence à l?ouvrage « Les sourds, c?est comme ça » d?Yves Delaporte (2002). Dans ce livre, le monde des sourds, la langue des signes française et ses spécificités sont décrits. Une des particularités de la langue des signes française est le geste spécifique signifiant COMME ÇA[1], expression fréquente chez les sourds, manifestant une certaine distance, respectueuse et sans jugement, vis-à-vis de ce qui nous entoure. C?est avec ce même regard ? proche de la probité scientifique simple et précise ? que nous tenterons d?approcher les langues signées.
Même si nous assistons à des avancées en linguistique des langues signées en général et de la langue des signes française en particulier, notamment depuis les travaux de Christian Cuxac (1983), de Harlan Lane (1991) et de Susan D. Fischer (2008), la linguistique des langues des signes reste un domaine encore peu développé. De plus, la langue des signes française est une langue en danger, menacée de disparition (Moseley, 2010 et Unesco, 2011). Mais quelle est cette langue ? Comment la définir ? Quels sont ses « mécanismes » ? Quelle est sa structure ? Comment la « considérer », sous quel angle, à partir de quelles approches ? Cette langue silencieuse met à mal un certain nombre de postulats linguistiques, comme l?universalité du phonème, et pose de nombreuses questions auxquelles il n?y a pas encore de réponses satisfaisantes. En quoi est-elle similaire et différente des langues orales ? N?appartient-elle qu?aux locuteurs sourds ? Doit-elle être étudiée, partagée, conservée, documentée comme toute langue qui appartient au patrimoine immatériel de l?humanité (Unesco, 2003) ? Comment l?enseigner et avec quels moyens ? Que raconte l?histoire à ce sujet ? Quel avenir pour les langues signées ? Que disent les premiers intéressés ? Une somme de questions ouvertes et très contemporaines?
Le numéro 34 de la revue Travaux Interdisciplinaires sur la Parole et le langage se propose de faire le point sur l?état de la recherche et des différents travaux relatifs à cette langue si singulière, en évitant de l?« enfermer » dans une seule discipline. Nous sommes à la recherche d?articles inédits sur les langues des signes et sur la langue des signes française en particulier. Ils proposeront description, formalisation ou encore aperçu des usages des langues signées. Une approche comparatiste entre les différentes langues des signes, des réflexions sur les variantes et les variations, des considérations sociolinguistiques, sémantiques et structurelles, une analyse de l?étymologie des signes pourront également faire l?objet d?articles. En outre, un espace sera réservé pour d?éventuels témoignages de sourds signeurs.
Les articles soumis à la revue TIPA sont lus et évalués par le comité de lecture de la revue. Ils peuvent être rédigés en français ou en anglais et présenter des images, photos et vidéos (voir « consignes aux auteurs » sur https://tipa.revues.org/222). Une longueur entre 10 et 20 pages est souhaitée pour chacun des articles, soit environ 35 000 / 80 000 caractères ou 6 000 / 12 000 mots. La taille moyenne recommandée pour chacune des contributions est d?environ 15 pages. Les auteurs sont priés de fournir un résumé de l?article dans la langue de l?article (français ou anglais ; entre 120 et 200 mots) ainsi qu?un résumé long d?environ deux pages (dans l?autre langue : français si l?article est en anglais et vice versa), et 5 mots-clés dans les deux langues (français-anglais). Les articles proposés doivent être sous format .doc (Word) et parvenir à la revue TIPA sous forme électronique aux adresses suivantes : tipa@lpl-aix.fr et melanie.hamm@lpl-aix.fr.
Références bibliographiques :
COMPANYS, Monica (2007). Prêt à signer. Guide de conversation en LSF. Angers : Éditions Monica Companys.
CUXAC, Christian (1983). Le langage des sourds. Paris : Payot.
DELAPORTE, Yves (2002). Les sourds, c?est comme ça. Paris : Maison des sciences de l?homme.
FISCHER, Susan D. (2008). Sign Languages East and West. In : Piet Van Sterkenburg, Unity and Diversity of Languages. Philadelphia/Amsterdam : John Benjamins Publishing Company.
LANE, Harlan (1991). Quand l?esprit entend. Histoire des sourds-muets. Traduction de l?américain par Jacqueline Henry. Paris : Odile Jacob.
MOSELEY, Christopher (2010). Atlas des langues en danger dans le monde. Paris : Unesco.
UNESCO (2011). Nouvelles initiatives de l?UNESCO en matière de diversité linguistique : http://fr.unesco.org/news/nouvelles-initiatives-unesco-matiere-diversite-linguistique.
UNESCO (2003). Convention de 2003 pour la sauvegarde du patrimoine culturel immatériel : http://www.unesco.org/culture/ich/doc/src/18440-FR.pdf.
Echéancier
Avril 2017 : appel à contribution
Septembre 2017 : soumission de l?article (version 1)
Octobre-novembre 2017 : retour du comité ; acceptation, proposition de modifications (de la version 1) ou refus
Fin janvier 2018 : soumission de la version modifiée (version 2)
Février 2018 : retour du comité (concernant la version 2)
Mars / juin 2018 : soumission de la version finale
Mai / juin 2018 : parution
Instructions aux auteurs
Merci d?envoyer 3 fichiers sous forme électronique à : tipa@lpl-aix.fr et melanie.hamm@lpl-aix.fr :
- un fichier en .doc contenant le titre, le nom et l?affiliation de l?auteur (des auteurs)
- deux fichiers anonymes, l?un en format .doc et le deuxième en .pdf,
Pour davantage de détails, les auteurs pourront suivre ce lien : http://tipa.revues.org/222
[1] Voir par exemple l?image 421, page 334, dans Companys, 2007 ou photo ci-dessus.
Back | Top |
Call for Papers
Special Issue of COMPUTER SPEECH AND LANGUAGE
Speech and Language Processing for Behavioral and Mental Health Research and Applications
The promise of speech and language processing for behavioral and mental health research and clinical applications is profound. Advances in all aspects of speech and language processing, and their integration—ranging from speech activity detection, speaker diarization, and speech recognition to various aspects of spoken language understanding and multimodal paralinguistics—offer novel tools for both scientific discovery and creating innovative ways for clinical screening, diagnostics, and intervention support. Credited to the potential for widespread impact, research sites across all continents are actively engaged in this societally important research area, tackling a rich set of challenges including the inherent multilingual and multicultural underpinnings of behavioural manifestations. The objective of this Special Issue on Speech and Language Processing for Behavioral and Mental Health Applications is to bring together and share these advances in order to shape the future of the field. It will focus on technical issues and applications of speech and language processing for behavioral and mental health applications. Original, previously unpublished submissions are encouraged within (not limited to) the following scope:
Important Dates
Guest Editors
Submission Procedure
Authors should follow the Elsevier Computer Speech and Language manuscript format described at the journal site https://www.elsevier.com/journals/computer-speech-and-language/0885-2308/guidefor-authors#20000. Prospective authors should submit an electronic copy of their complete manuscript through the journal Manuscript Tracking System at http://www.evise.com/evise/jrnl/CSL When submitting your papers, authors must select VSI:SLP-Behavior-mHealth as the article type.
Back | Top |
Release of the latest issue of the IEEE CIS Newsletter on Cognitive and Developmental Systems (open access).
This is a biannual newsletter addressing the sciences of developmental and cognitive processes in natural and artificial organisms, from humans to robots, at the crossroads of cognitive science, developmental psychology, artificial intelligence and neuroscience.
It is available at: http://goo.gl/dyrg6s
Featuring dialog:
=== 'Exploring Robotic Minds by Predictive Coding Principle'
Back | Top |
Back | Top |
Multimodal Interaction in Automotive Applications
=================================================
With the smartphone becoming ubiquitous, pervasive distributed computing is becoming a reality. Increasingly, aspects of the internet of things find their way into many aspects of our daily lives. Users are interacting multimodally with their smartphones and expectations with regard to natural interaction have increased dramatically in the past years. Even more, users have started to project these expectations towards all kind of interfaces encountered in their daily lives. Currently, these expectations are not yet fully met by car manufacturers since the automotive development cycles are still much longer compared to software industry. However, the clear trend is that manufacturers add technology to cars to deliver on their vision and promise of a safer drive. Multiple modalities are already available in today?s dashboards, including haptic controllers, touch screens, 3D gestures, voice, secondary displays, and gaze.
In fact, car manufacturers are aiming for a personal assistant with deep understanding of the car and an ability to meet driving-related demands and non-driving-related needs to get the job done. For instance, such an assistant can naturally answer any question about the car and help schedule service when needed. It can find the preferred gas station along the route, or even better ? plan a stop and ensure to arrive in time for a meeting. It understands that a perfect business meal involves more than finding a sponsored restaurant, and includes unbiased reviews, availability, budget, trouble-free parking and notifies all invitees of the meeting time and location. Moreover, multimodality can be a source for fatigue detection. The main goal for multimodal interaction and driver assistance systems is on ensuring that the driver can focus on his primary task of a safe drive.
This is why the biggest innovations in today?s cars happened in the way we interact with the integrated devices such as the infotainment system. For instance, it has been shown that voice based interaction is less distractive than interaction with visual haptic interface, but it is only one piece in the way we interact multimodally in today?s cars, shifting away from the GUI as the only source of interaction. This also leads to additional efforts to establish a mental model for the user. With the plethora of available modalities requiring multiple mental maps, learnability decreased considerably. Multimodality may also help here to decrease distraction. In the special issue we will present the challenges and opportunities of multimodal interaction to help reducing cognitive load and increase learnability as well as current research that has the potential to be employed in tomorrow?s cars.
In this special issue, we especially invite researchers, scientists, and developers to submit contributions that are original and unpublished and have not been submitted to any other journal, magazine, or conference. We expect at least 30% of novel content. We are soliciting original research related to multimodal smart and interactive media technologies in areas including - but not limited to - the following:
* In-vehicle multimodal interaction concepts
* Multimodal Head-Up Displays (HUDs) and Augmented Reality (AR) concepts
* Reducing driver distraction and cognitive load and demand with multimodal interaction
* (pro-active) in-car personal assistant systems
* Driver assistance systems
* Information access (search, browsing etc) in the car
* Interfaces for navigation
* Text input and output while driving
* Biometrics and physiological sensors as a user interface component
* Multimodal affective intelligent interfaces
* Multimodal automotive user-interface frameworks and toolkits
* Naturalistic/field studies of multimodal automotive user interfaces
* Multimodal automotive user-interface standards
* Detecting and estimating user intentions employing multiple modalities
Guest Editors
=============
Dirk Schnelle-Walka, Harman International, Connected Car Division, Germany
Phil Cohen, Voicebox, USA
Bastian Pfleging, Ludwig-Maximilians-Universität München, Germany
Submission Instructions
=======================
1-page abstract submission: Feb 5, 2018
Invitation for full submission: March 15, 2018
Full Submission: April 28, 2018
Notification about acceptance: June 15, 2018
Final article submission: July 15, 2018
Tentative Publication: ~ Sept 2018
Companion website: https://sites.google.com/view/multimodalautomotive/
Authors are requested to follow instructions for manuscript submission to the Journal of Multimodal User Interfaces (http://www.springer.com/computer/hci/journal/12193) and to submit manuscripts at the following link: https://easychair.org/conferences/?conf=mmautomotive2018.
Back | Top |