7-1 | SPECIAL ISSUE OF SPEECH COMMUNICATION on Sensing Emotion and Affect - Facing Realism in Speech Processin
Call for Papers
SPECIAL ISSUE OF SPEECH COMMUNICATION on
Sensing Emotion and Affect - Facing Realism in Speech Processing
http://www.elsevier.com/framework_products/promis_misc/specomsensingemotion.pdf
_______________________________________________________________________________
Human-machine and human-robot dialogues in the next generation will be dominated by natural speech which is fully spontaneous and thus driven by emotion. Systems will not only be expected to cope with affect throughout actual speech recognition, but at the same time to detect emotional and related patterns such as non-linguistic vocalization, e.g. laughter, and further social signals for appropriate reaction. In most cases, this analysis clearly must be made independently of the speaker and for all speech that 'comes in' rather than only for pre-selected and pre-segmented prototypical cases. In addition - as in any speech processing task, noise, coding, and blind speaker separation artefacts, together with transmission errors need to be dealt with. To provide appropriate back-channelling and sociSPECIAL ISSUE of SPEECH COMMUNally competent reaction fitting the speaker's emotional state in time, on-line and incremental processing will be among further concerns. Once affective speech processing is applied in real-life, novel issues as standards, confidences, distributed analysis, speaker adaptation, and emotional profiling are coming up next to appropriate interaction and system design. In this respect, the Interspeech Emotion Challenge 2009, which has been organized by the guest editors, provided the first forum for comparison of results, obtained for exactly the same realistic conditions. In this special issue, on the one hand, we will summarise the findings from this challenge, and on the other hand, provide space for novel original contributions that further the analysis of natural, spontaneous, and thus emotional speech by late-breaking technological advancement, recent experience with realistic data, revealing of black holes for future research endeavours, or giving a broad overview. Original, previously unpublished submissions are encouraged within the following scope of topics:
* Machine Analysis of Naturalistic Emotion in Speech and Text
* Sensing Affect in Realistic Environments (Vocal Expression, Nonlinguistic Vocalization)
* Social Interaction Analysis in Human Conversational Speech
* Affective and Socially-aware Speech User Interfaces
* Speaker Adaptation, Clustering, and Emotional Profiling
* Recognition of Group Emotion and Coping with Blind Speaker Separation Artefacts
* Novel Research Tools and Platforms for Emotion Recognition
* Confidence Measures and Out-of-Vocabulary Events in Emotion Recognition
* Noise, Echo, Coding, and Transmission Robustness in Emotion Recognition
* Effects of Prototyping on Performance
* On-line, Incremental, and Real-time Processing
* Distributed Emotion Recognition and Standardization Issues
* Corpora and Evaluation Tasks for Future Comparative Challenges
* Applications (Spoken Dialog Systems, Emotion-tolerant ASR, Call-Centers, Education, Gaming, Human-Robot Communication, Surveillance, etc.)
Composition and Review Procedures
_______________________________________________________________________________
This Special Issue of Speech Communication on Sensing Emotion and Affect - Facing Realism in Speech Processing will consist of papers on data-based evaluations and papers on applications. The balance between these will be adjusted to maximize the issue's impact. Submissions will undergo the normal review process.
Guest Editors
_______________________________________________________________________________
Björn Schuller, Technische Universität München, Germany
Stefan Steidl, Friedrich-Alexander-University, Germany
Anton Batliner, Friedrich-Alexander-University, Germany
Important Dates
_______________________________________________________________________________
Submission Deadline April 1st, 2010
First Notification July 1st, 2010
Revisions Ready September 1st, 2010
Final Papers Ready November 1st, 2010
Tentative Publication Date December 1st, 2010
Submission Procedure
_______________________________________________________________________________
Prospective authors should follow the regular guidelines of the Speech Communication Journal for electronic submission (http://ees.elsevier.com/specom/default.asp). During submission authors must select the 'Special Issue: Sensing Emotion' when they reach the 'Article Type'
__________________________________________
Dr. Björn Schuller
Senior Researcher and Lecturer
LIMSI-CNRS
BP133 91403 Orsay cedex
France
Technische Universität München
Institute for Human-Machine Communication
D-80333 München
schuller@IEEE.org
|
7-2 | EURASIP Journal on Advances in Signal Processing Special Issue on Emotion and Mental State Recognition from Speech
EURASIP Journal on Advances in Signal Processing Special Issue on Emotion and Mental State Recognition from Speech
http://www.hindawi.com/journals/asp/si/emsr.html _____________________________________________________ As research in speech processing has matured, attention has shifted from linguistic-related applications such as speech recognition towards paralinguistic speech processing problems, in particular the recognition of speaker identity, language, emotion, gender, and age. Determination of emotion or mental state is a particularly challenging problem, in view of the significant variability in its expression posed by linguistic, contextual, and speaker-specific characteristics within speech. Some of the key research problems addressed to date include isolating emotion-specific information in the speech signal, extracting suitable features, forming reduced-dimension feature sets, developing machine learning methods applicable to the task, reducing feature variability due to speaker and linguistic content, comparing and evaluating diverse methods, robustness, and constructing suitable databases. Automatic detection of other types of mental state, which share some characteristics with emotion, are also now being explored, for example, depression, cognitive load, and 'cognitive epistemic' states such as interest or skepticism.
Topics of interest in this special issue include, but are not limited to: * Signal processing methods for acoustic feature extraction in emotion recognition * Robustness issues in emotion classification, including speaker and speaker group normalization and reduction of mismatch due to coding, noise, channel, and transmission effects * Applications of prosodic and temporal feature modeling in emotion recognition * Novel pattern recognition techniques for emotion recognition * Automatic detection of depression or psychiatric disorders from speech * Methods for measuring stress, emotion-related indicators, or cognitive load from speech * Studies relating speech production or perception to emotion and mental state recognition * Recognition of nonprototypical spontaneous and naturalistic emotion in speech * New methods for multimodal emotion recognition, where nonverbal speech content has a central role * Emotional speech synthesis research with clear implications for emotion recognition * Emerging research topics in recognition of emotion and mental state from speech * Novel emotion recognition systems and applications * Applications of emotion modeling to other related areas, for example, emotion-tolerant automatic speech recognition and recognition of nonlinguistic vocalizations Before submission authors should carefully read over the journal's Author Guidelines, which are located at http://www.hindawi.com/journals/asp/guidelines.html. Prospective authors should submit an electronic copy of their complete manuscript through the journal Manuscript Tracking System at http://mts.hindawi.com/ according to the following timetable: _____________________________________________________ Manuscript Due August 1, 2010 First Round of Reviews November 1, 2010 Publication Date February 1, 2011
_____________________________________________________ Lead Guest Editor (for correspondence)
_____________________________________________________ Julien Epps, The University of New South Wales, Australia; National ICT Australia, Australia Guest Editors _____________________________________________________ Roddy Cowie, Queen's University Belfast, UK Shrikanth Narayanan, University of Southern California, USA Björn Schuller, Technische Universitaet Muenchen, Germany Jianhua Tao, Chinese Academy of Sciences, China
|
7-3 | Special Issue on Speech and Language Processing of Children's Speech for Child-machine Interaction Applications
ACM Transactions on Speech and Language Processing Special Issue on Speech and Language Processing of Children's Speech for Child-machine Interaction Applications
The state-of the-art in automatic speech recognition (ASR) technology is suitable for a broad range of interactive applications. Although children represent an important user segment for speech processing technologies, the acoustic and linguistic variability present in children's speech poses additional challenges for designing successful interactive systems for children.
Acoustic and linguistic characteristics of children's speech are widely different from those of adults and voice interaction of children with computers opens challenging research issues on how to develop effective acoustic, language and pronunciation models for reliable recognition of children's speech. Furthermore, the behavior of children interacting with a computer is also different from the behavior of adults. When using a conversational interface for example, children have a different language strategy for initiating and guiding conversational exchanges, and may adopt different linguistic registers than adults.
In order to develop reliable voice-interactive systems further studies are needed to better understand the characteristics of children's speech and the different aspects of speech-based interaction including the role of speech in multimodal interfaces. The development of pilot systems for a broad range of applications is also important to provide experimental evidence of the degree of progress in ASR technologies and to focus research on application-specific problems emerging by using systems in realistic operating environments.
We invite prospective authors to submit papers describing original and previously unpublished work in the following broad research areas: analysis of children's speech, core technologies for ASR of children's speech, conversational interfaces, multimodal child-machine interaction and computer instructional systems for children. Specific topics of interest include, but are not limited to:
- Acoustic and linguistic analysis of children's speech
- Discourse analysis of spoken language in child-machine interaction
- Intra- and inter-speaker variability in children's speech
- Age-dependent characteristics of spoken language
- Acoustic, language and pronunciation modeling in ASR for children
- Spoken dialogue systems
- Multimodal speech-based child-machine interaction
- Computer assisted language acquisition and language learning
- Tools for children with special needs (speech disorders, autism, dyslexia, etc)
Papers should have a major focus on analysis and/or acoustic and linguistic processing of children's speech. Analysis studies should be clearly related to technology development issues and implications should be extensively discussed in the papers. Manuscripts will be peer reviewed according to the standard ACM TSLP process.
Submission Procedure Authors should follow the ACM TSLP manuscript preparation guidelines described on the journal web site http://tslp.acm.org and submit an electronic copy of their complete manuscript through the journal manuscript submission site http://mc.manuscriptcentral.com/acm/tslp. Authors are required to specify that their submission is intended for this Special Issue by including on the first page of the manuscript and in the field 'Author's Cover Letter' the note 'Submitted for the Special Issue on Speech and Language Processing of Children's Speech for Child-machine Interaction Applications'. Without this indication, your submission cannot be considered for this Special Issue.
Schedule Submission deadline: May 12, 2010 Notification of acceptance: November 1, 2010 Final manuscript due: December 15, 2010
Guest Editors Alexandros Potamianos, Technical University of Crete, Greece (potam@telecom.tuc.gr) Diego Giuliani, Fondazione Bruno Kessler, Italy (giuliani@fbk.eu) Shrikanth Narayanan, University of Southern California, USA (shri@sipi.usc.edu) Kay Berkling, Inline Internet Online GmbH, Karlsruhe, Germany (Kay@Berkling.com)
|
7-4 | ACM TSLP - Special Issue: call for Papers:“Machine Learning for Robust and Adaptive Spoken Dialogue Systems'
ACM TSLP - Special Issue: call for Papers: “Machine Learning for Robust and Adaptive Spoken Dialogue Systems'
* Submission Deadline 1 July 2010 * http://tslp.acm.org/specialissues.html
During the last decade, research in the field of Spoken Dialogue Systems (SDS) has experienced increasing growth, and new applications include interactive search, tutoring and “troubleshooting” systems, games, and health agents. The design and optimization of such SDS requires the development of dialogue strategies which can robustly handle uncertainty, and which can automatically adapt to different types of users (novice/expert, youth/senior) and noise conditions (room/street). New statistical learning techniques are also emerging for training and optimizing speech recognition, parsing / language understanding, generation, and synthesis for robust and adaptive spoken dialogue systems.
Automatic learning of adaptive, optimal dialogue strategies is currently a leading domain of research. Among machine learning techniques for spoken dialogue strategy optimization, reinforcement learning using Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs) has become a particular focus. One concern for such approaches is the development of appropriate dialogue corpora for training and testing. However, the small amount of data generally available for learning and testing dialogue strategies does not contain enough information to explore the whole space of dialogue states (and of strategies). Therefore dialogue simulation is most often required to expand existing datasets and man-machine spoken dialogue stochastic modelling and simulation has become a research field in its own right. User simulations for different types of user are a particular new focus of interest.
Specific topics of interest include, but are not limited to:
• Robust and adaptive dialogue strategies • User simulation techniques for robust and adaptive strategy learning and testing • Rapid adaptation methods • Modelling uncertainty about user goals • Modelling user’s goal evolution along time • Partially Observable MDPs in dialogue strategy optimization • Methods for cross-domain optimization of dialogue strategies • Statistical spoken language understanding in dialogue systems • Machine learning and context-sensitive speech recognition • Learning for adaptive Natural Language Generation in dialogue • Machine learning for adaptive speech synthesis (emphasis, prosody, etc.) • Corpora and annotation for machine learning approaches to SDS • Approaches to generalising limited corpus data to build user models and user simulations • Evaluation of adaptivity and robustness in statistical approaches to SDS and user simulation.
Submission Procedure: Authors should follow the ACM TSLP manuscript preparation guidelines described on the journal web site http://tslp.acm.org and submit an electronic copy of their complete manuscript through the journal manuscript submission site http://mc.manuscriptcentral.com/acm/tslp. Authors are required to specify that their submission is intended for this Special Issue by including on the first page of the manuscript and in the field “Author’s Cover Letter” the note “Submitted for the Special Issue of Speech and Language Processing on Machine Learning for Robust and Adaptive Spoken Dialogue Systems”. Without this indication, your submission cannot be considered for this Special Issue.
Schedule: • Submission deadline : 1 July 2010 • Notification of acceptance: 1 October 2010 • Final manuscript due: 15th November 2010
Guest Editors: Oliver Lemon, Heriot-Watt University, Interaction Lab, School of Mathematics and Computer Science, Edinburgh, UK. Olivier Pietquin, Ecole Supérieure d’Électricité (Supelec), Metz, France.
http://tslp.acm.org/cfp/acmtslp-cfp2010-02.pdf
|
7-5 | Special issue on Content based Multimedia Indexing in Multimedia Tools and Applications Journal
Special Issue on Content-Based Multimedia Indexing CBMI’2010 Second call for submissions
This call is related to the CBMI’2010 workshop but is open to all contributions on a relevant topic, whether submitted at CBMI’2010 or not.
The Special issue of Multimedia Tools and Applications Journal will contain selected papers, after resubmission and review from 8th International Workshop on Content-Based Multimedia Indexing CBMI’2010. Following the seven successful previous events (Toulouse 1999, Brescia 2001, Rennes 2003, Riga 2005, Bordeaux 2007, London 2008, Chania 2009), 2010 International Workshop on Content-Based Multimedia Indexing (CBMI) will be held on June 23-25, 2010 in Grenoble, France. It will be organized by the Laboratoire d'Informatique de Grenoble http://www.liglab.fr/. CBMI 2010 aims at bringing together the various communities involved in the different aspects of content-based multimedia indexing, such as image processing and information retrieval with current industrial trends and developments. Research in Multimedia Indexing covers a wide spectrum of topics in content analysis, content description, content adaptation and content retrieval. Hence, topics of interest for the Special Issue include, but are not limited to:
- Multimedia indexing and retrieval (image, audio, video, text) - Matching and similarity search - Construction of high level indices - Multimedia content extraction - Identification and tracking of semantic regions in scenes - Multi-modal and cross-modal indexing - Content-based search - Multimedia data mining - Metadata generation, coding and transformation - Large scale multimedia database management - Summarisation, browsing and organization of multimedia content - Presentation and visualization tools - User interaction and relevance feedback - Personalization and content adaptation
Paper Format Papers must be typed in a font size no smaller than 10 pt, and presented in single-column format with double line spacing on one side A4 paper. All pages should be numbered. The manuscript should be formatted according to the requirements of the journal. Detailed information about the Journal, including an author guide and detailed formatting information is available at: http://www.springer.com/computer/information+systems/journal/11042.
Paper Submission All papers must be submitted through the journals Editorial Manager system: http://mtap.edmgr.com. When uploading your paper, please ensure that your manuscript is marked as being for this special issue.
Important Dates Manuscript due: 19th of April 2010 Notification of acceptance: 1st of July 2010 Publication date: January 2011
Guest Editors Dr. Georges Quénot LIG UMR 5217 INPG-INRIA-University Joseph Fourier, UPMF -CNRS Campus Scientifique, BP 53, 38041 Grenoble Cedex 9, France e-mail : Georges.Quenot@imag.fr
Prof. Jenny Benois-Pineau, University of Bordeaux1, LABRI UMR 5800 Universities Bordeaux-CNRS, e-mail: jenny.benois@labri.fr
Prof. Régine André-Obrecht University Paul Sabatier, Toulouse, IRIT UMR UPS/CNRS/UT1/UTM, France e-mail: obrecht@irit.fr
http://www.springer.com/cda/content/document/cda_downloaddocument/CFP-11042-20091003.pdf
|
7-6 | New book series: Frontiers in Mathematical Linguistics and Language Theory.
New book series: Mathematics, Computing, Language, and Life: Frontiers in Mathematical Linguistics and Language Theory to be published by Imperial College Press starting in 2010. Editor: Carlos Martin-Vide carlos.martin@urv.cat
|
7-7 | CfP Speech recognition in adverse environment in in Adverse Conditions of Language and Cognitive Processes
Call for papers: Special issue on Speech Recognition in Adverse Conditions of Language and Cognitive Processes/ Cognitive Neurosciences of Language
Language and Cognitive Processes, jointly with Cognitive Neuroscience of Language, is launching a call for submissions for a special issue on:
Speech Recognition in Adverse Conditions
This special issue is a unique opportunity to promote the development of a unifying thematic framework for understanding the perceptual, cognitive and neuro-physiological mechanisms underpinning speech recognition in adverse conditions. In particular, we seek papers focusing on the recognition of acoustically degraded speech (e.g., speech in noise, “accented” or motor-disordered speech), speech recognition under cognitive load (e.g., divided attention, memory load) and speech recognition by theoretically relevant populations (e.g., children, elderly or non-native listeners). We welcome both cognitive and neuroscientific perspectives on the topic that report strong and original empirical data firmly grounded in theory.
Guest editors: Sven Mattys, Ann Bradlow, Matt Davis, and Sophie Scott.
Submission deadline: 30 November 2010.
Please see URL below for further details:
http://www.tandf.co.uk/journals/cfp/PLCPcfp2.pdf
|
7-8 | Special Issue of Speech Communication on Advanced Voice Function Assessment
Speech Communication Call for papers for the Special Issue on “Advanced Voice Function Assessment” Everyday we use our voice to communicate, express emotions and feelings. Voice is also an important instrument for many professionals like teachers, singers, actors, lawyers, managers, salesmen etc. Modern style of life has increased the risk of experiencing some kind of voice alterations. It is believed that around the 19% of the population suffer or have suffered dysphonic voicing due to some kind of disease or dysfunction. So there exist a need for new and objective ways to evaluate the quality of voice, and its connection with vocal folds activity and the complex interaction between the larynx and the voluntary movements of the articulators (i.e. mouth, tongue, velum, jaw, etc). Diagnosis of voice disorders, the screening of vocal and voice diseases (and particularly their early detection), the objective determination of vocal function alterations and the evaluation of surgical as well as pharmacological treatments and rehabilitation, are considered as major goals of the voice function assessment. Applications of Voice Function Assessment also include control of voice quality for voice professionals such as teachers, singers, speakers, as well as for the evaluation of the stress, vocal fatigue and loading, etc. Although the state of the art reports significant achievements in understanding the voice production mechanism and in assessing voice quality, there is a continuous need for improving the existing models of the normal and pathological voice source to analyse healthy and pathological voices. This special issue aims at offering an interdisciplinary platform for presenting new knowledge in the field of models and analysis of voice signals in conjunction with videoendoscopic images with applications in occupational, pathological, and oesophageal voices. The scope of the special issue includes all aspects of voice modelling and analysis, ranging from fundamental research to all kind of biomedical applications and related established and advanced technologies. Original, previously unpublished submissions are encouraged within the following scope of topics: - Databases of voice disorders - Robust analysis of pathological and oesophageal voices - Inverse filtering for voice function assessment - Automatic detection of voice disorders from voice and speech - Automatic assessment and classification of voice quality - Multi-modal analysis of disordered speech (voice, speech, vocal folds images using videolaryngoscopy, videokymography, fMRI and other emerging techniques) - New strategies for parameterization and modelling normal and pathological voices (e.g. biomechanical-based parameters, chaos modelling, etc) - Signal processing to support the remote diagnosis - Assessment of voice quality in rehabilitation - Speech enhancement for pathological and oesophageal speech - Technical aids and hands-free devices: vocal prostheses and aids for disabled - Non-speech vocal emissions (e.g. infant cry, cough and snoring) - Relationship between speech and neurological dysfunctions (e.g. epilepsy, autism, schizophrenia, stress etc.) - Computer-based diagnostic and training systems for speech dysfunctions Composition and Review Procedures The emphasis of this special issue is on both basic and applied research related to evaluation of voice quality and diagnosis schemes, as well as in the results of voice treatments. The submissions received for this Special Issue of Speech Communication on Advanced Voice Function Assessment will undergo the normal review process. Guest Editors • Juan I. Godino-Llorente, Universidad Politécnica de Madrid, Spain, igodino@ics.upm.es • Yannis Stylianou, University of Crete, Greece, yannis@csd.uoc.gr • Philippe H. DeJonckere, University Medical Center Utrecht, The Netherlands, ph.dejonckere@umcutrecht.nl • Pedro Gómez-Vilda, Universidad Politécnica de Madrid, Spain, pedro@pino.datsi.fi.upm.es Important Dates Deadline for submission: June, 15th, 2010. First Notification: September 15th, 2010. Revisions Ready: October 30st, 2010 Final Notification: November, 30th, 2010 Final papers ready: December, 30th, 2010 Tentative publication date: January, 30th, 2011 Submission Procedure Prospective authors should follow the regular guidelines of the Speech Communication Journal for electronic submission (http://ees.elsevier.com/specom). During submission authors must select the Section “Special Issue Paper”, not “Regular Paper”, and the title of the special issue should be referenced in the “Comments” (Special Issue on Advanced Voice Function Assessment) page along with any other information
|
7-9 | Special Issue on Deep Learning for Speech and Language Processing, IEEE Trans. ASLT
IEEE Transactions on Audio, Speech, and Language Processing IEEE Signal Processing Society Special Issue on Deep Learning for Speech and Language Processing Over the past 25 years or so, speech recognition technology has been dominated largely by hidden Markov models (HMMs). Significant technological success has been achieved using complex and carefully engineered variants of HMMs. Next generation technologies require solutions to technical challenges presented by diversified deployment environments. These challenges arise from the many types of variability present in the speech signal itself. Overcoming these challenges is likely to require “deep” architectures with efficient and effective learning algorithms. There are three main characteristics in the deep learning paradigm: 1) layered architecture; 2) generative modeling at the lower layer(s); and 3) unsupervised learning at the lower layer(s) in general. For speech and language processing and related sequential pattern recognition applications, some attempts have been made in the past to develop layered computational architectures that are “deeper” than conventional HMMs, such as hierarchical HMMs, hierarchical point-process models, hidden dynamic models, layered multilayer perceptron, tandem-architecture neural-net feature extraction, multi-level detection-based architectures, deep belief networks, hierarchical conditional random field, and deep-structured conditional random field. While positive recognition results have been reported, there has been a conspicuous lack of systematic learning techniques and theoretical guidance to facilitate the development of these deep architectures. Recent communication between machine learning researchers and speech and language processing researchers revealed a wealth of research results pertaining to insightful applications of deep learning to some classical speech recognition and language processing problems. These results can potentially further advance the state of the arts in speech and language processing. In light of the sufficient research activities in this exciting space already taken place and their importance, we invite papers describing various aspects of deep learning and related techniques/architectures as well as their successful applications to speech and language processing. Submissions must not have been previously published, with the exception that substantial extensions of conference or workshop papers will be considered. The submissions must have specific connection to audio, speech, and/or language processing. The topics of particular interest will include, but are not limited to: Generative models and discriminative statistical or neural models with deep structure Supervised, semi-supervised, and unsupervised learning with deep structure Representing sequential patterns in statistical or neural models Robustness issues in deep learning Scalability issues in deep learning Optimization techniques in deep learning Deep learning of relationships between the linguistic hierarchy and data-driven speech units Deep learning models and techniques in applications such as (but not limited to) isolated or continuous speech recognition, phonetic recognition, music signal processing, language modeling, and language identification. The authors are required to follow the Author’s Guide for manuscript submission to the IEEE Transactions on Audio, Speech, and Language Processing at http://www.signalprocessingsociety.org/publications/periodicals/taslp/taslp-author-information Submission deadline: September 15, 2010 Notification of Acceptance: March 15, 2011 Final manuscripts due: May 15, 2011 Date of publication: August 2011 For further information, please contact the guest editors: Dong Yu
|
7-10 | ACM TSLP Special issue:“Machine Learning for Robust and Adaptive Spoken Dialogue Systems'
ACM TSLP - Special Issue: call for Papers: “Machine Learning for Robust and Adaptive Spoken Dialogue Systems'
* Submission Deadline 1 July 2010 * http://tslp.acm.org/specialissues.html
During the last decade, research in the field of Spoken Dialogue Systems (SDS) has experienced increasing growth, and new applications include interactive search, tutoring and “troubleshooting” systems, games, and health agents. The design and optimization of such SDS requires the development of dialogue strategies which can robustly handle uncertainty, and which can automatically adapt to different types of users (novice/expert, youth/senior) and noise conditions (room/street). New statistical learning techniques are also emerging for training and optimizing speech recognition, parsing / language understanding, generation, and synthesis for robust and adaptive spoken dialogue systems.
Automatic learning of adaptive, optimal dialogue strategies is currently a leading domain of research. Among machine learning techniques for spoken dialogue strategy optimization, reinforcement learning using Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs) has become a particular focus. One concern for such approaches is the development of appropriate dialogue corpora for training and testing. However, the small amount of data generally available for learning and testing dialogue strategies does not contain enough information to explore the whole space of dialogue states (and of strategies). Therefore dialogue simulation is most often required to expand existing datasets and man-machine spoken dialogue stochastic modelling and simulation has become a research field in its own right. User simulations for different types of user are a particular new focus of interest.
Specific topics of interest include, but are not limited to:
• Robust and adaptive dialogue strategies • User simulation techniques for robust and adaptive strategy learning and testing • Rapid adaptation methods • Modelling uncertainty about user goals • Modelling user’s goal evolution along time • Partially Observable MDPs in dialogue strategy optimization • Methods for cross-domain optimization of dialogue strategies • Statistical spoken language understanding in dialogue systems • Machine learning and context-sensitive speech recognition • Learning for adaptive Natural Language Generation in dialogue • Machine learning for adaptive speech synthesis (emphasis, prosody, etc.) • Corpora and annotation for machine learning approaches to SDS • Approaches to generalising limited corpus data to build user models and user simulations • Evaluation of adaptivity and robustness in statistical approaches to SDS and user simulation.
Submission Procedure: Authors should follow the ACM TSLP manuscript preparation guidelines described on the journal web site http://tslp.acm.org and submit an electronic copy of their complete manuscript through the journal manuscript submission site http://mc.manuscriptcentral.com/acm/tslp. Authors are required to specify that their submission is intended for this Special Issue by including on the first page of the manuscript and in the field “Author’s Cover Letter” the note “Submitted for the Special Issue of Speech and Language Processing on Machine Learning for Robust and Adaptive Spoken Dialogue Systems”. Without this indication, your submission cannot be considered for this Special Issue.
Schedule: • Submission deadline : 1 July 2010 • Notification of acceptance: 1 October 2010 • Final manuscript due: 15th November 2010
Guest Editors: Oliver Lemon, Heriot-Watt University, Interaction Lab, School of Mathematics and Computer Science, Edinburgh, UK. Olivier Pietquin, Ecole Supérieure d’Électricité (Supelec), Metz, France.
http://tslp.acm.org/cfp/acmtslp-cfp2010-02.pdf
|
7-11 | IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Special Issue on New Frontiers in Rich Transcription
IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING
Special Issue on New Frontiers in Rich Transcription
A rich transcript is a transcript of a recorded event along with
metadata to enrich the word stream with useful information such as
identifying speakers, sentence units, proper nouns, speaker locations,
etc. As the volume of online media increases and additional, layered
content extraction technologies are built, rich transcription has
become a critical foundation for delivering extracted content to
down-stream applications such as spoken document retrieval,
summarization, semantic navigation, speech data mining, and others.
The special issue on 'New Frontiers in Rich Transcription' will focus
on the recent research on technologies that generate rich
transcriptions automatically and on its applications. The field of
rich transcription draws on expertise from a variety of disciplines
including: (a) signal acquistion (recording room design, microphone
and camera design, sensor synchronization, etc.), (b) automatic
content extraction and supporting technologies (signal processing,
room acoustics compensation, spatial and multichannel audio
processing, robust speech recognition, speaker
recognition/diarization/tracking, spoken language understanding,
speech recognition, multimodal information integration from audio and
video sensors, etc.), (c) corpora infrastructure (meta-data
standards, annotations procedures, etc.), and (d) performance
benchmarking (ground truthing, evaluation metrics, etc.) In the end,
rich transcriptions serve as enabler of a variety of spoken document
applications.
Many large international projects (e.g. the NIST RT evaluations) have
been active in the area of rich transcription, engaging in efforts of
extracting useful content from a range of media such as broadcast
news, conversational telephone speech, multi-party meeting recordings,
lecture recordings. The current special issue aims to be one of the
first in bringing together the enabling technologies that are critical
in rich transcription of media with a large variety of speaker styles,
spoken content and acoustic environments. This area has also led to
new research directions recently, such as multimodal signal processing
or automatic human behavior modeling.
The purpose of this special issue is to present overview papers,
recent advances in Rich Transcription research as well as new ideas
for the direction of the field. We encourage submissions about the
following and other related topics:
* Robust Automatic Speech Recognition for Rich Transcription
* Speaker Diarization and Localization
* Speaker-attributed-Speech-to-Text
* Data collection and Annotation
* Benchmarking Metrology for Rich Transcription
* Natural language processing for Rich Transcription
* Multimodal Processing for Rich Transcription
* Online Methods for Rich Transcription
* Future Trends in Rich Transcription
Submissions must not have been previously published, with the
exception that substantial extensions of conference papers are
considered.
Submissions must be made through IEEE's manuscript central at:
http://mc.manuscriptcentral.com/sps-ieee
Selecting the special issue as target.
Important Dates:
EXTENDED Submission deadline: 1 September 2010
Notification of acceptance: 1 January 2011
Final manuscript due: 1 July 2011
For further information, please contact the guest editors:
Gerald Friedland, fractor@icsi.berkeley.edu
Jonathan Fiscus, jfiscus@nist.gov
Thomas Hain, T.Hain@dcs.shef.ac.uk
Sadaoki Furui, furui@cs.titech.ac.jp
|
7-12 | CfP IEEE Transactions on Audio, Speech, and Language Processing/Special Issue on Deep Learning for Speech and Language Processing
Call for Papers IEEE Transactions on Audio, Speech, and Language Processing Special Issue on Deep Learning for Speech and Language Processing
Over the past 25 years or so, speech recognition technology has been dominated largely by hidden Markov models (HMMs). Significant technological success has been achieved using complex and carefully engineered variants of HMMs. Next generation technologies require solutions to technical challenges presented by diversified deployment environments. These challenges arise from the many types of variability present in the speech signal itself. Overcoming these challenges is likely to require “deep” architectures with efficient and effective learning algorithms.
There are three main characteristics in the deep learning paradigm: 1) layered architecture; 2) generative modeling at the lower layer(s); and 3) unsupervised learning at the lower layer(s) in general. For speech and language processing and related sequential pattern recognition applications, some attempts have been made in the past to develop layered computational architectures that are “deeper” than conventional HMMs, such as hierarchical HMMs, hierarchical point-process models, hidden dynamic models, layered multilayer perception, tandem-architecture neural-net feature extraction, multi-level detection-based architectures, deep belief networks, hierarchical conditional random field, and deep-structured conditional random field. While positive recognition results have been reported, there has been a conspicuous lack of systematic learning techniques and theoretical guidance to facilitate the development of these deep architectures. Recent communication between machine learning researchers and speech and language processing researchers revealed a wealth of research results pertaining to insightful applications of deep learning to some classical speech recognition and language processing problems. These results can potentially further advance the state of the arts in speech and language processing.
In light of the sufficient research activities in this exciting space already taken place and their importance, we invite papers describing various aspects of deep learning and related techniques/architectures as well as their successful applications to speech and language processing. Submissions must not have been previously published, with the exception that substantial extensions of conference or workshop papers will be considered.
The submissions must have specific connection to audio, speech, and/or language processing. The topics of particular interest will include, but are not limited to:
• Generative models and discriminative statistical or neural models with deep structure • Supervised, semi-supervised, and unsupervised learning with deep structure • Representing sequential patterns in statistical or neural models • Robustness issues in deep learning • Scalability issues in deep learning • Optimization techniques in deep learning • Deep learning of relationships between the linguistic hierarchy and data-driven speech units • Deep learning models and techniques in applications such as (but not limited to) isolated or continuous speech recognition, phonetic recognition, music signal processing, language modeling, and language identification.
The authors are required to follow the Author’s Guide for manuscript submission to the IEEE Transactions on Audio, Speech, and Language Processing at http://www.signalprocessingsociety.org/publications/ periodicals/taslp/taslp-author-information
Submission deadline: September 15, 2010 Notification of Acceptance: March 15, 2011 Final manuscripts due: May 15, 2011 Date of publication: August 2011
For further information, please contact the guest editors: Dong Yu (dongyu@microsoft.com) Geoffrey Hinton (hinton@cs.toronto.edu) Nelson Morgan (morgan@ICSI.Berkeley.edu) Jen-Tzung Chien (jtchien@mail.ncku.edu.tw) Shiegeki Sagayama (sagayama@hil.t.u-tokyo.ac.jp)
|