ISCApad #155 |
Saturday, May 14, 2011 by Chris Wellekens |
7-1 | IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Special Issue on New Frontiers in Rich TranscriptionIEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Special Issue on New Frontiers in Rich Transcription A rich transcript is a transcript of a recorded event along with metadata to enrich the word stream with useful information such as identifying speakers, sentence units, proper nouns, speaker locations, etc. As the volume of online media increases and additional, layered content extraction technologies are built, rich transcription has become a critical foundation for delivering extracted content to down-stream applications such as spoken document retrieval, summarization, semantic navigation, speech data mining, and others. The special issue on 'New Frontiers in Rich Transcription' will focus on the recent research on technologies that generate rich transcriptions automatically and on its applications. The field of rich transcription draws on expertise from a variety of disciplines including: (a) signal acquistion (recording room design, microphone and camera design, sensor synchronization, etc.), (b) automatic content extraction and supporting technologies (signal processing, room acoustics compensation, spatial and multichannel audio processing, robust speech recognition, speaker recognition/diarization/tracking, spoken language understanding, speech recognition, multimodal information integration from audio and video sensors, etc.), (c) corpora infrastructure (meta-data standards, annotations procedures, etc.), and (d) performance benchmarking (ground truthing, evaluation metrics, etc.) In the end, rich transcriptions serve as enabler of a variety of spoken document applications. Many large international projects (e.g. the NIST RT evaluations) have been active in the area of rich transcription, engaging in efforts of extracting useful content from a range of media such as broadcast news, conversational telephone speech, multi-party meeting recordings, lecture recordings. The current special issue aims to be one of the first in bringing together the enabling technologies that are critical in rich transcription of media with a large variety of speaker styles, spoken content and acoustic environments. This area has also led to new research directions recently, such as multimodal signal processing or automatic human behavior modeling. The purpose of this special issue is to present overview papers, recent advances in Rich Transcription research as well as new ideas for the direction of the field. We encourage submissions about the following and other related topics: * Robust Automatic Speech Recognition for Rich Transcription * Speaker Diarization and Localization * Speaker-attributed-Speech-to-Text * Data collection and Annotation * Benchmarking Metrology for Rich Transcription * Natural language processing for Rich Transcription * Multimodal Processing for Rich Transcription * Online Methods for Rich Transcription * Future Trends in Rich Transcription Submissions must not have been previously published, with the exception that substantial extensions of conference papers are considered. Submissions must be made through IEEE's manuscript central at: http://mc.manuscriptcentral.com/sps-ieee Selecting the special issue as target. Important Dates: EXTENDED Submission deadline: 1 September 2010 Notification of acceptance: 1 January 2011 Final manuscript due: 1 July 2011 For further information, please contact the guest editors: Gerald Friedland, fractor@icsi.berkeley.edu Jonathan Fiscus, jfiscus@nist.gov Thomas Hain, T.Hain@dcs.shef.ac.uk Sadaoki Furui, furui@cs.titech.ac.jp
| ||
7-2 | IEEE Transactions on Audio, Speech, and Language Processing/Special Issue on Deep Learning for Speech and Language Processing IEEE Transactions on Audio, Speech, and Language Processing
| ||
7-3 | A New Journal on Speech Sciences and call for Papers for Special issue on experimental prosody Dear fellow prosodists, It is with a special joy that Sandra Madureira and myself announce here the launching of a new electronic journal which follows the principles of the Directory of Open Source Journals (DOAJ)*. The Journal of Speech Sciences (<http://www.journalofspeechsciences.org>) is sponsored by the Luso-Brazilian Association of Speech Sciences, an organisation founded in 2007 initially for helping organise Speech Prosody 2008. This journal proposes to occupy an ecological niche not covered by other journals where our community can publish, especially as regards its strength in linguistic and linguistically-related aspects of speech sciences research (but also speech pathology, new metholodologies and techniques, etc). Another reason for its special place in the speech research ecosystem is optinality of language's choice. Though English is the journal main language, people wanting to disseminate their work in Portuguese and French can do that, provided that they add an extended abstract in English (a way to make their work more visible outside the luso- and francophone communities). This journal was only made possible thanks to a great team working for the journal, and an exceptionally good editorial board. See the journal web page for that: <http://www.journalofspeechsciences.org>. For its first issue we propose a special issue on Experimental Prosody. Please, see the Call for Papers below and send your paper to us! All the best, Plinio (State Univ. of Campinas, Brazil) and Sandra (Catholic Univ. of São Paulo, Brazil) * Official inscription to the DOAJ and ISSN number can only be done/attributed after the first issue. -- Call for Papers The Journal of Speech Sciences (JoSS) is an open access journal which follows the principles of the Directory of Open Access Journals (DOAJ), meaning that its readers can freely read, download, copy, distribute, print, search, or link to the full texts of any article electronically published in the journal. It is accessible at <http://www.journalofspeechsciences.org>.
| ||
7-4 | Special issue Signal Processing : LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION The journal Signal Processing published by Elsevier is issuing a call for a special issue on latent variable models and source separation. Papers dealing with multi-talker ASR and noise-robust ASR using source separation techniques are highly welcome.
| ||
7-5 | IEEE Signal Processing Magazine: Special Issue on Fundamental Technologies in Modern Speech RecognitionIEEE Signal Processing Magazine Special Issue on Fundamental Technologies in Modern Speech Recognition Guest Editors: Sadaoki Furui Tokyo Institute of Technology, Tokyo, Japan (furui@cs.titech.ac.jp) Li Deng Microsoft Research, Redmond, USA (deng@microsoft.com) Mark Gales University of Cambridge, Cambridge, UK (mjfg@eng.cam.ac.uk) Hermann Ney RWTH Aachen University, Aachen, Germany (ney@cs.rwth-aachen.de) Keiichi Tokuda Nagoya Institute of Technology, Nagoya, Japan (tokuda@nitech.ac.jp) Recently, various statistical techniques that form the basis of fundamental technologies underlying today’s automatic speech recognition (ASR) research and applications have attracted new attentions. These techniques have significantly contributed to progress in ASR, including speaker recognition, and their various applications. The purpose of this special issue is to bring together leading experts from various disciplines to explore the impact of statistical approaches on ASR. The special issue will provide a comprehensive overview of recent developments and open problems. This Call for Papers invites researchers to contribute articles that have a broad appeal to the signal processing community. Such an article could be for example a tutorial of the fundamentals or a presentation of a state-of-the-art method. Examples of the topics that could be addressed in the article include, but are not limited to: * Supervised, unsupervised, and lightly supervised training/adaptation * Speaker-adaptive and noise-adaptive training * Discriminative training * Large-margin based methods * Model complexity optimization * Dynamic Bayesian networks for various levels of speech modeling and decoding * Deep belief networks and related deep learning techniques * Sparse coding for speech feature extraction and modeling * Feature parameter compensation/normalization * Acoustic factorization * Conditional random fields (CRF) for modeling and decoding * Acoustic source separation by PCA and ICA * De-reverberation * Rapid language adaptation for multilingual speech recognition * Weighted-finite-state-transducer (WFST) based decoding * Uncertainty decoding * Speaker recognition, especially text-independent speaker verification * Statistical framework for human-computer dialogue modeling * Automatic speech summarization and information extraction Submission Procedure: Prospective authors should submit their white papers to the web submission system at http://mc.manuscriptcentral.com/spmag-ieee. Schedule: * White paper due: October 1, 2011 * Invitation notification: November 1, 2011 * Manuscript due: February 1, 2012 * Acceptance notification: April 1, 2012 * Final manuscript due: May 15, 2012 * Publication date: September 15, 2012
|