ISCApad #156 |
Tuesday, June 21, 2011 by Chris Wellekens |
7-1 | IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Special Issue on New Frontiers in Rich TranscriptionIEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Special Issue on New Frontiers in Rich Transcription A rich transcript is a transcript of a recorded event along with metadata to enrich the word stream with useful information such as identifying speakers, sentence units, proper nouns, speaker locations, etc. As the volume of online media increases and additional, layered content extraction technologies are built, rich transcription has become a critical foundation for delivering extracted content to down-stream applications such as spoken document retrieval, summarization, semantic navigation, speech data mining, and others. The special issue on 'New Frontiers in Rich Transcription' will focus on the recent research on technologies that generate rich transcriptions automatically and on its applications. The field of rich transcription draws on expertise from a variety of disciplines including: (a) signal acquistion (recording room design, microphone and camera design, sensor synchronization, etc.), (b) automatic content extraction and supporting technologies (signal processing, room acoustics compensation, spatial and multichannel audio processing, robust speech recognition, speaker recognition/diarization/tracking, spoken language understanding, speech recognition, multimodal information integration from audio and video sensors, etc.), (c) corpora infrastructure (meta-data standards, annotations procedures, etc.), and (d) performance benchmarking (ground truthing, evaluation metrics, etc.) In the end, rich transcriptions serve as enabler of a variety of spoken document applications. Many large international projects (e.g. the NIST RT evaluations) have been active in the area of rich transcription, engaging in efforts of extracting useful content from a range of media such as broadcast news, conversational telephone speech, multi-party meeting recordings, lecture recordings. The current special issue aims to be one of the first in bringing together the enabling technologies that are critical in rich transcription of media with a large variety of speaker styles, spoken content and acoustic environments. This area has also led to new research directions recently, such as multimodal signal processing or automatic human behavior modeling. The purpose of this special issue is to present overview papers, recent advances in Rich Transcription research as well as new ideas for the direction of the field. We encourage submissions about the following and other related topics: * Robust Automatic Speech Recognition for Rich Transcription * Speaker Diarization and Localization * Speaker-attributed-Speech-to-Text * Data collection and Annotation * Benchmarking Metrology for Rich Transcription * Natural language processing for Rich Transcription * Multimodal Processing for Rich Transcription * Online Methods for Rich Transcription * Future Trends in Rich Transcription Submissions must not have been previously published, with the exception that substantial extensions of conference papers are considered. Submissions must be made through IEEE's manuscript central at: http://mc.manuscriptcentral.com/sps-ieee Selecting the special issue as target. Important Dates: EXTENDED Submission deadline: 1 September 2010 Notification of acceptance: 1 January 2011 Final manuscript due: 1 July 2011 For further information, please contact the guest editors: Gerald Friedland, fractor@icsi.berkeley.edu Jonathan Fiscus, jfiscus@nist.gov Thomas Hain, T.Hain@dcs.shef.ac.uk Sadaoki Furui, furui@cs.titech.ac.jp
| ||
7-2 | IEEE Transactions on Audio, Speech, and Language Processing/Special Issue on Deep Learning for Speech and Language Processing IEEE Transactions on Audio, Speech, and Language Processing
| ||
7-3 | A New Journal on Speech Sciences and call for Papers for Special issue on experimental prosody Dear fellow prosodists, It is with a special joy that Sandra Madureira and myself announce here the launching of a new electronic journal which follows the principles of the Directory of Open Source Journals (DOAJ)*. The Journal of Speech Sciences (<http://www.journalofspeechsciences.org>) is sponsored by the Luso-Brazilian Association of Speech Sciences, an organisation founded in 2007 initially for helping organise Speech Prosody 2008. This journal proposes to occupy an ecological niche not covered by other journals where our community can publish, especially as regards its strength in linguistic and linguistically-related aspects of speech sciences research (but also speech pathology, new metholodologies and techniques, etc). Another reason for its special place in the speech research ecosystem is optinality of language's choice. Though English is the journal main language, people wanting to disseminate their work in Portuguese and French can do that, provided that they add an extended abstract in English (a way to make their work more visible outside the luso- and francophone communities). This journal was only made possible thanks to a great team working for the journal, and an exceptionally good editorial board. See the journal web page for that: <http://www.journalofspeechsciences.org>. For its first issue we propose a special issue on Experimental Prosody. Please, see the Call for Papers below and send your paper to us! All the best, Plinio (State Univ. of Campinas, Brazil) and Sandra (Catholic Univ. of São Paulo, Brazil) * Official inscription to the DOAJ and ISSN number can only be done/attributed after the first issue. -- Call for Papers The Journal of Speech Sciences (JoSS) is an open access journal which follows the principles of the Directory of Open Access Journals (DOAJ), meaning that its readers can freely read, download, copy, distribute, print, search, or link to the full texts of any article electronically published in the journal. It is accessible at <http://www.journalofspeechsciences.org>.
| ||
7-4 | Special issue Signal Processing : LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION The journal Signal Processing published by Elsevier is issuing a call for a special issue on latent variable models and source separation. Papers dealing with multi-talker ASR and noise-robust ASR using source separation techniques are highly welcome.
| ||
7-5 | IEEE Signal Processing Magazine: Special Issue on Fundamental Technologies in Modern Speech RecognitionIEEE Signal Processing Magazine Special Issue on Fundamental Technologies in Modern Speech Recognition Guest Editors: Sadaoki Furui Tokyo Institute of Technology, Tokyo, Japan (furui@cs.titech.ac.jp) Li Deng Microsoft Research, Redmond, USA (deng@microsoft.com) Mark Gales University of Cambridge, Cambridge, UK (mjfg@eng.cam.ac.uk) Hermann Ney RWTH Aachen University, Aachen, Germany (ney@cs.rwth-aachen.de) Keiichi Tokuda Nagoya Institute of Technology, Nagoya, Japan (tokuda@nitech.ac.jp) Recently, various statistical techniques that form the basis of fundamental technologies underlying today’s automatic speech recognition (ASR) research and applications have attracted new attentions. These techniques have significantly contributed to progress in ASR, including speaker recognition, and their various applications. The purpose of this special issue is to bring together leading experts from various disciplines to explore the impact of statistical approaches on ASR. The special issue will provide a comprehensive overview of recent developments and open problems. This Call for Papers invites researchers to contribute articles that have a broad appeal to the signal processing community. Such an article could be for example a tutorial of the fundamentals or a presentation of a state-of-the-art method. Examples of the topics that could be addressed in the article include, but are not limited to: * Supervised, unsupervised, and lightly supervised training/adaptation * Speaker-adaptive and noise-adaptive training * Discriminative training * Large-margin based methods * Model complexity optimization * Dynamic Bayesian networks for various levels of speech modeling and decoding * Deep belief networks and related deep learning techniques * Sparse coding for speech feature extraction and modeling * Feature parameter compensation/normalization * Acoustic factorization * Conditional random fields (CRF) for modeling and decoding * Acoustic source separation by PCA and ICA * De-reverberation * Rapid language adaptation for multilingual speech recognition * Weighted-finite-state-transducer (WFST) based decoding * Uncertainty decoding * Speaker recognition, especially text-independent speaker verification * Statistical framework for human-computer dialogue modeling * Automatic speech summarization and information extraction Submission Procedure: Prospective authors should submit their white papers to the web submission system at http://mc.manuscriptcentral.com/spmag-ieee. Schedule: * White paper due: October 1, 2011 * Invitation notification: November 1, 2011 * Manuscript due: February 1, 2012 * Acceptance notification: April 1, 2012 * Final manuscript due: May 15, 2012 * Publication date: September 15, 2012
| ||
7-6 | CfP Journal of Speech Science Call for Papers Journal of Speech Sciences (JoSS) Volume 1, number 2 (regular issue)
This is the Call for the second issue of the Journal of Speech Sciences (JoSS). The JoSS covers experimental aspects that deal with scientific aspects of speech, language and linguistic communication processes. Coverage also includes articles dealing with pathological topics, or articles of an interdisciplinary nature, provided that experimental and linguistic principles underlie the work reported. Experimental approaches are emphasized in order to stimulate the development of new methodologies, of new annotated corpora, of new techniques aiming at fully testing current theories of speech production, perception, as well as phonetic and phonological theories and their interfaces. Original, previously unpublished contributions should be sent through the journal website (www.journalofspeechsciences.org) until July 15th. The primary language of the Journal is English. Contributions in Portuguese, in Spanish (Castillan) and in French are also accepted, provided a 1-page (between 500-600 words) abstract in English be provided. The goal of this policy is to ensure a wide dissemination of quality research written in these two Romance languages. The contributions will be reviewed by at least two independent reviewers, though the final decision as to publication is taken by the two editors. For preparing the manuscript, please follow the instructions at the JoSS webpage. If accepted, the authors must use the template given in the website for preparing the paper for publication. Important Dates Submission deadline: July 15th, 2011*
* If arrived after that data the paper will follow the schedule for the third issue, volume 2.
First issue titles and authors (to appear in July 2011) Regular papers . SILVA, Jair de Almeida; MEIRELES, Alexsandro Rodrigues. Universidade Federal do Espírito Santo, Brazil. Sociophonetic study of the Capixaba speech rhythm (in Portuguese) . SILVA, Carolina Garcia de Carvalho; NAME, Maria Cristina. Universidade Federal de Juiz de Fora, Brazil. Is “limpa” a verb or an adjective? The role of phonological phrase boundaries on syntactic parsing (in Portuguese)
Acceptance rate: 33%
Invited papers . HIRST, Daniel, Université de Provence. The analysis by synthesis of speech melody - from data to models*. . SAN-SEGUNDO, Rubén1, BONAFONTE, Antonio2, MARTÍNEZ-HINAREJOS, Carlos D.3, ORTEGA, Alfonso4, 1Universidad Politécnica de Madrid; 2Universidad Politécnica de Cataluña; 3Universidad Politécnica de Valencia; 4Universidad de Zaragoza. Review of Research on Speech Technology: main contributions from Spanish research groups . XU, Yi, University College London. Experimental Prosody: A methodological review*.
*These titles are provisory.
About the JoSS The Journal of Speech Sciences (JoSS) is an open access journal which follows the principles of the Directory of Open Access Journals (DOAJ), meaning that its readers can freely read, download, copy, distribute, print, search, or link to the full texts of any article electronically published in the journal. It is accessible at <http://www.journalofspeechsciences.org>. The JoSS covers experimental aspects that deal with scientific aspects of speech, language and linguistic communication processes. The JoSS is supported by the initiative of the Luso-Brazilian Association of Speech Sciences (LBASS), <http://www.lbass.org>. Founded in the 16th of February 2007, the LBASS aims at promoting, stimulating and disseminating research and teaching in Speech Sciences in Brazil and Portugal, as well as establishing a channel between sister associations abroad.
Editors Plinio A. Barbosa (Speech Prosody Studies Group/State University of Campinas, Brazil) Sandra Madureira (LIACC/Catholic University of São Paulo, Brazil)
E-mail: {pabarbosa, smadureira}@journalofspeechsciences.org
| ||
7-7 | EURASIP Journal on Audio, Speech, and Music Processing - Special Issue on Multilingual Spoken Document Organization, Understanding and RetrievalSpecial Issue on Multilingual Spoken Document Organization, Understanding and Retrieval Call for Papers Recently, the increasing popularity of digital technologies points towards an increasing number of multimedia data. Spoken documents in education, business and entertainment are rapidly growing, for example: lectures, history archives, news and movies. As an informative signal, speech contains not only the implication of its language content, but also acoustic, prosodic and linguistic information. Efficient management of spoken documents, including organization, understanding and retrieval, is an important issue. We invite investigators to contribute original research articles as well as review articles that will stimulate the continuing efforts to manage spoken documents and the development of strategies to organize and understand spoken documents. We are particularly interested in articles describing the new methods for multilingual speech analysis; advances in spoken document retrieval; new insights into speech summarization using acoustic and linguistic information; current concepts in the treatment of spoken document organization using prosody, recognition confidence, syntactic and semantic grammar strategies. Potential topics include, but are not limited to: * Recent developments in spoken document organization and understanding * Role of multilingual speech recognition in spoken document analysis * Latest technologies for spoken document summarization and measuring outcomes * Recent advances in spoken document retrieval * Spoken document understanding using statistical learning methods * Role of prosody, acoustic, and linguistic information in spoken document summarization and retrieval * Advances in voice search, music retrieval, speaker search and indexing Before submission authors should carefully read over the Instructions for Authors, which are located at http://asmp.eurasipjournals.com/authors/instructions . Prospective authors should submit an electronic copy of their complete manuscript through the SpringerOpen submission system at http://asmp.eurasipjournals.com/manuscript according to the submission schedule. They should specify the manuscript as a submission to the 'Special Issue on Multilingual Spoken Document Organization, Understanding, and Retrieval' in the cover letter. All submissions will undergo initial screening by the Guest Editors for fit to the theme of the Special Issue and prospects for successfully negotiating the review process. Manuscript Due: August 1, 2011 First Round of Reviews: September 15, 2011 Publication Date: December 15, 2011 Lead Guest Editor Chien-Lin Huang, Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore; clhuang@i2r.a-star.edu.sg Guest Editors Bin Ma, Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore; mabin@i2r.a-star.edu.sg Hsin-Min Wang, Academia Sinica, Taiwan; whm@iis.sinica.edu.tw Ea-Ee Jan, IBM T. J. Watson Research Center, USA; ejan@us.ibm.com Berlin Chen, Department of Computer Science and Information Engineering, National Taiwan Normal University, Taiwan; berlin@csie.ntnu.edu.tw Gareth Jones, School of Computing, Dublin City University, Ireland; Gareth.Jones@computing.dcu.ie
|