ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2011 » ISCApad #155 » Journals

ISCApad #155

Saturday, May 14, 2011 by Chris Wellekens

7 Journals

7-1

IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Special Issue on New Frontiers in Rich Transcription

IEEE  TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING
Special Issue on New Frontiers in Rich Transcription

A rich transcript is a transcript of a recorded event along with
metadata to enrich the word stream with useful information such as
identifying speakers, sentence units, proper nouns, speaker locations,
etc. As the volume of online media increases and additional, layered
content extraction technologies are built, rich transcription has
become a critical foundation for delivering extracted content to
down-stream applications such as spoken document retrieval,
summarization, semantic navigation, speech data mining, and others.

The special issue on 'New Frontiers in Rich Transcription' will focus
on the recent research on technologies that generate rich
transcriptions automatically and on its applications. The field of
rich transcription draws on expertise from a variety of disciplines
including: (a) signal acquistion (recording room design, microphone
and camera design, sensor synchronization, etc.), (b) automatic
content extraction and supporting technologies (signal processing,
room acoustics compensation, spatial and multichannel audio
processing, robust speech recognition, speaker
recognition/diarization/tracking, spoken language understanding,
speech recognition, multimodal information integration from audio and
video sensors,  etc.), (c) corpora infrastructure (meta-data
standards, annotations  procedures, etc.), and (d) performance
benchmarking (ground truthing, evaluation metrics, etc.) In the end,
rich transcriptions serve as enabler of a variety of spoken document
applications.

Many large international projects (e.g. the NIST RT evaluations) have
been active in the area of rich transcription, engaging in efforts of
extracting useful content from a range of media such as broadcast
news, conversational telephone speech, multi-party meeting recordings,
lecture recordings. The current special issue aims to be one of the
first in bringing together the enabling technologies that are critical
in rich transcription of media with a large variety of speaker styles,
spoken content and acoustic environments. This area has also led to
new research directions recently, such as multimodal signal processing
or automatic human behavior modeling.

The purpose of this special issue is to present overview papers,
recent advances in Rich Transcription research as well as new ideas
for the direction of the field.  We encourage submissions about the
following and other related topics:
  * Robust Automatic Speech Recognition for Rich Transcription
  * Speaker Diarization and Localization
  * Speaker-attributed-Speech-to-Text
  * Data collection and Annotation
  * Benchmarking Metrology for Rich Transcription
  * Natural language processing for Rich Transcription
  * Multimodal Processing for Rich Transcription
  * Online Methods for Rich Transcription
  * Future Trends in Rich Transcription

Submissions must not have been previously published, with the
exception that substantial extensions of conference papers are
considered.

Submissions must be made through IEEE's manuscript central at:
http://mc.manuscriptcentral.com/sps-ieee
Selecting the special issue as target.

Important Dates:
EXTENDED Submission deadline: 1 September 2010
Notification of acceptance: 1 January 2011
Final manuscript due:  1 July 2011

For further information, please contact the guest editors:
Gerald Friedland, fractor@icsi.berkeley.edu
Jonathan Fiscus, jfiscus@nist.gov
Thomas Hain, T.Hain@dcs.shef.ac.uk
Sadaoki Furui, furui@cs.titech.ac.jp

Top

7-2

IEEE Transactions on Audio, Speech, and Language Processing/Special Issue on Deep Learning for Speech and Language Processing

IEEE Transactions on Audio, Speech, and Language Processing
Special Issue on Deep Learning for Speech and Language Processing

Over the past 25 years or so, speech recognition
technology has been dominated largely by hidden Markov
models (HMMs). Significant technological success has been
achieved using complex and carefully engineered variants
of HMMs. Next generation technologies require solutions to
technical challenges presented by diversified deployment
environments. These challenges arise from the many types
of variability present in the speech signal itself.
Overcoming these challenges is likely to require “deep”
architectures with efficient and effective learning
algorithms.

There are three main characteristics in the deep learning
paradigm: 1) layered architecture; 2) generative modeling
at the lower layer(s); and 3) unsupervised learning at the
lower layer(s) in general. For speech and language
processing and related sequential pattern recognition
applications, some attempts have been made in the past to
develop layered computational architectures that are
“deeper” than conventional HMMs, such as hierarchical HMMs,
hierarchical point-process models, hidden dynamic models,
layered multilayer perception, tandem-architecture
neural-net feature extraction, multi-level detection-based
architectures, deep belief networks, hierarchical
conditional random field, and deep-structured conditional
random field. While positive recognition results have been
reported, there has been a conspicuous lack of systematic
learning techniques and theoretical guidance to facilitate
the development of these deep architectures. Recent
communication between machine learning researchers and
speech and language processing researchers revealed a
wealth of research results pertaining to insightful
applications of deep learning to some classical speech
recognition and language processing problems. These
results can potentially further advance the state of the
arts in speech and language processing.

In light of the sufficient research activities in this
exciting space already taken place and their importance,
we invite papers describing various aspects of deep
learning and related techniques/architectures as well as
their successful applications to speech and language
processing. Submissions must not have been previously
published, with the exception that substantial extensions
of conference or workshop papers will be considered.

The submissions must have specific connection to audio,
speech, and/or language processing. The topics of
particular interest will include, but are not limited to:

• Generative models and discriminative statistical or neural models with deep structure
• Supervised, semi-supervised, and unsupervised learning with deep structure
• Representing sequential patterns in statistical or neural models
• Robustness issues in deep learning
• Scalability issues in deep learning
• Optimization techniques in deep learning
• Deep learning of relationships between the linguistic hierarchy and data-driven speech units
• Deep learning models and techniques in applications such as (but not limited to) isolated or continuous speech recognition, phonetic recognition, music signal processing, language modeling, and language identification.

The authors are required to follow the Author’s Guide for
manuscript submission to the IEEE Transactions on Audio,
Speech, and Language Processing at
http://www.signalprocessingsociety.org/publications/
periodicals/taslp/taslp-author-information

Submission deadline: September 15, 2010
Notification of Acceptance: March 15, 2011
Final manuscripts due: May 15, 2011
Date of publication: August 2011

For further information, please contact the guest editors:
Dong Yu (dongyu@microsoft.com)
Geoffrey Hinton (hinton@cs.toronto.edu)
Nelson Morgan (morgan@ICSI.Berkeley.edu)
Jen-Tzung Chien (jtchien@mail.ncku.edu.tw)
Shiegeki Sagayama (sagayama@hil.t.u-tokyo.ac.jp)

Top

7-3

A New Journal on Speech Sciences and call for Papers for Special issue on experimental prosody

Dear fellow prosodists,

It is with a special joy that Sandra Madureira and myself announce here the launching of a new electronic journal which follows the principles of the Directory of Open Source Journals (DOAJ)*. The Journal of Speech Sciences (<http://www.journalofspeechsciences.org>) is sponsored by the Luso-Brazilian Association of Speech Sciences, an organisation founded in 2007 initially for helping organise Speech Prosody 2008.

This journal proposes to occupy an ecological niche not covered by other journals where our community can publish, especially as regards its strength in linguistic and linguistically-related aspects of speech sciences research (but also speech pathology, new metholodologies and techniques, etc). Another reason for its special place in the speech research ecosystem is optinality of language's choice. Though English is the journal main language, people wanting to disseminate their work in Portuguese and French can do that, provided that they add an extended abstract in English (a way to make their work more visible outside the luso- and francophone communities).

This journal was only made possible thanks to a great team working for the journal, and an exceptionally good editorial board. See the journal web page for that: <http://www.journalofspeechsciences.org>.

For its first issue we propose a special issue on Experimental Prosody. Please, see the Call for Papers below and send your paper to us!

All the best, Plinio (State Univ. of Campinas, Brazil) and Sandra (Catholic Univ. of São Paulo, Brazil)

* Official inscription to the DOAJ and ISSN number can only be done/attributed after the first issue.
--

Call for Papers

The Journal of Speech Sciences (JoSS) is an open access journal which follows the principles of the Directory of Open Access Journals (DOAJ), meaning that its readers can freely read, download, copy, distribute, print, search, or link to the full texts of any article electronically published in the journal. It is accessible at <http://www.journalofspeechsciences.org>.

The JoSS covers experimental aspects that deal with scientific aspects of speech, language and linguistic communication processes. Coverage also includes articles dealing with pathological topics, or articles of an interdisciplinary nature, provided that experimental and linguistic principles underlie the work reported. Experimental approaches are emphasized in order to stimulate the development of new methodologies, of new annotated corpora, of new techniques aiming at fully testing current theories of speech production, perception, as well as phonetic and phonological theories and their interfaces.

The JoSS is supported by the initiative of the Luso-Brazilian Association of Speech Sciences (LBASS), <http://www.lbass.org>. Founded in the 16th of February 2007, the LBASS aims at promoting, stimulating and disseminating research and teaching in Speech Sciences in Brazil and Portugal, as well as establishing a channel between sister associations abroad.

The JoSS editorial team decided to launch the journal with a special issue on Experimental Prosody. The purpose of this Special Issue is to present recent progress and significant advances in areas of speech science devoted to experimental approaches in prosody research. Submitted papers must address a topic specific to experimental prosody in one of the following research areas:

Experimental Prosodic Phonology; Acoustics of prosody; Articulatory prosody; Perception of prosody; Prosody, discourse and dialogue; Emotion and Expression; Paralinguistic and nonlinguistic cues of prosody; Prosody physiology and pathology; Prosody acquisition; Prosody and the brain (especially neuro-imagery and EEG evidence of syntax-prosody interface and prosody functions); Corpus design and annotation for prosody research; Psycholinguistics of speech prosody processing.

Original, previously unpublished contributions will be reviewed by at least two independent reviewers, though the final decision as to publication is taken by the two editors. The primary language of the Journal is English. Contributions in Portuguese and in French are also accepted, provided a 1-page (circa 500 words) abstract in English be provided. The goal of this policy is to ensure a wide dissemination of quality research written in these two Romance languages.

For preparing the manuscript, please follow the instructions at the JSS webpage and submit it to the editors with the subject “Submission: Special Issue on Experimental Prosody”.

Editors

Plinio A. Barbosa (Speech Prosody Studies Group/State University of Campinas, Brazil)
Sandra Madureira (LIACC/Catholic University of São Paulo, Brazil)

E-mail: {pabarbosa, smadureira}@journalofspeechsciences.org

Important Dates

Submission deadline:             January 30th, 2011
Notification of acceptance:    March 10th, 2011
Final manuscript due:             March 25th, 2011
Publication date:                    April, 2011

Top

7-4

Special issue Signal Processing : LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION

The journal Signal Processing published by Elsevier is issuing a call for a special issue on latent variable models and source separation. Papers dealing with multi-talker ASR and noise-robust ASR using source separation techniques are highly welcome.

                         SIGNAL PROCESSING
               http://www.elsevier.com/locate/sigpro

                          Special issue on
           LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION

                     DEADLINE: JANUARY 15, 2011

While independent component analysis and blind signal separation have become mainstream topics in signal and image processing, new approaches have emerged to solve problems involving nonlinear signal mixtures or various other types of latent variables, such as semi-blind models and matrix or tensor decompositions. All these recent topics lead to new developments and promising applications. They are the main goals of the conference LVA/ICA 2010 which took place in Saint-Malo, France, from September 27 to 30, 2010.

The aim of this special issue is to provide up to date developments on Latent Variable Analysis and Signal Separation, including theoretical analysis, algorithms and applications. Contributions are welcome both from attendees of the above conference and from authors who did not attend the conference but are active in these areas of research.

Examples of topics relevant to the special issue include:
- Non-negative matrix factorization
- Joint tensor factorization
- Latent variables
- Source separation
- Nonlinear ICA
- Noisy ICA
- BSS/ICA applications: image analysis, speech and audio data, encoding of natural scenes and sound, telecommunications, data mining, medical data processing, genomic data analysis, finance,...
- Unsolved and emerging problems: causality detection, feature selection, data mining,...

SUBMISSION INSTRUCTIONS:
Manuscript submissions shall be made through the Elsevier Editorial System (EES) at
http://ees.elsevier.com/sigpro/
Once logged in, click on “Submit New Manuscript” then select “Special Issue: LVA” in the “Choose Article Type” dropdown menu.

IMPORTANT DATES:
January 15, 2011: Manuscript submission deadline
May 15, 2011: Notification to authors
September 15, 2011: Final manuscript submission
December 15, 2011: Publication

GUEST EDITORS:
Vincent Vigneron, University of Evry – Val d’Essonne, France
Remi Gribonval, INRIA, France
Emmanuel Vincent, INRIA, France
Vicente Zarzoso, University of Nice – Sophia Antipolis, France
Terrence J. Sejnowski, Salk Institute, USA

Top

7-5

IEEE Signal Processing Magazine: Special Issue on Fundamental Technologies in Modern Speech Recognition

IEEE Signal Processing Magazine
Special Issue on Fundamental Technologies in Modern Speech Recognition
		 
Guest Editors:		 
Sadaoki Furui   Tokyo Institute of Technology, Tokyo, Japan  
                (furui@cs.titech.ac.jp)
Li Deng         Microsoft Research, Redmond, USA (deng@microsoft.com)
Mark Gales      University of Cambridge, Cambridge, UK (mjfg@eng.cam.ac.uk)
Hermann Ney     RWTH Aachen University, Aachen, Germany
                (ney@cs.rwth-aachen.de)
Keiichi Tokuda  Nagoya Institute of Technology, Nagoya, Japan 
                (tokuda@nitech.ac.jp)

Recently, various statistical techniques that form the basis of fundamental technologies underlying today’s automatic speech recognition (ASR) research and applications have attracted new attentions. These techniques have significantly contributed to progress in ASR, including speaker recognition, and their various applications.  The purpose of this special issue is to bring together leading experts from various disciplines to explore the impact of statistical approaches on ASR.  The special issue will provide a comprehensive overview of recent developments and open problems.

This Call for Papers invites researchers to contribute articles that have a broad appeal to the signal processing community.  Such an article could be for example a tutorial of the fundamentals or a presentation of a state-of-the-art method.  Examples of the topics that could be addressed in the article include, but are not limited to:
 * Supervised, unsupervised, and lightly supervised training/adaptation
 * Speaker-adaptive and noise-adaptive training
 * Discriminative training
 * Large-margin based methods
 * Model complexity optimization
 * Dynamic Bayesian networks for various levels of speech modeling and decoding
 * Deep belief networks and related deep learning techniques
 * Sparse coding for speech feature extraction and modeling
 * Feature parameter compensation/normalization
 * Acoustic factorization
 * Conditional random fields (CRF) for modeling and decoding
 * Acoustic source separation by PCA and ICA
 * De-reverberation
 * Rapid language adaptation for multilingual speech recognition
 * Weighted-finite-state-transducer (WFST) based decoding
 * Uncertainty decoding
 * Speaker recognition, especially text-independent speaker verification
 * Statistical framework for human-computer dialogue modeling
 * Automatic speech summarization and information extraction

Submission Procedure:
Prospective authors should submit their white papers to the web submission system at http://mc.manuscriptcentral.com/spmag-ieee.

Schedule:
 * White paper due:         October 1, 2011
 * Invitation notification: November 1, 2011
 * Manuscript due:          February 1, 2012
 * Acceptance notification: April 1, 2012
 * Final manuscript due:    May 15, 2012
 * Publication date:        September 15, 2012

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy