ISCA - International Speech
Communication Association

ISCApad Archive  »  2020  »  ISCApad #267  »  Resources  »  Books

ISCApad #267

Thursday, September 10, 2020 by Chris Wellekens

5-1 Books
5-1-1Emmanuel Vincent (Editor), Tuomas Virtanen (Editor), Sharon Gannot (Editor), 'Audio Source Separation and Speech Enhancement', Wiley

 Emmanuel Vincent (Editor), Tuomas Virtanen (Editor), Sharon Gannot (Editor),

Audio Source Separation and Speech Enhancement:

ISBN: 978-1-119-27989-1

October 2018

504 pages

This 500-page book provides a unifying view of source separation and enhancement,
including but not limited to array processing, matrix factorization, and deep learning
based methods, and speech and music applications, with consistent notation and
terminology across all chapters.


5-1-2Jen-Tzung Chien, 'Source Separation and Machine Learning', Academic Press

Jen-Tzung Chien, 'Source Separation and Machine Learning', Academic Press

Source Separation and Machine Learning presents the fundamentals in adaptive learning
algorithms for Blind Source Separation (BSS) and emphasizes the importance of machine
learning perspectives. It illustrates how BSS problems are tackled through adaptive
learning algorithms and model-based approaches using the latest information on mixture
signals to build a BSS model that is seen as a statistical model for a whole system.
Looking at different models, including independent component analysis (ICA), nonnegative
matrix factorization (NMF), nonnegative tensor factorization (NTF), and deep neural
network (DNN), the book addresses how they have evolved to deal with multichannel and
singlechannel source separation.

Key features:
? Emphasizes the modern model-based Blind Source Separation (BSS) which closely connects
the latest research topics of BSS and Machine Learning
? Includes coverage of Bayesian learning, sparse learning, online learning,
discriminative learning and deep learning
? Presents a number of case studies of model-based BSS, using a variety of learning
algorithms that provide solutions for the construction of BSS systems


5-1-3Ingo Feldhausen, « Methods in prosody: A Romance language perspective », Language Science Press (open access)

Nous sommes heureux de vous annoncer la parution d'un recueil validé par un comité de lecture et consacré aux méthodes de recherche en prosodie. Cet ouvrage est intitulé « Methods in prosody: A Romance language perspective ».

Il est publié par Language Science Press, une maison d’édition open access. Le livre peut-être téléchargé gratuitement en cliquant sur le lien suivant :

La table des matières est la suivante :


Ingo Feldhausen, Jan Fliessbach & Maria del Mar Vanrell                                                                   iii

Pilar Prieto                                                                                                                                              vii

I Large corpora and spontaneous speech

1) Using large corpora and computational tools to describe prosody: An
exciting challenge for the future with some (important) pending problems to solve

Juan María Garrido Almiñana                                                                                                                  3

2) Intonation of pronominal subjects in Porteño Spanish: Analysis of 
spontaneous speech

Andrea Pešková                                                                                                                                     45

II Approaches to prosodic analysis

3) Multimodal analyses of audio-visual information: Some methods and
issues in prosody research

Barbara Gili Fivela                                                                                                                                 83

4) The realizational coefficient: Devising a method for empirically
determining prominent positions in Conchucos Quechua

Timo Buchholz & Uli Reich                                                                                                                 123

5) On the role of prosody in disambiguating wh-exclamatives and
wh-interrogatives in Cosenza Italian

Olga Kellert, Daniele Panizza & Caterina Petrone                                                                               165

III Elicitation methods

6) The Discourse Completion Task in Romance prosody research: Status
quo and outlook

Maria del Mar Vanrell, Ingo Feldhausen & Lluïsa Astruc                                                                    191

7) Describing the intonation of speech acts in Brazilian Portuguese:
Methodological aspects

João Antônio de Moraes & Albert Rilliard                                                                                           229

Indexes                                                                                                                                                  263


N'hésitez pas à diffuser la parution de cet ouvrage auprès de vos collègues qui pourraient s'y intéresser.

Bien cordialement,

Ingo Feldhausen
(Co-coordinateur d'ouvrage)


5-1-4Nigel Ward, 'Prosodic Patterns in English Conversation', Cambridge University Press, 2019

Prosodic Patterns in English Conversation

Nigel G. Ward, Professor of Computer Science, University of Texas at El Paso

Cambridge University Press, 2019.


Spoken language is more than words: it includes the prosodic features and patterns that speakers use, subconsciously, to frame meanings and achieve interactional goals. Thanks to the application of simple processing techniques to spoken dialog corpora, this book goes beyond intonation to describe how pitch, timing, intensity and voicing properties combine to form meaningful temporal configurations: prosodic constructions. Combining new findings with hitherto-scattered observations from diverse research traditions, this book enumerates twenty of the principal prosodic constructions of English.   


5-1-5J.H.Esling, Scott R.Moisik, Allison Benner, Lise Crevier-Buchman, 'Voice Quality: the Laryngeal Articulator Model', Cambridge University Press

Voice Quality

The Laryngeal Articulator Model

Hardback 978-1-108-49842-5

John H. Esling, University of Victoria, British Columbia

Scott R. Moisik, Nanyang Technological University, Singapore

Allison Benner, University of Victoria, British Columbia

Lise Crevier-Buchman, Centre National de la Recherche Scientifique (CNRS), Paris

The first description of voice quality production in forty years, this book

provides a new framework for its study: The Laryngeal Articulator Model.

Informed by instrumental examinations of the laryngeal articulatory

mechanism, it revises our understanding of articulatory postures to explain

the actions, vibrations and resonances generated in the epilarynx and

pharynx. It focuses on the long-term auditory-articulatory component of

accent in the languages of the world, explaining how voice quality relates to

segmental and syllabic sounds. Phonetic illustrations of phonation types

and of laryngeal and oral vocal tract articulatory postures are provided.

Extensive video and audio material is available on a companion website.

The book presents computational simulations, the laryngeal and voice

quality foundations of infant speech acquisition, speech/voice disorders and

surgeries that entail compensatory laryngeal articulator adjustment, and an

exploration of the role of voice quality in sound change and of the larynx in

the evolution of speech.


1. Voice and voice quality; 2. Voice quality classification; 3. Instrumental case

studies and computational simulations of voice quality; 4. Linguistic, paralinguistic

and extralinguistic illustrations of voice quality; 5. Phonological implications of

voice quality theory; 6. Infant acquisition of speech and voice quality; 7. Clinical

illustrations of voice quality; 8. Laryngeal articulation and voice quality in sound

change, language ontogeny.


5-1-6Albert di Cristo,' Les langues naturelles'. HAL archive ouverte

Albert di Cristo, les langues naturelles.
Première partie : La structure informationnelle et ses déterminants
, 2019, 548 p.

Cet ouvrage constitue la première partie d?un vaste travail dédié à l?étude des façons dont les langues naturelles conditionnent l?information et au rôle que joue la prosodie dans l?expression de ces conditionnements. Cette première partie se propose d?analyser, sous ses divers aspects (principalement d'ordre épistémologiques), la notion de structure informationnelle, notamment dans ses relations avec la grammaire et d?examiner, dans le détail, les déterminants qui forment l?armature de cette structure. Dans cette perspective, les discussions portent, outre sur les notions de thème, de topique et de « given », sur celles de focus, de focalisation et de contraste, qui font l?objet d?analyses approfondies. Les discussions s?attachent à appréhender ces notions, à la fois dans l?optique de leurs propriétés formelles, de leur fonctionnalité et des significations qu?elles contribuent à délivrer. Un chapitre entier de cette première partie est consacré à l?étude du questionnement et à la manière dont l?organisation de l?information est gérée dans l?exercice de cette activité. L?ouvrage est doté d?une bibliographie qui comporte plus de deux mille références.

Cet ouvrage sera complété par une 2ème partie, en cours de rédaction, qui traitera essentiellement de la prosodie et de son rôle dans les conditionnements de l'information.


5-1-7Benjamin Weiss, 'Talker Quality in Human and Machine Interaction - Modeling the Listener’s Perspective in Passive and Interactive Scenarios'. T-Labs Series in Telecommunication Services. Springer Nature, Cham. (2020)

Benjamin Weiss (2020): 'Talker Quality in Human and Machine Interaction - Modeling the Listener’s Perspective in Passive and Interactive Scenarios'. T-Labs Series in Telecommunication Services. Springer Nature, Cham.

In this book, the background, state of research, and own contributions to the assessment and prediction of talker quality that is constituted in voice perception and in dialog are presented. Starting from theories and empirical findings from human interaction, major results and approaches are transferred to the domain of human-computer interaction. The main subject of this book is to contribute to the evaluation of spoken interaction in both humans and between human and computer, and in particular to the quality subsequently attributed to the speaking system or person, based on the listening and interactive experience.


5-1-8W.F.Katz, P.F.Assman, 'The Routledge Handbook of Phonetics', Routledge.


The Routledge Handbook of Phonetics Edited by William F. Katz and Peter F. Assmann The Routledge Handbook of Phonetics provides a comprehensive and up-to-date compilation of research, history and techniques in phonetics. With contributions from 41 prominent authors from North America, Europe, Australia and Japan, and including over 130 figures to illustrate key points, this handbook covers all the most important areas in the field, including:

  • The history and scope of techniques used, including speech synthesis, vocal tract imaging techniques, and obtaining information on under-researched languages from language archives;
  • The physiological bases of speech and hearing, including auditory, articulatory, and neural explanations of hearing, speech, and language processes;
  • Theories and models of speech perception and production related to the processing of consonants, vowels, prosody, tone, and intonation;



5-1-9Proceedings of SLTU-CCURL2020

Dear all,

we are very happy to announce that the SLTU-CCURL2020 Proceedings are available online:
This year, LREC2020 would have featured an extraordinary event: the first joint SLTU-CCURL2020 Workshop, which was planned as a two-day workshop, with 54 papers accepted either as oral and poster presentations.
The workshop program was enriched by two tutorials and two keynote speeches.
We will miss the presentations, the discussions and the overall stimulating environment very deeply. 
We are thankful to ELRA and ISCA for their support to the workshop, to our Google sponsor and to the 60 experts of the Program Committee, who worked tirelessly in order to help us to select the best papers representing a wide perspective over NLP, speech and computational linguistics addressing less-resource languages.
Looking forward to better times when we will be able to meet in person again, we hope that you will find these workshop proceedings relevant and stimulating for your own research.
With our best wishes,

Claudia Soria, Laurent Besacier, Dorothee Beermann, and Sakriani Sakti

 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA