ISCA - International Speech
Communication Association


ISCApad Archive  »  2016  »  ISCApad #222  »  Resources  »  Books

ISCApad #222

Saturday, December 10, 2016 by Chris Wellekens

5-1 Books
5-1-1Björn Schuller, Anton Batliner , Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing, Wiley, ISBN: 978-1-119-97136-8, 344 pages, November 2013
Björn Schuller, Anton Batliner Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing Wiley, ISBN: 978-1-119-97136-8, 344 pages, November 2013 Description - This book presents the methods, tools and techniques that are currently being used to recognise (automatically) the affect, emotion, personality and everything else beyond linguistics (‘paralinguistics’) expressed by or embedded in human speech and language. - It is the first book to provide such a systematic survey of paralinguistics in speech and language processing. The technology described has evolved mainly from automatic speech and speaker recognition and processing, but also takes into account recent developments within speech signal processing, machine intelligence and data mining. - Moreover, the book offers a hands-on approach by integrating actual data sets, software, and open-source utilities which will make the book invaluable as a teaching tool and similarly useful for those professionals already in the field. Key features: - Provides an integrated presentation of basic research (in phonetics/linguistics and humanities) with state-of-the-art engineering approaches for speech signal processing and machine intelligence. - Explains the history and state of the art of all of the sub-fields which contribute to the topic of computational paralinguistics. - Covers the signal processing and machine learning aspects of the actual computational modelling of emotion and personality and explains the detection process from corpus collection to feature extraction and from model testing to system integration. - Details aspects of real-world system integration including distribution, weakly supervised learning and confidence measures. - Outlines machine learning approaches including static, dynamic and context-sensitive algorithms for classification and regression. - Includes a tutorial on freely available toolkits, such as the open-source ‘openEAR’ toolkit for emotion and affect recognition co-developed by one of the authors, and a listing of standard databases and feature sets used in the field to allow for immediate experimentation enabling the reader to build an emotion detection model on an existing corpus. Links: - The book: http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1119971365.html - Table of Contents (pdf): http://media.wiley.com/product_data/excerpt/65/11199713/1119971365-16.pdf - Chapter01 (pdf): http://media.wiley.com/product_data/excerpt/65/11199713/1119971365-14.pdf 

 

Back  Top

5-1-2Li Deng and Dong Yu, Deep Learning: Methods and Applications, Foundations and Trends in Signal Processing
Foundations and Trends in Signal Processing (www.nowpublishers.com/sig) has published the following issue:   

Volume 7, Issue 3-4                                                                                                                                                                   
Deep Learning: Methods and Applications                                                               
By Li Deng and Dong Yu (Microsoft Research, USA)       
http://dx.doi.org/10.1561/2000000039                                       
Back  Top

5-1-3O.Niebuhr, R.Skarnitzl, 'Tackling the Complexity in Speech', Prague University Press

Tackling the Complexity in Speech

Author Oliver Niebuhr, Radek Skarnitzl (eds)
Publisher Univerzita Karlova v Praze, Filozofická fakulta
Release year 2015
ISBN 978-80-7308-558-2
Series Opera Facultatis philosophicae
Pages 230

The present volume is meant to give the reader an impression of the range of questions and topics that are currently subject of international research in the discovery of complexity, the organization of complexity, and the modelling of complexity. These are the main sections of our volume. Each section includes four carefully selected chapters. They deal with facets of speech production, speech acoustics, and/or speech perception or recognition, place them in an integrated phonetic-phonological perspective, and relate them in more or less explicit ways to aspects of speech technology. Therefore, we hope that this volume can help speech scientists with traditional training in phonetics and phonology to keep up with the latest developments in speech technology. In the opposite direction, speech researchers starting from a technological perspective will hopefully get inspired by reading about the questions, phenomena, and communicative functions that are currently addressed in phonetics and phonology. Either way, the future of speech research lies in international, interdisciplinary collaborations, and our volume is meant to reflect and facilitate such collaborations.

https://e-shop.ff.cuni.cz/books/monographs_eng/opera_facultatis_philosophicae_eng/tackling_the_complexity_in_spee_-1153

Back  Top

5-1-4J.Li, L.Deng, R.Haeb-Umbach and Y.Gong, 'Robust Automatic Speech Recognition', Academic Press

 'Robust Automatic Speech Recognition'

  • The  first book that provides a comprehensive review on noise and reverberation robust speech recognition methods in the era of deep neural networks
  • Connects robust speech recognition techniques to machine learning paradigms with rigorous mathematical treatment
  • Provides elegant and structural ways to categorize and analyze noise-robust speech recognition techniques
  • Written by leading researchers who have been actively working on the subject matter in both industrial and academic organizations for many years

https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fstore.elsevier.com%2fRobust-Automatic-Speech-Recognition%2fJinyu-Li%2fisbn-9780128023983%2f.&data=01%7c01%7cygong%40exchange.microsoft.com%7c3bd27ec380c8427e97e208d2975aca2a%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=PRRo3i4DYNV1rNmVlhPMaHa0pUN4oi%2b1khyjctDXxjU%3d

Back  Top

5-1-5Barbosa, P. A. and Madureira, S. Manual de Fonética Acústica Experimental. Aplicações a dados do português. 591 p. São Paulo: Cortez, 2015. [In Portuguese]

Barbosa, P. A. and Madureira, S. Manual de Fonética Acústica Experimental. Aplicações a dados do português. 591 p. São Paulo: Cortez, 2015. [In Portuguese]     


http://www.cortezeditora.com.br/manual-de-fonetica-acustica-experimental-1599.aspx/p

This manual of Experimental Acoustic Phonetics is conceived for Undergraduate and Graduate classes on areas such as Acoustic Phonetics, Phonology, Communications Engineering, Music, Acoustic Physics, Speech Theraphy, among others.  Starting with a theoretical and methodological presentation of Acoustic Phonetics Theory and Techniques in five chapters,  including a chapter on experimental methods, the book follows with detailed acoustic analysis of all classes of sounds using audio files from both European and Brazilian Portuguese as data.
All analyses are explained step by step using Praat. The audiofiles are available on the book web site and can be downloaded.  All techniques can be applied to any language, of course. Proposed exercices at the end of each chapter allow the teacher o evaluate the student progress.

 

Delete | Reply | Reply to All | Forward | Redirect | View Thread | Blacklist | Whitelist | Message Source | Save as | Print
Move | Copy
Back  Top

5-1-6Damien Nouvel, Inalco, Maud Ehrmann, EPFL,Sophie Rosset, CNRS. Les entités nommées pour le traitement automatique des langues

Les entités nommées pour le traitement automatique des langues

Damien Nouvel, Inalco, Maud Ehrmann, EPFL
Sophie Rosset, CNRS  

Le livre est disponible en ebook au prix de 9,90 euros.
(prix réservé aux particuliers - PDF lisible sur tout support - uniquement disponible sur iste-editions.fr)
Le livre est disponible en version papier au prix de 40,00 euros.

Le monde numérisé et connecté produit de grandes quantités de données. Analyser automatiquement le langage naturel est un enjeu majeur pour les applications de recherches sur le Web, de suivi d'actualités, de fouille, de veille, d'opinion, etc.

Les recherches menées en extraction d'information ont montré l'importance de certaines unités, telles que les noms de personnes, de lieux et d’organisations, les dates ou les montants. Le traitement de ces éléments, les « entités nommées », a donné lieu au développement d'algorithmes et de ressources utilisées par les systèmes informatiques.

Théorique et pratique, cet ouvrage propose des outils pour définir ces entités, les identifier, les lier à des bases de connaissance ou pour procéder à l’évaluation des systèmes.
 
 
Sommaire

1. Les entités nommées pour l’accès à l’information
2. Les entités nommées, des unités référentielles
3. Ressources autour des entités nommées
4. Reconnaître les entités nommées
5. Lier les entités nommées aux référentiels
6. Évaluation de la reconnaissance des entités nommées

168 pages - Octobre 2015
Ouvrage papier - broché 
ISBN 978-1-78405-104-4
Back  Top

5-1-7R.Fuchs, 'Speech Rhythm in Varieties of English' , Springer

R.Fuchs,  'Speech Rhythm in Varieties of English' has appeared with Springer, in the 'Prosody, Phonology and Phonetics' series: https://www.springer.com/gp/book/9783662478172

Back  Top

5-1-8Pejman Mowlaee et al., 'Phase-Aware Signal Processing in Speech Communication: Theory and Practice', Wiley 2016

Phase-Aware Signal Processing in Speech Communication: Theory and Practice

Pejman Mowlaee, Johannes Stahl, Josef Kulmer, Florian Mayer

http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1119238811.html

An overview on the challenging new topic of phase-aware signal processing

Speech communication technology is a key factor in human-machine interaction, digital hearing aids, mobile telephony, and automatic speech/speaker recognition. With the proliferation of these applications, there is a growing requirement for advanced methodologies that can push the limits of the conventional solutions relying on processing the signal magnitude spectrum.

Single-Channel Phase-Aware Signal Processing in Speech Communication provides a comprehensive guide to phase signal processing and reviews the history of phase importance in the literature, basic problems in phase processing, fundamentals of phase estimation together with several applications to demonstrate the usefulness of phase processing.

Key features:

  • Analysis of recent advances demonstrating the positive impact of phase-based processing in pushing the limits of conventional methods.
  • Offers unique coverage of the historical context, fundamentals of phase processing and provides several examples in speech communication.
  • Provides a detailed review of many references and discusses the existing signal processing techniques required to deal with phase information in different applications involved with speech.
  • The book supplies various examples and MATLAB® implementations delivered within the PhaseLab toolbox.

Single-Channel Phase-Aware Signal Processing in Speech Communication is a valuable single-source for students, non-expert DSP engineers, academics and graduate students.

ejman Mowlaee, Johannes Stahl, Josef Kulmer, Florian Mayer
Back  Top



 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA