ISCA - International Speech
Communication Association

ISCApad Archive » 2012 » ISCApad #172 » Resources » Books

ISCApad #172

Sunday, October 07, 2012 by Chris Wellekens

5-1 Books

5-1-1

Dorothea Kolossa and Reinhold Haeb-Umbach: Robust Speech Recognition of Uncertain or Missing Data

Title: Robust Speech Recognition of Uncertain or Missing Data
Editors: Dorothea Kolossa and Reinhold Haeb-Umbach
Publisher: Springer
Year: 2011
ISBN 978-3-642-21316-8
Link:
http://www.springer.com/engineering/signals/book/978-3-642-21316-8?detailsPage=authorsAndEditors

Automatic speech recognition suffers from a lack of robustness with
respect to noise, reverberation and interfering speech. The growing
field of speech recognition in the presence of missing or uncertain
input data seeks to ameliorate those problems by using not only a
preprocessed speech signal but also an estimate of its reliability to
selectively focus on those segments and features that are most reliable
for recognition. This book presents the state of the art in recognition
in the presence of uncertainty, offering examples that utilize
uncertainty information for noise robustness, reverberation robustness,
simultaneous recognition of multiple speech signals, and audiovisual
speech recognition.

The book is appropriate for scientists and researchers in the field of
speech recognition who will find an overview of the state of the art in
robust speech recognition, professionals working in speech recognition
who will find strategies for improving recognition results in various
conditions of mismatch, and lecturers of advanced courses on speech
processing or speech recognition who will find a reference and a
comprehensive introduction to the field. The book assumes an
understanding of the fundamentals of speech recognition using Hidden
Markov Models.

Back

Top

5-1-2

Mohamed Embarki et Christelle Dodane: La coarticulation

LA COARTICULATION

Mohamed Embarki et Christelle Dodane
Des indices à la représentation

La parole est faite de gestes articulatoires complexes qui se chevauchent dans l’espace et dans le temps. Ces chevauchements, conceptualisés par le terme coarticulation, n’épargnent aucun articulateur. Ils sont repérables dans les mouvements de la mâchoire, des lèvres, de la langue, du voile du palais et des cordesvocales. La coarticulation est aussi attendue par l’auditeur, les segments coarticulés sont mieux perçus. Elle intervient dans les processus cognitifs et linguistiques d’encodage et de décodage de la parole. Bien plus qu’un simple processus, la coarticulation est un domaine de recherche structuré avec des concepts et des modèles propres. Cet ouvrage collectif réunit des contributions inédites de chercheurs internationaux abordant lacoarticulation des points de vue moteur, acoustique, perceptif et linguistique. C’est le premier ouvrage publié en langue française sur cette question et le premier à l’explorer dans différentes langues.

Collection : Langue & Parole, L'Harmattan

ISBN : 978-2-296-55503-7 • 25 € • 260 pages

Mohamed Embarki

est maître de conférences-HDR en phonétique à l’université de Franche-Comté (Besançon) et membre du Laseldi (E.A. 2281). Ses travaux portent sur les aspects (co)articulatoires et acoustiques des parlers arabes modernes ainsi que sur leurs motivations sociophonétiques.
Christelle Dodane
est maître de conférences en phonétique à l’université Paul-Valéry (Montpellier 3) et elle est affiliée au laboratoire DIPRALANG (E.A. 739). Ses recherches portent sur la communication langagière chez le jeune enfant (12-36 mois) et notamment sur le rôle de la prosodie dans le passage du niveau pré-linguistique au niveau linguistique, dans la construction de la première syntaxe et dans le langage adressé à l’enfant.

Back

Top

5-1-3

Ben Gold, Nelson Morgan, Dan Ellis :Speech and Audio Signal Processing: Processing and Perception of Speech and Music [Digital]

Speech and Audio Signal Processing: Processing and Perception of Speech and Music [2nd edition] Ben Gold, Nelson Morgan, Dan Ellis

Digital copy: http://www.amazon.com/Speech-Audio-Signal-Processing-Perception/dp/product-description/1118142888

Hardcopy available: http://www.amazon.com/Speech-Audio-Signal-Processing-Perception/dp/0470195363/ref=sr_1_1?s=books&ie=UTF8&qid=1319142964&sr=1-1

Back

Top

5-1-4

Video Proceedings ERMITES 2011

Actes vidéo des journées ERMITES 2011 'Décomposition Parcimonieuse, Contraction et Structuration pour l'Analyse de Scènes', sont en ligne sur :   http://glotin.univ-tln.fr/ERMITES11

On y retrouve (en .mpg) la vingtaine d'heure des conférences de :

Y. Bengio, Montréal
    «Apprentissage Non-Supervisé de Représentations Profondes »
     http://lsis.univ-tln.fr/~glotin/ERMITES_2011_Y_Bengio_1sur4.mp4 ...

S. Mallat, Paris
    « Scattering & Matching Pursuit for Acoustic Sources Separation »
     http://lsis.univ-tln.fr/~glotin/ERMITES_2011_Mallat_1sur3.mp4 ...

J.-P. Haton, Nancy
    « Analyse de Scène et Reconnaissance Stochastique de la Parole »
     http://lsis.univ-tln.fr/~glotin/ERMITES_2011_JP_Haton_1sur4.mp4 ...

M. Kowalski, Paris
    « Sparsity and structure for audio signal: a *-lasso therapy »
     http://lsis.univ-tln.fr/~glotin/ERMITES_2011_Kowalski_1sur5.mp4 ...

O. Adam, Paris
    « Estimation de Densité de Population de Baleines par Analyse de
leurs Chants »
     http://lsis.univ-tln.fr/~glotin/ERMITES_2011_Adam.mp4

X. Halkias, New-York
    « Detection and Tracking of Dolphin Vocalizations »
     http://lsis.univ-tln.fr/~glotin/ERMITES_2011_Halkias.mp4

J. Razik, Toulon
    « Sparse coding : from speech to whales »
     http://lsis.univ-tln.fr/~glotin/ERMITES_2011_Razik.mp4

H. Glotin, Toulon
   « Suivi & reconstruction du comportement de cétacés par acoustique passive »

ps : ERMITES 2012 portera sur la vision (Y. Lecun, Y. Thorpe, P.
Courrieu, M Perreira, M. Van Gerven,...)

Back

Top

5-1-5

Zeki Majeed Hassan and Barry Heselwood (Eds): Instrumental Studies in Arabic Phonetics

Instrumental Studies in Arabic Phonetics
Edited by Zeki Majeed Hassan and Barry Heselwood
University of Gothenburg / University of Leeds
[Current Issues in Linguistic Theory, 319] 2011. xii, 365 pp.
Publishing status: Available
Hardbound – Available
ISBN 978 90 272 4837 4 | EUR 110.00 | USD 165.00
e-Book – Forthcoming Ordering information
ISBN 978 90 272 8322 1 | EUR 110.00 | USD 165.00
Brought together in this volume are fourteen studies using a range of modern instrumental methods – acoustic and articulatory – to investigate the phonetics of several North African and Middle Eastern varieties of Arabic. Topics covered include syllable structure, quantity, assimilation, guttural and emphatic consonants and their pharyngeal and laryngeal mechanisms, intonation, and language acquisition. In addition to presenting new data and new descriptions and interpretations, a key aim of the volume is to demonstrate the depth of objective analysis that instrumental methods can enable researchers to achieve. A special feature of many chapters is the use of more than one type of instrumentation to give different perspectives on phonetic properties of Arabic speech which have fascinated scholars since medieval times. The volume will be of interest to phoneticians, phonologists and Arabic dialectologists, and provides a link between traditional qualitative accounts of spoken Arabic and modern quantitative methods of instrumental phonetic analysis.

Acknowledgements vii – viii
List of contributors ix – x
Transliteration and transcription symbols for Arabic xi – xii
Introduction
Barry Heselwood and Zeki Majeed Hassan 1 – 26
Part I. Issues in syntagmatic structure
Preliminary study of Moroccan Arabic word-initial consonant clusters and syllabification using electromagnetic articulography
Adamantios I. Gafos, Philip Hoole and Chakir Zeroual 27 – 46
An acoustic phonetic study of quantity and quantity complementarity in Swedish and Iraqi Arabic
Zeki Majeed Hassan 47 – 62
Assimilation of /l/ to /r/ in Syrian Arabic: An electropalatographic and acoustic study
Barry Heselwood, Sara Howard and Rawya Ranjous 63 – 98
Part II. Guttural consonants
A study of the laryngeal and pharyngeal consonants in Jordanian Arabic using nasoendoscopy, videofluoroscopy and spectrography
Barry Heselwood and Feda Al-Tamimi 99
A phonetic study of guttural laryngeals in Palestinian Arabic using laryngoscopic and acoustic analysis
Kimary N. Shahin 129 – 140
Airflow and acoustic modelling of pharyngeal and uvular consonants in Moroccan Arabic
Mohamed Yeou and Shinji Maeda 141 – 162
Part III. Emphasis and coronal consonants
Nasoendoscopic, videofluoroscopic and acoustic study of plain and emphatic coronals in Jordanian Arabic
Feda Al-Tamimi and Barry Heselwood 163 – 192
Acoustic and electromagnetic articulographic study of pharyngealisation: Coarticulatory effects as an index of stylistic and regional variation in Arabic
Mohamed Embarki, Slim Ouni, Mohamed Yeou, M. Christian Guilleminot and Sallal Al-Maqtari 193 – 216
Investigating the emphatic feature in Iraqi Arabic: Acoustic and articulatory evidence of coarticulation
Zeki Majeed Hassan and John H. Esling 217 – 234
Glottalisation and neutralisation in Yemeni Arabic and Mehri: An acoustic study
Janet C.E. Watson and Alex Bellem 235 – 256
The phonetics of localising uvularisation in Ammani-Jordanian Arabic: An acoustic study
Bushra Adnan Zawaydeh and Kenneth de Jong 257 – 276
EMA, endoscopic, ultrasound and acoustic study of two secondary articulations in Moroccan Arabic: Labial-velarisation vs. emphasis
Chakir Zeroual, John H. Esling and Philip Hoole 277 – 298
Part IV. Intonation and acquisition
Acoustic cues to focus and givenness in Egyptian Arabic
Sam Hellmuth 299 – 324
Acquisition of Lebanese Arabic and Yorkshire English /l/ by bilingual and monolingual children: A comparative spectrographic study
Ghada Khattab 325 – 354
Appendix: Phonetic instrumentation used in the studies 355 – 358

Back

Top

5-1-6

G. Bailly, P. Perrier & E. Vatikiotis-Batesonn eds : Audiovisual Speech Processing

'Audiovisual
Speech Processing' édité par G. Bailly, P. Perrier & E. Vatikiotis-Batesonn chez
Cambridge University Press ?

'When we speak, we configure the vocal tract which shapes the visible motions of the face
and the patterning of the audible speech acoustics. Similarly, we use these visible and
audible behaviors to perceive speech. This book showcases a broad range of research
investigating how these two types of signals are used in spoken communication, how they
interact, and how they can be used to enhance the realistic synthesis and recognition of
audible and visible speech. The volume begins by addressing two important questions about
human audiovisual performance: how auditory and visual signals combine to access the
mental lexicon and where in the brain this and related processes take place. It then
turns to the production and perception of multimodal speech and how structures are
coordinated within and across the two modalities. Finally, the book presents overviews
and recent developments in machine-based speech recognition and synthesis of AV speech. '

Back

Top

5-1-7

Fuchs, Susanne / Weirich, Melanie / Pape, Daniel / Perrier, Pascal (eds.): Speech Planning and Dynamics, Publisher P.Lang

Fuchs, Susanne / Weirich, Melanie / Pape, Daniel / Perrier, Pascal (eds.)

Speech Planning and Dynamics

Frankfurt am Main, Berlin, Bern, Bruxelles, New York, Oxford, Wien, 2012. 277 pp., 50 fig., 8 tables

Speech Production and Perception. Vol. 1

Edited by Susanne Fuchs and Pascal Perrier

Imprimé :

ISBN 978-3-631-61479-2 hb.

SFR 60.00 / €* 52.95 / €** 54.50 / € 49.50 / £ 39.60 / US$ 64.95

eBook :

ISBN 978-3-653-01438-9

SFR 63.20 / €* 58.91 / €** 59.40 / € 49.50 / £ 39.60 / US$ 64.95

Commander en ligne : www.peterlang.com

Back

Top

5-1-8

Video archive of Odyssey Speaker and Language Recognition Workshop, Singapore 2012

Odyssey Speaker and Language Recognition Workshop 2012, the workshop of ISCA SIG Speaker and Language Characterization, was held in Singapore on 25-28 June 2012. Odyssey 2012 is glad to announce that its video recordings have been included in the ISCA Video Archive. http://www.isca-speech.org/iscaweb/index.php/archive/video-archive

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy