5-1 Books
5-1-1L'imagerie medicale pour l'etude de la parole

Alain Marchal, Christian Cave

Eds Hermes Lavoisier

99 euros • 304 pages • 16 x 24 • 2009 • ISBN : 978-2-7462-2235-9

Du miroir laryngé à la vidéofibroscopie actuelle, de la prise d'empreintes statiques à la palatographie dynamique, des débuts de la radiographie jusqu'à l'imagerie par résonance magnétique ou la magnétoencéphalographie, cet ouvrage passe en revue les différentes techniques d'imagerie utilisées pour étudier la parole tant du point de vue de la production que de celui de la perception. Les avantages et inconvénients ainsi que les limites de chaque technique sont passés en revue, tout en présentant les principaux résultats acquis avec chacune d'entre elles ainsi que leurs perspectives d'évolution. Écrit par des spécialistes soucieux d'être accessibles à un large public, cet ouvrage s'adresse à tous ceux qui étudient ou abordent la parole dans leurs activités professionnelles comme les phoniatres, ORL, orthophonistes et bien sûr les phonéticiens et les linguistes.

5-1-2Korpusbasierte Sprachverarbeitung

Author: Christoph Draxler
Title: Korpusbasierte Sprachverarbeitung
Publisher: Narr Francke Attempto Verlag Tübingen
Year: 2008

Summary: Spoken language is a major area of linguistic research and speech technology development. This handbook presents an introduction to the technical foundations and shows how speech data is collected, annotated, analysed, and made accessible in the form of speech databases. The book focuses on web-based procedures for the recording and processing of high quality speech data, and it is intended as a desktop reference for practical recording and annotation work. A chapter is devoted to the Ph@ttSessionz database, the first large-scale speech data collection (860+ speakers, 40 locations in Germany) performed via the Internet. The companion web site ( contains audio examples, software tools, solutions to the exercises, important links, and checklists. 

5-1-3Linear Predictive Coding and the Internet Protocol, by Robert M. Gray

Linear Predictive Coding and the Internet Protocol, by Robert M. Gray, a special edition hardback book from Foundations and Trends in Signal Processing (FnT SP). The book brings together two forthcoming issues of FnT SP, the first being a survey of LPC, the second a unique history of realtime digital speech on packet networks.


Volume 3, Issue 3                                                                                                                                                                                                 

A Survey of Linear Predictive Coding: Part 1 of LPC and the IP                                                                                                                                  

By Robert M. Gray (Stanford University)                                                                                                                                                                                                                                                                       


Volume 3, Issue  4


A History of Realtime Digital Speech on Packet Networks: Part 2 of LPC and the IP                                                                                                     

By Robert M. Gray (Stanford University)                                                                                                                                                                                                                                                                      


The links above will take you to the article abstracts.

5-1-4Modern Trends in Arabic Dialectology, M. Embarki and M. Ennaji

Modern Trends in Arabic Dialectology,
M. Embarki & M. Ennaji (eds.), Trenton (USA): The Red Sea Press.

Mohamed Embarki and Moha Ennaji
Part I: Theoretical and Hi storical Perspectives
and Methods in Arabic Di alectology
Chapter 1 : Arabic Dialects: A Discussion
Janet C. E. Watson p. 3
Chapter 2 : The Emergence of Western Arabic: A Likely Consequence of Creolization
Federrico Corriente p. 39
Chapter 3 : Acoustic Cues for the Classification of Arabic Dialects
Mohamed Embarki p. 47
Chapter 4 : Variation and Attitudes:
A Sociolinguistic Analysis of the Qaaf
Maher Bahloul p. 69

Part II : Eastern Arabic Di alects
Chapter 5 : Arabic Bedouin Dialects and their Classification
Judith Rosenhouse p. 97
Chapter 6 : Evolution of Expressive Structures in Egyptian Arabic
Amr Helmy Ibrahim p. 121
Chapter 7 : ?adram? Arabic Lexicon
Abdullah Hassan Al-Saqqaf p. 139

Part III: Western Arabic Di alects
Chapter 8 : Dialectal Variation in Moroccan Arabic
Moha Ennaji p. 171
Chapter 9 : Formation and Evolution of Andalusi Arabic and its
Imprint on Modern Northern Morocco
Ángeles Vicente p. 185
Chapter 10 : The Phonetic Implementation of Falling Pitch Accents
in Dialectal Maltese: A Preliminary Study
of the Intonation of Gozitan ?ebbu?i
Alexandra Vella p. 211
Index p. 239

5-1-5Gokhan Tur , R De Mori Spoken Language Understanding: Systems for Extracting Semantic Information from Speech

Title: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech

Editors: Gokhan Tur and Renato De Mori


Brief Description (please use as you see fit):

Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors.

Both human/machine and human/human communications can benefit from the application of SLU, using differing tasks and approaches to better understand and utilize such communications. This book covers the state-of-the-art approaches for the most popular SLU tasks with chapters written by well-known researchers in the respective fields. Key features include:

Presents a fully integrated view of the two distinct disciplines of speech processing and language processing for SLU tasks.

Defines what is possible today for SLU as an enabling technology for enterprise (e.g., customer care centers or company meetings), and consumer (e.g., entertainment, mobile, car, robot, or smart environments) applications and outlines the key research areas.

Provides a unique source of distilled information on methods for computer modeling of semantic information in human/machine and human/human conversations.

This book can be successfully used for graduate courses in electronics engineering, computer science or computational linguistics. Moreover, technologists interested in processing spoken communications will find it a useful source of collated information of the topic drawn from the two distinct disciplines of speech processing and language processing under the new area of SLU.

5-2 Database
5-2-1LDC Newsletter (April 2011)

In this newsletter:

-   Spring 2011 LDC Data Scholarship Recipients   -

-   LDC at NEALLT 2011-

New publications:

-   2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set   -

-   NIST/USF Evaluation Resources for the VACE Program  – Meeting Data Training Set Part 1   -

Spring 2011 LDC Data Scholarship Recipients

LDC is pleased to announce the student recipients of the Spring 2011 LDC Data Scholarship program!  The LDC Data Scholarship program provides university students with access to LDC data at no-cost. Students were asked to complete an application which consisted of a proposal describing their intended use of the data, as well as a letter of support from their thesis adviser. LDC received many solid applications from both undergraduate and graduate students attending universities across the globe.  After careful deliberation, we have chosen eight proposals to support.   These students will receive no-cost copies of LDC data:

Roberto Aceves - Monterrey Institute of Technology and Superior Studies, ITESM (Mexico), graduate student, Computer Science.  Roberto has been awarded a copy of the Speech in Noisy Environments (SPINE) database for his research in automatic speech recognition in noisy environments.

Daniel Escobar - Monterrey Institute of Technology and Superior Studies, ITESM (Mexico), graduate student, Mechatronics and Automation.  Daniel has been awarded  a copy of Switchboard-2 and NIST SRE for designing a parallel joint factor analysis architecture for a speaker verification system.

Erhan Guven - The George Washington University (USA), graduate student, Computer Science.  Erhan has been awarded a copy of Emotional Prosody (LDC2002S28) for his work in extracting speaker emotional state from spectrograms.

Anup Kolya - Jadavpur University (India), graduate student, Computer Science and Engineering.  Anup has been awarded a copy of ACE 2005 English SpatialML Annotations (LDC2008T03), ACE Time Normalization (TERN) 2004 English Evaluation Data V1.0 (LDC2010T18), and ACE Time Normalization (TERN) 2004 English Training Data v 1.0 (LDC2005T07) for his research in temporal information extraction.

Benjamín Martínez Elizalde - Monterrey Institute of Technology and Superior Studies, ITESM (Mexico), graduate student, Computer Science.  Benjamín has been awarded a copy of Switchboard-2 and NIST SRE  to support his research in speaker verification modeling.

Hanan Waer - Newcastle University (UK), graduate student, Educational and Applied Linguistics.  Hanan has been awarded a copy of CALLHOME Egyptian Arabic Transcripts (LDC97T19), CALLHOME Egyptian Arabic Transcripts Supplement (LDC2002T38), and Egyptian Colloquial Arabic Lexicon (LDC99L22) for her research in comparing Arabic/English code switching in everyday Arabic conversation and academic discourse.

Muhua Zhu - Northeastern University (China), graduate student, Natural Language Processing.  Muhua has been  awarded a copy of Chinese Treebank 7.0 (LDC2010T07) to support the development of a high-accuracy Chinese parser.

Vignesh Kalaiselvan, Ganapathy Raman Kasi, Preetham Samue, Ramsrinivas Anantharamakrishnan, and Sathyanarayan Jeevan - Amrita Vishwa Vidyapeetham University (India), undergraduate students, Electronics and Communication Engineering -  the group has been awarded CALLHOME Speech, Transcripts, and Lexicon in Egyptian Arabic and German for their research in deriving robust features for multilingual acoustic modeling.

Please join us in congratulating our student winners!   The next LDC Data Scholarship program is scheduled for the Fall 2011 semester.

LDC at NEALLT 2011

LDC will be exhibiting at the upcoming NEALLT (North East Association for Language Learning Technology) conference, which will be held at the University of Pennsylvania from 1-3 April 2011. NEALLT is the regional chapter of the International Association for Language Learning Technology and works to improve language instruction through the use of technology.

How resources developed and distributed by LDC can aid language education will be discussed by LDC’s Dr Mohamed Maamouri in the presentation “Incorporating Resources and New Technologies in Language Education” on Saturday, April 2 (Session 9: 4.00-4.20 pm, Cohen G17). That presentation will highlight the LDC Arabic Reading Enhancement Tool, designed to support the development of reading skills for learning Arabic as a first and second language.

We hope to see you there!
New Publications
 (1) 2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set (LDC2011T05) is a package containing source data, reference translations, machine translations and associated human judgments used in the NIST 2008 and 2010 MetricsMaTr evaluations. The package was compiled by researchers at NIST, making use of Arabic and Chinese broadcast, newswire and web data and reference translations collected and developed by LDC for Phase 2 and Phase 2.5 of the DARPA GALE program.

NIST MetricsMaTr is a series of research challenge events for machine translation (MT) metrology, promoting the development of innovative MT metrics that correlate highly with human assessments of MT quality. Participants submit their metrics to NIST (National Institute of Standards and Technology). NIST runs those metrics on certain held-back test data for which it has human assessments measuring quality and then calculates correlations between the automatic metric scores and the human assessments. Specifically, the goals of MetricsMATR are: to inform other MT technology evaluation campaigns and conferences with regard to improved metrology; to establish an infrastructure that encourages the development of innovative metrics; to build a diverse community that will bring new perspectives to MT metrology research; and to provide a forum for MT metrology discussion and for establishing future directions of MT metrology.

The first MetricsMaTr challenge was held in 2008; the development data from the 2008 program is available from LDC, 2008 NIST Metrics for Machine Translation (MetricsMATR08) Development Data LDC2009T05. The MetricsMaTr10 evaluation plan is included in this release.

This release contains 149 documents with corresponding reference translations (Arabic-to-English and Chinese-to-English), system translations and human assessments. The human assessments include the following: Adequacy7 (a 7-point scale for judging the meaning of a system translation with respect to the reference translation); Adequacy Yes/No (whether the given system segment meant essentially the same as the reference translation); Preference (the judges' preference between two candidate translations when compared to a human reference translation); and HTER (Human Targeted Error Rate, human edits to a system translation to have the same meaning as a reference translation).

2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set is distributed via web download.

2011 Subscription Members will automatically receive two copies of this corpus on disc. 2011 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$250.


(2) NIST/USF Evaluation Resources for the VACE Program  – Meeting Data Training Set Part 1 (LDC2011V01) was developed by researchers at the Department of Computer Science and Engineering, University of South Florida (USF), Tampa, Florida and the Multimodal Information Group at the National Institute of Standards and Technology (NIST). It contains approximately fifteen hours of meeting room video data collected in 2001 and 2002 at NIST's Meeting Data Collection Laboratory and annotated for the VACE (Video Analysis and Content Extraction Program) 2005 face, person and hand detection and tracking tasks.

The VACE program was established to develop novel algorithms for automatic video content extraction, multi-modal fusion, and event understanding. During VACE Phases I and II, the program made significant progress in the automated detection and tracking of moving objects including faces, hands, people, vehicles and text in four primary video domains: broadcast news, meetings, street surveillance, and unmanned aerial vehicle motion imagery. Initial results were also obtained on automatic analysis of human activities and understanding of video sequences.

Three performance evaluations were conducted under the auspices of the VACE program between 2004 and 2007.  The 2005 evaluation was administered by USF in collaboration with NIST and guided by an advisory forum including the evaluation participants.

NIST's Meeting Data Collection Laboratory is designed to collect corpora to support research, development and evaluation in meeting recognition technologies. It is equipped to look and sound like a conventional meeting space. The data collection facility includes five Sony EV1-D30 video cameras, four of which have stationary views of a center conference table with a fixed focus and viewing angle, and an additional 'floating' camera which is used to focus on particular participants, whiteboard or conference table depending on the meeting forum. The data is captured in a NIST-internal file format. The video data was extracted from the NIST format and encoded using the MPEG-2 standard in NTSC format.

NIST/USF Evaluation Resources for the VACE Program -- Meeting Data Training Set Part 1 is distributed on eight DVD-ROM.

2011 Subscription Members will automatically receive two copies of this corpus. 2011 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$2500.









5-2-2ELDA Distribution Campaign 2010

 ELDA Distribution Campaign 2010

ELDA is launching a special distribution campaign offering very favorable conditions for the language resources acquisition,
including discounts on public prices, from the ELRA Catalogue of Language Resources (see

This offer will be open until the end of December 2010.
For more information on this offer, please contact Valérie Mapelli (

Visit our On-line Catalogue:
Visit the Universal Catalogue:
Archives of ELRA Language Resources Catalogue Updates:

5-2-3SpeechOcean China

SpeechOcean China also has about 200+ large language resources and some of databases can be freely used to our members for academic research purpose.  As a ISCA member, we will be also glad to share these databases to other ISCA members,

5-2-4ELRA - Language Resources Catalogue - Update (2011-05)

ELRA - Language Resources Catalogue - Update

ELRA is happy to announce that 2 new Multimodal and 3 new Speech Resources are now available in its catalogue.
Moreover, two Speech Resources previously announced are now available at better pricing conditions.

1) New Language Resources:

ELRA-S0314 LILA Marathi database
The LILA Marathi database comprises 2,002 Marathi speakers (992 males and 1010 females) recorded over the Korean mobile telephone network. Each speaker uttered around 46 read and spontaneous items.
For more information, see:

ELRA-S0315 A-SpeechDB
A-SpeechDB© is an Arabic speech database which contains about 20 hours of continuous speech recorded through one desktop omni microphone by 205 native speakers (about 30% of females and 70% of males), aged between 20 and 45. Automatically generated transcriptions are provided with a manually revised version for each sentence.
For more information, see:

ELRA-S0316 SmartKom Home (SKH)
Release SKH 1.0 contains 130 recordings in the technical setup ('scenario') SmartKom Home which should be an intelligent communication assistant for the private environment. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two human operators. They were asked to solve two tasks in a period of 4.5 minutes while they were left alone with the system.
For more information, see:

ELRA-S0317 SmartKom Mobil (SKM)
Release SKM 1.0 contains 146 recordings in the technical setup ('scenario') SmartKom Mobil which is a portable PDA equipped with a net link and additional intelligent communication devices. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two human operators. They were asked to solve two tasks in a period of 4,5 min while they were left alone with the system.
For more information, see:

ELRA-S0318 SmartKom Audio (SKAUDIO)
Release SKAUDIO 1.0 contains all audio channel recordings of the SmartKom corpora SmartKom Public (cf. ELRA-S0136), SmartKom Home (cf. ELRA-S0316) and SmartKom Mobil (cf. ELRA-S0317).
For more information, see:

2) Revised Language Resources (new pricing conditions):

ELRA-S0136 SmartKom Public (SKP)
Release SKP 2.0 contains 172 recordings in the technical setup ('scenario') SmartKom Public which is comparable to a traditional public phone booth but equipped with additional intelligent communication devices. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two human operators. They were asked to solve two tasks in a period of 4.5 minutes while they were left alone with the system.
For more information, see:

ELRA-S0281 LILA Hindi-L1 database
The LILA Hindi-L1 database comprises 2,030 Hindi speakers (1,012 males and 1,018 females, all speakers with Hindi as first language) recorded over the Indian mobile telephone network. Each speaker uttered around 60 read and spontaneous items.
For more information, see:

For more information on the catalogue, please contact Valérie Mapelli

Visit our On-line Catalogue:
Visit the Universal Catalogue:
Archives of ELRA Language Resources Catalogue Updates: 

5-3 Software

