ISCA - International Speech
Communication Association


ISCApad Archive  »  2012  »  ISCApad #163  »  Resources  »  Database

ISCApad #163

Wednesday, January 11, 2012 by Chris Wellekens

5-2 Database
5-2-1Nominations for the Antonio Zampoli Prize (ELRA)

The ELRA Board has created a prize to honour the memory of its first President, Professor Antonio Zampolli, a pioneer and visionary scientist
who was internationally recognized in the field of computational linguistics and Human Language Technologies (HLT). He also contributed
much through the establishment of ELRA and the LREC conference.

To reflect Antonio Zampolli's specific interest in our field, the Prize will be awarded to individuals whose work lies within the areas of
Language Resources and Language Technology Evaluation with acknowledged contributions to their advancements.

The Prize will be awarded for the fifth time in May 2012 at the LREC 2012 conference in Istanbul (21-27 May 2012 ). Completed nominations should
be sent to the ELRA President Stelios Piperidis @ AntonioZampolli-Prize@elra.info no later than February 1st, 2012.
 

On behalf of the ELRA Board
Stelios Piperidis
President

Please visite ELRA web site for the Antonio Zampolli Prize Statutes and the nomination procedure:  http://www.elra.info/Antonio-Zampolli-Prize.html

Back  Top

5-2-2ELRA - Language Resources Catalogue - Update (2012-01)

*****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************

ELRA is happy to announce that 14 new Speech Resources are now available in its catalogue.

ELRA-S0324 Catalan-SpeechDat For the Fixed Telephone Network Database
This speech database contains the recordings of 2000 Catalan speakers who called from Fixed telephones and who are recorded over the fixed PSTN using and ISDN-BRI interface. Each speaker uttered around 50 read and spontaneous items. The speech database follows the specifications made within the SpeechDat (II) project. The database was validated by UVIGO. The Catalan-SpeechDat for the Fixed Telephone Network Database was funded by the Catalan Government.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_38&products_id=1146

ELRA-S0325 Catalan-SpeechDat for the Mobile Telephone Network Database
This speech database contains the recordings of 2000 Catalan speakers who called from GSM telephones and who are recorded over the fixed PSTN using and ISDN-BRI interface. Each speaker uttered around 50 read and spontaneous items. The speech database follows the specifications made within the SpeechDat (II) project. The database was validated by UVIGO. The Catalan-SpeechDat for the Mobile Telephone Network Database was funded by the Catalan Government.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_38&products_id=1147

ELRA-S0326 Catalan SpeechDat-Car database
The Catalan SpeechDat-Car database contains the in-car recordings of 300 speakers who uttered from around 120 read and spontaneous items. Each speaker recorded two sessions. Recordings have been made through 4 different channels, via in-car microphones (1 close-talk microphone, 3 far-talk microphones). The 300 Catalan speakers were selected from 5 different dialectal regions and are balanced in gender and age groups. The database was validated by UVIGO. The Catalan-SpeechDat-Car Database was funded by the Catalan Government.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1148

ELRA-S0327 Catalan Speecon database
The Catalan Speecon database comprises the recordings of 550 adult Catalan speakers who uttered over 290 items (read and spontaneous). The data were recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). The speech database follows the specifications made within the UE funded Speecon project. The database was validated by UVIGO. The Catalan-Speecon Database was funded by the Catalan Government.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1149

ELRA-S0328 Spanish EUROM.1
EUROM1 is a multilingual European speech database. It contains over 60 speakers per language who pronounced numbers, sentences, isolated words ... using close talking microphone in an anecoic room. Equivalent corpora for each of the European languages exist already, with the same number of speakers selected in the same way, and recorded in the same conditions with common file formats.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1150

ELRA-S0329 Emotional speech synthesis database
This database contains the recordings of one male and one female Spanish professional speakers recorded in a noise-reduced room. It consists in recordings and annotations of read text material in neutral style plus six MPEG expressions, all in fast, slow, soft and loud speech styles. The text material is composed of 184 items including phonetically balanced sentences, digits and isolated words. The text material was the same for all the modes and styles, giving a total of 3h 59min recorded speech for the male speaker and 3h 53min for the female speaker. The Emotional speech synthesis database was created within the scope of the Interface EU funded project.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1151

ELRA-S0330 FESTCAT Catalan TTS baseline male speech database
This database contains the recordings of one male Catalan professional speaker recorded in a noise-reduced room simultaneously through a close talk microphone, a mid distance microphone and a laryngograph signal. This database consists in the recordings and annotations of read text material of approximately 10 hours of speech for baseline applications (Text-to-Speech systems). The FESTCAT Catalan TTS Baseline Male Speech Database was created within the scope of the FESTCAT project, funded by the Catalan Government.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1152

ELRA-S0331 FESTCAT Catalan TTS baseline female speech database
This database contains the recordings of one female Catalan professional speaker recorded in a noise-reduced room simultaneously through a close talk microphone, a mid distance microphone and a laryngograph signal. It consists in the recordings and annotations of read text material of approximately 10 hours of speech for baseline applications (Text-to-Speech systems). The FESTCAT Catalan TTS Baseline Female Speech Database was created within the scope of the FESTCAT project funded by the Catalan Government.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1153

ELRA-S0332 FESTCAT Catalan TTS baseline speech database - 8 speakers
This database contains the recordings of four female and four male Catalan professional speakers recorded in a noise-reduced room simultaneously through a close talk microphone, a mid distance microphone and a laryngograph signal. It consists of the recordings and annotations of read text material of approximately 1 hour of speech per speaker for baseline applications (Text-to-Speech systems). The FESTCAT Catalan TTS baseline speech database - 8 speakers was created within the scope of the FESTCAT project funded by the Catalan Government.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1154

ELRA-S0333 Spanish Festival HTS models - male speech
This database contains the Festival HTS models trained with 10h of speech from the TC-STAR Spanish Baseline Male Speech Database (ELRA-S0310).
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1155

ELRA-S0334 Spanish Festival HTS models - female speech
This database contains the Festival HTS models trained with 10h of speech from the TC-STAR Spanish Baseline Female Speech Database (ELRA-S0309).
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1156

ELRA-S0335 Bilingual (Spanish-English) Speech synthesis HTS models
This database contains Bilingual (English and Spanish) Festival HTS models. Models were trained with 9h of speech from 2 female bilingual speakers and 2 male bilingual speakers. Each speaker recorded 2h 15 min per language. The speech data can be found in the TC-STAR Bilingual Voice-Conversion Spanish Speech Database (ELRA-S0311) and in the TC-STAR Bilingual Expressive Spanish Speech Database (ELRA-S0313).
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1157

ELRA-S0336 Spanish Festival voice male
This database contains the recordings of one male Spanish speaker recorded in a noise-reduced room simultaneously through a close talk microphone, a mid distance microphone and a laryngograph signal. This comprises read text material of approximately 10 hours of speech for baseline applications (Text-to-Speech systems). The database includes Festival-compatible annotations. The recordings can be also found under TC-STAR Spanish Baseline Male Speech Database (ELRA-S0310).
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1158

ELRA-S0337 Spanish Festival voice female
This database contains the recordings of one female Spanish speaker recorded in a noise-reduced room simultaneously through a close talk microphone, a mid distance microphone and a laryngograph signal, of read text material of approximately 10 hours of speech for baseline applications (Text-to-Speech systems). The database includes Festival-compatible annotations. The recordings can be also found under TC-STAR Spanish Baseline Female Speech Database (ELRA-S0309).
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1159
 

For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/LRs-Announcements.html

Back  Top

5-2-3LDC Newsletter (December 2011)

In this newsletter:

 

Spring 2012 LDC Data Scholarship Program - deadline approaching!  -

 

LDC Exhibiting at LSA 2012 Annual Meeting  -

 

LDC Hosts Satellite Workshop at LSA 2012  -

 

LDC to Close for Winter Break  -

 

New publications

 

LDC2011S10
- 2006 NIST Speaker Recognition Evaluation Test Set Part 1  -

 

LDC2011S11
2008 NIST Speaker Recognition Evaluation Supplemental Set  -


 Spring 2012 LDC Data Scholarship Program - deadline fast approaching!

The deadline for the Spring 2012 LDC Data Scholarship Program is less than a month away!   Applications are being accepted through January 15, 2012.  The LDC Data Scholarship program provides university students with access to LDC data at no cost.  This program is open to students pursuing both undergraduate and graduate studies in an accredited college or university. LDC Data Scholarships are not restricted to any particular field of study; however, students must demonstrate a well-developed research agenda and a bona fide inability to pay. 

Students will need to complete an application which consists of a data use proposal and letter of support from their adviser.  For further information on application materials and program rules, please visit the LDC Data Scholarship page. 

Students can email their applications to the LDC Data Scholarship program. Decisions will be sent by email from the same address.

LDC Exhibiting at LSA 2012 Annual Meeting

LDC looks forward to mingling with linguists and language specialists when we exhibit at the 86th Annual Meeting of the Linguistic Society of America (LSA). The main conference will be held over January 5-8, 2012 at the Portland, OR Hilton and Executive Tower and the exhibit hall will be open from January 6-8th (limited hours on Sunday the 8th). Please stop by our display for news on what 2012 will hold for LDC and to receive some of our conference giveaways.

LSA 2012 will feature plenary talks on the following topics:

 

  •         Patrice Speeter Beddor (University of Michigan): 'The Dynamics of Speech Perception: Constancy, Variation, and Change'
  •          Dan Jurafsky (Stanford University): 'Computing Meaning: Learning and Extracting Meaning from Text'
  •         Ted Supalla (University of Rochester): 'Rethinking the Emergence of Grammatical Structure in Signed Languages: New Evidence from Variation and Historical Change in American Sign Language'

For further information visit the LSA Annual Meeting website. If you would like to learn more about LDC’s conference preparations, please ‘like’ our Facebookpage.

We hope to see you there!

 

 

LDC Hosts Satellite Workshop at LSA 2012

LDC will co-host a satellite workshop entitled Sociolinguistic Archival Preparation on January 4-5, 2012 in conjunction with the LSA 2012 Annual Meeting  in Portland, OR.  This two-day workshop will focus on techniques to permit the archiving of data, for cross-community sharing of corpora as well as for subsequent 'panel' studies. Recent discussions within the field have concluded that present protocols need to be expanded to permit adequate archiving. Specifically:

  •        Institutional Review Board (IRB) paperwork needs to be adapted to provide protection for interviewees while permitting their speech data to be more generally sharable (and therefore archiveable);
  •       Demographic, situational, and attitudinal protocols are needed to provide a unified resource serving multiple research communities as well as the contributing researchers.

The sooner IRB forms and research protocols are aligned with each other, the sooner sharable, archiveable corpora will become available, permitting intergroup comparison and interdisciplinary collaboration.

LDC's Executive Director, Christopher Cieri, and LDC consultant and University of Arizona scholar, Malcah Yaeger-Dror, are the workshop organizers. This workshop is funded in part by the National Science Foundation (BCS#1144480). Further information about the workshop is available on the LSA Annual Meeting website.

LDC to Close for Winter Break

LDC will be closed from Monday, December 26, 2011 through Monday, January 2, 2012 in accordance with the University of Pennsylvania Winter Break Policy.  Our offices will reopen on Tuesday, January 3, 2012.  Requests received for membership renewals and corpora during the Winter Break will be processed at that time.

Best wishes for a happy and safe holiday season!

New Publications

(1) 2006 NIST Speaker Recognition Evaluation Test Set Part 1 was developed by LDC and National Institute of Standards and Technology (NIST).  It contains 437 hours of conversational telephone and microphone speech in English, Arabic, Bengali, Chinese, Farsi, Hindi, Korean, Russian, Spanish, Thai and Urdu and associated English transcripts used as test data in the NIST-sponsored 2006 Speaker Recognition Evaluation (SRE).

The ongoing series of SRE yearly evaluations conducted by NIST are intended to be of interest to researchers working on the general problem of text independent speaker recognition. The task of the 2006 SRE evaluation was speaker detection, that is, to determine whether a specified speaker is speaking during a given segment of conversational telephone speech. The task was divided into 15 distinct and separate tests involving one of five training conditions and one of four test conditions. Further information about the test conditions and additional documentation is available at the NIST web site for the 2006 SRE and within the 2006 SRE Evaluation Plan.

The speech data in this release was collected by LDC as part of the Mixer project, in particular Mixer Phases 1, 2 and 3. The Mixer project supports the development of robust speaker recognition technology by providing carefully collected and audited speech from a large pool of speakers recorded simultaneously across numerous microphones and in different communicative situations and/or in multiple languages. The data is mostly English speech, but includes some speech in Arabic, Bengali, Chinese, Farsi, Hindi, Korean, Russian, Spanish, Thai and Urdu.

The telephone speech segments are multi-channel data collected simultaneously from a number of auxiliary microphones. The files are organized into four types: two-channel excerpts of approximately 10 seconds, two-channel conversations of approximately 5 minutes, summed-channel conversations also of approximately 5 minutes and a two-channel conversation with the usual telephone speech replaced by auxiliary microphone data in the putative target speaker channel. The auxiliary microphone conversations are also of approximately five minutes in length.

English language transcripts in .ctm format were produced using an automatic speech recognition (ASR) system.

2006 NIST Speaker Recognition Evaluation Test Set Part 1 is distributed on five DVD-ROM.

2011 Subscription Members will automatically receive two copies of this corpus. 2011 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$2000.

*

(2) 2008 NIST Speaker Recognition Evaluation Supplemental Set was developed by LDC and National Institute of Standards and Technology (NIST) and contains additional data distributed after the main 2008 Speaker Recognition Evaluation (SRE). Specifically, the corpus consists of 770 hours of English microphone speech along with transcripts and other materials used as supplemental data in the 2008 NIST Speaker Recognition Evaluation (SRE) and in a follow-up evaluation to SRE08.

The 2008 evaluation was distinguished from prior evaluations by including not only conversational telephone speech data but also conversational speech data of comparable duration recorded over a microphone channel involving an interview scenario. The follow-up evaluation focused on speaker detection in the context of conversational interview type speech and was designed to measure the performance of SRE08 systems in previously unexposed test segment channel conditions.

LDC previously released the main 2008 NIST SRE Evaluation in three parts as 2008 NIST Speaker Recognition Evaluation Training Set Part 1 LDC2011S05, 2008 NIST Speaker Recognition Evaluation Training Set Part 2 LDC2011S07 and 2008 NIST Speaker Recognition Evaluation Test Set LDC2011S08.

The speech data in this release was collected in 2007 by LDC at its Human Subjects Data Collection Laboratories in Philadelphia and by the International Computer Science Institute (ICSI) at the University of California, Berkeley. This collection was part of the Mixer 5 project, which was designed to support the development of robust speaker recognition technology by providing carefully collected and audited speech from a large pool of speakers recorded simultaneously across numerous microphones and in different communicative situations and/or in multiple languages. Mixer participants were native English and bilingual English speakers. The microphone speech in this corpus is in English and consists of approximately 3 minute and 30 minute interview excerpts.

This supplemental data is split into four different parts which provide:

  •         new training data distributed to 2008 SRE participants
  •        additional data distributed to participants in the 2008 SRE follow-up evaluation
  •         interviewer channel files for the 2008 SRE main test (released after the evaluations)
  •         supplemental training data (released after the evaluations)

English language transcripts in .cfm format were produced using an automatic speech recognition (ASR) system and are included for some, but not all, speech data.

2008 NIST Speaker Recognition Evaluation Supplemental Set is distributed on five DVD-ROM.

2011 Subscription Members will automatically receive two copies of this corpus. 2011 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$2000.

Back  Top

5-2-4Speechocean December 2011 update

Speechocean - Language Resource Catalogue - New Released (2011-10)
Speechocean, as a global provider of language resources and data services, has more than 200 large-scale databases available in 80+ languages and accents covering the fields of Text to Speech, Automatic Speech Recognition, Text, Machine Translation, Web Search, Videos, Images etc.
Speechocean is glad to announce that more Speech Resources has been released:
Canadian French Speech Recognition Database - Sentences (Desktop) -- 200 speakers
This Canadian French speech recognition database was collected by Speechocean’s project team in Canada. It contains the voices of 200 different native speakers who were demographic balanced according age distribution (mainly 16 – 30, 31 – 45, 46 – 60), gender (50±5% Males, 50±5% Females) and regional accents. A script pool with a total of 20,000 simple sentences was phonetically designed for both training and testing of speech recognizers. Each speaker has recorded 300 sentences which were randomly selected from the script pool. All speakers have been recorded in a quiet office room through two professional microphones. Each prompted utterance is stored in a separate file and each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.
A pronunciation lexicon with a phonemic SAMPA transcription is also included.
For more information, please see the technical document at the following link:
http://www.speechocean.com/en-ASR-Corpora/616.html


UK English Speech Recognition Database ---- Sentences (Desktop)-200 Speakers
This UK English desktop speech recognition database was collected by Speechocean’s project team in UK. This database is one of our databases of Speech Data ----Desktop Project (SDD) which contains the database collections for 30 languages presently.
It contains the voices of 200 different native speakers who were balanced distributed by age (mainly 16 – 30,31 – 45,46 – 60), gender (106 males, 94 females) and regional accents (for the details, please see the technical document).The script was specially designed to provide material for both training and testing of many classes of speech recognizers. Each speaker has been recorded in a quiet office environment and 300 phonetically rich sentences were randomly selected from a pool of sentences specially designed.
The speech data are stored as sequences of 48.1 kHz, 16 bit and uncompressed. A pronunciation lexicon with a phonemic transcription in SAMPA is also included. The pure recording hours are 189.1. And the phoneme labelling of 6843 sentences (100034 words) which were chosen from 24 speakers were manually made.
For more information, please see the technical document at the following link:
http://www.speechocean.com/en-ASR-Corpora/792.html

UK TTS Speech database (Female)
The UK English Speech Corpus consists in one native UK Female professional broadcaster (Female, 35 years old) recorded in a studio with high SNR (>35dB) over two channels (AKG C4000B microphone and Electroglottography (EGG) sensor).
The Corpus includes the following sub-corpora:
1. Sentence sub-corpus: including 3000 short sentences (7~12 words) and 2000 sentences with normal length (13~20 words). Considering all kinds of linguistic phenomena, all sentences were extracted from the daily articles in England, such as national and international news, papers about life, travel, and so on. The sentences with political/religious/obscene/pornographic words which might lead to negative emotions were carefully excluded.
2. Emotional sub-corpus: including 100 exclamatory sentences and 100 interrogative sentences which can be used for emotional TTS study;
3. Digit sub-corpus: including many kinds of digits data, such as isolated digit, connected digits with blocks, natural and ordinal number readings;
4. Expression sub-corpus: consists of general expressions, such as date, time, money and measure expression;
5. Spell sub-corpus: including characters in alphabet, Greek characters and general abbreviations; All reading prompts are manually revised and prosody annotations were made according to real speech.
All speech data are segmented and labeled on phone level. Pronunciation lexicon and pitch extract from EEG can also be provided based on demands.
For more information, please see the technical document at the following link:
http://www.speechocean.com/en-TTS-Corpora/799.html

US TTS speech database (Female)
The US English Speech Corpus consists in one native US Female professional broadcaster (Female, 36 years old) recorded in a studio with high SNR (>35dB) over two channels (AKG C4000B microphone and Electroglottography (EGG) sensor).
The Corpus includes the following sub-corpora:
1. Sentence sub-corpus: including 3000 short sentences (7~12 words) and 2000 sentences with normal length (13~20 words). Considering all kinds of linguistic phenomena, all sentences were extracted from the daily articles in the Usa, such as national and international news, papers about life, travel, and so on. The sentences with political/religious/obscene/pornographic words which might lead to negative emotions were carefully excluded.
2. Emotional sub-corpus: including 100 exclamatory sentences and 100 interrogative sentences which can be used for emotional TTS study;
3. Digit sub-corpus: including many kinds of digits data, such as isolated digit, connected digits with blocks, natural and ordinal number readings;
4. Expression sub-corpus: consists of general expressions, such as date, time, money and measure expression;
5. Spell sub-corpus: including characters in alphabet, Greek characters and general abbreviations; All reading prompts are manually revised and prosody annotations were made according to real speech.
All speech data are segmented and labeled on phone level. Pronunciation lexicon and pitch extract from EEG can also be provided based on demands.

Italian TTS speech database (Female)
The Italian Speech Corpus consists in one native Italian Female professional broadcaster (Female, 26 years old) recorded in a studio with high SNR (>35dB) over two channels (AKG C4000B microphone and Electroglottography (EGG) sensor).
The Corpus includes the following sub-corpora:
1. Sentence sub-corpus: including 3000 short sentences (7~12 words) and 2000 sentences with normal length (13~20 words). Considering all kinds of linguistic phenomena, all sentences were extracted from the daily articles in Italy, such as national and international news, papers about life, travel, and so on. The sentences with political/religious/obscene/pornographic words which might lead to negative emotions were carefully excluded.
2. Emotional sub-corpus: including 100 exclamatory sentences and 100 interrogative sentences which can be used for emotional TTS study;
3. Digit sub-corpus: including many kinds of digits data, such as isolated digit, connected digits with blocks, natural and ordinal number readings;
4. Expression sub-corpus: consists of general expressions, such as date, time, money and measure expression;
5. Spell sub-corpus: including characters in alphabet, Greek characters and general abbreviations; All reading prompts are manually revised and prosody annotations were made according to real speech.
All speech data are segmented and labeled on phone level. Pronunciation lexicon and pitch extract from EEG can also be provided based on demands.

For more information about our Database and Services please visit our website www.Speechocen.com or visit our on-line Catalogue at http://www.speechocean.com/en-Product-Catalogue/Index.html
If you have any inquiry regarding our databases and service please feel free to contact us:
Xianfeng Cheng mailto: Chengxianfeng@speechocean.com
Marta Gherardi mailto: Marta@speechocean.com

Back  Top

5-2-5ELDA Distribution Campaign 2011

*****************************************************************

ELDA Distribution Campaign 2011

*****************************************************************

ELDA is launching a special distribution campaign offering very favorable conditions for the language resources acquisition,
including discounts on public prices, from the ELRA Catalogue of Language Resources (see http://catalog.elra.info).

This offer will be open until the end of December 2011.
 
For more information on this offer, please contact Valérie Mapelli (mapelli@elda.org)

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/LRs-Announcements.html

Back  Top



 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA