ISCA - International Speech
Communication Association


ISCApad Archive  »  2011  »  ISCApad #158  »  Resources  »  Database

ISCApad #158

Thursday, August 04, 2011 by Chris Wellekens

5-2 Database
5-2-1ELRA - Language Resources Catalogue - Update (2011-05)


*****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************

ELRA is happy to announce that 2 new Multimodal and 3 new Speech Resources are now available in its catalogue.
Moreover, two Speech Resources previously announced are now available at better pricing conditions.

1) New Language Resources:

ELRA-S0314 LILA Marathi database
The LILA Marathi database comprises 2,002 Marathi speakers (992 males and 1010 females) recorded over the Korean mobile telephone network. Each speaker uttered around 46 read and spontaneous items.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1136

ELRA-S0315 A-SpeechDB
A-SpeechDB© is an Arabic speech database which contains about 20 hours of continuous speech recorded through one desktop omni microphone by 205 native speakers (about 30% of females and 70% of males), aged between 20 and 45. Automatically generated transcriptions are provided with a manually revised version for each sentence.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1140

ELRA-S0316 SmartKom Home (SKH)
Release SKH 1.0 contains 130 recordings in the technical setup ('scenario') SmartKom Home which should be an intelligent communication assistant for the private environment. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two human operators. They were asked to solve two tasks in a period of 4.5 minutes while they were left alone with the system.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1137

ELRA-S0317 SmartKom Mobil (SKM)
Release SKM 1.0 contains 146 recordings in the technical setup ('scenario') SmartKom Mobil which is a portable PDA equipped with a net link and additional intelligent communication devices. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two human operators. They were asked to solve two tasks in a period of 4,5 min while they were left alone with the system.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1138

ELRA-S0318 SmartKom Audio (SKAUDIO)
Release SKAUDIO 1.0 contains all audio channel recordings of the SmartKom corpora SmartKom Public (cf. ELRA-S0136), SmartKom Home (cf. ELRA-S0316) and SmartKom Mobil (cf. ELRA-S0317).
For more information, see: http://catalog.elra.info/product_info.php?products_id=1139


2) Revised Language Resources (new pricing conditions):

ELRA-S0136 SmartKom Public (SKP)
Release SKP 2.0 contains 172 recordings in the technical setup ('scenario') SmartKom Public which is comparable to a traditional public phone booth but equipped with additional intelligent communication devices. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two human operators. They were asked to solve two tasks in a period of 4.5 minutes while they were left alone with the system.
For more information, see: http://catalog.elra.info/product_info.php?cPath=37_39&products_id=1130

ELRA-S0281 LILA Hindi-L1 database
The LILA Hindi-L1 database comprises 2,030 Hindi speakers (1,012 males and 1,018 females, all speakers with Hindi as first language) recorded over the Indian mobile telephone network. Each speaker uttered around 60 read and spontaneous items.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1071


For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/LRs-Announcements.html 

Back  Top

5-2-2LDC Newsletter (July 2011)

In this newsletter:

LDC Sponsors a Student Group at 2011 International Linguistics Olympiad  -

LDC Receives META Prize from META-NET  -

New publications:

2005 NIST Speaker Recognition Evaluation Test Data  -

2006 NIST Spoken Term Detection Evaluation Set  -

NIST/USF Evaluation Resources for the VACE Program - Meeting Data Test Set Part 2  -



LDC Sponsors a Student Group at 2011 International Linguistics Olympiad

LDC is happy to support the 2011 International Linguistics Olympiad  by sponsoring a student team. The IOL is one of the twelve International Science Olympiads and is an annual event that brings together students from around the world to compete in linguistically–based challenges. This year’s competition takes place from July 24-30 at Carnegie Mellon University, Pittsburgh, PA  USA. Students do not need to have a background in linguistics in order to participate since they typically use analysis and deductive reasoning to solve the competition problems.

Please visit the 2011 IOL website for additional details. We wish good luck to all of the participants!

 

LDC Receives META Prize from META-NET

 LDC was awarded a ‘2nd META Prize’ from META-NET ‘for outstanding long term commitment to the preparation and distribution of language resources and technologies.’

 The META Prize is awarded by META-NET to those who provide outstanding products or services that support the European Multilingual Information Society. META-NET is a Network of Excellence dedicated to fostering the technological foundations of a multilingual European information society. Several organizations were honored at this year’s META Forum in Budapest; LDC and ELRA were both honored for supporting and developing language resources.

New Publications

(1) 2005 NIST Speaker Recognition Evaluation Test Data was developed at LDC and NIST (National Institute of Standards and Technology). It consists of 525 hours of conversational telephone speech in English, Arabic, Mandarin Chinese, Russian and Spanish and associated English transcripts used as test data in the NIST-sponsored 2005 Speaker Recognition Evaluation (SRE). The ongoing series of SRE yearly evaluations conducted by NIST are intended to be of interest to researchers working on the general problem of text independent speaker recognition. To that end the evaluations are designed to be simple, to focus on core technology issues, to be fully supported and accessible.

The task of the 2005 SRE evaluation was speaker detection, that is, to determine whether a specified speaker is speaking during a given segment of conversational speech. The task was divided into 20 distinct and separate tests involving one of five training conditions and one of four test conditions. Further information about the task conditions is contained in the The NIST Year 2005 Speaker Recognition Evaluation Plan.

The speech data consists of conversational telephone speech with 'multi-channel' data collected by LDC simultaneously from a number of auxiliary microphones. The files are organized into two segments: 10 second two-channel excerpts (continuous segments from single conversations that are estimated to contain approximately 10 seconds of actual speech in the channel of interest) and 5 minute two-channel conversations.

The data are stored as 8-bit u-law speech signals in NIST SPHERE format. In addition to the standard header fields, the SPHERE header for each file contains some auxiliary information that includes the language of the conversation and whether the data was recorded over a telephone line.  English language word transcripts in .cmt format were produced using an automatic speech recognition system (ASR) with error rates in the range of 15-30%.

2005 NIST Speaker Recognition Evaluation Test Data is distributed on 7 DVD-ROM.

2011 Subscription Members will automatically receive two copies of this corpus. 2011 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$2000.

*

(2) 2006 NIST Spoken Term Detection Evaluation Set was compiled by researchers at NIST (National Institute of Standards and Technology) and contains approximately eighteen hours of  Arabic, Chinese and English broadcast news, English conversational telephone speech and English meeting room speech used in NIST's 2006 Spoken Term Detection (STD) evaluation. The STD initiative is designed to facilitate research and development of technology for retrieving information from archives of speech data with the goals of exploring promising new ideas in spoken term detection, developing advanced technology incorporating these ideas, measuring the performance of this technology and establishing a community for the exchange of research results and technical insights.

The 2006 STD task was to find all of the occurrences of a specified 'term' (a sequence of one or more words) in a given corpus of speech data. The evaluation was intended to develop technology for rapidly searching very large quantities of audio data. Although the evaluation used modest amounts of data, it was structured to simulate the very large data situation and to make it possible to extrapolate the speed measurements to much larger data sets. Therefore, systems were implemented in two phases: indexing and searching. In the indexing phase, the system processes the speech data without knowledge of the terms. In the searching phase, the system uses the terms, the index, and optionally the audio to detect term occurrences.

The evaluation corpus consists of three data genres: broadcast news (BNews), conversational telephone speech (CTS) and conference room meetings (CONFMTG). The broadcast news material was collected in 2003 and 2004  by LDC's broadcast collection system from the following sources: ABC (English), Aljazeera (Arabic), China Central TV (Chinese), CNN (English), CNBC (English), Dubaie TV (Arabic), New Tang Dynasty TV (Chinese), Public Radio International (English) and Radio Free Asia(Chinese). The CTS data was taken from the Switchboard data sets (e.g., Switchboard-2 Phase 1 LDC98S75, Switchboard-2 Phase 2 LDC99S79) and the Fisher corpora (e.g., Fisher English Training Speech Part 1 LDC2004S13), also collected by LDC. The conference room meeting material consists of goal-oriented, small group round table meetings and was collected in  2004 and 2005 by NIST, the International Computer Science Institute (Berkeley, California), Carnegie Mellon University (Pittsburgh, PA), TNO (The Netherlands) and Virginia Polytechnic Institute and State University (Blacksburg, VA) as part of the AMI corpus project. This evaluation corpus includes scoring software. It uses the inputs described in the STD Evaluation plan to complete the evaluation of a system.

Each BNews recording is a 1-channel, pcm-encoded, 16Khz, SPHERE formatted file. CTS recordings are 2-channel, u-law encoded, 8 Khz, SPHERE formatted files. The CONFMTG files contain a single recorded channel.

2006 NIST Spoken Term Detection Evaluation Set is distributed on 1 DVD-ROM.

2011 Subscription Members will automatically receive two copies of this corpus. 2011 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$800.

*

(3) NIST/USF Evaluation Resources for the VACE Program - Meeting Data Test Set Part 2 was developed by researchers at the Department of Computer Science and Engineering, University of South Florida (USF), Tampa, Florida and the Multimodal Information Group at the National Institute of Standards and Technology (NIST). It contains approximately thirteen hours of meeting room video data collected in 2001 and 2002 at NIST's Meeting Data Collection Laboratory and used in the VACE (Video Analysis and Content Extraction) 2005 evaluation.

The VACE program was established to develop novel algorithms for automatic video content extraction, multi-modal fusion, and event understanding. During VACE Phases I and II, the program made significant progress in the automated detection and tracking of moving objects including faces, hands, people, vehicles and text in four primary video domains: broadcast news, meetings, street surveillance, and unmanned aerial vehicle motion imagery. Initial results were also obtained on automatic analysis of human activities and understanding of video sequences.

Three performance evaluations were conducted under the auspices of the VACE program between 2004 and 2007.  The 2005 evaluation was administered by USF in collaboration with NIST and guided by an advisory forum including the evaluation participants.

LDC has previously released NIST/USF Evaluation Resources for the VACE Program -- Meeting Data Training Set Part 1 LDC2011V01, NIST/USF Evaluation Resources for the VACE Program -- Meeting Data Training Set Part 2 LDC2011V02 and NIST/USF Evaluation Resources for the VACE Program -- Meeting Data Test Set Part 1 LDC2011V03.

NIST's Meeting Data Collection Laboratory is designed to collect corpora to support research, development and evaluation in meeting recognition technologies. It is equipped to look and sound like a conventional meeting space. The data collection facility includes five Sony EV1-D30 video cameras, four of which have stationary views of a center conference table (one view from each surrounding wall) with a fixed focus and viewing angle, and an additional 'floating' camera which is used to focus on particular participants, whiteboard or conference table depending on the meeting forum. The data is captured in a NIST-internal file format. The video data was extracted from the NIST format and encoded using the MPEG-2 standard in NTSC format. Further information concerning the video data parameters can found in the documentation included with this corpus.

NIST/USF Evaluation Resources for the VACE Program - Meeting Data Test Set Part 2 is distributed on 8 DVD-ROM.

2011 Subscription Members will automatically receive two copies of this corpus. 2011 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$2500.

 


 

Back  Top

5-2-3ELRA receives the META Prize at the META-FORUM 2011 in Budapest

Paris, France, July 18, 2011

ELRA receives the META Prize at the META-FORUM 2011 in Budapest


During the META-FORUM 2011 which took place in Budapest (Hungary) on June 27-28, 2011, ELRA, a world-wide leading player providing Language Resources-related services to the HLT community, has received the META Prize for “Outstanding long-term commitment to the preparation and distribution of language resources and technologies”.  The Prize is shared with Linguistic Data Consortium (LDC). The META-NET Prize is awarded by the META-NET Technology Council, a 30 members committee from HLT R&D and industry executives and managers.


*** About ELRA ***
The European Language Resources Association (ELRA) is a world-wide leading player in language resource distribution, sharing and production. ELRA provides services to the language communities related to distribution and production of resources, including production on demand, technology evaluation and benchmarking. ELRA organizes LREC, the major international conference devoted to language resources and evaluation, with over 1,200 attendees from academia and industry. The next LREC edition will be held in Istanbul in 2012.
To find out more about ELRA, please visit our web site: http://www.elra.info


*** About Meta ***
META, the Multilingual Europe Technology Alliance brings together researchers, commercial technology providers, private and corporate language technology users, language professionals and other information society stakeholders.
To find out more about META, please visit the web site: http://www.meta-net.eu

Contact : Helene Mazo, mazo@elda.org

Back  Top



 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA