ISCA - International Speech
Communication Association


ISCApad Archive  »  2017  »  ISCApad #227  »  Resources  »  Database  »  ELRA - Language Resources Catalogue - Update (April 2017)

ISCApad #227

Thursday, May 11, 2017 by Chris Wellekens

5-2-1 ELRA - Language Resources Catalogue - Update (April 2017)
  
ELRA - Language Resources Catalogue - Update
*****************************************************************

We are happy to announce that 1 Evaluation Package, 1 Written Corpus, 3 Desktop/Microphone Speech Resources and 1 Broadcast Speech Resource are now available in our catalogue.

ELRA-E0046 ETAPE Evaluation Package
ISLRN: 425-777-374-455-4

The ETAPE Evaluation Package consists of ca. 30 hours of radio and TV data, selected to include mostly non planned speech and a reasonable proportion of multiple speaker data. All data were carefully transcribed, including named entity annotation. This package includes the material that was used for the ETAPE evaluation campaign. It includes resources, scoring tools, results of the campaign, etc., that were used or produced during the campaign. The aim of this evaluation package is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1299

ELRA-W0117 Danish Propbank (DPB)
ISLRN: 213-212-351-142-5

The Danish Propbank (DPB) is an 87,000-token treebank from a variety of genres, annotated with morphosyntactic and semantic information, namely propositions/frames with VerbNet classes and semantic roles for both arguments and satellites. There are over 12,000 frames with 32,000 role instances. The corpus has also been annotated with 20 Named Entity classes and a 200-category semantic ontology for nouns.
For more information, see http://catalog.elra.info/product_info.php?products_id=1300

ELRA-S0388 GlobalPhone Bulgarian Pronunciation Dictionary 260k entries (extended version)
ISLRN: 799-402-906-876-5

This extended version of the Bulgarian Pronunciation Dictionary called Bulgarian-Dict260k contains pronunciations of more than 260,000 word forms.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1301

ELRA-S0389 Accented English GlobalPhone
ISLRN: 574-579-221-841-3

The Accented English part of the GlobalPhone resources contains 63 recording sessions of Bulgarian, Chinese, German, and Indian native speakers reading 37 English sentences each, produced in GlobalPhone-style, i.e. 16kHz PCM encoded audio recordings of utterance-segmented read speech from the newspaper domain.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1302

ELRA-S0390 Parallel EMG-Acoustic English GlobalPhone
ISLRN: 910-309-096-523-6

The parallel EMG-Acoustic English GlobalPhone language resource contains 63 recordings sessions from 8 speakers articulating speech in three speaking modes, audible, whispered, and silent by reading three times 50 English sentences in GlobalPhone-style, i.e. 16kHz PCM encoded audio recordings of utterance-segmented read speech from the newspaper domain. Speech is recorded in a parallel fashion, i.e. synchronously by a standard close-talking microphone and by surface electrodes capturing the muscle activities of the articulatory muscles in the face (EelectroMmyoGgraphy =- EMG).
For more information, see: http://catalog.elra.info/product_info.php?products_id=1303

ELRA-S0391 The FAME! Speech Corpus
ISLRN: 340-994-352-616-4

This Frisian corpus consists of 203 audio segments of approximately 5 minutes long extracted from various radio programs covering a time span of almost 50 years (1966-2015), adding a longitudinal dimension to the database. The content of the recordings are very diverse including radio programs about culture, history, literature, sports, nature, agriculture, politics, society and languages. There are 309 identified speakers in the FAME! Speech Corpus, 21 of whom appear at least 3 times in the database. The total duration of the manually annotated radio broadcasts sums up to 18 hours, 33 minutes and 57 seconds.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1304


For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org

If you would like to enquire about having your resources distributed by ELRA, please do not hesitate to contact us.

Visit our On-line Catalogue: http://catalog.elra.info

Visit the Universal Catalogue: http://universal.elra.info

Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/en/catalogues/language-resources-announcements/







Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA