ISCApad Archive » 2018 » ISCApad #241 » Resources » Database » ELRA - Language Resources Catalogue - Update (February 2018) |
ISCApad #241 |
Tuesday, July 10, 2018 by Chris Wellekens |
ELRA - Language Resources Catalogue - Update
-------------------------------------------------------
We are happy to announce that 1 new Monolingual Lexicon, 1 new Written Corpus and 2 new Speech resources are now available in our catalogue.
ELRA-L0100 French dictionary of definitions (SYNAPSE)
ISLRN: 357-949-964-163-0 The French dictionary of definitions (SYNAPSE) consists of 216,835 entries (147,378 nouns, 80,552 adjectives, 24,001 verbs, 4,677 adverbs, 1,560 prefixes, 107 prepositions, 614 interjections, 147 pronouns, 42 conjunctions, 27 articles), 309,078 definitions and 7,395 phraseological units (phrases). Grammatical information for each entry consists of: grammatical category, gender, number, inflected forms. This dictionary is provided in XML format together with its DTD. For more information, see: http://catalog.elra.info/product_info.php?products_id=1315 ELRA-W0124 English-Vietnamese Parallel Corpus ISLRN: 838-483-738-912-8 This is a corpus of 500,000 English-Vietnamese sentence pairs. The parallel corpus contains English documents translated by professional translators into Vietnamese. The source texts include books, dictionaries, newspapers, online news. The texts are provided in TEI format. For more information, see: http://catalog.elra.info/product_info.php?products_id=1316 ELRA-S0394 Metalogue Multi-Issue Bargaining Dialogue
ISLRN: 217-906-813-531-9 This corpus consists of approximately 2.5 hours of semantically annotated English dialogue data that includes speech and transcripts. Six unique subjects (undergraduates between 19 and 25 years of age) participated in the collection. The dialogue speech was captured with two headset microphones and saved in 16kHz, 16-bit mono linear PCM FLAC format. Transcripts were produced semi-automatically, using an automatic speech recognizer followed by manual correction. All text is presented in UTF-8 as either plain text or XML. For more information, see: http://catalog.elra.info/product_info.php?products_id=1317 ELRA-S0395 Nautilus Speaker Characterization (NSC) Corpus ISLRN: 157-037-166-491-1
This corpus comprises clean microphone recordings of conversational speech from 300 German speakers (126 males and 174 females) aged 18 to 35 years, with no marked dialect/accent. The recordings were performed in an acoustically-isolated room in 2016/2017. Four scripted and four semi-spontaneous dialogs were elicited from the speakers, simulating telephone call inquiries. Additionally, spontaneous neutral and emotional speech utterances and questions were produced. All labels are provided, together with the speech recordings and the speakers' metadata. For more information, see: http://catalog.elra.info/product_info.php?products_id=1318 For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org If you would like to enquire about having your resources distributed by ELRA, please do not hesitate to contact us. Visit the Universal Catalogue: http://universal.elra.info Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/en/catalogues/language-resources-announcements/ |
Back | Top |