We are happy to announce that 2 new speech corpora, 1 new lexicon and 2 new bilingual terminological resources are now available in our catalogue.
MGB-5 Moroccan Dialect
ISLRN: 938-639-614-524-5
The MGB-5 Moroccan Dialect comprises 14 hours of Moroccan Arabic speech extracted from 93 YouTube videos distributed across seven genres: comedy, cooking, family/children, fashion, drama, sports, and science clips. The 93 YouTube clips have been manually labelled for speech, non-speech segments. About 12 minutes from each program were selected for transcription. In addition to the transcribed 14 hours, the full programs are also provided, which amounts 48 hours for the 93 programs.
Chinese-Vietnamese - PhraseBank with audio files
ISLRN: 428-557-564-826-7
Chinese-Vietnamese - PhraseBank with audio files of daily conversations spoken by native speakers containing 4002 sentence pairs. Scripts with Pinyin, Topic, Cat, Vietnamese translation with corresponding audio in Chinese and Vietnamese. Corpus in XML and WAV formats.
Vietnamese WordNet
ISLRN: 166-795-507-589-2
Manual translation of the 2.1 version of the English WordNet into Vietnamese containing 211000 entries, in Excel format.
Idioms French-Vietnamese Dictionary
ISLRN: 167-512-984-991-8
Idioms French-Vietnamese Dictionary with French terms translated in Vietnamese and one idiomatic sentence per Vietnamese word of 448 entries in XML format.
Vietnamese Etymology Dictionary
ISLRN: 627-237-063-692-6
Vietnamese Etymology Dictionary containing Vietnamese terms with correspondence in Kanji + Exp with meaning and examples of 3100 entries, provided in XML format.
For more information on the catalogue or if you would like to enquire about having your resources distributed by ELRA, please contact us.
_________________________________________
Visit the ELRA Catalogue of Language Resources
Visit the Universal Catalogue
Archives of ELRA Language Resources Catalogue Updates