ISCApad #160 |
Saturday, October 08, 2011 by Chris Wellekens |
SpeechOcean China also has about 200+ large language resources and some of databases can be freely used to our members for academic research purpose. As a ISCA member, we will be also glad to share these databases to other ISCA members, Speechocean - Language Resource Catalogue - New Released (2011-10) Speechocean, as a global provider of language resources and data services, has more than 200 large-scale databases available 80+ languages and accents covering the fields of Text to Speech, Automatic Speech Recognition, Text, Machine Translation, Web Search, Videos, Images etc. Speechocean is glad to announce that more Speech Resources has been released: Turkish speech recognition Database (Desktop) --- 201 speakers This Turkish desktop speech recognition database was collected by Speechocean’s project team in Turkey. This database is one of our databases of Speech Data ----Desktop Project (SDD) which contains the database collections in 30 languages presently. All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/789.html
Turkish speech recognition Database (In-car) --- 316 speakers This Turkish in-car speech recognition database was collected by Speechocean’s project team in Turkey. This database is one of our databases of Speech Data---Car (SDC) Project, which contains the database collections in more than 30 languages presently. The script was specially designed to provide material for both training and testing of many classes of speech recognizers, and contain 320 utterances covering 15 categories and 35 sub-categories for each speaker. Each speaker was recorded under two environments in three variations (Parked, City Driving and Highway driving) with kinds of recording conditions such as motor running, fan on/off, window up/down and etc. A total of 320 utterances were recorded for each speaker under two environments (160 utterances and spontaneous sentences per environment).
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/793.html
France French speech recognition Database (Desktop) --- 200 speakers This France French desktop speech recognition database was collected by Speechocean’s project team in France. This database is one of our databases of Speech Data ----Desktop Project (SDD)which contains the database collections in 30 languages presently. All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/796.html
Spain Spanish speech recognition Database (Desktop) --- 210 speakers This Spain Spanish desktop speech recognition database was collected by Speechocean’s project team in Spain. This database is one of our databases of Speech Data ----Desktop Project (SDD) which contains the database collections in 30 languages presently.
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/795.html
UK English speech recognition Database (Desktop) --- 200 speakers This UK English desktop speech recognition database was collected by Speechocean’s project team in UK. This database is one of our databases of Speech Data ----Desktop Project (SDD) which contains the database collections in 30 languages presently.
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/792.html
Portugal Portuguese speech recognition Database (Desktop) --- 200 speakers This Portugal Portuguese desktop speech recognition database was collected by Speechocean’s project team in Portugal. This database is one of our databases of Speech Data ----Desktop Project (SDD) which contains the database collections in 30 languages presently.
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/791.html
Swedish speech recognition Database (Desktop) --- 200 speakers This Swedish desktop speech recognition database was collected by Speechocean’s project team in Sweden. This database is one of our databases of Speech Data ----Desktop Project (SDD) which contains the database collections in 30 languages presently.
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/790.html
Canadian French Desktop speech recognition Corpus (200 speakers) was launched in Canada Based on our client's urgent demands, the Canadian French desktop speech recognition database (200 speakers) was collected by Speechocean’s project team in Canada. This database belongs to Speechocean's Desktop Speech Data Project.
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/733.html
Chinese Mandarin In-car Speech Recognition Database was Successful Released! Chinese Mandarin In-car Speech Recognition Database was successfully released with the catalogue serial number of King-ASR-122 in our Catalogue. This database was made for the tuning and testing purpose of speech recognition system for car-using. It belongs to SPC’s Multi-language In-car Speech Data Project. The script was specially designed to provide material for both training and testing of many classes of speech recognizers which contain 320 utterances covering 15 categories and 35 sub-categories for each speaker.
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/781.html
The American Spanish Mobile speech Recognition database was Successful Released!
The American Spanish Mobile speech Recognition database was successfully released with the catalogue serial number of King-ASR-119. This database was made for the tuning and testing purpose of speech recognition system for IVR / mobile. It belongs to SPC’s Multi-language Mobile Speech Data Project.
All audio files are manually transcribed and labelled. A pronunciation lexicon with a phonetic transcription in SAMPA is also included. For more information, please see the technical document at the following link: http://www.speechocean.com/en-ASR-Corpora/779.html
Visit our on-line Catalogue: http://www.speechocean.com/en-Product-Catalogue/Index.html For more information about our Database and Services please visit our website www.Speechocen.com If you have any inquiry regarding our databases and service please feel free to contact us: XiangFeng Cheng mailto:Chengxianfeng@speechocean.com Marta Gherardi mailto:Marta@speechocean.com |
Back | Top |