| Speechocean – update (June 2014):
Speechocean: A global language resources and data services supplier
Speechocean has over 500 large-scale databases available in 110+ languages and accents with the platform of desktop, in-car, telephony and tablet PC. Our data repository is enormous and diversified, which includes ASR Databases, TTS Databases, Lexica, Text Corpora, etc.
Speechocean is glad to announce more resources that have been released:
ASR Databases
Speechocean provides 110+ regional languages corpora, available in a variety of formats, situational styles, scene environments and platform systems, covering In-car speech recognition corpora, mobile phone speech recognition corpora, fixed-line speech recognition corpora, desktop speech recognition corpora, etc. This month we released more European Languages Databases (Part One) which were made for the tuning and testing purpose of speech recognition systems for speech ASR applications.
-
In-Car
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-129
|
Canadian French Speech Recognition Corpus (In car) Sentence (328 Speakers)
|
16 K,16 bit Four Channels
|
361,560
|
King-ASR-132
|
France French Speech Recognition Corpus (in car )300 Speakers
|
16 K,16 bit Four Channels
|
360000
|
King-ASR-134
|
Turkish Speech Recognition Corpus (in car) Sentence (316 Speakers)
|
16 K,16 bit Four Channels
|
398,692
|
King-ASR-141
|
Spain Spanish Speech Recognition Corpus (in car ) 300 Speakers
|
16 K,16 bit Four Channels
|
360000
|
-
Telephony
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-220
|
German Speech Recognition Corpus (Telephone) Conversational 1000 speakers
|
8K,16bit one Channels
|
150000
|
1.3 Mobile
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-106
|
Catalan Speech Recognition Corpus (mobile) 200 Speakers
|
16K,16bit One Channel
|
60000
|
King-ASR-116
|
Polish Speech Recognition Corpus (Mobile) 600 Speakers
|
16K,16bit one channel
|
180000
|
King-ASR-124
|
Russian Speech Recognition Corpus (mobile) Sentence (604 Speakers)
|
16 K, 16 bit one channel
|
180542
|
King-ASR-128
|
Romanian Speech Recognition Corpus (Mobile) 600 Speakers
|
16K,16bit one channel
|
180000
|
King-ASR-133
|
Swedish Speech Recognition Corpus (Mobile) 300 Speakers
|
16K,16bit One Channel
|
45000
|
-
Desktop
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-207
|
Brazilian Portuguese Speech Recognition Corpus(Desktop) (203 Speakers)
|
44.1K,16bit Two Channels
|
121780
|
King-ASR-075
|
European Portuguese Speech Recognition Corpus (desktop) 200 Speakers
|
44.1K,16bit Four Channels
|
319908
|
King-ASR-171
|
France French Speech Recognition Corpus(Desktop) -Sentence (203 Speakers)
|
44.1K,16bit Two Channels
|
121642
|
King-ASR-182
|
German Speech Recognition Corpus (Desktop) -Sentence (200 Speakers)
|
44.1K,16bit Four Channels
|
239940
|
-
TTS Databases
Speechocean licenses a variety of databases in more than 40 languages for speech synthesis broadcasting speech, emotional speech, etc. which can be used in different algorithms.
Serial No.
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
Recording Hours
|
King-TTS-004
|
Arabic Speech Synthesis Database 1 (Male)
|
16K,16bit Two Channels
|
8055
|
11.7
|
King-TTS-005
|
Arabic Speech Synthesis Database 2 (Male)
|
16K,16bit Two Channels
|
8039
|
12.01
|
King-TTS-008
|
Spain Spanish Speech Synthesis Database (Female)
|
44.1K,16bit Two Channels
|
5000
|
Under Building
|
King-TTS-009
|
Fr-French Spanish Speech Synthesis Database (Female)
|
44.1K,17bit Two Channels
|
5000
|
Under Building
|
King-TTS-010
|
German Speech Synthesis Database (Female)
|
44.1K,18bit Two Channels
|
5000
|
Under Building
|
King-TTS-015
|
Italian Speech Synthesis Database (Female)
|
44.1K,19bit Two Channels
|
10300
|
13.13
|
-
Text Corpora
Speechocean licenses many kinds of text corpora in many languages which is superb for language model training.
ID
|
Kingline Data Names
|
Languages
|
Size
|
King-NLP-017
|
Spain Spanish Personal Names Corpus
|
Spain Spanish
|
Under Building
|
King-NLP-018
|
Spain Spanish Address Corpus
|
Spain Spanish
|
Under Building
|
King-NLP-021
|
Polish address corpus
|
Polish
|
Under Building
|
King-NLP-025
|
Turkish Personal Names Corpus
|
Turkish
|
Under Building
|
King-NLP-026
|
Turkish Address Corpus
|
Turkish
|
Under Building
|
-
Lexica
Speechocean builds pronunciation lexica in many languages which can be licensed to customers.
No.
|
Name
|
Phoneme Set
|
King-Lexicon-019
|
Italian Pronunciation Lexicon
|
SAMPA
|
King-Lexicon-020
|
Polish Pronunciation Lexicon
|
SAMPA
|
King-Lexicon-021
|
Dutch Pronunciation Lexicon
|
SAMPA
|
King-Lexicon-022
|
Swedish Pronunciation Lexicon
|
XSAMPA
|
King-Lexicon-024
|
Finnish Pronunciation Lexicon
|
Under Building
|
King-Lexicon-025
|
Romanian Pronunciation Lexicon
|
Under Building
|
Contact Information
Xianfeng Cheng
Business Manager of Commercial Department
Tel: +86-10-62660928; +86-10-62660053 ext.8080
Mobile: +86 13681432590
Skype: xianfeng.cheng1
Email: chengxianfeng@speechocean.com; cxfxy0cxfxy0@gmail.com
Website: www.speechocean.com
|