|
Speechocean – update (May 2014):
Speechocean: A global language resources and data services supplier
Speechocean has over 500 large-scale databases available in 110+ languages and accents with the platform of desktop, in-car, telephony and tablet PC. Our data repository is enormous and diversified, which includes ASR Databases, TTS Databases, Lexica, Text Corpora, etc.
Speechocean is glad to announce more resources that have been released:
ASR Databases
Speechocean provides 110+ regional languages corpora, available in a variety of formats, situational styles, scene environments and platform systems, covering In-car speech recognition corpora, mobile phone speech recognition corpora, fixed-line speech recognition corpora, desktop speech recognition corpora, etc. This month we released more European languages databases which were made for the tuning and testing purpose of speech recognition systems for speech ASR applications.
-
In-Car
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-129
|
Canadian French Speech Recognition Corpus (In car) Sentence (328 Speakers)
|
16 K,16 bit Four Channels
|
361,560
|
King-ASR-132
|
France French Speech Recognition Corpus (in car )300 Speakers
|
16 K,16 bit Four Channels
|
360000
|
King-ASR-134
|
Turkish Speech Recognition Corpus (in car) Sentence (316 Speakers)
|
16 K,16 bit Four Channels
|
398,692
|
King-ASR-141
|
Spain Spanish Speech Recognition Corpus (in car ) 300 Speakers
|
16 K,16 bit Four Channels
|
360000
|
King-ASR-147
|
Italian Speech Recognition Corpus ( in car ) 300 Speakers
|
16 K,16 bit Four Channels
|
360230
|
King-ASR-153
|
Russian Speech Recognition Corpus ( in car) 308 Speakers
|
16 K,16 bit Four Channels
|
392,200
|
King-ASR-157
|
Polish Speech Recognition Corpus ( in car) 300 Speakers
|
16 K,16 bit Four Channels
|
356000
|
King-ASR-162
|
Dutch Speech Recognition Corpus (in car) 300 Speakers
|
16 K,16 bit Four Channels
|
360030
|
King-ASR-170
|
Danish Speech Recognition Corpus (in car) 300 Speakers
|
16 K,16 bit Four Channels
|
360058
|
King-ASR-172
|
Brazilian Portuguese Speech Recognition Corpus ( in car) 300 Speakers
|
16 K,16 bit Four Channels
|
360020
|
-
Telephony
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-220
|
German Speech Recognition Corpus (Telephone) Conversational 1000 speakers
|
8K,16bit one Channels
|
150000
|
King-ASR-219
|
Spanish Speech Recognition Corpus (Telephone) Conversational 1000 speakers
|
8K,16bit one Channel
|
300000
|
1.3 Mobile
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-106
|
Catalan Speech Recognition Corpus (mobile) 200 Speakers
|
16K,16bit One Channel
|
60000
|
King-ASR-116
|
Polish Speech Recognition Corpus (Mobile) 600 Speakers
|
16K,16bit one channel
|
180000
|
King-ASR-124
|
Russian Speech Recognition Corpus (mobile) Sentence (604 Speakers)
|
16 K, 16 bit one channel
|
180542
|
King-ASR-128
|
Romanian Speech Recognition Corpus (Mobile) 600 Speakers
|
16K,16bit one channel
|
180000
|
King-ASR-133
|
Swedish Speech Recognition Corpus (Mobile) 300 Speakers
|
16K,16bit One Channel
|
45000
|
King-ASR-149
|
Finnish Speech Recognition Corpus (mobile) 200 speakers
|
8K,16bit one channel
|
60000
|
King-ASR-154
|
Brazilian Portuguese Speech Recognition Corpus (mobile) Sentence (301 Speakers)
|
16 K, 16 bit one channel
|
90266
|
King-ASR-155
|
European Portuguese Speech Recognition Corpus (Mobile) 300 Speakers
|
16K,16bit one channel
|
90000
|
King-ASR-205
|
Turkish Speech Recognition Corpus (mobile) Sentence (302 Speakers)
|
16 K, 16 bit one channel
|
99471
|
King-ASR-206
|
Greek Speech Recognition Corpus (Mobile) 300 Speakers
|
22K,16bit one channel
|
45000
|
King-ASR-209
|
Dutch Speech Recognition Corpus (Mobile) 200 Speakers
|
16K,16bit one channel
|
60000
|
-
Desktop
Serial Number
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
King-ASR-207
|
Brazilian Portuguese Speech Recognition Corpus(Desktop) (203 Speakers)
|
44.1K,16bit Two Channels
|
121780
|
King-ASR-075
|
European Portuguese Speech Recognition Corpus (desktop) 200 Speakers
|
44.1K,16bit Four Channels
|
319908
|
King-ASR-171
|
France French Speech Recognition Corpus(Desktop) -Sentence (203 Speakers)
|
44.1K,16bit Two Channels
|
121642
|
King-ASR-182
|
German Speech Recognition Corpus (Desktop) -Sentence (200 Speakers)
|
44.1K,16bit Four Channels
|
239940
|
King-ASR-181
|
Italian Speech Recognition Corpus (desktop) –Sentence 201 Speakers
|
44.1K,16bit Four Channels
|
240880
|
King-ASR-212
|
Polish Speech Recognition Corpus (Desktop) -Sentence (200 Speakers)
|
44.1K,16bit Two Channels
|
119948
|
King-ASR-084
|
Russian Speech Recognition Corpus (Desktop) –Comprehensive utterances (200 Speakers)
|
44.1K,16bit Four Channels
|
239936
|
King-ASR-158
|
Swedish Speech Recognition Corpus (Desktop) -Sentence (200 Speakers)
|
44.1K,16bit Two Channels
|
119938
|
King-ASR-159
|
Turkish Speech Recognition Corpus (Desktop) -Sentence (201 Speakers)
|
44.1K,16bit Two Channels
|
120578
|
-
TTS Databases
Speechocean licenses a variety of databases in more than 40 languages for speech synthesis broadcasting speech, emotional speech, etc. which can be used in different algorithms.
Serial No.
|
Kingline Data Names
|
Sound Parameter
|
Utterances
|
Recording Hours
|
King-TTS-004
|
Arabic Speech Synthesis Database 1 (Male)
|
16K,16bit Two Channels
|
8055
|
11.7
|
King-TTS-005
|
Arabic Speech Synthesis Database 2 (Male)
|
16K,16bit Two Channels
|
8039
|
12.01
|
King-TTS-008
|
Spain Spanish Speech Synthesis Database (Female)
|
44.1K,16bit Two Channels
|
5000
|
Under Building
|
King-TTS-009
|
Fr-French Spanish Speech Synthesis Database (Female)
|
44.1K,17bit Two Channels
|
5000
|
Under Building
|
King-TTS-010
|
German Speech Synthesis Database (Female)
|
44.1K,18bit Two Channels
|
5000
|
Under Building
|
King-TTS-015
|
Italian Speech Synthesis Database (Female)
|
44.1K,19bit Two Channels
|
10300
|
13.13
|
King-TTS-016
|
Italian Speech Synthesis Database (Male)
|
44.1K,20bit Two Channels
|
7211
|
9.09
|
King-TTS-017
|
Portugal Portuguese Speech Synthesis Database (Female)
|
44.1K,21bit Two Channels
|
9900
|
12.6
|
King-TTS-018
|
Portugal Portuguese Speech Synthesis Database (Male)
|
44.1K,22bit Two Channels
|
5000
|
Under Building
|
King-TTS-019
|
Russian Speech Synthesis Database (Female)
|
44.1K,23bit Two Channels
|
8143
|
14.59
|
King-TTS-020
|
Russian Speech Synthesis Database (Male)
|
44.1K,24bit Two Channels
|
8216
|
12.32
|
King-TTS-021
|
Polish Speech Synthesis Database (Female)
|
44.1K,25bit Two Channels
|
5000
|
Under Building
|
King-TTS-022
|
Turkish Speech Synthesis Database (Female)
|
44.1K,26bit Two Channels
|
5000
|
Under Building
|
King-TTS-029
|
Ukrainian Speech Synthesis Database (Female)
|
44.1K,27bit Two Channels
|
5000
|
Under Building
|
-
Text Corpora
Speechocean licenses many kinds of text corpora in many languages which is superb for language model training.
ID
|
Kingline Data Names
|
Languages
|
Size
|
King-NLP-017
|
|
|