ISCApad #242 |
Friday, August 10, 2018 by Chris Wellekens |
Speechocean – update (August 2018):
Speechocean: A global language resources and data services supplier
About Speechocean
Speechocean is one of the world well-known language related resources & services provider in the fields of Human Computer Interaction and Human Language Technology. At present, we can provide data services with 110+ languages and dialects across the world.
KingLine Data Center---Data Sharing Platform
Kingline Data Center is operated and supervised by Speechocean, which is mainly focused on language resources creating and providing for research and development of human language technology.
These diversified corpora are widely used for the research and development in the fields of Speech Recognition, Speech Synthesis, Natural Language Processing, Machine Translation, Web Search, etc. All corpora are openly accessible for users all over the world, including users from scientific research institutions, enterprises or individuals.
For more detailed information, please visit our website: http://kingline.speechocean.com
New released data:
1. http://kingline.speechocean.com/exchange.php?id=4693&act=view
S.N:King-TTS-019
Name: Russian Speech Synthesis Corpus (Female)
Recording Hours: 15.88 Hours
Utterances: 8,930
Parameters: 44.1k, 16bit
Channels: 2 Channels (waveform + EGG signal)
Phoneme Set: SAMPA
The whole database contains waveform data, phone boundary labeling files, prosody labeling, stress labeling, property of speech labeling files and pronunciation lexicon.
The voice talent was a 28-year-old female broadcaster. She was born and raised in Moscow.
The recording was in a professional studio. The voice talent recorded 2 to 3 times per week and lasted for about 2 months. While recording, there were 2 recording engineers (one of them is a native speaker) monitoring and guiding the voice talent. They were responsible for recording quality and text and pronunciation correctness.
2. Chinese Mandarin Speech Synthesis Corpus (Female)
ID: King-TTS-031
Name: Chinese Mandarin Speech Synthesis Corpus (Female)
Recording Hours: 41.86 Hours
Utterances: 24,733 (Chinese / English / Mixed)
Parameters: 48k, 16bit
Channels: 1 Channel (waveform)
Phoneme Set: Pinyin and CMU
The whole database contains waveform data, phone boundary labeling files and prosody labeling.
The voice talent was a female broadcaster. She was born and raised in Shan Xi, China.
The recording was in a professional studio. The voice talent recorded 1 to 2 times per week and lasted for about 12 months. While recording, there were 2 recording engineers (native speakers) monitoring and guiding the voice talent. They were responsible for recording quality and text and pronunciation correctness.
3. American English Speech Synthesis Corpus (Female)
ID: King-TTS-033
Name: American English Speech Synthesis Corpus (Female)
Recording Hours: 19.86 Hours
Utterances: 13,709
Parameters: 48k, 16bit
Channels: 1 Channel (waveform)
Phoneme Set: CMU
The whole database contains waveform data, phone boundary labeling files, prosody labeling and pronunciation lexicon.
The voice talent was a 33-year-old female broadcaster. She was born and raised in Minneapolis, America.
The recording was in a professional studio. The voice talent recorded 2 to 3 times per week and lasted for about 2 months. While recording, there were 2 recording engineers (one of them is a native speaker) monitoring and guiding the voice talent. They were responsible for recording quality and text and pronunciation correctness.
4. Chinese Child’s Voice Speech Synthesis Corpus (Female)
ID: King-TTS-040
Name: Chinese Child’s Voice Speech Synthesis Corpus (Female)
Recording Hours: 1.15 Hours
Utterances: 1,001
Parameters: 48k, 24bit
Channels: 1 Channel (waveform)
Phoneme Set: Pinyin
The whole database contains waveform data, scripts, phone boundary labeling files, prosody labeling and POS labeling.
The voice talent was a 7-year-old girl. She was born and raised in Shan Dong, China
The recording was in a professional studio. It lasted for about 2 months and the voice talent recorded 80~100 utterance each time. While recording, there were 2 recording engineers (native speakers) monitoring and guiding the voice talent. They were responsible for recording quality and text and pronunciation correctness.
Contact Information
Xianfeng Cheng
Tel: +86-10-62660928; +86-10-62660053 ext.8080
Mobile: +86-13681432590
Skype: xianfeng.cheng1
Email: chengxianfeng@speechocean.com
Website: http://en.speechocean.com/
|
Back | Top |