ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2018 » ISCApad #242 » Resources » Database » Speechocean – update (August 2018)

ISCApad #242

Friday, August 10, 2018 by Chris Wellekens

5-2-12 Speechocean – update (August 2018)

Speechocean – update (August 2018):

Speechocean: A global language resources and data services supplier

About Speechocean

Speechocean is one of the world well-known language related resources & services provider in the fields of Human Computer Interaction and Human Language Technology. At present, we can provide data services with 110+ languages and dialects across the world.

KingLine Data Center---Data Sharing Platform

Kingline Data Center is operated and supervised by Speechocean, which is mainly focused on language resources creating and providing for research and development of human language technology.

These diversified corpora are widely used for the research and development in the fields of Speech Recognition, Speech Synthesis, Natural Language Processing, Machine Translation, Web Search, etc. All corpora are openly accessible for users all over the world, including users from scientific research institutions, enterprises or individuals.

For more detailed information, please visit our website: http://kingline.speechocean.com

New released data:

1. http://kingline.speechocean.com/exchange.php?id=4693&act=view

S.N：King-TTS-019

Name: Russian Speech Synthesis Corpus (Female)

Recording Hours: 15.88 Hours

Utterances: 8,930

Parameters: 44.1k, 16bit

Channels: 2 Channels (waveform + EGG signal)

Phoneme Set: SAMPA

The whole database contains waveform data, phone boundary labeling files, prosody labeling, stress labeling, property of speech labeling files and pronunciation lexicon.

The voice talent was a 28-year-old female broadcaster. She was born and raised in Moscow.

The recording was in a professional studio. The voice talent recorded 2 to 3 times per week and lasted for about 2 months. While recording, there were 2 recording engineers (one of them is a native speaker) monitoring and guiding the voice talent. They were responsible for recording quality and text and pronunciation correctness.

2. Chinese Mandarin Speech Synthesis Corpus (Female)

ID: King-TTS-031

Name: Chinese Mandarin Speech Synthesis Corpus (Female)

Recording Hours: 41.86 Hours

Utterances: 24,733 (Chinese / English / Mixed)

Parameters: 48k, 16bit

Channels: 1 Channel (waveform)

Phoneme Set: Pinyin and CMU

The whole database contains waveform data, phone boundary labeling files and prosody labeling.

The voice talent was a female broadcaster. She was born and raised in Shan Xi, China.

The recording was in a professional studio. The voice talent recorded 1 to 2 times per week and lasted for about 12 months. While recording, there were 2 recording engineers (native speakers) monitoring and guiding the voice talent. They were responsible for recording quality and text and pronunciation correctness.

3. American English Speech Synthesis Corpus (Female)

ID: King-TTS-033

Name: American English Speech Synthesis Corpus (Female)

Recording Hours: 19.86 Hours

Utterances: 13,709

Parameters: 48k, 16bit

Channels: 1 Channel (waveform)

Phoneme Set: CMU

The whole database contains waveform data, phone boundary labeling files, prosody labeling and pronunciation lexicon.

The voice talent was a 33-year-old female broadcaster. She was born and raised in Minneapolis, America.

4. Chinese Child’s Voice Speech Synthesis Corpus (Female)

ID: King-TTS-040

Name: Chinese Child’s Voice Speech Synthesis Corpus (Female)

Recording Hours: 1.15 Hours

Utterances: 1,001

Parameters: 48k, 24bit

Channels: 1 Channel (waveform)

Phoneme Set: Pinyin

The whole database contains waveform data, scripts, phone boundary labeling files, prosody labeling and POS labeling.

The voice talent was a 7-year-old girl. She was born and raised in Shan Dong, China

The recording was in a professional studio. It lasted for about 2 months and the voice talent recorded 80~100 utterance each time. While recording, there were 2 recording engineers (native speakers) monitoring and guiding the voice talent. They were responsible for recording quality and text and pronunciation correctness.

Contact Information

Xianfeng Cheng

Vice President

Tel: +86-10-62660928; +86-10-62660053 ext.8080

Mobile: +86-13681432590

Skype: xianfeng.cheng1

Email: chengxianfeng@speechocean.com

cxfxy0cxfxy0@gmail.com

Website: http://en.speechocean.com/

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy