ISCA - International Speech
Communication Association


ISCApad Archive  »  2018  »  ISCApad #240  »  Resources  »  Database  »  Speechocean – update (May 2018)

ISCApad #240

Tuesday, June 12, 2018 by Chris Wellekens

5-2-12 Speechocean – update (May 2018)
  

 

Speechocean – update (May 2018):

 

 

 

Speechocean: A global language resources and data services supplier

 

 

 

About Speechocean

 

Speechocean is one of the world well-known language related resources & services provider in the fields of Human Computer Interaction and Human Language Technology. At present, we can provide data services with 110+ languages and dialects across the world.

 

 

 

KingLine Data Center---Data Sharing Platform

 

Kingline Data Center is operated and supervised by Speechocean, which is mainly focused on language resources creating and providing for research and development of human language technology.

 

These diversified corpora are widely used for the research and development in the fields of Speech Recognition, Speech Synthesis, Natural Language Processing, Machine Translation, Web Search, etc. All corpora are openly accessible for users all over the world, including users from scientific research institutions, enterprises or individuals.

 

 

 

For more detailed information, please visit our website: http://kingline.speechocean.com

 

 

 

New released data:

 

 

 

1. Chinese Mandarin VPR Corpus (Mobile)-Comprehensive Sentences-300 Speakers

 

 

 

S.NKing-ASR-620

 

 

 

Details:

 

 

 

This Chinese Mandarin VPR Speech Recognition Corpus was collected in Beijing, China, on November, 2017. It contains 190,281 utterances in total.

 

 

 

3 different mobile operating systems simultaneously were using when collecting data: iOS, Android and Windows Phone.

 

 

 

This corpus contains the voices of 300 different speakers (133 males, 167 females) and each of them was recorded in quiet home and office environments. Most speakers were in the range of 16-30 years old. Each speaker was designed to record 2 sessions in different days and the 2 sessions at intervals of more than one week. Besides, each speaker recorded around 106 utterances for each session in approximate 30 minutes as natural as possible.

 

 

 

All script using for recording were designed by us, including common application words, isolated digits, phone numbers and phonetically rich sentences. The sentences are selected from different domain: news, conversations, twitter and etc. We removed a number of sentences that includes offensive or negative words or phrase.

 

 

 

A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

 

 

 

 

 

2. Chinese Mandarin Whisper Speech Recognition Corpus (Mobile)-Sentences-21 Speakers

 

 

 

S.NKing-ASR-603

 

 

 

Details:

 

 

 

This Chinese Mandarin Whisper Speech Recognition Corpus was collected in China.

 

 

 

The corpus contains 3,139 utterances and the voices of 21 different speakers (11 males, 10 females). Each speaker was recorded in a quiet office environment.

 

 

 

Mobile platform, i.e. Android, was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

 

 

 

In general, speech recognition systems are always used by people with an ordinary decibel. However, in some certain circumstances, such as the occasions where quietness or low decibel noise are needed, i.e. cinema, library, meeting rooms, etc. or when the talking contents are more private or classified, whisper speech recognition function may bring great convenience for people.

 

 

 

 

 

3. Russian Speech Synthesis Voice Font - Female

 

 

 

S.NKing-TTS-019

 

 

 

Details:

 

 

 

Size: 8.65 GB

 

Recording Hours: 15.88 Hours

 

Parameters: 44.1k, 16bit; 2 Channels

 

 

 

The Russian Speech Synthesis Voice Font contains the recordings of 1 female voice talent. She is a broadcaster, 28 years old when recording this database, and she was born and grew up in Moscow.

 

 

 

This voice font contains 8,930 utterances. It was recorded in a professional studio over two channels--waveform and electroglottography (EGG) signal. Speech rate, energy and timbre were strictly controlled during recording process.

 

 

 

Each utterance was carefully proofreaded by linguists and was stored in Windows uncompressed PCM format. Prosody labeling and phone boundary labeling are included. A pronouncing dictionary is available. All data were manually checked. 

 

 

 

 

 

4. Modern Standard Arabic Pronunciation + Vowel Lexicon

 

 

 

S.NKing-Lexicon-036-2

 

 

 

Details:

 

 

 

Entries: 180,000

 

Phoneme Inventory: Computer Readable IPAIt can be converted to the phoneset Sampa, XSampa, and etc., based on demand.)

 

Stress: Included

 

Syllable Boundary: Included

 

 

 

 

 

Contact Information

 

Xianfeng Cheng

 

VP

 

Tel: +86-10-62660928; +86-10-62660053 ext.8080

 

Mobile: +86 13681432590

 

Skype: xianfeng.cheng1

 

Email: chengxianfeng@speechocean.com; cxfxy0cxfxy0@gmail.com

 

Website: www.speechocean.com


 


 

 


Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA