ISCA - International Speech
Communication Association


ISCApad Archive  »  2015  »  ISCApad #202  »  Resources  »  Database  »  Speechocean – update (March 2015)

ISCApad #202

Monday, April 13, 2015 by Chris Wellekens

5-2-14 Speechocean – update (March 2015)
  

 

Speechocean – update (March 2015):

 

 

 

Speechocean: A global language resources and data services supplier

 

 

 

Speechocean has over 500 large-scale databases available in 110+ languages and accents with the platform of desktop, in-car, telephony and tablet PC. Our data repository is enormous and diversified, which includes ASR Databases, TTS Databases, Lexica, Text Corpora, etc.

 

 

 

Speechocean is glad to announce 3 more resources that available in its catalogue:

 

 

 

  1. Chinese (Taiwan) Speech Recognition Corpus (Mobile) – 3300 Speakers

 

ID: King-ASR-044

 

This is Chinese mandarin (Taiwan) speech database, which is collected over 3 different mobile operating systems: iOS, Android and Windows Phone platform. This database is owned by Speechocean. 3300 speakers were recorded in total, and each speaker recorded 1 session in quiet environment. With discarding some unqualified utterances, the whole corpus contains the recordings of 2580000 utterances of Chinese mandarin (Taiwan) speech data which were from all the speakers. For the whole corpus, the pure recording time is about 1200 hours, including the leading and trailing silence (about 500ms).

 

 

 

  1. Japanese Speech Recognition Database ----spontaneous dialog (Telephony)-200 Speakers

 

ID: King-ASR-222

 

This is a telephony Japanese speech database, which is collected in Japan over Fixed telephone channel in the environment of quiet office. This database is owned by of Beijing Haitian Ruisheng Science Technology Ltd (SpeechOcean, www.speechocean.com).

 

The corpus contains the 100 pairs of spontaneous dialog speech data which were from 200 speakers. Each pair of speech consists of 3 audio files: two of them from single speaker separately and the other is from the mixed channel. The three files were recorded simultaneously. The pure recording time of mixed channel is about 307.1 hours. 33 topics were contained in this database. The total size of this database is 16.4G.

 

 

 

Contact Information

 

Xianfeng Cheng

 

Business Manager of Commercial Department

 

Tel: +86-10-62660928; +86-10-62660053 ext.8080

 

Mobile: +86 13681432590

 

Skype: xianfeng.cheng1

 

Email: chengxianfeng@speechocean.com; cxfxy0cxfxy0@gmail.com

 

Website: www.speechocean.com

 

 

 

 

 

 

 

 

 

 

 

 

 




 

 

 

 

 

 


Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA