ISCA - International Speech
Communication Association


ISCApad Archive  »  2024  »  ISCApad #315  »  Resources  »  Database  »  Datatang

ISCApad #315

Friday, September 13, 2024 by Chris Wellekens

5-2-9 Datatang
  

Datatang is a global leading data provider that specialized in data customized solution, focusing in variety speech, image, and text data collection, annotation, crowdsourcing services.

 

Summary of the new datasets (2018) and a brief plan for 2019.

 

 

 

? Speech data (with annotation) that we finished in 2018 

 

Language
Datasets Length
  ( Hours )
French
794
British English
800
Spanish
435
Italian
1,440
German
1,800
Spanish (Mexico/Colombia)
700
Brazilian Portuguese
1,000
European Portuguese
1,000
Russian
1,000

 

?2019 ongoing  speech project 

 

Type

Project Name

Europeans speak English

1000 Hours-Spanish Speak English

1000 Hours-French Speak English

1000 Hours-German Speak English

Call Center Speech

1000 Hours-Call Center Speech

off-the-shelf data expansion

1000 Hours-Chinese Speak English

1500 Hours-Mixed Chinese and English Speech Data

 

 

 

On top of the above,  there are more planed speech data collections, such as Japanese speech data, children`s speech data, dialect speech data and so on.  

 

What is more, we will continually provide those data at a competitive price with a maintained high accuracy rate.

 

 

 

If you have any questions or need more details, do not hesitate to contact us jessy@datatang.com 

 

It would be possible to send you with a sample or specification of the data.

 

 

 



Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA