ISCApad #243 |
Sunday, September 23, 2018 by Chris Wellekens |
5-1-1 | J.Li, L.Deng, R.Haeb-Umbach and Y.Gong, 'Robust Automatic Speech Recognition', Academic Press 'Robust Automatic Speech Recognition'
| |||||||
5-1-2 | Barbosa, P. A. and Madureira, S. Manual de Fonética Acústica Experimental. Aplicações a dados do português. 591 p. São Paulo: Cortez, 2015. [In Portuguese] Barbosa, P. A. and Madureira, S. Manual de Fonética Acústica Experimental. Aplicações a dados do português. 591 p. São Paulo: Cortez, 2015. [In Portuguese]
| |||||||
5-1-3 | Damien Nouvel, Inalco, Maud Ehrmann, EPFL,Sophie Rosset, CNRS. Les entités nommées pour le traitement automatique des langues Les entités nommées pour le traitement automatique des langues
| |||||||
5-1-4 | R.Fuchs, 'Speech Rhythm in Varieties of English' , Springer R.Fuchs, 'Speech Rhythm in Varieties of English' has appeared with Springer, in the 'Prosody, Phonology and Phonetics' series: https://www.springer.com/gp/book/9783662478172
| |||||||
5-1-5 | Pejman Mowlaee et al., 'Phase-Aware Signal Processing in Speech Communication: Theory and Practice', Wiley 2016Phase-Aware Signal Processing in Speech Communication: Theory and Practice
http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1119238811.html
| |||||||
5-1-6 | Jean Caelen, Anne Xuereb, 'Dialogue : altérité, interaction, énaction'
Jean Caelen,Anne Xuereb Dialogue : altérité, interaction, énaction Editions universitaires européennes
| |||||||
5-1-7 | Bäckström, Tom (with Guillaume Fuchs, Sascha Disch, Christian Uhle and Jeremie Lecomte), 'Speech Coding with Code-Excited Linear Prediction', Springer
| |||||||
5-1-8 | Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey (Eds), 'New Era for Robust Seech Recognition', Springer. Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey (Eds), 'New Era for Robust Seech Recognition', Springer. https://link.springer.com/book/10.1007%2F978-3-319-64680-0
| |||||||
5-1-9 | Fabrice Marsac, Rudolph Sock, CONSÉCUTIVITÉ ET SIMULTANÉITÉ en Linguistique, Langues et Parole, L'Harmattan,France Nous avons le plaisir de vous annoncer la parution du volume thématique « CONSÉCUTIVITÉ ET SIMULTANÉITÉ en Linguistique, Langues et Parole » dans la Collection Dixit Grammatica (L’Harmattan, France) :
|
5-2-1 | Linguistic Data Consortium (LDC) update (August 2018)
In this newsletter: LDC at Interspeech 2018 Fall 2018 LDC Data Scholarship Program
2011 NIST Language Recognition Evaluation Test Set
LDC at Interspeech 2018 LDC will participate in various ways at Interspeech 2018 held this year in Hyderabad, India, September 2-6. It is co-organizing the special session, The First DIHARD Speech Diarization Challenge, on September 3 and is a sponsor of the September 1 pre-conference workshop, Young Female Researchers in Speech Science & Technology (YFRSW). Results of recent work will be presented during the poster session on September 3, “Global TIMIT: Acoustic Phonetic Datasets for the World’s Languages.” Fall 2018 LDC Data Scholarship Program Students can apply for the Fall 2018 Data Scholarship Program now through September 15, 2018. The LDC Data Scholarship program provides students with access to LDC data at no cost. For more information on application requirements and program rules, please visit LDC Data Scholarships.
(1) BOLT English SMS/Chat was developed by LDC and consists of naturally-occurring Short Message Service (SMS) and Chat (CHT) data collected through data donations and live collection from native English speakers. The corpus contains 18,429 conversations totaling 3,674,802 words across 375,967 messages. The BOLT (Broad Operational Language Translation) program developed machine translation and information retrieval for less formal genres, focusing particularly on user-generated content. LDC supported the BOLT program by collecting informal data sources -- discussion forums, text messaging, and chat -- in Chinese, Egyptian Arabic, and English. The collected data was translated and annotated for various tasks including word alignment, treebanking, propbanking, and co-reference. BOLT English SMS/Chat is available via web download.
2018 Subscription Members will receive copies of this corpus. 2018 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US $1750.
*
(2) CIEMPIESS Balance (Corpus de Investigación en Español de México del Posgrado de Ingeniería Eléctrica y Servicio Social) was developed by the Development of Speech Technologies program at the School of Engineering at the National Autonomous University of Mexico (UNAM) and consists of approximately 18 hours of Mexican Spanish broadcast speech with associated transcripts. The goal of this work was to create acoustic models for automatic speech recognition. For more information and documentation see the CIEMPIESS-UNAM Project website.
CIEMPIESS Balance is a companion corpus to CIEMPIESS Light, released by LDC as LDC2017S23. It was developed so that the data sets together constitute a gender-balanced corpus. The gender breakdown in CIEMPIESS Light is approximately 75% male and 25% female. In CIEMPIESS Balance, the gender breakdown is approximately 25% male and 75% female.
The majority of the speech recordings were collected from Radio-IUS, a UNAM radio station. Other recordings were taken from IUS Canal Multimedia and Centro Universitario de Estudios Jurídicos (CUEJ UNAM). These two channels feature videos with speech around legal issues and topics related to UNAM.
CIEMPIESS Balance is available via web download.
2018 Subscription Members will receive copies of this corpus. 2018 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data at no cost.
*
(3) 2011 NIST Language Recognition Evaluation Test Set contains selected training data and the evaluation test set for the 2011 NIST Language Recognition Evaluation. It consists of approximately 204 hours of conversational telephone speech and broadcast audio collected by LDC between 2009 and 2011 in the following 24 languages and dialects: Arabic (Iraqi), Arabic (Levantine), Arabic (Maghrebi), Arabic (Standard), Bengali, Czech, Dari, English (American), English (Indian), Farsi, Hindi, Lao, Mandarin, Punjabi, Pashto, Polish, Russian, Slovak, Spanish, Tamil, Thai, Turkish, Ukrainian, and Urdu.
The 2011 evaluation emphasized the language pair condition and involved both conversational telephone speech (CTS) and broadcast narrow-band speech (BNBS).
This release includes training data for nine language varieties that had not been represented in prior LRE cycles -- Arabic (Iraqi), Arabic (Levantine), Arabic (Maghrebi), Arabic (Standard), Czech, Lao, Punjabi, Polish, and Slovak -- contained in 893 audited segments of roughly 30 seconds duration and in 400 full-length CTS recordings. The evaluation test set comprises a total of 29,511 audio files, all manually audited at LDC for language and divided equally into three different test conditions according to the nominal amount of speech content per segment.
LDC released the prior LREs as:
2011 NIST Language Recognition Evaluation Test Set is distributed via web download.
2018 Subscription Members will receive copies of this corpus. 2018 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US $2000.
*
Membership Office University of Pennsylvania T: +1-215-573-1275 E: ldc@ldc.upenn.edu M: 3600 Market St. Suite 810 Philadelphia, PA 19104
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-2 | ELRA - Language Resources Catalogue - Update (February 2018) ELRA - Language Resources Catalogue - Update
-------------------------------------------------------
We are happy to announce that 1 new Monolingual Lexicon, 1 new Written Corpus and 2 new Speech resources are now available in our catalogue.
ELRA-L0100 French dictionary of definitions (SYNAPSE)
ISLRN: 357-949-964-163-0 The French dictionary of definitions (SYNAPSE) consists of 216,835 entries (147,378 nouns, 80,552 adjectives, 24,001 verbs, 4,677 adverbs, 1,560 prefixes, 107 prepositions, 614 interjections, 147 pronouns, 42 conjunctions, 27 articles), 309,078 definitions and 7,395 phraseological units (phrases). Grammatical information for each entry consists of: grammatical category, gender, number, inflected forms. This dictionary is provided in XML format together with its DTD. For more information, see: http://catalog.elra.info/product_info.php?products_id=1315 ELRA-W0124 English-Vietnamese Parallel Corpus ISLRN: 838-483-738-912-8 This is a corpus of 500,000 English-Vietnamese sentence pairs. The parallel corpus contains English documents translated by professional translators into Vietnamese. The source texts include books, dictionaries, newspapers, online news. The texts are provided in TEI format. For more information, see: http://catalog.elra.info/product_info.php?products_id=1316 ELRA-S0394 Metalogue Multi-Issue Bargaining Dialogue
ISLRN: 217-906-813-531-9 This corpus consists of approximately 2.5 hours of semantically annotated English dialogue data that includes speech and transcripts. Six unique subjects (undergraduates between 19 and 25 years of age) participated in the collection. The dialogue speech was captured with two headset microphones and saved in 16kHz, 16-bit mono linear PCM FLAC format. Transcripts were produced semi-automatically, using an automatic speech recognizer followed by manual correction. All text is presented in UTF-8 as either plain text or XML. For more information, see: http://catalog.elra.info/product_info.php?products_id=1317 ELRA-S0395 Nautilus Speaker Characterization (NSC) Corpus ISLRN: 157-037-166-491-1
This corpus comprises clean microphone recordings of conversational speech from 300 German speakers (126 males and 174 females) aged 18 to 35 years, with no marked dialect/accent. The recordings were performed in an acoustically-isolated room in 2016/2017. Four scripted and four semi-spontaneous dialogs were elicited from the speakers, simulating telephone call inquiries. Additionally, spontaneous neutral and emotional speech utterances and questions were produced. All labels are provided, together with the speech recordings and the speakers' metadata. For more information, see: http://catalog.elra.info/product_info.php?products_id=1318 For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org If you would like to enquire about having your resources distributed by ELRA, please do not hesitate to contact us. Visit the Universal Catalogue: http://universal.elra.info Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/en/catalogues/language-resources-announcements/
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-3 | Appen ButlerHill
Appen ButlerHill A global leader in linguistic technology solutions RECENT CATALOG ADDITIONS—MARCH 2012 1. Speech Databases 1.1 Telephony
2. Pronunciation Lexica Appen Butler Hill has considerable experience in providing a variety of lexicon types. These include: Pronunciation Lexica providing phonemic representation, syllabification, and stress (primary and secondary as appropriate) Part-of-speech tagged Lexica providing grammatical and semantic labels Other reference text based materials including spelling/mis-spelling lists, spell-check dictionar-ies, mappings of colloquial language to standard forms, orthographic normalization lists. Over a period of 15 years, Appen Butler Hill has generated a significant volume of licensable material for a wide range of languages. For holdings information in a given language or to discuss any customized development efforts, please contact: sales@appenbutlerhill.com
4. Other Language Resources Morphological Analyzers – Farsi/Persian & Urdu Arabic Thesaurus Language Analysis Documentation – multiple languages
For additional information on these resources, please contact: sales@appenbutlerhill.com 5. Customized Requests and Package Configurations Appen Butler Hill is committed to providing a low risk, high quality, reliable solution and has worked in 130+ languages to-date supporting both large global corporations and Government organizations. We would be glad to discuss to any customized requests or package configurations and prepare a cus-tomized proposal to meet your needs.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-4 | OFROM 1er corpus de français de Suisse romande Nous souhaiterions vous signaler la mise en ligne d'OFROM, premier corpus de français parlé en Suisse romande. L'archive est, dans version actuelle, d'une durée d'environ 15 heures. Elle est transcrite en orthographe standard dans le logiciel Praat. Un concordancier permet d'y effectuer des recherches, et de télécharger les extraits sonores associés aux transcriptions.
Pour accéder aux données et consulter une description plus complète du corpus, nous vous invitons à vous rendre à l'adresse suivante : http://www.unine.ch/ofrom.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-5 | Real-world 16-channel noise recordings We are happy to announce the release of DEMAND, a set of real-world
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-6 | Aide à la finalisation de corpus oraux ou multimodaux pour diffusion, valorisation et dépôt pérenne Aide à la finalisation de corpus oraux ou multimodaux pour diffusion, valorisation et dépôt pérenne
Le consortium IRCOM de la TGIR Corpus et l’EquipEx ORTOLANG s’associent pour proposer une aide technique et financière à la finalisation de corpus de données orales ou multimodales à des fins de diffusion et pérennisation par l’intermédiaire de l’EquipEx ORTOLANG. Cet appel ne concerne pas la création de nouveaux corpus mais la finalisation de corpus existants et non-disponibles de manière électronique. Par finalisation, nous entendons le dépôt auprès d’un entrepôt numérique public, et l’entrée dans un circuit d’archivage pérenne. De cette façon, les données de parole qui ont été enrichies par vos recherches vont pouvoir être réutilisées, citées et enrichies à leur tour de manière cumulative pour permettre le développement de nouvelles connaissances, selon les conditions d’utilisation que vous choisirez (sélection de licences d’utilisation correspondant à chacun des corpus déposés).
Cet appel d’offre est soumis à plusieurs conditions (voir ci-dessous) et l’aide financière par projet est limitée à 3000 euros. Les demandes seront traitées dans l’ordre où elles seront reçues par l’ IRCOM. Les demandes émanant d’EA ou de petites équipes ne disposant pas de support technique « corpus » seront traitées prioritairement. Les demandes sont à déposer du 1er septembre 2013 au 31 octobre 2013. La décision de financement relèvera du comité de pilotage d’IRCOM. Les demandes non traitées en 2013 sont susceptibles de l’être en 2014. Si vous avez des doutes quant à l’éligibilité de votre projet, n’hésitez pas à nous contacter pour que nous puissions étudier votre demande et adapter nos offres futures.
Pour palier la grande disparité dans les niveaux de compétences informatiques des personnes et groupes de travail produisant des corpus, L’ IRCOM propose une aide personnalisée à la finalisation de corpus. Celle-ci sera réalisée par un ingénieur IRCOM en fonction des demandes formulées et adaptées aux types de besoin, qu’ils soient techniques ou financiers.
Les conditions nécessaires pour proposer un corpus à finaliser et obtenir une aide d’IRCOM sont :
Les demandes peuvent concerner tout type de traitement : traitements de corpus quasi-finalisés (conversion, anonymisation), alignement de corpus déjà transcrits, conversion depuis des formats « traitement de textes », digitalisation de support ancien. Pour toute demande exigeant une intervention manuelle importante, les demandeurs devront s’investir en moyens humains ou financiers à la hauteur des moyens fournis par IRCOM et ORTOLANG.
IRCOM est conscient du caractère exceptionnel et exploratoire de cette démarche. Il convient également de rappeler que ce financement est réservé aux corpus déjà largement constitués et ne peuvent intervenir sur des créations ex-nihilo. Pour ces raisons de limitation de moyens, les propositions de corpus les plus avancés dans leur réalisation pourront être traitées en priorité, en accord avec le CP d’IRCOM. Il n’y a toutefois pas de limite « théorique » aux demandes pouvant être faites, IRCOM ayant la possibilité de rediriger les demandes qui ne relèvent pas de ses compétences vers d’autres interlocuteurs.
Les propositions de réponse à cet appel d’offre sont à envoyer à ircom.appel.corpus@gmail.com. Les propositions doivent utiliser le formulaire de deux pages figurant ci-dessous. Dans tous les cas, une réponse personnalisée sera renvoyée par IRCOM.
Ces propositions doivent présenter les corpus proposés, les données sur les droits d’utilisation et de propriétés et sur la nature des formats ou support utilisés.
Cet appel est organisé sous la responsabilité d’IRCOM avec la participation financière conjointe de IRCOM et l’EquipEx ORTOLANG.
Pour toute information complémentaire, nous rappelons que le site web de l'Ircom (http://ircom.corpus-ir.fr) est ouvert et propose des ressources à la communauté : glossaire, inventaire des unités et des corpus, ressources logicielles (tutoriaux, comparatifs, outils de conversion), activités des groupes de travail, actualités des formations, ... L'IRCOM invite les unités à inventorier leur corpus oraux et multimodaux - 70 projets déjà recensés - pour avoir une meilleure visibilité des ressources déjà disponibles même si elles ne sont pas toutes finalisées.
Le comité de pilotage IRCOM
Utiliser ce formulaire pour répondre à l’appel : Merci.
Réponse à l’appel à la finalisation de corpus oral ou multimodal
Nom du corpus :
Nom de la personne à contacter : Adresse email : Numéro de téléphone :
Nature des données de corpus :
Existe-t’il des enregistrements : Quel média ? Audio, vidéo, autre… Quelle est la longueur totale des enregistrements ? Nombre de cassettes, nombre d’heures, etc. Quel type de support ? Quel format (si connu) ?
Existe-t’il des transcriptions : Quel format ? (papier, traitement de texte, logiciel de transcription) Quelle quantité (en heures, nombre de mots, ou nombre de transcriptions) ?
Disposez vous de métadonnées (présentation des droits d’auteurs et d’usage) ?
Disposez-vous d’une description précise des personnes enregistrées ?
Disposez-vous d’une attestation de consentement éclairé pour les personnes ayant été enregistrées ? En quelle année (environ) les enregistrements ont eu lieu ?
Quelle est la langue des enregistrements ?
Le corpus comprend-il des enregistrements d’enfants ou de personnes ayant un trouble du langage ou une pathologie ? Si oui, de quelle population s’agit-il ?
Dans un souci d’efficacité et pour vous conseiller dans les meilleurs délais, il nous faut disposer d’exemples des transcriptions ou des enregistrements en votre possession. Nous vous contacterons à ce sujet, mais vous pouvez d’ores et déjà nous adresser par courrier électronique un exemple des données dont vous disposez (transcriptions, métadonnées, adresse de page web contenant les enregistrements).
Nous vous remercions par avance de l’intérêt que vous porterez à notre proposition. Pour toutes informations complémentaires veuillez contacter Martine Toda martine.toda@ling.cnrs.fr ou à ircom.appel.corpus@gmail.com.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-7 | Rhapsodie: un Treebank prosodique et syntaxique de français parlé Rhapsodie: un Treebank prosodique et syntaxique de français parlé
Nous avons le plaisir d'annoncer que la ressource Rhapsodie, Corpus de français parlé annoté pour la prosodie et la syntaxe, est désormais disponible sur http://www.projet-rhapsodie.fr/
Le treebank Rhapsodie est composé de 57 échantillons sonores (5 minutes en moyenne, au total 3h de parole, 33000 mots) dotés d’une transcription orthographique et phonétique alignées au son.
Il s'agit d’une ressource de français parlé multi genres (parole privée et publique ; monologues et dialogues ; entretiens en face à face vs radiodiffusion, parole plus ou moins interactive et plus ou moins planifiée, séquences descriptives, argumentatives, oratoires et procédurales) articulée autour de sources externes (enregistrements extraits de projets antérieurs, en accord avec les concepteurs initiaux) et internes. Nous tenons en particulier à remercier les responsables des projets CFPP2000, PFC, ESLO, C-Prom ainsi que Mathieu Avanzi, Anne Lacheret, Piet Mertens et Nicolas Obin.
Les échantillons sonores (wave & MP3, pitch nettoyé et lissé), les transcriptions orthographiques (txt), les annotations macrosyntaxiques (txt), les annotations prosodiques (xml, textgrid) ainsi que les metadonnées (xml & html) sont téléchargeables librement selon les termes de la licence Creative Commons Attribution - Pas d’utilisation commerciale - Partage dans les mêmes conditions 3.0 France. Les annotations microsyntaxiques seront disponibles prochainement Les métadonnées sont également explorables en ligne grâce à un browser. Les tutoriels pour la transcription, les annotations et les requêtes sont disponibles sur le site Rhapsodie. Enfin, L’annotation prosodique est interrogeable en ligne grâce au langage de requêtes Rhapsodie QL. L'équipe Ressource Rhapsodie (Modyco, Université Paris Ouest Nanterre) Sylvain Kahane, Anne Lacheret, Paola Pietrandrea, Atanas Tchobanov, Arthur Truong. Partenaires : IRCAM (Paris), LATTICE (Paris), LPL (Aix-en-Provence), CLLE-ERSS (Toulouse).
******************************************************** Rhapsodie: a Prosodic and Syntactic Treebank for Spoken French We are pleased to announce that Rhapsodie, a syntactic and prosodic treebank of spoken French created with the aim of modeling the interface between prosody, syntax and discourse in spoken French is now available at http://www.projet-rhapsodie.fr/ The Rhapsodie treebank is made up of 57 short samples of spoken French (5 minutes long on average, amounting to 3 hours of speech and a 33 000 word corpus) endowed with an orthographical phoneme-aligned transcription . The corpus is representative of different genres (private and public speech; monologues and dialogues; face-to-face interviews and broadcasts; more or less interactive discourse; descriptive, argumentative and procedural samples, variations in planning type). The corpus samples have been mainly drawn from existing corpora of spoken French and partially created within the frame of theRhapsodie project. We would especially like to thank the coordinators of the CFPP2000, PFC, ESLO, C-Prom projects as well as Piet Mertens, Mathieu Avanzi, Anne Lacheret and Nicolas Obin. The sound samples (waves, MP3, cleaned and stylized pitch), the orthographic transcriptions (txt), the macrosyntactic annotations (txt), the prosodic annotations (xml, textgrid) as well as the metadata (xml and html) can be freely downloaded under the terms of the Creative Commons licence Attribution - Noncommercial - Share Alike 3.0 France. Microsyntactic annotation will be available soon. The metadata are searchable on line through a browser. The prosodic annotation can be explored on line through the Rhapsodie Query Language. The tutorials of transcription, annotations and Rhapsodie Query Language are available on the site.
The Rhapsodie team (Modyco, Université Paris Ouest Nanterre : Sylvain Kahane, Anne Lacheret, Paola Pietrandrea, Atanas Tchobanov, Arthur Truong. Partners: IRCAM (Paris), LATTICE (Paris), LPL (Aix-en-Provence),CLLE-ERSS (Toulouse).
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-8 | Annotation of “Hannah and her sisters” by Woody Allen. We have created and made publicly available a dense audio-visual person-oriented ground-truth annotation of a feature movie (100 minutes long): “Hannah and her sisters” by Woody Allen. Jean-Ronan Vigouroux, Louis Chevallier Patrick Pérez Technicolor Research & Innovation
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-9 | French TTS Text to Speech Synthesis:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-10 | Google 's Language Model benchmark A LM benchmark is available at:https://github.com/ciprian-chelba/1-billion-word-language-modeling-benchmark
Here is a brief description of the project.
'The purpose of the project is to make available a standard training and test setup for language modeling experiments. The training/held-out data was produced from a download at statmt.org using a combination of Bash shell and Perl scripts distributed here. This also means that your results on this data set are reproducible by the research community at large. Besides the scripts needed to rebuild the training/held-out data, it also makes available log-probability values for each word in each of ten held-out data sets, for each of the following baseline models:
ArXiv paper: http://arxiv.org/abs/1312.3005
Happy benchmarking!'
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-11 | ISLRN new portal Opening of the ISLRN Portal
ELRA, LDC, and AFNLP/Oriental-COCOSDA announce the opening of the ISLRN Portal @ www.islrn.org.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-12 | Speechocean – update (September 2018)
Speechocean – update (September 2018):
Speechocean: Global A.I. Data Resource & Service Supplier
About Speechocean
Speechocean is one of the world well-known language related resources & services provider in the fields of Human Computer Interaction and Human Language Technology. At present, we can provide data services with 110+ languages and dialects across the world.
KingLine Data Center---Data Sharing Platform
Kingline Data Center is operated and supervised by Speechocean, which is mainly focused on language resources creating and providing for research and development of human language technology.
These diversified corpora are widely used for the research and development in the fields of Speech Recognition, Speech Synthesis, Natural Language Processing, Machine Translation, Web Search, etc. All corpora are openly accessible for users all over the world, including users from scientific research institutions, enterprises or individuals.
For more detailed information, please visit our website: http://kingline.speechocean.com
New released data:
1. Chinese Mandarin Speech Recognition Corpus
S.N:King-ASR-459
Size: 776 GB
Recording Hours: 2596 Hours
Recording Platform: Desktop
Parameters: 44.1k, 16bit; 2 Channels
Introduction:
The Chinese Mandarin Speech Recognition Corpus was collected in China.
The corpus contains 2,400,698 utterances in total. Each utterance wave was stored in a separate file and uncompressed.
This corpus contains the voices of 2404 different speakers (1184 males, 1220 females) who were balanced distributed in age (18 – 30, 31 – 45, 46 – 60), gender and regional accents. Each speaker was recorded in quiet office and home environments.
Desktop platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in Pinyin. All audio files were manually transcribed and annotated by native transcribers.
2. American English Speech Recognition Corpus
Size: 123 GB
Recording Hours: 1146.6 Hours
Recording Platform: Mobile
Parameters: 16k, 16bit; 1 Channel
Introduction:
This American English Speech Recognition Corpus was collected in USA.
The script contains 829,631 utterances in total. Each utterance wave was stored in a separate file and uncompressed.
This corpus contains the voices of 2,602 different speakers (1,232 males, 1,370 females) who were balanced distributed in age (16 – 30, 31 – 45, >45), gender and regional accents. Each speaker was recorded in quiet office and home environment.
Mobile platform, i.e. Android, was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in CMU. All audio files were manually transcribed and annotated by native transcribers.
3. Taiwan Chinese Speech Synthesis Corpus - Male
S.N: King-TTS-037
Size: 9.54 GB
Recording Hours: 29.55 Hours
Parameters: 48k, 16bit; 1 Channel
Introduction:
The Taiwan Chinese Speech Synthesis Corpus contains the recordings of 1 male voice talent. He is a broadcaster and he was born and grew up in Taiwan.
The corpus contains 20,736 utterances. It was recorded in a professional studio over one channel of waveform. Speech rate, energy and timbre were strictly controlled during recording process.
Each utterance was carefully proofreaded by linguists and was stored in Windows uncompressed PCM format. Prosody labeling and phone boundary labeling are included. All data were manually checked.
4. Prounciation Lexicon of Person and Location Name of US English
S.N: King-Lexicon-078
Entries: 359,246
Phoneme Inventory: Computer Readable IPA(It can be converted to the phoneset Sampa, XSampa, and etc., based on demand.)
Stress: Included
Syllable Boundary: Included
Contact Information
Xianfeng Cheng
vice president
+86-10-62660053 ext.8080
Mobile: +86-13681432590
Skype: xianfeng.cheng1
Email: chengxianfeng@speechocean.com; cxfxy0cxfxy0@gmail.com
Website: www.speechocean.com
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-13 | kidLUCID: London UCL Children’s Clear Speech in Interaction Database kidLUCID: London UCL Children’s Clear Speech in Interaction Database We are delighted to announce the availability of a new corpus of spontaneous speech for children aged 9 to 14 years inclusive, produced as part of the ESRC-funded project on ‘Speaker-controlled Variability in Children's Speech in Interaction’ (PI: Valerie Hazan). Speech recordings (a total of 288 conversations) are available for 96 child participants (46M, 50F, range 9;0 to 15;0 years), all native southern British English speakers. Participants were recorded in pairs while completing the diapix spot-the-difference picture task in which the pair verbally compared two scenes, only one of which was visible to each talker. High-quality digital recordings were made in sound-treated rooms. For each conversation, a stereo audio recording is provided with each speaker on a separate channel together with a Praat Textgrid containing separate word- and phoneme-level segmentations for each speaker. There are six recordings per speaker pair made in the following conditions:
The kidLUCID corpus is available online within the OSCAAR (Online Speech/Corpora Archive and Analysis Resource) archive (https://oscaar.ci.northwestern.edu/). Free access can be requested for research purposes. Further information about the project can be found at: http://www.ucl.ac.uk/pals/research/shaps/research/shaps/research/clear-speech-strategies This work was supported by Economic and Social Research Council Grant No. RES-062- 23-3106.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-14 | Robust speech datasets and ASR software tools
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-15 | International Standard Language Resource Number (ISLRN) implemented by ELRA and LDC ELRA and LDC partner to implement ISLRN process and assign identifiers to all the Language Resources in their catalogues.
Following the meeting of the largest NLP organizations, the NLP12, and their endorsement of the International Standard Language Resource Number (ISLRN), ELRA and LDC partnered to implement the ISLRN process and to assign identifiers to all the Language Resources (LRs) in their catalogues. The ISLRN web portal was designed to enable the assignment of unique identifiers as a service free of charge for all Language Resource providers. To enhance the use of ISLRN, ELRA and LDC have collaborated to provide the ISLRN 13-digit ID to all the Language Resources distributed in their respective catalogues. Anyone who is searching the ELRA and LDC catalogues can see that each Language Resource is now identified by both the data centre ID and the ISLRN number. All providers and users of such LRs should refer to the latter in their own publications and whenever referring to the LR.
ELRA and LDC will continue their joint involvement in ISLRN through active participation in this web service.
Visit the ELRA and LDC catalogues, respectively at http://catalogue.elra.info and https://catalog.ldc.upenn.edu
Background The International Standard Language Resource Number (ISLRN) aims to provide unique identifiers using a standardised nomenclature, thus ensuring that LRs are correctly identified, and consequently, recognised with proper references for their usage in applications within R&D projects, product evaluation and benchmarking, as well as in documents and scientific papers. Moreover, this is a major step in the networked and shared world that Human Language Technologies (HLT) has become: unique resources must be identified as such and meta-catalogues need a common identification format to manage data correctly.
***About NLP12*** Representatives of the major Natural Language Processing and Computational Linguistics organizations met in Paris on 18 November 2013 to harmonize and coordinate their activities within the field.
*** About ELRA *** The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and research laboratories that creates and distributes linguistic resources for language-related education, research and technology development. To find out more about LDC, please visit our web site: https://www.ldc.upenn.edu
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-16 | Base de données LIBRE et GRATUITE pour la reconnaissance du locuteur Je me permet de vous solliciter pour contribuer à la création
d?une base de données LIBRE et GRATUITE
pour la reconnaissance du locuteur.
Plus de détails et la marche à suivre ci-dessous.
Merci beaucoup,
Anthony Larcher
Récemment, un certain nombre de laboratoires spécialisés dans la reconnaissance du locuteur dépendante du texte ont initié le projet RedDots.
Il s?agit d?une initiative volontaire sur financement propre des laboratoires.
Ce projet encourage des discussions sur les thèmes de la reconnaissance du locuteur,
la collection de corpus et les cas d?usage propres à cette technologie à travers un Google Group.
Dans le cadre du projet RedDots, l?Institute for Infocomm Research (Singapour) a développé une application Android
qui permet d?enregistrer des données sur un téléphone portable.
Cette base de données a pour but de pallier certaines lacunes des corpus existants:
- le coût (certaines bases standard sont vendues à plusieurs milliers d?euro)
- la taille limitée (le nombre limité de locuteurs ne permet plus d?évaluer les systèmes de reconnaissance de manière significative)
- la variabilité limitée (les données sont actuellement enregistrées dans plus de 5 pays dans le monde entier)
Afin de distributer une base de données, qui puisse bénéficier librement
à l?ensemble de la communauté de recherche nous vous sollicitons.
Comment faire et en combien de temps?
- inscrivez vous en 2 minutes à l?adresse suivante
- installez l?application Android sur votre téléphone en 2 minutes, saisissez l'ID et mot de passe qui vous seront envoyé par email
- enregistrez une session 3 minutes sur votre téléphone
Tout se fait en moins de 10 minutes?
Une des limitations principale des corpus existant est le nombre limité de sessions
enregistrée par locuteur et le court intervalle de temps au cours duquel ces sessions sont enregistrées.
Afin de combler ce manque nous espérons que chaque participant acceptera d?enregistrer
plusieurs sessions dans les mois à venir.
Idealement, chaque participant enregistrera 3 ou 4 minutes par semaine pendant un an.
Ou vont mes données et pour quoi sont elles utilisées?
Les données sont actuellement envoyées sur un serveur de l?Institute for Infocomm Research
à Singapour. Un institut de recherche public.
En vous enregistrant, vous acceptez que ces données soient utilisées à des fins de recherche
uniquement. ces données seront mise à disposition en ligne gratuitement tout au long du projet.
Merci pour votre contribution, n?hésitez pas à faire circuler cet email.
Plus de détails seront données prochainement dans un article soumis à INTERSPEECH 2015.
Anthony Larcher
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-17 | ISLRN adopted by Joint Research Center (JRC) of the European Commission JRC, the EC's Joint Research Centre, an important LR player: First to adopt the ISLRN initiative
The Joint Research Centre (JRC), the European Commission's in house science service, is the first organisation to use the International Standard Language Resource Number (ISLRN) initiative and has requested ISLRN 13-digit unique identifiers to its Language Resources (LR).
The current JRC LRs (downloadable from https://ec.europa.eu/jrc/en/language-technologies) with an ISLRN ID are:
Background The International Standard Language Resource Number (ISLRN) aims to provide unique identifiers using a standardised nomenclature, thus ensuring that LRs are correctly identified, and consequently, recognised with proper references for their usage in applications within R&D projects, product evaluation and benchmarking, as well as in documents and scientific papers. Moreover, this is a major step in the networked and shared world that Human Language Technologies (HLT) has become: unique resources must be identified as such and meta-catalogues need a common identification format to manage data correctly.
*** About the JRC *** As the Commission's in-house science service, the Joint Research Centre's mission is to provide EU policies with independent, evidence-based scientific and technical support throughout the whole policy cycle.
*** About ELRA ***
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-18 | Forensic database of voice recordings of 500+ Australian English speakers Forensic database of voice recordings of 500+ Australian English speakers
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-19 | Audio and Electroglottographic speech recordings
Audio and Electroglottographic speech recordings from several languages We are happy to announce the public availability of speech recordings made as part of the UCLA project 'Production and Perception of Linguistic Voice Quality'. http://www.phonetics.ucla.edu/voiceproject/voice.html Audio and EGG recordings are available for Bo, Gujarati, Hmong, Mandarin, Black Miao, Southern Yi, Santiago Matatlan/ San Juan Guelavia Zapotec; audio recordings (no EGG) are available for English and Mandarin. Recordings of Jalapa Mazatec extracted from the UCLA Phonetic Archive are also posted. All recordings are accompanied by explanatory notes and wordlists, and most are accompanied by Praat textgrids that locate target segments of interest to our project. Analysis software developed as part of the project – VoiceSauce for audio analysis and EggWorks for EGG analysis – and all project publications are also available from this site. All preliminary analyses of the recordings using these tools (i.e. acoustic and EGG parameter values extracted from the recordings) are posted on the site in large data spreadsheets. All of these materials are made freely available under a Creative Commons Attribution-NonCommercial-ShareAlike-3.0 Unported License. This project was funded by NSF grant BCS-0720304 to Pat Keating, Abeer Alwan and Jody Kreiman of UCLA, and Christina Esposito of Macalester College. Pat Keating (UCLA)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-20 | Press release: Opening of the ELRA License Wizard Press Release - Immediate - Paris, France, April 2, 2015
Currently, the License Wizard allows the user to choose among several licenses that exist for the use of Language Resources: ELRA, Creative Commons and META-SHARE.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-21 | EEG-face tracking- audio 24 GB data set Kara One, Toronto, Canada We are making 24 GB of a new dataset, called Kara One, freely available. This database combines 3 modalities (EEG, face tracking, and audio) during imagined and articulated speech using phonologically-relevant phonemic and single-word prompts. It is the result of a collaboration between the Toronto Rehabilitation Institute (in the University Health Network) and the Department of Computer Science at the University of Toronto.
In the associated paper (abstract below), we show how to accurately classify imagined phonological categories solely from EEG data. Specifically, we obtain up to 90% accuracy in classifying imagined consonants from imagined vowels and up to 95% accuracy in classifying stimulus from active imagination states using advanced deep-belief networks.
Data from 14 participants are available here: http://www.cs.toronto.edu/~complingweb/data/karaOne/karaOne.html.
If you have any questions, please contact Frank Rudzicz at frank@cs.toronto.edu.
Best regards, Frank
PAPER Shunan Zhao and Frank Rudzicz (2015) Classifying phonological categories in imagined and articulated speech. In Proceedings of ICASSP 2015, Brisbane Australia ABSTRACT This paper presents a new dataset combining 3 modalities (EEG, facial, and audio) during imagined and vocalized phonemic and single-word prompts. We pre-process the EEG data, compute features for all 3 modalities, and perform binary classi?cation of phonological categories using a combination of these modalities. For example, a deep-belief network obtains accuracies over 90% on identifying consonants, which is signi?cantly more accurate than two baseline supportvectormachines. Wealsoclassifybetweenthedifferent states (resting, stimuli, active thinking) of the recording, achievingaccuraciesof95%. Thesedatamaybeusedtolearn multimodal relationships, and to develop silent-speech and brain-computer interfaces.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-22 | TORGO data base free for academic use. In the spirit of the season, I would like to announce the immediate availability of the TORGO database free, in perpetuity for academic use. This database combines acoustics and electromagnetic articulography from 8 individuals with speech disorders and 7 without, and totals over 18 GB. These data can be used for multimodal models (e.g., for acoustic-articulatory inversion), models of pathology, and augmented speech recognition, for example. More information (and the database itself) can be found here: http://www.cs.toronto.edu/~complingweb/data/TORGO/torgo.html.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-23 | Datatang Datatang is a global leading data provider that specialized in data customized solution, focusing in variety speech, image, and text data collection, annotation, crowdsourcing services.
1, Speech data collection 2, Speech data synthesis 3, Speech data transcription I’ve attached our company introduction as reference, as well as available speech data lists as follows:
If you find any particular interested datasets, we could provide you samples with costs too.
Regards
Runze Zhao Oversea Sales Manager | Datatang Technology China M: +86 185 1698 2583 18 Zhongguancun St. Kemao Building Tower B 18F Beijing 100190
US M: +1 617 763 4722 640 W California Ave, Suite 210 Sunnyvale, CA 94086
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-24 | The International Standard Language Resource Number (ISLRN) assigned to LRE Map 2016 Language Resources Press Release ? Immediate
Paris, France, June 8, 2017
The International Standard Language Resource Number (ISLRN) assigned to LRE Map 2016 Language Resources
As a follow-up of the LRE Map 2016 initiative, ELRA has processed the information on existing and newly-created Language Resources provided by the authors submitting at LREC 2016 Conference (http://lrec2016.lrec-conf.org). In order to increase the visibility of these resources, ELRA has allocated ISLRNs to 106 submitted languages resources. They distribute as follows:
The meta-information for these language resources is also available on the ISLRN website with a broad international audience.
Background
As part of an international effort to document and archive the various language resource development efforts around the world, a system of assigning ISLRNs was established in November 2013. The ISLRN is a unique persistent identifier to be assigned to each language resource. The establishment of ISLRNs was a major step in the networked and shared world of human language technologies. Unique resources must be identified as they are, and meta-catalogues require a common identification format to manage data correctly. Therefore, language resources should carry identical identification schemes independent of their representations, whatever their types and wherever their physical locations (on hard drives, internet or intranet). Visit: http://islrn.org/.
About LRE Map
Initiated by ELRA and FlareNet at LREC 2010 (http://www.lrec-conf.org), the LRE Map is a mechanism intended to monitor the use and creation of language resources by collecting information on both existing and newly-created resources during the submission process. Apart from providing a portrait of the resources behind the community, of their uses and usability, the LRE Map intends to be a measuring instrument for monitoring the field of language resources. The feature has been so successful that it has been implemented also at other major conferences like COLING, IJCNLP, Interspeech, LTC, ACLHT, O-COCOSDA, RANLP, in addition to the LRE Journal. Visit: http://lremap.elra.info
About ELRA
The European Language Resources Association (ELRA) is a non-profit-making organisation founded by the European Commission in 1995, with the mission of providing a clearing house for language resources and promoting human language technologies.
To find out more about ELRA, please visit the website: http://www.elra.info
Contact: info@elda.org
The International Standard Language Resource Number (ISLRN) is becoming an increasingly widespread persistent identifier
Since the deployment of ISLRN, 3 years ago, the number of Language Resources which were allocated an ISLRN has grown significantly to reach 2500+. These LRs include raw and annotated corpora, lexicons and dictionaries, speech resources (conversational, synthesis, etc.), evaluation sets and multimodal resources, and cover 219 distinct languages (including sign languages).
In the first place, the ISLRN system has been endorsed by two large data centers, namely ELRA (European Language Resources Association) and LDC (Linguistic Data Consortium) which team up to maintain jointly the assignment process. Other significant contributions come from institutions like the Joint Research Centre (JRC), the Resource Management Agency (RMA), the Institute for Applied Linguistics (IULA) at the Universitat Pompeu Fabra (UPF).
Moreover, authors are invited to quote the ISLRN of each Language Resource they are referring to in the paper(s) they are submitting to LREC Conferences, which makes the persistent identifier a key factor of the LR citation process.
Background
As part of an international effort to document and archive the various Language Resource development efforts around the world, a system assigning ISLRNs was established in November 2013 and deployed in April 2014. The ISLRN is a unique persistent identifier to be assigned to each Language Resource. The establishment of ISLRNs was a major step in the networked and shared world of Human Language Technologies. Unique resources must be identified as they are, and meta-catalogues require a common identification format to manage data correctly. Therefore, Language Resources should carry identical identification schemes regardless their representations, their types and their storage place (hard drives, internet or intranet) (http://islrn.org/).
About ELRA
The European Language Resources Association (ELRA) is a non-profit-making organisation founded by the European Commission in 1995, with the mission of providing a clearing house for Language Resources and promoting Human Language Technologies.
To find out more about ELRA, please visit the website: http://www.elra.info
References
LDC: https://www.ldc.upenn.edu
JRC: https://ec.europa.eu/jrc/en/research-topic/internet-surveillance-systems
RMA: http://rma.nwu.ac.za
UPF: http://www.iula.upf.edu and https://www.upf.edu/web/universitat
LREC Conferences: www.lrec-conf.org
Read also: Valérie Mapelli, Vladimir Popescu, Lin Liu and Khalid Choukri, Language Resource Citation: the ISLRN Dissemination and Further Developments, in Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portoro?, Slovenia: http://www.lrec-conf.org/proceedings/lrec2016/summaries/1251.html
Contact: mapelli@elda.org
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-25 | Fearless Steps Corpus (University of Texas, Dallas) Fearless Steps Corpus John H.L. Hansen, Abhijeet Sangwan, Lakshmish Kaushik, Chengzhu Yu Center for Robust Speech Systems (CRSS), Eric Jonsson School of Engineering, The University of Texas at Dallas (UTD), Richardson, Texas, U.S.A.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-26 | SIWIS French Speech Synthesis Database The SIWIS French Speech Synthesis Database includes high quality French speech recordings and associated text files, aimed at building TTS systems, investigate multiple styles, and emphasis. A total of 9750 utterances from various sources such as parliament debates and novels were uttered by a professional French voice talent. A subset of the database contains emphasised words in many different contexts. The database includes more than ten hours of speech data and is freely available.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-27 | The new ELRA Catalogue of Language Resources (February 2018) Press Release - Immediate
|
5-3-1 | ROCme!: a free tool for audio corpora recording and management ROCme!: nouveau logiciel gratuit pour l'enregistrement et la gestion de corpus audio.
| ||
5-3-2 | VocalTractLab 2.0 : A tool for articulatory speech synthesis VocalTractLab 2.0 : A tool for articulatory speech synthesis
| ||
5-3-3 | Bob signal-processing and machine learning toolbox (v.1.2..0)
It is developed by the Biometrics Group at Idiap in Switzerland.
Dr. Elie Khoury Post Doctorant Biometric Person Recognition Group
IDIAP Research Institute (Switzerland) Tel : +41 27 721 77 23
| ||
5-3-4 | COVAREP: A Cooperative Voice Analysis Repository for Speech Technologies ======================
CALL for contributions
======================
We are pleased to announce the creation of an open-source repository of advanced speech processing algorithms called COVAREP (A Cooperative Voice Analysis Repository for Speech Technologies). COVAREP has been created as a GitHub project (https://github.com/covarep/covarep) where researchers in speech processing can store original implementations of published algorithms.
Over the past few decades a vast array of advanced speech processing algorithms have been developed, often offering significant improvements over the existing state-of-the-art. Such algorithms can have a reasonably high degree of complexity and, hence, can be difficult to accurately re-implement based on article descriptions. Another issue is the so-called 'bug magnet effect' with re-implementations frequently having significant differences from the original. The consequence of all this has been that many promising developments have been under-exploited or discarded, with researchers tending to stick to conventional analysis methods.
By developing the COVAREP repository we are hoping to address this by encouraging authors to include original implementations of their algorithms, thus resulting in a single de facto version for the speech community to refer to.
We envisage a range of benefits to the repository:
1) Reproducible research: COVAREP will allow fairer comparison of algorithms in published articles.
2) Encouraged usage: the free availability of these algorithms will encourage researchers from a wide range of speech-related disciplines (both in academia and industry) to exploit them for their own applications.
3) Feedback: as a GitHub project users will be able to offer comments on algorithms, report bugs, suggest improvements etc.
SCOPE
We welcome contributions from a wide range of speech processing areas, including (but not limited to): Speech analysis, synthesis, conversion, transformation, enhancement, speech quality, glottal source/voice quality analysis, etc.
REQUIREMENTS
In order to achieve a reasonable standard of consistency and homogeneity across algorithms we have compiled a list of requirements for prospective contributors to the repository. However, we intend the list of the requirements not to be so strict as to discourage contributions.
LICENCE
Getting contributing institutions to agree to a homogenous IP policy would be close to impossible. As a result COVAREP is a repository and not a toolbox, and each algorithm will have its own licence associated with it. Though flexible to different licence types, contributions will need to have a licence which is compatible with the repository, i.e. {GPL, LGPL, X11, Apache, MIT} or similar. We would encourage contributors to try to obtain LGPL licences from their institutions in order to be more industry friendly.
CONTRIBUTE!
We believe that the COVAREP repository has a great potential benefit to the speech research community and we hope that you will consider contributing your published algorithms to it. If you have any questions, comments issues etc regarding COVAREP please contact us on one of the email addresses below. Please forward this email to others who may be interested.
Existing contributions include: algorithms for spectral envelope modelling, adaptive sinusoidal modelling, fundamental frequncy/voicing decision/glottal closure instant detection algorithms, methods for detecting non-modal phonation types etc.
Gilles Degottex <degottex@csd.uoc.gr>, John Kane <kanejo@tcd.ie>, Thomas Drugman <thomas.drugman@umons.ac.be>, Tuomo Raitio <tuomo.raitio@aalto.fi>, Stefan Scherer <scherer@ict.usc.edu>
Website - http://covarep.github.io/covarep
GitHub - https://github.com/covarep/covarep
| ||
5-3-5 | Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox).Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox). http://bass-db.gforge.inria.fr/fasst/ This toolbox is intended to speed up the conception and to automate the implementation of new model-based audio source separation algorithms. It has the following additions compared to version 1: * Core in C++ * User scripts in MATLAB or python * Speedup * Multichannel audio input We provide 2 examples: 1. two-channel instantaneous NMF 2. real-world speech enhancement (2nd CHiME Challenge, Track 1)
| ||
5-3-6 | Cantor Digitalis, an open-source real-time singing synthesizer controlled by hand gestures. We are glad to announce the public realease of the Cantor Digitalis, an open-source real-time singing synthesizer controlled by hand gestures. It can be used e.g. for making music or for singing voice pedagogy. A wide variety of voices are available, from the classic vocal quartet (soprano, alto, tenor, bass), to the extreme colors of childish, breathy, roaring, etc. voices. All the features of vocal sounds are entirely under control, as the synthesis method is based on a mathematic model of voice production, without prerecording segments. The instrument is controlled using chironomy, i.e. hand gestures, with the help of interfaces like stylus or fingers on a graphic tablet, or computer mouse. Vocal dimensions such as the melody, vocal effort, vowel, voice tension, vocal tract size, breathiness etc. can easily and continuously be controlled during performance, and special voices can be prepared in advance or using presets. Check out the capabilities of Cantor Digitalis, through performances extracts from the ensemble Chorus Digitalis: http://youtu.be/_LTjM3Lihis?t=13s. In pratice, this release provides:
Regards,
The Cantor Digitalis team (who loves feedback — cantordigitalis@limsi.fr) Christophe d'Alessandro, Lionel Feugère, Olivier Perrotin http://cantordigitalis.limsi.fr/
| ||
5-3-7 | MultiVec: a Multilingual and MultiLevel Representation Learning Toolkit for NLP
We are happy to announce the release of our new toolkit “MultiVec” for computing continuous representations for text at different granularity levels (word-level or sequences of words). MultiVec includes Mikolov et al. [2013b]’s word2vec features, Le and Mikolov [2014]’s paragraph vector (batch and online) and Luong et al. [2015]’s model for bilingual distributed representations. MultiVec also includes different distance measures between words and sequences of words. The toolkit is written in C++ and is aimed at being fast (in the same order of magnitude as word2vec), easy to use, and easy to extend. It has been evaluated on several NLP tasks: the analogical reasoning task, sentiment analysis, and crosslingual document classification. The toolkit also includes C++ and Python libraries, that you can use to query bilingual and monolingual models.
The project is fully open to future contributions. The code is provided on the project webpage (https://github.com/eske/multivec) with installation instructions and command-line usage examples.
When you use this toolkit, please cite:
@InProceedings{MultiVecLREC2016, Title = {{MultiVec: a Multilingual and MultiLevel Representation Learning Toolkit for NLP}}, Author = {Alexandre Bérard and Christophe Servan and Olivier Pietquin and Laurent Besacier}, Booktitle = {The 10th edition of the Language Resources and Evaluation Conference (LREC 2016)}, Year = {2016}, Month = {May} }
The paper is available here: https://github.com/eske/multivec/raw/master/docs/Berard_and_al-MultiVec_a_Multilingual_and_Multilevel_Representation_Learning_Toolkit_for_NLP-LREC2016.pdf
Best regards,
Alexandre Bérard, Christophe Servan, Olivier Pietquin and Laurent Besacier
| ||
5-3-8 | An android application for speech data collection LIG_AIKUMA We are pleased to announce the release of LIG_AIKUMA, an android application for speech data collection, specially dedicated to language documentation. LIG_AIKUMA is an improved version of the Android application (AIKUMA) initially developed by Steven Bird and colleagues. Features were added to the app in order to facilitate the collection of parallel speech data in line with the requirements of a French-German project (ANR/DFG BULB - Breaking the Unwritten Language Barrier).
The resulting app, called LIG-AIKUMA, runs on various mobile phones and tablets and proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). It was used for field data collections in Congo-Brazzaville resulting in a total of over 80 hours of speech.
Users who just want to use the app without access to the code can download it directly from the forge direct link: https://forge.imag.fr/frs/download.php/706/MainActivity.apk
Code is also available on demand (contact elodie.gauthier@imag.fr and laurent.besacier@imag.fr).
More details on LIG_AIKUMA can be found on the following paper: http://www.sciencedirect.com/science/article/pii/S1877050916300448
| ||
5-3-9 | Web services via ALL GO from IRISA-CNRS It is our pleasure to introduce A||GO (https://allgo.inria.fr/ or http://allgo.irisa.fr/), a platform providing a collection of web-services for the automatic analysis of various data, including multimedia content across modalities. The platform builds on the back-end web service deployment infrastructure developed and maintained by Inria?s Service for Experimentation and Development (SED). Originally dedicated to multimedia content, A||GO progressively broadened to other fields such as computational biology, networks and telecommunications, computational graphics or computational physics.
| ||
5-3-10 | Clickable map - Illustrations of the IPA Clickable map - Illustrations of the IPA
| ||
5-3-11 | LIG-Aikuma running on mobile phones and tablets
|