ISCApad #245 |
Saturday, November 10, 2018 by Chris Wellekens |
5-1-1 | J.Li, L.Deng, R.Haeb-Umbach and Y.Gong, 'Robust Automatic Speech Recognition', Academic Press 'Robust Automatic Speech Recognition'
| ||||||||||
5-1-2 | Barbosa, P. A. and Madureira, S. Manual de Fonética Acústica Experimental. Aplicações a dados do português. 591 p. São Paulo: Cortez, 2015. [In Portuguese] Barbosa, P. A. and Madureira, S. Manual de Fonética Acústica Experimental. Aplicações a dados do português. 591 p. São Paulo: Cortez, 2015. [In Portuguese]
| ||||||||||
5-1-3 | Damien Nouvel, Inalco, Maud Ehrmann, EPFL,Sophie Rosset, CNRS. Les entités nommées pour le traitement automatique des langues Les entités nommées pour le traitement automatique des langues
| ||||||||||
5-1-4 | R.Fuchs, 'Speech Rhythm in Varieties of English' , Springer R.Fuchs, 'Speech Rhythm in Varieties of English' has appeared with Springer, in the 'Prosody, Phonology and Phonetics' series: https://www.springer.com/gp/book/9783662478172
| ||||||||||
5-1-5 | Pejman Mowlaee et al., 'Phase-Aware Signal Processing in Speech Communication: Theory and Practice', Wiley 2016Phase-Aware Signal Processing in Speech Communication: Theory and Practice
http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1119238811.html
| ||||||||||
5-1-6 | Jean Caelen, Anne Xuereb, 'Dialogue : altérité, interaction, énaction'
Jean Caelen,Anne Xuereb Dialogue : altérité, interaction, énaction Editions universitaires européennes
| ||||||||||
5-1-7 | Bäckström, Tom (with Guillaume Fuchs, Sascha Disch, Christian Uhle and Jeremie Lecomte), 'Speech Coding with Code-Excited Linear Prediction', Springer
| ||||||||||
5-1-8 | Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey (Eds), 'New Era for Robust Seech Recognition', Springer. Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey (Eds), 'New Era for Robust Seech Recognition', Springer. https://link.springer.com/book/10.1007%2F978-3-319-64680-0
| ||||||||||
5-1-9 | Fabrice Marsac, Rudolph Sock, CONSÉCUTIVITÉ ET SIMULTANÉITÉ en Linguistique, Langues et Parole, L'Harmattan,France Nous avons le plaisir de vous annoncer la parution du volume thématique « CONSÉCUTIVITÉ ET SIMULTANÉITÉ en Linguistique, Langues et Parole » dans la Collection Dixit Grammatica (L’Harmattan, France) :
| ||||||||||
5-1-10 | Emmanuel Vincent (Editor), Tuomas Virtanen (Editor), Sharon Gannot (Editor), 'Audio Source Separation and Speech Enhancement', Wiley Emmanuel Vincent (Editor), Tuomas Virtanen (Editor), Sharon Gannot (Editor), Audio Source Separation and Speech Enhancement:
ISBN: 978-1-119-27989-1 October 2018 504 pages
|
5-2-1 | Linguistic Data Consortium (LDC) update (October 2018)
In this newsletter:
Fall 2018 LDC Data Scholarship Recipients
Membership Year 2019 Publication Preview
Concretely Annotated English Gigaword
Fall 2018 LDC Data Scholarship Recipients
Utkrist Adhikari: University of Bonn (Germany); M.Sc, Computer Science. Utkrist is awarded a copy of Treebank-2 for his research in named entity recognition, super sense tagging, and semantic role labeling.
Vitaliya Remneva: Higher School of Economics, National Research University (Russia); M.Sc, System and Software Engineering. Vitaliya is awarded a copy of ETS Corpus of Non-Native Written English for her work in author profiling through natural language processing.
Tian Xiaoyu: Shanghai International Studies University (China); MA, Linguistics. Tian is awarded a copy of Tagged Chinese Gigaword Version 2.0 for her research in causative construction variations in Mainland Chinese, Taiwan Chinese, and Singapore Chinese.
W. Victor H. Yarlott: Florida International University (US); Ph.D., School of Computing and Information Sciences. Victor is awarded a copy of ACE2005 Multilingual Training Corpus for his research in relation extraction.
For information about the program, visit the Data Scholarship page.
Membership Year 2019 Publication Preview
The 2019 Membership Year is fast approaching and plans for next year’s publications are in progress. Among the expected releases are:
SRI Speech-Based Collaborative Learning Corpus: speech from over 100 US middle school students performing collaborative learning tasks, includes audio recordings, orthographic transcriptions, manual annotation of collaboration, and related documentation
Chinese Abstract Meaning Representation (AMR): developed by Nanjing Normal University and Brandeis University, semantic representation of approximately 10,000 Chinese sentences following the basic principles of AMR using web source data from Chinese Treebank 8.0 (LDC2013T21)
Multilanguage conversational telephone speech: developed to support language identification research in related languages (Arabic, East Asian, English, Mandarin)
TAC KBP: English entity discovery and linking, nugget detection, and event argument data, Chinese slot-filling data
IARPA Babel Language Packs (telephone speech and transcripts): languages include Amharic, Guarani, Igbo, and Lithuanian
HAVIC Med Progress Test data: web video, metadata, and annotations for developing multimedia systems
BOLT: discussion forums, SMS, word-aligned and tagged data in all languages (Chinese, Egyptian Arabic, English)
Check your inbox in the coming weeks for more information about membership renewal.
(1) Concretely Annotated English Gigaword was developed by Johns Hopkins University's Human Language Technology Center of Excellence. It adds multiple kinds and instances of automatically-generated syntactic, semantic, and coreference annotations to English Gigaword Fifth Edition (LDC2011T07). Concrete is a schema for representing structured, hierarchical, and overlapping linguistic annotations. This release provides multiple tool outputs producing the same annotation types as different annotation theories under a shared tokenization.
Concretely Annotated English Gigaword contains the nearly ten million documents (over four billion words) of the original English Gigaword Fifth Edition, which consists of newswire stories from seven sources collected by LDC between 1994-2010.
Concretely Annotated English Gigaword is distributed via hard drive.
2018 Subscription Members will automatically receive copies of this corpus. 2018 Standard Members may request a copy as part of their 16 free membership corpora. Any organization that licensed English Gigaword Fifth Edition (LDC2011T07) or Annotated English Gigaword (LDC2012T21) may request a copy of Concretely Annotated English Gigaword for a $250 media fee. Non-members may license this data for $6,000.
*
(2) TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014 was developed by LDC and contains training and evaluation data produced in support of the TAC KBP Slot Filling evaluation track conducted from 2009 to 2014.
Text Analysis Conference (TAC) is a series of workshops organized by the National Institute of Standards and Technology (NIST). TAC was developed to encourage research in natural language processing and related applications by providing a large test collection, common evaluation procedures, and a forum for researchers to share their results.
The regular English Slot Filling evaluation track involved mining information about entities from text. In completing the task, participating systems and LDC annotators searched a corpus for information on certain attributes (slots) of person and organization entities and attempted to return all valid answers (slot fillers) in the source collection. For more information about English Slot Filling, please refer to the 2014 track home page.
This release contains queries, the 'manual runs' (human-produced responses to the queries), and the final rounds of assessment results.
TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014 is distributed via web download.
2018 Subscription Members will automatically receive copies of this corpus. 2018 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for $1,000.
*
(3) TRAD Arabic-French Parallel Text -- Newswire was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 20,000 Arabic words from NIST 2008 Open Machine Translation (OpenMT) Evaluation (LDC2010T21). The purpose of the PEA-TRAD project (Translation as a Support for Document Analysis) was to develop speech-to-speech translation technology for multiple languages (e.g., Arabic, Chinese, Pashto) from a variety of domains.
This release consists of 813 segments (translations units) from 74 documents. The Arabic source file contains 19,902 words and the French reference translation contains 29,104 words. The source data is Arabic newswire text collected and translated into English by LDC. Information about the ELDA translation team, translation guidelines, and validation results is contained in the documentation accompanying this release.
TRAD Arabic-French Parallel Text -- Newswire is distributed via web download.
2018 Subscription Members will receive copies of this corpus provided they have submitted a completed copy of the special license agreement. 2018 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for $350.
*
Membership Office
University of Pennsylvania
T: +1-215-573-1275
E: ldc@ldc.upenn.edu
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-2 | ELRA - Language Resources Catalogue - Update (October 2018) ELRA - Language Resources Catalogue - Update
-------------------------------------------------------
We are happy to announce that 2 new Written Corpora and 4 new Speech resources are now available in our catalogue. ELRA-W0126 Training and test data for Arabizi detection and transliteration ISLRN: 986-364-744-303-9 The dataset is composed of : a collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts ; and a set of 3,452 Arabizi tokens manually transliterated into Arabic, intended to train and test a system that performs Arabizi to Arabic transliteration. For more information, see: http://catalog.elra.info/en-us/repository/browse/ELRA-W0126/ ELRA-W0127 Normalized Arabic Fragments for Inestimable Stemming (NAFIS)
ISLRN: 305-450-745-774-1 This is an Arabic stemming gold standard corpus composed by a collection of 37 sentences, selected to be representative of Arabic stemming tasks and manually annotated. Compiled sentences belong to various sources (poems, holy Quran, books, and periodics) of diversified kinds (proverb and dictum, article commentary, religious text, literature, historical fiction). NAFIS is represented according to the TEI standard. For more information, see: http://catalog.elra.info/en-us/repository/browse/ELRA-W0127/ ELRA-S0396 Mbochi speech corpus ISLRN: 747-055-093-447-8 This corpus consists of 5131 sentences recorded in Mbochi, together with their transcription and French translation, as well as the results from the work made during JSALT workshop: alignments at the phonetic level and various results of unsupervised word segmentation from audio. The audio corpus is made up of 4,5 hours, downsampled at 16kHz, 16bits, with Linear PCM encoding. Data is distributed into 2 parts, one for training consisting of 4617 sentences, and one for development consisting of 514 sentences. For more information, see: http://catalog.elra.info/en-us/repository/browse/ELRA-S0396/ ELRA-S0397 Chinese Mandarin (South) database ISLRN: 503-886-852-083-2 This database contains the recordings of 1000 Chinese Mandarin speakers from Southern China (500 males and 500 females), from 18 to 60 years? old, recorded in quiet studios. Recordings were made through microphone headsets and consist of 341 hours of audio data (about 30 minutes per speaker), stored in .WAV files as sequences of 48 KHz Mono, 16 bits, Linear PCM. For more information, see: http://catalog.elra.info/en-us/repository/browse/ELRA-S0397/ ELRA-S0398 Chinese Mandarin (North) database
ISLRN: 353-548-770-894-7 This database contains the recordings of 500 Chinese Mandarin speakers from Northern China (250 males and 250 females), from 18 to 60 years? old, recorded in quiet studios. Recordings were made through microphone headsets and consist of 172 hours of audio data (about 30 minutes per speaker), stored in .WAV files as sequences of 48 KHz Mono, 16 bits, Linear PCM. For more information, see: http://catalog.elra.info/en-us/repository/browse/ELRA-S0398/ ELRA-S0401 Persian Audio Dictionary ISLRN: 133-181-128-420-9 This dictionary consists of more than 50,000 entries (along with almost all wordforms and proper names) with corresponding audio files in MP3 and English transliterations. The words have been recorded with standard Persian (Farsi) pronunciation (all by a single speaker). This dictionary is provided with its software. For more information, see: http://catalog.elra.info/en-us/repository/browse/ELRA-S0401/ For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org If you would like to enquire about having your resources distributed by ELRA, please do not hesitate to contact us. Visit the Universal Catalogue: http://universal.elra.info Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/en/catalogues/language-resources-announcements/
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-3 | Appen ButlerHill
Appen ButlerHill A global leader in linguistic technology solutions RECENT CATALOG ADDITIONS—MARCH 2012 1. Speech Databases 1.1 Telephony
2. Pronunciation Lexica Appen Butler Hill has considerable experience in providing a variety of lexicon types. These include: Pronunciation Lexica providing phonemic representation, syllabification, and stress (primary and secondary as appropriate) Part-of-speech tagged Lexica providing grammatical and semantic labels Other reference text based materials including spelling/mis-spelling lists, spell-check dictionar-ies, mappings of colloquial language to standard forms, orthographic normalization lists. Over a period of 15 years, Appen Butler Hill has generated a significant volume of licensable material for a wide range of languages. For holdings information in a given language or to discuss any customized development efforts, please contact: sales@appenbutlerhill.com
4. Other Language Resources Morphological Analyzers – Farsi/Persian & Urdu Arabic Thesaurus Language Analysis Documentation – multiple languages
For additional information on these resources, please contact: sales@appenbutlerhill.com 5. Customized Requests and Package Configurations Appen Butler Hill is committed to providing a low risk, high quality, reliable solution and has worked in 130+ languages to-date supporting both large global corporations and Government organizations. We would be glad to discuss to any customized requests or package configurations and prepare a cus-tomized proposal to meet your needs.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-4 | OFROM 1er corpus de français de Suisse romande Nous souhaiterions vous signaler la mise en ligne d'OFROM, premier corpus de français parlé en Suisse romande. L'archive est, dans version actuelle, d'une durée d'environ 15 heures. Elle est transcrite en orthographe standard dans le logiciel Praat. Un concordancier permet d'y effectuer des recherches, et de télécharger les extraits sonores associés aux transcriptions.
Pour accéder aux données et consulter une description plus complète du corpus, nous vous invitons à vous rendre à l'adresse suivante : http://www.unine.ch/ofrom.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-5 | Real-world 16-channel noise recordings We are happy to announce the release of DEMAND, a set of real-world
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-6 | Aide à la finalisation de corpus oraux ou multimodaux pour diffusion, valorisation et dépôt pérenne Aide à la finalisation de corpus oraux ou multimodaux pour diffusion, valorisation et dépôt pérenne
Le consortium IRCOM de la TGIR Corpus et l’EquipEx ORTOLANG s’associent pour proposer une aide technique et financière à la finalisation de corpus de données orales ou multimodales à des fins de diffusion et pérennisation par l’intermédiaire de l’EquipEx ORTOLANG. Cet appel ne concerne pas la création de nouveaux corpus mais la finalisation de corpus existants et non-disponibles de manière électronique. Par finalisation, nous entendons le dépôt auprès d’un entrepôt numérique public, et l’entrée dans un circuit d’archivage pérenne. De cette façon, les données de parole qui ont été enrichies par vos recherches vont pouvoir être réutilisées, citées et enrichies à leur tour de manière cumulative pour permettre le développement de nouvelles connaissances, selon les conditions d’utilisation que vous choisirez (sélection de licences d’utilisation correspondant à chacun des corpus déposés).
Cet appel d’offre est soumis à plusieurs conditions (voir ci-dessous) et l’aide financière par projet est limitée à 3000 euros. Les demandes seront traitées dans l’ordre où elles seront reçues par l’ IRCOM. Les demandes émanant d’EA ou de petites équipes ne disposant pas de support technique « corpus » seront traitées prioritairement. Les demandes sont à déposer du 1er septembre 2013 au 31 octobre 2013. La décision de financement relèvera du comité de pilotage d’IRCOM. Les demandes non traitées en 2013 sont susceptibles de l’être en 2014. Si vous avez des doutes quant à l’éligibilité de votre projet, n’hésitez pas à nous contacter pour que nous puissions étudier votre demande et adapter nos offres futures.
Pour palier la grande disparité dans les niveaux de compétences informatiques des personnes et groupes de travail produisant des corpus, L’ IRCOM propose une aide personnalisée à la finalisation de corpus. Celle-ci sera réalisée par un ingénieur IRCOM en fonction des demandes formulées et adaptées aux types de besoin, qu’ils soient techniques ou financiers.
Les conditions nécessaires pour proposer un corpus à finaliser et obtenir une aide d’IRCOM sont :
Les demandes peuvent concerner tout type de traitement : traitements de corpus quasi-finalisés (conversion, anonymisation), alignement de corpus déjà transcrits, conversion depuis des formats « traitement de textes », digitalisation de support ancien. Pour toute demande exigeant une intervention manuelle importante, les demandeurs devront s’investir en moyens humains ou financiers à la hauteur des moyens fournis par IRCOM et ORTOLANG.
IRCOM est conscient du caractère exceptionnel et exploratoire de cette démarche. Il convient également de rappeler que ce financement est réservé aux corpus déjà largement constitués et ne peuvent intervenir sur des créations ex-nihilo. Pour ces raisons de limitation de moyens, les propositions de corpus les plus avancés dans leur réalisation pourront être traitées en priorité, en accord avec le CP d’IRCOM. Il n’y a toutefois pas de limite « théorique » aux demandes pouvant être faites, IRCOM ayant la possibilité de rediriger les demandes qui ne relèvent pas de ses compétences vers d’autres interlocuteurs.
Les propositions de réponse à cet appel d’offre sont à envoyer à ircom.appel.corpus@gmail.com. Les propositions doivent utiliser le formulaire de deux pages figurant ci-dessous. Dans tous les cas, une réponse personnalisée sera renvoyée par IRCOM.
Ces propositions doivent présenter les corpus proposés, les données sur les droits d’utilisation et de propriétés et sur la nature des formats ou support utilisés.
Cet appel est organisé sous la responsabilité d’IRCOM avec la participation financière conjointe de IRCOM et l’EquipEx ORTOLANG.
Pour toute information complémentaire, nous rappelons que le site web de l'Ircom (http://ircom.corpus-ir.fr) est ouvert et propose des ressources à la communauté : glossaire, inventaire des unités et des corpus, ressources logicielles (tutoriaux, comparatifs, outils de conversion), activités des groupes de travail, actualités des formations, ... L'IRCOM invite les unités à inventorier leur corpus oraux et multimodaux - 70 projets déjà recensés - pour avoir une meilleure visibilité des ressources déjà disponibles même si elles ne sont pas toutes finalisées.
Le comité de pilotage IRCOM
Utiliser ce formulaire pour répondre à l’appel : Merci.
Réponse à l’appel à la finalisation de corpus oral ou multimodal
Nom du corpus :
Nom de la personne à contacter : Adresse email : Numéro de téléphone :
Nature des données de corpus :
Existe-t’il des enregistrements : Quel média ? Audio, vidéo, autre… Quelle est la longueur totale des enregistrements ? Nombre de cassettes, nombre d’heures, etc. Quel type de support ? Quel format (si connu) ?
Existe-t’il des transcriptions : Quel format ? (papier, traitement de texte, logiciel de transcription) Quelle quantité (en heures, nombre de mots, ou nombre de transcriptions) ?
Disposez vous de métadonnées (présentation des droits d’auteurs et d’usage) ?
Disposez-vous d’une description précise des personnes enregistrées ?
Disposez-vous d’une attestation de consentement éclairé pour les personnes ayant été enregistrées ? En quelle année (environ) les enregistrements ont eu lieu ?
Quelle est la langue des enregistrements ?
Le corpus comprend-il des enregistrements d’enfants ou de personnes ayant un trouble du langage ou une pathologie ? Si oui, de quelle population s’agit-il ?
Dans un souci d’efficacité et pour vous conseiller dans les meilleurs délais, il nous faut disposer d’exemples des transcriptions ou des enregistrements en votre possession. Nous vous contacterons à ce sujet, mais vous pouvez d’ores et déjà nous adresser par courrier électronique un exemple des données dont vous disposez (transcriptions, métadonnées, adresse de page web contenant les enregistrements).
Nous vous remercions par avance de l’intérêt que vous porterez à notre proposition. Pour toutes informations complémentaires veuillez contacter Martine Toda martine.toda@ling.cnrs.fr ou à ircom.appel.corpus@gmail.com.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-7 | Rhapsodie: un Treebank prosodique et syntaxique de français parlé Rhapsodie: un Treebank prosodique et syntaxique de français parlé
Nous avons le plaisir d'annoncer que la ressource Rhapsodie, Corpus de français parlé annoté pour la prosodie et la syntaxe, est désormais disponible sur http://www.projet-rhapsodie.fr/
Le treebank Rhapsodie est composé de 57 échantillons sonores (5 minutes en moyenne, au total 3h de parole, 33000 mots) dotés d’une transcription orthographique et phonétique alignées au son.
Il s'agit d’une ressource de français parlé multi genres (parole privée et publique ; monologues et dialogues ; entretiens en face à face vs radiodiffusion, parole plus ou moins interactive et plus ou moins planifiée, séquences descriptives, argumentatives, oratoires et procédurales) articulée autour de sources externes (enregistrements extraits de projets antérieurs, en accord avec les concepteurs initiaux) et internes. Nous tenons en particulier à remercier les responsables des projets CFPP2000, PFC, ESLO, C-Prom ainsi que Mathieu Avanzi, Anne Lacheret, Piet Mertens et Nicolas Obin.
Les échantillons sonores (wave & MP3, pitch nettoyé et lissé), les transcriptions orthographiques (txt), les annotations macrosyntaxiques (txt), les annotations prosodiques (xml, textgrid) ainsi que les metadonnées (xml & html) sont téléchargeables librement selon les termes de la licence Creative Commons Attribution - Pas d’utilisation commerciale - Partage dans les mêmes conditions 3.0 France. Les annotations microsyntaxiques seront disponibles prochainement Les métadonnées sont également explorables en ligne grâce à un browser. Les tutoriels pour la transcription, les annotations et les requêtes sont disponibles sur le site Rhapsodie. Enfin, L’annotation prosodique est interrogeable en ligne grâce au langage de requêtes Rhapsodie QL. L'équipe Ressource Rhapsodie (Modyco, Université Paris Ouest Nanterre) Sylvain Kahane, Anne Lacheret, Paola Pietrandrea, Atanas Tchobanov, Arthur Truong. Partenaires : IRCAM (Paris), LATTICE (Paris), LPL (Aix-en-Provence), CLLE-ERSS (Toulouse).
******************************************************** Rhapsodie: a Prosodic and Syntactic Treebank for Spoken French We are pleased to announce that Rhapsodie, a syntactic and prosodic treebank of spoken French created with the aim of modeling the interface between prosody, syntax and discourse in spoken French is now available at http://www.projet-rhapsodie.fr/ The Rhapsodie treebank is made up of 57 short samples of spoken French (5 minutes long on average, amounting to 3 hours of speech and a 33 000 word corpus) endowed with an orthographical phoneme-aligned transcription . The corpus is representative of different genres (private and public speech; monologues and dialogues; face-to-face interviews and broadcasts; more or less interactive discourse; descriptive, argumentative and procedural samples, variations in planning type). The corpus samples have been mainly drawn from existing corpora of spoken French and partially created within the frame of theRhapsodie project. We would especially like to thank the coordinators of the CFPP2000, PFC, ESLO, C-Prom projects as well as Piet Mertens, Mathieu Avanzi, Anne Lacheret and Nicolas Obin. The sound samples (waves, MP3, cleaned and stylized pitch), the orthographic transcriptions (txt), the macrosyntactic annotations (txt), the prosodic annotations (xml, textgrid) as well as the metadata (xml and html) can be freely downloaded under the terms of the Creative Commons licence Attribution - Noncommercial - Share Alike 3.0 France. Microsyntactic annotation will be available soon. The metadata are searchable on line through a browser. The prosodic annotation can be explored on line through the Rhapsodie Query Language. The tutorials of transcription, annotations and Rhapsodie Query Language are available on the site.
The Rhapsodie team (Modyco, Université Paris Ouest Nanterre : Sylvain Kahane, Anne Lacheret, Paola Pietrandrea, Atanas Tchobanov, Arthur Truong. Partners: IRCAM (Paris), LATTICE (Paris), LPL (Aix-en-Provence),CLLE-ERSS (Toulouse).
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-8 | Annotation of “Hannah and her sisters” by Woody Allen. We have created and made publicly available a dense audio-visual person-oriented ground-truth annotation of a feature movie (100 minutes long): “Hannah and her sisters” by Woody Allen. Jean-Ronan Vigouroux, Louis Chevallier Patrick Pérez Technicolor Research & Innovation
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-9 | French TTS Text to Speech Synthesis:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-10 | Google 's Language Model benchmark A LM benchmark is available at:https://github.com/ciprian-chelba/1-billion-word-language-modeling-benchmark
Here is a brief description of the project.
'The purpose of the project is to make available a standard training and test setup for language modeling experiments. The training/held-out data was produced from a download at statmt.org using a combination of Bash shell and Perl scripts distributed here. This also means that your results on this data set are reproducible by the research community at large. Besides the scripts needed to rebuild the training/held-out data, it also makes available log-probability values for each word in each of ten held-out data sets, for each of the following baseline models:
ArXiv paper: http://arxiv.org/abs/1312.3005
Happy benchmarking!'
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-11 | ISLRN new portal Opening of the ISLRN Portal
ELRA, LDC, and AFNLP/Oriental-COCOSDA announce the opening of the ISLRN Portal @ www.islrn.org.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-12 | Speechocean – update (November 2018)
Spanish ASR & TTS Corpus --- Speechocean
Speechocean: The World’s Leading A.I. Data Resource & Service Supplier
At present, we can provide data services with 110+ languages and dialects across the world. For more detailed information, please visit our website: http://kingline.speechocean.com
Spanish ASR & TTS Corpus
More Information
About ASR Corpus…
About TTS Corpus…
Contact Information: Name: Xianfeng Cheng Position: Vice President Tel: +86-10-62660928; +86-10-62660053 ext.8080 Mobile: +86-13681432590 Skype: xianfeng.cheng1 Email: chengxianfeng@speechocean.com Website: www.speechocean.com http://kingline.speechocean.com
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-13 | kidLUCID: London UCL Children’s Clear Speech in Interaction Database kidLUCID: London UCL Children’s Clear Speech in Interaction Database We are delighted to announce the availability of a new corpus of spontaneous speech for children aged 9 to 14 years inclusive, produced as part of the ESRC-funded project on ‘Speaker-controlled Variability in Children's Speech in Interaction’ (PI: Valerie Hazan). Speech recordings (a total of 288 conversations) are available for 96 child participants (46M, 50F, range 9;0 to 15;0 years), all native southern British English speakers. Participants were recorded in pairs while completing the diapix spot-the-difference picture task in which the pair verbally compared two scenes, only one of which was visible to each talker. High-quality digital recordings were made in sound-treated rooms. For each conversation, a stereo audio recording is provided with each speaker on a separate channel together with a Praat Textgrid containing separate word- and phoneme-level segmentations for each speaker. There are six recordings per speaker pair made in the following conditions:
The kidLUCID corpus is available online within the OSCAAR (Online Speech/Corpora Archive and Analysis Resource) archive (https://oscaar.ci.northwestern.edu/). Free access can be requested for research purposes. Further information about the project can be found at: http://www.ucl.ac.uk/pals/research/shaps/research/shaps/research/clear-speech-strategies This work was supported by Economic and Social Research Council Grant No. RES-062- 23-3106.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-14 | Robust speech datasets and ASR software tools
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-15 | International Standard Language Resource Number (ISLRN) implemented by ELRA and LDC ELRA and LDC partner to implement ISLRN process and assign identifiers to all the Language Resources in their catalogues.
Following the meeting of the largest NLP organizations, the NLP12, and their endorsement of the International Standard Language Resource Number (ISLRN), ELRA and LDC partnered to implement the ISLRN process and to assign identifiers to all the Language Resources (LRs) in their catalogues. The ISLRN web portal was designed to enable the assignment of unique identifiers as a service free of charge for all Language Resource providers. To enhance the use of ISLRN, ELRA and LDC have collaborated to provide the ISLRN 13-digit ID to all the Language Resources distributed in their respective catalogues. Anyone who is searching the ELRA and LDC catalogues can see that each Language Resource is now identified by both the data centre ID and the ISLRN number. All providers and users of such LRs should refer to the latter in their own publications and whenever referring to the LR.
ELRA and LDC will continue their joint involvement in ISLRN through active participation in this web service.
Visit the ELRA and LDC catalogues, respectively at http://catalogue.elra.info and https://catalog.ldc.upenn.edu
Background The International Standard Language Resource Number (ISLRN) aims to provide unique identifiers using a standardised nomenclature, thus ensuring that LRs are correctly identified, and consequently, recognised with proper references for their usage in applications within R&D projects, product evaluation and benchmarking, as well as in documents and scientific papers. Moreover, this is a major step in the networked and shared world that Human Language Technologies (HLT) has become: unique resources must be identified as such and meta-catalogues need a common identification format to manage data correctly.
***About NLP12*** Representatives of the major Natural Language Processing and Computational Linguistics organizations met in Paris on 18 November 2013 to harmonize and coordinate their activities within the field.
*** About ELRA *** The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and research laboratories that creates and distributes linguistic resources for language-related education, research and technology development. To find out more about LDC, please visit our web site: https://www.ldc.upenn.edu
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-16 | Base de données LIBRE et GRATUITE pour la reconnaissance du locuteur Je me permet de vous solliciter pour contribuer à la création
d?une base de données LIBRE et GRATUITE
pour la reconnaissance du locuteur.
Plus de détails et la marche à suivre ci-dessous.
Merci beaucoup,
Anthony Larcher
Récemment, un certain nombre de laboratoires spécialisés dans la reconnaissance du locuteur dépendante du texte ont initié le projet RedDots.
Il s?agit d?une initiative volontaire sur financement propre des laboratoires.
Ce projet encourage des discussions sur les thèmes de la reconnaissance du locuteur,
la collection de corpus et les cas d?usage propres à cette technologie à travers un Google Group.
Dans le cadre du projet RedDots, l?Institute for Infocomm Research (Singapour) a développé une application Android
qui permet d?enregistrer des données sur un téléphone portable.
Cette base de données a pour but de pallier certaines lacunes des corpus existants:
- le coût (certaines bases standard sont vendues à plusieurs milliers d?euro)
- la taille limitée (le nombre limité de locuteurs ne permet plus d?évaluer les systèmes de reconnaissance de manière significative)
- la variabilité limitée (les données sont actuellement enregistrées dans plus de 5 pays dans le monde entier)
Afin de distributer une base de données, qui puisse bénéficier librement
à l?ensemble de la communauté de recherche nous vous sollicitons.
Comment faire et en combien de temps?
- inscrivez vous en 2 minutes à l?adresse suivante
- installez l?application Android sur votre téléphone en 2 minutes, saisissez l'ID et mot de passe qui vous seront envoyé par email
- enregistrez une session 3 minutes sur votre téléphone
Tout se fait en moins de 10 minutes?
Une des limitations principale des corpus existant est le nombre limité de sessions
enregistrée par locuteur et le court intervalle de temps au cours duquel ces sessions sont enregistrées.
Afin de combler ce manque nous espérons que chaque participant acceptera d?enregistrer
plusieurs sessions dans les mois à venir.
Idealement, chaque participant enregistrera 3 ou 4 minutes par semaine pendant un an.
Ou vont mes données et pour quoi sont elles utilisées?
Les données sont actuellement envoyées sur un serveur de l?Institute for Infocomm Research
à Singapour. Un institut de recherche public.
En vous enregistrant, vous acceptez que ces données soient utilisées à des fins de recherche
uniquement. ces données seront mise à disposition en ligne gratuitement tout au long du projet.
Merci pour votre contribution, n?hésitez pas à faire circuler cet email.
Plus de détails seront données prochainement dans un article soumis à INTERSPEECH 2015.
Anthony Larcher
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-17 | ISLRN adopted by Joint Research Center (JRC) of the European Commission JRC, the EC's Joint Research Centre, an important LR player: First to adopt the ISLRN initiative
The Joint Research Centre (JRC), the European Commission's in house science service, is the first organisation to use the International Standard Language Resource Number (ISLRN) initiative and has requested ISLRN 13-digit unique identifiers to its Language Resources (LR).
The current JRC LRs (downloadable from https://ec.europa.eu/jrc/en/language-technologies) with an ISLRN ID are:
Background The International Standard Language Resource Number (ISLRN) aims to provide unique identifiers using a standardised nomenclature, thus ensuring that LRs are correctly identified, and consequently, recognised with proper references for their usage in applications within R&D projects, product evaluation and benchmarking, as well as in documents and scientific papers. Moreover, this is a major step in the networked and shared world that Human Language Technologies (HLT) has become: unique resources must be identified as such and meta-catalogues need a common identification format to manage data correctly.
*** About the JRC *** As the Commission's in-house science service, the Joint Research Centre's mission is to provide EU policies with independent, evidence-based scientific and technical support throughout the whole policy cycle.
*** About ELRA ***
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-18 | Forensic database of voice recordings of 500+ Australian English speakers Forensic database of voice recordings of 500+ Australian English speakers
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-19 | Audio and Electroglottographic speech recordings
Audio and Electroglottographic speech recordings from several languages We are happy to announce the public availability of speech recordings made as part of the UCLA project 'Production and Perception of Linguistic Voice Quality'. http://www.phonetics.ucla.edu/voiceproject/voice.html Audio and EGG recordings are available for Bo, Gujarati, Hmong, Mandarin, Black Miao, Southern Yi, Santiago Matatlan/ San Juan Guelavia Zapotec; audio recordings (no EGG) are available for English and Mandarin. Recordings of Jalapa Mazatec extracted from the UCLA Phonetic Archive are also posted. All recordings are accompanied by explanatory notes and wordlists, and most are accompanied by Praat textgrids that locate target segments of interest to our project. Analysis software developed as part of the project – VoiceSauce for audio analysis and EggWorks for EGG analysis – and all project publications are also available from this site. All preliminary analyses of the recordings using these tools (i.e. acoustic and EGG parameter values extracted from the recordings) are posted on the site in large data spreadsheets. All of these materials are made freely available under a Creative Commons Attribution-NonCommercial-ShareAlike-3.0 Unported License. This project was funded by NSF grant BCS-0720304 to Pat Keating, Abeer Alwan and Jody Kreiman of UCLA, and Christina Esposito of Macalester College. Pat Keating (UCLA)
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-20 | Press release: Opening of the ELRA License Wizard Press Release - Immediate - Paris, France, April 2, 2015
Currently, the License Wizard allows the user to choose among several licenses that exist for the use of Language Resources: ELRA, Creative Commons and META-SHARE.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-21 | EEG-face tracking- audio 24 GB data set Kara One, Toronto, Canada We are making 24 GB of a new dataset, called Kara One, freely available. This database combines 3 modalities (EEG, face tracking, and audio) during imagined and articulated speech using phonologically-relevant phonemic and single-word prompts. It is the result of a collaboration between the Toronto Rehabilitation Institute (in the University Health Network) and the Department of Computer Science at the University of Toronto.
In the associated paper (abstract below), we show how to accurately classify imagined phonological categories solely from EEG data. Specifically, we obtain up to 90% accuracy in classifying imagined consonants from imagined vowels and up to 95% accuracy in classifying stimulus from active imagination states using advanced deep-belief networks.
Data from 14 participants are available here: http://www.cs.toronto.edu/~complingweb/data/karaOne/karaOne.html.
If you have any questions, please contact Frank Rudzicz at frank@cs.toronto.edu.
Best regards, Frank
PAPER Shunan Zhao and Frank Rudzicz (2015) Classifying phonological categories in imagined and articulated speech. In Proceedings of ICASSP 2015, Brisbane Australia ABSTRACT This paper presents a new dataset combining 3 modalities (EEG, facial, and audio) during imagined and vocalized phonemic and single-word prompts. We pre-process the EEG data, compute features for all 3 modalities, and perform binary classi?cation of phonological categories using a combination of these modalities. For example, a deep-belief network obtains accuracies over 90% on identifying consonants, which is signi?cantly more accurate than two baseline supportvectormachines. Wealsoclassifybetweenthedifferent states (resting, stimuli, active thinking) of the recording, achievingaccuraciesof95%. Thesedatamaybeusedtolearn multimodal relationships, and to develop silent-speech and brain-computer interfaces.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-22 | TORGO data base free for academic use. In the spirit of the season, I would like to announce the immediate availability of the TORGO database free, in perpetuity for academic use. This database combines acoustics and electromagnetic articulography from 8 individuals with speech disorders and 7 without, and totals over 18 GB. These data can be used for multimodal models (e.g., for acoustic-articulatory inversion), models of pathology, and augmented speech recognition, for example. More information (and the database itself) can be found here: http://www.cs.toronto.edu/~complingweb/data/TORGO/torgo.html.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-23 | Datatang Datatang is a global leading data provider that specialized in data customized solution, focusing in variety speech, image, and text data collection, annotation, crowdsourcing services.
1, Speech data collection 2, Speech data synthesis 3, Speech data transcription I’ve attached our company introduction as reference, as well as available speech data lists as follows:
If you find any particular interested datasets, we could provide you samples with costs too.
Regards
Runze Zhao Oversea Sales Manager | Datatang Technology China M: +86 185 1698 2583 18 Zhongguancun St. Kemao Building Tower B 18F Beijing 100190
US M: +1 617 763 4722 640 W California Ave, Suite 210 Sunnyvale, CA 94086
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-24 | The International Standard Language Resource Number (ISLRN) assigned to LRE Map 2016 Language Resources Press Release ? Immediate
Paris, France, June 8, 2017
The International Standard Language Resource Number (ISLRN) assigned to LRE Map 2016 Language Resources
As a follow-up of the LRE Map 2016 initiative, ELRA has processed the information on existing and newly-created Language Resources provided by the authors submitting at LREC 2016 Conference (http://lrec2016.lrec-conf.org). In order to increase the visibility of these resources, ELRA has allocated ISLRNs to 106 submitted languages resources. They distribute as follows:
The meta-information for these language resources is also available on the ISLRN website with a broad international audience.
Background
As part of an international effort to document and archive the various language resource development efforts around the world, a system of assigning ISLRNs was established in November 2013. The ISLRN is a unique persistent identifier to be assigned to each language resource. The establishment of ISLRNs was a major step in the networked and shared world of human language technologies. Unique resources must be identified as they are, and meta-catalogues require a common identification format to manage data correctly. Therefore, language resources should carry identical identification schemes independent of their representations, whatever their types and wherever their physical locations (on hard drives, internet or intranet). Visit: http://islrn.org/.
About LRE Map
Initiated by ELRA and FlareNet at LREC 2010 (http://www.lrec-conf.org), the LRE Map is a mechanism intended to monitor the use and creation of language resources by collecting information on both existing and newly-created resources during the submission process. Apart from providing a portrait of the resources behind the community, of their uses and usability, the LRE Map intends to be a measuring instrument for monitoring the field of language resources. The feature has been so successful that it has been implemented also at other major conferences like COLING, IJCNLP, Interspeech, LTC, ACLHT, O-COCOSDA, RANLP, in addition to the LRE Journal. Visit: http://lremap.elra.info
About ELRA
The European Language Resources Association (ELRA) is a non-profit-making organisation founded by the European Commission in 1995, with the mission of providing a clearing house for language resources and promoting human language technologies.
To find out more about ELRA, please visit the website: http://www.elra.info
Contact: info@elda.org
The International Standard Language Resource Number (ISLRN) is becoming an increasingly widespread persistent identifier
Since the deployment of ISLRN, 3 years ago, the number of Language Resources which were allocated an ISLRN has grown significantly to reach 2500+. These LRs include raw and annotated corpora, lexicons and dictionaries, speech resources (conversational, synthesis, etc.), evaluation sets and multimodal resources, and cover 219 distinct languages (including sign languages).
In the first place, the ISLRN system has been endorsed by two large data centers, namely ELRA (European Language Resources Association) and LDC (Linguistic Data Consortium) which team up to maintain jointly the assignment process. Other significant contributions come from institutions like the Joint Research Centre (JRC), the Resource Management Agency (RMA), the Institute for Applied Linguistics (IULA) at the Universitat Pompeu Fabra (UPF).
Moreover, authors are invited to quote the ISLRN of each Language Resource they are referring to in the paper(s) they are submitting to LREC Conferences, which makes the persistent identifier a key factor of the LR citation process.
Background
As part of an international effort to document and archive the various Language Resource development efforts around the world, a system assigning ISLRNs was established in November 2013 and deployed in April 2014. The ISLRN is a unique persistent identifier to be assigned to each Language Resource. The establishment of ISLRNs was a major step in the networked and shared world of Human Language Technologies. Unique resources must be identified as they are, and meta-catalogues require a common identification format to manage data correctly. Therefore, Language Resources should carry identical identification schemes regardless their representations, their types and their storage place (hard drives, internet or intranet) (http://islrn.org/).
About ELRA
The European Language Resources Association (ELRA) is a non-profit-making organisation founded by the European Commission in 1995, with the mission of providing a clearing house for Language Resources and promoting Human Language Technologies.
To find out more about ELRA, please visit the website: http://www.elra.info
References
LDC: https://www.ldc.upenn.edu
JRC: https://ec.europa.eu/jrc/en/research-topic/internet-surveillance-systems
RMA: http://rma.nwu.ac.za
UPF: http://www.iula.upf.edu and https://www.upf.edu/web/universitat
LREC Conferences: www.lrec-conf.org
Read also: Valérie Mapelli, Vladimir Popescu, Lin Liu and Khalid Choukri, Language Resource Citation: the ISLRN Dissemination and Further Developments, in Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portoro?, Slovenia: http://www.lrec-conf.org/proceedings/lrec2016/summaries/1251.html
Contact: mapelli@elda.org
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-25 | Fearless Steps Corpus (University of Texas, Dallas) Fearless Steps Corpus John H.L. Hansen, Abhijeet Sangwan, Lakshmish Kaushik, Chengzhu Yu Center for Robust Speech Systems (CRSS), Eric Jonsson School of Engineering, The University of Texas at Dallas (UTD), Richardson, Texas, U.S.A.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-26 | SIWIS French Speech Synthesis Database The SIWIS French Speech Synthesis Database includes high quality French speech recordings and associated text files, aimed at building TTS systems, investigate multiple styles, and emphasis. A total of 9750 utterances from various sources such as parliament debates and novels were uttered by a professional French voice talent. A subset of the database contains emphasised words in many different contexts. The database includes more than ten hours of speech data and is freely available.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-27 | The new ELRA Catalogue of Language Resources (February 2018) Press Release - Immediate
|
5-3-1 | ROCme!: a free tool for audio corpora recording and management ROCme!: nouveau logiciel gratuit pour l'enregistrement et la gestion de corpus audio.
| |||||
5-3-2 | VocalTractLab 2.0 : A tool for articulatory speech synthesis VocalTractLab 2.0 : A tool for articulatory speech synthesis
| |||||
5-3-3 | Bob signal-processing and machine learning toolbox (v.1.2..0)
It is developed by the Biometrics Group at Idiap in Switzerland.
Dr. Elie Khoury Post Doctorant Biometric Person Recognition Group
IDIAP Research Institute (Switzerland) Tel : +41 27 721 77 23
| |||||
5-3-4 | COVAREP: A Cooperative Voice Analysis Repository for Speech Technologies ======================
CALL for contributions
======================
We are pleased to announce the creation of an open-source repository of advanced speech processing algorithms called COVAREP (A Cooperative Voice Analysis Repository for Speech Technologies). COVAREP has been created as a GitHub project (https://github.com/covarep/covarep) where researchers in speech processing can store original implementations of published algorithms.
Over the past few decades a vast array of advanced speech processing algorithms have been developed, often offering significant improvements over the existing state-of-the-art. Such algorithms can have a reasonably high degree of complexity and, hence, can be difficult to accurately re-implement based on article descriptions. Another issue is the so-called 'bug magnet effect' with re-implementations frequently having significant differences from the original. The consequence of all this has been that many promising developments have been under-exploited or discarded, with researchers tending to stick to conventional analysis methods.
By developing the COVAREP repository we are hoping to address this by encouraging authors to include original implementations of their algorithms, thus resulting in a single de facto version for the speech community to refer to.
We envisage a range of benefits to the repository:
1) Reproducible research: COVAREP will allow fairer comparison of algorithms in published articles.
2) Encouraged usage: the free availability of these algorithms will encourage researchers from a wide range of speech-related disciplines (both in academia and industry) to exploit them for their own applications.
3) Feedback: as a GitHub project users will be able to offer comments on algorithms, report bugs, suggest improvements etc.
SCOPE
We welcome contributions from a wide range of speech processing areas, including (but not limited to): Speech analysis, synthesis, conversion, transformation, enhancement, speech quality, glottal source/voice quality analysis, etc.
REQUIREMENTS
In order to achieve a reasonable standard of consistency and homogeneity across algorithms we have compiled a list of requirements for prospective contributors to the repository. However, we intend the list of the requirements not to be so strict as to discourage contributions.
LICENCE
Getting contributing institutions to agree to a homogenous IP policy would be close to impossible. As a result COVAREP is a repository and not a toolbox, and each algorithm will have its own licence associated with it. Though flexible to different licence types, contributions will need to have a licence which is compatible with the repository, i.e. {GPL, LGPL, X11, Apache, MIT} or similar. We would encourage contributors to try to obtain LGPL licences from their institutions in order to be more industry friendly.
CONTRIBUTE!
We believe that the COVAREP repository has a great potential benefit to the speech research community and we hope that you will consider contributing your published algorithms to it. If you have any questions, comments issues etc regarding COVAREP please contact us on one of the email addresses below. Please forward this email to others who may be interested.
Existing contributions include: algorithms for spectral envelope modelling, adaptive sinusoidal modelling, fundamental frequncy/voicing decision/glottal closure instant detection algorithms, methods for detecting non-modal phonation types etc.
Gilles Degottex <degottex@csd.uoc.gr>, John Kane <kanejo@tcd.ie>, Thomas Drugman <thomas.drugman@umons.ac.be>, Tuomo Raitio <tuomo.raitio@aalto.fi>, Stefan Scherer <scherer@ict.usc.edu>
Website - http://covarep.github.io/covarep
GitHub - https://github.com/covarep/covarep
| |||||
5-3-5 | Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox).Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox). http://bass-db.gforge.inria.fr/fasst/ This toolbox is intended to speed up the conception and to automate the implementation of new model-based audio source separation algorithms. It has the following additions compared to version 1: * Core in C++ * User scripts in MATLAB or python * Speedup * Multichannel audio input We provide 2 examples: 1. two-channel instantaneous NMF 2. real-world speech enhancement (2nd CHiME Challenge, Track 1)
| |||||
5-3-6 | Cantor Digitalis, an open-source real-time singing synthesizer controlled by hand gestures. We are glad to announce the public realease of the Cantor Digitalis, an open-source real-time singing synthesizer controlled by hand gestures. It can be used e.g. for making music or for singing voice pedagogy. A wide variety of voices are available, from the classic vocal quartet (soprano, alto, tenor, bass), to the extreme colors of childish, breathy, roaring, etc. voices. All the features of vocal sounds are entirely under control, as the synthesis method is based on a mathematic model of voice production, without prerecording segments. The instrument is controlled using chironomy, i.e. hand gestures, with the help of interfaces like stylus or fingers on a graphic tablet, or computer mouse. Vocal dimensions such as the melody, vocal effort, vowel, voice tension, vocal tract size, breathiness etc. can easily and continuously be controlled during performance, and special voices can be prepared in advance or using presets. Check out the capabilities of Cantor Digitalis, through performances extracts from the ensemble Chorus Digitalis: http://youtu.be/_LTjM3Lihis?t=13s. In pratice, this release provides:
Regards,
The Cantor Digitalis team (who loves feedback — cantordigitalis@limsi.fr) Christophe d'Alessandro, Lionel Feugère, Olivier Perrotin http://cantordigitalis.limsi.fr/
| |||||
5-3-7 | MultiVec: a Multilingual and MultiLevel Representation Learning Toolkit for NLP
We are happy to announce the release of our new toolkit “MultiVec” for computing continuous representations for text at different granularity levels (word-level or sequences of words). MultiVec includes Mikolov et al. [2013b]’s word2vec features, Le and Mikolov [2014]’s paragraph vector (batch and online) and Luong et al. [2015]’s model for bilingual distributed representations. MultiVec also includes different distance measures between words and sequences of words. The toolkit is written in C++ and is aimed at being fast (in the same order of magnitude as word2vec), easy to use, and easy to extend. It has been evaluated on several NLP tasks: the analogical reasoning task, sentiment analysis, and crosslingual document classification. The toolkit also includes C++ and Python libraries, that you can use to query bilingual and monolingual models.
The project is fully open to future contributions. The code is provided on the project webpage (https://github.com/eske/multivec) with installation instructions and command-line usage examples.
When you use this toolkit, please cite:
@InProceedings{MultiVecLREC2016, Title = {{MultiVec: a Multilingual and MultiLevel Representation Learning Toolkit for NLP}}, Author = {Alexandre Bérard and Christophe Servan and Olivier Pietquin and Laurent Besacier}, Booktitle = {The 10th edition of the Language Resources and Evaluation Conference (LREC 2016)}, Year = {2016}, Month = {May} }
The paper is available here: https://github.com/eske/multivec/raw/master/docs/Berard_and_al-MultiVec_a_Multilingual_and_Multilevel_Representation_Learning_Toolkit_for_NLP-LREC2016.pdf
Best regards,
Alexandre Bérard, Christophe Servan, Olivier Pietquin and Laurent Besacier
| |||||
5-3-8 | An android application for speech data collection LIG_AIKUMA We are pleased to announce the release of LIG_AIKUMA, an android application for speech data collection, specially dedicated to language documentation. LIG_AIKUMA is an improved version of the Android application (AIKUMA) initially developed by Steven Bird and colleagues. Features were added to the app in order to facilitate the collection of parallel speech data in line with the requirements of a French-German project (ANR/DFG BULB - Breaking the Unwritten Language Barrier).
The resulting app, called LIG-AIKUMA, runs on various mobile phones and tablets and proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). It was used for field data collections in Congo-Brazzaville resulting in a total of over 80 hours of speech.
Users who just want to use the app without access to the code can download it directly from the forge direct link: https://forge.imag.fr/frs/download.php/706/MainActivity.apk
Code is also available on demand (contact elodie.gauthier@imag.fr and laurent.besacier@imag.fr).
More details on LIG_AIKUMA can be found on the following paper: http://www.sciencedirect.com/science/article/pii/S1877050916300448
| |||||
5-3-9 | Web services via ALL GO from IRISA-CNRS It is our pleasure to introduce A||GO (https://allgo.inria.fr/ or http://allgo.irisa.fr/), a platform providing a collection of web-services for the automatic analysis of various data, including multimedia content across modalities. The platform builds on the back-end web service deployment infrastructure developed and maintained by Inria?s Service for Experimentation and Development (SED). Originally dedicated to multimedia content, A||GO progressively broadened to other fields such as computational biology, networks and telecommunications, computational graphics or computational physics.
| |||||
5-3-10 | Clickable map - Illustrations of the IPA Clickable map - Illustrations of the IPA
| |||||
5-3-11 | LIG-Aikuma running on mobile phones and tablets
| |||||
5-3-12 | Python Library Nous sommes heureux d'annoncer la mise à disposition du public de la
première bibliothèque en langage Python pour convertir des nombres écrits en
français en leur représentation en chiffres.
L'analyseur est robuste et est capable de segmenter et substituer les expressions
de nombre dans un flux de mots, comme une conversation par exemple. Il reconnaît les différentes
variantes de la langue (quantre-vingt-dix / nonante?) et traduit aussi bien les
ordinaux que les entiers, les nombres décimaux et les séquences formelles (n° de téléphone, CB?).
Nous espérons que cet outil sera utile à celles et ceux qui, comme nous, font du traitment
du langage naturel en français.
Cette bibliothèque est diffusée sous license MIT qui permet une utilisation très libre.
Sources : https://github.com/allo-media/text2num
|