ISCApad #191 |
Monday, May 12, 2014 by Chris Wellekens |
5-1-1 | Tuomas Virtanen, Rita Singh, Bhiksha Raj (editors),Techniques for Noise Robustness in Automatic Speech Recognition,Wiley Techniques for Noise Robustness in Automatic Speech Recognition
| ||
5-1-2 | Niebuhr, Olivier, Understanding Prosody:The Role of Context, Function and Communication Understanding Prosody: The Role of Context, Function and Communication Ed. by Niebuhr, Oliver Series:Language, Context and Cognition 13, De Gruyter http://www.degruyter.com/view/product/186201?format=G or http://linguistlist.org/pubs/books/get-book.cfm?BookID=63238
The volume represents a state-of-the-art snapshot of the research on prosody for phoneticians, linguists and speech technologists. It covers well-known models and languages. How are prosodies linked to speech sounds? What are the relations between prosody and grammar? What does speech perception tell us about prosody, particularly about the constituting elements of intonation and rhythm? The papers of the volume address questions like these with a special focus on how the notion of context-based coding, the knowledge of prosodic functions and the communicative embedding of prosodic elements can advance our understanding of prosody.
| ||
5-1-3 | Albert Di Cristo: « La Prosodie de la Parole : Une Introduction », Editions de Boeck-Solal (296 p) Albert Di Cristo: « La Prosodie de la Parole : Une Introduction », Editions de Boeck-Solal (296 p).
Sommaire :
Avant –propos, Introduction, ;
Ch.1 : Eléments de définition ;
Ch 2. Situation de la prosodie dans le champ des sciences du langage et dans l’étude de la communication ;
Ch 3. La prosodie sur les deux versants de la communication orale interindividuelle (production et compréhension) ;
Ch 4. La prosodie et le cerveau ;
Ch 5. La matérialité de la prosodie ;
Ch 6. Les niveau d’analyse et de représentation de la prosodie ;
Ch 7. Les théories, les modèles de la prosodie et leurs appareils formels ;
Ch 8 La fonctionnalité plurielle de la prosodie ;
Ch 9. Les relations de la prosodie avec les sens ;
Epilogue.
Suggestions de lecture ;
Index des termes ;
Index des noms propres.
| ||
5-1-4 | Pierre-Yves Oudeyer, 'Aux sources de la parole: auto-organisation et évolution', Odile Jacob Pierre-Yves Oudeyer, dir. rech. Inria, vient de publier 'Aux sources de la parole: auto-organisation et évolution', chez Odile Jacob (Sept. 2013).
Il discute de la question de l'évolution et de l'acquisition de la parole, chez l'enfant et chez les robots.
En faisant dialoguer biologie, linguistique, neurosciences et expériences robotiques,
ce livre étudie en particulier les phénomènes d'auto-organisation, permettant la formation spontanée de langues nouvelles dans une population d'individus.
Il présente en particulier des expériences dans lesquelles une population de robots numériques invente, forme, et négotie son propre système de parole
et explique comment de telles expériences robotiques peuvent nous aider à mieux comprendre l'homme.
Il présente aussi des expérimentations robotiques récentes, et à partir de perspectives nouvelles en intelligence artificielle, dans lesquelles des mécanismes de curiosité permettent à un robot de découvrir par lui-même son corps, les objets qui l'entourent, et finalement les interactions vocales avec ses pairs. C'est ainsi que s'auto-organise son propre développement cognitif, et qu'apparaissent des hypothèses nouvelles pour comprendre le développement chez l'enfant.
Site web du livre: http://goo.gl/A6EwTJ
Pierre-Yves Oudeyer,
Directeur de recherche, Inria
Responsable de l'équipe Flowers
Inria Bordeaux Sud-Ouest et Ensta-ParisTech, France
Twitter: https://twitter.com/pyoudeyer
| ||
5-1-5 | Björn Schuller, Anton Batliner , Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing, Wiley, ISBN: 978-1-119-97136-8, 344 pages, November 2013
|
5-2-1 | ELRA - Language Resources Catalogue - Update (2014-05)) *****************************************************************
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-2 | ELRA releases free Language Resources ELRA releases free Language Resources
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-3 | LDC Newsletter (April 2014)
New publications:
New publications
(1) Domain-Specific Hyponym Relations was developed by the Shaanxi Province Key Laboratory of Satellite and Terrestrial Network Technology at Xi’an Jiaotung University, Xi’an, Shaanxi, China. It provides more than 5,000 English hyponym relations in five domains including data mining, computer networks, data structures, Euclidean geometry and microbiology. All hypernym and hyponym words were taken from Wikipedia article titles.
A hyponym relation is a word sense relation that is an IS-A relation. For example, dog is a hyponym of animal and binary tree is a hyponym of tree structure. Among the applications for domain-specific hyponym relations are taxonomy and ontology learning, query result organization in a faceted search and knowledge organization and automated reasoning in knowledge-rich applications.
The data is presented in XML format, and each file provides hyponym relations in one domain. Within each file, the term, Wikipedia URL, hyponym relation and the names of the hyponym and hypernym words are included. The distribution of terms and relations is set forth in the table below:
*
Parallel aligned treebanks are treebanks annotated with morphological and syntactic structures aligned at the sentence level and the sub-sentence level. Such data sets are useful for natural language processing and related fields, including automatic word alignment system training and evaluation, transfer-rule extraction, word sense disambiguation, translation lexicon extraction and cultural heritage and cross-linguistic studies. With respect to machine translation system development, parallel aligned treebanks may improve system performance with enhanced syntactic parsers, better rules and knowledge about language pairs and reduced word error rate.
In this release, the source Arabic data was translated into English. Arabic and English treebank annotations were performed independently. The parallel texts were then word aligned.
LDC previously released Arabic-English Parallel Aligned Treebanks as follows:
This release consists of Arabic source web data (newsgroups, weblogs) collected by LDC in 2004 and 2005. All data is encoded as UTF-8. A count of files, words, tokens and segments is below.
Note: Word count is based on the untokenized Arabic source, token count is based on the ATB-tokenized Arabic source.
The purpose of the GALE word alignment task was to find correspondences between words, phrases or groups of words in a set of parallel texts. Arabic-English word alignment annotation consisted of the following tasks:
GALE Arabic-English Parallel Aligned Treebank -- Web Training is distributed via web download.
2014 Subscription Members will automatically receive two copies of this data on disc. 2014 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$1750.
*
(3) Multi-Channel WSJ Audio (MCWSJ) was developed by the Centre for Speech Technology Research at the University of Edinburgh and contains approximately 100 hours of recorded speech from 45 British English speakers. Participants read Wall Street Journal texts published in 1987-1989 in three recording scenarios: a single stationary speaker, two stationary overlapping speakers and one single moving speaker.
This corpus was designed to address the challenges of speech recognition in meetings, which often occur in rooms with non-ideal acoustic conditions and significant background noise, and may contain large sections of overlapping speech. Using headset microphones represents one approach, but meeting participants may be reluctant to wear them. Microphone arrays are another option. MCWSJ supports research in large vocabulary tasks using microphone arrays. The news sentences read by speakers are taken from WSJCAM0 Cambridge Read News, a corpus originally developed for large vocabulary continuous speech recognition experiments, which in turn was based on CSR-I (WSJ0) Complete, made available by LDC to support large vocabulary continuous speech recognition initiatives.
Speakers reading news text from prompts were recorded using a headset microphone, a lapel microphone and an eight-channel microphone array. In the single speaker scenario, participants read from six fixed positions. Fixed positions were assigned for the entire recording in the overlapping scenario. For the moving scenario, participants moved from one position to the next while reading.
Fifteen speakers were recorded for the single scenario, nine pairs for the overlapping scenario and nine individuals for the moving scenario. Each read approximately 90 sentences.
Multi-Channel WSJ Audio is distributed on 2 DVD-ROM.
2014 Subscription Members will receive a copy of this data provided that they have completed the User License Agreement for Multi-Channel WSJ Audio LDC2014S03. 2014 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$1500.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-4 | Appen ButlerHill
Appen ButlerHill A global leader in linguistic technology solutions RECENT CATALOG ADDITIONS—MARCH 2012 1. Speech Databases 1.1 Telephony
2. Pronunciation Lexica Appen Butler Hill has considerable experience in providing a variety of lexicon types. These include: Pronunciation Lexica providing phonemic representation, syllabification, and stress (primary and secondary as appropriate) Part-of-speech tagged Lexica providing grammatical and semantic labels Other reference text based materials including spelling/mis-spelling lists, spell-check dictionar-ies, mappings of colloquial language to standard forms, orthographic normalization lists. Over a period of 15 years, Appen Butler Hill has generated a significant volume of licensable material for a wide range of languages. For holdings information in a given language or to discuss any customized development efforts, please contact: sales@appenbutlerhill.com
4. Other Language Resources Morphological Analyzers – Farsi/Persian & Urdu Arabic Thesaurus Language Analysis Documentation – multiple languages
For additional information on these resources, please contact: sales@appenbutlerhill.com 5. Customized Requests and Package Configurations Appen Butler Hill is committed to providing a low risk, high quality, reliable solution and has worked in 130+ languages to-date supporting both large global corporations and Government organizations. We would be glad to discuss to any customized requests or package configurations and prepare a cus-tomized proposal to meet your needs.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-5 | OFROM 1er corpus de français de Suisse romande Nous souhaiterions vous signaler la mise en ligne d'OFROM, premier corpus de français parlé en Suisse romande. L'archive est, dans version actuelle, d'une durée d'environ 15 heures. Elle est transcrite en orthographe standard dans le logiciel Praat. Un concordancier permet d'y effectuer des recherches, et de télécharger les extraits sonores associés aux transcriptions.
Pour accéder aux données et consulter une description plus complète du corpus, nous vous invitons à vous rendre à l'adresse suivante : http://www.unine.ch/ofrom.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-6 | Real-world 16-channel noise recordings We are happy to announce the release of DEMAND, a set of real-world
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-7 | Aide à la finalisation de corpus oraux ou multimodaux pour diffusion, valorisation et dépôt pérenne Aide à la finalisation de corpus oraux ou multimodaux pour diffusion, valorisation et dépôt pérenne
Le consortium IRCOM de la TGIR Corpus et l’EquipEx ORTOLANG s’associent pour proposer une aide technique et financière à la finalisation de corpus de données orales ou multimodales à des fins de diffusion et pérennisation par l’intermédiaire de l’EquipEx ORTOLANG. Cet appel ne concerne pas la création de nouveaux corpus mais la finalisation de corpus existants et non-disponibles de manière électronique. Par finalisation, nous entendons le dépôt auprès d’un entrepôt numérique public, et l’entrée dans un circuit d’archivage pérenne. De cette façon, les données de parole qui ont été enrichies par vos recherches vont pouvoir être réutilisées, citées et enrichies à leur tour de manière cumulative pour permettre le développement de nouvelles connaissances, selon les conditions d’utilisation que vous choisirez (sélection de licences d’utilisation correspondant à chacun des corpus déposés).
Cet appel d’offre est soumis à plusieurs conditions (voir ci-dessous) et l’aide financière par projet est limitée à 3000 euros. Les demandes seront traitées dans l’ordre où elles seront reçues par l’ IRCOM. Les demandes émanant d’EA ou de petites équipes ne disposant pas de support technique « corpus » seront traitées prioritairement. Les demandes sont à déposer du 1er septembre 2013 au 31 octobre 2013. La décision de financement relèvera du comité de pilotage d’IRCOM. Les demandes non traitées en 2013 sont susceptibles de l’être en 2014. Si vous avez des doutes quant à l’éligibilité de votre projet, n’hésitez pas à nous contacter pour que nous puissions étudier votre demande et adapter nos offres futures.
Pour palier la grande disparité dans les niveaux de compétences informatiques des personnes et groupes de travail produisant des corpus, L’ IRCOM propose une aide personnalisée à la finalisation de corpus. Celle-ci sera réalisée par un ingénieur IRCOM en fonction des demandes formulées et adaptées aux types de besoin, qu’ils soient techniques ou financiers.
Les conditions nécessaires pour proposer un corpus à finaliser et obtenir une aide d’IRCOM sont :
Les demandes peuvent concerner tout type de traitement : traitements de corpus quasi-finalisés (conversion, anonymisation), alignement de corpus déjà transcrits, conversion depuis des formats « traitement de textes », digitalisation de support ancien. Pour toute demande exigeant une intervention manuelle importante, les demandeurs devront s’investir en moyens humains ou financiers à la hauteur des moyens fournis par IRCOM et ORTOLANG.
IRCOM est conscient du caractère exceptionnel et exploratoire de cette démarche. Il convient également de rappeler que ce financement est réservé aux corpus déjà largement constitués et ne peuvent intervenir sur des créations ex-nihilo. Pour ces raisons de limitation de moyens, les propositions de corpus les plus avancés dans leur réalisation pourront être traitées en priorité, en accord avec le CP d’IRCOM. Il n’y a toutefois pas de limite « théorique » aux demandes pouvant être faites, IRCOM ayant la possibilité de rediriger les demandes qui ne relèvent pas de ses compétences vers d’autres interlocuteurs.
Les propositions de réponse à cet appel d’offre sont à envoyer à ircom.appel.corpus@gmail.com. Les propositions doivent utiliser le formulaire de deux pages figurant ci-dessous. Dans tous les cas, une réponse personnalisée sera renvoyée par IRCOM.
Ces propositions doivent présenter les corpus proposés, les données sur les droits d’utilisation et de propriétés et sur la nature des formats ou support utilisés.
Cet appel est organisé sous la responsabilité d’IRCOM avec la participation financière conjointe de IRCOM et l’EquipEx ORTOLANG.
Pour toute information complémentaire, nous rappelons que le site web de l'Ircom (http://ircom.corpus-ir.fr) est ouvert et propose des ressources à la communauté : glossaire, inventaire des unités et des corpus, ressources logicielles (tutoriaux, comparatifs, outils de conversion), activités des groupes de travail, actualités des formations, ... L'IRCOM invite les unités à inventorier leur corpus oraux et multimodaux - 70 projets déjà recensés - pour avoir une meilleure visibilité des ressources déjà disponibles même si elles ne sont pas toutes finalisées.
Le comité de pilotage IRCOM
Utiliser ce formulaire pour répondre à l’appel : Merci.
Réponse à l’appel à la finalisation de corpus oral ou multimodal
Nom du corpus :
Nom de la personne à contacter : Adresse email : Numéro de téléphone :
Nature des données de corpus :
Existe-t’il des enregistrements : Quel média ? Audio, vidéo, autre… Quelle est la longueur totale des enregistrements ? Nombre de cassettes, nombre d’heures, etc. Quel type de support ? Quel format (si connu) ?
Existe-t’il des transcriptions : Quel format ? (papier, traitement de texte, logiciel de transcription) Quelle quantité (en heures, nombre de mots, ou nombre de transcriptions) ?
Disposez vous de métadonnées (présentation des droits d’auteurs et d’usage) ?
Disposez-vous d’une description précise des personnes enregistrées ?
Disposez-vous d’une attestation de consentement éclairé pour les personnes ayant été enregistrées ? En quelle année (environ) les enregistrements ont eu lieu ?
Quelle est la langue des enregistrements ?
Le corpus comprend-il des enregistrements d’enfants ou de personnes ayant un trouble du langage ou une pathologie ? Si oui, de quelle population s’agit-il ?
Dans un souci d’efficacité et pour vous conseiller dans les meilleurs délais, il nous faut disposer d’exemples des transcriptions ou des enregistrements en votre possession. Nous vous contacterons à ce sujet, mais vous pouvez d’ores et déjà nous adresser par courrier électronique un exemple des données dont vous disposez (transcriptions, métadonnées, adresse de page web contenant les enregistrements).
Nous vous remercions par avance de l’intérêt que vous porterez à notre proposition. Pour toutes informations complémentaires veuillez contacter Martine Toda martine.toda@ling.cnrs.fr ou à ircom.appel.corpus@gmail.com.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-8 | Rhapsodie: un Treebank prosodique et syntaxique de français parlé Rhapsodie: un Treebank prosodique et syntaxique de français parlé
Nous avons le plaisir d'annoncer que la ressource Rhapsodie, Corpus de français parlé annoté pour la prosodie et la syntaxe, est désormais disponible sur http://www.projet-rhapsodie.fr/
Le treebank Rhapsodie est composé de 57 échantillons sonores (5 minutes en moyenne, au total 3h de parole, 33000 mots) dotés d’une transcription orthographique et phonétique alignées au son.
Il s'agit d’une ressource de français parlé multi genres (parole privée et publique ; monologues et dialogues ; entretiens en face à face vs radiodiffusion, parole plus ou moins interactive et plus ou moins planifiée, séquences descriptives, argumentatives, oratoires et procédurales) articulée autour de sources externes (enregistrements extraits de projets antérieurs, en accord avec les concepteurs initiaux) et internes. Nous tenons en particulier à remercier les responsables des projets CFPP2000, PFC, ESLO, C-Prom ainsi que Mathieu Avanzi, Anne Lacheret, Piet Mertens et Nicolas Obin.
Les échantillons sonores (wave & MP3, pitch nettoyé et lissé), les transcriptions orthographiques (txt), les annotations macrosyntaxiques (txt), les annotations prosodiques (xml, textgrid) ainsi que les metadonnées (xml & html) sont téléchargeables librement selon les termes de la licence Creative Commons Attribution - Pas d’utilisation commerciale - Partage dans les mêmes conditions 3.0 France. Les annotations microsyntaxiques seront disponibles prochainement Les métadonnées sont également explorables en ligne grâce à un browser. Les tutoriels pour la transcription, les annotations et les requêtes sont disponibles sur le site Rhapsodie. Enfin, L’annotation prosodique est interrogeable en ligne grâce au langage de requêtes Rhapsodie QL. L'équipe Ressource Rhapsodie (Modyco, Université Paris Ouest Nanterre) Sylvain Kahane, Anne Lacheret, Paola Pietrandrea, Atanas Tchobanov, Arthur Truong. Partenaires : IRCAM (Paris), LATTICE (Paris), LPL (Aix-en-Provence), CLLE-ERSS (Toulouse).
******************************************************** Rhapsodie: a Prosodic and Syntactic Treebank for Spoken French We are pleased to announce that Rhapsodie, a syntactic and prosodic treebank of spoken French created with the aim of modeling the interface between prosody, syntax and discourse in spoken French is now available at http://www.projet-rhapsodie.fr/ The Rhapsodie treebank is made up of 57 short samples of spoken French (5 minutes long on average, amounting to 3 hours of speech and a 33 000 word corpus) endowed with an orthographical phoneme-aligned transcription . The corpus is representative of different genres (private and public speech; monologues and dialogues; face-to-face interviews and broadcasts; more or less interactive discourse; descriptive, argumentative and procedural samples, variations in planning type). The corpus samples have been mainly drawn from existing corpora of spoken French and partially created within the frame of theRhapsodie project. We would especially like to thank the coordinators of the CFPP2000, PFC, ESLO, C-Prom projects as well as Piet Mertens, Mathieu Avanzi, Anne Lacheret and Nicolas Obin. The sound samples (waves, MP3, cleaned and stylized pitch), the orthographic transcriptions (txt), the macrosyntactic annotations (txt), the prosodic annotations (xml, textgrid) as well as the metadata (xml and html) can be freely downloaded under the terms of the Creative Commons licence Attribution - Noncommercial - Share Alike 3.0 France. Microsyntactic annotation will be available soon. The metadata are searchable on line through a browser. The prosodic annotation can be explored on line through the Rhapsodie Query Language. The tutorials of transcription, annotations and Rhapsodie Query Language are available on the site.
The Rhapsodie team (Modyco, Université Paris Ouest Nanterre : Sylvain Kahane, Anne Lacheret, Paola Pietrandrea, Atanas Tchobanov, Arthur Truong. Partners: IRCAM (Paris), LATTICE (Paris), LPL (Aix-en-Provence),CLLE-ERSS (Toulouse).
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-9 | COVAREP: A Cooperative Voice Analysis Repository for Speech Technologies ======================
CALL for contributions
======================
We are pleased to announce the creation of an open-source repository of advanced speech processing algorithms called COVAREP (A Cooperative Voice Analysis Repository for Speech Technologies). COVAREP has been created as a GitHub project (https://github.com/covarep/covarep) where researchers in speech processing can store original implementations of published algorithms.
Over the past few decades a vast array of advanced speech processing algorithms have been developed, often offering significant improvements over the existing state-of-the-art. Such algorithms can have a reasonably high degree of complexity and, hence, can be difficult to accurately re-implement based on article descriptions. Another issue is the so-called 'bug magnet effect' with re-implementations frequently having significant differences from the original. The consequence of all this has been that many promising developments have been under-exploited or discarded, with researchers tending to stick to conventional analysis methods.
By developing the COVAREP repository we are hoping to address this by encouraging authors to include original implementations of their algorithms, thus resulting in a single de facto version for the speech community to refer to.
We envisage a range of benefits to the repository:
1) Reproducible research: COVAREP will allow fairer comparison of algorithms in published articles.
2) Encouraged usage: the free availability of these algorithms will encourage researchers from a wide range of speech-related disciplines (both in academia and industry) to exploit them for their own applications.
3) Feedback: as a GitHub project users will be able to offer comments on algorithms, report bugs, suggest improvements etc.
SCOPE
We welcome contributions from a wide range of speech processing areas, including (but not limited to): Speech analysis, synthesis, conversion, transformation, enhancement, speech quality, glottal source/voice quality analysis, etc.
REQUIREMENTS
In order to achieve a reasonable standard of consistency and homogeneity across algorithms we have compiled a list of requirements for prospective contributors to the repository. However, we intend the list of the requirements not to be so strict as to discourage contributions.
LICENCE
Getting contributing institutions to agree to a homogenous IP policy would be close to impossible. As a result COVAREP is a repository and not a toolbox, and each algorithm will have its own licence associated with it. Though flexible to different licence types, contributions will need to have a licence which is compatible with the repository, i.e. {GPL, LGPL, X11, Apache, MIT} or similar. We would encourage contributors to try to obtain LGPL licences from their institutions in order to be more industry friendly.
CONTRIBUTE!
We believe that the COVAREP repository has a great potential benefit to the speech research community and we hope that you will consider contributing your published algorithms to it. If you have any questions, comments issues etc regarding COVAREP please contact us on one of the email addresses below. Please forward this email to others who may be interested.
Existing contributions include: algorithms for spectral envelope modelling, adaptive sinusoidal modelling, fundamental frequncy/voicing decision/glottal closure instant detection algorithms, methods for detecting non-modal phonation types etc.
Gilles Degottex <degottex@csd.uoc.gr>, John Kane <kanejo@tcd.ie>, Thomas Drugman <thomas.drugman@umons.ac.be>, Tuomo Raitio <tuomo.raitio@aalto.fi>, Stefan Scherer <scherer@ict.usc.edu>
Website - http://covarep.github.io/covarep
GitHub - https://github.com/covarep/covarep
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-10 | Annotation of “Hannah and her sisters” by Woody Allen. We have created and made publicly available a dense audio-visual person-oriented ground-truth annotation of a feature movie (100 minutes long): “Hannah and her sisters” by Woody Allen. Jean-Ronan Vigouroux, Louis Chevallier Patrick Pérez Technicolor Research & Innovation
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-11 | French TTS Text to Speech Synthesis:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-12 | Google 's Language Model benchmark A LM benchmark is available at: https://code.google.com/p/1-billion-word-language-modeling-benchmark/.
Here is a brief description of the project.
'The purpose of the project is to make available a standard training and test setup for language modeling experiments. The training/held-out data was produced from a download at statmt.org using a combination of Bash shell and Perl scripts distributed here. This also means that your results on this data set are reproducible by the research community at large. Besides the scripts needed to rebuild the training/held-out data, it also makes available log-probability values for each word in each of ten held-out data sets, for each of the following baseline models:
ArXiv paper: http://arxiv.org/abs/1312.3005
Happy benchmarking!'
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-13 | International Standard Language Resource Number (ISLRN) (ELRA Press release) Press Release - Immediate - Paris, France, December 13, 2013 Establishing the International Standard Language Resource Number (ISLRN) 12 major NLP organisations announce the establishment of the ISLRN, a Persistent Unique Identifier, to be assigned to each Language Resource. On November 18, 2013, 12 NLP organisations have agreed to announce the establishment of the International Standard Language Resource Number (ISLRN), a Persistent Unique Identifier, to be assigned to each Language Resource. Experiment replicability, an essential feature of scientific work, would be enhanced by such unique identifier. Set up by ELRA, LDC and AFNLP/Oriental-COCOSDA, the ISLRN Portal will provide unique identifiers using a standardised nomenclature, as a service free of charge for all Language Resource providers. It will be supervised by a steering committee composed of representatives of participating organisations and enlarged whenever necessary. More information on ELRA and the ISLRN, please contact: Khalid Choukri choukri@elda.org More information on ELDA, please contact: Hélène Mazo mazo@elda.org ELRA 55-57, rue Brillat Savarin 75013 Paris (France) Tel.: +33 1 43 13 33 33 Fax: +33 1 43 13 33 30
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-14 | Speechocean May 2014 update
Speechocean – update (May 2014):
Speechocean: A global language resources and data services supplier
Speechocean has over 500 large-scale databases available in 110+ languages and accents with the platform of desktop, in-car, telephony and tablet PC. Our data repository is enormous and diversified, which includes ASR Databases, TTS Databases, Lexica, Text Corpora, etc.
Speechocean is glad to announce more resources that have been released:
ASR Databases
Speechocean provides 110+ regional languages corpora, available in a variety of formats, situational styles, scene environments and platform systems, covering In-car speech recognition corpora, mobile phone speech recognition corpora, fixed-line speech recognition corpora, desktop speech recognition corpora, etc. This month we released more European languages databases which were made for the tuning and testing purpose of speech recognition systems for speech ASR applications.
1.3 Mobile
Speechocean licenses a variety of databases in more than 40 languages for speech synthesis broadcasting speech, emotional speech, etc. which can be used in different algorithms.
Speechocean licenses many kinds of text corpora in many languages which is superb for language model training.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5-2-15 | ISLRN new portal Opening of the ISLRN Portal
ELRA, LDC, and AFNLP/Oriental-COCOSDA announce the opening of the ISLRN Portal @ www.islrn.org.
|
5-3-1 | ROCme!: a free tool for audio corpora recording and management ROCme!: nouveau logiciel gratuit pour l'enregistrement et la gestion de corpus audio.
| ||
5-3-2 | VocalTractLab 2.0 : A tool for articulatory speech synthesis VocalTractLab 2.0 : A tool for articulatory speech synthesis
| ||
5-3-3 | Voice analysis toolkit After just completing my PhD I have made the algorithms I have developed during it available online: https://github.com/covarep/covarep
The so-called Voice Analysis Toolkit contains algorithms for glottal source and voice quality analysis. In making the code available online I hope that people in the speech processing community can benefit from it. I would really appreciate if you could include a link to this in the software section of the next ISCApad (section 5-3).
thanks for this.
John Researcher
Centre for Language and Communication Studies,
School of Linguistics, Speech and Communication Sciences, Trinity College Dublin, College Green Dublin 2 Phone: (+353) 1 896 1348 Website: http://www.tcd.ie/slscs/postgraduate/phd-masters-research/student-pages/johnkane.php Check out our workshop!! http://muster.ucd.ie/workshops/iast/
| ||
5-3-4 | Bob signal-processing and machine learning toolbox (v.1.2..0)
It is developed by the Biometrics
Group at Idiap in Switzerland. -- ------------------- Dr. Elie Khoury Post Doctorant Biometric Person Recognition Group IDIAP Research Institute (Switzerland) Tel : +41 27 721 77 23
| ||
5-3-5 | Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox).Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox). http://bass-db.gforge.inria.fr/fasst/ This toolbox is intended to speed up the conception and to automate the implementation of new model-based audio source separation algorithms. It has the following additions compared to version 1: * Core in C++ * User scripts in MATLAB or python * Speedup * Multichannel audio input We provide 2 examples: 1. two-channel instantaneous NMF 2. real-world speech enhancement (2nd CHiME Challenge, Track 1)
|