ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2014 » ISCApad #192 » Resources » Software

ISCApad #192

Thursday, June 12, 2014 by Chris Wellekens

5-3 Software

5-3-1

ROCme!: a free tool for audio corpora recording and management

ROCme!: nouveau logiciel gratuit pour l'enregistrement et la gestion de corpus audio.

Le logiciel ROCme! permet une gestion rationalisée, autonome et dématérialisée de l’enregistrement de corpus lus.

Caractéristiques clés :
- gratuit
- compatible Windows et Mac
- interface paramétrable pour le recueil de métadonnées sur les locuteurs
- le locuteur fait défiler les phrases à l'écran et les enregistre de façon autonome
- format audio paramétrable

Téléchargeable à cette adresse :
www.ddl.ish-lyon.cnrs.fr/rocme

Back

Top

5-3-2

VocalTractLab 2.0 : A tool for articulatory speech synthesis

VocalTractLab 2.0 : A tool for articulatory speech synthesis

It is my pleasure to announce the release of the new major version 2.0 of VocalTractLab. VocalTractLab is an articulatory speech synthesizer and a tool to visualize and explore the mechanism of speech production with regard to articulation, acoustics, and control. It is available from http://www.vocaltractlab.de/index.php?page=vocaltractlab-download .
Compared to version 1.0, the new version brings many improvements in terms of the implemented models of the vocal tract, the vocal folds, the acoustic simulation, and articulatory control, as well as in terms of the user interface. Most importantly, the new version comes together with a manual.

If you like, give it a try. Reports on bugs and any other feedback are welcome.

Peter Birkholz

Back

Top

5-3-3

Voice analysis toolkit

After just completing my PhD I have made the algorithms I have developed during it available online: https://github.com/covarep/covarep

The so-called Voice Analysis Toolkit contains algorithms for glottal source and voice quality analysis. In making the code available online I hope that people in the speech processing community can benefit from it. I would really appreciate if you could include a link to this in the software section of the next ISCApad (section 5-3).

thanks for this.

John

Researcher

Phonetics and Speech Laboratory (Room 4074) Arts Block,

Centre for Language and Communication Studies,
School of Linguistics, Speech and Communication Sciences, Trinity College Dublin, College Green Dublin 2
Phone: (+353) 1 896 1348 Website: http://www.tcd.ie/slscs/postgraduate/phd-masters-research/student-pages/johnkane.php

Check out our workshop!! http://muster.ucd.ie/workshops/iast/

Back

Top

5-3-4

Bob signal-processing and machine learning toolbox (v.1.2..0)

    The release 1.2.0 of the Bob
      signal-processing and machine learning toolbox is available .
    Bob provides both efficient implementations of several machine     learning algorithms as well as a framework to help researchers to     publish reproducible research.

It is developed by the Biometrics
Group at Idiap in Switzerland.

    The previous release of Bob was providing:
    * image, video and audio IO
      interfaces such as jpg, avi, wav,
    * database
      accessors such as FRGC, Labelled Face in the Wild, and many     others,
    *       image processing: Local Binary Patterns (LBPs), Gabor Jets,     SIFT,
    * machines
      and trainers such as Support Vector Machines (SVMs), k-Means,     Gaussian Mixture Models (GMMs), Inter-Session Variability modeling     (ISV), Joint Factor Analysis (JFA), Probabilistic Linear     Discriminant Analysis (PLDA), Bayesian intra/extra (personal)     classifier,

    The new release of Bob has brought the following features and/or     improvements, such as:
    * Unified implementation of Local Binary Patterns (LBPs),
    * Histograms of Oriented Gradients (HOG) implementation,
    * Total variability (i-vector) implementation,
    * Conjugate gradient based-implementation for logistic regression,
    * Improved multi-layer perceptrons implementation (Back-propagation     can now be easily used in combination with any optimizer -- i.e     L-BFGS),
    * Pseudo-inverse-based method for Linear Discriminant Analysis,
    * Covariance-based method for Principal Component Analysis,
    * Whitening and within-class covariance normalization techniques,
    * Module for object detection and keypoint localization     (bob.visioner),
    * Module for       audio processing including feature extraction such as LFCC and     MFCC,
    * Improved extensions (satellite packages), that now support both     Python and C++ code, within an easy to use framework,
    * Improved documentation and add new tutorials,
    * Support for Intel's MKL (in addition to ATLAS),
    * Extend supported platforms (Arch Linux).

    This release represents a major milestone in Bob with plenty of     functionality improvements (>640       commits in total) and plenty of bug
      fixes.
    • Sources and       Documentation
    • Binary packages:
    •     Ubuntu: 10.04, 12.04, 12.10 and 13.04
    • For     Mac OSX: works with 10.6 (Snow Leopard), 10.7 (Lion) and 10.8     (Mountain Lion)

    For instructions on how to install pre-packaged version on Ubuntu or     OSX, consult our quick       installation instructions (N.B. OS X macport has not yet been     upgraded. This will be done very soon. cf. https://trac.macports.org/ticket/39831 ).


    Best regards,
    Elie Khoury (on Behalf of the Biometric
      Group at Idiap lead by Sebastien
      Marcel)


    ---

-- ------------------- Dr. Elie Khoury Post Doctorant Biometric Person Recognition Group IDIAP Research Institute (Switzerland) Tel : +41 27 721 77 23

Back

Top

5-3-5

Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox).

Release of the version 2 of FASST (Flexible Audio Source Separation Toolbox). http://bass-db.gforge.inria.fr/fasst/ This toolbox is intended to speed up the conception and to automate the implementation of new model-based audio source separation algorithms. It has the following additions compared to version 1: * Core in C++ * User scripts in MATLAB or python * Speedup * Multichannel audio input We provide 2 examples: 1. two-channel instantaneous NMF 2. real-world speech enhancement (2nd CHiME Challenge, Track 1)

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy