ISCA - International Speech
Communication Association

ISCApad Archive  »  2015  »  ISCApad #206  »  Academic and Industry Notes

ISCApad #206

Thursday, August 20, 2015 by Chris Wellekens

4 Academic and Industry Notes
4-1Carnegie Speech


Carnegie Speech produces systems to teach people how to speak another language understandably. Some of its products include NativeAccent, SpeakIraqi, SpeakRussian, and ClimbLevel4. You can find out more at You can also read about awarding it a Best Breakout Idea of 2009 at:


4-2Research in Interactive Virtual Experiences at USC CA USA

REU Site: Research in Interactive Virtual Experiences



The Institute for Creative Technologies (ICT) offers a 10-week summer research program for undergraduates in interactive virtual experiences. A multidisciplinary research institute affiliated with the University of Southern California, the ICT was established in 1999 to combine leading academic researchers in computing with the creative talents of Hollywood and the video game industry. Having grown to encompass a total of 170 faculty, staff, and students in a diverse array of fields, the ICT represents a unique interdisciplinary community brought together with a core unifying mission: advancing the state-of-the-art for creating virtual reality experiences so compelling that people will react as if they were real.


Reflecting the interdisciplinary nature of ICT research, we welcome applications from students in computer science, as well as many other fields, such as psychology, art/animation, interactive media, linguistics, and communications. Undergraduates will join a team of students, research staff, and faculty in one of several labs focusing on different aspects of interactive virtual experiences. In addition to participating in seminars and social events, students will also prepare a final written report and present their projects to the rest of the institute at the end of summer research fair.


Students will receive $5000 over ten weeks, plus an additional $2800 stipend for housing and living expenses.  Non-local students can also be reimbursed for travel up to $600.  The ICT is located in West Los Angeles, just north of LAX and only 10 minutes from the beach.


This Research Experiences for Undergraduates (REU) site is supported by a grant from the National Science Foundation. The site is expected to begin summer 2013, pending final award issuance.


Students can apply online at:

Application deadline: March 31, 2013


For more information, please contact Evan Suma at


4-3Announcing the Master of Science in Intelligent Information Systems

Carnegie Mellon University


degree designed for students who want to rapidly master advanced content-analysis, mining, and intelligent information technologies prior to beginning or resuming leadership careers in industry and government. Just over half of the curriculum consists of graduate courses. The remainder provides direct, hands-on, project-oriented experience working closely with CMU faculty to build systems and solve problems using state-of-the-art algorithms, techniques, tools, and datasets. A typical MIIS student completes the program in one year (12 months) of full-time study at the Pittsburgh campus.  Part-time and distance education options are available to students employed at affiliated companies. The application deadline for the Fall 2013 term is December 14, 2012. For more information about the program, please visit


4-4Master in linguistics (Aix-Marseille) France

Master's in Linguistics (Aix-Marseille Université): Linguistic Theories, Field Linguistics and Experimentation TheLiTEx offers advanced training in Linguistics. This specialty focuses Linguistics is aimed at presenting in an original way the links between corpus linguistics and scientific experimentation on the one hand and laboratory and field methodologies on the other. On the basis of a common set of courses (offered within the first year), TheLiTEx offers two paths: Experimental Linguistics (LEx) and Language Contact & Typology (LCT) The goal of LEx is the study of language, speech and discourse on the basis of scientific experimentation, quantitative modeling of linguistic phenomena and behavior. It focuses on a multidisciplinary approach which borrows its methodologies to human physical and biological sciences and its tools to computer science, clinical approaches, engineering etc.. Among the courses offered: semantics, phonetics / phonology, morphology, syntax or pragmatics, prosody and intonation, and the interfaces between these linguistic levels, in their interactions with the real world and the individual, in a biological, cognitive and social perspective. Within the second year, a set of more specialized courses is offered such as Language and the Brain and Laboratory Phonology. LCT aims at understanding the world's linguistic diversity, focusing on language contact, language change and variation (European, Asian and African languages, Creoles, sign language, etc.).. This specialty focuses, from a a linguistic and sociolinguistic perspective, on issues of field linguistics and taking into account both the human and socio-cultural dimension of language (speakers, communities). It also focuses on documenting rare and endangered languages and to engage a reflection on linguistic minorities. This path also provides expertise and intervention models (language policy and planning) in order  to train students in the management of contact phenomena and their impact on the speakers, languages and societies More info at:




A new, one-year Master in Brain and Cognition will begin its activities in the Academic Year 2014-15 in Barcelona, Spain, organized by the Universitat Pompeu Fabra (

The core of the master's programme is composed of the research groups at UPF's Center for Brain and Cognition  ( These groups are directed by renowned scientists in areas such as computational neuroscience, cognitive neuroscience, psycholinguistics, vision, multisensory perception, human development and comparative cognition. Students will  be exposed to the ongoing research projects at the Center for Brain and Cognition and will be integrated in one of its main research lines, where they will conduct original research for their final project.

Application period is now open. Please visit the Master web page or contact for further information.


4-6Masters à la Sorbonne (Paris)

Les masters d'Ingénierie de la langue de Paris-Sorbonne, ILGII (R) et IILGI (P), sont maintenant regroupés dans une seule spécialité de la mention Littérature, Philosophie, Linguistique.
Les deux années du master Langue et Informatique apportent des connaissances fondamentales sur la langue et son traitement automatique, sur les interactions langagières et la modélisation des phénomènes paralangagiers, ainsi que sur l'ingénierie des connaissances. Les enseignements de spécialité développent également des savoirs et des savoir-faire : analyse et compréhension de textes ; reconnaissance et synthèse de la parole ; sciences affectives et systèmes de dialogue ; résumé et traduction assistés par ordinateur; extraction et construction des connaissances ; intelligence économique. Les enseignements méthodologiques du tronc commun de la mention permettent d'articuler ces enseignements spécialisés avec ce qui relève de l'épistémologie de la littérature, de la philologie et de la linguistique. Ce master comporte deux parcours : un parcours professionnel « Ingénierie de la Langue pour la Société Numérique (ILSN) »  et un parcours recherche « Informatique, Langue et Interactions (ILI) ». La différenciation entre les deux parcours se fait au semestre 4.



4-7The International Standard Language Resource Number (ISLRN)

JRC, the EC?s Joint Research Centre, an important LR player: First to adopt the ISLRN initiative


The Joint Research Centre (JRC), the European Commission's in house science service, is the first organisation to use the International Standard Language Resource Number (ISLRN) initiative and has requested ISLRN 13-digit unique identifiers to its Language Resources (LR).
Thus, anyone who is using JRC LRs may now refer to this number in their own publications.


The current JRC LRs (downloadable from with an ISLRN ID are:




The International Standard Language Resource Number (ISLRN) aims to provide unique identifiers using a standardised nomenclature, thus ensuring that LRs are correctly identified, and consequently, recognised with proper references for their usage in applications within R&D projects, product evaluation and benchmarking, as well as in documents and scientific papers. Moreover, this is a major step in the networked and shared world that Human Language Technologies (HLT) has become: unique resources must be identified as such and meta-catalogues need a common identification format to manage data correctly.
The ISLRN portal can be accessed from,


*** About the JRC ***

As the Commission's in-house science service, the Joint Research Centre's mission is to provide EU policies with independent, evidence-based scientific and technical support throughout the whole policy cycle.
Within its research in the field of global security and crisis management, the JRC develops open source intelligence and analysis systems that can automatically harvest and analyse a huge amount of multi-lingual information from the internet-based sources. In this context, the JRC has developed Language Technology resources and tools that can be used for highly multilingual text analysis and cross-lingual applications.
To find out more about JRC's research in open source information monitoring, please visit To access media monitoring applications directly, go to


*** About ELRA ***
The European Language Resources Association (ELRA) is a non-profit making organisation founded by the European Commission in 1995, with the mission of providing a clearing house for language resources and promoting Human Language Technologies (HLT).
To find out more about ELRA, please visit our web site:

For more information, contact


4-8New Masters in Machine Learning, Speech and Language Processing at Cambridge University, UK
New Masters in Machine Learning, Speech and Language Processing
This is a new twelve-month full-time MPhil programme offered by the Computational and Biological Learning Group (CBL) and the Speech Group in the Cambridge University Department of Engineering, with a unique, joint emphasis on both machine learning and on speech and language technology. The course aims: to teach the state of the art in machine learning, speech and language processing; to give students the skills and expertise necessary to take leading roles in industry; to equip students with the research skills necessary for doctoral study.
UK and EU students applications should be completed by 9 January 2015 for admission in October 2015. A limited number of studentships may be available for exceptional UK and eligible EU applicants. 

Self-funding students who do not wish to be considered for support from the Cambridge Trusts have until 30 June 2015 to submit their complete applications.

More information about the course can be found here:


4-9MediaEval 2015 Multimedia Benchmark


Call for Participation
MediaEval 2015 Multimedia Benchmark Evaluation
Early registration deadline: 1 May 2015

MediaEval is a multimedia benchmark evaluation that offers tasks promoting research and
innovation in areas related to human and social aspects of multimedia. MediaEval 2015
focuses on aspects of multimedia including and going beyond visual content, such as
language, speech, music, and social factors. Participants carry out one or more of the
tasks offered and submit runs to be evaluated. They then write up their results and
present them at the MediaEval 2015 workshop.

For each task, participants receive a task definition, task data and accompanying
resources (dependent on task) such as shot boundaries, keyframes, visual features, speech
transcripts and social metadata. In order to encourage participants to develop techniques
that push forward the state-of-the-art, a 'required reading' list of papers will be
provided for each task.

Participation is open to all interested research groups. To sign up, please click the
?MediaEval 2015 Registration? link at:

The following tasks are available to participants at MediaEval 2015:

*QUESST: Query by Example Search on Speech Task*
The task involves searching FOR audio content WITHIN audio content USING an audio content
query. This task is particularly interesting for speech researchers in the area of spoken
term detection or low-resource/zero-resource speech processing. The primary  performance
metric will be the normalized cross entropy cost (Cnxe).

*Multimodal Person Discovery in Broadcast TV (New in 2015!)*
Given raw TV broadcasts, each shot must be automatically tagged with the name(s) of
people who can be both seen as well as heard in the shot. The list of people is not known
a priori and their names must be discovered in an unsupervised way from provided text
overlay or speech transcripts. The task will be evaluated on a new French corpus
(provided by INA) and the AGORA Catalan corpus, using standard information retrieval
metrics based on a posteriori collaborative annotation of the corpus.

*C@merata: Querying Musical Scores*
The input is a natural language phrase referring to a musical feature (e.g., ?consecutive
fifths?) together with a classical music score, and the required output is a list of
passages in the score which contain that feature. Scores are in the MusicXML format,
which can capture most aspects of Western music notation. Evaluation is via versions of
Precision and Recall relative to a Gold Standard produced by the organisers.

*Affective Impact of Movies (including Violent Scenes Detection)*
In this task participating teams are expected to classify short movie scenes by their
affective content according to two use cases: (1) the presence of depicted violence, and
(2) their emotional impact (valence, arousal). The training data consists of short
Creative Commons-licensed movie scenes (both professional and amateur) together with
human annotations of violence and valence-arousal ratings. The results will be evaluated
using standard retrieval and classification metrics.

*Emotion in Music (An Affect Task)*
We aim at detecting emotional dynamics of music using its content. Given a set of songs,
participants are asked to automatically generate continuous emotional representations in
arousal and valence.

*Retrieving Diverse Social Images*
This task requires participants to refine a ranked list of Flickr photos with location
related information using provided visual, textual and user credibility information.
Results are evaluated with respect to their relevance to the query and the diverse
representation of it.

*Placing: Multimodal Geo-location Prediction*
The Placing Task requires participants to estimate the locations where multimedia items
(photos or videos) were captured solely by inspecting the content and metadata of these
items, and optionally exploiting additional knowledge sources such as gazetteers.
Performance is evaluated using the distance to the ground truth coordinates of the
multimedia items.

*Verifying Multimedia Use (New in 2015!)*
For this task, the input is a tweet about an event that has the profile to be of interest
in the international news, and the accompanying multimedia item (image or video).
Participants must build systems that output a binary decision representing a verification
of whether the multimedia item reflects the reality of the event in the way purported by
the tweet. The task is evaluated using the F1 score. Participants are also requested to
return a short explanation or evidence for the verification decision.

*Context of Experience: Recommending Videos Suiting a Watching Situation (New in 2015!)*
This task develops multimodal techniques for automatic prediction of multimedia in a
specific consumption context. In particular, we focus on the context of predicting movies
that are suitable to watch on airplanes. Input to the prediction methods are movie
trailers, and metadata from IMDb. Output is evaluated using the Weighted F1 score, with
expert labels as ground truth.

*Reliability of Social Multimedia Annotations (New in 2015!)*
Input is a set of underwater photos with user-generated annotations and other addition
social information taken from a social scuba divers website, and output is a ranked list
of the least reliable user-generated annotations. Systems will be evaluated using a
labeling of fish species created by expert annotators.

*Synchronization of Multi-User Event Media*
This task addresses the challenge of automatically creating a chronologically-ordered
outline of multiple multimedia collections corresponding to the same event. Given N media
collections (galleries) taken by different users/devices at the same event, the goal is
to find the best (relative) time alignment among them and detect the significant
sub-events over the whole gallery. Performance is evaluated using ground truth time codes
and actual event schedules.

*DroneProtect: Mini-drone Video Privacy Task (New in 2015!)*
Recent popularity of mini-drones and their rapidly increasing adoption in various areas,
including photography, news reporting, cinema, mail delivery, cartography, agriculture,
and military, raises concerns for privacy protection and personal safety. Input to the
task is drone video, and output is version of the video which protects privacy while
retaining key information about the event or situation recorded.

*Search and Anchoring in Video Archives*
The 2015 Search and Anchoring in Video Archives task consists of two sub-tasks: search
for multimedia content and automatic anchor selection. In the ?search for multimedia
content? sub-task, participants use multimodal textual and visual descriptions of content
of interest to retrieve potentially relevant video segments from within a collection. In
the ?automatic anchor selection? sub-task, participants automatically predict key
elements of videos as anchor points for the formation of hyperlinks to relevant content
within the collection. The video collection consists of professional broadcasts from BBC
or semi-professional user generated content. Participant submissions will be assessed
using professionally-created anchors, and crowdsourcing-based evaluation.

MediaEval 2015 Timeline
(dates vary slightly from task to task, see the individual task pages for the individual

Mid?March-May: Registration and return usage agreements.
May-June: Release of development/training data.
June-July: Release of test data.
Mid-Aug.: Participants submit their completed runs, and receive results.
End Aug: Participants submit their 2-page working notes papers.
14-15 September: MediaEval 2015 Workshop, Wurzen, Germany. Workshop as a satellite event
of Interspeech 2015, held nearby in Dresden the previous week.

We ask you to register by 1 May (because of the timing of the first wave of data
releases). After that point, late registration will be possible, but we encourage teams
to register as early as they can.

For questions or additional information please contact Martha Larson or visit

The ISCA SIG SLIM: Speech and Language in Multimedia ( is a key
supporter of MediaEval. This year, the MediaEval workshop will be held as a satellite
event of Interspeech (

A large number of organizations and projects make a contribution to MediaEval
organization, including the projects (alphabetical): Camomile
(, CrowdRec (, EONS
(, PHENICX (, Reveal
(, VideoSense (, Visen



4-10AASP TC Challenges

 Three years after its launch, the AASP TC Challenges series has achieved its goal of
stimulating ground-breaking approaches to hot topics in Audio and Acoustic Signal
Processing. The challenges run so far have been great successes, leading to unprecedented
participation, new publicly available datasets, and highly attended special sessions and
-    the CHiME Challenge on speech separation and recognition in domestic environments
-    the D-Case Challenge on detection and classification of acoustic scenes and events
-    the REVERB Challenge on single- and multichannel speech dereverberation

The coming year will see the unravelling of the ACE Challenge on acoustic
characterization of environments (, and future challenges
are now eagerly awaited by academics and industrials.

In order to pursue this endeavor, we are issuing a call for expressions of interest in
organizing new challenges. This is an open call with no deadline. Prospective organizers
should provide a brief description of the challenge, the planned test data and evaluation
methodology, and their value to the community. Challenges at the crossroads of other
communities such as speech processing or machine learning are especially welcome. For
more details, see

The AASP TC Challenges Subcommittee will help organizers run a successful Challenge by
providing scientific and organizational feedback, sharing industrial sponsorship
contacts, and awarding official prizes to the most reproducible challenges entries.

We are looking forward to your proposals!

On behalf of the AASP TC Challenges Subcommittee
Emmanuel Vincent, Chair


4-11Campagne d'évaluation MULTILING

La campagne d'évaluation Multiling sur le résumé automatique
s'interesse cette année au résumé de conversations orales à travers la
tâche CCCS (Call-Center Conversation Summarization). Les systèmes
participant seront évalués sur leur capacité à générer un résumé par
abstraction d'une conversation téléphonique qui raconte les problèmes
rencontrés par l'appelant et les solutions apportées par l'agent.

La deadline pour les soumissions est le 24 avril, ce qui vous laisse
juste le temps de participer.

Les données d'apprentissage proviennent des corpus DECODA (français),
LUNA (italien), et traductions manuelles de ces corpus en anglais. 100
conversations de chaque corpus sont annotées avec plusieurs résumés de
référence et 1000 conversations supplémentaires sont fournies pour
l'apprentissage non supervisé. Le corpus contient les transcriptions
manuelles de chaque conversation.

Si vous êtes interessé par cette tâche, merci de me contacter et de
vous renseigner sur

Benoit Favre.


4-12Questionnaire sur une éthique pour la communauté de parole

Pendant  l'atelier  Ethique et TRaitemeNt Automatique des Langues (ETeRNAL à TALN, a été proposé un questionnaire  afin de connaître les usages et les attentes de la communauté en matière  d'éthique.

Connectez-vous sur le site de l'atelier, ou directement
( ou et
consacrez 5 minutes pour nous donner votre avis.

Le résultat de ce questionnaire ne sera significatif que s'il est rempli par beaucoup de
monde, y compris par ceux qui ne se sentent pas a priori concernés par le sujet ou ceux
qui pensent que l'éthique n'est pas un problème.

Le résultat de l'enquête sera diffusé largement, sur le site de l'atelier, mais également
sur le blog créé à la suite de l'atelier (

5 minutes, promis!


Gilles Adda (LIMSI)


4-13Invitation to host the 8th International Conference on Multimedia Retrieval

Invitation to host the ACM ICMR2018 conference
ICMR Steering Committee<>

The ICMR Steering Committee invites interested parties to submit proposals to host and
organize the 8th International Conference on Multimedia Retrieval, ICMR2018 (sponsored by

ACM ICMR is the premier scientific conference for multimedia retrieval. Its mission is to
provide a forum to discuss, promote and advance the state-of-the-arts in multimedia
retrieval by bringing together researchers and practitioners in the field. It is thus
essential to ensure that the conference includes sessions for presenting high-quality
research papers and for sharing practitioner experience.

We expect ICMR2018 to be held in Asia / Australasia regions.

Parties interested in hosting ICMR2018 are invited to submit their proposals (20 pages or
less) by Friday, 16 October 2015 by email with the subject line: ICMR2018 to the steering
committee chair, Prof Tat-Seng CHUA,<;

Guidelines for potential conference hosts

The ICMR steering committee will evaluate all bids using the guidelines set out below.
Anyone interested in bidding is welcome to make informal contact with the steering
committee chair prior to the deadline for the proposals. Proposals will be judged on the
strength of the organizing committee (track records in multimedia retrieval, diversity
and experience of international members), the plan for the conference (vision, ideas,
etc.), and location (appeal, accessibility, etc.).  Decisions are made by majority vote
within the steering committee.  Only one proposal from all the submissions will be
selected by the steering committee.

The steering committee will aim to review proposals and make its decision within four
weeks of the submission.

ICMR should facilitate interactions between multimedia retrieval community members which
includes both researchers and practitioners.  Specific objectives of ICMR are as follows:

1. To provide a setting for the presentation and discussion of high-quality original
research papers in all aspects of multimedia retrieval.

2. To provide a forum for the exchange of ideas between researchers and practitioners in
the field, ideally by maintaining a separate 'industry' track.

3. To provide a range of complementary events such as panel sessions, system
demonstrations, exhibitions, workshops, and large-scale media evaluations and challenges.

4. To provide suitable facilities for informal networking and exchange of ideas between

The host organization is therefore expected to arrange for refereeing of all submitted
papers to international standards, using ICMR's existing international program committees
as their primary source of referees, and to liase with ACM over the publication of
conference proceedings.

Timing and location

The date for ICMR should be around June each year. In particular, in order to improve the
interactions between SIGMM meetings, the timing of ICMR should be coordinated with the
ACM MM Organizers of that year so that the Technical Program Committee (TPC) meeting of
ACM MM (usually held in June) can be co-located with ICMR (see Section on Checklist

The conference location should be easily accessible by people from around the world, with
good air, rail and road links. The bid should include a short description of the locality
and any remarkable or outstanding features that would make it particularly attractive to
potential delegates.

The venue of the conference should preferably be in an education/research Institute or a
hotel, with space for about 180 delegates.

Accommodation and social events

Proposers should demonstrate that they have suitable accommodation for delegates (e.g.
en-suite rooms in student halls and/or local hotels), for a meeting of at least three
days duration. The availability of low-cost accommodation for student delegates such as
youth hostels or inexpensive student halls would also be an advantage. The Organizer is
expected to organize a drink reception (possibly externally sponsored) and a relatively
formal conference dinner. Other types of activities such as sightseeing visits would also
be appreciated.

Conference web site

Proposers are expected to organize and maintain a web site (typically<>) for the conference, at least providing links
for paper submission, delegate registration, organizing, program, and steering committees.

Budget preparation and costs

The ACM SIGMM has agreed to full sponsorship of ICMR.  Proposers must produce a budget
for the conference. Costs should be estimated based on 140 attendees, inclusive of
organizers and volunteer helpers.  It is expected that the conference should break even
at a minimum, and should preferably show a small surplus.  Various projections of income
and expenditure, with different price bands for members of any collaborating or
sponsoring bodies (normally a 10% discount on the standard rate), non-members and student
delegates, with corresponding break even points, should be provided. Registration fees
for students should be kept as low as possible in order to encourage young researchers to
participate. Incentives for early registration are advisable.

Please see for an example list of typical budget


The proposal should include the following information, and in 20 pages or less in PDF:

* names, affiliations, and email contact information of the main organizers.

* a copy of the first call for papers for the conference including dates. To facilitate
close interactions among SIGMM sponsored conferences, and in particular ACM Multimedia,
proposers of ICMR should coordinate with ACMMM organizers of the same year and propose a
conference schedule that allows the ACMMM TPC meeting (typically held in the first two
weeks of June - you can ask the steering committee chair about this too) to be co-located
with ICMR.

* highlights of the conference and justification/support of the conference dates,
location, and venue.

* a draft of the organizing and program committees (please specify if a member is
tentative; also if an ICMR steering committee member's name is used without making it
very clear he agreed, the name will be marked over (erased) to ensure fairness in the

* a draft programme for the conference

* a draft of the conference budget including the ACM contingency fee and the VAT if

* a schedule of activities

* plans for publicizing the conference

Tat-Seng Chua
ACM ICMR Steering Committee Chair
July 2015

on behalf of the ACM ICMR Steering Committee,<>


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA