ISCA - International Speech
Communication Association

ISCApad Archive  »  2014  »  ISCApad #194  »  Events  »  Other Events  »  (2014-10-16) CfP MediaEval 2014 Multimedia Benchmark Evaluation, Barcelona (SP)

ISCApad #194

Monday, August 04, 2014 by Chris Wellekens

3-3-22 (2014-10-16) CfP MediaEval 2014 Multimedia Benchmark Evaluation, Barcelona (SP)

Call for Participation
MediaEval 2014 Multimedia Benchmark Evaluation
Early registration deadline: 1 May 2014

MediaEval is a multimedia benchmark evaluation that offers tasks promoting research and innovation in areas related to human and social aspects of multimedia. MediaEval 2014 focuses on aspects of multimedia that include speech and audio. Participants carry out one or more of the tasks offered and submit runs to be evaluated. They then write up their results and present them at the MediaEval 2014 workshop.

The tasks that focus on speech are:

*QUESST: Query by Example Search on Speech Task (ex SWS)*
*Search and Hyperlinking*

The entire list of tasks and their descriptions is below.

For each task, participants receive a task definition, task data and accompanying resources (dependent on task) such as shot boundaries, keyframes, visual features, speech transcripts and social metadata. In order to encourage participants to develop techniques that push forward the state-of-the-art, a 'required reading' list of papers will be provided for each task. Participation is open to all interested research groups. To sign up, please click the “MediaEval 2014 registration site” link at:

The following tasks are available to participants at MediaEval 2014:

*Synchronization of multi-user Event Media (New!)*
This task requires participants to automatically create a chronologically-ordered outline of multiple image galleries corresponding to the same event, where data collections are synchronized altogether and aligned along parallel lines over the same time axis, or mixed in the correct order.

*C@merata: Question Answering on Classical Music Scores (New!)*
In this task, systems take as input a noun phrase (e.g., 'harmonic perfect fifth') and a short score in MusicXML (e.g., J.S. Bach, Suite No. 3 in C Major for Cello, BWV 1009, Sarabande) and return an answer stating the location of the requested feature (e.g., 'Bar 206').

*Retrieving Diverse Social Images Task*
This task requires participants to automatically refine a ranked list of Flickr photos with landmarks using provided visual and textual information. The objective is to select only a small number of photos that are equally representative matches but also diverse representations of the query.

*Search and Hyperlinking*
This task requires participants to find video segments relevant to an information need and to provide a list of useful hyperlinks for each of these segments. The hyperlinks point to other video segments in the same collection and should allow the user of the system to explore the collection with respect to the current information need in a non-linear fashion. The task focuses on television data provided by the BBC and real information needs from home users.

*QUESST: Query by Example Search on Speech Task (ex SWS)*
The task involves searching FOR audio content WITHIN audio content USING an audio content query. This task is particularly interesting for speech researchers in the area of spoken term detection or low-resource speech processing.

*Visual Privacy*
This task requires participants to implement privacy filtering solutions that provide an optimal balance between obscuring information that personally identifies people in a video, and retraining information that allows viewers otherwise to interpret the video.

*Emotion in Music (an Affect Task)*
We aim at detecting emotional dynamics of music using its content. Given a set of songs, participants are asked to automatically generate continuous emotional representations in arousal and valence.

*Placing: Geo-coordinate Prediction for Social Multimedia*
This task requires participants to estimate the geographical coordinates (latitude and longitude) of multimedia items (photos, videos and accompanying metadata), as well as predicting how “placeable” a media item actually is. The Placing Task integrates all aspects of multimedia: textual meta-data, audio, image, video, location, time, users and context.

*Affect Task: Violent Scenes Detection*
This task requires participants to automatically detect portions of movies depicting violence. Participants are encouraged to deploy multimodal approaches (audio, visual, text) to solve the task.

*Social Event Detection in Web Multimedia*
This task requires participants to discover, retrieve and summarize social events, within a collection of Web multimedia. Social events are events that are planned by people, attended by people and for which the social multimedia are also captured by people.

*Crowdsourcing: Crowdsorting Multimedia Comments (New!)*
This task asks participants to combine human computation (i.e., input from the crowd) with automatic computation to carry out classification. The classification involves sorting timed-comments in music, i.e., comments that users have made at certain points in a song, into categories according to their type (e.g., useful vs. non-useful and informative vs. affective).

Tasks marked 'New!' are the 2014 Brave New Tasks. If you sign up for these tasks, please be aware that you will be asked to keep in close touch with the task organizers concerning the details of the task over the course of the benchmarking cycle. We ask for extra-tight communication in order to ensure that these tasks have the flexibility they need to reach their goals.

MediaEval 2014 Timeline
(dates vary slightly from task to task, see the individual task pages for the individual deadlines:

April-May: Registration and return usage agreements.
May-June: Release of development/training data.
June-July: Release of test data.
Mid-Sept.: Participants submit their completed runs.
Mid-Sept.-End-Sept.: Evaluation of submitted runs. Participants write their 2-page working notes papers.
16-17+18 October: MediaEval 2014 Workshop, Barcelona, Spain

We ask you to register by 1 May, when the first task will release its data set. After that point, late registration will be possible, but we encourage teams to register as early as they can.

For questions or additional information please contact Martha Larson or visit

MediaEval 2014 Organization Committee:

Martha Larson at Delft University of Technology and Gareth Jones at Dublin City University act as the overall coordinators of MediaEval. Individual tasks are coordinated by a group of task organizers, who form the MediaEval Organizing Committee. It is the collective efforts of this group of people that makes MediaEval possible. The complete list of MediaEval organizers is at:

A large number of organizations and projects make a contribution to MediaEval organization, including the projects (alphabetical): AXES (, CUbRIK (, CNGL (, CrowdRec (, Glocal (, LinkedTV (, Media Mixer (, Mucke (, Promise (, Quaero (, Sealinc Media (, SocialSensor (, and VideoSense (

Back  Top

 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA