The third workshop on Speech, Language and Audio in Multimedia (SLAM) aims at bringing together researchers working in speech, language and audio processing to analyze, index and access multimedia data. Multimedia data are now available in enormous volumes in a wide variety of formats and qualities, from professional content to user-generated ones: Lectures, meetings, interviews, debates, conversational broadcast, podcasts, social videos on the Web, etc. Such data, along with the associated use scenarios, raise specific challenges: Robustness facing the high variability in quality; Efficiency to handle very large amount of data; Semantics shared across modalities; Potentially high error rates in transcription; etc. Worldwide, several national and international research projects are focusing on audio and language analysis of multimedia data. Similarly, various benchmark initiatives have devoted effort to offering tasks related to multimodal multimedia challenges (e.g., TRECVid, CLEF, MediaEval).
Following SLAM 2013 in Marseille, France, and SLAM 2014 in Pinang, Malaysia, both collocated with the Interspeech conference, SLAM 2015 moves to the multimedia community. To make the most of the collocation with ACM Multimedia, the workshop features a dedicated session to highlight work on multimodality and fusion, at the intersection of speech, audio, language and computer vision.
SLAM gathers players from the fields of speech and audio processing and of multimedia to share recent research results, discuss ongoing and future projects, explore potential areas for interdisciplinary collaboration or sharing or ideas, and develop new benchmarking initiatives of mutual interest to multimedia and language researchers. We expect contributions on ongoing research work, project descriptions, evaluation initiatives, demonstrations and applications emphasizing the speech and/or language and/or audio contribution to any type of multimedia technology.
As a special focus of SLAM 2015, we particularly welcome contributions on video hyperlinking, as a case study where the speech and language modalities are complemented by audio and vision.
*Important dates*
Paper submission deadline July 10, 2015 Notification of acceptance August 2, 2015 Camera ready paper August 10, 2015