Growing amounts of multimedia content is being shared or stored in online archives. Alternative research directions in the speech processing and multimedia analysis communities are developing and improving speech or multimedia processing technologies in parallel, often using each others work as ?black boxes?. However, genuine combination would appear to be a better strategy to exploit the synergies between the modalities of content containing multiple potential sources of information.
This session seeks to bring together the speech and multimedia research communities to report on current work and to explore potential synergies and opportunities for creative research collaborations between speech and multimedia technologies. From the speech perspective the session aims to explore how fundamentals of speech technology can be benefit multimedia applications, and from the multimedia perspective to explore the crucial role that speech can play in multimedia analysis.
The list of topics of interest includes (but is not limited to):
- Navigation in multimedia content using advanced speech analysis features; - Large scale speech and video analysis - Multimedia content segmentation and structuring using audio and visual features; - Multimedia content hyperlinking and summarization; - Natural language processing for multimedia; - Multimodality-enhanced metadata extraction, e.g. entity extraction, keyword extraction, etc; - Generation of descriptive text for multimedia; - Multimedia applications and services using speech analysis features; - Affective and behavioural analytics based on multimodal cues; - Audio event detection and video classification; - Multimodal speaker identification and clustering.
Important dates:
20 Mar 2015 paper submission deadline 01 Jun 2015 paper notification of acceptance/rejection 10 Jun 2015 paper camera-ready 20 Jun 2015 early registration deadline 6-10 Sept 2015 Interspeech 2015, Dresden, Germany
Submission takes place via the general Interspeech submission system. Paper contributions must comply to the INTERSPEECH paper submission guidelines, cf. http://interspeech2015.org/papers. There will be no extension to the full paper submission deadline. We are looking forward to receive your contribution!
Organizers:
- Maria Eskevich, Communications Multimedia Group, EURECOM, France (maria.eskevich@eurecom.fr <mailto:maria.eskevich@eurecom.fr>) - Robin Aly, Database Management Group, University of Twente, The Netherlands (r.aly@utwente.nl <mailto:r.aly@utwente.nl>) - Roeland Ordelman, Human Media Interaction Group, University of Twente, The Netherlands (roeland.ordelman@utwente.nl < mailto:roeland.ordelman@utwente.nl>) - Gareth J.F. Jones, CNGL Centre for Global Intelligent Content, Dublin City University, Ireland (gjones@computing.dcu.ie < mailto:gjones@computing.dcu.ie>)