ISCA - International Speech
Communication Association


ISCApad Archive  »  2020  »  ISCApad #267  »  Journals  »  IEEE STSP Special issue: Deep Learning for Multi-modal Intelligence across Speech, Language, Vision, and Heterogeneous Signals (extended deadline)

ISCApad #267

Thursday, September 10, 2020 by Chris Wellekens

7-4 IEEE STSP Special issue: Deep Learning for Multi-modal Intelligence across Speech, Language, Vision, and Heterogeneous Signals (extended deadline)
  

Call for Papers
IEEE JSTSP Special Issue on

Deep Learning for Multi-modal Intelligence across
Speech, Language, Vision, and Heterogeneous Signals

 

Extended deadline to September 15th


In the past years, thanks to the disruptive advances in deep learning, significant progress has been made in speech processing, language processing, computer vision, and applications across multiple modalities. Despite the superior empirical results, however, there remain important issues to be addressed. Both theoretical and empirical advancements are expected to drive further performance improvements, which in turn would generate new opportunities for in-depth studies of emerging novel learning and modeling methodologies. Moreover, many problems in artificial intelligence involve more than one modality, such as language, vision, speech and heterogeneous signals. Techniques developed for different modalities can often be successfully cross-fertilized. Therefore, it is of great interest to study multimodal modeling and learning approaches across more than one modality. The goal of this special issue is to bring together a diverse but complementary set of contributions on emerging deep learning methods for problems across multiple modalities. The topics of this special issue include but not limit to the following:

Topics of interest in this special issue include (but are not limited to):

  • Fundamental problems and methods for processing multi-modality data including language, speech, image, video, and heterogeneous signals
  • Pre-training, representation learning, multitask learning, low-shot learning, and reinforcement learning of multimodal problems across natural language, speech, image, and video
  • Deep learning methods and applications for cross-modalities, such as image captioning, visual question answering, visual story-telling, text-to-image synthesis, vision-language navigation, etc.
  • Evaluation metrics of multimodal applications


Prospective authors should follow the instructions given on the IEEE JSTSP webpages and submit their manuscript to the web submission system.


Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA