| Visually-Grounded Interaction and Language Workshop NIPS 2017 Friday, December 8 Long Beach, CA, USA https://nips2017vigil.github.io/ Please address questions to: nips2017vigil@gmail.com ********************************************************************************
Everyday interactions require a common understanding of language, i.e. for people to communicate effectively, words (for example ?cat?) should invoke similar beliefs over physical concepts (what cats look like, the sounds they make, how they behave, what their skin feels like etc.). However, how this ?common understanding? emerges is still unclear. One appealing hypothesis is that language is tied to how we interact with the environment. As a result, meaning emerges by ?grounding? language in modalities in our environment (images, sounds, actions, etc.).
Recent concurrent works in machine learning have focused on bridging visual and natural language understanding through visually-grounded language learning tasks, e.g. through natural images (Visual Question Answering, Visual Dialog), or through interactions with virtual-physical environments. In cognitive science, progress in fMRI enables creating a semantic atlas of the cerebral cortex, or decoding semantic information from visual input. And in psychology, recent studies show that a baby?s most likely first words are based on their visual experience, laying the foundation for a new theory of infant language acquisition and learning.
As the grounding problem requires an interdisciplinary attitude, this workshop aims to gather researchers with broad expertise in various fields ? machine learning, computer vision, natural language, neuroscience, and psychology ? to discuss their cutting-edge work as well as perspectives on future directions in this exciting space of grounding and interactions.
We invite you to submit your paper related to the following topics:
? language acquisition or learning through interactions ? visual captioning, dialog, and question-answering ? reasoning in language and vision ? visual synthesis from language ? transfer learning in language and vision tasks ? navigation in virtual worlds with natural-language instructions ? machine translation with visual cues ? novel tasks that combine language, vision and actions ? understanding and modeling the relationship between language and vision in humans ? semantic systems and modeling of natural language and visual stimuli representations in the human brain
Important dates --------------------- Submission deadline: 3rd November 2017 Acceptance notification: 10th November 2017 Workshop: 8th December 2017
Paper details ------------------ ? Contributed papers may include novel research, preliminary results, extended abstract, positional papers or surveys ? Papers are limited to 4 pages, excluding references, in the latest camera-ready NIPS format: https://nips.cc/Conferences/2017/PaperInformation/StyleFiles ? Papers published at the main conference can be submitted without reformatting ? Please submit via email: nips2017vigil@gmail.com
Accepted papers ----------------------- ? All accepted papers will be presented during 2 poster sessions ? Up to 5 accepted papers will be invited to deliver short talks ? Accepted papers will be made publicly available as non-archival reports, allowing future submissions to archival conferences and journals
Invited Speakers ----------------------- Raymond J. Mooney - University of Texas Sanja Fidler - University of Toronto Olivier Pietquin - DeepMind Jack Gallant - University of Berkeley Devi Parikh - Georgia Tech / FAIR Felix Hill - DeepMind ... and more to come!
Organizers --------------- Florian Strub - University of Lille, Inria Harm de Vries - University of Montreal Abhishek Das - Georgia Tech Satwik Kottur - Carnegie Mellon University Stefan Lee - Georgia Tech Mateusz Malinowski - DeepMind Dhruv Batra - Georgia Tech / FAIR Aaron Courville - University of Montreal Olivier Pietquin - DeepMind Devi Parikh - Georgia Tech / FAIR Jeremie Mary - Criteo
|