ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2012 » ISCApad #169 » Events » ISCA Supported Events

ISCApad #169

Tuesday, July 10, 2012 by Chris Wellekens

3-2 ISCA Supported Events

3-2-1

(2012-09-09) Special Session at Interspeech 2012 Speech and Audio Analysis of Consumer and Semi-Professional Multimedia

Special Session at Interspeech 2012

Speech and Audio Analysis of Consumer and Semi-Professional Multimedia

http://interspeech2012.org/Special.html

**********************************************************************

Consumer-grade and semi-professional multimedia material (video) is becoming abundant on the Internet and other online archives. It is easier than ever to download material of any kind. With cell-phones now featuring video recording capability along with broadband connectivity, multimedia material can be recorded and distributed across the world just as easily as text could just a couple of years ago. The easy availability of vast amounts of text gave a huge boost to the Natural Language Processing and Information Retrieval research communities, The above-mentioned multimedia material is set to do the same for multi-modal audio and video analysis and generation. We argue that the speech and language research community should embrace that trend, as it would profit vastly from the availability of this material, and has significant own know-how and experience to contribute, which will help shape this field.

Consumer-created (as opposed to broadcast news, “professional style”) multimedia material offers a great opportunity for research on all aspects of human-to-human as well as man-machine interaction, which can be processed offline, but on a much larger scale than is possible in online, controlled experiments. Speech is naturally an important part of these interactions, which can link visual objects, people, and other observations across modalities. Research results will inform future research and development directions in interactive settings, e.g. robotics, interactive agents, etc., and give a significant boost to core (offline) analysis techniques such as robust audio and video processing, speech and language understanding, as well as multimodal fusion.

Large-scale multi-modal analysis of audio-visual material is beginning in a number of multi-site research projects across the world, driven by various communities, such as information retrieval, video search, copyright protection, etc. While each of these have slightly different targets, they are facing largely the same challenges: how to robustly and efficiently process large amounts of data, how to represent and then fuse information across modalities, how to train classifiers and segmenters on un-labeled data, how to include human feedback, etc. Speech, language and audio researchers have considerable interest and experience in these areas, and should be at the core and forefront of this research. To make progress at a useful rate, researchers must be connected in a focused way, and be aware of each other’s work, in order to discuss algorithmic approaches, ideas for evaluation and comparisons across corpora and modalities, training methods with various degrees of supervision, available data sets, etc. Sharing software, databases, research results and projects' descriptions are some of the key elements to success which are at the core of the Speech and Language in Multimedia (SLIM) SIG's objectives.

The special session will serve these goals by bringing together researchers from different fields – speech, but also audio, multimedia – to share experience, resources and foster new research directions and initiatives. Contributions are expected on all aspects of speech and audio processing for multimedia contents: research results but also presentation of ongoing research projects or software, multimedia databases and benchmarking initiatives, etc. A special session, as opposed to a regular session, offers unique opportunities to emphasize interaction between participants with the goal of strengthening and growing the SLIM community. The following format will be adopted: a few selected talks targeting a large audience (e.g., project or dataset descriptions, overview) will open the session, followed by a panel and open discussion on how to develop our community along with poster presentations.

Florian Metze

Assistant Research Professor

Language Technologies Institute

School of Computer Science

Carnegie Mellon University

Back

Top

3-2-2

(2012-09-14) Symposium on Machine Learning in Speech and Language Processing (MLSLP)

Symposium on Machine Learning in Speech and Language Processing (MLSLP) http://ttic.edu/sigml/symposium2012/

This is the second annual meeting of the ISCA Special Interest Group on Machine Learning (SIGML).

It will include invited talks and general submissions.  The deadline for general submissions is June 15, 2012.

Please see the web site for up-to-date information.

Call for Participation

The goal of the symposium is to foster communication and collaboration between researchers in these synergistic areas, taking advantage of the nearby location of Interspeech 2012. It is the second annual meeting of the Machine Learning Special Interest Group (SIGML) of the International Speech Communication Association (ISCA). (See last year's symposium here.)

Topics

The workshop will feature both invited talks and general submissions. Submissions focusing on novel research are solicited. In addition, we especially encourage position and review papers addressing topics that are relevant to speech, machine learning, and NLP research. These areas include, but are not limited to, applications to speech/NLP of SVMs, log-linear models, neural networks, kernel methods, discriminative transforms, large-margin training, discriminative training, active/semi-supervised/unsupervised learning, structured prediction, Bayesian modeling, deep learning, and sparse representations.

Paper Submission

Prospective authors are invited to submit papers written in English via the 'Submissions' link to the left. Each paper will be reviewed by at least two reviewers, and each accepted paper must have at least one registered author.

Invited Speakers

Shai Ben-David, Inderjit Dhillon, Mark Gales, Brian Roark, Dirk van Compernolle, additional speakers TBA

Organizing Committee

Scientific Chair: Joseph Keshet	TTI-Chicago
Speech Processing Chair: Karen Livescu	TTI-Chicago
Natural Language Processing Chair: David Chiang	University of Southern California and Information Sciences Institute
Machine Learning Chair: Fei Sha	University of Southern California
Local Organization: Mark Hasegawa-Johnson	University of Illinois at Urbana-Champaign

Back

Top

3-2-3

(2012-11-28) International Workshop on Spoken Dialog Systems (IWSDS 2012) Towards a Natural Interaction with Robots, Knowbots and Smartphones.Paris, France

International Workshop on Spoken Dialog Systems (IWSDS 2012)
Towards a Natural Interaction with Robots, Knowbots and Smartphones.
Paris, France, November 28-30, 2012

www.iwsds.org

** Final Announcement **

Following the success of IWSDS’2009 (Irsee, Germany), IWSDS’2010 (Gotemba Kogen Resort, Japan) and IWSDS’2011 (Granada, Spain), the Fourth International Workshop on Spoken Dialog Systems (IWSDS 2012) will be held in Paris (France) on November 28-30, 2012.

The IWSDS Workshop series provides an international forum for the presentation of research and applications and for lively discussions among researchers as well as industrialists, with a special interest to the practical implementation of Spoken Dialog Systems in everyday applications. Scientific achievements in language processing now results in the development of successful applications such as IBM Watson, Evi, Apple Siri or Google Assistant for access to knowledge and interaction with smartphones, while the coming of domestic robots advocates for the development of powerful communication means with their human users and fellow robots.

We therefore put this year workshop under the theme “Towards a Natural Interaction with Robots, Knowbots and Smartphones”, which covers:
- Dialog for robot interaction (including ethics),
- Dialog for Open Domain knowledge access,
- Dialog for interacting with smartphones,
- Mediated dialog (including multilingual dialog involving Speech Translation),
- Dialog quality evaluation.

We would also like to encourage the discussion of common issues of theories, applications, evaluation, limitations, general tools and techniques, and therefore also invite the submission of original papers in any related area, including but not limited to:
- Speech recognition and understanding,
- Dialog management, Adaptive dialog modeling,
- Recognition of emotions from speech, gestures, facial expressions and physiological data,
- Emotional and interactional dynamic profile of the speaker during dialog, User modeling,
- Planning and reasoning capabilities for coordination and conflict description,
- Conflict resolution in complex multi-level decisions,
- Multi-modality such as graphics, gesture and speech for input and output,
- Fusion, fission and information management, Learning and adaptability,
- Visual processing and recognition for advanced human-computer interaction,
- Spoken Dialog databases and corpora, including methodologies and ethics,
- Objective and subjective Spoken Dialog evaluation methodologies, strategies and paradigms,
- Spoken Dialog prototypes and products, etc.

Invited speakers: Jérôme Bellegarda (Apple Inc. (USA)), Axel Buendia (SpirOps (France)), Jonathan Ginzburg (Univ. Paris-Diderot (France), Alex Waibel (KIT (Germany), CMU (USA) and IMMI (France)), Marilyn Walker (University of California at Santa Cruz (USA))

PAPER SUBMISSION

We particularly welcome papers that can be illustrated by a demonstration, and we will organize the conference in order to best accommodate these papers, whatever their category.

As usual, it is planned that a selection of accepted papers will be published in a book by Springer following the conference.

We distinguish between the following categories of submissions:

Long Research Papers are reserved for reports on mature research results. The expected length of a long paper should be in the range of 8-12 pages.
Short Research Papers should not exceed 6 pages in total. Authors may choose this category if they wish to report on smaller case studies or ongoing but interesting and original research efforts
Demo - System Papers: Authors who wish to demonstrate their system may choose this category and provide a description of their system and demo. System papers should not exceed 6 pages in total.

IMPORTANT DATES
Deadline for submission: July 16, 2012
Notification of acceptance: September 15, 2012
Deadline for final submission of accepted paper: October 8, 2012
Deadline for Early Bird registration: October 8, 2012
Final program available online: November 5, 2012
Workshop: November 28-30, 2012

VENUE: IWSDS 2012 will be held as a two-day residential seminar in the wonderful Castle of Ermenonville (http://www.chateau-ermenonville.com/en) near Paris, France, where attendees will be accommodated.

IWSDS Steering Committee: Gary Geunbae Lee (POSTECH, Pohang, Korea), Ramón López-Cózar (Univ. of Granada, Spain), Joseph Mariani (LIMSI and IMMI-CNRS, Orsay, France), Wolfgang Minker (Ulm Univ., Germany), Satoshi Nakamura (Nara Institute of Science and Technology, Japan)

IWSDS 2012 Program Committee: Joseph Mariani (LIMSI & IMMI-CNRS, Chair), Laurence Devillers (LIMSI-CNRS & Univ. Paris-Sorbonne 4), Martine Garnier-Rizet (IMMI-CNRS), Sophie Rosset (LIMSI-CNRS).

Organizing Committee: Martine Garnier-Rizet (Chair), Lynn Barreteau, Joseph Mariani (IMMI-CNRS).

Scientific Committee: Jan Alexandersson (DFKI, Saarbrucken, Germany), Masahiro Araki (Interactive Intelligence lab, Kyoto Institute of Technology, Japan), Frédéric Béchet (LIF, Marseille, France), André Berton (Daimler R&D, Ulm, Germany), Axel Buendia (SpirOps, Paris, France), Susanne Burger (Carnegie Mellon University, Pittsburg PA, USA), Felix Burkhardt (Deutsche Telecom Laboratories, Berlin, Germany), Zoraida Callejas (University of Granada, Spain), Nick Campbell (Trinity College, Dublin, Ireland), Heriberto Cuayáhuitl (DFKI, Saarbrucken, Germany), Yannick Estève (LIUM, Université du Maine, Le Mans, France), Sadaoki Furui (Tokyo Institute of Technology, Tokyo, Japan), Jon Ander Gomez (Polytechnic University of Valencia, Spain), David Griol (Carlos III University of Madrid, Spain), Joakim Gustafson (KTH, Stockholm, Sweden), Olivier Hamon (ELDA, Paris, France), Tobias Heinroth (Ulm University, Germany), Paul Heisterkamp (Daimler Research, Ulm, Germany), Luis Alfonso Hernandez (Polytechnic University of Madrid), Dirk Heylen (University of Twente, The Netherlands), Ryuichiro Higashinaka (NTT Cyber Space Laboratories, Yokosuka, Japan), Julia Hirschberg (Columbia University, New York, USA), M. Ehsan Hoque (MIT Media Lab, Cambridge, USA), Chiori Hori (NICT, Kyoto, Japan), Kristiina Jokinen (University of Helsinki, Finland), Tatsuya Kawahara (Kyoto University, Japan), Seokhwan Kim (Institute for Infocomm Research, Singapore), Harksoo Kim (Kangwon National University, Korea), Hong Kook Kim (Gwangju Institute of Science and Technology, Korea), Lin-Shan Lee (National Taiwan University, Taiwan), Fabrice Lefèvre (LIA, Université d'Avignon et des Paysdu Vaucluse, France), Heizhou Li (Institute for Infocomm Research, Singapore), Michael McTear (University of Ulster, UK), Yasuhiro Minami (NTT Cyber Space Laboratories, Yokosuka, Japan), Teruhisa Misu (NICT, Kyoto, Japan), Mikio Nakano (Honda Research Institute, Japan), Shrikanth S. Narayanan (SAIL (Signal Analysis and Interpretation Laboratory), Los Angeles), USA), Elmar Nöth (University of Erlangen, Germany), Roberto Pieraccini (ICSI - Berkeley, USA) , Olivier Pietquin (Sup'Elec, Metz, France), Sylvia Quarteroni (Politecnico di Milano, Italy), Matthieu Quignard (ICAR, ENS Lyon, France), Norbert Reithinger (DFKI, Berlin, Germany), Alexander Schmitt (Ulm University, Germany), Björn Schuller (Institute for Human-Machine Communication, Technische Universität München, Germany), Elizabeth Shriberg (Microsoft, USA), Gabriel Skantze (KTH, Stockholm, Sweden), Sebastian Stüker (KIT, Karlsruhe, Germany), Kazuya Takeda (University of Nagoya, Japan), Alessandro Vinciarelli (University of Glasgow, United Kingdom), Marilyn Walker (University of California, Santa Cruz, USA), Hsin-min Wang (Academia Sinica, Taipei, Taiwan).

Participating organizations: IMMI-CNRS and LIMSI-CNRS (France), Postech (Korea), University of Granada (Spain), Nara Institute of Science and Technology (NAIST) and National Institute of Information and Communications (NICT) (Japan), Ulm University (Germany).

Sponsors: European Language Resources Association (ELRA), European Language and Speech Network (ELSNET).

Supporting organizations: Association Francophone pour la Communication Parlée (AFCP), Association pour le Traitement Automatique des Langues (ATALA), HUMAINE Emotion Research Network, International Speech Communication Association (ISCA), Korean Society of Speech Sciences (KSSS), Spanish Thematic Network on Advanced Dialogue Systems (RTSDA), SIGdial.

** Please contact iwsds2012@immi-labs.org or visit www.iwsds.org to get more information. **

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy