ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2012 » ISCApad #172 » Events » Other Events » (2013-06-18) Urgent Cf Participation NTCIR-10 IR for Spoken Documents Task (SpokenDoc-2)

ISCApad #172

Sunday, October 07, 2012 by Chris Wellekens

3-3-19 (2013-06-18) Urgent Cf Participation NTCIR-10 IR for Spoken Documents Task (SpokenDoc-2)

Call for Participation

    NTCIR-10 IR for Spoken Documents Task (SpokenDoc-2)
    http://www.cl.ics.tut.ac.jp/~sdpwg/index.php?ntcir10

== INTRODUCTION

The growth of the internet and the decrease of the storage costs are
resulting in the rapid increase of multimedia contents today. For
retrieving these contents, available text-based tag information is
limited. Spoken Document Retrieval (SDR) is a promising technology for
retrieving these contents using the speech data included in them.
Following the NTCIR-9 SpokenDoc task, we will continue to evaluate the
SDR based on a realistic ASR condition, where the target documents are
spontaneous speech data with high word error rate and high
out-of-vocabulary rate.

== TASK OVERVIEW

The new speech data, the recordings of the first to sixth annual
Spoken Document Processing Workshop, are going to be used as the
target document in SpokenDoc-2. The larger speech data, spoken
lectures in Corpus of Spontaneous Japanese (CSJ), are also used as in
the last SpokenDoc-1. The task organizers are going to provide
reference automatic transcriptions for these speech data. These
enabled researchers interested in SDR, but without access to their own
ASR system to participate in the tasks. They also enabled comparisons
of the IR methods based on the same underlying ASR performance.

Targeting these documents, two subtasks will be conducted.

Spoken Term Detection: 
  Within spoken documents, find the occurrence positions of a queried
  term. The evaluation should be conducted by both the efficiency
  (search time) and the effectiveness (precision and recall).

Spoken Content Retrieval: 
  Among spoken documents, find the segments including the relevant
  information related to the query, where a segment is either a
  document (resulting in document retrieval task) or a passage
  (passage retrieval task). This is like an ad-hoc text retrieval
  task, except that the target documents are speech data.
  
== FOR MORE DETAILS

Please visit
http://www.cl.ics.tut.ac.jp/~sdpwg/index.php?ntcir10
A link to the NTCIR-10 task participants registration page
is now available from this page.

Please note that the registration deadline is Jun 30, 2012 (for
all NTCIR-10 tasks).

== ORGANIZERS

Kiyoaki Aikawa (Tokyo University of Technology)
Tomoyosi Akiba (Toyohashi University of Technology)
Xinhui Hu (National Institute of Information and Communications Technology)
Yoshiaki Itoh (Iwate Iwate Prefectural University)
Tatsuya Kawahara (Kyoto University)
Seiichi Nakagawa (Toyohashi University of Technology)
Hiroaki Nanjo (Ryukoku University)
Hiromitsu Nishizaki (University of Yamanashi)
Yoichi Yamashita Ritsumeikan University)

If you have any questions, please send e-mails to the task
organizers mailing list: ntcadm-spokendoc2@nlp.cs.tut.ac.jp

======================================================================

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy