ISCApad Archive » 2020 » ISCApad #266 » Academic and Industry Notes » (2nd call) FEARLESS STEPS Challenge Phase-2 for ISCA INTERSPEECH-2020 |
ISCApad #266 |
Monday, August 10, 2020 by Chris Wellekens |
ISCApad INTERSPEECH March 2020 March 10, 2020 The Fearless Steps Challenge (Phase 2: FS#2) TIMELINE: Challenge Start Date (Data Release): January 25th 2020 INTERSPEECH-2020 Papers dealing with FEARLESS STEPS deadline: May 8, 2020
Registration Link:https://bit.ly/2qZ5tic
Challenge Tasks in Phase-2 (FS#2): 1. Speech Activity Detection (SAD) 2. Speaker Identification (SID) 3. Speaker Diarization: 3a. Track 1: Diarization using reference SAD 3b. Track 2: Diarization using system SAD 4. Automatic Speech Recognition (ASR): 4a. Track 1: ASR using reference Diarization 4b. Track 2: Continuous stream ASR
Website Link https://fearless-steps.github.io/ChallengePhase2/
Background: The Fearless Steps Initiative by UTDallas-CRSS led to the digitization, recovery, and diarization of 19,000 hours of original analog audio data, as well as the development of algorithms to extract meaningful information from this multichannel naturalistic data resource. As an initial step to motivate a stream-lined and collaborative effort from the speech and language community, UTDallas-CRSS is hosting a series of progressively complex tasks to promote advanced research on naturalistic “Big Data” corpora. This began with ISCA INTERSPEECH-2019: 'The FEARLESS STEPS Challenge: Massive Naturalistic Audio (FS-#1)'. This first edition of this challenge encouraged the development of core unsupervised/semi-supervised speech and language systems for single-channel data with low resource availability, serving as the “First Step” towards extracting high-level information from such massive unlabeled corpora. As a natural progression following the successful Inaugural Challenge FS#1, the FEARLESS STEPS Challenge Phase-#2 focuses on development of single-channel supervised learning strategies. This FS#2 provides 80 hours of ground-truth data through Training and Development sets, with an additional 20 hours of blind-set Evaluation data. Based on feedback from the Fearless Steps participants, additional Tracks for streamlined speech recognition and speaker diarization have been included in the FS#2. The results for this Challenge will be presented at the ISCA INTERSPEECH-2020 Special Session. We encourage participants to explore any and all research tasks of interest with the Fearless Steps Corpus – with suggested Task Domains listed below. Research participants can however, also utilize the FS#2 corpus to explore additional problems dealing with naturalistic data, which we welcome as part of the special session. Organizers John H.L. Hansen (john.hansen@utdallas.edu) Aditya Joglekar (aditya.joglekar@utdallas.edu) Meena Chandra Shekar (meena.chandrashekar@utdallas.edu) Abhijeet Sangwan (abhijeet.sangwan@utdallas.edu) |
Back | Top |