|    | We have created and made publicly available a dense audio-visual person-oriented ground-truth annotation of a feature movie (100 minutes long): “Hannah and her sisters” by Woody Allen.
  The annotation includes
  •          Face tracks in video (densely annotated, i.e., in each frame, and person-labeled)
  •             Speech segments in audio (person-labeled)
  •             Shot boundaries in video
 
 
  The annotation can be useful for evaluating
 
 
  •   Person-oriented video-based tasks (e.g., face tracking, automatic character naming, etc.)
  •             Person-oriented audio-based tasks (e.g., speaker diarization or recognition)
  •             Person-oriented multimodal-based tasks (e.g., audio-visual character naming)
 
 
  Detail on Hannah dataset and access to it can be obtained there:
  https://research.technicolor.com/rennes/hannah-home/
  https://research.technicolor.com/rennes/hannah-download/
 
 
  Acknowledgments:
  This work is supported by AXES EU project: http://www.axes-project.eu/
 
 
 
 
 
 
 
 
 
  Alexey Ozerov Alexey.Ozerov@technicolor.com 
Jean-Ronan Vigouroux,  
Louis Chevallier  
Patrick Pérez  
Technicolor Research & Innovation
 
 
     |