ISCA - International Speech
Communication Association


ISCApad Archive  »  2017  »  ISCApad #233  »  Jobs  »  (2017-07-12) Open PhD and postdoc positions at LIMSI - CNRS, Orsay, France

ISCApad #233

Friday, November 10, 2017 by Chris Wellekens

6-13 (2017-07-12) Open PhD and postdoc positions at LIMSI - CNRS, Orsay, France
  
Open PhD and postdoc positions at LIMSI - CNRS, Orsay, France
Automatic enrichment of TV series and movies transcripts
Keywords : natural language processing, speech processing, machine learning, deep learning

The goal of this project is to fully exploit the audio stream to automatically enrich speech transcripts and subtitles of TV series and movies with the name and position of the characters.

speaker A  ? 'Nice to meet you, I am Leonard, and this is Sheldon. We live across the hall.'
speaker B ? 'Oh. Hi. I?m Penny.'

speaker A ? 'Sheldon, what the hell are you doing?'
speaker C ? I am not quite sure yet. Do you know where Howard lives?

Just looking at these two short conversations, a human can easily infer that 'speaker A' is actually 'Leonard', 'speaker B' is Penny and 'speaker C' is Sheldon. The objective of this project is to combine natural language processing, speech processing, and computer vision to do the same automatically.
 

Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA