ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2012 » ISCApad #172 » Jobs » (2012-07-05) PhD at LIG (Grenoble-France)

ISCApad #172

Sunday, October 07, 2012 by Chris Wellekens

6-26 (2012-07-05) PhD at LIG (Grenoble-France)

PhD proposal : Collaborative Annotation of multi-modal, multi-lingual and multimedia documents
Project objective
This PhD will be proposed and funded in the context of the CHIST-ERA / ANR Camomile Project (Collaborative Annotation of multi-MOdal, MultI-Lingual and multi-mEdia documents) Human activity is constantly generating large volumes of heterogeneous data, in particular via the Web. These data can be collected and explored to gain new insights in social sciences, linguistics, economics, behavioral studies as well as artificial intelligence and computer sciences. In this regard, 3M (multimodal, multimedia, multilingual) data could be seen as a paradigm of sharing an object of study, human data, between many scientific domains. But, to be really useful, these data should be annotated, and available in very large amounts. Annotated data is useful for computer sciences which process human data with statistical-based machine learning methods, but also for social sciences which are more and more using the large corpora available to support new insights, in a way which was not imaginable few years ago. However, annotating data is costly as it involves a large amount of manual work, and in this regard 3M data, for which we need to annotate different modalities with different levels of abstraction is especially costly. Current annotation frameworks involve some local manual annotation, with the help sometimes of some automatic tools. The Camomile Project aims at developing a first prototype of collaborative annotation framework on 3M data, in which the manual annotation will be done remotely on many sites, while the final annotation will be localized on the main site. Furthermore, with the same principle, some systems devoted to automatic processing of the modalities (speech, vision) present in the multimedia data will help the transcription, by producing automatic pre-annotations.
PHD proposal
This PhD is dedicated to the proposal of semi-supervised and unsupervised methods for the annotation of MMM data. Different scenarios of semi-supervised annotations will be experimented, for different type of videos. More precisely, we shall study: ? innovative retraining / adaptation strategies to update the different systems using new annotations. Since we consider a real scenario where new annotations are produced continuously, we will specially focus on iterative learning techniques where models are updated instead of being fully retrained; ? new data selection methods for active learning strategies ; we will focus on active learning for multimodal and heterogeneous systems which makes the data selection task much more difficult. As a case study we shall focus our work on developing technologies in order to answer to the questions ?who is seen??, ?who is speaking?? in videos. Depending on the type of video and the feedback from the supervision group, we may extend our work to the automatic annotation of objects (?what is seen??) or activities (?what is going on??).
Required Skills
The applicant must have a master degree in either computer science or computer engineering and have some knowledge in speech, image or video processing and in machine learning. We also search for a candidate with very good programming skills.
LIG GETALP and MRIM collaboration
PHD work is to be carried out between the GETALP and MRIM teams of LIG. LIG / GETALP website http://getalp.imag.fr LIG / MRIM website http://mrim.imag.fr
Contacts Laurent Besacier Laurent.Besacier@imag.fr Georges Quénot Georges.Quenot@imag.fr
Targeted starting date: fall 2012

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy