ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2019 » ISCApad #252 » Jobs » (2019-01-24) Internship at LABRI, Talence, France

ISCApad #252

Tuesday, June 11, 2019 by Chris Wellekens

6-27 (2019-01-24) Internship at LABRI, Talence, France

Sujet de stage M2 : sleepiness detection and characterization in voice recordings.

Advisors: - Jean-Luc Rouas – CR CNRS, LaBRI : rouas@labri.fr

- Pr. Pierre Philippe – PU-PH, SANPSY : pr.philip@free.fr

Subject: Detection sleepiness is useful for many reasons : for instance, it can help prevent road traffic accidents, it can be useful to monitor workers in critical environments (air traffic control, nuclear plants, etc.). While these applications are very important, it can also be used in a clinical way in the follow-up of sleep deprived patients. The Obstrusive Sleep Apnea is nowaday recognised as a major public health problem resulting in many consequences : road traffic accidents, increase in heart failure rates, behavioural and cognitive troubles, … In order to deal with these problems, we devised an experiment with the SANPSY research unit (Sommeil - Addiction- Neuropsychiatrie) Université Bordeaux Ségalen CNRS USR 3413) in order to assess if we can evaluate the sleepiness level of a patient using only a simple speech recording. Previous research has shown that this task is possible, however most studies on sleepiness detection from speech rely on corpora with self reported labels according to the KSS scale [1]. For instance, the Interspeech 2011 speaker state challenge [2] uses data from 99 speakers and contains mixed data from different tasks (isolated vowels, read speech, command request, spontaneous speech) in German. The annotations are self-reported using the KSS scale and are divided in two classes : sleepy (S) and not sleepy (NS). The best system [3] in the challenge competition won with a reported accuracy slightly above the baseline, around 72 % of correctly identified samples. Other efforts on sleepiness detection from speech often use the same kind of data. For example, in [4] 77 participants are recorded speaking isolated vowels, and the annotation is also made using self-reported scores from the KSS scale. Reported performances on two classes (S and NS) are around 78 % of correction identification. In a more recent paper [5], the number of participants is increased (402), the recordings are read passages from 7 texts. However the classification task is not the same since the classifier tries to predict the value of the KSS score. In our project, in close partnership with the SANSPY unit, we started to record patients (current number of patients recorded is 78) while asssessing their sleepiness states by various measurements including EEG as well as clinical expertise. Recording the patients follows a strict clinical methodology resulting in sets of 4 recordings per patient, always at the same time of the day. Three categories of sleepiness level have been devised according to the health professionals (instead of usually two in previous research on sleepiness detection in speech): very sleepy, intermediate and normal. Using these recordings and the provided categories, we begun to test different features and classification methods. Using a relatively small set of features and simple classification techniques, we managed to obtain in a cross validation procedure a global classification rate of 70% correct. The task of the intern student is to further explore the different possibilities in terms of features and machine learning methods as the data collection continues, and to carry on thorough analysis of
the results so as to understand the influence of several factors such as gender, age, or pathology.

References:

[1] Shahid, A., Wilkinson, K., Marcu, S., & Shapiro, C. M. (2011). Karolinska sleepiness scale (KSS). In STOP, THAT and One Hundred Other Sleep Scales (pp. 209-210). Springer New York.

[2] Schuller, B.; Steidl, S.; Batliner, A.; Schiel, F.; Krajewski, J.: “The Interspeech 2011 Speaker State Challenge”, Interspeech (2011), ISCA, Florence, Italy, 2011.

[3] Dong-Yan Huang, Zhengchen Zhang, Shuzhi Sam Ge, Speaker state classification based on fusion of asymmetric simple partial least squares (SIMPLS) and support vector machines, In Computer Speech & Language, Volume 28, Issue 2, 2014, Pages 392-419, ISSN 0885-2308, https://doi.org/10.1016/j.csl.2013.06.002.

[4] Krajewski, J., Schnieder, S., Sommer, D., Batliner, A., & Schuller, B. (2012). Applying multiple classifiers and non-linear dynamics features for detecting sleepiness from speech. Neurocomputing, 84, 65-75.

[5] Krajewski, J., Schnieder, S., Monschau, C., Titt, R., Sommer, D., & Golz, M. (2016, October). Large Sleepy Reading Corpus (LSRC): Applying Read Speech for Detecting Sleepiness. In Speech Communication; 12. ITG Symposium; Proceedings of (pp. 1-4). VDE.

Requested skills:

- speech processing and/or signal processing techniques

- machine learning

- programming languages : matlab, python, C/C++

- interest in clinical research and/or cognitive sciences

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy