ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2022 » ISCApad #283 » Jobs » (2021-11-26) Four research internships at SteelSeries France R&D team

ISCApad #283

Monday, January 10, 2022 by Chris Wellekens

6-24 (2021-11-26) Four research internships at SteelSeries France R&D team

The SteelSeries France R&D team (former Nahimic R&D team) is glad to open 4 research internship positions for 2022.

The selected candidates will be working on one of the following topics (more details in attached):

- Audio media classification

- Audio source classification

- Audio source separation

- Real-time speech restoration

Please reply/apply to nathan.souviraa-labastie@steelseries.com.

Audio media classification Master internship, Lille (France), 2022

Advisors — Pierre Biret, R&D Engineer, pierre.biret@steelseries.com — Nathan Souviraà-Labastie, R&D Engineer, PhD, nathan.souviraa-labastie@steelseries.com

Company description

SteelSeries is a leader in gaming peripherals focused on quality, innovation and functionality, and the fastest growing major gaming headset brand globally. Founded in 2001, SteelSeries improves performance through first-to-market innovations and technologies that enable gamers to play harder, train longer, and rise to the challenge. SteelSeries is a pioneer supporter of competitive gaming tournaments and eSports and connects gamers to each other, fostering a sense of community and purpose. Nahimic has joined the SteelSeries family in 2020 to bolster reputation of industry-leading gaming audio performance across both hardware and software. Nahimic is the leading 3D gaming audio software editor with more than 150 man-years of research and development in gaming industry. Their team gathers the rare combination of world class audio processing engineers and software geniuses based across France, Singapore and Taiwan. They are the worldwide leader in PC audio gaming software that are embedded in millions of gaming devices, from gaming headsets to the most powerful gaming PCs by brands such as MSI, Dell, Asus, Gigabyte, etc. Their technology offers the most precise and immersive sound for gamers that allows them to be more efficient in any game and have more immersive feeling. We wish to meet passionate people full of energy and motivations, ready to achieve great challenges to exhale everyone’s audio experience. We are currently looking for a AUDIO SIGNAL PROCESSING / MACHINE LEARNING RESEARCH INTERN to join the R&D team of SteelSeries’ Software & Services Business Unit in our French office (former Nahimic R&D team).

Subject

The target of the internship is to build a model able to classify audio streams into multiple media classes (classes description upon request).The audio classification problem will be addressed using supervised machine learning. Hence, the first step of the internship will be to collect data and build a balanced corpus for such an audio classification problem. Fortunately, massive audio content for most potential classes are available within the company and this task should not be an important burden. Once the corpus is built, the intern will have to either tune the parameters of an already existing internal model or develop a more adapted model from the state of the art [1] [2] [4] that still satisfy the «real-time» constraint. A more advanced step of the internship would be to define a more precise media type classification with for instance sub-types within a same category.Once the relevant classes have been identified, the intern will have to incorporate such changes in his classification algorithm and framework. As the intern will receive support to turn his model into an in-product real-time prototype, this internship is a rare opportunity to bring research to product in such a short time frame.

Skills

Who are we looking for ? Preparing an engineering degree or a master’s degree, you preferably have knowledge in the development and implementation of advanced algorithms for digital audio signal processing. Machine learning skills is a plus. 1 Whereas not mandatory, notions in the following various fields would be appreciated : Audio, acoustics and psychoacoustics - Audio effects in general : compression, equalization, etc. - Machine learning and artificial neural networks. - Statistics, probabilist approaches, optimization. - Programming language : Matlab, Python, Pytorch, Keras, Tensorflow. - Voice recognition, voice command. - Computer programming and development : Max/MSP, C/C++/C#. - Audio editing software : Audacity, Adobe Audition, etc. - Scientific publications and patent applications. - Fluent in English and French. - Demonstrate intellectual curiosity.

Other offers https://nahimic.welcomekit.co/

https://www.welcometothejungle.co/companies/nahimic/jobs

Références

[1] DCase Challenge Low-Complexity Acoustic Scene Classification with Multiple Devices. url : http: //dcase.community/challenge2021/task-acoustic-scene-classification-results-a.

[2] B. Kim et al. Domain Generalization on Efficient Acoustic Scene Classification using Residual Normalization. 2021. arXiv : 2111.06531 [cs.SD].

[3] Nahimic on MSI. url : https://fr.msi.com/page/nahimic.

[4] sharathadavanne. seld-dcase2021. https://github.com/sharathadavanne/seld-dcase2021. 2021

Audio source classification Master internship, Lille (France), 2022

Advisors — Nathan Souviraà-Labastie, R&D Engineer, PhD, nathan.souviraa-labastie@steelseries.com — Pierre Biret, R&D Engineer, pierre.biret@steelseries.com

Company description

Subject

The target of the internship is to build a model able to classify audio sources. And by audio sources, we mean sources present inside a predetermined given media such as music, movies or video games, e.g., instruments in the case of music. The audio classification problem will be addressed using supervised machine learning. The intern would not start his project from scratch as data and classification code from other projects can be re-used with minor adaptation (description upon request). Once the corpus is reshaped for classification, the intern will have to either tune the parameters of an already existing internal model or develop a more adapted model from the state of the art [1] [3] [5] that still satisfy strong real-time constraint. Multi-task approach A more advanced step of the internship would be to explore multi-task models. The two tasks of target would be 1/ the classification task that the intern would have previously addressed, 2/ the audio source separation task on the same data type. This is a very challenging machine learning problem, especially because the different tasks are heterogeneous (classification, regression, signal estimation), contrary to homogeneous multi-task classification where a classifier is able to address different classification task. Moreover, just a few study are targeting audio heteregeneous multi-task (exhaustive list from advisors knowledge [2, 6, 7, 4]). Potential advantages of the multi-task approach are performance improvement for the main/principal task and computational cost reduction in products, as several tasks are achieved at the same time. Previous internal bibliographic work and network architecture could be used as starting point for this approach.

Skills

Who are we looking for ? Preparing an engineering degree or a master’s degree, you preferably have knowledge in the development and implementation of advanced algorithms for digital audio signal processing. Machine learning skills is a plus. Whereas not mandatory, notions in the following various fields would be appreciated : Audio, acoustics and psychoacoustics - Machine learning and artificial neural networks. - Audio effects in general : compression, equalization, etc. - Statistics, probabilist approaches, optimization. - Programming language : Matlab, Python, Pytorch, Keras, Tensorflow. - Sound spatialization effects : binaural synthesis, ambisonics, artificial reverberation. - Voice recognition, voice command. - Voice processing effects : noise reduction, echo cancellation, array processing. - Computer programming and development : Max/MSP, C/C++/C#. - Audio editing software : Audacity, Adobe Audition, etc. - Scientific publications and patent applications. - Fluent in English and French. - Demonstrate intellectual curiosity.

Other offers

https://nahimic.welcomekit.co/ https://www.welcometothejungle.co/companies/nahimic/jobs

Références

[1] DCase Challenge Low-Complexity Acoustic Scene Classification with Multiple Devices. url : http: //dcase.community/challenge2021/task-acoustic-scene-classification-results-a.

[2] P. Georgiev et al. « Low-resource Multi-task Audio Sensing for Mobile and Embedded Devices via Shared Deep Neural Network Representations ». In : Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1.3 (sept. 2017), 50 :1-50 :19.

[3] B. Kim et al. Domain Generalization on Efficient Acoustic Scene Classification using Residual Normalization. 2021. arXiv : 2111.06531 [cs.SD].

[4] H. Phan et al. « On multitask loss function for audio event detection and localization ». In : arXiv preprint arXiv :2009.05527 (2020).

[5] sharathadavanne. seld-dcase2021. https://github.com/sharathadavanne/seld-dcase2021. 2021.

[6] G. P. Stéphane Dupont Thierry Dutoit. « Multi-task learning for speech recognition : an overview ». In : 24th European Symposium on Artificial Neural Networks 1 (2016).

[7] D. Stoller, S. Ewert et S. Dixon. « Jointly Detecting and Separating Singing Voice : A Multi-Task Approach ». en. In : arXiv :1804.01650 [cs, eess] (avr. 2018). arXiv : 1804.01650.

Audio source separation Master internship, Lille (France), 2022

Advisors — Nathan Souviraà-Labastie, R&D Engineer, PhD, nathan.souviraa-labastie@steelseries.com — Damien Granger, R&D Engineer, damien.granger@steelseries.com

Company description

Approaches and topics for the internship Audio source separation consists in extracting the different sound sources present in an audio signal, in particular by estimating their frequency distributions and/or spatial positions. Many applications are possible from karaoke generation to speech denoising. In 2020, our separation approaches [11, 1] were equaling the state of the art [12, 13] on a music separation task and many tracks of improvement are possible in terms of implementations (details hereafter). The selected candidate will work on one or several of the following topics according to her/his aspirations, skills and bibliographic outcomes. In addition to those topics, the candidates can also make their own topic proposal. She/he will also have the chance to work on our internal substantive datasets. New core algorithm Machine learning is a fast changing research domain and an algorithm can move from being state of the art to being obsolete in less than a year (see for instance the recent advances in music source separation [9, 3]). The task would be to try recent powerful neural network approaches like recent architectures or unit types that proved benefit in other research fields. For instance, the encoding and decoding part of [15] shows huge benefit compared to traditional audio codec. Other research domains outside audio (like computer vision) might be considered as sources of inspiration. For instance, the approaches in [14, 6] have shown promising results on other tasks and previous internal work [1] managed to bring those benefits to audio source separation. Conversely, approaches like [10, 5] were tested without benefit for the separation tasks that we target. 1 Overall, the targeted benefits of a new approach can be of two kinds, either to bring improvements in terms of audio separation performances, either to reduce the computational costs (mainly CPU/GPU load, RAM usage). Extension to multi-source Another challenging problem would be to estimate all the different sources with a single network, either by selecting wich source to output but with a single network such as in [7], either by outputting all sources at the same time. In the case of music, most of the state of the art approaches [12] had historically addressed the backing track problem (i.e., karaoke for instruments) as a one instrument versus the rest problem, hence using specific networks for each instruments when multiple instruments are present in the mix. Pruning Beside testing new architectures or unit types, pruning could be a simple and effective way to reduce computational costs. The original pruning principle is to remove the less influent neural units in order to avoid overfitting. We would mainly be interested in reducing the total amount of units and parameters.The theoretical and domain agnostic literature [16, 4, 8], as well as the audio specific literature [2] will be explored. As the selected candidate would work on our most advanced model, this subject is the opportunity to have a direct impact on the company in such a short time frame.

Skills

Who are we looking for ? Preparing an engineering degree or a master’s degree, you preferably have knowledge in the development and implementation of advanced algorithms for digital audio signal processing. Machine learning skills is a plus. Whereas not mandatory, notions in the following various fields would be appreciated : Audio, acoustics and psychoacoustics - Machine learning and artificial neural networks. - Audio effects in general : compression, equalization, etc. - Statistics, probabilist approaches, optimization. - Programming language : Matlab, Python, Pytorch, Keras, Tensorflow. - Sound spatialization effects : binaural synthesis, ambisonics, artificial reverberation. - Voice recognition, voice command. - Voice processing effects : noise reduction, echo cancellation, array processing. - Computer programming and development : Max/MSP, C/C++/C#. - Audio editing software : Audacity, Adobe Audition, etc. - Scientific publications and patent applications. - Fluent in English and French. - Demonstrate intellectual curiosity.

Other offers

https://nahimic.welcomekit.co/ https://www.welcometothejungle.co/companies/nahimic/jobs