ISCA - International Speech
Communication Association


ISCApad Archive  »  2022  »  ISCApad #295  »  Jobs  »  (2022-11-05)Audio detection for gaming Master internship, Lille (France), 2022@SteelSeries France R&D team (former Nahimic R&D team), France

ISCApad #295

Monday, January 09, 2023 by Chris Wellekens

6-46 (2022-11-05)Audio detection for gaming Master internship, Lille (France), 2022@SteelSeries France R&D team (former Nahimic R&D team), France
  

Audio detection for gaming

Master internship, Lille (France), 2022

Advisors — Damien Granger, R&D Engineer, damien.granger@steelseries.com — Nathan Souviraà-Labastie, R&D Engineer, PhD, nathan.souviraa-labastie@steelseries.com Company description About GN Group GN was founded 150 years ago with a truly innovative and global mindset. Today, we honour that legacy with world-leading expertise in the human ear, sound and video processing, wireless technology, miniaturization and collaborations with leading technology partners. GN’s solutions are marketed by the brands ReSound, Beltone, Interton, Jabra, BlueParrott, SteelSeries and FalCom in 100 countries. The GN Group employs 6,500 people and is listed on Nasdaq Copenhagen (GN.CO). About SteelSeries SteelSeries is the worldwide leader in gaming and esports peripherals focused on premium quality, innovation, and functionality. SteelSeries’ family of professional and gaming enthusiasts are the driving force behind the company and help influence, design, and craft every single accessory and the brand’s software ecosystem, SteelSeries GG. In 2020, SteelSeries acquired Nahimic, the leader in 3D sound solutions for gaming. We are currently looking for a machine learning / audio signal processing intern to join the R&D team of SteelSeries’ Software & Services Business Unit in our French office (former Nahimic R&D team). Internship subject This internship targets the detection of a known signal in an audio scene (containing multiple signals). For instance, some signs and feedbacks in video games are always the same audio signal while the rest of the audio scene is changing. The current internal implementation is based on a legacy state of the art music identification system [1, 2] The objective of the internship is to address one or multiple of the following targets : — Agnostic to the source type (speech, music, audio gaming event ...), indeed the current approach is designed for music — Enable the handling of shorter target signal — Robustness to various overlapping audio noise type from the audio scene — Robustness to level variation over time (in the case of moving audio sources) — Explore the effect of having multi-channel signals as input, summing the channels might help to identify moving sound but potentially with the drawback of lowering the signal-to-noise ratio — Improvement of the above-mentioned aspects by adapting the approach to the use of a smaller dictionary of target signals (not millions like in the case of musics) Machine learning is the expected track, and in particular, pre-trained and potentially overfitted audio representation (embeddings). Here is a short list of examples : — Attention-Based Audio Embeddings [3] — Autoencoder [4] — Contrastive learning [5, 6] 1 Skills Who are we looking for ? Preparing an engineering degree or a master’s degree, you preferably have knowledge in the development and implementation of advanced machine learning algorithms. Digital audio signal processing skills is a plus. Whereas not mandatory, notions in the following additional various fields would be appreciated : Audio effects in general : compression, equalization, etc. - Statistics, probabilist approaches, optimization. - Programming language : Python, Pytorch, Keras, Tensorflow, Matlab. - Voice recognition, voice command. - Computer programming and development : Max/MSP, C/C++/C#. - Audio editing software : Audacity, Adobe Audition, etc. - Scientific publications and patent applications. - Fluent in English and French. - Demonstrate intellectual curiosity. Références [1] A. Wang. « The Shazam music recognition service ». In : Communications of the ACM 49.8 (2006), p. 44-48. [2] A. Wang et al. « An industrial strength audio search algorithm. » In : Ismir. T. 2003. Washington, DC. 2003, p. 7-13. [3] A. Singh, K. Demuynck et V. Arora. « Attention-Based Audio Embeddings for Query-by-Example ». In : arXiv preprint arXiv :2210.08624 (2022). [4] A. Báez-Suárez et al. « SAMAF : Sequence-to-sequence Autoencoder Model for Audio Fingerprinting ». In : ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16.2 (2020), p. 1-23. [5] X. Wu et H. Wang. « Asymmetric Contrastive Learning for Audio Fingerprinting ». In : IEEE Signal Processing Letters 29 (2022), p. 1873-1877. [6] Z. Yu et al. « Contrastive unsupervised learning for audio fingerprinting ». In : arXiv preprint arXiv :2010.13540 (2020).


Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA