SpiN 2018 :: program

Segregation enhancement for hearing impaired listeners using a deep neural networks separation algorithm

Lars Bramsløw^(a)
Eriksholm Research Centre, Snekkersten, Denmark

Gaurav Naithani
Tampere University of Technology, Finland

Atefeh Hafez
Oticon A/S, Smørum, Denmark

Tom Barker
Cirrus Logic Ltd, London, UK

Niels Pontoppidan
Eriksholm Research Centre, Snekkersten, Denmark

Tuomas Virtanen
Tampere University of Technology, Finland

(a) Presenting

Hearing aid users are challenged in listening situations with noise and especially speech-on-speech situations with two or more competing voices. Specifically, the task of segregating two competing voices is very hard, unlike for normal-hearing listeners.

Recently, deep neural network algorithms have shown great potential in tasks like blind source separation of a single-channel (monaural) mixture of multiple voices. The idea is to train the algorithm on relatively short samples of clean speech, thus learning the characteristics of each voice. Once trained for those specific voices, the network can then be applied to mixtures of new speech samples from the same voices.

For this listening task, the benefit of a deep neural network (DNN) based stream segregation enhancement algorithm on hearing-impaired listeners was tested on 15 hearing-impaired listeners. The newly developed Competing Voices Test (Bramsløw et al, 2016) was used, in which pairs of sentences are presented, and the listeners has to repeat a target sentence as cued on a monitor. The cue is a word from the target sentence, presented either before or after playback of the mixed sentences. This competing voices test is based on the Danish HINT test (Nielsen & Dau, 2011; Nilsson et al, 1994).

A mixture of two HINT sentences was separated using DNN and presented to the two ears as individual streams and tested for word score. The results indicate that DNNs have a large potential for improving stream segregation and speech intelligibility in difficult scenarios with two equally important target voices. In such cases, the user will at any time be able to shift attention to the desired target voice.

Last modified 2017-11-17 15:56:08