Smartglasses processing: audio examples

The proposed system [1] is designed to receive the desired signal produced by the smartglasses wearer and suppress undesired sound.

Short demonstration:

Scenario:
1 desired speaker (user) + 3 undesired speakers (fixed locations) ; SNRin = -5 dB .

Signals at input:  

Signal at output:
 

Long demonstration:

In the article, we described experiments in which different signals were combined and processed with a number of algorithms. The desired signal was broadcast from the Head and Torso Simulator (HATS) and received by the 8-channel glasses mounted array. Similarly, a number of undesired signals were broadcast and received by the array. The signals were combined to create various scenarios which were used to test and compare several algorithms.

DESIRED SIGNAL:
The following is a recording of the desired signal (the average of the two omnidirectional sensors):  

NOISE SIGNAL:
Two noise scenarios were created. (a) three stationary speech sources, and (b) a moving speech source.

(a) Stationary noise sources:
The following is a recording of the three stationary speech signals (this is a stereo recording with each channel corresponding to one of the omnidirectional sensors):  

(b) Moving noise source
The following is a recording of a single undesired speech signal (this is a stereo recording with each channel corresponding to one of the omnidirectional sensors):  

RESULTS:
We present tables containing results of the tested scenarios.

(a) Stationary scenario:  

SNR = - 10 dB SNR = - 5 dB SNR = 0 dB SNR = 5 dB
Input signal
fixed-MVDR
fixed-MPDR
Adaptive MPDR
oracle adaptation
unporcessed monopole average
proposed algorithm (*)

(*) without post-processing

(b) Moving interference scenario:  

SNR = - 10 dB SNR = - 5 dB SNR = 0 dB SNR = 5 dB
Input signal
fixed-MVDR
fixed-MPDR
Adaptive MPDR
oracle adaptation
unporcessed monopole average
proposed algorithm (*)

(*) without post-processing

Post-processing results  
The following table demonstrates the effects of postprocessing. The results pertain to the scenario with three static interferers. Two sets of parameters are used for the post-prcessing stage. The first set (termed "post1") is more conservative and the second ("post2") is more aggressive.

SNR = - 10 dB SNR = -5 dB SNR = - 0 dB SNR = 5 dB
proposed algorithm (*)
post1
post2

(*) without post-processing

Reference:

  1. Dovid Y. Levin, Emanuël A.P. Habets, Sharon Gannot, Near-field signal acquisition for smartglasses using two acoustic vector-sensors, Speech Communication, Volume 83, October 2016, Pages 42-53.
    [ArXiv document (open access)];     [Journal (possibly pay-walled)]