Speech Enhancement Task

tl;dr: Check the full paper here.

On the test-set snippets

Those examples were retrieved from the LibriSpeech (train-clean-100) snippets of 2^16 samples used to test the efficiency of the network. In the next section you can find about the model network applied on multiple SNR intervals and with full temporal-context (complete audio files). Just a quick remainder that the Network was trained on noisy signals with an SNR of 5db-15db.

Noisy Signal Our Method Ground Truth

On Long temporal context

Here we apply our method to more diverse audios on the full length. The results of this evaluation is applied on ASR algorithms to check if there are performance gains.

SNR between 10db-20db

Noisy Signal Our Method Ground Truth

SNR between 5db-15db

Noisy Signal Our Method Ground Truth

SNR between 0db-10db

Noisy Signal Our Method Ground Truth

ASR Evaluation

@TODO