Multi-microphone Dereverberation and Intelligibility Estimation in Speech Processing

682

25.3

Microsoft Research334 тыс

Следующее

17.08.16 – 1 44141:14

Coherent Depth in Stereo Vision

Популярные

70 дней – 1 13648:11

AI for Business Transformation: Lessons from Healthcare

70 дней – 4633:15

Ludic Design for Accessibility

Опубликовано 17 августа 2016, 3:06

When speech signals are captured by one or more microphones in realistic acoustic environments, they will be contaminated by noise due to surrounding sound sources and by reverberation due to reflections off walls and other surfaces. Noise and reverberation can have detrimental effects on the perceptual experience of a listener and, in more severe cases, they can cause intelligibility loss. Many signal processing applications, such as, speech codecs and speech recognizers deteriorate rapidly in performance as noise and reverberation levels increase. Consequently, the challenging problems of noise reduction and dereverberation have received a great deal of attention in research, especially, with the advent of mobile telephony and voice over IP. Multi-microphone speech dereverberation forms the topic of the first part of this talk. Two alternative methods will be introduced. The first method is based on the source-filter model of speech production while the second approaches the problem through blind identification and inversion of the room impulse responses. Simulation results will be presented to demonstrate the methods and to facilitate a comparison between them in terms of dereverberation performance. In the second part, the talk will focus on subject-based and automatic estimation of intelligibility in noisy and processed speech. In particular, the Bayesian Adaptive Speech Intelligibility Estimation (BASIE) method will be presented. BASIE is a tool for rapid subject-based estimation of a given speech reception threshold (SRT) and the slope at that threshold of multiple psychometric functions for speech intelligibility in noise. The core of BASIE is an adaptive Bayesian procedure, which adjusts the signal-to-noise ratio at each subsequent stimulus such that the expected variance of the threshold and slope estimates are minimised. Furthermore, strategies for using BASIE to evaluate the effects of speech processing algorithms on intelligibility and two illustrative examples for different noise reduction methods with supporting listening experiments will be given.

Свежие видео