Full-rank Gaussian Modeling of Convolutive Audio Mixtures Applied to Source Separation

320
106.7
Опубликовано 17 августа 2016, 3:10
We address the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation, that is the extraction of the signal of each source from a multichannel audio mixture. Firstly, we propose a general Gaussian modeling framework whereby the contribution of each source to all mixture channels in the time-frequency domain is modeled as a zero-mean Gaussian random variable whose covariance encodes both the spatial and the spectral characteristics of the source. In order to better account for the reverberant mixing process, we relax the conventional narrowband assumption resulting in rank-1 spatial covariance and propose a full-rank spatial covariance parameterization. We then design a general source separation architecture, which is applicable for both linear and quadratic input time-frequency representations, and derive a family of Expectation-Maximization (EM) algorithms to estimate the model parameters either in the maximum likelihood (ML) sense or in the maximum a posteriori (MAP) sense. The source separation results given by the proposed approach are compared with several baseline and state-of-the-art algorithms on both simulated mixtures and real-world recordings in various scenarios.
автотехномузыкадетское