Full-rank Gaussian Modeling of Convolutive Audio Mixtures Applied to Source Separation

320

106.7

Microsoft Research334 тыс

Следующее

17.08.16 – 1 36152:15

Abelian Sandpiles and the Harmonic Model

Популярные

262 дня – 1 0325:33

The Metacognitive Demands and Opportunities of Generative AI

20.09.22 – 1 42853:56

A Prediction Model for Malaria using an Ensemble of Machine Learning & Hydrological Drought Indices

Опубликовано 17 августа 2016, 3:10

We address the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation, that is the extraction of the signal of each source from a multichannel audio mixture. Firstly, we propose a general Gaussian modeling framework whereby the contribution of each source to all mixture channels in the time-frequency domain is modeled as a zero-mean Gaussian random variable whose covariance encodes both the spatial and the spectral characteristics of the source. In order to better account for the reverberant mixing process, we relax the conventional narrowband assumption resulting in rank-1 spatial covariance and propose a full-rank spatial covariance parameterization. We then design a general source separation architecture, which is applicable for both linear and quadratic input time-frequency representations, and derive a family of Expectation-Maximization (EM) algorithms to estimate the model parameters either in the maximum likelihood (ML) sense or in the maximum a posteriori (MAP) sense. The source separation results given by the proposed approach are compared with several baseline and state-of-the-art algorithms on both simulated mixtures and real-world recordings in various scenarios.

Свежие видео