MSR Talk: Unsupervised Speech Reverberation Control with Diffusion Implicit Bridges

498
9.2
Опубликовано 14 мая 2024, 15:32
Speaker(s): Eloi Moliner
Host: Hannes Gamper

Speech reverberation control involves the manipulation of acoustic characteristics in speech recordings, including tasks like speech dereverberation or reverberation time reduction. Diffusion implicit bridges are a recently proposed domain translation technique based on diffusion models and entropy-regularized optimal transport. They enable a bijective mapping between samples from different distributions by bridging through a prior Gaussian distribution. Diffusion bridges have the advantage of not requiring paired data samples for training and are optimized with a simple and stable Euclidean objective. This study applies diffusion implicit bridges to unsupervised speech reverberation control. We identify how a naive implementation of this method results in numerous undesired artifacts, such as speaker identity changes or babling, and attribute it to the curvature in the sampling trajectories. To mitigate these issues we propose training the model with a chunk-based optimal transport coupling between speech and noise samples, which significantly straightens the learned trajectories and improves the semantic consistency of the speech content. We study the performance of different configurations of the model through a comprehensive objective evaluation. To demonstrate the versatility of the method, we additionally conduct experiments on other tasks such as speech declipping or guitar distortion removal.

See more at microsoft.com/en-us/research/v...
Случайные видео
31 день – 7 734 5091:08
Flip or not flip? It's up to you!
89 дней – 68 3900:09
_ _ tan Gray is coming...
24.11.22 – 204 76427:43
BEST Black Friday Tech Deals - 2022
автотехномузыкадетское