Microsoft Research334 тыс
Опубликовано 2 декабря 2020, 1:31
Acoustic signal compression techniques, converting the floating-point waveform into the bitstream representation, serve a cornerstone in the current data storage and telecommunication infrastructure. The rise of data-driven approaches for acoustic coding systems brings in not only potentials but also challenges, among which the model complexity is a major concern: on the one hand, this general-purpose computational paradigm features the performance superiority; on the other hand, most codecs are deployed on low power devices which barely afford the overwhelming computational overhead. In this talk, I will introduce several of our recent efforts towards a better trade-off between performance and efficiency for neural speech/audio coding. I will present on cascaded cross-module residual learning to conduct multistage quantization in deep learning techniques; in addition, a collaborative quantization scheme will be talked about to simultaneously binarize linear predictive coefficients and the corresponding residuals. If time permits, a novel perceptually salient objective function with a psychoacoustical calibration will also be discussed.
Learn more about this and other talks at Microsoft Research: microsoft.com/en-us/research/v...
Learn more about this and other talks at Microsoft Research: microsoft.com/en-us/research/v...
Свежие видео
Случайные видео