A novel paradigm for nonlinear speech processing through local singularity analysis

214
Следующее
28.07.16 – 43356:20
Sublinear Optimization
Популярные
Опубликовано 28 июля 2016, 1:40
The existence of nonlinear and turbulent phenomena in the speech production process has been theoretically and experimentally established. However, most of the current approaches in speech processing are based on linear techniques which basically rely on the linear source-filter model. These linear approaches cannot adequately capture all the complex dynamics of speech (despite their undeniable importance). For this reason, nonlinear speech processing has gained a significant attention in recent years. Among the numerous attempts dedicated to the development of nonlinear methods and models for speech processing, a class have taken analogies from the study of turbulent flows and dynamical systems in statistical physics. I will start the talk by giving a brief overview of such methods and argue that they belong to the first phase of complex systems theory, where only global measurements of the degree of complexity may be achieved. This fact, added to the difficulty of the practical computation of such measurements, limits the usefulness and applications of these methods. For instance, signal classification such as voice pathology detection is the most widely used application. Since the 90's, a new phase in complex systems theory has emerged where it is now possible to quantify complexity in a geometrical and local manner. Within this framework, the GeoStat Group and its collaborators have developed the so called Microcanonical Multiscale Formalism (MFF) for natural image processing. In MMF, the relation between geometry and statistics is unlocked through the notion of local singularity/predictability exponents and system reconstructability. During the last 3 years, we have been conducting research attempting to adapt MMF to the particular case of speech signals, viewed as realizations of a complex system. A particular aspect of our strategy has been to study the potential of MMF in fundamental speech problems and to develop efficient and robust processing algorithms. I will show that by appropriate definition and estimation of singularity exponents, critical system transitions can be identified thus providing interesting descriptions of some speech dynamics and characteristics. As a consequence, we could achieve promising results and outperform state-of-the-art linear techniques in several speech applications, such as speech segmentation, GCI identification, sparse source modeling and coding. These promising results open the gap for many perspectives that we will discuss at the end of the talk.
автотехномузыкадетское