Microsoft Research336 тыс
Следующее
Опубликовано 29 июня 2016, 23:30
One of the characteristics of spontaneous speech that distinguishes it from written text is the presence of disfluencies, including filled pauses (um, uh), repetitions, and self corrections. In spoken language processing applications, disfluencies are typically thought of as "noise" in the speech signal. However, there are several systematic patterns associated with where disfluencies occur that can be leveraged to automatically detect them and to improve natural language processing. Further, rates of different types of disfluencies appear to depend on multiple levels of speech production planning and to vary depending on the individual speaker and the social context. Thus, detecting different disfluency types provides additional information about spoken interactions – beyond the literal meaning of the words. In this talk, we describe both computational models for multi-domain disfluency detection and analyses of different corpora that provide insights into what disfluencies can tell us about the speaker in both high-stakes and casual contexts.