Exploring Richer Sequence Models in Speech and Language Processing

180
Опубликовано 28 июля 2016, 23:03
Conditional and other feature-based models have become an increasingly popular methodology for combining evidence in speech and language processing. As one example, Conditional Random Fields have been shown by several research groups to provide good performance on several tasks via discriminatively training weighted combinations of feature descriptions over input. CRFs with linear chain structures have been useful for sequence labeling tasks such as phone recognition or named entity recognition. As we start to tackle problems of increasing complexity, it makes sense to investigate models that move beyond linear-chain CRFs in various ways -- for example, by considering richer graphical model structures to describe more complex interactions between linguistic variables, or using CRF classifiers within a larger learning framework. In this talk, I will describe recent research projects in the Speech and Language Technologies (SLaTe) Lab at Ohio State; each takes the basic CRF paradigm in a slightly different direction. The talk will describe two models for speech processing: Boundary-Factored CRFs, an extension of Segmental CRFs that allows for fast processing of features related to state transitions, and Factorized CRFs, which we used to investigate articulatory-feature alignment. I will also discuss how CRFs play a role in a semi-supervised framework for event coreference resolution within clinical notes found in electronic medical records. Joint work with Yanzhang He, Rohit Prabhavalkar, Karen Livescu, Preethi Raghavan, and Albert Lai
автотехномузыкадетское