Discriminative Graphical Models for Structured Data Prediction

97
Опубликовано 6 сентября 2016, 6:20
Structured data prediction is the problem of assigning labels (or segmentations) given the observations with particular inherent structures, such as parsing for English sentences or structure prediction for protein sequences. Recently various discriminative graphical models, such as Conditional Random Fields or Max-Margin Markov Networks, have been proposed and demonstrated successes in different applications. However, most of them fail to effectively handle the long-range interactions that common in various domains, for example, co-referencing in texts and hydrogen bonding between residues that are far apart in protein sequences. In this talk, I will present a framework of discriminative graphical models to solve the problem with applications to protein structure prediction. In particular, a special kind of undirected graph is defined, whose nodes represent the state of the concerned structure elements (e.g. secondary structure components or text chunks) and whose edges indicate either local interactions between neighboring nodes or the long-range interactions between distant nodes. Based on the types of long-interactions in our application, we propose the segmentation conditional random fields for general-type interactions for one single sequence, chain graph model for repetitive patterns of interactions and dynamic segmentation conditional random fields for complex interactions involving multiple protein sequences. I will also briefly discuss our ongoing work to apply these models for bio-literature text mining and my previous work in automatically extracting hidden features from texts of product description.
автотехномузыкадетское