Structured Prediction in NLP: Dual Decomposition and Structured Sparsity

544

45.3

Microsoft Research329 тыс

Следующее

27.07.16 – 18647:41

Generalization Bounds and Consistency for Latent-Structural Probit and Ramp Loss

Популярные

299 дней – 27340:27

AI Forum 2023 | Bridging Disciplines: Exploring the Frontiers of New Computing Paradigms

299 дней – 58222:36

AI Forum 2023 | Phase Transition in AI

Опубликовано 27 июля 2016, 23:38

In the first half of the talk, I will describe a new dual decomposition method for structured classification which is suitable for logically constrained problems, with applications in NLP. Dual decomposition has been recently proposed as a way of combining complementary models, with a boost in predictive power. However, in cases where lightweight decompositions are not readily available (e.g., due to the presence of rich features or logical constraints), the original subgradient algorithm is inefficient. We sidestep that difficulty by adopting an augmented Lagrangian method (ADMM) that accelerates model consensus by regularizing towards the averaged votes. We show how first-order logical constraints can be handled efficiently, even though the corresponding subproblems are no longer combinatorial, and report experiments in dependency parsing, with state-of-the-art results. In the second half of the talk, I will talk about structured sparse modeling in structured prediction. Linear models have enjoyed great success in structured prediction in NLP. While a lot of progress has been made on efficient training with several loss functions, the problem of endowing learners with a mechanism for feature selection is still unsolved. Common approaches employ ad hoc filtering or L1-regularization; both ignore the structure of the feature space, preventing practitioners from encoding structural prior knowledge. We fill this gap by adopting regularizers that promote structured sparsity, along with efficient algorithms to handle them. Experiments on three tasks (chunking, entity recognition, and dependency parsing) show gains in performance, compactness, and model interpretability. This is joint work with Mario Figueiredo, Pedro Aguiar, Noah Smith and Eric Xing.

Свежие видео