UW - MSR Machine Learning workshop 2015 - Session 2

1 699
70.8
Опубликовано 22 июня 2016, 1:34
11:00 Sure Screening for Guassian Graphical Models - Daniela Witten In an undirected graphical model, the nodes represent random variables, and an edge represents conditional dependence between the corresponding pair of nodes. In recent years, the statistical and machine learning communities have focused a lot of attention upon the problem of learning the structure of a graph in the high-dimensional setting. We propose graphical sure screening, or GRASS, an extremely simple and computationally-efficient screening procedure for this task: we simply threshold the elements of the sample covariance matrix. With very high probability, the GRASS estimated edge set contains the true edge set; furthermore, we can control the size of the estimated edge set and the rate of false positives. Furthermore, in empirical studies, GRASS outperforms much more complex and computationally-demanding state-of-the-art estimators for high-dimensional graph estimation. 11:25 Quantum Deep Learning - Ashish Kapoor In recent years, deep learning has had a profound impact on machine learning and artificial intelligence. An important question remaining is that of whether quantum computing will lead to improved methods for deep learning. We show that quantum computing can not only reduce the time required to train a deep restricted Boltzmann machine, but also provides a much more natural environment for deep learning than classical computing does. Specifically, it obviates the need to use the contrastive divergence approximation, to impose directionality on the graphical model, or to use greedy layer-by-layer training, thereby significantly improving the optimization of the underlying objective function. Our quantum approach also allows richer classes of models for deep learning, including full Boltzmann machines and multilayer, fully connected models, to be efficiently trained. This is a joint work with Nathan Wiebe and Krysta Svore. 11:50Spotlight: Learning Graphical Models with Hubs - Kean Ming Tan We consider the problem of learning a high-dimensional graphical model in which there are a few hub nodes that are densely-connected to many other nodes. Many authors have studied the use of an l1 penalty in order to learn a sparse graph in the high-dimensional setting. However, the l1 penalty implicitly assumes that each edge is equally likely and independent of all other edges. We propose a general framework to accommodate more realistic networks with hub nodes, using a convex formulation that involves a row-column overlap norm penalty. We apply this general framework to three widely-used probabilistic graphical models: the Gaussian graphical model, the covariance graph model, and the binary Ising model. An alternating direction method of multipliers algorithm is used to solve the corresponding convex optimization problems. On synthetic data, we demonstrate that our proposed framework outperforms competitors that do not explicitly model hub nodes. We illustrate our proposal on a webpage data set and a gene expression data set. 11:55Spotlight: Recursive Decomposition for Nonconvex Optimization - Abram Friesen Continuous optimization is an important problem in many areas of AI, including vision, robotics, probabilistic inference, and machine learning. Unfortunately, most real-world optimization problems are non-convex, causing standard convex techniques to find only local optima, even with extensions like random restarts and simulated annealing. We observe that, in many cases, the local modes of the objective function have combinatorial structure, and thus ideas from combinatorial optimization can be brought to bear. Based on this, we propose a problem-decomposition approach to nonconvex optimization. Similarly to DPLL-style SAT solvers and recursive conditioning in probabilistic inference, our algorithm, RDIS, recursively sets variables so as to simplify and decompose the objective function into approximately independent sub-functions, until the remaining functions are simple enough to be optimized by standard techniques like gradient descent. The variables to set are chosen by graph partitioning, ensuring decomposition whenever possible. We show analytically that RDIS can solve a broad class of nonconvex optimization problems exponentially faster than gradient descent with random restarts. We observe experimentally that RDIS outperforms standard techniques on problems like structure from motion and protein folding.
Случайные видео
309 дней – 182 02327:32
Adam Savage's Lathe Build Fail
автотехномузыкадетское