Microsoft Research333 тыс
Следующее
Опубликовано 20 октября 2016, 20:26
We provide tight upper and lower bounds on the complexity of minimizing the average of $m$ convex functions using gradient and prox information for the component functions. We show a significant gap between the complexity of deterministic vs randomized optimization. For smooth functions, we show that accelerated gradient descent and an accelerated variant of SVRG are optimal in the deterministic and randomized settings respectively, and that a gradient oracle is sufficient for that optimal rate. For non-smooth functions, having access to prox oracle reduces the complexity and we present optimal methods based on smoothing AGD that improve over methods using just gradient accesses.
See more on this video at microsoft.com/en-us/research/v...
See more on this video at microsoft.com/en-us/research/v...
Свежие видео