Oral Session 9

Microsoft Research334 тыс

Следующее

07.07.16 – 301:02:07

MSR NYC Data Science Seminar Series: From "In" to "Over"

Популярные

38 дней – 7204:30

Hairmony: Fairness-aware hairstyle classification

355 дней – 24427:12

AI Forum 2023 | AI for Neurodiverse Society

Опубликовано 7 июля 2016, 23:54

Neural Reinforcement Learning - Reinforcement learning has become a wide and deep conduit that links ideas and results in computer science, statistics, control theory and economics to psychological data on animal and human decision-making, and the neural basis of choice. There is a ready and free flow of ideas among these disciplines, providing a powerful foundation for exploring some of the complexities of both normal and abnormal behaviours. I will outline some of the happy circumstances that led us to this point; discuss current computational, algorithmic and implementational themes; and provide some pointers to the future. Actor-Critic Algorithms for Risk-Sensitive MDPs - In many sequential decision-making problems we may want to manage risk by minimizing some measure of variability in rewards in addition to maximizing a standard criterion. Variance related risk measures are among the most common risk-sensitive criteria in finance and operations research. However, optimizing many such criteria is known to be a hard problem. In this paper, we consider both discounted and average reward Markov decision processes. For each formulation, we first define a measure of variability for a policy, which in turn gives us a set of risk-sensitive criteria to optimize. For each of these criteria, we derive a formula for computing its gradient. We then devise actor-critic algorithms for estimating the gradient and updating the policy parameters in the ascent direction. We establish the convergence of our algorithms to locally risk-sensitive optimal policies. Finally, we demonstrate the usefulness of our algorithms in a traffic signal control application.

Свежие видео