последние час день неделя месяц всё

Scalable Trust-Region Method for Deep Reinforcement Learning Using Kronecker-Factored Approximation

3 401

19.9

Microsoft Research336 тыс

Следующее

11.10.17 – 1 6851:08:50

What 151,000,000 Calories Burned in a Single VR Game Says About the Future of Exercise

Популярные

27 дней – 80754:37

A Closer Look at Falcon

340 дней – 1 6025:28

Evaluation and Understanding of Foundation Models

Опубликовано 11 октября 2017, 20:47

In this work, we propose to apply trust region optimization to deep reinforcement learning using a recently proposed Kronecker-factored approximation to the curvature. We extend the framework of natural policy gradient and propose to optimize both the actor and the critic using Kronecker-factored approximate curvature (K-FAC) with trust region; hence we call our method Actor Critic using Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this is the first scalable trust region natural gradient method for actor-critic methods. It is also a method that learns non-trivial tasks in continuous control as well as discrete control policies directly from raw pixel inputs. We tested our approach across discrete domains in Atari games as well as continuous domains in the MuJoCo environment. With the proposed methods, we are able to achieve higher rewards and a 2- to 3-fold improvement in sample efficiency on average, compared to previous state-of-the-art on-policy actor-critic methods.

See more on this video at microsoft.com/en-us/research/v...

Свежие видео

9 дней – 20 51335:12

The Level1 Show December 27 2024: California Murder Squirrel

10 дней – 9 2604:03

Empowering MediaMarkt to offer a seamless retail experience | Samsung

10 дней – 4 1470:20

Feel the Fire 6 Power 120LM Flashlight—turning the darkest night into day in an instant!#ruggedphone

13 дней – 127 29615:05

YouTube Rewind 2024 Tech Edition ft. MKBHD, Linus Tech Tips, JerryRigEverything, iJustine + More

14 дней – 8 6221:00

ASRock PSU??? Steel Legend SL-1000G #asrock quick look, powering threadripper and epyc. vids soon

20 дней – 2714:16

I get 403 Access Denied errors with an S3 website endpoint as my CloudFront distribution origin.

Случайные видео

3 дня – 9680:22

CSSNestedDeclarations

66 дней – 52 7670:26

Glow in the dark phone? Would you get one? Nothing phone 2a plus community #igyaan #tech #fyp

232 дня – 1 4400:57

The Future Of AI Models And How To Choose The Right One, With Nuri Cankaya | Intel

21.07.23 – 8 3320:58

Is the Google Pixel 7a the best midranger? 🤔

30.09.22 – 2 087 31812:15

Dope Tech: The Hottest Laptop Design!

25.06.22 – 4 070 4507:52

Adam Savage's Aluminum Foil Ball Cut In Half! (Ft. Waterjet Channel)

3 дня – 88 44714:29

The Intel ARC B580 is Broken...on Older Systems

13 дней – 19 7690:12

@sabrinacarpenter is a 2024 Breakout Search. Watch her @nprmusic Tiny Desk on YouTube.

16 дней – 1 1830:24

Blade GT | Refined Styling , Exclusive Colors

17 дней – 67 6057:30

XR Unlocked '24 in under 8 minutes

24 дня – 310 16920:13

Be careful! iPhone 16 vs Pixel 9

24 дня – 1436:19

How do I troubleshoot the InvokeModel API error in Amazon Bedrock?

24 дня – 7090:57

Intel Rig Rush PC Building Giveaway Winners!

7 часов – 3950:11

Soft leather, all the attitude ⚡️ #amazonfinds

5 дней – 3062:32

Witness the Power of AI-Driven Cybersecurity with Bufferzone and Intel | Intel Business

авто техно музыка детское

Последние техно видео О рейтинге Добавить канал English