End-to-end Reinforcement Learning for the Large-scale Traveling Salesman Problem

2 053

Microsoft Research357 тыс

Следующее

12.12.22 – 82330:03

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits

Популярные

107 дней – 1 5300:43

Magentic Marketplace: Microsoft Research Forum #ai #llm #agenticai #reasoning #opensource #github

251 день – 72053:58

BRAIN SIGNALS TO ACTION: Monitoring and Explaining User Cognitive Load with Foundation Models

Опубликовано 12 декабря 2022, 20:34

2022 Data-driven Optimization Workshop: End-to-end Reinforcement Learning for the Large-scale Traveling Salesman Problem

Speaker: Yan Jin, Huazhong University of Science and Technology

Traveling Salesman Problem (TSP) is one of the most studied routing problems that arise in the practical applications of logistics. Traditional approaches not only rely on hand-crafted rules of experts, but also are time-consuming on iterative search. This limits their applications in time sensitive scenarios, e.g., on-call routing and ride hailing service. We propose an end-to-end approach based on hierarchical reinforcement learning for addressing the large-scale TSP. Using a divide-and-conquer strategy, the upper-level policy chooses a small subset of cities from all remaining cities that are to be traversed, while the lower-level policy takes a Transformer model on the chosen cities to solve a shortest path with prescribed starting and ending cities. These two policies are jointly trained by reinforcement learning algorithms, and the TSP solutions can be directly generated without any search procedure. The proposed approach takes advantage of inference efficiency of Transformer model and provides highly competitive results.

Свежие видео