End-to-end Reinforcement Learning for the Large-scale Traveling Salesman Problem

1 232
18.7
Опубликовано 12 декабря 2022, 20:34
2022 Data-driven Optimization Workshop: End-to-end Reinforcement Learning for the Large-scale Traveling Salesman Problem

Speaker: Yan Jin, Huazhong University of Science and Technology

Traveling Salesman Problem (TSP) is one of the most studied routing problems that arise in the practical applications of logistics. Traditional approaches not only rely on hand-crafted rules of experts, but also are time-consuming on iterative search. This limits their applications in time sensitive scenarios, e.g., on-call routing and ride hailing service. We propose an end-to-end approach based on hierarchical reinforcement learning for addressing the large-scale TSP. Using a divide-and-conquer strategy, the upper-level policy chooses a small subset of cities from all remaining cities that are to be traversed, while the lower-level policy takes a Transformer model on the chosen cities to solve a shortest path with prescribed starting and ending cities. These two policies are jointly trained by reinforcement learning algorithms, and the TSP solutions can be directly generated without any search procedure. The proposed approach takes advantage of inference efficiency of Transformer model and provides highly competitive results.
автотехномузыкадетское