Publications

Publications / Conference Paper

Utilizing reinforcement learning to continuously improve a primitive-based motion planner

Goddard, Zachary C.; Wardlaw, Kenneth; Krishnan, Rohith; Tsiotras, Panagiotis; Smith, Michael R.; Sena, Mary R.; Parish, Julie M.; Mazumdar, Anirban

This paper describes how the performance of motion primitive based planning algorithms can be improved using reinforcement learning. Specifically, we describe and evaluate a framework for policy improvement via the discovery of new motion primitives. Our approach combines the predictable behavior of deterministic planning methods with the exploration capability of reinforcement learning. The framework consists of three phases: evaluation, exploration, and extraction. This framework can be iterated continuously to provide successive improvement. The evaluation step scores the performance of a motion primitive library using value iteration to create a cost map. A local difference metric is then used to identify regions that need improvement. The exploration step utilizes reinforcement learning to examine new trajectories in the regions of greatest need. The extraction step encodes the agent’s experiences into new primitives. The framework is tested on a point-to-point navigation task using a 6DOF nonlinear F-16 model. One iteration of the framework discovered 17 new primitives and provided a maximum planning time reduction of 96.91%. After 3 full iterations, 123 primitives were added with a maximum time reduction of 97.39%. The proposed framework is easily extensible to a range of vehicles, environments, and cost functions.