Reinforcement Learning-Based Guidance of Autonomous Vehicles

Joseph Clemmons and Yufang Jin
University of Texas at San Antonio


Reinforcement learning (RL) has attracted significant research efforts to guide an autonomous vehicle (AV) for a collision-free path due to its advantage of investigating interactions among multiple vehicles and the dynamic environment. This study deploys a Deep Q-Network (DQN)- based RL algorithm with reward shaping to control an ego AV in an environment with multiple vehicles. Specifically, the state space of the RL algorithm depends on the desired destination, the ego vehicle's location and orientation, and the location of other vehicles in the system. The training time of the proposed RL algorithm is much shorter compared with most current image-based algorithms. The RL algorithm also provides an extendable framework to include different numbers of vehicles in the environment and can be easily adapted to different maps without changing the setup of the RL algorithm. Three scenarios were simulated to validate the effects of the proposed RL algorithm while guiding the ego AV interacting with multiple vehicles on straight and curvy roads. Our results showed that the ego AV could learn to reach its destination within 5,000 episodes for all scenarios tested.