Abstract:
To ensure the steady navigation for robot stable controls are the basic unit and control values
selection is highly environment dependent. Adding Generalization to system is the key to
reusability of control parameters to ensure adaptability in robots to perform with sophistication, in
the environments about which they have no prior knowledge, for this Reinforcement Leaning (RL)
based control systems are promising. However, tuning appropriate parameters to train RL
algorithm is a challenge. Therefore, we designed a continuous reward function to minimizing the
sparsity and stabilizes the policy convergence, to attain control generalization for differential drive
robot. We Implemented Twin Delayed Deep Deterministic Policy Gradient-TD3 on Open-AI Gym
Race Car. System was trained to achieve smart primitive control policy, moving forward in the
direction of goal by maintaining an appropriate distance from walls to avoid collisions. Resulting
policy was tested on unseen environments and observed precisely performing results. Upon
comparative analysis of TD3 with DDPG, TD3 policy outperformed the DDPG policy in both
training and testing phase, proving TD3 to be resource efficient and stable.