Trajectory optimization to reduce the radar exposure of aerial vehicles is an active area
of research. Revisiting the radar avoidance tactics through intelligent algorithms may
improve the existing stealth paradigm. This research discusses the utilization of deep
reinforcement learning algorithms to obtain optimal paths for an aerial vehicle with the
aim to avoid or minimize radar detection and tracking. A modular approach is adopted
in formulation of the problem that include aircraft kinematics module, aircraft radarcross-
section (RCS) model, and radar tracking model. OpenAI Gym based environment
is designed for single and multiple radar cases to obtain optimal paths. The optimal
trajectories are generated through deep reinforcement learning (DRL) in this study.
Specifically, three algorithms namely deep-deterministic-policy-gradient (DDPG), trustregion-
policy-optimization (TRPO), and proximity-policy-optimization (PPO) are used
to find optimal paths for five test cases. The comparison is carried out based on six
performance indicators. Significant results prove the importance of these RL algorithms
in optimal path planning. Overall, DDPG was outperformed by PPO and TRPO. The
results indicate that PPO performed better for optimal paths in general, however, TRPO
demonstrated fastest convergence.