dc.description.abstract |
The emergence of the Metaverse introduces a new paradigm of interconnected virtual
spaces, fostering social interactions and immersive experiences. However, this virtual
realm is not devoid of privacy challenges, ranging from potential stalking and avatar
staring to more sophisticated threats like identity theft and unauthorized data access. This research addresses these concerns through the lens of deep reinforcement
learning, leveraging Unity’s ML-Agents toolkit within a simulated supermarket environment. The proposed approach involves training reinforcemet learning (RL) agents
using Proximal Policy Optimization (PPO) and Soft Actor Critic (SAC) algorithms to
detect and respond to follower and staring threats in the Metaverse. Results indicate
that Proximal Policy Optimization exhibits faster convergence, with decision requester
1 showing promising outcomes. For the detection of a follower threat, PPO converges
at 45000 steps in 7.95 min for a mean reward value 1 while for a staring threat the
PPO converges in 8.89 min. Hyperparameter tuning for follower threat with PPO yields
marginal improvements by reducing the convergence time to 7.71 min, while fine-tuning
PPO for staring threats, specifically adjusting learning rate and beta, enhances results
and expedites convergence to 8.85 min. SAC, on the other hand, displays a slower but
steadily increasing reward trajectory. The research highlights the significance of PPO
as the preferred algorithm for its efficiency in training RL agents. |
en_US |