Abstract:
Employing transformer-based architectures in image inpainting has significantly advanced the
quality of generated results. By leveraging self-attention mechanisms, transformers can cap
ture long-range dependencies within an image, making them particularly effective in restoring
missing regions with coherence. Recent developments, such as the HINT framework, have
introduced enhanced attention mechanisms that further improve inpainting performance by
incorporating mask-aware encoding.Transformer models often struggle with processing high
resolution images due to their significant hardware requirements, which can limit their usability
in broader applications and real-time scenarios.Reducing image resolution leads to information
loss, which harms inpainting by causing blurred artifacts and vague structures in the recon
structed output. We proposed two models , HINT Initial and HINT Optimized. The HINT
initial model employed transfer learning.HINT optimized leverages advanced hyperparame
ter tuning (uses Keras Tuner for advanced hyperparameter tuning, optimizing model parame
ters for performance) and architectural refinements (MPD module and SCAL enhance image
inpainting by improving attention and downsampling). Our methods are evaluated on two
benchmark datasets namely, Places2 and CelebA-HQ. Simulation experiments validated our
proposed methods which showed significant improvements in comparison with the state-of
the-art image inpainting models. Notably, HINT Optimized effectively captured the complex
relationships between pixels on both datasets. HINT initial showed improvements in (L1↓
i
(loss),FID↓(Fréchet Inception Distance) and LPIPS↓(Learned Perceptual Image Patch Simi
larity) on CelebA-HQ Dataset, whereas HINT optimized improved (PSNR↑ (Peak Signal-to
Noise Ratio) and SSIM ↑ (Structural Similarity Index Measure).On Places2 Dataset ,HINT ini
tial improved L1↓and LPIPS↓ .HINT Optimized showed improvement on PSNR↑,FID↓ and
SSIM↑.The model demonstrated a significant improvement in both accuracy and loss metrics,
reflecting enhanced performance and a more efficient learning process.