Abstract:
The task of Unconstrained Off-Line handwriting recognition is challenging in general
and particularly difficult for Arabic-like scripts and is an active research area. Recent
use of Transformer models for the task of English Handwriting Recognition have shown
promising results. The proposed solution includes the fusion of Convolutional Neural
Network before a vanilla Transformer architecture and the use of printed Urdu text
along with handwriting text during training. Convolutional Blocks decrease the spatial
resolutions in order to make up for the Transformer’s attention layers’ n2 complexity.
Moreover, the use of printed text along with handwritten text aids in learning diverse
ligatures and a better language model for the transformer during training. On the
publicly accessible NUST-UHWR dataset, the proposed model achieves the state-ofthe-
art accuracy with a CER of 5.31 percent.
xi