Abstract:
Advancements in artificial intelligence over the last decade have transformed
numerous fields, including biotechnology. Recent developments in deep learning (DL)
have led to the creation of models capable of generating antibody sequences with
remarkable efficiency. These models, built on cutting-edge NLP-based architectures
and trained on extensive datasets of protein sequences, harness the inherent information
encoded in protein sequences, from structural conformations to binding affinities. By
leveraging deep learning, these methods can potentially reduce the reliance on
traditional, resource-intensive experimental procedures for antibody development.
However, antigen-specific antibody sequence generation is still a problem that needs to
be addressed. In this study, AbAtT5, a finetuned transformer-based model specifically
designed for generating antigen-specific antibodies using full-length antigen sequences
is introduced. AbAtT5 is finetuned on a large protein language model, protT5,
harnessing the potential of transfer learning by updating the weights and biases of the
pre-trained model. AbAtT5 demonstrated superior performance compared to existing
models like HERN and EAGLE, achieving improvements of up to 1.88% in VAAR and
18.04% in SeqID. These findings underscore the model's potential to accelerate the
antibody design process by providing more accurate sequence generation. The ability
of AbAtT5 to generate antibodies with higher sequence identity and alignment rates
highlights its promise as a powerful tool in the field of computational antibody
generation, offering a more efficient approach to identifying potent antigen-specific
antibody candidates.