Abstract:
With the rise of computational methods in medicine, there has been several important
breakthroughs. Virtual screening, molecular docking and molecular dynamics simulations have revolutionized the field of drug design over the decades. More recently,
artificial intelligence has also had major contributions to drug design. However, the
problem of computational chemistry is a combinatorial one: molecular function is nonlinear and combinatorial in nature. Finding a relationship between chemical space and
functional space has been quite challenging. Fortunately, deep reinforcement learning
provides some hope in approaching this problem. Earlier work named MOLDQN has
used a discrete deterministic approach to modeling molecules from scratch, this thesis
aims to use a more generalized probabilistic approach. This thesis uses the Actor-Critic
formulation in reinforcement learning to explore the chemical space in terms of the
quantitative estimate of drug-likeness (QED), Tanimoto index, and a newly designed
diversity score which penalizes highly similar molecules. Results from the algorithm
show that the system can learn to model chemical bonds better than earlier work, however the system cannot model aromatic rings accurately. This may perhaps be because
of the three-dimensional nature of resonance structures not captured with the Morgan
fingerprint which the algorithm uses.