Can training on artificial binding pockets enhance the performance of generative diffusion model for molecular docking?

Original title: Boosting performance of generative diffusion model for molecular docking by training on artificial binding pockets

Authors: Taras Voitsitskyi,Volodymyr Bdzhola,Roman Stratiichuk,Ihor Koleiev,Zakhar Ostrovsky,Volodymyr Vozniak,Ivan Khropachov,Pavlo Henitsoi,Leonid Popryho,Roman Zhytar,Semen O Yesylevskyy,Alan Nafiev,Serhii Starosyla

In this article, a new generative diffusion model called PocketCFDM is introduced, with the goal of improving the prediction of small molecule poses in protein binding pockets. The model utilizes a unique data augmentation technique, which involves creating artificial binding pockets that mimic the statistical patterns of non-bond interactions seen in real protein-ligand complexes. To achieve this, an algorithmic method was developed to assess and replicate these interaction patterns in the artificial binding pockets.

The results of the study show that integrating artificial binding pockets into the training process significantly enhances the model’s performance. In fact, PocketCFDM outperformed a model called DiffDock in terms of non-bond interaction quality, the number of steric clashes, and inference speed.

The authors also discuss future developments and optimizations for the model. Additionally, they provide public access to the inference code and final model weights of PocketCFDM through a GitHub repository.

It is worth noting that all the authors of the article are employees of Receptor.AI INC., with some of them also having shares in the company.

Original article: https://www.biorxiv.org/content/10.1101/2023.11.22.568238v1