Ligand-based virtual screening aims to reduce the cost and duration of drug discovery campaigns. Shape similarity can be used to screen large databases, with the goal of predicting potential new hits by comparing to molecules with known favourable properties. RGMolSA is a new alignment-free and mesh-free surface-based molecular shape descriptor derived from the mathematical theory of Riemannian geometry. The treatment of a molecule as a series of intersecting spheres allows the description of its surface geometry using the Riemannian metric, obtained by considering the spectrum of the Laplacian. This gives a simple vector descriptor constructed of the weighted surface area and eight non-zero eigenvalues, which capture the surface shape. The full method is described here.
RGMolSA should currently be considered a beta version under development. This initial sample runs for the supplied PDE5 inhibitor test sets (as discussed in the above paper), but is not guaranteed to work for all molecules.
Run the following in the terminal from the directory the software is to be cloned to:
git clone https://github.com/RPirie96/RGMolSA.git
The Jupyter Notebook "run_RGMolSA.ipynb" can be used to run the code for the examples provided in the paper. Note that you'll need to change the paths specified in the 1st cell to the directory the python scripts and data have been cloned to for the notebook to run.
- SVT.sdf: structure data file containing a single conformer for each of Sildenafil, Vardenafil and Tadalafil.
- X_confs_10.sdf: structure data files for Sildenafil, Vardenafil and Tadalafil, each containing 10 low energy conformers generated using RDKit.
- X_confs_10random.sdf: structure data files for Sildenafil, Vardenafil and Tadalafil, each containing 10 random conformers generated using RDKit.