-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Resolves bugs in writing cif file and parity similarity calculation (#28
) * corrected the reversal of bondtype from zero to dative * use mol from sanitized result * enable parsing a subsets of ccds from components cif file * fix: fix error in writing cif file This fixes the error in generating inchikey while inchi is missing * style: formating * chore: removed unnecessary None * doc: add download badge and CLC feature * chore: add citation file * fix: write rdkit properties if it is present only * fix: correct use of length of list in python * fix: change bondtype from dative to zero for parity method SMARTS with dative bondtype fails to find substructures, hence dative bond types are changed to zero * test: add HEM for parity test * chore: linting and formatting * test: removed parity test using HEM to check if the segmentation fault in githubworkflow is due to this * fix: remove PDBe from unichem mapping sources * chore: Update dependencies and package management Update the project's dependencies and package management to use Poetry instead of pip. This includes installing Poetry and running `poetry install --with tests` to install the project dependencies. Remove the installation of `rdkit==2023.09.6` and `pre-commit`. * chore: Update pre-commit hooks and dependencies Update the pre-commit hooks to use the Ruff pre-commit hook repository and remove the black and flake8 hooks. Also, add the rST Formatter hook from the rstfmt repository. * chore: Update tests.yml to use Poetry for pre-commit and pytest commands Update the tests.yml file to use Poetry for the pre-commit and pytest commands. This ensures that the project's dependencies are installed correctly and that the pre-commit hooks are run using Poetry. The changes include replacing "pre-commit install" with "poetry run pre-commit install" and "pytest --cov=pdbeccdutils" with "poetry run pytest --cov=pdbeccdutils". * Minor formatting * chore: Install pre-commit package in tests.yml Add the installation of the pre-commit package in the tests.yml file to ensure that pre-commit hooks are run during the testing process. This will help catch and fix any code style or formatting issues before committing the changes. * chore: Update docs and publish pipelines to use poetry * test: add HEM to parity test * bump up version number * 🔥 remove __init__ file * 🔧 update documentation action to use poetry * 🔧 update publish action to use poetry * 🔧 update test action to use poetry * 🎨 move details from setup to pyproject.toml file * ♻️ use version information from pyproject.toml * 🩹 add CCDC to unichem resources * ✏️ fix typos * ✏️ fix typos * ♻️ refactor configs * 🎨 use single function to get properties of rdkit objects * 🎨 use rdkit_object_property function insted of get_componet_atom_id * ✨ get name of clc from entities * 🩹 import importlib.metadata * 🎨 linting and formatting * 🩹 removed "data" from the path * 🔧 add poetry.toml file * 🔧 update poetry.lock file * 🔧 remove pre-commit from test workflow * 🎨 linting and formatting * 📝 replace github downloads with pypi downloads * bump up version * 📝 update changelog for release 0.8.6 * 🔧 create hook to generate poetry.lock file * 📝 update readme with installation using poetry * 🔧 update the rdkit version number * 🔧 update poetry lock file --------- Co-authored-by: roshan <[email protected]> Co-authored-by: Sreenath Sasidharan Nair <[email protected]>
- Loading branch information
1 parent
dbe3b87
commit 618f680
Showing
39 changed files
with
1,626 additions
and
235 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
cff-version: 1.2.0 | ||
message: "If you use this software, please cite it as below." | ||
authors: | ||
- family-names: "Kunnakkattu" | ||
given-names: "Ibrahim Roshan" | ||
orcid: "https://orcid.org/0000-0002-8646-0969" | ||
- family-names: "Pravda" | ||
given-names: "Lukas" | ||
- family-names: "Yuan" | ||
given-names: "Qi" | ||
- family-names: "S.Smart" | ||
given-names: "Oliver" | ||
- family-names: "Nadzirin" | ||
given-names: "Nurul" | ||
- family-names: "Anyango" | ||
given-names: "Stephen" | ||
- family-names: "Nair" | ||
given-names: "Sreenath" | ||
|
||
title: "PDBe CCDUtils" | ||
version: 0.8.5 | ||
date-released: 22/05/2024 | ||
url: "https://github.com/PDBeurope/ccdutils" | ||
preferred-citation: | ||
type: article | ||
authors: | ||
- family-names: "Kunnakkattu" | ||
given-names: "Ibrahim Roshan" | ||
orcid: "https://orcid.org/0000-0002-8646-0969" | ||
- family-names: "Choudhary" | ||
given-names: "Preeti" | ||
orcid: "https://orcid.org/0000-0003-2340-3278" | ||
- family-names: "Pravda" | ||
given-names: "Lukas" | ||
- family-names: "Yuan" | ||
given-names: "Qi" | ||
- family-names: "S.Smart" | ||
given-names: "Oliver" | ||
- family-names: "Nadzirin" | ||
given-names: "Nurul" | ||
- family-names: "Anyango" | ||
given-names: "Stephen" | ||
- family-names: "Nair" | ||
given-names: "Sreenath" | ||
- family-names: "Velankar" | ||
given-names: "Sameer" | ||
orcid: "https://orcid.org/0000-0002-8439-5964" | ||
doi: "10.1186/s13321-023-00786-w" | ||
journal: "Journal of Cheminformatics" | ||
month: 12 | ||
title: "PDBe CCDUtils: an RDKit-based toolkit for handling and analysing small molecules in the Protein Data Bank" | ||
volume: 15 | ||
year: 2023 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,76 +1,83 @@ | ||
[![CodeFactor](https://www.codefactor.io/repository/github/pdbeurope/ccdutils/badge/master)](https://www.codefactor.io/repository/github/pdbeurope/ccdutils/overview/master) ![PYPi](https://img.shields.io/pypi/v/pdbeccdutils?color=green&style=flat) ![GitHub](https://img.shields.io/github/license/pdbeurope/ccdutils) ![ccdutils documentation](https://github.com/PDBeurope/ccdutils/workflows/ccdutils%20documentation/badge.svg) ![ccdutils tests](https://github.com/PDBeurope/ccdutils/workflows/ccdutils%20tests/badge.svg) | ||
[![CodeFactor](https://www.codefactor.io/repository/github/PDBeurope/ccdutils/badge/master)](https://www.codefactor.io/repository/github/PDBeurope/ccdutils/overview/master) ![PYPi](https://img.shields.io/pypi/v/pdbeccdutils?color=green&style=flat) ![GitHub](https://img.shields.io/github/license/PDBeurope/ccdutils) ![ccdutils documentation](https://github.com/PDBeurope/ccdutils/workflows/ccdutils%20documentation/badge.svg) ![ccdutils tests](https://github.com/PDBeurope/ccdutils/workflows/ccdutils%20tests/badge.svg) ![PyPI Downloads](https://img.shields.io/pypi/dm/pdbeccdutils) | ||
|
||
|
||
# pdbeccdutils | ||
|
||
* A set of python tools to deal with PDB chemical components definitions | ||
for small molecules, taken from the [wwPDB Chemical Component Dictionary](https://www.wwpdb.org/data/ccd) and [wwPDB The Biologically Interesting Molecule Reference Dictionary](https://www.wwpdb.org/data/bird) | ||
An RDKit-based python toolkit for parsing and processing small molecule definitions in [wwPDB Chemical Component Dictionary](https://www.wwpdb.org/data/ccd) and [wwPDB The Biologically Interesting Molecule Reference Dictionary](https://www.wwpdb.org/data/bird).`pdbeccdutils` provides streamlined access to all metadata of small molecules in the PDB and offers a set of convenient methods to compute various properties of small molecules using RDKIt such as 2D depictions, 3D conformers, physicochemical properties, matching common fragments and scaffolds, mapping to small-molecule databases using UniChem. | ||
|
||
## Features | ||
|
||
* The tools use: | ||
* [RDKit](http://www.rdkit.org/) for chemistry. Presently tested with `2022.09.4` | ||
* `gemmi` CCD read/write. | ||
* Generation of 2D depictions (`No image available` generated if the flattening cannot be done) along with the quality check. | ||
* Generation of 3D conformations. | ||
* Fragment library search (PDBe hand-curated library, ENAMINE, DSI). | ||
* Chemical scaffolds (Murcko scaffold, Murcko general, BRICS). | ||
* Lightweight implementation of [parity method](https://doi.org/10.1016/j.str.2018.02.009) by Jon Tyzack. | ||
* RDKit molecular properties per component. | ||
* UniChem mapping. | ||
* Generating complete representation of multiple [Covalently Linked Components (CLC)](https://www.ebi.ac.uk/pdbe/news/introducing-covalently-linked-components) | ||
|
||
## Dependencies | ||
|
||
* [RDKit](http://www.rdkit.org/) for small molecule representation. Presently tested with `2023.9.6` | ||
* [GEMMI](https://gemmi.readthedocs.io/en/latest/index.html) for parsing mmCIF files. | ||
* [scipy](https://www.scipy.org/) for depiction quality check. | ||
* [numpy](https://www.numpy.org/) for molecular scaling. | ||
* [networkx](https://networkx.org/) for bound-molecules. | ||
|
||
* Please note that the project is under active development. | ||
|
||
## Installation instructions | ||
## Installation | ||
|
||
* `pdbeccdutils` requires RDKit to be installed. | ||
The official RDKit documentation has [installation instructions for a variety of platforms](http://www.rdkit.org/docs/Install.html). | ||
For Linux/macOS this is most easily done using the Anaconda Python with commands similar to: | ||
create a [virtual environment](https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/#create-and-use-virtual-environments) and install using pip | ||
|
||
```console | ||
conda create -n rdkit-env rdkit python=3.9 | ||
conda activate rdkit-env | ||
``` | ||
|
||
* Once you have installed RDKit, as described above then install `pdbeccdutils` using `pip`: | ||
|
||
```console | ||
```bash | ||
pip install pdbeccdutils | ||
``` | ||
|
||
## Features | ||
## Contribution | ||
We encourage you to contribute to this project. The package uses [poetry](https://python-poetry.org/) for packaging and dependency management. You can develop locally using: | ||
|
||
* `gemmi` CCD read/write. | ||
* Generation of 2D depictions (`No image available` generated if the flattening cannot be done) along with the quality check. | ||
* Generation of 3D conformations. | ||
* Fragment library search (PDBe hand-curated library, ENAMINE, DSI). | ||
* Chemical scaffolds (Murcko scaffold, Murcko general, BRICS). | ||
* Lightweight implementation of [parity method](https://doi.org/10.1016/j.str.2018.02.009) by Jon Tyzack. | ||
* RDKit molecular properties per component. | ||
* UniChem mapping. | ||
```bash | ||
git clone https://github.com/PDBeurope/ccdutils.git | ||
cd ccdutils | ||
pip install poetry | ||
poetry install --with tests,docs | ||
pre-commit install | ||
``` | ||
|
||
## TODO list | ||
The pre-commit hook will run linting, formatting and update `poetry.lock`. The `poetry.lock` file will lock all dependencies and ensure that they match pyproject.toml versions. | ||
|
||
* Add more unit/regression tests to get higher code coverage. | ||
* Further improvements of the documentation. | ||
To add a new dependency | ||
|
||
```bash | ||
# Latest resolvable version | ||
poetry add <package> | ||
|
||
## Documentation | ||
# Optionally fix a version | ||
poetry add <package>@<version> | ||
``` | ||
|
||
To change a version of a dependency, either edit pyproject.toml and run: | ||
|
||
The documentation depends on the following packages: | ||
```bash | ||
poetry sync --with dev | ||
``` | ||
|
||
* `sphinx` | ||
* `sphinx_rtd_theme` | ||
* `myst-parser` | ||
* `sphinx-autodoc-typehints` | ||
or | ||
|
||
Note that `sphinx` needs to be a part of the virtual environment, if you want to generate documentation by yourself. | ||
Otherwise it cannot pick `rdkit` module. `sphinx_rtd_theme` is a theme providing nice `ReadtheDocs` mobile friendly style. | ||
```bash | ||
poetry add <package>@<version> | ||
``` | ||
|
||
* Generate *.rst* files to be included as a part of the documentation. Inside the directory `pdbeccdutils/doc` run the following commands to generate documentation. | ||
* Alternatively, use the `myst-parser` package to get the Markdown working. | ||
|
||
Use the following to generate initial markup files to be used by sphinx. This needs to be used when adding another sub-packages. | ||
## Documentation | ||
|
||
```console | ||
sphinx-apidoc -f -o /path/to/output/dir ../pdbeccdutils/ | ||
``` | ||
The documentation is generated using `sphinx` in `sphinx_rtd_theme` and hosted on GitHub Pages. To generate the documentation locally, | ||
|
||
Use this to re-generate the documentation from the doc/ directory: | ||
```bash | ||
cd doc | ||
poetry run sphinx-build -b html . _build/html | ||
|
||
```console | ||
make html | ||
# See the documentation at http://localhost:8080. | ||
python -m http.server 8080 -d _build/html | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.