- Building and testing modules
- Python package
- Modules
- Test-managed development
spt
tab-completion- Throwaway testing
- Add a new workflow
The modules in this repository are built, tested, and deployed using make
and Docker.
Development environment software requirements | Version required or tested under |
---|---|
Unix-like operating system | Darwin 20.6.0 and Ubuntu 20.04 |
GNU Make | 4.2.1 |
Docker Engine | 20.10.17 |
Docker Compose | 2.10.2 |
bash | >= 4 |
python | >=3.7 <3.12 |
postgresql | 13.4 |
toml | 0.10.2 |
A typical development workflow looks like:
- Modify or add source files.
- Add new tests.
$ make clean
Checking that Docker daemon is running ......................................... Running. (1s) Running docker compose rm (remove) ............................................. Down. (1s)
$ make test
Creating pyproject.toml ........................................................ Created. (0s)
Building development image precursor ........................................... Built. (0s)
Building development image ..................................................... Built. (1s)
Building apiserver Dockerfile .................................................. Built. (0s)
Building graphs Dockerfile ..................................................... Built. (0s)
Building ondemand Dockerfile ................................................... Built. (0s)
Building db Dockerfile ......................................................... Built. (0s)
Building workflow Dockerfile ................................................... Built. (0s)
Checking for Docker credentials in ~/.docker/config.json ....................... Found. (0s)
Building Docker image nadeemlab/spt-db ......................................... Built. (6s)
Building test-data-loaded spt-db image (1small) ................................ Built. (0s)
Building test-data-loaded spt-db image (1) ..................................... Built. (0s)
Building test-data-loaded spt-db image (1and2) ................................. Built. (1s)
Building Docker image nadeemlab/spt-apiserver .................................. Built. (5s)
Building Docker image nadeemlab/spt-graphs ..................................... Built. (6s)
Building Docker image nadeemlab/spt-ondemand ................................... Built. (6s)
Building Docker image nadeemlab/spt-workflow ................................... Built. (6s)
Running docker compose rm (remove) ............................................. Down. (1s)
apiserver (setup testing environment) .......................................... Setup. (4s)
study names .................................................................. Passed. (1s)
record counts ................................................................ Passed. (4s)
API internal basic database accessor ......................................... Passed. (1s)
expressions in db ............................................................ Passed. (1s)
apiserver (teardown testing environment) ....................................... Down. (1s)
graphs (setup testing environment) ............................................. Setup. (2s)
image runs properly .......................................................... Passed. (1s)
graphs (teardown testing environment) .......................................... Down. (1s)
ondemand (setup testing environment) ........................................... Setup. (4s)
binary expression viewer ..................................................... Passed. (1s)
intensity values imported .................................................... Passed. (4s)
ondemand (teardown testing environment) ........................................ Down. (0s)
db (setup testing environment) ................................................. Setup. (3s)
guess channels from object files ............................................. Passed. (1s)
drop recreate database constraints ........................................... Passed. (13s)
shapefile polygon extraction ................................................. Passed. (1s)
db (teardown testing environment) .............................................. Down. (0s)
workflow (setup testing environment) ........................................... Setup. (3s)
centroid pulling ............................................................. Passed. (3s)
feature matrix extraction .................................................... Passed. (26s)
stratification pulling ....................................................... Passed. (2s)
signature cell set subsetting ................................................ Passed. (1s)
sample stratification ........................................................ Passed. (1s)
workflow (teardown testing environment) ........................................ Down. (1s)
Building test-data-loaded spt-db image (1smallnointensity) ..................... Built. (0s)
apiserver (setup testing environment) .......................................... Setup. (5s)
phenotype criteria ........................................................... Passed. (0s)
proximity .................................................................... Passed. (4s)
phenotype summary ............................................................ Passed. (0s)
retrieval of umap plots ...................................................... Passed. (5s)
retrieval of hi res umap ..................................................... Passed. (1s)
study summary retrieval ...................................................... Passed. (0s)
counts query delegation edge cases ........................................... Passed. (1s)
apiserver (teardown testing environment) ....................................... Down. (1s)
graphs (teardown testing environment) .......................................... Down. (0s)
ondemand (setup testing environment) ........................................... Setup. (4s)
expression data caching ...................................................... Passed. (13s)
class counts cohoused datasets ............................................... Passed. (1s)
edge cases few markers ....................................................... Passed. (1s)
single signature count query ................................................. Passed. (0s)
ondemand (teardown testing environment) ........................................ Down. (1s)
db (setup testing environment) ................................................. Setup. (3s)
basic health of database ..................................................... Passed. (5s)
expression table indexing .................................................... Passed. (13s)
record counts cohoused datasets .............................................. Passed. (4s)
fractions assessment ......................................................... Passed. (4s)
db (teardown testing environment) .............................................. Down. (0s)
workflow (setup testing environment) ........................................... Setup. (3s)
proximity pipeline ........................................................... Passed. (95s)
umap plot creation ........................................................... Passed. (56s)
workflow (teardown testing environment) ........................................ Down. (1s)
Optionally, if the images are ready to be released:
$ make build-and-push-docker-images
Checking for Docker credentials in ~/.docker/config.json ....................... Found. (0s) Pushing Docker container nadeemlab/spt-apiserver ............................... Pushed. (16s) Pushing Docker container nadeemlab/spt-ondemand ................................ Pushed. (15s) Pushing Docker container nadeemlab/spt-db ...................................... Pushed. (23s) Pushing Docker container nadeemlab/spt-workflow ................................ Pushed. (27s)
If the package source code is ready to be released to PyPI:
$ make release-package
Checking for PyPI credentials in ~/.pypirc for spatialprofilingtoolbox ......... Found. (0s) Uploading spatialprofilingtoolbox==0.11.0 to PyPI .............................. Uploaded. (3s)
The source code is contained in one Python package, spatialprofilingtoolbox
. The package metadata uses the declarative pyproject.toml
format.
The main functionality is provided by 4 modules designed to operate as services. Each module's source is wrapped in a Docker image.
Module name | Description |
---|---|
apiserver |
FastAPI application supporting queries over cell data. |
graphs |
Command line tool to apply cell graph neural network models to data stored in an SPT framework. |
ondemand |
An optimized class-counting and other metrics-calculation program served by a custom TCP server. |
db |
Data model/interface and PostgresQL database management SQL fragments. |
workflow |
Nextflow-orchestrated computation workflows. |
- The
db
module is for testing only. A real PostgresQL database should generally not be deployed in a container.
Test scripts are located under test/
.
These tests serve multiple purposes for us:
- To verify preserved functionality during source code modification.
- To exemplify typical usage of classes and functions, including how they are wrapped in a container and how that container is setup.
Each test is performed inside an isolated for-development-only spatialprofilingtoolbox
-loaded Docker container, in the presence of a running module-specific Docker composition that provides the given module's service as well as other modules' services (if needed).
You might want to install spatialprofilingtoolbox
to your local machine in order to initiate database control actions, ETL, etc.
In this case bash completion is available that allows you to readily assess and find functionality provided at the command line. This reduces the need for some kinds of documentation, since such documentation is already folded in to the executables in such a way that it can be readily accessed.
After installation of the Python package, an entry point spt
is created. (Use spt-enable-completion
to manually install the completion to a shell profile file).
spt [TAB]
yields the submodules which can be typed next.spt <module name> [TAB]
yields the commands provided by the given module.spt <module name> <command name> [TAB]
yields the--help
text for the command.
Development often entails "throwaway" test scripts that you modify and run frequently in order to check your understanding of functionality and verify that it works as expected.
For this purpose, a pattern that has worked for me in this repository is:
- Ensure at least one successful run of
make build-docker-images
at the top level of this repository's directory, for each module that you will use. - Go into the build are for a pertinent module:
cd build/<module name>
. - Create
throwaway_script.py
. - Setup the testing environment:
docker compose up -d
- As many times as you need to, run your script with the following (replacing
<module name>
):
test_cmd="cd /mount_sources/<module name>/; python throwaway_script.py" ;
docker run \
--rm \
--network <module name>_isolated_temporary_test \
--mount type=bind,src=$(realpath ..),dst=/mount_sources \
-t nadeemlab-development/spt-development:latest \
/bin/bash -c "$test_cmd";
- Tear down the testing environment when you're done:
docker compose down;
docker compose rm --force --stop;
You can of course also modify the testing environment, involving more or fewer modules, even docker containers from external images, by editing compose.yaml
.
The computation workflows are orchestrated with Nextflow, using the process definition script main_visitor.nf
. "Visitor" refers to the visitor pattern, whereby the process steps access the database, do some reads, do some computations, and return some results by sending them to the database.
Each workflow consists of:
- "job" definition (in case the workflow calls for parallelization)
- initialization
- core jobs
- integration/wrap-up
To make a new workflow: copy the phenotype_proximity
subdirectory to a sibling directory with a new name. Update the components accordingly, and update workflow/__init__.py
with a new entry for your workflow, to ensure that it is discovered. You'll also need to update pyproject.toml
to declare your new subpackage.
It is often useful during development to run one test (e.g. a new test for a new feature). This is a little tricky in our environment, which creates an elaborate test harness to simulate the production environment. However, it can be done with the following snippet.
SHELL=$(realpath build/build_scripts/status_messages_only_shell.sh) \
MAKEFLAGS=--no-builtin-rules \
BUILD_SCRIPTS_LOCATION_ABSOLUTE=$(realpath build/build_scripts) \
MESSAGE='bash ${BUILD_SCRIPTS_LOCATION_ABSOLUTE}/verbose_command_wrapper.sh' \
DOCKER_ORG_NAME=nadeemlab \
DOCKER_REPO_PREFIX=spt \
TEST_LOCATION_ABSOLUTE=$(realpath test) \
TEST_LOCATION=test \
make --no-print-directory -C build/SUBMODULE_NAME test-../../test/SUBMODULE_NAME/module_tests/TEST_FILENAME