NOTE (2019-09-05): This should still largely be accurate, but I haven't had time to go over it since reconfiguring the build process, so take with a grain of salt until updated.
This document provides information about how the application stack is built, and how it is operated.
The first time the cluster is brought up, three docker volumes are created:
redis-data
- Persists the data store for theredis
containerpostgis-data
- Persists the data for thepostgis
containerpostgis-extensions
- Persists the/usr/share/postgresql
directory on thepostgis
container, so that the built extension files may be installed onto new databases created during runtime.
Once those volumes are created, the PostgreSQL data cluster has to be initialized. This will happen automatically, by virtue of the underlying postgres:10
official docker image having a built-in init process when it is started with an empty data directory. As part of that process, the PostgreSQL process will attempt to execute any shell or SQL scripts it finds on the container in the directory /docker-entrypoint-initdb.d
. The Dockerfile for the postgis
container copies one startup script there, multi-svc-cartodb/docker/postgis/initdb.d/00_setup_carto_pg.sh
. Once the PostgreSQL cluster is initiated, that script is used to:
- Create the
publicuser
andtileuser
PostgreSQL roles - Globally install the
plpythonu
PostgreSQL extension - Create the PostgreSQL template database
template_postgis
, and install the following extensions to it:plpgsql
- The pgsql procedural languagepostgis
- The PostGIS core extensionpostgis_topology
- The PostGIS topology extensionplpythonu
- The python procedural languageplproxy
- The PL/Proxy database partitioning systemcrankshaft
- One of Carto's geospatial extensions, found here: https://github.com/cartodb/crankshaft
- Create the
geocoder_api
PostgreSQL role - Create the
dataservices_db
database (owned bygeocoder_api
), and install the following extensions to it:plproxy
plpythonu
postgis
cartodb
- A Carto PostgreSQL extension that includes a substantial amount of the core program logic for the PostgreSQL side of the Carto applications. Found here: https://github.com/cartodb/cartodb-postgresqlcdb_geocoder
- Carto extension that powers their internal geocoder: https://github.com/cartodb/data-servicescdb_dataservices_server
- Carto extension that provides access to their geocoder: https://github.com/cartodb/dataservices-apicdb_dataservices_client
- Carto extension that gives client databases access to the geocoder api fromcdb_dataservices_server
. Also found at https://github.com/cartodb/dataservices-apiobservatory
- Carto extension that 'implements the row level functions needed by the Observatory service'. Found here: https://github.com/CartoDB/observatory-extension
- Load the fixture tables and data the Observatory extension depends on
- Make grants to the
geocoder_api
role for tables and functions in the schemaobservatory
- Add Carto application-specific configuration inside PostgreSQL, by making a number of
SELECT
statements that use thecartodb.CDB_Conf_SetConf()
function.
When the postgis
container has completed its cluster init, and the cartodb
container is up and running, the /carto/docker-entrypoint.sh
script on that container will do the following:
- If the
CARTO_USE_HTTPS
value in your.env
file wastrue
at the time the container was built, the script will adjust a line of code in/carto/cartodb/config/initializers/carto_db.rb
to allow HTTPS usage while the RAILS_ENV is set todevelopment
. - If
CARTO_USE_HTTPS
was nottrue
, the script will copy the contents ofconfig/app_config_no_https.yml
intoconfig/app_config.yml
. - If the database name provided for the current environment in
config/database.yml
does not exist in the PostgreSQL cluster, it is created via thedb:create
rake task in the Carto Rails application. - If there are fewer than 60 tables found in that database, the script will execute the
db:migrate
rake task. (Running migrations multiple times wouldn't be harmful, it just takes up time so we skip it if there's already a lot of tables in the db.) - If there's no entry in the
users
table in that database for theCARTO_DEFAULT_USER
andCARTO_DEFAULT_EMAIL
values from your.env
file, that user is created by executing thescript/better_create_dev_user.sh
script (one we've added to make that process more transparent). That script will also update a number of user settings in the database. - If no entries exist for the organization and org user defined by the
CARTO_ORG_NAME
,CARTO_ORG_USER
, andCARTO_ORG_EMAIL
entries in your.env
file, that organization and user are created via thescript/setup_organization.sh
script (another of ours). That script also makes some settings changes for the organization and user. - Runs the
script/restore_redis
script - Starts the Resque process (the RoR job runner that Carto uses)
- Clears existing API keys and regenerates them
- Starts the rails server on port 80 (which the
router
container will reverse proxy to when incoming HTTPS connections are made)
If you are working on the process of initialization, or if you would like to test or re-run the PostgreSQL initialization process and script(s), it will be necessary to at the very least destroy (or otherwise make unavailable) the postgis-data
Docker volume. If the PostgreSQL process finds anything other than a completely empty data directory (at /var/lib/postgresql/data
), it will skip the cluster initialization process. To remove the postgis-data
volume, you can run:
docker-compose down
docker volume ls -q --filter "name=postgis-data" | xargs docker volume rm
If you would like to remove all of the volumes for the cluster, there is a utility script in the this repo's /scripts
directory called remove_docker_volumes.sh
. Note that if you remove them all, you may need to run docker-compose build postgis
to repopulate the PostgreSQL extensions directory before bringing the cluster back up.
To start the cluster, you can use docker-compose up
. This will create a Docker network and the Docker volumes if they do not already exist, then start the containers in dependency order.
To stop the cluster, use docker-compose stop
. This will attempt to gracefully stop all containers referenced in the docker-compose.yml
file. If they cannot be gracefully stopped they are sent a hard stop signal and killed.
Note: Stopping the cluster does not remove the containers, it simply stops them. If you want to both stop the cluster and remove the stopped containers, use docker-compose down
.
If you would like to get a shell on any of the containers, you may do so with docker exec
, called from the root of the repo:
cd multi-svc-cartodb
docker-compose exec cartodb /bin/bash
You can get a shell on any of the containers, but note that because both the redis
and router
containers are build on very minimal Alpine Linux installs, they do not have a bash
shell by default. For those, use docker-compose exec redis /bin/sh
.
Other than the router
container, the postgis
container is the only one that opens a port on the host machine. Consequently you can get an interactive session on that container's PostgreSQL instance by hitting localhost:5432
as the user postgres
:
psql -U postgres -h localhost
Alternatively, you can use the psql
client on the container itself, by connecting to it with docker-compose exec postgis psql -U postgres -h localhost
(or by getting a bash shell and running psql
from there).
You can't directly connect to the Redis instance from the host machine, so you'll have to do it via the redis-cli
utility on the container itself:
docker-compose exec redis redis-cli
Or by getting a /bin/sh
shell on the container and calling redis-cli
from that.