Investigate the impact of the auto-scale-down jobs #157

WadeBarnes · 2024-01-23T17:30:41Z

Platform services has started running jobs that scale down any pods that have not been updated (rolled out) in over a year. These scripts will be run every Tuesday from now on.

The idea is to eliminate any abandoned projects and free the associated resources as well as attempt to encourage best practices around pod/application maintenance.

The best practice set forth is to rebuild and redeploy application pods at least once a month in order to pick updates and patches performed to the base image(s). This will have knock-on effects in some of our projects such as those dependent on aca-py images.

As a workaround, application pods can be rolled out, this updates the resource manifests to include the current date.

For now we want to review the pods that did get scaled down and identify what's needed to updated them. We also want to identify what other pods may have been scaled down since there are some services in the tools and deployment environments we don't activity monitor.

A separate ticket will be opened to discuss and design the update strategy moving forward.

The text was updated successfully, but these errors were encountered:

WadeBarnes · 2024-01-23T18:35:30Z

Summary:

Backup containers

Update to the latest version.

Databases

Short term, rebuild to pick up latest base images. Long term, upgrade databases to newer version of Postgres.
Many are S2I builds, which are used to pickup server configuration files.

S2I Builds

Short term, update the base containers where possible and rebuild the application images.
Determine the alternative to S2I base images. It appears RedHat is not supporting S2I containers to the same level anymore. There are Fedora based S2I images available here; https://quay.io/organization/fedora, but the log term plan should likely be to migrate away from using S2I base images.
My comments about Red Hat not supporting S2I images as much anymore is incorrect. They've just made it painfully difficult to find them (aka, their search feature is lacking in some areas); UBI based S2I base images can be found here, and Postgres S2I images can be found here. Note there is no Postgres 14 image from Red Hat. We've been using the one from Fedora which is built from the same source https://quay.io/repository/fedora/postgresql-14

Email verification services specifically

Short term redeploy existing services.
Short to medium term, retire and archive.

BC Registries FDW Database

Used for connection the COLIN databases.
Requires a fair amount of updating to use medium to long term.
Look for alternatives.
Short term, rebuild and deploy. Medium to long term, look for alternatives, or update.

Others

Including controller-buybc, aries-endorser-api, and issuer-admin-bcvcpilot
Update as indicated.
Rebuild and redeploy.

Details

Monitored Applications Affected:

e79518-dev (Digital Trust Services Trust Over IP)
- controller-buybc
  - Docker build that could use a regular python image as it's base. No need to be building off bcgovimages/von-image:py36-1.16-0. Upgrade to newer Debian release, Bullseye at minimum.
  - https://github.com/esune/aries-vcr-issuer-controller.git
    - ./issuer_controller
    - master
4a9599-test (Digital Trust Shared Service)
- aries-endorser-backup
  - Could be upgraded to use the latest backup build/image.
  - artifacts.developer.gov.bc.ca/docker-remote/centos/postgresql-13-centos7:20210722-70dc4d3
  - https://github.com/BCDevOps/backup-container.git
    - ./docker
    - 2.5.1
- aries-endorser-db
  - Short term, rebuild using latest postgresql-13 image. Long term, update to newer version of Postgres
  - registry.redhat.io/rhel9/postgresql-13:latest
  - https://github.com/bcgov/aries-endorser-service
    - ./docker/wallet/config
    - main
- aries-endorser-api
  - Docker build. Rebuild. Upgrade to newer Debian release, Bullseye at minimum.
  - artifacts.developer.gov.bc.ca/docker-remote/python:3.10-slim-buster
  - https://github.com/bcgov/aries-endorser-service.git
    - ./endorser
    - main
- aries-endorser-wallet
  - Uses aries-endorser-db image

Others Affected

e79518-test (Digital Trust Services Trust Over IP)
- wallet-buybc
  - S2I build. Database upgrade highly recommened.
  - registry.access.redhat.com/rhscl/postgresql-10-rhel7:latest
  - https://github.com/bcgov/von-bc-registries-agent-configurations.git
    - ./openshift/templates/db/config/postgresql-cfg
    - master
a99fd4-dev (Digital Trust Demo Apps)
- email-verification-agent
  - Docker build used just to pull the aca-py image. Should be able to upgrade to a newer aca-py image from ghcr.io/hyperledger/aries-cloudagent-python
  - bcgovimages/aries-cloudagent:py36-1.15-1_0.6.0
  - https://github.com/bcgov/indy-email-verification.git
    - ./
    - master
- issuer-kit-wallet
  - S2I build. Database upgrade highly recommended. We've done this with some other issure79518
  - registry.access.redhat.com/rhscl/postgresql-10-rhel7:latest
  - https://github.com/bcgov/issuer-kit.git
    - ./wallet/config
    - main
- email-verification-service
  - S2I build. Upgrade to newer python image. Looks like RedHat is not supporting S2I containers to the same level anymore. There are Fedora based images available here; https://quay.io/organization/fedora
  - registry.access.redhat.com/rhscl/python-36-rhel7:latest
  - https://github.com/bcgov/indy-email-verification.git
    - ./src
    - master
- email-verification-service-db
  - Image only database upgrade highly recommended.
  - registry.access.redhat.com/rhscl/postgresql-10-rhel7
a99fd4-test (Digital Trust Demo Apps)
- email-verification-service-db
  - See email-verification-service-db in a99fd4-dev
- issuer-admin-bcvcpilot
  - Multi-stage build. Rebuild to pick up new based image(s)
  - Build
    - artifacts.developer.gov.bc.ca/docker-remote/node:hydrogen
    - https://github.com/bcgov/issuer-kit.git
      - ./
      - main
  - Runtime
    - artifacts.developer.gov.bc.ca/docker-remote/caddy:alpine
    - w/ artifacts from build
- email-verification-demo
  - S2I Build. Upgrade to newer base image
  - registry.fedoraproject.org/f32/python3:latest
  - https://github.com/bcgov/vc-visual-verifier.git
    - ./src
    - main
- email-verification-agent
  - Docker build used just to pull the aca-py image. Should be able to upgrade to a newer aca-py image from ghcr.io/hyperledger/aries-cloudagent-python
  - bcgovimages/aries-cloudagent:py36-1.15-1_0.6.0
  - https://github.com/bcgov/indy-email-verification.git
    - ./
    - master
8ad0ea-dev (OrgBook BC)
- backup-bc
  - Could be upgraded to use the latest backup build/image.
  - artifacts.developer.gov.bc.ca/docker-remote/centos/postgresql-13-centos7:20210722-70dc4d3
  - https://github.com/BCDevOps/backup-container.git
    - ./docker
    - 2.5.1
8ad0ea-test (OrgBook BC)
- backup-bc
  - See backup-bc in 8ad0ea-dev
7cba16-dev (BC Registries Agent)
- event-processor-log-db-primary
  - Uses same image as wallet-primary
- backup-primary
  - Could be upgraded to use the latest backup build/image.
  - artifacts.developer.gov.bc.ca/docker-remote/centos/postgresql-13-centos7:20210722-70dc4d3
  - https://github.com/BCDevOps/backup-container.git
    - ./docker
    - 2.5.1
- wallet-primary
  - S2I build. Short term, rebuild using latest postgresql-12 image. Long term, update to newer version of Postgres
  - postgresql:12
  - https://github.com/bcgov/von-bc-registries-agent-configurations.git
    - ./openshift/templates/db/config
    - main
- event-db-primary
  - Uses same image as wallet-primary
7cba16-test (BC Registries Agent)
- wallet-primary
  - See wallet-primary in 7cba16-dev
- event-processor-log-db-primary
  - Uses same image as wallet-primary
- event-db-primary
  - Uses same image as wallet-primary
- bc-reg-fdw-primary
  - Docker build. Used rhel7 and Postgres 9.6. Requires an upgrade to both.
  - registry.access.redhat.com/rhel7:latest
  - https://github.com/bcgov/openshift-postgresql-oracle_fdw.git
    - ./
    - master
- backup-primary
  - See backup-primary in 7cba16-dev

WadeBarnes · 2024-01-23T21:28:38Z

I've spun the application pods back up and reviewed the environments for any other containers that were spun down. Next step is to review and identify what can be done to update the affected application pods.

WadeBarnes · 2024-01-29T15:09:02Z

Summary here; #157 (comment)

WadeBarnes · 2024-02-01T18:16:47Z

Closing this. The investigation is complete. Addressing the issues is covered by #158

WadeBarnes added this to CDT Enterprise Apps Jan 23, 2024

WadeBarnes self-assigned this Jan 23, 2024

WadeBarnes converted this from a draft issue Jan 23, 2024

WadeBarnes mentioned this issue Jan 29, 2024

DevOps processes and Continuous Delivery - Moving Forward #158

Open

9 tasks

WadeBarnes closed this as completed Feb 1, 2024

github-project-automation bot moved this from In Progress to In Review in CDT Enterprise Apps Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate the impact of the auto-scale-down jobs #157

Investigate the impact of the auto-scale-down jobs #157

WadeBarnes commented Jan 23, 2024 •

edited

Loading

WadeBarnes commented Jan 23, 2024 •

edited

Loading

WadeBarnes commented Jan 23, 2024

WadeBarnes commented Jan 29, 2024

WadeBarnes commented Feb 1, 2024

Investigate the impact of the auto-scale-down jobs #157

Investigate the impact of the auto-scale-down jobs #157

Comments

WadeBarnes commented Jan 23, 2024 • edited Loading

WadeBarnes commented Jan 23, 2024 • edited Loading

Summary:

Details

WadeBarnes commented Jan 23, 2024

WadeBarnes commented Jan 29, 2024

WadeBarnes commented Feb 1, 2024

WadeBarnes commented Jan 23, 2024 •

edited

Loading

WadeBarnes commented Jan 23, 2024 •

edited

Loading