PostgreSQL 16 upgrade #2968
Unanswered
plasticviking
asked this question in
General
Replies: 1 comment
-
I don't have a problem with doing one fresh migration to get past the upgrade, we did the same thing for knex at one point. I think on the shortlist after the upgrade we should look at making some better seeds for testing, which should save us from "can't build from scratch" states like the one you're in now. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I’ve done a ton of research and collected my thoughts on our best possible upgrade path to a modern version of Postgres.
We’re currently using version 9.6 (which was EOL’d in 2022) both in the cluster and in docker. I’m also using 9.6 on a local bare-metal Debian server for compatibility/testing.
Upgrading to 16 brings a number of benefits (mostly totally transparent things like better parallel query performance), but also some great quality of life stuff like better/more mature JSON handling.
Upgrading, especially that many versions at a time, is not trivial, and we have three environments (Local, Docker, Openshift) and two scenarios (Upgrade-in-place, Fresh install) to consider, for 6 combinations.
The bundled (with Postgres-server)
pg_upgrade
tool works well. I used it on my local server to go direct from 9.6 to 16. It did choke on one invalid function definition, so it’s necessary to do it like this:a) install both versions in parallel
b) connect to 9.6 and
drop function public.convert_string_list_to_array_elements("unknown”);
c) shutdown both postmasters
d) run
pg_upgrade
e) startup the new service, keep the old one offline as a backup.
f) run the generated
update_extensions.sql
filepg_upgrade
makes for youWorks great locally. Won’t work in the cluster without serious docker-image manipulation, since the filesystem is read-only. I think the way to handle this in-cluster is:
Do it in dev and test first, prod after validation…
a) shut off the API server so no data is being modified
b) modify the openshift service definition for the database to point to nothing (delete the selector)
c) shell into the pod and run a
pg_dump
of prod datad) download to local and upload it into a fresh, running local 9.6 service
e) do the above steps to upgrade the database
f) pg_dump again and re-upload into a new instance running version 16 (which may optionally be a CrunchyDB cluster)
g) setup the service definition to point to the new pods with the same name as previously. It should be transparent to the API pod.
h) validate everything by testing the app. If something doesn’t look right, revert step (b) to get the old service back
This doesn’t handle local docker upgrades. To be honest I think the best approach for local docker is to upgrade the image definitions and delete old data volumes and make a clean break. If that’s not possible, something very similar to what was done for the cluster upgrade will work as well. I can be available to generate these upgraded dumps for anyone who really needs it.
Speaking of fresh installs:
Our migrations are totally broken. They cannot run successfully on a fresh install now. I don’t know if anyone has tried recently. They’ll blow up on 0029, and then again on a bunch of others if you delete the offending definitions (which are view definitions referencing tables or views that don’t exist). I tried untangling it, and there are dependencies on certain seeds being run at certain times in the process, but I got into a real headache.
They don’t work on either 9.6 or 16 right now for totally clean install. I have a fix to suggest here too: Assuming that prod is the data that is closest to what is required (I know it’s at migration 0075 and latest is 0076, but that’s easily bumped up), I propose that we freeze the creation of new migrations while we do the database upgrade, and shortly after upgrading prod we take it as the "gold standard", do a
pg-dump —schema-only
into a new migration0001_fresh_start
to catch the structure, and create a seed with the /current/ values of the code tables (and other necessary ancillary tables like email_templates). Have the migration run the seed (you can do this withknex
programmatically). Truncate the migration table in dev/test/prod and manually enter0001_fresh_start
as having been run. Do the same for any other environment that has data (local and docker instances). The end result of this unpleasant task is:a) One migration and one seed, both of which are known working on Postgres 16 and have the structure we actually need
b) The ability to do fresh installs
c) We can clean up some of the things that are broken in our current migrations, like hard-coded schema/role names.
I propose to:
pg_upgrade
d to version 16, without modifying any dataBeta Was this translation helpful? Give feedback.
All reactions