文章简介:bitnami/postgres-ha 使用 pg_upgrade 全自动大版本升级 从 14 到 15
Infrastructural notes
Reasons to upgrade
开始我所在产品计划做基于 k8s 的集群版,postgres 需要一个可靠的可以自动主从切换的 HA 方案。本着不重新造轮子的原则,选择了 bitnami/postgres-ha 的方案。调研时发现开了 witness/pg_rewind 后,pg 反复重启。为此把 witness/pg_rewind 关了后上了生产。后续由于我们百 G 存储的 pg 测试集群偶发了一次丢所有的数据,调查后决定开启 pg_rewind 防止丢数据。调研中发现最新的 bitnami/postgres-ha 基于 postgres 15, pg 14 开启后不工作,postgres 15 已经经过一段时间的生产环境验证,决定尝试 pg14 升级 15。
当前官方提供的升级方案包括:
- Dump & restore: 最简单也是最耗时的方法。将面临更长的停机时间,我们一个测试环境采用不落盘方案,30G 数据大约花费了 30 分钟,在更大的数据库或不同的硬件配置可能会有不同的执行时间;
- Logical replication: 停机时间最短的方式(所有数据库实例数据同步完成后切换旧集群到新集群这段时间会有短暂秒级服务不可用)。从 pg10 开始就内置了逻辑复制。
- PostgreSQL official upgrade tool: pg_upgrade 官方维护了一个命令行工具, 用于此场景.
我所面临的场景要求原地升级,不接受使用一个新的 bitnami/postgres-ha
集群替换新的集群,允许比较短的维护时间窗口(可停服),故采用方案 3: PostgreSQL official upgrade tool pg_upgrade。如果对停机时间有严格要求的,还是建议采用方案 2 Logical replication。
因为我们的场景属于容器化场景,在一个 k8s 集群中部署了我们所有的数据库服务及业务服务,我们先在 docker 环境下熟悉一下 pg_upgrade 的操作。
基于 docker 的 postgres 升级过程探索
调研过程中看到容器场景比较理想的迁移方案 blog Terabyte-scale PostgreSQL upgrade from 9.6 to 14 ,其中为他们的场景自制了一个用于升级的 docker image,此项目中对自己的介绍 This is a PoC for using pg_upgrade inside Docker -- learn from it, adapt it for your needs; don't expect it to work as-is!
。接下来我们使用此方法复现一下大版本升级过程。
升级过程
在此过程中我们会先启动一个 postgres14, 然后使用 tianon/postgres-upgrade:14-to-15
从 14 升级到 15. 然后启动 pg15 验证是否升级成功。
先看一下 docker-compose.yaml 文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
version: "3"
services:
db14:
image: postgres:14-bookworm
container_name: db14
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: test
POSTGRES_DB: test
PGDATA: /var/lib/postgresql/all/db14
volumes:
- ./data/:/var/lib/postgresql/all
ports:
- 5432:5432
db15:
image: postgres:15-bookworm
container_name: db15
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: test
POSTGRES_DB: test
PGDATA: /var/lib/postgresql/all/db15
volumes:
- ./data/:/var/lib/postgresql/all
ports:
- 5433:5432
upgrade:
image: tianon/postgres-upgrade:14-to-15
container_name: upgrade
environment:
PGDATAOLD: /var/lib/postgresql/all/db14
PGDATANEW: /var/lib/postgresql/all/db15
POSTGRES_USER: test
POSTGRES_PASSWORD: password
command: ["tail", "-f", "/dev/null"]
volumes:
- ./data/:/var/lib/postgresql/all
|
首先我们把 pg 14 启动起来,并创建一些数据。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
docker-compose up -d db14
[+] Building 0.0s (0/0) docker-container:multiarch
[+] Running 2/2
✔ Network docker_default Created 0.1s
✔ Container db14 Started
export database='postgres://postgres:test@localhost:5432/test?sslmode=disable'
docker-compose exec db14 psql "$database" -c 'select version();'
------------------------------------------------------------------------------------------------------------
-----------------
PostgreSQL 14.10 (Debian 14.10-1.pgdg120+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14
) 12.2.0, 64-bit
(1 row)
docker-compose exec db14 psql "$database" -c 'CREATE TABLE IF NOT EXISTS t_test( ID INT NOT NULL, NAME TEXT NOT NULL, AGE INT NOT NULL, ADDRESS CHAR(50), SALARY REAL );'
CREATE TABLE
docker-compose exec db14 psql "$database" -c 'select count(*) from t_test;'
count
-------
0
(1 row)
docker-compose exec db14 psql "$database" -c 'insert into t_test SELECT generate_series(1,1) as key,repeat( chr(int4(random()*26)+65),4), (random()*(6^2))::integer,null,(random()*(10^4))::integer;'
INSERT 0 1
docker-compose exec db14 psql "$database" -c 'select count(*) from t_test;'
count
-------
1
(1 row)
|
开始升级过程
1
2
3
|
docker-compose stop db14
[+] Stopping 1/1
✔ Container db14 Stopped
|
1
2
3
4
|
docker-compose up -d upgrade
[+] Running 1/1
✔ Container upgrade Started
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
|
docker-compose exec upgrade docker-upgrade pg_upgrade
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /var/lib/postgresql/all/db15 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
pg_ctl -D /var/lib/postgresql/all/db15 -l logfile start
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database user is the install user ok
Checking database connection settings ok
Checking for prepared transactions ok
Checking for system-defined composite types in user tables ok
Checking for reg* data types in user tables ok
Checking for contrib/isn with bigint-passing mismatch ok
Creating dump of global objects ok
Creating dump of database schemas
ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for new cluster tablespace directories ok
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
Performing Upgrade
------------------
Analyzing all rows in the new cluster ok
Freezing all rows in the new cluster ok
Deleting files from new pg_xact ok
Copying old pg_xact to new server ok
Setting oldest XID for new cluster ok
Setting next transaction ID and epoch for new cluster ok
Deleting files from new pg_multixact/offsets ok
Copying old pg_multixact/offsets to new server ok
Deleting files from new pg_multixact/members ok
Copying old pg_multixact/members to new server ok
Setting next multixact ID and offset for new cluster ok
Resetting WAL archives ok
Setting frozenxid and minmxid counters in new cluster ok
Restoring global objects in the new cluster ok
Restoring database schemas in the new cluster
ok
Copying user relation files
ok
Setting next OID for new cluster ok
Sync data directory to disk ok
Creating script to delete old cluster ok
Checking for extension updates ok
Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade.
Once you start the new server, consider running:
/usr/lib/postgresql/15/bin/vacuumdb --all --analyze-in-stages
Running this script will delete the old cluster's data files:
./delete_old_cluster.sh
----------------
ExecutionTime: 0h:00m:12s
|
1
2
3
4
|
docker-compose up -d db15
[+] Running 1/1
✔ Container db15 Started
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
docker-compose logs db15
db15 |
db15 | PostgreSQL Database directory appears to contain a database; Skipping initialization
db15 |
db15 | 2023-12-24 02:50:09.279 UTC [1] LOG: starting PostgreSQL 15.5 (Debian 15.5-1.pgdg120+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
db15 | 2023-12-24 02:50:09.279 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
db15 | 2023-12-24 02:50:09.279 UTC [1] LOG: listening on IPv6 address "::", port 5432
db15 | 2023-12-24 02:50:09.282 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
db15 | 2023-12-24 02:50:09.290 UTC [30] LOG: database system was shut down at 2023-12-24 02:48:38 UTC
db15 | 2023-12-24 02:50:09.304 UTC [1] LOG: database system is ready to accept connections
db15 | 2023-12-24 02:50:24.376 UTC [34] WARNING: database "test" has a collation version mismatch
db15 | 2023-12-24 02:50:24.376 UTC [34] DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
db15 | 2023-12-24 02:50:24.376 UTC [34] HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE test REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
db15 | 2023-12-24 02:50:39.360 UTC [35] WARNING: database "postgres" has a collation version mismatch
db15 | 2023-12-24 02:50:39.360 UTC [35] DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
db15 | 2023-12-24 02:50:39.360 UTC [35] HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
|
1
2
3
4
5
6
7
|
docker-compose exec db15 psql "$database" -c 'ALTER DATABASE test REFRESH COLLATION VERSION;'
WARNING: database "test" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE test REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
NOTICE: changing version from 2.31 to 2.36
ALTER DATABASE
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
docker-compose exec db15 vacuumdb --username=postgres --all --analyze-in-stages
WARNING: database "postgres" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
WARNING: database "postgres" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
vacuumdb: processing database "postgres": Generating minimal optimizer statistics (1 target)
WARNING: database "template1" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE template1 REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
vacuumdb: processing database "template1": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "test": Generating minimal optimizer statistics (1 target)
WARNING: database "postgres" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
vacuumdb: processing database "postgres": Generating medium optimizer statistics (10 targets)
WARNING: database "template1" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE template1 REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
vacuumdb: processing database "template1": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "test": Generating medium optimizer statistics (10 targets)
WARNING: database "postgres" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
vacuumdb: processing database "postgres": Generating default (full) optimizer statistics
WARNING: database "template1" has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE template1 REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
vacuumdb: processing database "template1": Generating default (full) optimizer statistics
vacuumdb: processing database "test": Generating default (full) optimizer statistics
|
验证一下已经升级成功
1
2
3
4
5
6
|
docker-compose exec db15 psql "$database" -c 'select count(*) from t_test;'
count
-------
1
(1 row)
|
1
2
3
4
5
6
|
docker-compose exec db15 psql "$database" -c 'select * from t_test;'
id | name | age | address | salary
----+------+-----+---------+--------
1 | EEEE | 29 | | 9179
(1 row)
|
验证通过。
升级注意事项
如果使用了 PostgreSQL extension, 需要手动安装
在升级容器新版本 PG 手动安装
1
2
|
apt update && apt install postgresql-14-pgaudit
apt update && apt install postgresql-15-pgaudit
|
旧 postgres:14-alpine 使用 postgres:15-bookworm-pg_upgrade 新运行环境 postgres:15-alpine
使用 postgres:15-bookworm 做 pg_upgrade 遇到 warning:
1
|
database "postgres" has no actual collation version, but a version was recorded
|
1
2
3
4
|
REINDEX
REINDEX DATABASE splat;ALTER DATABASE splat REFRESH COLLATION VERSION; ```
```bash
ERROR: invalid collation version change
|
glibc 版本/系统版本变化导致 pg_upgrade 升级后报错 官方建议 新旧及升级容器的 linux 运行时都使用相同的. 比如此例子中应都使用 alpine:3.14.
最终,写了完整验证自动化脚本:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
# docker-compose.yaml
version: "3"
services:
db14:
image: postgres:14-bookworm
container_name: db14
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: test
POSTGRES_DB: test
PGDATA: /var/lib/postgresql/all/db14
volumes:
- ./data/:/var/lib/postgresql/all
ports:
- 5432:5432
db15:
image: postgres:15-bookworm
container_name: db15
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: test
POSTGRES_DB: test
PGDATA: /var/lib/postgresql/all/db15
volumes:
- ./data/:/var/lib/postgresql/all
ports:
- 5433:5432
upgrade:
image: tianon/postgres-upgrade:14-to-15
container_name: upgrade
environment:
PGDATAOLD: /var/lib/postgresql/all/db14
PGDATANEW: /var/lib/postgresql/all/db15
POSTGRES_USER: test
POSTGRES_PASSWORD: password
command: ["tail", "-f", "/dev/null"]
volumes:
- ./data/:/var/lib/postgresql/all
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
# test.sh
#!/usr/bin/env bash
set -o errtrace
set -o errexit
set -o nounset
set -o pipefail
set -o xtrace
cd "$(dirname "$0")"
export database='postgres://postgres:test@localhost:5432/test?sslmode=disable'
docker-compose down
rm -rf data
sleep 3
docker-compose up -d db14
sleep 5
docker-compose exec db14 pg_isready -d "$database" -t 20
docker-compose exec db14 psql "$database" -c 'select version();'
docker-compose exec db14 psql "$database" -c 'CREATE TABLE IF NOT EXISTS t_test( ID INT NOT NULL, NAME TEXT NOT NULL, AGE INT NOT NULL, ADDRESS CHAR(50), SALARY REAL );'
docker-compose exec db14 psql "$database" -c 'select count(*) from t_test;'
docker-compose exec db14 psql "$database" -c 'insert into t_test SELECT generate_series(1,1) as key,repeat( chr(int4(random()*26)+65),4), (random()*(6^2))::integer,null,(random()*(10^4))::integer;'
docker-compose exec db14 psql "$database" -c 'select count(*) from t_test;'
docker-compose stop db14
docker-compose up -d upgrade
docker-compose exec upgrade docker-upgrade pg_upgrade
docker-compose up -d db15
sleep 3
docker-compose exec db15 psql "$database" -c 'ALTER DATABASE test REFRESH COLLATION VERSION;'
docker-compose exec db15 vacuumdb --username=postgres --all --analyze-in-stages
docker-compose logs -f db15
|
基于 k8s 的 bitnami/postgres-ha 升级的尝试
Terabyte-scale PostgreSQL upgrade from 9.6 to 14 中有描述原地升级方法,在此复现,并将过程自动化。
重复升级方法。
首先创建一个 postgres 14 的 pg 集群
1
2
|
helm pull oci://registry-1.docker.io/bitnamicharts/postgresql-ha --version 12.3.1
helm upgrade --install shared postgresql-ha-12.3.1.tgz --set postgresql.image.tag='14.10.0-debian-11-r6' --set postgresql.replicaCount=1 --set postgresql.sharedPreloadLibraries='"pgaudit, repmgr, pg_stat_statements"'
|
先准备升级 pod
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
# 停止pg
kubectl scale --replicas=0 sts shared-postgresql-ha-postgresql
cat <<'EOF' | kubectl apply -f -
apiVersion: batch/v1
kind: Job
metadata:
name: pgupgrade
spec:
completions: 1
backoffLimit: 1
template:
spec:
restartPolicy: Never
volumes:
- name: data-sharedx-pgha-0
persistentVolumeClaim:
claimName: data-shared-postgresql-ha-postgresql-0
containers:
- name: pg-upgrade
image: tianon/postgres-upgrade:14-to-15
imagePullPolicy: IfNotPresent
command:
- "sh"
- -c
- "tail -f /dev/null"
env:
- name: PGDATABASE
value: /bitnami/postgresql/
- name: PGDATAOLD
value: /bitnami/postgresql/data
- name: PGDATANEW
value: /bitnami/postgresql/datanew
volumeMounts:
- mountPath: "/bitnami/postgresql"
name: data-sharedx-pgha-0
EOF
# 获得升级 shell
kubectl exec -it $(kubectl get pod -l batch.kubernetes.io/job-name=pgupgrade -o name) -- bash
|
尝试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
docker-upgrade pg_upgrade
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
*failure*
Consult the last few lines of "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T060414.361/log/pg_upgrade_server.log" for
the probable cause of the failure.
connection to server on socket "/var/lib/postgresql/.s.PGSQL.50432" failed: No such file or directory
Is the server running locally and accepting connections on that socket?
could not connect to source postmaster started with the command:
"/usr/lib/postgresql/14/bin/pg_ctl" -w -l "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T060414.361/log/pg_upgrade_server.log" -D "/bitnami/postgresql/data" -o "-p 50432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start
Failure, exiting
cat /bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T060414.361/log/pg_upgrade_server.log
-----------------------------------------------------------------
pg_upgrade run on Tue Nov 28 06:04:14 2023
-----------------------------------------------------------------
command: "/usr/lib/postgresql/14/bin/pg_ctl" -w -l "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T060414.361/log/pg_upgrade_server.log" -D "/bitnami/postgresql/data" -o "-p 50432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start >> "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T060414.361/log/pg_upgrade_server.log" 2>&1
waiting for server to start....postgres: could not access the server configuration file "/bitnami/postgresql/data/postgresql.conf": No such file or directory
stopped waiting
pg_ctl: could not start server
Examine the log output.
|
看起来旧数据需要 postgresql.conf 才可以启动。现将所有旧配置都复制到数据目录
1
|
kubectl exec -it shared-postgresql-ha-postgresql-0 -- cp /opt/bitnami/postgresql/conf/postgresql.conf /bitnami/postgresql/data
|
1
2
3
4
5
6
7
8
9
10
11
|
root@task-pg-upgrade-545cbc8cdf-6z8mf:/var/lib/postgresql# cat /bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T064300.020/log/pg_upgrade_server.log
-----------------------------------------------------------------
pg_upgrade run on Tue Nov 28 06:43:00 2023
-----------------------------------------------------------------
command: "/usr/lib/postgresql/14/bin/pg_ctl" -w -l "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T064300.020/log/pg_upgrade_server.log" -D "/bitnami/postgresql/data" -o "-p 50432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start >> "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T064300.020/log/pg_upgrade_server.log" 2>&1
waiting for server to start....2023-11-28 06:43:00.206 GMT [177] LOG: could not open configuration directory "/bitnami/postgresql/data/conf.d": No such file or directory
2023-11-28 06:43:00.207 GMT [177] FATAL: configuration file "/bitnami/postgresql/data/postgresql.conf" contains errors
stopped waiting
pg_ctl: could not start server
Examine the log output.
|
执行 mkdir -p /bitnami/postgresql/data/conf.d 修复
1
2
3
4
5
6
7
8
9
10
11
12
13
|
cat/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T064744.773/log/pg_upgrade_server.log
-----------------------------------------------------------------
pg_upgrade run on Tue Nov 28 06:47:44 2023
-----------------------------------------------------------------
command: "/usr/lib/postgresql/14/bin/pg_ctl" -w -l "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T064744.773/log/pg_upgrade_server.log" -D "/bitnami/postgresql/data" -o "-p 50432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start >> "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T064744.773/log/pg_upgrade_server.log" 2>&1
waiting for server to start....2023-11-28 06:47:44.970 GMT [255] FATAL: 58P01: could not access file "repmgr": No such file or directory
2023-11-28 06:47:44.970 GMT [255] LOCATION: internal_load_library, dfmgr.c:208
2023-11-28 06:47:44.970 GMT [255] LOG: 00000: database system is shut down
2023-11-28 06:47:44.970 GMT [255] LOCATION: UnlinkLockFiles, miscinit.c:970
stopped waiting
pg_ctl: could not start server
Examine the log output.
|
配置中有 repmgr、pgaudit ,升级过程中不需要此配置,可以从配置中删除
1
|
sed -i 's/repmgr, pgaudit, repmgr, //' /bitnami/postgresql/data/postgresql.conf
|
修复后再试
1
2
3
4
5
6
7
8
|
command: "/usr/lib/postgresql/14/bin/pg_ctl" -w -l "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T070009.292/log/pg_upgrade_server.log" -D "/bitnami/postgresql/data" -o "-p 50432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start >> "/bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T070009.292/log/pg_upgrade_server.log" 2>&1
waiting for server to start....2023-11-28 07:00:09.671 GMT [522] FATAL: 58P01: could not open log file "/opt/bitnami/postgresql/logs/postgresql.log": No such file or directory
2023-11-28 07:00:09.671 GMT [522] LOCATION: logfile_open, syslogger.c:1223
2023-11-28 07:00:09.688 GMT [522] LOG: 00000: database system is shut down
2023-11-28 07:00:09.688 GMT [522] LOCATION: UnlinkLockFiles, miscinit.c:970
stopped waiting
pg_ctl: could not start server
Examine the log output.
|
1
2
3
|
mkdir -p /opt/bitnami/postgresql/logs/
chown -R 999:999 /opt/bitnami/postgresql/logs/
chown -R 999:999 /opt/bitnami/postgresql/
|
修复后再试
1
|
connection to server on socket "/tmp/.s.PGSQL.50432" failed: fe_sendauth: no password supplied
|
1
2
3
|
echo 'local all all trust' >> /bitnami/postgresql/data/pg_hba.conf # 此方法不行
sed -i 's/local all all md5/local all all trust/' /bitnami/postgresql/data/pg_hba.conf # 此方法可以
|
修复后再试
1
2
3
4
5
|
Failure, exiting
cat /bitnami/postgresql/datanew/pg_upgrade_output.d/20231128T081622.981/loadable_libraries.txt
could not load library "$libdir/repmgr": ERROR: could not access file "$libdir/repmgr": No such file or directory
In database: repmgr
|
https://opensource-db.com/the-quick-and-easy-way-to-upgrade-postgres-using-pg_upgrade/
1
|
apt install postgresql-15-repmgr
|
终于可以了。
自动化
k8s bitnami/postgres-ha 升级步骤比较多,在此汇总出自动化, 代码放到 exfly/bitnami-pg-upgrade. 可以根据情况适量修改。
参考