You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm playing with AlloyDB Omni, which is a standard PGSQL wrapped in a container and packed with some GCP (Google) steroids. Everything is working well, I was able to build a simple config with Primary and a single Standby. I was also able to use repmgr to test the switchover and switchback operations - this also works fine.
The problem starts when I try to use repmgr with automatic failover:
Sympthoms:
I'm able to start the repmgrd service on both nodes:
on prim:
repmgr -f /var/alloydb/config/repmgr.conf daemon start --verbose
NOTICE: using provided configuration file "/var/alloydb/config/repmgr.conf"
INFO: connecting to local node
NOTICE: executing: "sudo /usr/bin/systemctl start repmgrd"
NOTICE: repmgrd was successfully started
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Starting LSB: Start/stop repmgrd...
Jun 24 04:24:39 omnidbv-repli-03 repmgrd[10531]: Starting PostgreSQL replication management and monitoring daemon: repmgrd.
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Started LSB: Start/stop repmgrd.
on stby:
repmgr -f /var/alloydb/config/repmgr.conf daemon start --verbose
NOTICE: using provided configuration file "/var/alloydb/config/repmgr.conf"
INFO: connecting to local node
NOTICE: executing: "sudo /usr/bin/systemctl start repmgrd"
NOTICE: repmgrd was successfully started
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Starting LSB: Start/stop repmgrd...
Jun 24 04:24:39 omnidbv-repli-03 repmgrd[10531]: Starting PostgreSQL replication management and monitoring daemon: repmgrd.
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Started LSB: Start/stop repmgrd.
repmgr extention is installed on both nodes:
repmgr=# SELECT * FROM pg_extension;
oid
extname
extowner
extnamespace
extrelocatable
extversion
extconfig
extcondition
14204
plpgsql
10
11
f
1.0
99377
google_columnar_engine
10
2200
t
1.0
99567
google_db_advisor
10
2200
t
1.0
99661
hypopg
10
2200
t
1.3.2
50059
repmgr
47598
50058
f
5.4
{50060,50076,50083}
{"","",""}
repmgr service status and daemon status are able to show the repmgrd PIDs but reporting repmgrd as 'not running'
ID
Name
Role
Status
Upstream
repmgrd
PID
Paused?
Upstream last seen
1
omnidbv-03-n1
primary
* running
not running
52598
no
n/a
2
omnidbv-03-n2
standby
running
omnidbv-03-n1
not running
10536
no
0 second(s) ago
Any clue why this can be happening? What types of checks repmgr is doing to get the daemon status (beside the repmgrd_is_running function)? Appreciate any help in debugging.
BTW. why the logfile is reporting about: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid and not as configured: REPMGRD_PIDFILE=/var/run/repmgrd.pid,
The text was updated successfully, but these errors were encountered:
Hi,
I'm playing with AlloyDB Omni, which is a standard PGSQL wrapped in a container and packed with some GCP (Google) steroids. Everything is working well, I was able to build a simple config with Primary and a single Standby. I was also able to use repmgr to test the switchover and switchback operations - this also works fine.
The problem starts when I try to use repmgr with automatic failover:
Versions:
repmgr --version
repmgr 5.4.1
postgres --version
postgres (PostgreSQL) 15.5
Configuration:
A) repmgrd content (/etc/default/repmgrd):
REPMGRD_ENABLED=yes
REPMGRD_CONF="/var/alloydb/config/repmgr.conf"
REPMGRD_OPTS="--daemonize=false"
REPMGRD_USER=postgres
REPMGRD_BIN=/usr/bin/repmgrd
REPMGRD_PIDFILE=/var/run/repmgrd.pid
B) repmgr cofiguration (/var/alloydb/config/repmgr.conf):
failover=automatic
promote_command='/usr/bin/repmgr standby promote -f /var/alloydb/config/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /var/alloydb/config/repmgr.conf --log-to-file --upstream-node-id=%n'
repmgrd_service_start_command='sudo /usr/bin/systemctl start repmgrd'
repmgrd_service_start_command='sudo /usr/bin/systemctl stop repmgrd'
monitoring_history=yes
log_level=INFO
log_file='/var/log/postgres/repmgrd.log'
Sympthoms:
I'm able to start the repmgrd service on both nodes:
on prim:
repmgr -f /var/alloydb/config/repmgr.conf daemon start --verbose
NOTICE: using provided configuration file "/var/alloydb/config/repmgr.conf"
INFO: connecting to local node
NOTICE: executing: "sudo /usr/bin/systemctl start repmgrd"
NOTICE: repmgrd was successfully started
prim output:
● repmgrd.service - LSB: Start/stop repmgrd
Loaded: loaded (/etc/init.d/repmgrd; generated)
Active: active (running) since Mon 2024-06-24 04:24:39 EDT; 16min ago
Docs: man:systemd-sysv-generator(8)
Process: 10531 ExecStart=/etc/init.d/repmgrd start (code=exited, status=0/SUCCESS)
Tasks: 1 (limit: 19151)
Memory: 1.3M
CPU: 532ms
CGroup: /system.slice/repmgrd.service
└─10536 /usr/lib/postgresql/15/bin/repmgrd --config-file /var/alloydb/config/repmgr.conf --daemonize=false
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Starting LSB: Start/stop repmgrd...
Jun 24 04:24:39 omnidbv-repli-03 repmgrd[10531]: Starting PostgreSQL replication management and monitoring daemon: repmgrd.
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Started LSB: Start/stop repmgrd.
on stby:
repmgr -f /var/alloydb/config/repmgr.conf daemon start --verbose
NOTICE: using provided configuration file "/var/alloydb/config/repmgr.conf"
INFO: connecting to local node
NOTICE: executing: "sudo /usr/bin/systemctl start repmgrd"
NOTICE: repmgrd was successfully started
stby output:
● repmgrd.service - LSB: Start/stop repmgrd
Loaded: loaded (/etc/init.d/repmgrd; generated)
Active: active (running) since Mon 2024-06-24 04:24:39 EDT; 17min ago
Docs: man:systemd-sysv-generator(8)
Process: 10531 ExecStart=/etc/init.d/repmgrd start (code=exited, status=0/SUCCESS)
Tasks: 1 (limit: 19151)
Memory: 1.3M
CPU: 567ms
CGroup: /system.slice/repmgrd.service
└─10536 /usr/lib/postgresql/15/bin/repmgrd --config-file /var/alloydb/config/repmgr.conf --daemonize=false
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Starting LSB: Start/stop repmgrd...
Jun 24 04:24:39 omnidbv-repli-03 repmgrd[10531]: Starting PostgreSQL replication management and monitoring daemon: repmgrd.
Jun 24 04:24:39 omnidbv-repli-03 systemd[1]: Started LSB: Start/stop repmgrd.
repmgr extention is installed on both nodes:
repmgr=# SELECT * FROM pg_extension;
repmgr service status and daemon status are able to show the repmgrd PIDs but reporting repmgrd as 'not running'
Any clue why this can be happening? What types of checks repmgr is doing to get the daemon status (beside the repmgrd_is_running function)? Appreciate any help in debugging.
BTW. why the logfile is reporting about: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid and not as configured: REPMGRD_PIDFILE=/var/run/repmgrd.pid,
The text was updated successfully, but these errors were encountered: