You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running CREATE INDEX CONCURRENTLY on subscriber can break subscription (subscriber: postgres 15.6 + pglogical 2.4.4) - error message in logs: could not read block 0 in file ""base/[DB_OID]/[IDX_RELATION_FILENODE]"": read only 0 of 8192 bytes
#469
Open
aristocrates opened this issue
Apr 9, 2024
· 1 comment
I don't have a minimal repro for this right now, but the symptom I have observed on a subscriber that I am syncing for an upgrade is:
(context) I have modified the subscriber schema to just have primary key indexes and no constraints for the initial data sync - plan is to create the other indexes concurrently on the subscriber once all tables have synced, and then after index creation to create all foreign key constraints (with NOT VALID followed by VALIDATE CONSTRAINT)
But, on some indexes in the desired final schema, running CREATE UNIQUE INDEX CONCURRENTLY [options]; on the subscriber results in the subscription going status: down with a log line similar to the following in the CSV log:
2024-04-02 15:16:17.297 PDT,,"$DBNAME",4712,,6608afb3.1268,3738,"",2024-03-30 17:34:59 PDT,7/184779565,0,ERROR,XX001,"could not read block 0 in file ""base/16400/105657687"": read only 0 of 8192 bytes",,,,,"apply INSERT from remote relation $TABLE_NAME in commit before 45415/83927478, xid 197646281 committed at 1999-12-31 16:12:45.411377-08 (action #2) from node replorigin 1",,,,"pglogical apply 16400:988484226","pglogical apply 16400:988484226",,0
In this particular example, 105657687 is indeed the result of calling pg_relation_filenode($INDEX_NAME) on the currently-INVALID $INDEX_NAME that is being created concurrently at the time the subscription goes down:
$DBNAME=# select pg_relation_filenode($INDEX_NAME); -- select oid from pg_database where datname = $DBNAME; yields 16400
pg_relation_filenode
----------------------
105657687
(1 row)
postgres@[host]:/path/to/data/base/16400$ ls -alh 105657687
-rw------- 1 postgres postgres 0 Apr 2 15:16 105657687
The subscriber is on postgres 15.6 and pglogical 2.4.4
A workaround that seems to work fine is to just create the index non-concurrently, in which case pglogical apply on the subscriber blocks on obtaining a RowExclusiveLock until the create index finishes, and the replication slot stays put on the provider.
Based on #344 (comment) (and other guides around using pglogical on DBs with large tables) the expected behavior is for CREATE INDEX CONCURRENTLY to work fine on the subscriber:
You can always create indexes after the initial load. If you are concerned about index build that locks out writes, you can always use CONCURRENTLY.
The text was updated successfully, but these errors were encountered:
I have not noticed this on all indexes, just on a handful. I would be surprised if this happened every time an update, insert, or delete replicated over while the concurrent index creation was happening, since I do not recall running into this on other DB upgrades between the same postgres versions, but the behavior replicated when I retried index creation on these particular indexes, even after re-initializing the subscriber from scratch on a new machine.
I don't have a minimal repro for this right now, but the symptom I have observed on a subscriber that I am syncing for an upgrade is:
CREATE UNIQUE INDEX CONCURRENTLY [options];
on the subscriber results in the subscription goingstatus: down
with a log line similar to the following in the CSV log:In this particular example, 105657687 is indeed the result of calling
pg_relation_filenode($INDEX_NAME)
on the currently-INVALID$INDEX_NAME
that is being created concurrently at the time the subscription goes down:The subscriber is on postgres 15.6 and pglogical 2.4.4
A workaround that seems to work fine is to just create the index non-concurrently, in which case pglogical apply on the subscriber blocks on obtaining a RowExclusiveLock until the create index finishes, and the replication slot stays put on the provider.
Based on #344 (comment) (and other guides around using pglogical on DBs with large tables) the expected behavior is for
CREATE INDEX CONCURRENTLY
to work fine on the subscriber:The text was updated successfully, but these errors were encountered: