Skip to content

Commit

Permalink
Toasting for AO tables should still use custom toast_tuple_target
Browse files Browse the repository at this point in the history
In commit a0684da we made a change to use the fixed TOAST_TUPLE_TARGET
instead of custom toast_tuple_target that's calculated based on the the
blocksize of the AO table for toasting AO table tuples.

This caused an issue that some tuples that are well beyond the
toast_tuple_target are not being toasted (because they're smaller than
TOAST_TUPLE_TARGET). This is ok when the tuples can still fit into the
AO table's blocksize. But if not, an error would occur. E.g.:

  postgres=# create table t (a int, b int[]) WITH (appendonly=true, blocksize=8192);
  CREATE TABLE
  postgres=# insert into t select 1, array_agg(x) from generate_series(1, 2030) x;
  ERROR:  item too long (check #1): length 8160, maxBufferLen 8168
  CONTEXT:  Append-Only table 't'

Therefore, we should revert those changes, including:

1. Still use the toast_tuple_target for toasting 'x' and 'e' columns for AO tables.
There's also a GPDB_12_MERGE_FIXME in the original comment suggesting to use
RelationGetToastTupleTarget for AO table (in order to unify the logic to calculate
maxDataLen for heap and AO tuples, I suppose). But that's not possible currently
because the calculated toast_tuple_target is stored in AppendOnlyInsertDesc and
we cannot get it from the Relation struct. And it seems to me that we won't have
a way to do that easily. Therefore, keep this FIXME comment being removed.

2. Still use the toast_tuple_target for toasting 'm' columns for AO tables. Also restore
the FIXME comment suggesting that we should use a bigger target size for 'm' columns.
This should be something that worth investigating more into.

Commit a0684da also includes a change to use custom toast_tuple_target for heap tables,
which should be a valid change. I think that fixed an oversight when merging the upstream
commit c251336.
  • Loading branch information
huansong authored and reshke committed Jan 28, 2025
1 parent 2a4af7f commit 972ee7f
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 5 deletions.
16 changes: 11 additions & 5 deletions src/backend/access/heap/heaptoast.c
Original file line number Diff line number Diff line change
Expand Up @@ -210,10 +210,8 @@ heap_toast_insert_or_update_generic(Relation rel, void *newtup, void *oldtup,
}
else
{
/* Since reloptions for AO table is not permitted, so using TOAST_TUPLE_TARGET */
hoff = sizeof(MemTupleData);
hoff = MAXALIGN(hoff);
maxDataLen = TOAST_TUPLE_TARGET - hoff;
maxDataLen = toast_tuple_target;
hoff = -1; /* keep compiler quiet about using 'hoff' uninitialized */
}

/*
Expand Down Expand Up @@ -295,7 +293,15 @@ heap_toast_insert_or_update_generic(Relation rel, void *newtup, void *oldtup,
* increase the target tuple size, so that MAIN attributes aren't stored
* externally unless really necessary.
*/
maxDataLen = TOAST_TUPLE_TARGET_MAIN - hoff;
/*
* FIXME: Should we do something like this with memtuples on
* AO tables too? Currently we do not increase the target tuple size for AO
* table, so there are occasions when columns of type 'm' will be stored
* out-of-line but they could otherwise be accommodated in-block
* c.f. upstream Postgres commit ca7c8168de76459380577eda56a3ed09b4f6195c
*/
if (!ismemtuple)
maxDataLen = TOAST_TUPLE_TARGET_MAIN - hoff;

while (heap_compute_data_size(tupleDesc,
toast_values, toast_isnull) > maxDataLen &&
Expand Down
13 changes: 13 additions & 0 deletions src/test/regress/expected/toast.out
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,19 @@ SELECT char_length(a), char_length(b), c FROM toastable_ao ORDER BY c;
10000000 | 10000032 | 3
(3 rows)

-- Check that small tuples can be correctly toasted as long as it's beyond the toast
-- target size (about 1/4 of the table's blocksize)
CREATE TABLE toastable_ao2(a int, b int[]) WITH (appendonly=true, blocksize=8192);
INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 1000) x;
INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 2030) x;
SELECT gp_segment_id, get_rel_toast_count('toastable_ao2') FROM gp_dist_random('gp_id') order by gp_segment_id;
gp_segment_id | get_rel_toast_count
---------------+---------------------
0 | 0
1 | 2
2 | 0
(3 rows)

-- UPDATE
-- (heap rel only) and toast the large tuple
UPDATE toastable_heap SET a=repeat('A',100000) WHERE c=1;
Expand Down
12 changes: 12 additions & 0 deletions src/test/regress/sql/toast.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ CREATE TABLE toastable_heap(a text, b varchar, c int) distributed by (b);
CREATE TABLE toastable_ao(a text, b varchar, c int) with(appendonly=true, compresslevel=1) distributed by (b);
ALTER TABLE toastable_ao ALTER COLUMN a SET STORAGE EXTERNAL;

-- start_ignore
create language plpython3u;
-- end_ignore

-- Helper function to generate a random string with given length. (Due to the
-- implementation, the length is rounded down to nearest multiple of 32.
-- That's good enough for our purposes.)
Expand Down Expand Up @@ -33,6 +37,14 @@ INSERT INTO toastable_ao VALUES(randomtext(10000000), randomtext(10000032), 3);
SELECT char_length(a), char_length(b), c FROM toastable_heap ORDER BY c;
SELECT char_length(a), char_length(b), c FROM toastable_ao ORDER BY c;

-- Check that small tuples can be correctly toasted as long as it's beyond the toast
-- target size (about 1/4 of the table's blocksize)
CREATE TABLE toastable_ao2(a int, b int[]) WITH (appendonly=true, blocksize=8192);
INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 1000) x;
INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 2030) x;
SELECT gp_segment_id, get_rel_toast_count('toastable_ao2') FROM gp_dist_random('gp_id') order by gp_segment_id;


-- UPDATE
-- (heap rel only) and toast the large tuple
UPDATE toastable_heap SET a=repeat('A',100000) WHERE c=1;
Expand Down

0 comments on commit 972ee7f

Please sign in to comment.