From f3641976cc3d9ca8c211acd84845b8883376cef3 Mon Sep 17 00:00:00 2001 From: Siddharth Shah Date: Mon, 13 Jan 2025 21:52:14 +0530 Subject: [PATCH 1/5] doc changes --- .../advanced-configuration.md | 10 +++++++++- .../cdc-get-started.md | 10 +++++++--- .../preview/reference/configuration/yb-master.md | 2 +- .../preview/reference/configuration/yb-tserver.md | 8 +++++++- .../advanced-configuration.md | 10 +++++++++- .../cdc-get-started.md | 10 +++++++--- .../stable/reference/configuration/yb-master.md | 2 +- .../stable/reference/configuration/yb-tserver.md | 8 +++++++- 8 files changed, 48 insertions(+), 12 deletions(-) diff --git a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md index 6809e8850412..831640235014 100644 --- a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md +++ b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md @@ -30,6 +30,14 @@ CDC retains resources (such as WAL segments) that contain information related to Retaining resources has an impact on the system. Clients are expected to consume these transactions within configurable duration limits. Resources will be released if the duration exceeds these configured limits. -Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag to control the duration for which resources are retained. +Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) & [cdc_wal_retention_time_secs](../../../../reference/configuration/yb-tserver/#cdc_wal_retention_time_secs) flag to control the duration for which resources are retained. Resources are retained for each tablet of a table that is part of a database whose changes are being consumed using a replication slot. This includes those tables that may not be currently part of the publication specification. + +Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. + +{{< warning title="Important" >}} +When using replica identity FULL or DEFAULT, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of history for streamed rows. + +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned replica identities may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these replica identities, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. +{{< /warning >}} diff --git a/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md b/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md index 6942824e333c..d8415d8f0c11 100644 --- a/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md +++ b/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md @@ -535,12 +535,16 @@ You can use several flags to fine-tune YugabyteDB's CDC behavior. These flags ar ## Retaining data for longer durations -To increase retention of data for CDC, change the two flags, `cdc_intent_retention_ms` and `cdc_wal_retention_time_secs` as required. +The following flags are responsible for retention of data required by CDC: +- `cdc_wal_retention_time_secs` (default value: 28800s) +- `cdc_intent_retention_ms` (default value: 28800000ms) -{{< warning title="Important" >}} +Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. -Longer values of `cdc_intent_retention_ms`, coupled with longer CDC lags (periods of downtime where the client is not requesting changes) can result in increased memory footprint in the YB-TServer and affect read performance. +{{< warning title="Important" >}} +When using before image modes ALL, FULL_ROW_NEW_IMAGE or MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of streamed rows. +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned before image modes may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these before image modes, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. {{< /warning >}} ## Content-based routing diff --git a/docs/content/preview/reference/configuration/yb-master.md b/docs/content/preview/reference/configuration/yb-master.md index 7cdc0199f8b5..b9b744787136 100644 --- a/docs/content/preview/reference/configuration/yb-master.md +++ b/docs/content/preview/reference/configuration/yb-master.md @@ -931,7 +931,7 @@ Default: `0` (Use the same default number of tablets as for regular tables.) WAL retention time, in seconds, to be used for tables for which a CDC stream was created. Used in both xCluster and CDCSDK. -Default: `14400` (4 hours) +Default: `28800` (8 hours) ##### --enable_tablet_split_of_cdcsdk_streamed_tables diff --git a/docs/content/preview/reference/configuration/yb-tserver.md b/docs/content/preview/reference/configuration/yb-tserver.md index 2898f510ed91..55f7bc2ce236 100644 --- a/docs/content/preview/reference/configuration/yb-tserver.md +++ b/docs/content/preview/reference/configuration/yb-tserver.md @@ -1336,7 +1336,13 @@ Default: `102400` The time period, in milliseconds, after which the intents will be cleaned up if there is no client polling for the change records. -Default: `14400000` (4 hours) +Default: `28800000` (8 hours) + +##### --cdc_wal_retention_time_secs + +WAL retention time, in seconds, to be used for tables for which a CDC stream was created. Used in both xCluster and CDCSDK. + +Default: `28800` (8 hours) ##### --cdcsdk_table_processing_limit_per_run diff --git a/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md b/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md index 4b34d5255a78..353fc4612bfe 100644 --- a/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md +++ b/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md @@ -28,6 +28,14 @@ CDC retains resources (such as WAL segments) that contain information related to Retaining resources has an impact on the system. Clients are expected to consume these transactions within configurable duration limits. Resources will be released if the duration exceeds these configured limits. -Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag to control the duration for which resources are retained. +Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) & [cdc_wal_retention_time_secs](../../../../reference/configuration/yb-tserver/#cdc_wal_retention_time_secs) flag to control the duration for which resources are retained. Resources are retained for each tablet of a table that is part of a database whose changes are being consumed using a replication slot. This includes those tables that may not be currently part of the publication specification. + +Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. + +{{< warning title="Important" >}} +When using replica identity FULL or DEFAULT, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of history for streamed rows. + +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned replica identities may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these replica identities, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. +{{< /warning >}} diff --git a/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md b/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md index 822f5ec7d24c..1512dabdb5ca 100644 --- a/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md +++ b/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md @@ -532,12 +532,16 @@ You can use several flags to fine-tune YugabyteDB's CDC behavior. These flags ar ## Retaining data for longer durations -To increase retention of data for CDC, change the two flags, `cdc_intent_retention_ms` and `cdc_wal_retention_time_secs` as required. +The following flags are responsible for retention of data required by CDC: +- `cdc_wal_retention_time_secs` (default value: 28800s) +- `cdc_intent_retention_ms` (default value: 28800000ms) -{{< warning title="Important" >}} +Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. -Longer values of `cdc_intent_retention_ms`, coupled with longer CDC lags (periods of downtime where the client is not requesting changes) can result in increased memory footprint in the YB-TServer and affect read performance. +{{< warning title="Important" >}} +When using before image modes ALL, FULL_ROW_NEW_IMAGE or MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of streamed rows. +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned before image modes may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these before image modes, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. {{< /warning >}} ## Content-based routing diff --git a/docs/content/stable/reference/configuration/yb-master.md b/docs/content/stable/reference/configuration/yb-master.md index 55658ad5f7d5..d86cfef7e3aa 100644 --- a/docs/content/stable/reference/configuration/yb-master.md +++ b/docs/content/stable/reference/configuration/yb-master.md @@ -939,7 +939,7 @@ Default: `0` (Use the same default number of tablets as for regular tables.) WAL retention time, in seconds, to be used for tables for which a CDC stream was created. Used in both xCluster and CDCSDK. -Default: `14400` (4 hours) +Default: `28800` (8 hours) ##### --enable_tablet_split_of_cdcsdk_streamed_tables diff --git a/docs/content/stable/reference/configuration/yb-tserver.md b/docs/content/stable/reference/configuration/yb-tserver.md index 52744c7e6c91..b042d1e31fef 100644 --- a/docs/content/stable/reference/configuration/yb-tserver.md +++ b/docs/content/stable/reference/configuration/yb-tserver.md @@ -1344,7 +1344,13 @@ Default: `102400` The time period, in milliseconds, after which the intents will be cleaned up if there is no client polling for the change records. -Default: `14400000` (4 hours) +Default: `28800000` (8 hours) + +##### --cdc_wal_retention_time_secs + +WAL retention time, in seconds, to be used for tables for which a CDC stream was created. Used in both xCluster and CDCSDK. + +Default: `28800` (8 hours) ##### --cdcsdk_table_processing_limit_per_run From 063eb7a2ff784ecf9c2d541980d66fb027f2bab6 Mon Sep 17 00:00:00 2001 From: Siddharth Shah Date: Wed, 29 Jan 2025 22:50:04 +0530 Subject: [PATCH 2/5] add flag changes for dynamic table support in logical replication --- .../using-logical-replication/advanced-topic.md | 14 -------------- .../content/preview/explore/change-data-capture.md | 2 +- .../preview/reference/configuration/yb-tserver.md | 4 ++-- .../using-logical-replication/advanced-topic.md | 14 -------------- docs/content/stable/explore/change-data-capture.md | 2 +- .../stable/reference/configuration/yb-tserver.md | 4 ++-- .../using-logical-replication/advanced-topic.md | 14 -------------- .../content/v2024.1/explore/change-data-capture.md | 2 +- .../v2024.1/reference/configuration/yb-tserver.md | 4 ++-- 9 files changed, 9 insertions(+), 51 deletions(-) diff --git a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-topic.md b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-topic.md index d4c79055b967..982b731e2bb8 100644 --- a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-topic.md +++ b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-topic.md @@ -132,20 +132,6 @@ The value of this flag can be changed at run time, but the change becomes effect To enable dynamic table addition, perform the following steps: -1. Set the [cdcsdk_enable_dynamic_table_support](../../../../reference/configuration/yb-tserver/#cdcsdk-enable-dynamic-table-support) to true. - - Because it is a preview flag, first add it to the `allowed_preview_flags_csv` list. - - ```sh - ./yb-ts-cli --server_address= set_flag allowed_preview_flags_csv cdcsdk_enable_dynamic_table_support - ``` - - Then set the `cdcsdk_enable_dynamic_table_support` flag to true. - - ```sh - ./yb-ts-cli --server_address= set_flag cdcsdk_enable_dynamic_table_support true - ``` - 1. Set the [cdcsdk_publication_list_refresh_interval_secs](../../../../reference/configuration/yb-tserver/#cdcsdk-publication-list-refresh-interval-secs) flag to a lower value, such as 60 or 120 seconds. Note that the effect of this setting takes place after the upcoming publication refresh is performed. ```sh diff --git a/docs/content/preview/explore/change-data-capture.md b/docs/content/preview/explore/change-data-capture.md index 8fc2b99d576f..be94dfb12cb8 100644 --- a/docs/content/preview/explore/change-data-capture.md +++ b/docs/content/preview/explore/change-data-capture.md @@ -57,7 +57,7 @@ To set up pg_recvlogical, create and start the local cluster by running the foll ./bin/yugabyted start \ --advertise_address=127.0.0.1 \ --base_dir="${HOME}/var/node1" \ - --tserver_flags="allowed_preview_flags_csv={cdcsdk_enable_dynamic_table_support},cdcsdk_enable_dynamic_table_support=true,cdcsdk_publication_list_refresh_interval_secs=2" + --tserver_flags="cdcsdk_publication_list_refresh_interval_secs=2" ``` ### Create tables diff --git a/docs/content/preview/reference/configuration/yb-tserver.md b/docs/content/preview/reference/configuration/yb-tserver.md index 55f7bc2ce236..efe8f22d008e 100644 --- a/docs/content/preview/reference/configuration/yb-tserver.md +++ b/docs/content/preview/reference/configuration/yb-tserver.md @@ -1360,9 +1360,9 @@ Default: `CHANGE` ##### --cdcsdk_enable_dynamic_table_support -Tables created after the creation of a replication slot are referred as Dynamic tables. This preview flag can be used to switch the dynamic addition of tables to the publication ON or OFF. +Tables created after the creation of a replication slot are referred as Dynamic tables. This flag can be used to switch the dynamic addition of tables to the publication ON or OFF. -Default: `false` +Default: `true` ##### --cdcsdk_publication_list_refresh_interval_secs diff --git a/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-topic.md b/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-topic.md index c96447426e82..b710c61d5da8 100644 --- a/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-topic.md +++ b/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-topic.md @@ -130,20 +130,6 @@ The value of this flag can be changed at run time, but the change becomes effect To enable dynamic table addition, perform the following steps: -1. Set the [cdcsdk_enable_dynamic_table_support](../../../../reference/configuration/yb-tserver/#cdcsdk-enable-dynamic-table-support) to true. - - Because it is a preview flag, first add it to the `allowed_preview_flags_csv` list. - - ```sh - ./yb-ts-cli --server_address= set_flag allowed_preview_flags_csv cdcsdk_enable_dynamic_table_support - ``` - - Then set the `cdcsdk_enable_dynamic_table_support` flag to true. - - ```sh - ./yb-ts-cli --server_address= set_flag cdcsdk_enable_dynamic_table_support true - ``` - 1. Set the [cdcsdk_publication_list_refresh_interval_secs](../../../../reference/configuration/yb-tserver/#cdcsdk-publication-list-refresh-interval-secs) flag to a lower value, such as 60 or 120 seconds. Note that the effect of this setting takes place after the upcoming publication refresh is performed. ```sh diff --git a/docs/content/stable/explore/change-data-capture.md b/docs/content/stable/explore/change-data-capture.md index ebe6b2f029c1..1eff18651c43 100644 --- a/docs/content/stable/explore/change-data-capture.md +++ b/docs/content/stable/explore/change-data-capture.md @@ -57,7 +57,7 @@ To set up pg_recvlogical, create and start the local cluster by running the foll ./bin/yugabyted start \ --advertise_address=127.0.0.1 \ --base_dir="${HOME}/var/node1" \ - --tserver_flags="allowed_preview_flags_csv={cdcsdk_enable_dynamic_table_support},cdcsdk_enable_dynamic_table_support=true,cdcsdk_publication_list_refresh_interval_secs=2" + --tserver_flags="cdcsdk_publication_list_refresh_interval_secs=2" ``` ### Create tables diff --git a/docs/content/stable/reference/configuration/yb-tserver.md b/docs/content/stable/reference/configuration/yb-tserver.md index b042d1e31fef..686be98c3cf6 100644 --- a/docs/content/stable/reference/configuration/yb-tserver.md +++ b/docs/content/stable/reference/configuration/yb-tserver.md @@ -1368,9 +1368,9 @@ Default: `CHANGE` ##### --cdcsdk_enable_dynamic_table_support -Tables created after the creation of a replication slot are referred as Dynamic tables. This preview flag can be used to switch the dynamic addition of tables to the publication ON or OFF. +Tables created after the creation of a replication slot are referred as Dynamic tables. This flag can be used to switch the dynamic addition of tables to the publication ON or OFF. -Default: `false` +Default: `true` ##### --cdcsdk_publication_list_refresh_interval_secs diff --git a/docs/content/v2024.1/develop/change-data-capture/using-logical-replication/advanced-topic.md b/docs/content/v2024.1/develop/change-data-capture/using-logical-replication/advanced-topic.md index 6e38525d61dc..6596b1ede6d6 100644 --- a/docs/content/v2024.1/develop/change-data-capture/using-logical-replication/advanced-topic.md +++ b/docs/content/v2024.1/develop/change-data-capture/using-logical-replication/advanced-topic.md @@ -130,20 +130,6 @@ The value of this flag can be changed at run time, but the change becomes effect To enable dynamic table addition, perform the following steps: -1. Set the [cdcsdk_enable_dynamic_table_support](../../../../reference/configuration/yb-tserver/#cdcsdk-enable-dynamic-table-support) to true. - - Because it is a preview flag, first add it to the `allowed_preview_flags_csv` list. - - ```sh - ./yb-ts-cli --server_address= set_flag allowed_preview_flags_csv cdcsdk_enable_dynamic_table_support - ``` - - Then set the `cdcsdk_enable_dynamic_table_support` flag to true. - - ```sh - ./yb-ts-cli --server_address= set_flag cdcsdk_enable_dynamic_table_support true - ``` - 1. Set the [cdcsdk_publication_list_refresh_interval_secs](../../../../reference/configuration/yb-tserver/#cdcsdk-publication-list-refresh-interval-secs) flag to a lower value, such as 60 or 120 seconds. Note that the effect of this setting takes place after the upcoming publication refresh is performed. ```sh diff --git a/docs/content/v2024.1/explore/change-data-capture.md b/docs/content/v2024.1/explore/change-data-capture.md index 753a796aa04e..87c3524eca81 100644 --- a/docs/content/v2024.1/explore/change-data-capture.md +++ b/docs/content/v2024.1/explore/change-data-capture.md @@ -57,7 +57,7 @@ To set up pg_recvlogical, create and start the local cluster by running the foll ./bin/yugabyted start \ --advertise_address=127.0.0.1 \ --base_dir="${HOME}/var/node1" \ - --tserver_flags="allowed_preview_flags_csv={cdcsdk_enable_dynamic_table_support},cdcsdk_enable_dynamic_table_support=true,cdcsdk_publication_list_refresh_interval_secs=2" + --tserver_flags="cdcsdk_publication_list_refresh_interval_secs=2" ``` ### Create tables diff --git a/docs/content/v2024.1/reference/configuration/yb-tserver.md b/docs/content/v2024.1/reference/configuration/yb-tserver.md index c35345b649e5..2d814f3c5c9b 100644 --- a/docs/content/v2024.1/reference/configuration/yb-tserver.md +++ b/docs/content/v2024.1/reference/configuration/yb-tserver.md @@ -1334,9 +1334,9 @@ Default: `CHANGE` ##### --cdcsdk_enable_dynamic_table_support -Tables created after the creation of a replication slot are referred as Dynamic tables. This preview flag can be used to switch the dynamic addition of tables to the publication ON or OFF. +Tables created after the creation of a replication slot are referred as Dynamic tables. This flag can be used to switch the dynamic addition of tables to the publication ON or OFF. -Default: `false` +Default: `true` ##### --cdcsdk_publication_list_refresh_interval_secs From b1a56bfaedcf0c0829d7e63c8a98f5dde4b133ee Mon Sep 17 00:00:00 2001 From: Dwight Hodge <79169168+ddhodge@users.noreply.github.com> Date: Wed, 29 Jan 2025 21:35:34 -0500 Subject: [PATCH 3/5] Update docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md --- .../using-logical-replication/advanced-configuration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md index 831640235014..0d896a7cbc92 100644 --- a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md +++ b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md @@ -30,7 +30,7 @@ CDC retains resources (such as WAL segments) that contain information related to Retaining resources has an impact on the system. Clients are expected to consume these transactions within configurable duration limits. Resources will be released if the duration exceeds these configured limits. -Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) & [cdc_wal_retention_time_secs](../../../../reference/configuration/yb-tserver/#cdc_wal_retention_time_secs) flag to control the duration for which resources are retained. +Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) and [cdc_wal_retention_time_secs](../../../../reference/configuration/yb-tserver/#cdc-wal-retention-time-secs) flags to control the duration for which resources are retained. Resources are retained for each tablet of a table that is part of a database whose changes are being consumed using a replication slot. This includes those tables that may not be currently part of the publication specification. From 80f02319a73e32bba84ab34cb3b1b4aabe70d461 Mon Sep 17 00:00:00 2001 From: Dwight Hodge Date: Thu, 30 Jan 2025 14:15:02 -0500 Subject: [PATCH 4/5] apply suggestions --- .../advanced-configuration.md | 6 +++--- .../cdc-get-started.md | 15 ++++++++------- 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md index 0d896a7cbc92..aaa990334f13 100644 --- a/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md +++ b/docs/content/preview/develop/change-data-capture/using-logical-replication/advanced-configuration.md @@ -34,10 +34,10 @@ Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver Resources are retained for each tablet of a table that is part of a database whose changes are being consumed using a replication slot. This includes those tables that may not be currently part of the publication specification. -Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. +Starting from v2024.2.1, the default data retention for CDC is 8 hours, with support for maximum retention up to 24 hours. Prior to v2024.2.1, the default retention for CDC is 4 hours. {{< warning title="Important" >}} -When using replica identity FULL or DEFAULT, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of history for streamed rows. +When using FULL or DEFAULT replica identities, CDC preserves previous row values for UPDATE and DELETE operations. This is done by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of history for streamed rows. -The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned replica identities may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these replica identities, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period (default 8 hours). Be aware that any interruption in CDC consumption for extended periods using these replica identities may degrade read performance. This happens because compaction activities are halted in the database when these replica identities are used, leading to inefficient key lookups as reads must traverse multiple SST files. {{< /warning >}} diff --git a/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md b/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md index d8415d8f0c11..b688e4119d26 100644 --- a/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md +++ b/docs/content/preview/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md @@ -533,18 +533,19 @@ You can use several flags to fine-tune YugabyteDB's CDC behavior. These flags ar - [cdc_max_stream_intent_records](../../../../reference/configuration/yb-tserver/#cdc-max-stream-intent-records) - Controls how many intent records can be streamed in a single `GetChanges` call. Essentially, intents of large transactions are broken down into batches of size equal to this flag, hence this controls how many batches of `GetChanges` calls are needed to stream the entire large transaction. The default value of this flag is 1680, and transactions with intents less than this value are streamed in a single batch. The value of this flag can be increased, if the workload has larger transactions and CDC throughput needs to be increased. Note that high values of this flag can increase the latency of each `GetChanges` call. -## Retaining data for longer durations +## Retain data for longer durations -The following flags are responsible for retention of data required by CDC: -- `cdc_wal_retention_time_secs` (default value: 28800s) -- `cdc_intent_retention_ms` (default value: 28800000ms) +The following flags control the retention of data required by CDC: -Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. +- `cdc_wal_retention_time_secs` (default: 28800s) +- `cdc_intent_retention_ms` (default: 28800000ms) + +Starting from v2024.2.1, the default data retention for CDC is 8 hours, with support for maximum retention up to 24 hours. Prior to v2024.2.1, the default retention for CDC is 4 hours. {{< warning title="Important" >}} -When using before image modes ALL, FULL_ROW_NEW_IMAGE or MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of streamed rows. +When using ALL, FULL_ROW_NEW_IMAGE, or MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES before image modes, CDC preserves previous row values for UPDATE and DELETE operations. This is done by retaining history for each row in the database through a suspension of the compaction process. Compaction is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of streamed rows. -The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned before image modes may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these before image modes, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period (default 8 hours). Be aware that any interruption in CDC consumption for extended periods using these before image modes may degrade read performance. This happens because compaction activities are halted in the database when these before image modes are used, leading to inefficient key lookups as reads must traverse multiple SST files. {{< /warning >}} ## Content-based routing From 90a4dec26bcd3004f904e1d293261c33b75a2a67 Mon Sep 17 00:00:00 2001 From: Dwight Hodge Date: Thu, 30 Jan 2025 14:20:19 -0500 Subject: [PATCH 5/5] apply suggestions --- .../advanced-configuration.md | 8 ++++---- .../cdc-get-started.md | 15 ++++++++------- 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md b/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md index 353fc4612bfe..c3fa20ccd748 100644 --- a/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md +++ b/docs/content/stable/develop/change-data-capture/using-logical-replication/advanced-configuration.md @@ -28,14 +28,14 @@ CDC retains resources (such as WAL segments) that contain information related to Retaining resources has an impact on the system. Clients are expected to consume these transactions within configurable duration limits. Resources will be released if the duration exceeds these configured limits. -Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) & [cdc_wal_retention_time_secs](../../../../reference/configuration/yb-tserver/#cdc_wal_retention_time_secs) flag to control the duration for which resources are retained. +Use the [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) and [cdc_wal_retention_time_secs](../../../../reference/configuration/yb-tserver/#cdc-wal-retention-time-secs) flags to control the duration for which resources are retained. Resources are retained for each tablet of a table that is part of a database whose changes are being consumed using a replication slot. This includes those tables that may not be currently part of the publication specification. -Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. +Starting from v2024.2.1, the default data retention for CDC is 8 hours, with support for maximum retention up to 24 hours. Prior to v2024.2.1, the default retention for CDC is 4 hours. {{< warning title="Important" >}} -When using replica identity FULL or DEFAULT, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of history for streamed rows. +When using FULL or DEFAULT replica identities, CDC preserves previous row values for UPDATE and DELETE operations. This is done by retaining history for each row in the database through a suspension of the compaction process. Compaction is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of history for streamed rows. -The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned replica identities may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these replica identities, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period (default 8 hours). Be aware that any interruption in CDC consumption for extended periods using these replica identities may degrade read performance. This happens because compaction activities are halted in the database with these replica identities, leading to inefficient key lookups as reads must traverse multiple SST files. {{< /warning >}} diff --git a/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md b/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md index 1512dabdb5ca..2c74db6d48ba 100644 --- a/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md +++ b/docs/content/stable/develop/change-data-capture/using-yugabytedb-grpc-replication/cdc-get-started.md @@ -530,18 +530,19 @@ You can use several flags to fine-tune YugabyteDB's CDC behavior. These flags ar - [cdc_max_stream_intent_records](../../../../reference/configuration/yb-tserver/#cdc-max-stream-intent-records) - Controls how many intent records can be streamed in a single `GetChanges` call. Essentially, intents of large transactions are broken down into batches of size equal to this flag, hence this controls how many batches of `GetChanges` calls are needed to stream the entire large transaction. The default value of this flag is 1680, and transactions with intents less than this value are streamed in a single batch. The value of this flag can be increased, if the workload has larger transactions and CDC throughput needs to be increased. Note that high values of this flag can increase the latency of each `GetChanges` call. -## Retaining data for longer durations +## Retain data for longer durations -The following flags are responsible for retention of data required by CDC: -- `cdc_wal_retention_time_secs` (default value: 28800s) -- `cdc_intent_retention_ms` (default value: 28800000ms) +The following flags control the retention of data required by CDC: -Starting from 2024.2.1, the data retention configuration for Change Data Capture (CDC) has been updated. The default retention period is now set to 8 hours, with support for maximum retention up to 24 hours. Prior to 2024.2.1, the default retention for CDC is 4 hours. +- `cdc_wal_retention_time_secs` (default: 28800s) +- `cdc_intent_retention_ms` (default: 28800000ms) + +Starting from v2024.2.1, the default data retention for CDC is 8 hours, with support for maximum retention up to 24 hours. Prior to v2024.2.1, the default retention for CDC is 4 hours. {{< warning title="Important" >}} -When using before image modes ALL, FULL_ROW_NEW_IMAGE or MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES, CDC preserves previous row values for UPDATE and DELETE operations. This is accomplished by retaining history for each row in the database through a suspension of the compaction process. Compaction process is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of streamed rows. +When using ALL, FULL_ROW_NEW_IMAGE, or MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES before image modes, CDC preserves previous row values for UPDATE and DELETE operations. This is done by retaining history for each row in the database through a suspension of the compaction process. Compaction is halted by setting retention barriers to prevent cleanup of history for those rows that are yet to be streamed to the CDC client. These retention barriers are dynamically managed and advanced only after the CDC events are streamed and explicitly acknowledged by the client, thus allowing compaction of streamed rows. -The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period, defaulting to 8 hours. Users should be aware that any interruption in CDC consumption for extended periods with the above-mentioned before image modes may lead to potential read performance degradation. This happens because compaction activities are halted in the database with these before image modes, leading to inefficient key lookups as reads must traverse multiple SST files, which degrades read performance. +The [cdc_intent_retention_ms](../../../../reference/configuration/yb-tserver/#cdc-intent-retention-ms) flag governs the maximum retention period (default 8 hours). Be aware that any interruption in CDC consumption for extended periods using these before image modes may degrade read performance. This happens because compaction activities are halted in the database when these before image modes are used, leading to inefficient key lookups as reads must traverse multiple SST files. {{< /warning >}} ## Content-based routing