From 20822a0a826a5028d467e3d170e34dc6b04c6307 Mon Sep 17 00:00:00 2001 From: atovpeko Date: Tue, 22 Oct 2024 12:25:47 +0300 Subject: [PATCH 01/13] draft --- .../data-tiering/enabling-data-tiering.md | 288 +++++++++++++++++- 1 file changed, 273 insertions(+), 15 deletions(-) diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index e58c29cedc..b3424760db 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -1,5 +1,5 @@ --- -title: Enabling the object storage tier +title: Manage tiering excerpt: How to enable the object storage tier products: [cloud] keywords: [tiered storage] @@ -9,27 +9,25 @@ cloud_ui: - [services, :serviceId, overview] --- -# Tier data to the object storage tier +# Manage tiering -Enable tiered storage to begin migrating rarely used data from Timescale's standard high-performance storage tier -to the object storage tier to save on storage costs. +You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage +to the object storage. With tiered storage enabled, you then either manually tier and untier data, or create tiering policies. -## Enabling the object storage tier +## Enable tiered storage -You can enable tiered storage from the Services Overview page in the Timescale -console. +You enable tiered storage from the `Overview` tab in Console. -### Enabling tiered storage +### Enable tiered storage -1. In the Timescale console, from the `Services` list, click the name of +1. In Timescale Console, from the `Services` list, click the name of the service you want to modify. 1. In the `Overview` tab, locate the `Tiered Storage` card, and click `Enable tiered storage`. Confirm the action. 1. Tiered storage can take a few seconds to turn on and once activated shows the amount of - data that has been tiered. Once enabled, data can be tiered by manually tiering - a chunk or by creating a tiering policy. + data that has been tiered. -After tiered storage is enabled you must either [manually tier data][manual-tier-chunk] or [setup a tiering policy][creating-data-tiering-policy] -to begin tiering data from your hypertables. +## Automate tiering with policies -[manual-tier-chunk]: /use-timescale/:currentVersion:/data-tiering/manual-tier-chunk/ -[creating-data-tiering-policy]: /use-timescale/:currentVersion:/data-tiering/creating-data-tiering-policy/ +To automate the archival of data not actively accessed, create a tiering policy that +automatically moves data to the object storage tier. Any chunks that only contain data +older than the `move_after` threshold are moved. This works similarly to a +[data retention policy](https://docs.timescale.com/use-timescale/latest/data-retention/), but chunks are moved rather than deleted. + +The tiering policy schedules a job that runs periodically to migrate +eligible chunks. The migration is asynchronous. +The chunks are tiered once they appear in the `timescaledb_osm.tiered_chunks` view. +Tiering does not influence your ability to query the chunks. + +To add a tiering policy, call the `add_tiering_policy` function: + +```sql +SELECT add_tiering_policy(hypertable REGCLASS, move_after INTERVAL, if_not_exists BOOL = false); +``` + +You can add a tiering policy to hypertables and continuous aggregates. In the following example, you tier chunks that are more than three days old in the `example` hypertable. + + + +### Add a tiering policy + +1. At the psql prompt, select the hypertable and duration: + +```sql +SELECT add_tiering_policy('example', INTERVAL '3 days'); +``` + + + +To remove an existing tiering policy, use the `remove_tiering_policy` function: + +```sql +SELECT remove_tiering_policy(hypertable REGCLASS, if_exists BOOL = false); +``` + + + +### Remove a tiering policy + +1. At the psql prompt, select the hypertable to remove the policy from: + +```sql +SELECT remove_tiering_policy('example'); +``` + + + +If you remove a tiering policy, the removal automatically prevents scheduled chunks from being tiered in the future. +Any chunks that were already tiered won't be untiered automatically. You can use the [untier_chunk][untier-data] procedure +to untier chunks to local storage that have already been tiered. + +The procedure for adding and removing tiering policy for a continuous aggregate is identical to a hypertable. The following example uses a continuous aggregate called `example_day_avg`. + + + +### Add a tiering policy for a continuous aggregate + +At the psql prompt, specify the continuous aggregate name and the interval after which chunks are moved to tiered storage: + +```sql +SELECT add_tiering_policy('example_day_avg', move_after => '1 month'::interval) +``` + + + + + +### Remove a tiering policy from a continuous aggregate + +At the psql prompt, specify the continuous aggregate to remove the policy from: + +```sql +SELECT remove_tiering_policy('example_day_avg'); +``` + + + + +## Manually tier and untier chunks + +Once tiered storage has been enabled on a service, individual chunks from a hypertable may be tiered to the object storage tier. + +Before you start, you need a list of chunks to tier. In this example, you use a hypertable called example, and tier chunks older than three days. +Data on the object storage tier cannot be modified - so inserts, updates, and deletes will not work on tiered data. So make sure that +you are not tiering data that is being actively modified to the object storage tier + + + +### Select chunks to tier + +1. At the psql prompt, select all chunks in the table `example` that are older + than three days: + + ```sql + SELECT show_chunks('example', older_than => INTERVAL '3 days'); + ``` + +1. This returns a list of chunks. Take a note of the chunk names: + + ```sql + |1|_timescaledb_internal_hyper_1_2_chunk| + |2|_timescaledb_internal_hyper_1_3_chunk| + ``` + + + +When you are happy with the list of chunks, you can use the `tier_chunk` function to manually tier each one. + + + +### Tier chunks manually + +1. At the psql prompt, tier the chunk: + + ```sql + SELECT tier_chunk( '_timescaledb_internal_hyper_1_2_chunk'); + ``` + + Tiering a chunk is an asynchronous process that schedules the chunk to be tiered. + +1. Repeat for all chunks you want to tier. + + + + + + +### List tiered chunks + + +Tiering a chunk schedules the chunk for migration to the object storage tier but, won't be tiered immediately. +It may take some time tiering to complete. You can continue to query a chunk during migration. + + +To see which chunks are tiered into the object storage tier, use the `tiered_chunks` +informational view: + +```sql +SELECT * FROM timescaledb_osm.tiered_chunks; +``` + + + + +### Find chunks that are scheduled to be tiered + +Chunks are tiered asynchronously. Chunks are tiered one at a time in order to minimize db resource +consumption during the tiering process. You can see chunks scheduled for tiering (either by the policy or +by a manual call to `tier_chunk`) but have not yet been moved to the object storage tier using this view. + +```sql +SELECT * FROM timescaledb_osm.chunks_queued_for_tiering ; +``` + +If you need to untier your data, see the +[manually untier data][untier-data] section. + + + + + +#### Untier data in a chunk + +Tiered data is stored on our object storage tier. Tiered data is immutable, and cannot +be changed. To update data in a tiered chunk, you need to move it back to local storage (Timescale's standard high-performance storage tier). +This is called untiering the data. You can untier data in a chunk using the `untier_chunk` stored procedure. + +Untiering chunks is a synchronous process that occurs when the `untier_chunk` +procedure is called. When you untier a chunk, the data is moved from the object storage tier +to local storage. Chunks are renamed when the data is untiered. + +1. At the `psql` prompt, check which chunks are currently tiered: + + ```sql + SELECT * FROM timescaledb_osm.tiered_chunks ; + ``` + + The output looks something like this: + + ```sql + hypertable_schema | hypertable_name | chunk_name | range_start | range_end + -------------------+-----------------+------------------+------------------------+------------------------ + public | sample | _hyper_1_1_chunk | 2023-02-16 00:00:00+00 | 2023-02-23 00:00:00+00 + (1 row) + ``` + +1. Run `untier_chunk`: + + ```sql + CALL untier_chunk('_hyper_1_1_chunk'); + ``` + +1. You can see the details of the chunk with the + `timescaledb_information.chunks` function. The chunk might have changed name + when it was untiered: + + ```sql + SELECT * FROM timescaledb_information.chunks; + ``` + + The output looks something like this: + + ```sql + -[ RECORD 1 ]----------+------------------------- + hypertable_schema | public + hypertable_name | sample + chunk_schema | _timescaledb_internal + chunk_name | _hyper_1_4_chunk + primary_dimension | ts + primary_dimension_type | timestamp with time zone + range_start | 2023-02-16 00:00:00+00 + range_end | 2020-03-23 00:00:00+00 + range_start_integer | + range_end_integer | + is_compressed | f + chunk_tablespace | + data_nodes | + ``` + + + + + +## Disable tiering on a hypertable + +If you no longer want to use tiered storage for a particular hypertable, you +can drop the associated metadata by calling the `disable_tiering` function. + + + +### Disable tiering + +1. Call [remove_tiering_policy][tiering-policy] and drop any tiering policy associated with this hypertable. + +1. Make sure that there is no tiered data associated with this hypertable: + + 1. List the tiered chunks associated with this hypertable: + ```sql + select * from timescaledb_osm.tiered_chunks + ``` + + 1. If you have any tiered chunks, either untier this data, or drop these chunks from tiered storage. + + You can use the [untier_chunk][untier-data] procedure to untier chunks that have already been tiered to local storage. + +1. Use `disable_tiering` to drop all tiering related metadata for the hypertable: + + ```sql + select disable_tiering('my_hypertable_name'); + ``` + +1. Verify that tiering has been disabled by listing the hypertables that have tiering enabled. + ```sql + select * from timescaledb_osm.tiered_hypertables; + ``` + + + +And that is it, you have disabled tiering on a hypertable. + +[untier-data]: /use-timescale/:currentVersion:/data-tiering/untier-data/ +[tiering-policy]: /use-timescale/:currentVersion:/data-tiering/creating-data-tiering-policy/ \ No newline at end of file From e44aadc1d57bb1f7f727e114f47428b66d5483d7 Mon Sep 17 00:00:00 2001 From: atovpeko Date: Tue, 22 Oct 2024 13:49:53 +0300 Subject: [PATCH 02/13] draft --- use-timescale/page-index/page-index.js | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/use-timescale/page-index/page-index.js b/use-timescale/page-index/page-index.js index 2952576426..0a60ce4c1e 100644 --- a/use-timescale/page-index/page-index.js +++ b/use-timescale/page-index/page-index.js @@ -486,7 +486,7 @@ module.exports = [ "A quick tour of tiered storage", }, { - title: "Enabling the object storage tier", + title: "Manage tiering", href: "enabling-data-tiering", excerpt: "How to enable the object storage tier", From 09f98630e81136ba1619a50d93dd7a282d70d6cb Mon Sep 17 00:00:00 2001 From: atovpeko Date: Wed, 23 Oct 2024 11:35:36 +0300 Subject: [PATCH 03/13] draft --- .../data-tiering/enabling-data-tiering.md | 173 +++++------------- use-timescale/page-index/page-index.js | 23 --- 2 files changed, 42 insertions(+), 154 deletions(-) diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index b3424760db..452e9dca3d 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -1,6 +1,6 @@ --- title: Manage tiering -excerpt: How to enable the object storage tier +excerpt: How to enable and use object storage tiering products: [cloud] keywords: [tiered storage] tags: [storage, data management] @@ -11,8 +11,9 @@ cloud_ui: # Manage tiering -You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage -to the object storage. With tiered storage enabled, you then either manually tier and untier data, or create tiering policies. +You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. With tiered storage enabled, you then either create automated tiering policies or manually tier and untier data. + +Data on the object storage tier cannot be modified - so inserts, updates, and deletes will not work on tiered data. Make sure that you are not tiering data that is being actively modified to the object storage tier. ## Enable tiered storage @@ -20,13 +21,11 @@ You enable tiered storage from the `Overview` tab in Console. -### Enable tiered storage - 1. In Timescale Console, from the `Services` list, click the name of the service you want to modify. -1. In the `Overview` tab, locate the `Tiered Storage` card, and click +1. In the `Overview` tab, locate the `Tiered Storage` card and click `Enable tiered storage`. Confirm the action. -1. Tiered storage can take a few seconds to turn on and once activated shows the amount of +1. Tiered storage can take a few seconds to turn on and, once activated, shows the amount of data that has been tiered. - ## Automate tiering with policies -To automate the archival of data not actively accessed, create a tiering policy that -automatically moves data to the object storage tier. Any chunks that only contain data +A tiering policy automatically moves data to the object storage tier. Any chunks that only contain data older than the `move_after` threshold are moved. This works similarly to a -[data retention policy](https://docs.timescale.com/use-timescale/latest/data-retention/), but chunks are moved rather than deleted. +[data retention policy][data-retention], but chunks are moved rather than deleted. You can add tiering policies to hypertables, including continuous aggregates. -The tiering policy schedules a job that runs periodically to migrate -eligible chunks. The migration is asynchronous. -The chunks are tiered once they appear in the `timescaledb_osm.tiered_chunks` view. -Tiering does not influence your ability to query the chunks. +A tiering policy schedules a job that runs periodically to migrate eligible chunks. The migration is asynchronous. The chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. Tiering does not influence your ability to query the chunks. + +### Add a tiering policy -To add a tiering policy, call the `add_tiering_policy` function: +To add a tiering policy, call `add_tiering_policy`: ```sql SELECT add_tiering_policy(hypertable REGCLASS, move_after INTERVAL, if_not_exists BOOL = false); ``` -You can add a tiering policy to hypertables and continuous aggregates. In the following example, you tier chunks that are more than three days old in the `example` hypertable. - - - -### Add a tiering policy - -1. At the psql prompt, select the hypertable and duration: +For example, tier chunks that are more than three days old in the `example` hypertable in the following way: ```sql SELECT add_tiering_policy('example', INTERVAL '3 days'); ``` - +### Remove a tiering policy -To remove an existing tiering policy, use the `remove_tiering_policy` function: +To remove an existing tiering policy, call `remove_tiering_policy`: ```sql SELECT remove_tiering_policy(hypertable REGCLASS, if_exists BOOL = false); ``` - - -### Remove a tiering policy - -1. At the psql prompt, select the hypertable to remove the policy from: +For example, remove the tiering policy from the `example` hypertable in the following way: ```sql SELECT remove_tiering_policy('example'); ``` - - -If you remove a tiering policy, the removal automatically prevents scheduled chunks from being tiered in the future. -Any chunks that were already tiered won't be untiered automatically. You can use the [untier_chunk][untier-data] procedure -to untier chunks to local storage that have already been tiered. - -The procedure for adding and removing tiering policy for a continuous aggregate is identical to a hypertable. The following example uses a continuous aggregate called `example_day_avg`. - - - -### Add a tiering policy for a continuous aggregate - -At the psql prompt, specify the continuous aggregate name and the interval after which chunks are moved to tiered storage: - -```sql -SELECT add_tiering_policy('example_day_avg', move_after => '1 month'::interval) -``` - - - - - -### Remove a tiering policy from a continuous aggregate - -At the psql prompt, specify the continuous aggregate to remove the policy from: - -```sql -SELECT remove_tiering_policy('example_day_avg'); -``` - - - +If you remove a tiering policy, new scheduled chunks will not be tiered. However, already tiered chunks won't be untiered. You can [untier chunks manually](#manually-tier-and-untier-chunks) to the local storage. ## Manually tier and untier chunks -Once tiered storage has been enabled on a service, individual chunks from a hypertable may be tiered to the object storage tier. +If tiering policies do not meet your current needs, you can tier and untier chunks manually. -Before you start, you need a list of chunks to tier. In this example, you use a hypertable called example, and tier chunks older than three days. -Data on the object storage tier cannot be modified - so inserts, updates, and deletes will not work on tiered data. So make sure that -you are not tiering data that is being actively modified to the object storage tier +### Tier chunks - +Tiering a chunk is an asynchronous process that schedules the chunk to be tiered. In this example, you use a hypertable called `example` and tier chunks older than three days. You then proceed to list tiered chunks. -### Select chunks to tier + 1. At the psql prompt, select all chunks in the table `example` that are older than three days: @@ -137,79 +90,44 @@ you are not tiering data that is being actively modified to the object st SELECT show_chunks('example', older_than => INTERVAL '3 days'); ``` -1. This returns a list of chunks. Take a note of the chunk names: + This returns a list of chunks. Take a note of the chunk names: ```sql |1|_timescaledb_internal_hyper_1_2_chunk| |2|_timescaledb_internal_hyper_1_3_chunk| ``` - - -When you are happy with the list of chunks, you can use the `tier_chunk` function to manually tier each one. - - - -### Tier chunks manually - -1. At the psql prompt, tier the chunk: +1. Call the `tier_chunk` function to manually tier each chunk: ```sql SELECT tier_chunk( '_timescaledb_internal_hyper_1_2_chunk'); ``` - Tiering a chunk is an asynchronous process that schedules the chunk to be tiered. - 1. Repeat for all chunks you want to tier. - - - - - -### List tiered chunks +1. To see which chunks are tiered into the object storage tier, use the `tiered_chunks` informational view: - -Tiering a chunk schedules the chunk for migration to the object storage tier but, won't be tiered immediately. -It may take some time tiering to complete. You can continue to query a chunk during migration. - - -To see which chunks are tiered into the object storage tier, use the `tiered_chunks` -informational view: + ```sql + SELECT * FROM timescaledb_osm.tiered_chunks; + ``` -```sql -SELECT * FROM timescaledb_osm.tiered_chunks; -``` - - -### Find chunks that are scheduled to be tiered +Tiering a chunk schedules it for migration to the object storage tier, but the migration won't happen immediately. Chunks are tiered one at a time in order to minimize database resource consumption. You can continue to query a chunk during migration. -Chunks are tiered asynchronously. Chunks are tiered one at a time in order to minimize db resource -consumption during the tiering process. You can see chunks scheduled for tiering (either by the policy or -by a manual call to `tier_chunk`) but have not yet been moved to the object storage tier using this view. +To see which chunks are scheduled for tiering either by policy or by a manual call, but have not yet been tiered, use this view: ```sql SELECT * FROM timescaledb_osm.chunks_queued_for_tiering ; ``` -If you need to untier your data, see the -[manually untier data][untier-data] section. +### Untier chunks - +Tiered data is immutable. To update data in a tiered chunk, move it back to local storage, that is, Timescale's standard high-performance storage tier. You can do so by using the `untier_chunk` stored procedure. - +Untiering chunks is a synchronous process. Chunks are renamed when the data is untiered. -#### Untier data in a chunk - -Tiered data is stored on our object storage tier. Tiered data is immutable, and cannot -be changed. To update data in a tiered chunk, you need to move it back to local storage (Timescale's standard high-performance storage tier). -This is called untiering the data. You can untier data in a chunk using the `untier_chunk` stored procedure. - -Untiering chunks is a synchronous process that occurs when the `untier_chunk` -procedure is called. When you untier a chunk, the data is moved from the object storage tier -to local storage. Chunks are renamed when the data is untiered. + 1. At the `psql` prompt, check which chunks are currently tiered: @@ -232,9 +150,8 @@ to local storage. Chunks are renamed when the data is untiered. CALL untier_chunk('_hyper_1_1_chunk'); ``` -1. You can see the details of the chunk with the - `timescaledb_information.chunks` function. The chunk might have changed name - when it was untiered: +1. See the details of the chunk with the + `timescaledb_information.chunks` function: ```sql SELECT * FROM timescaledb_information.chunks; @@ -262,43 +179,37 @@ to local storage. Chunks are renamed when the data is untiered. - -## Disable tiering on a hypertable +## Disable tiering If you no longer want to use tiered storage for a particular hypertable, you can drop the associated metadata by calling the `disable_tiering` function. -### Disable tiering - -1. Call [remove_tiering_policy][tiering-policy] and drop any tiering policy associated with this hypertable. +1. Call `remove_tiering_policy` and drop any tiering policy associated with this hypertable. 1. Make sure that there is no tiered data associated with this hypertable: 1. List the tiered chunks associated with this hypertable: + ```sql select * from timescaledb_osm.tiered_chunks ``` 1. If you have any tiered chunks, either untier this data, or drop these chunks from tiered storage. - You can use the [untier_chunk][untier-data] procedure to untier chunks that have already been tiered to local storage. - -1. Use `disable_tiering` to drop all tiering related metadata for the hypertable: +1. Use `disable_tiering` to drop all tiering-related metadata for the hypertable: ```sql select disable_tiering('my_hypertable_name'); ``` -1. Verify that tiering has been disabled by listing the hypertables that have tiering enabled. +1. Verify that tiering has been disabled by listing the hypertables that have tiering enabled: + ```sql select * from timescaledb_osm.tiered_hypertables; ``` -And that is it, you have disabled tiering on a hypertable. - -[untier-data]: /use-timescale/:currentVersion:/data-tiering/untier-data/ -[tiering-policy]: /use-timescale/:currentVersion:/data-tiering/creating-data-tiering-policy/ \ No newline at end of file +[data-retention]: /use-timescale/:currentVersion:/data-retention/ \ No newline at end of file diff --git a/use-timescale/page-index/page-index.js b/use-timescale/page-index/page-index.js index 0a60ce4c1e..5385bc0067 100644 --- a/use-timescale/page-index/page-index.js +++ b/use-timescale/page-index/page-index.js @@ -491,41 +491,18 @@ module.exports = [ excerpt: "How to enable the object storage tier", }, - { - title: "Manually tier data", - href: "manual-tier-chunk", - excerpt: - "How to manually tier data to the object storage tier", - }, - { - title: "Creating tiering policies", - href: "creating-data-tiering-policy", - excerpt: - "How to create a tiering policy", - }, { title: "Querying tiered data", href: "querying-tiered-data", excerpt: "How to query tiered data", }, - { - title: "Manually untier data", - href: "untier-data", - excerpt: "How to manualy untier data from the object storage tier", - }, { title: "Replicas and forks with tiered data", href: "tiered-data-replicas-forks", excerpt: "How tiered data works on replicas and forks", }, - { - title: "Disabling tiering for a hypertable", - href: "disabling-data-tiering", - excerpt: - "How to disable tiering for a hypertable", - }, { title: "Troubleshooting", href: "troubleshooting", From 6ad34adbe6177827d8867895956ed49b5058a040 Mon Sep 17 00:00:00 2001 From: atovpeko Date: Wed, 30 Oct 2024 12:15:25 +0200 Subject: [PATCH 04/13] review comments --- .../data-tiering/enabling-data-tiering.md | 37 +++++++++++-------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index 452e9dca3d..e952bebfb5 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -7,12 +7,19 @@ tags: [storage, data management] cloud_ui: path: - [services, :serviceId, overview] +plans: [scale, enterprise] --- # Manage tiering You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. With tiered storage enabled, you then either create automated tiering policies or manually tier and untier data. + + +Data tiering is available in [Scale and Enterprise](/about/latest/pricing-and-account-management/) pricing plans only. + + + Data on the object storage tier cannot be modified - so inserts, updates, and deletes will not work on tiered data. Make sure that you are not tiering data that is being actively modified to the object storage tier. ## Enable tiered storage @@ -21,31 +28,29 @@ You enable tiered storage from the `Overview` tab in Console. -1. In Timescale Console, from the `Services` list, click the name of - the service you want to modify. -1. In the `Overview` tab, locate the `Tiered Storage` card and click - `Enable tiered storage`. Confirm the action. -1. Tiered storage can take a few seconds to turn on and, once activated, shows the amount of - data that has been tiered. +1. In [Timescale Console][console], select the service to modify. + + You see the `Overview` section. + +1. Scroll down, then click `Enable tiered storage`. + + ![Enable tiered storage](https://assets.timescale.com/docs/images/console-enable-tiered-storage.png) - The Timescale Console showing tiered storage enabled + When tiered storage is activated, you see the amount of data in the tiered object storage. ## Automate tiering with policies -A tiering policy automatically moves data to the object storage tier. Any chunks that only contain data -older than the `move_after` threshold are moved. This works similarly to a +A tiering policy automatically moves any data chunks that only contain data +older than the `move_after` threshold to the object storage tier. This works similarly to a [data retention policy][data-retention], but chunks are moved rather than deleted. You can add tiering policies to hypertables, including continuous aggregates. A tiering policy schedules a job that runs periodically to migrate eligible chunks. The migration is asynchronous. The chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. Tiering does not influence your ability to query the chunks. ### Add a tiering policy -To add a tiering policy, call `add_tiering_policy`: +To add a tiering policy, connect to your $SERVICE_SHORT and call `add_tiering_policy`: ```sql SELECT add_tiering_policy(hypertable REGCLASS, move_after INTERVAL, if_not_exists BOOL = false); @@ -79,7 +84,7 @@ If tiering policies do not meet your current needs, you can tier and untier chun ### Tier chunks -Tiering a chunk is an asynchronous process that schedules the chunk to be tiered. In this example, you use a hypertable called `example` and tier chunks older than three days. You then proceed to list tiered chunks. +Tiering a chunk is an asynchronous process that schedules the chunk to be tiered. In the following example, you tier chunks older than three days in the example [hypertable][hypertable]. You then list the tiered chunks. @@ -212,4 +217,6 @@ can drop the associated metadata by calling the `disable_tiering` function. -[data-retention]: /use-timescale/:currentVersion:/data-retention/ \ No newline at end of file +[data-retention]: /use-timescale/:currentVersion:/data-retention/ +[console]: https://console.cloud.timescale.com/ +[hypertable]: /use-timescale/:currentVersion:/hypertables/ \ No newline at end of file From c572d39a6db4fd7ed7359da6f2b83f1ef45727fb Mon Sep 17 00:00:00 2001 From: atovpeko <114177030+atovpeko@users.noreply.github.com> Date: Wed, 30 Oct 2024 12:41:31 +0200 Subject: [PATCH 05/13] Update use-timescale/data-tiering/enabling-data-tiering.md Co-authored-by: Iain Cox Signed-off-by: atovpeko <114177030+atovpeko@users.noreply.github.com> --- use-timescale/data-tiering/enabling-data-tiering.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index 452e9dca3d..7eefdae8e0 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -11,7 +11,7 @@ cloud_ui: # Manage tiering -You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. With tiered storage enabled, you then either create automated tiering policies or manually tier and untier data. +You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. After you enable tiered storage, you then either create automated tiering policies or manually tier and untier data. Data on the object storage tier cannot be modified - so inserts, updates, and deletes will not work on tiered data. Make sure that you are not tiering data that is being actively modified to the object storage tier. From be82728430b222fdcf24fc4ebe873e5520f8157d Mon Sep 17 00:00:00 2001 From: atovpeko <114177030+atovpeko@users.noreply.github.com> Date: Wed, 30 Oct 2024 12:41:45 +0200 Subject: [PATCH 06/13] Update use-timescale/data-tiering/enabling-data-tiering.md Co-authored-by: Iain Cox Signed-off-by: atovpeko <114177030+atovpeko@users.noreply.github.com> --- use-timescale/data-tiering/enabling-data-tiering.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index 7eefdae8e0..3695521c5f 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -41,7 +41,7 @@ A tiering policy automatically moves data to the object storage tier. Any chunks older than the `move_after` threshold are moved. This works similarly to a [data retention policy][data-retention], but chunks are moved rather than deleted. You can add tiering policies to hypertables, including continuous aggregates. -A tiering policy schedules a job that runs periodically to migrate eligible chunks. The migration is asynchronous. The chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. Tiering does not influence your ability to query the chunks. +A tiering policy schedules a job that runs periodically to asynchronously migrate eligible chunks to object storage. Chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. Tiering does not influence your ability to query the chunks. ### Add a tiering policy From c34591adad850b313ff87afb6758424e57eb1013 Mon Sep 17 00:00:00 2001 From: atovpeko <114177030+atovpeko@users.noreply.github.com> Date: Wed, 30 Oct 2024 12:42:57 +0200 Subject: [PATCH 07/13] Update use-timescale/data-tiering/enabling-data-tiering.md Co-authored-by: Iain Cox Signed-off-by: atovpeko <114177030+atovpeko@users.noreply.github.com> --- use-timescale/data-tiering/enabling-data-tiering.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index 3695521c5f..103b5dcd9f 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -71,7 +71,7 @@ For example, remove the tiering policy from the `example` hypertable in the foll SELECT remove_tiering_policy('example'); ``` -If you remove a tiering policy, new scheduled chunks will not be tiered. However, already tiered chunks won't be untiered. You can [untier chunks manually](#manually-tier-and-untier-chunks) to the local storage. +If you remove a tiering policy, new scheduled chunks are not be tiered. However, chunks in tiered storage are not untiered. You [untier chunks manually](#manually-tier-and-untier-chunks) to local storage. ## Manually tier and untier chunks From ca3f4dfdeba109c8678a8f89aeb68a8fd82b590e Mon Sep 17 00:00:00 2001 From: atovpeko Date: Wed, 30 Oct 2024 13:52:11 +0200 Subject: [PATCH 08/13] review comment --- .../data-tiering/enabling-data-tiering.md | 87 +++++++++---------- 1 file changed, 43 insertions(+), 44 deletions(-) diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index 571163c0fe..63ac5b5cde 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -12,15 +12,9 @@ plans: [scale, enterprise] # Manage tiering -You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. After you enable tiered storage, you then either create automated tiering policies or manually tier and untier data. +You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. After you [enable tiered storage](#enable-tiered-storage), you then either [create automated tiering policies](#automate-tiering-with-policies) or [manually tier and untier data](#manually-tier-and-untier-chunks). - - -Data tiering is available in [Scale and Enterprise](/about/latest/pricing-and-account-management/) pricing plans only. - - - -Data on the object storage tier cannot be modified - so inserts, updates, and deletes will not work on tiered data. Make sure that you are not tiering data that is being actively modified to the object storage tier. +You can query the data on the object storage tier, but you cannot modify it. Make sure that you are not tiering data that needs to be **actively modified**. ## Enable tiered storage @@ -28,35 +22,41 @@ You enable tiered storage from the `Overview` tab in Console. -1. In [Timescale Console][console], select the service to modify. +1. **In [Timescale Console][console], select the service to modify**. You see the `Overview` section. -1. Scroll down, then click `Enable tiered storage`. +1. **Scroll down, then click `Enable tiered storage`**. ![Enable tiered storage](https://assets.timescale.com/docs/images/console-enable-tiered-storage.png) - When tiered storage is activated, you see the amount of data in the tiered object storage. + When tiered storage is enabled, you see the amount of data in the tiered object storage. + + + Data tiering is available in [Scale and Enterprise](/about/latest/pricing-and-account-management/) pricing plans only. + ## Automate tiering with policies -A tiering policy automatically moves any data chunks that only contain data +A tiering policy automatically moves any chunks that only contain data older than the `move_after` threshold to the object storage tier. This works similarly to a -[data retention policy][data-retention], but chunks are moved rather than deleted. You can add tiering policies to hypertables, including continuous aggregates. +[data retention policy][data-retention], but chunks are moved rather than deleted. -A tiering policy schedules a job that runs periodically to asynchronously migrate eligible chunks to object storage. Chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. Tiering does not influence your ability to query the chunks. +A tiering policy schedules a job that runs periodically to asynchronously migrate eligible chunks to object storage. Chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. + +You can add tiering policies to [hypertables][hypertable], including [continuous aggregates][caggs]. To manage tiering policies, [connect to your service][connect-to-service] and run the queries below in the data mode, the SQL editor, or using `psql`. ### Add a tiering policy -To add a tiering policy, connect to your $SERVICE_SHORT and call `add_tiering_policy`: +To add a tiering policy, call `add_tiering_policy`: ```sql SELECT add_tiering_policy(hypertable REGCLASS, move_after INTERVAL, if_not_exists BOOL = false); ``` -For example, tier chunks that are more than three days old in the `example` hypertable in the following way: +For example, to tier chunks that are more than three days old in the `example` [hypertable][hypertable]: ```sql SELECT add_tiering_policy('example', INTERVAL '3 days'); @@ -70,47 +70,48 @@ To remove an existing tiering policy, call `remove_tiering_policy`: SELECT remove_tiering_policy(hypertable REGCLASS, if_exists BOOL = false); ``` -For example, remove the tiering policy from the `example` hypertable in the following way: +For example, to remove the tiering policy from the `example` hypertable: ```sql SELECT remove_tiering_policy('example'); ``` -If you remove a tiering policy, new scheduled chunks are not be tiered. However, chunks in tiered storage are not untiered. You [untier chunks manually](#manually-tier-and-untier-chunks) to local storage. +If you remove a tiering policy, the remaining scheduled chunks are not tiered. However, chunks in tiered storage are not untiered. You [untier chunks manually](#manually-tier-and-untier-chunks) to local storage. ## Manually tier and untier chunks -If tiering policies do not meet your current needs, you can tier and untier chunks manually. +If tiering policies do not meet your current needs, you can tier and untier chunks manually. To do so, [connect to your service][connect-to-service] and run the queries below in the data mode, the SQL editor, or using `psql`. ### Tier chunks -Tiering a chunk is an asynchronous process that schedules the chunk to be tiered. In the following example, you tier chunks older than three days in the example [hypertable][hypertable]. You then list the tiered chunks. +Tiering a chunk is an asynchronous process that schedules the chunk to be tiered. In the following example, you tier chunks older than three days in the `example` hypertable. You then list the tiered chunks. -1. At the psql prompt, select all chunks in the table `example` that are older - than three days: +1. **Select all chunks in `example` that are older than three days:** ```sql SELECT show_chunks('example', older_than => INTERVAL '3 days'); ``` - This returns a list of chunks. Take a note of the chunk names: + This returns a list of chunks. Take a note of the chunk names: ```sql |1|_timescaledb_internal_hyper_1_2_chunk| |2|_timescaledb_internal_hyper_1_3_chunk| ``` -1. Call the `tier_chunk` function to manually tier each chunk: +1. **Call `tier_chunk` to manually tier each chunk:** ```sql SELECT tier_chunk( '_timescaledb_internal_hyper_1_2_chunk'); ``` -1. Repeat for all chunks you want to tier. +1. **Repeat for all chunks you want to tier.** + + Tiering a chunk schedules it for migration to the object storage tier, but the migration won't happen immediately. Chunks are tiered one at a time in order to minimize database resource consumption. You can continue to query a chunk during migration. -1. To see which chunks are tiered into the object storage tier, use the `tiered_chunks` informational view: +1. **To see which chunks are tiered into the object storage tier, use the `tiered_chunks` informational view:** ```sql SELECT * FROM timescaledb_osm.tiered_chunks; @@ -118,8 +119,6 @@ Tiering a chunk is an asynchronous process that schedules the chunk to be tiered -Tiering a chunk schedules it for migration to the object storage tier, but the migration won't happen immediately. Chunks are tiered one at a time in order to minimize database resource consumption. You can continue to query a chunk during migration. - To see which chunks are scheduled for tiering either by policy or by a manual call, but have not yet been tiered, use this view: ```sql @@ -128,19 +127,19 @@ SELECT * FROM timescaledb_osm.chunks_queued_for_tiering ; ### Untier chunks -Tiered data is immutable. To update data in a tiered chunk, move it back to local storage, that is, Timescale's standard high-performance storage tier. You can do so by using the `untier_chunk` stored procedure. +To update data in a tiered chunk, move it back to the standard high-performance storage tier in $CLOUD_LONG. Untiering chunks is a synchronous process. Chunks are renamed when the data is untiered. -Untiering chunks is a synchronous process. Chunks are renamed when the data is untiered. +To untier a chunk, call the `untier_chunk` stored procedure. -1. At the `psql` prompt, check which chunks are currently tiered: +1. **Check which chunks are currently tiered:** ```sql SELECT * FROM timescaledb_osm.tiered_chunks ; ``` - The output looks something like this: + Sample output: ```sql hypertable_schema | hypertable_name | chunk_name | range_start | range_end @@ -149,20 +148,19 @@ Untiering chunks is a synchronous process. Chunks are renamed when the data is u (1 row) ``` -1. Run `untier_chunk`: +1. **Call `untier_chunk`**: ```sql CALL untier_chunk('_hyper_1_1_chunk'); ``` -1. See the details of the chunk with the - `timescaledb_information.chunks` function: +1. **See the details of the chunk with `timescaledb_information.chunks`**: ```sql SELECT * FROM timescaledb_information.chunks; ``` - The output looks something like this: + Sample output: ```sql -[ RECORD 1 ]----------+------------------------- @@ -186,14 +184,13 @@ Untiering chunks is a synchronous process. Chunks are renamed when the data is u ## Disable tiering -If you no longer want to use tiered storage for a particular hypertable, you -can drop the associated metadata by calling the `disable_tiering` function. +If you no longer want to use tiered storage for a particular hypertable, drop the associated metadata by calling `disable_tiering`. -1. Call `remove_tiering_policy` and drop any tiering policy associated with this hypertable. +1. **To drop all tiering policies associated with a table, call `remove_tiering_policy`**. -1. Make sure that there is no tiered data associated with this hypertable: +1. **Make sure that there is no tiered data associated with this hypertable**: 1. List the tiered chunks associated with this hypertable: @@ -203,13 +200,13 @@ can drop the associated metadata by calling the `disable_tiering` function. 1. If you have any tiered chunks, either untier this data, or drop these chunks from tiered storage. -1. Use `disable_tiering` to drop all tiering-related metadata for the hypertable: +1. **Use `disable_tiering` to drop all tiering-related metadata for the hypertable**: ```sql select disable_tiering('my_hypertable_name'); ``` -1. Verify that tiering has been disabled by listing the hypertables that have tiering enabled: +1. **Verify that tiering has been disabled by listing the hypertables that have tiering enabled**: ```sql select * from timescaledb_osm.tiered_hypertables; @@ -218,5 +215,7 @@ can drop the associated metadata by calling the `disable_tiering` function. [data-retention]: /use-timescale/:currentVersion:/data-retention/ -[console]: https://console.cloud.timescale.com/ -[hypertable]: /use-timescale/:currentVersion:/hypertables/ \ No newline at end of file +[console]: https://console.cloud.timescale.com/dashboard/services +[hypertable]: /use-timescale/:currentVersion:/hypertables/ +[connect-to-service]: /getting-started/:currentVersion:/services/#connect-to-your-service +[caggs]: /use-timescale/:currentVersion:/continuous-aggregates/ \ No newline at end of file From 6b90d0b6ee36f17be94b847f336c0e4db21d4165 Mon Sep 17 00:00:00 2001 From: atovpeko Date: Wed, 30 Oct 2024 14:32:05 +0200 Subject: [PATCH 09/13] pricing widget --- _troubleshooting/slow-tiering-chunks.md | 1 + use-timescale/data-tiering/about-data-tiering.md | 1 + use-timescale/data-tiering/querying-tiered-data.md | 1 + use-timescale/data-tiering/tiered-data-replicas-forks.md | 1 + use-timescale/data-tiering/tour-data-tiering.md | 1 + 5 files changed, 5 insertions(+) diff --git a/_troubleshooting/slow-tiering-chunks.md b/_troubleshooting/slow-tiering-chunks.md index bf7e3cd7df..c88d705339 100644 --- a/_troubleshooting/slow-tiering-chunks.md +++ b/_troubleshooting/slow-tiering-chunks.md @@ -5,6 +5,7 @@ products: [cloud] topics: [data tiering] keywords: [tiered storage] tags: [tiered storage] +plans: [scale, enterprise] --- diff --git a/use-timescale/data-tiering/about-data-tiering.md b/use-timescale/data-tiering/about-data-tiering.md index 53a4d98c35..84261f1a72 100644 --- a/use-timescale/data-tiering/about-data-tiering.md +++ b/use-timescale/data-tiering/about-data-tiering.md @@ -7,6 +7,7 @@ tags: [storage, data management] cloud_ui: path: - [services, :serviceId, overview] +plans: [scale, enterprise] --- # About the object storage tier diff --git a/use-timescale/data-tiering/querying-tiered-data.md b/use-timescale/data-tiering/querying-tiered-data.md index e5a6cf5fcb..d1d5d4066e 100644 --- a/use-timescale/data-tiering/querying-tiered-data.md +++ b/use-timescale/data-tiering/querying-tiered-data.md @@ -4,6 +4,7 @@ excerpt: How to query tiered data product: [ cloud ] keywords: [ tiered storage, tiering ] tags: [ storage, data management ] +plans: [scale, enterprise] --- # Querying tiered data diff --git a/use-timescale/data-tiering/tiered-data-replicas-forks.md b/use-timescale/data-tiering/tiered-data-replicas-forks.md index 6a4a292aef..6e3707fb79 100644 --- a/use-timescale/data-tiering/tiered-data-replicas-forks.md +++ b/use-timescale/data-tiering/tiered-data-replicas-forks.md @@ -4,6 +4,7 @@ excerpt: How tiered data works on replicas and forks product: [cloud] keywords: [tiered storage] tags: [storage, data management] +plans: [scale, enterprise] --- # How tiered data works on replicas and forks diff --git a/use-timescale/data-tiering/tour-data-tiering.md b/use-timescale/data-tiering/tour-data-tiering.md index afe22c347c..7833d127dd 100644 --- a/use-timescale/data-tiering/tour-data-tiering.md +++ b/use-timescale/data-tiering/tour-data-tiering.md @@ -4,6 +4,7 @@ excerpt: A quick tour of tiered storage product: [cloud] keywords: [tiered storage] tags: [storage, data management] +plans: [scale, enterprise] --- # Tiered Storage From ee380c4ab5f282c9ee9433b5909d827b1d800bf3 Mon Sep 17 00:00:00 2001 From: Iain Date: Wed, 30 Oct 2024 14:11:15 +0100 Subject: [PATCH 10/13] chore: add the plan widget to the index page. --- use-timescale/data-tiering/index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/use-timescale/data-tiering/index.md b/use-timescale/data-tiering/index.md index f100f99a6c..0085342981 100644 --- a/use-timescale/data-tiering/index.md +++ b/use-timescale/data-tiering/index.md @@ -4,6 +4,7 @@ excerpt: Save on storage costs by tiering older data to a low-cost bottomless ob products: [cloud] keywords: [tiered storage] tags: [storage, data management] +plans: [scale, enterprise] --- # Tiered storage From d44a4a4b60cd5ece7942411faea9350987e52aa6 Mon Sep 17 00:00:00 2001 From: Iain Date: Wed, 30 Oct 2024 14:42:35 +0100 Subject: [PATCH 11/13] chore: tiny cleanup for clickthroughs. --- use-timescale/data-tiering/index.md | 34 +++++++++++++++-------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/use-timescale/data-tiering/index.md b/use-timescale/data-tiering/index.md index 0085342981..17c11b6f5a 100644 --- a/use-timescale/data-tiering/index.md +++ b/use-timescale/data-tiering/index.md @@ -9,12 +9,11 @@ plans: [scale, enterprise] # Tiered storage -Tiered storage is Timescale's [hierarchical storage management architecture](https://en.wikipedia.org/wiki/Hierarchical_storage_management). -Engineered for infinite low-cost scalability, tiered storage is available for the -[Time series and analytics](https://www.timescale.com/products) instances you create in -[Timescale](https://console.cloud.timescale.com/). +Tiered storage is a [hierarchical storage management architecture](https://en.wikipedia.org/wiki/Hierarchical_storage_management) for +[Time series and analytics][create-service] services you create in [$CLOUD_LONG](https://console.cloud.timescale.com/). + +Engineered for infinite low-cost scalability, tiered storage consists of the: -Tiered storage consists of the: * **High-performance tier**: rapid access to the most recent, and frequently accessed data. * **Object storage tier**: store data that is rarely accessed and has lower performance requirements. @@ -43,30 +42,33 @@ solutions to offload data to secondary storage and fetch it back in when needed. we do the work for you. + Tiered storage is only available for the [Time series and analytics](https://www.timescale.com/products) -instances you create in [Timescale](https://console.cloud.timescale.com/). +instances you create in [$CLOUD_LONG](https://console.cloud.timescale.com/). + Tiered storage **DOES NOT** work on Self-hosted TimescaleDB or Managed Service for TimescaleDB. -In this section you can: -* [Learn about the object storage tier][about-data-tiering] before you start using tiered storage. -* Take a [tour of tiered storage features][tour-data-tiering]. -* [Learn how to enable the object storage tier][enabling-data-tiering] on your service. -* Manually [tier chunks][manual-tier-chunk] to schedule individual chunks to be tiered. -* Create a [Tiering Policy][creating-data-tiering-policy] to automatically schedule chunks to be tiered. -* [Learn how to query tiered data][querying-tiered-data]. -* Manually [untier chunks][untier-data] to move data back to the high-performance local storage tier. -* [Disable tiering on a hypertable][disabling-data-tiering] on an individual table if you no longer want to associate it with tiered storage. +This section explains the following: +* [Learn about the object storage tier][about-data-tiering]: understand tiered storage before you + [Manage tiering][enabling-data-tiering]. +* [Tour tiered storage][tour-data-tiering]: see the different features in tiered storage. +* [Manage tiering][enabling-data-tiering]: enable and disable data tiering, automate tiering with + policies or tier and untier manually. +* [Query tiered data][querying-tiered-data]: query and performance for tiered data. +* [Replicas and forks with tiered data][replicas-and-forks]: billing and tiered storage. [about-data-tiering]: /use-timescale/:currentVersion:/data-tiering/about-data-tiering/ [tour-data-tiering]: /use-timescale/:currentVersion:/data-tiering/tour-data-tiering/ [enabling-data-tiering]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/ +[replicas-and-forks]: /use-timescale/:currentVersion:/data-tiering/tiered-data-replicas-forks/ [manual-tier-chunk]: /use-timescale/:currentVersion:/data-tiering/manual-tier-chunk/ [disabling-data-tiering]: /use-timescale/:currentVersion:/data-tiering/disabling-data-tiering/ -[creating-data-tiering-policy]: /use-timescale/:currentVersion:/data-tiering/creating-data-tiering-policy/ +[creating-data-tiering-policy]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/#automate-tiering-with-policies [querying-tiered-data]: /use-timescale/:currentVersion:/data-tiering/querying-tiered-data/ [untier-data]: /use-timescale/:currentVersion:/data-tiering/untier-data/ [add-retention-policies]: /api/:currentVersion:/continuous-aggregates/add_policies/ +[create-service]: /getting-started/:currentVersion:/services/ From 9d2b86ee676673e344de4eafd080c9d9f6339186 Mon Sep 17 00:00:00 2001 From: Iain Date: Thu, 31 Oct 2024 15:36:16 +0100 Subject: [PATCH 12/13] chore: updates on review. --- use-timescale/data-tiering/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/use-timescale/data-tiering/index.md b/use-timescale/data-tiering/index.md index 17c11b6f5a..069f068c5c 100644 --- a/use-timescale/data-tiering/index.md +++ b/use-timescale/data-tiering/index.md @@ -52,13 +52,13 @@ Tiered storage **DOES NOT** work on Self-hosted TimescaleDB or Managed Service f This section explains the following: -* [Learn about the object storage tier][about-data-tiering]: understand tiered storage before you - [Manage tiering][enabling-data-tiering]. +* [Learn about the object storage tier][about-data-tiering]: understand tiered storage. * [Tour tiered storage][tour-data-tiering]: see the different features in tiered storage. * [Manage tiering][enabling-data-tiering]: enable and disable data tiering, automate tiering with policies or tier and untier manually. * [Query tiered data][querying-tiered-data]: query and performance for tiered data. -* [Replicas and forks with tiered data][replicas-and-forks]: billing and tiered storage. +* [Replicas and forks with tiered data][replicas-and-forks]: how tiered storage works + with forks and replicas. [about-data-tiering]: /use-timescale/:currentVersion:/data-tiering/about-data-tiering/ From dd535463289d09ca3d69d57266723bdbd2b4fda8 Mon Sep 17 00:00:00 2001 From: atovpeko Date: Fri, 1 Nov 2024 15:26:16 +0200 Subject: [PATCH 13/13] removed tiered storage tour --- .../data-tiering/about-data-tiering.md | 80 +++-- .../data-tiering/enabling-data-tiering.md | 7 +- .../data-tiering/querying-tiered-data.md | 152 ++++++--- .../data-tiering/tour-data-tiering.md | 289 ------------------ use-timescale/page-index/page-index.js | 6 - 5 files changed, 169 insertions(+), 365 deletions(-) diff --git a/use-timescale/data-tiering/about-data-tiering.md b/use-timescale/data-tiering/about-data-tiering.md index 84261f1a72..84f3ebae73 100644 --- a/use-timescale/data-tiering/about-data-tiering.md +++ b/use-timescale/data-tiering/about-data-tiering.md @@ -12,51 +12,77 @@ plans: [scale, enterprise] # About the object storage tier -The tiered storage architecture complements Timescale's standard high-performance storage tier with a low-cost object storage tier. +The Timescale's tiered storage architecture includes a standard high-performance storage tier and a low-cost object storage tier built on Amazon S3. You can use the standard tier for data that requires quick access, and the object tier for rarely used historical data. Chunks from a single hypertable, including compressed chunks, can stretch across these two storage tiers. A compressed chunk uses a different storage representation after tiering. -You can move your hypertable data across the different storage tiers to get the best price performance. -You can use the standard high-performance storage tier for data that requires quick access, -and the low-cost object storage tier for rarely used historical data. -Regardless of where your data is stored, you can still query it with -[standard SQL][querying-tiered-data]. -Because it's queried normally with SQL, you can still JOIN against tiered data, -build views on tiered data, and even define continuous aggregates on tiered data. -In fact, because the implementation of continuous aggregates also use hypertables, -they can be tiered to low-cost storage as well. +In the standard storage, chunks are stored in the block format. In the object storage, they are stored in a compressed, columnar format. This format is different from that of the internals of the database, for better interoperability across various platforms. It allows for more efficient columnar scans across longer time periods, and Timescale uses other metadata and query optimizations to reduce the amount of data that needs to be fetched from the object storage tier to satisfy a query. +Regardless of where your data is stored, you can still query it with standard SQL. A single SQL query transparently pulls data from the appropriate chunks using the chunk exclusion algorithms. You can `JOIN` against tiered data, build views, and even define continuous aggregates on it. In fact, because the implementation of continuous aggregates also uses hypertables, they can be tiered to low-cost storage as well. ## Benefits of the object storage tier -The object storage tier is more than an archiving solution: +The object storage tier is more than an archiving solution. It is also: -* **Cost effective.** Store high volumes of data cost-efficiently. +* **Cost-effective:** store high volumes of data at a lower cost. You pay only for what you store, with no extra cost for queries. -* **Scalable.** Scale past the restrictions imposed by storage that can be attached +* **Scalable:** scale past the restrictions imposed by storage that can be attached directly to a Timescale service (currently 16 TB). -* **Online.** Your data is always there and can be [queried when needed][querying-tiered-data]. +* **Online:** your data is always there and can be [queried when needed][querying-tiered-data]. ## Architecture -The tiered storage backend works by periodically and asynchronously moving older chunks to the object storage tier; -an object store built on Amazon S3. -There, it's stored in the Apache Parquet format, which is a compressed -columnar format well-suited for S3. Data remains accessible both during and after the migration. +The tiered storage backend works by periodically and asynchronously moving older chunks to the object storage tier. +There, it's stored in the Apache Parquet format, which is a compressed columnar format well-suited for S3. Within a Parquet file, a set of rows is grouped together to form a row group. Within a row group, values for a single column across multiple rows are stored together. By default, tiered data is not included when querying from a Timescale service. -However, it is possible to access tiered data by [enabling tiered reads][querying-tiered-data] for a session, query, or even for all sessions. +However, you can access tiered data by [enabling tiered reads][querying-tiered-data] for a query, a session, or even for all sessions. After you enable tiered reads, when you run regular SQL queries, a behind-the-scenes process transparently pulls data from wherever it's located: the standard high-performance storage tier, the object storage tier, or both. -With tiered reads enabled, when you run regular SQL queries, a behind-the-scenes process transparently -pulls data from wherever it's located: the standard high-performance storage tier, the object storage tier, or both. Various SQL optimizations limit what needs to be read from S3: -* Chunk exclusion avoids processing chunks that fall outside the query's time window -* The database uses metadata about row groups and columnar offsets, so only - part of an object needs to be read from S3 - -The result is transparent queries across standard PostgreSQL storage and S3 -storage, so your queries fetch the same data as before. +* Chunk pruning - exclude the chunks that fall outside the query time window. +* Row group pruning - identify the row groups within the Parquet object that satisfy the query. +* Column pruning - fetch only columns that are requested by the query. + +The result is transparent queries across standard PostgreSQL storage and S3 storage, so your queries fetch the same data as before. + +The following query is against a tiered dataset and illustrates the optimizations: + +```sql +EXPLAIN ANALYZE +SELECT count(*) FROM +( SELECT device_uuid, sensor_id FROM public.device_readings + WHERE observed_at > '2023-08-28 00:00+00' and observed_at < '2023-08-29 00:00+00' + GROUP BY device_uuid, sensor_id ) q; + QUERY PLAN + +------------------------------------------------------------------------------------------------- + Aggregate (cost=7277226.78..7277226.79 rows=1 width=8) (actual time=234993.749..234993.750 rows=1 loops=1) + -> HashAggregate (cost=4929031.23..7177226.78 rows=8000000 width=68) (actual time=184256.546..234913.067 rows=1651523 loops=1) + Group Key: osm_chunk_1.device_uuid, osm_chunk_1.sensor_id + Planned Partitions: 128 Batches: 129 Memory Usage: 20497kB Disk Usage: 4429832kB + -> Foreign Scan on osm_chunk_1 (cost=0.00..0.00 rows=92509677 width=68) (actual time=345.890..128688.459 rows=92505457 loops=1) + Filter: ((observed_at > '2023-08-28 00:00:00+00'::timestamp with time zone) AND (observed_at < '2023-08-29 00:00:00+00'::timestamp with t +ime zone)) + Rows Removed by Filter: 4220 + Match tiered objects: 3 + Row Groups: + _timescaledb_internal._hyper_1_42_chunk: 0-74 + _timescaledb_internal._hyper_1_43_chunk: 0-29 + _timescaledb_internal._hyper_1_44_chunk: 0-71 + S3 requests: 177 + S3 data: 224423195 bytes + Planning Time: 6.216 ms + Execution Time: 235372.223 ms +(16 rows) +``` + +`EXPLAIN` illustrates which chunks are being pulled in from the object storage tier: + +1. Fetch data from chunks 42, 43, and 44 from the object storage tier. +1. Prune row groups and limit the fetch to a subset of the offsets in the +Parquet object that potentially match the query filter. Only fetch the data +for `device_uuid`, `sensor_id`, and `observed_at` as the query needs only these 3 columns. ## Limitations diff --git a/use-timescale/data-tiering/enabling-data-tiering.md b/use-timescale/data-tiering/enabling-data-tiering.md index 63ac5b5cde..ed5ed3506b 100644 --- a/use-timescale/data-tiering/enabling-data-tiering.md +++ b/use-timescale/data-tiering/enabling-data-tiering.md @@ -62,6 +62,8 @@ For example, to tier chunks that are more than three days old in the `example` [ SELECT add_tiering_policy('example', INTERVAL '3 days'); ``` +By default, a tiering policy runs hourly on your database. To change this interval, call `alter_job`. + ### Remove a tiering policy To remove an existing tiering policy, call `remove_tiering_policy`: @@ -109,7 +111,7 @@ Tiering a chunk is an asynchronous process that schedules the chunk to be tiered 1. **Repeat for all chunks you want to tier.** - Tiering a chunk schedules it for migration to the object storage tier, but the migration won't happen immediately. Chunks are tiered one at a time in order to minimize database resource consumption. You can continue to query a chunk during migration. + Tiering a chunk schedules it for migration to the object storage tier, but the migration won't happen immediately. Chunks are tiered one at a time in order to minimize database resource consumption. A chunk is marked as migrated and deleted from the standard storage only after it has been durably stored in the object storage tier. You can continue to query a chunk during migration. 1. **To see which chunks are tiered into the object storage tier, use the `tiered_chunks` informational view:** @@ -181,7 +183,6 @@ To untier a chunk, call the `untier_chunk` stored procedure. - ## Disable tiering If you no longer want to use tiered storage for a particular hypertable, drop the associated metadata by calling `disable_tiering`. @@ -218,4 +219,4 @@ If you no longer want to use tiered storage for a particular hypertable, drop th [console]: https://console.cloud.timescale.com/dashboard/services [hypertable]: /use-timescale/:currentVersion:/hypertables/ [connect-to-service]: /getting-started/:currentVersion:/services/#connect-to-your-service -[caggs]: /use-timescale/:currentVersion:/continuous-aggregates/ \ No newline at end of file +[caggs]: /use-timescale/:currentVersion:/continuous-aggregates/ diff --git a/use-timescale/data-tiering/querying-tiered-data.md b/use-timescale/data-tiering/querying-tiered-data.md index d1d5d4066e..6068d08379 100644 --- a/use-timescale/data-tiering/querying-tiered-data.md +++ b/use-timescale/data-tiering/querying-tiered-data.md @@ -9,12 +9,9 @@ plans: [scale, enterprise] # Querying tiered data - - - Once rarely used data is tiered and migrated to the object storage tier, it can still be queried with standard SQL by enabling the `timescaledb.enable_tiered_reads` GUC. -By default, the GUC is set to false so that queries on TimescaleDB do not touch tiered data. +By default, the GUC is set to `false`, so that queries do not touch tiered data. The `timescaledb.enable_tiered_reads` GUC, or Grand Unified Configuration variable, is a setting that controls if tiered data is queried. The configuration variable can be set at different levels, @@ -24,36 +21,32 @@ sessions. With tiered reads enabled, you can query your data normally even when it's distributed across different storage tiers. Your hypertable is spread across the tiers, so queries and `JOIN`s work and fetch the same data as usual. - - By default, tiered data is not accessed by queries. Querying tiered data may slow down query performance as the data is not stored locally on Timescale's high-performance storage tier. - +## Enable querying tiered data for a single query -### querying tiered data in a single query + 1. Enable `timescaledb.enable_tiered_reads` before querying the hypertable with tiered data and reset it after it is complete: -```sql -set timescaledb.enable_tiered_reads = true; SELECT count(*) FROM example; set timescaledb.enable_tiered_reads = false; -``` + ```sql + set timescaledb.enable_tiered_reads = true; SELECT count(*) FROM example; set timescaledb.enable_tiered_reads = false; + ``` -This queries data from all chunks including tiered chunks and non tiered chunks: - - ```sql - ||count| - |---| - |1000| - ``` + This queries data from all chunks including tiered chunks and non tiered chunks: + + ```sql + ||count| + |---| + |1000| + ``` - - -### Querying tiered data for an entire session +## Enable querying tiered data for a single session All future queries within a session can be enabled to use the object storage tier by enabling `timescaledb.enable_tiered_reads` within a session. @@ -61,16 +54,16 @@ All future queries within a session can be enabled to use the object storage tie 1. Enable `timescaledb.enable_tiered_reads` for an entire session: -```sql -set timescaledb.enable_tiered_reads to true; -``` + ```sql + set timescaledb.enable_tiered_reads = true; + ``` -1. All future queries in that session are configured to read from tiered data and locally stored data. + All future queries in that session are configured to read from tiered data and locally stored data. -### Querying tiered data in all future sessions +## Enable querying tiered data in all future sessions You can also enable queries to read from tiered data always by following these steps: @@ -78,41 +71,120 @@ You can also enable queries to read from tiered data always by following these s 1. Enable `timescaledb.enable_tiered_reads` for all future sessions: -```sql -alter database tsdb set timescaledb.enable_tiered_reads to true; -``` + ```sql + alter database tsdb set timescaledb.enable_tiered_reads = true; + ``` -1. In all future created sessions, timescaledb.enable_tiered_reads initializes with enabled. + In all future created sessions, `timescaledb.enable_tiered_reads` initializes with `enabled`. -## Performance considerations +## Query data in the object storage tier + +This section illustrates how querying tiered storage works. + +Consider a simple database with a standard `devices` table and a `metrics` hypertable. After enabling tiered storage, you can see which chunks are tiered to the object storage tier: -Queries over tiered data are expected to be slower than over local data. However, in a limited number of scenarios tiered reads can impact query planning time over local data as well. In order to prevent any unexpected performance degradation for application queries, we keep the GUC `timescaledb.enable_tiered_reads` to false. +```sql + chunk_name | range_start | range_end +------------------+------------------------+------------------------ + _hyper_2_4_chunk | 2015-12-31 00:00:00+00 | 2016-01-07 00:00:00+00 + _hyper_2_3_chunk | 2017-08-17 00:00:00+00 | 2017-08-24 00:00:00+00 +(2 rows) +``` -* Queries without time boundaries specified are expected to perform slower when querying tiered data, both during query planning and during query execution. TimescaleDB's chunk exclusion algorithms cannot be applied for this case. +The following query fetches data only from the object storage tier. This makes sense based on the +`WHERE` clause specified by the query and the chunk ranges listed above for this +hypertable. +```sql + EXPLAIN SELECT * FROM metrics where ts < '2017-01-01 00:00+00'; + QUERY PLAN +--------------------------------------------------------------------- + Foreign Scan on osm_chunk_2 (cost=0.00..0.00 rows=2 width=20) + Filter: (ts < '2017-01-01 00:00:00'::timestamp without time zone) + Match tiered objects: 1 + Row Groups: + _timescaledb_internal._hyper_2_4_chunk: 0 +(5 rows) ``` -SELECT * FROM device_readings WHERE id = 10; + +If your query does not need to touch the object storage tier, it will only +process the chunks in the standard storage. The following query refers to newer data that is not yet tiered to the object storage tier. +`Match tiered objects :0 ` in the plan indicates that no tiered data matches the query constraint. So data in the object storage is not touched at all. + +```sql + EXPLAIN SELECT * FROM metrics where ts > '2022-01-01 00:00+00'; + QUERY PLAN + +-------------------------------------------------------------------------------- +---------------------------------- + Append (cost=0.15..25.02 rows=568 width=20) + -> Index Scan using _hyper_2_5_chunk_metrics_ts_idx on _hyper_2_5_chunk (co +st=0.15..22.18 rows=567 width=20) + Index Cond: (ts > '2022-01-01 00:00:00'::timestamp without time zone) + -> Foreign Scan on osm_chunk_2 (cost=0.00..0.00 rows=1 width=20) + Filter: (ts > '2022-01-01 00:00:00'::timestamp without time zone) + Match tiered objects: 0 + Row Groups: +(7 rows) ``` +Here is another example with a `JOIN` that does not touch tiered data: + +```sql + EXPLAIN SELECT ts, device_id, description FROM metrics + JOIN devices ON metrics.device_id = devices.id + WHERE metrics.ts > '2023-08-01'; + QUERY PLAN + +-------------------------------------------------------------------------------- + Hash Join (cost=32.12..184.55 rows=3607 width=44) + Hash Cond: (devices.id = _hyper_4_9_chunk.device_id) + -> Seq Scan on devices (cost=0.00..22.70 rows=1270 width=36) + -> Hash (cost=25.02..25.02 rows=568 width=12) + -> Append (cost=0.15..25.02 rows=568 width=12) + -> Index Scan using _hyper_4_9_chunk_metrics_ts_idx on _hyper_4_ +9_chunk (cost=0.15..22.18 rows=567 width=12) + Index Cond: (ts > '2023-08-01 00:00:00+00'::timestamp with +time zone) + -> Foreign Scan on osm_chunk_3 (cost=0.00..0.00 rows=1 width=12 +) + Filter: (ts > '2023-08-01 00:00:00+00'::timestamp with time + zone) + Match tiered objects: 0 + Row Groups: +(11 rows) +``` + + +## Performance considerations + +Queries over tiered data are expected to be slower than over local data. However, in a limited number of scenarios tiered reads can impact query planning time over local data as well. In order to prevent any unexpected performance degradation for application queries, we keep the GUC `timescaledb.enable_tiered_reads` set to `false`. + +* Queries without time boundaries specified are expected to perform slower when querying tiered data, both during query planning and during query execution. Timescale's chunk exclusion algorithms cannot be applied for this case. + + ```sql + SELECT * FROM device_readings WHERE id = 10; + ``` + * Queries with predicates computed at runtime (such as `NOW()`) are not always optimized at planning time and as a result might perform slower than statically assigned values when querying against the object storage tier. - For example, this query is optimized at planning time - ``` + For example, this query is optimized at planning time: + + ```sql SELECT * FROM metrics WHERE ts > '2023-01-01' AND ts < '2023-02-01' ``` - while the following query does not do chunk pruning at query planning time - ``` + The following query does not do chunk pruning at query planning time: + + ```sql SELECT * FROM metrics WHERE ts < now() - '10 days':: interval ``` At the moment, queries against tiered data work best when the query optimizer can apply planning time optimizations. - * Text and non-native types (JSON, JSONB, GIS) filtering is slower when querying tiered data. - diff --git a/use-timescale/data-tiering/tour-data-tiering.md b/use-timescale/data-tiering/tour-data-tiering.md index 7833d127dd..e69de29bb2 100644 --- a/use-timescale/data-tiering/tour-data-tiering.md +++ b/use-timescale/data-tiering/tour-data-tiering.md @@ -1,289 +0,0 @@ ---- -title: Tour of tiered storage -excerpt: A quick tour of tiered storage -product: [cloud] -keywords: [tiered storage] -tags: [storage, data management] -plans: [scale, enterprise] ---- - -# Tiered Storage - -The tiered storage architecture complements Timescale's standard high-performance storage tier with -a low-cost object storage tier; an object store built on Amazon S3. -In particular, users have the ability to transparently tier hypertable chunks into -the object storage tier on the Timescale platform for highly scalable, long-term storage. - -But this is not just an archive! Once tiered, these chunks remain fully and -directly queryable from within your database using standard SQL. Chunks for a -given hypertable can now stretch across standard storage (in block form) and -the object storage tier (in object form), but a single SQL query transparently -pulls data from the appropriate chunks using TimescaleDB's chunk exclusion algorithms. - -In fact, chunks in the object storage tier are stored in compressed, columnar format -(in a different format from the internals of the database, for better -interoperability across various platforms). This format allows for more -efficient columnar scans across longer time periods, and Timescale uses other -metadata and query optimizations to reduce the amount of data that needs to be -fetched from the object storage tier to satisfy a query. - -Let's get started! - -First, [enable tiered storage][enabling-data-tiering] from the UI on the Timescale cloud console. - -In an existing database service with a hypertable, you can tier chunks to - the object storage tier via automated policies on the hypertable, or via manual -commands on specific chunks. While users will likely adopt automated policies -in production scenarios, the manual command is a good way to start -experimenting with tiered storage. - -## Manually tier a specific chunk - -Users can move a single chunk to the object storage tier by explicitly specifying the chunk's name. - -``` -SELECT tier_chunk('_timescaledb_internal._hyper_2_3_chunk'); -``` - -To get the name of a chunk for tiering, you can use the chunks informational -view. For example: - -``` -SELECT chunk_schema, chunk_name, range_start, range_end FROM timescaledb_information.chunks WHERE hypertable_name = 'metrics_table'; --[ RECORD 1 ]+----------------------- -chunk_schema | _timescaledb_internal -chunk_name | _hyper_2_3_chunk -range_start | 2017-08-02 20:00:00-04 -range_end | 2017-08-09 20:00:00-04 - -``` - -Executing the tier_chunk command on a specific chunk does not immediately and -synchronously move the chunk to the object storage tier, but instead schedules the -chunk for migration. In the background, a cloud service will asynchronously -migrate the chunk to the object storage tier, and only mark the chunk as migrated -(and delete it from within the database's primary storage) once it has been -durably stored in the object storage tier. - -You can view chunks in the tiering queue, that is, chunks that are scheduled - to be tiered, by using this query. - -``` -SELECT * FROM timescaledb_osm.chunks_queued_for_tiering ; --[ RECORD 1 ]-----+----------------- -hypertable_schema | public -hypertable_name | metrics_table -chunk_name | _hyper_2_3_chunk -``` - -For smaller chunks, this asynchronous migration should happen within seconds or -a few minutes, although the chunk will remain fully queryable while it is being - migrated: the database engine continues to access the chunk in primary storage - until it fully switches over to use the chunk in the object storage tier. And yes, -you can tier a compressed chunk seamlessly, although it uses a different -storage representation once tiered to the object storage tier. - -## Automate through a tiering policy - -Users can create a tiering policy to automate moving data to object -storage, such that any chunks whose time range falls before the move_after -threshold will be moved to the object storage tier. This interval-threshold-based -policy is similar to age thresholds with compression and data retention policies. - -The tiering policy operates at a chunk level, such that the policy starts -up a job periodically that will asynchronously move SELECTed chunks over to -the object storage tier. By default, the tiering policy runs hourly on your database; -this can be modified via alter_job. - -Example: - -``` - SELECT add_tiering_policy('metrics', INTERVAL '4 weeks'); -``` - -We also provide a [remove tiering policy][creating-data-tiering-policy] interface if you want to stop tiering -data. - -This function removes the background job that automates tiering. Any chunks -that were already moved to the object storage tier will remain there, however. Any -chunks that are scheduled for tiering will also not be affected by this command. - -## List a set of tiered chunks - -You can review the set of chunks that are tiered into the object storage tier via a -standard informational view within the database: - -``` - SELECT * FROM timescaledb_osm.tiered_chunks; - --[ RECORD 1 ]-----+----------------------- -hypertable_schema | public -hypertable_name | metrics -chunk_name | _hyper_1_4_chunk -range_start | 2022-04-28 00:00:00+00 -range_end | 2022-05-05 00:00:00+00 --[ RECORD 2 ]-----+----------------------- -hypertable_schema | public -hypertable_name | metrics -chunk_name | _hyper_1_1_chunk -range_start | 2022-05-26 00:00:00+00 -range_end | 2022-06-02 00:00:00+00 -``` - -## Querying data in the object storage tier - -Once a hypertable is tiered across storage, you can continue to query it as -normal, including JOINing it with other relational tables, and all that SQL -goodness. - -Consider a simple database with a standard devices table and a metrics hypertable. - -``` -CREATE TABLE devices ( id integer, description text); -CREATE TABLE metrics ( ts timestamp with time zone, device_id integer, val float); -SELECT create_hypertable('metrics', 'ts'); -``` - -Once you insert data into the tables, you can then tier some of the hypertable's data to the object storage tier. -A simple query against the informational view illustrates which chunks are tiered to the object storage tier. - -``` - SELECT chunk_name, range_start, range_end FROM timescaledb_osm.tiered_chunks where hypertable_name = 'metrics'; - chunk_name | range_start | range_end -------------------+------------------------+------------------------ - _hyper_2_4_chunk | 2015-12-31 00:00:00+00 | 2016-01-07 00:00:00+00 - _hyper_2_3_chunk | 2017-08-17 00:00:00+00 | 2017-08-24 00:00:00+00 -(2 rows) - -``` - -By default, querying the object storage tier is disabled. Lets first enable this and -then run the query. See [querying tiered data][querying-tiered-data] for -detailed steps on enabling reads from the object storage tier. -``` -set timescaledb.enable_tiered_reads = true; -``` - -This query fetches data only from the object storage tier. This makes sense based on the -WHERE clause specified by the query an the chunk ranges listed above for this -hypertable. -``` - EXPLAIN SELECT * FROM metrics where ts < '2017-01-01 00:00+00'; - QUERY PLAN ---------------------------------------------------------------------- - Foreign Scan on osm_chunk_2 (cost=0.00..0.00 rows=2 width=20) - Filter: (ts < '2017-01-01 00:00:00'::timestamp without time zone) - Match tiered objects: 1 - Row Groups: - _timescaledb_internal._hyper_2_4_chunk: 0 -(5 rows) -``` - -If your query predicate never needs to touch the object storage tier, it will only -process those chunks stored in regular storage; in this case, the time -predicate refers to newer data that is not yet tiered to the object storage tier. -This query does not touch the object storage tier at all. We know that because -`Match tiered objects :0 ` in the plan indicates that no tiered data matches - the query constraint. - -``` - EXPLAIN SELECT * FROM metrics where ts > '2022-01-01 00:00+00'; - QUERY PLAN - --------------------------------------------------------------------------------- ----------------------------------- - Append (cost=0.15..25.02 rows=568 width=20) - -> Index Scan using _hyper_2_5_chunk_metrics_ts_idx on _hyper_2_5_chunk (co -st=0.15..22.18 rows=567 width=20) - Index Cond: (ts > '2022-01-01 00:00:00'::timestamp without time zone) - -> Foreign Scan on osm_chunk_2 (cost=0.00..0.00 rows=1 width=20) - Filter: (ts > '2022-01-01 00:00:00'::timestamp without time zone) - Match tiered objects: 0 - Row Groups: -(7 rows) -``` - -Here is another example with a JOIN that does not touch tiered data. - -``` - EXPLAIN SELECT ts, device_id, description FROM metrics - JOIN devices ON metrics.device_id = devices.id - WHERE metrics.ts > '2023-08-01'; - QUERY PLAN - --------------------------------------------------------------------------------- - Hash Join (cost=32.12..184.55 rows=3607 width=44) - Hash Cond: (devices.id = _hyper_4_9_chunk.device_id) - -> Seq Scan on devices (cost=0.00..22.70 rows=1270 width=36) - -> Hash (cost=25.02..25.02 rows=568 width=12) - -> Append (cost=0.15..25.02 rows=568 width=12) - -> Index Scan using _hyper_4_9_chunk_metrics_ts_idx on _hyper_4_ -9_chunk (cost=0.15..22.18 rows=567 width=12) - Index Cond: (ts > '2023-08-01 00:00:00+00'::timestamp with -time zone) - -> Foreign Scan on osm_chunk_3 (cost=0.00..0.00 rows=1 width=12 -) - Filter: (ts > '2023-08-01 00:00:00+00'::timestamp with time - zone) - Match tiered objects: 0 - Row Groups: -(11 rows) -``` - -## Digging deeper into querying tiered data -Lets dig a bit deeper into how data is organized on S3. When chunks are tiered -they are written out as Parquet objects. Parquet is a columnar storage format. -Within a Parquet file, we group a set of rows together to form a row group. -Within the row group, values for a single column (across multiple rows) are -stored together. The query planner optimizes access to the object storage tier at - multiple stages: -1. Chunk pruning - match only chunks that satisfy the query constraints. -This is done by looking at the hypertable's dimension column metadata, typically a timestamp. -2. Row group pruning - Identify the row groups within the Parquet object that satisfy the query. -3. Column pruning - Fetch only columns that are requested by the query. - -The following query is against a bigger data set tiered on S3 and you can see -the query optimizations in action here. -EXPLAIN will illustrate which chunks are being pulled in from the object storage tier. -First, we only fetch data from chunks 42, 43 and 44 from the object storage tier. Then - we prune row groups and limit the fetch to a subset of the offsets in the - Parquet object that potentially match the query filter. We only fetch the data -for the columns device_uuid, sensor_id and observed_at as the query needs -only these 3 columns. - -``` -EXPLAIN ANALYZE -SELECT count(*) FROM -( SELECT device_uuid, sensor_id FROM public.device_readings - WHERE observed_at > '2023-08-28 00:00+00' and observed_at < '2023-08-29 00:00+00' - GROUP BY device_uuid, sensor_id ) q; - QUERY PLAN - -------------------------------------------------------------------------------------------------- - Aggregate (cost=7277226.78..7277226.79 rows=1 width=8) (actual time=234993.749..234993.750 rows=1 loops=1) - -> HashAggregate (cost=4929031.23..7177226.78 rows=8000000 width=68) (actual time=184256.546..234913.067 rows=1651523 loops=1) - Group Key: osm_chunk_1.device_uuid, osm_chunk_1.sensor_id - Planned Partitions: 128 Batches: 129 Memory Usage: 20497kB Disk Usage: 4429832kB - -> Foreign Scan on osm_chunk_1 (cost=0.00..0.00 rows=92509677 width=68) (actual time=345.890..128688.459 rows=92505457 loops=1) - Filter: ((observed_at > '2023-08-28 00:00:00+00'::timestamp with time zone) AND (observed_at < '2023-08-29 00:00:00+00'::timestamp with t -ime zone)) - Rows Removed by Filter: 4220 - Match tiered objects: 3 - Row Groups: - _timescaledb_internal._hyper_1_42_chunk: 0-74 - _timescaledb_internal._hyper_1_43_chunk: 0-29 - _timescaledb_internal._hyper_1_44_chunk: 0-71 - S3 requests: 177 - S3 data: 224423195 bytes - Planning Time: 6.216 ms - Execution Time: 235372.223 ms -(16 rows) -``` - -## Dropping tiered data -You can drop tiered data by using the Timescale [data retention policy and API ][about-data-retention] - -[enabling-data-tiering]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/ -[querying-tiered-data]: /use-timescale/:currentVersion:/data-tiering/querying-tiered-data/ -[creating-data-tiering-policy]: /use-timescale/:currentVersion:/data-tiering/creating-data-tiering-policy/ -[about-data-retention]: /use-timescale/:currentVersion:/data-retention/about-data-retention diff --git a/use-timescale/page-index/page-index.js b/use-timescale/page-index/page-index.js index 508779fd33..716b8c18bf 100644 --- a/use-timescale/page-index/page-index.js +++ b/use-timescale/page-index/page-index.js @@ -481,12 +481,6 @@ module.exports = [ excerpt: "Learn how the object storage tier helps you save on storage costs", }, - { - title: "Tour of tiered storage", - href: "tour-data-tiering", - excerpt: - "A quick tour of tiered storage", - }, { title: "Manage tiering", href: "enabling-data-tiering",