-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorg tiering policy sections into manage tiering #3524
base: latest
Are you sure you want to change the base?
Reorg tiering policy sections into manage tiering #3524
Conversation
Allow 10 minutes from last push for the staging site to build. If the link doesn't work, try using incognito mode instead. For internal reviewers, check web-documentation repo actions for staging build status. Link to build for this PR: http://docs-dev.timescale.com/docs-3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering |
…s-into-manage-tiering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments, good stuff.
…s-into-manage-tiering
Co-authored-by: Iain Cox <[email protected]> Signed-off-by: atovpeko <[email protected]>
Co-authored-by: Iain Cox <[email protected]> Signed-off-by: atovpeko <[email protected]>
Co-authored-by: Iain Cox <[email protected]> Signed-off-by: atovpeko <[email protected]>
…-tiering' of github.com:timescale/docs into 3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering
@gayyappan can you have a look at https://docs-dev.timescale.com/docs-3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering/use-timescale/3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering/data-tiering/enabling-data-tiering/ please. Any comments in this PR. @atovpeko: I have bad news for you. Now we have https://docs-dev.timescale.com/docs-3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering/use-timescale/3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering/data-tiering/enabling-data-tiering/ , I don't think we need https://docs-dev.timescale.com/docs-3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering/use-timescale/3508-docs-rfc-reorg-tiering-policy-sections-into-manage-tiering/data-tiering/tour-data-tiering/ any more. @gayyappan, what do you think? Do we have a corresponding redirects PR? |
…s-into-manage-tiering
use-timescale/data-tiering/index.md
Outdated
* [Disable tiering on a hypertable][disabling-data-tiering] on an individual table if you no longer want to associate it with tiered storage. | ||
This section explains the following: | ||
* [Learn about the object storage tier][about-data-tiering]: understand tiered storage before you | ||
[Manage tiering][enabling-data-tiering]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should remove this.
use-timescale/data-tiering/index.md
Outdated
* [Manage tiering][enabling-data-tiering]: enable and disable data tiering, automate tiering with | ||
policies or tier and untier manually. | ||
* [Query tiered data][querying-tiered-data]: query and performance for tiered data. | ||
* [Replicas and forks with tiered data][replicas-and-forks]: billing and tiered storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* [Replicas and forks with tiered data][replicas-and-forks]: billing and tiered storage. | |
* [Replicas and forks with tiered data][replicas-and-forks]: How does tiered storage work with forks and replicas. |
older than the `move_after` threshold to the object storage tier. This works similarly to a | ||
[data retention policy][data-retention], but chunks are moved rather than deleted. | ||
|
||
A tiering policy schedules a job that runs periodically to asynchronously migrate eligible chunks to object storage. Chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tiering policy schedules a job that runs periodically to asynchronously migrate eligible chunks to object storage. Chunks are considered tiered once they appear in the `timescaledb_osm.tiered_chunks` view. | |
A tiering policy schedules a job that runs periodically to asynchronously migrate eligible chunks to object storage. After chunks are tiered, they appear in the `timescaledb_osm.tiered_chunks` view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor edits suggested.
The overall "Manage tiering" section looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These pages are really coming together.
* [Disable tiering on a hypertable][disabling-data-tiering] on an individual table if you no longer want to associate it with tiered storage. | ||
This section explains the following: | ||
* [Learn about the object storage tier][about-data-tiering]: understand tiered storage. | ||
* [Tour tiered storage][tour-data-tiering]: see the different features in tiered storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this link please.
--- | ||
|
||
# Tier data to the object storage tier | ||
# Manage tiering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the title should explain more clearly what we explain. Manage automatic and manual tiering?
--- | ||
|
||
# About the object storage tier | ||
|
||
The tiered storage architecture complements Timescale's standard high-performance storage tier with a low-cost object storage tier. | ||
The Timescale's tiered storage architecture includes a standard high-performance storage tier and a low-cost object storage tier built on Amazon S3. You can use the standard tier for data that requires quick access, and the object tier for rarely used historical data. Chunks from a single hypertable, including compressed chunks, can stretch across these two storage tiers. A compressed chunk uses a different storage representation after tiering. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Timescale's tiered storage architecture includes a standard high-performance storage tier and a low-cost object storage tier built on Amazon S3. You can use the standard tier for data that requires quick access, and the object tier for rarely used historical data. Chunks from a single hypertable, including compressed chunks, can stretch across these two storage tiers. A compressed chunk uses a different storage representation after tiering. | |
Timescale's tiered storage architecture includes a standard high-performance storage tier, and a low-cost object storage tier built on Amazon S3. You use the standard tier for data that requires quick access, and the object tier for rarely used historical data. Chunks from a single hypertable, including compressed chunks, can stretch across these two storage tiers. A compressed chunk uses a different storage representation after tiering. |
build views on tiered data, and even define continuous aggregates on tiered data. | ||
In fact, because the implementation of continuous aggregates also use hypertables, | ||
they can be tiered to low-cost storage as well. | ||
In the standard storage, chunks are stored in the block format. In the object storage, they are stored in a compressed, columnar format. This format is different from that of the internals of the database, for better interoperability across various platforms. It allows for more efficient columnar scans across longer time periods, and Timescale uses other metadata and query optimizations to reduce the amount of data that needs to be fetched from the object storage tier to satisfy a query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the standard storage, chunks are stored in the block format. In the object storage, they are stored in a compressed, columnar format. This format is different from that of the internals of the database, for better interoperability across various platforms. It allows for more efficient columnar scans across longer time periods, and Timescale uses other metadata and query optimizations to reduce the amount of data that needs to be fetched from the object storage tier to satisfy a query. | |
In high-performance storage, chunks are stored in the block format. In the object storage, they are stored in a compressed, columnar format. For better interoperability across various platforms, this format is different from that of the internals of the database. It allows for more efficient columnar scans across longer time periods, and Timescale Cloud uses other metadata and query optimizations to reduce the amount of data that needs to be fetched from the object storage tier to satisfy a query. |
an object store built on Amazon S3. | ||
There, it's stored in the Apache Parquet format, which is a compressed | ||
columnar format well-suited for S3. Data remains accessible both during and after the migration. | ||
The tiered storage backend works by periodically and asynchronously moving older chunks to the object storage tier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tiered storage backend works by periodically and asynchronously moving older chunks to the object storage tier. | |
The tiered storage backend works by periodically and asynchronously moving older chunks from high-performance storage to the object storage tier. |
|
||
The result is transparent queries across standard PostgreSQL storage and S3 | ||
storage, so your queries fetch the same data as before. | ||
* Chunk pruning - exclude the chunks that fall outside the query time window. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put Chunk pruning: etc in bold to match the other lists in the page please.
* Row group pruning - identify the row groups within the Parquet object that satisfy the query. | ||
* Column pruning - fetch only columns that are requested by the query. | ||
|
||
The result is transparent queries across standard PostgreSQL storage and S3 storage, so your queries fetch the same data as before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result is transparent queries across standard PostgreSQL storage and S3 storage, so your queries fetch the same data as before. | |
The result is transparent queries across high-performance storage and S3 object storage , so your queries fetch the same data as before. |
|
||
Enable tiered storage to begin migrating rarely used data from Timescale's standard high-performance storage tier | ||
to the object storage tier to save on storage costs. | ||
You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. After you [enable tiered storage](#enable-tiered-storage), you then either [create automated tiering policies](#automate-tiering-with-policies) or [manually tier and untier data](#manually-tier-and-untier-chunks). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to the object storage. After you [enable tiered storage](#enable-tiered-storage), you then either [create automated tiering policies](#automate-tiering-with-policies) or [manually tier and untier data](#manually-tier-and-untier-chunks). | |
You use tiered storage to save on storage costs. Specifically, you can migrate rarely used data from Timescale's standard high-performance storage to object storage. After you [enable tiered storage](#enable-tiered-storage), you then either [create automated tiering policies](#automate-tiering-with-policies) or [manually tier and untier data](#manually-tier-and-untier-chunks). |
@@ -23,95 +21,170 @@ sessions. | |||
With tiered reads enabled, you can query your data normally even when it's distributed across different storage tiers. | |||
Your hypertable is spread across the tiers, so queries and `JOIN`s work and fetch the same data as usual. | |||
|
|||
<!-- vale Google.Acronyms = YES --> | |||
|
|||
<Highlight type="warning"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make this into a sentence without the warning and link to the performance section. if you must, make it an info admomition.
…s-into-manage-tiering
No description provided.