A node example to encode columns of a redshift table.
Best Practice when using Amazon Redshift is to have your columns encoded with the recommendation from ANALYZE COMPRESSION command.
"redshift_table_encode" is a module that is used to encode the table in two
ways:
A. tabularEncode -
useful for first time encoding (most of the columns are not encoded)
encode the whole table with the following steps:
- connect to redshift using pg - should be in this format:
postgres://USER:PASSWORD@DOMAIN-OF-REDSHIFT:PORT/DBNAME?tcpKeepAlive=true - analyze copmpression - to get the recommended encoding to all columns
- create new temp table with the encoded columns
- insert all the data from the table to the temp table
- drop the not encoded table
- rename the temp table to the original name
B. columnarEncode
- connect to redshift using pg - should be in this format: postgres://USER:PASSWORD@DOMAIN-OF-REDSHIFT:PORT/DBNAME?tcpKeepAlive=true
- analyze copmpression - to get the recommended encoding to all columns
- get current status of columns
- filter only the columns that are not encoded
- encode every column separately:
5.1 create a new encoded column with a temp name
5.2 update the new column with all the data from the original column
5.3 drop the original column
5.4 rename new column to original name
I'm open to suggestions, code reviews and contributions :)
hope this would be useful to you.