Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-5551] prep for AWS Glue Data Catalogue migration #5449

Merged
merged 17 commits into from
Jan 22, 2025
Merged

Conversation

maskarb
Copy link
Member

@maskarb maskarb commented Jan 16, 2025

Jira Ticket

COST-5551

Description

This change will do a few things:

  1. rename the sql_params schema_name to schema (for the sake of consistency)
  2. add trino_schema_prefix to all sql files and sql_params where necessary
  3. add TRINO_SCHEMA_PREFIX env variable -> defaults to empty initially
  4. add TRINO_S3A_OR_S3 env variable -> defaults to s3a initially (what it currently is today)
  5. add SCHEMA_SUFFIX env variable -> currently unused, but will be used after we switch to Glue for local and ephemeral
  6. add more separation between HCS, ROS, SUBS S3 params in settings. Move HCS S3 upload to the CSVhandler.

Testing

hive:

  1. smokes

transition to Glue from Hive:

  1. configure POSTGRES_SQL_SERVICE_HOST, POSTGRES_SQL_SERVICE_PORT, DATABASE_USER, and DATABASE_PASSWORD to connect to RDS
  2. create a new RDS db (CREATE DATABASE db_name ;); update DATABASE_NAME env variable
  3. set SCHEMA_SUFFIX=_username; set TRINO_S3A_OR_S3=s3a; set TRINO_SCHEMA_PREFIX=""
  4. restart koku -> this WILL be slow because tiny RDS is slow
  5. $ make create-test-customer; make load-test-customer-data test_source=AWS and allow the data to ingest
  6. once done, run the glue migration script and ensure the data was imported into Glue Data Catalog correctly
  7. set TRINO_S3A_OR_S3=s3; set TRINO_SCHEMA_PREFIX="hccm_"
  8. map the glue.properties files to hive.properties in the trino container and remove the dependency of hive-metastore.
  9. restart all of koku, ensure hive-metastore is not running
  10. make load-test-customer-data test_source=GCP
  11. ensure data is ingesting in Glue Data Catalog

Release Notes

  • proposed release note
* [COST-5551](https://issues.redhat.com/browse/COST-5551) prep for AWS Glue Data Catalogue migration

@maskarb maskarb added the smoke-tests pr_check will build the image and run minimal required smokes label Jan 16, 2025
@maskarb maskarb requested review from a team as code owners January 16, 2025 00:39
Copy link

codecov bot commented Jan 16, 2025

Codecov Report

Attention: Patch coverage is 88.37209% with 5 lines in your changes missing coverage. Please review.

Project coverage is 94.1%. Comparing base (0e027d7) to head (8009c22).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #5449     +/-   ##
=======================================
- Coverage   94.2%   94.1%   -0.0%     
=======================================
  Files        370     370             
  Lines      31539   31550     +11     
  Branches    3378    3381      +3     
=======================================
+ Hits       29702   29703      +1     
- Misses      1195    1199      +4     
- Partials     642     648      +6     

@maskarb maskarb changed the title [COST-5368] prep for AWS Glue Data Catalogue migration [COST-5551] prep for AWS Glue Data Catalogue migration Jan 16, 2025
@maskarb maskarb added full-run-smoke-tests pr_check will build the image and run all smoke tests and removed smoke-tests pr_check will build the image and run minimal required smokes labels Jan 22, 2025
@maskarb maskarb merged commit fb81de0 into main Jan 22, 2025
14 checks passed
@maskarb maskarb deleted the glue-prep branch January 22, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full-run-smoke-tests pr_check will build the image and run all smoke tests smokes-required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants