Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support virtual SQL for JSON TABLE #4

Merged
merged 9 commits into from
Jan 2, 2025
Merged

Support virtual SQL for JSON TABLE #4

merged 9 commits into from
Jan 2, 2025

Conversation

GITHUBear
Copy link
Contributor

Summary

  • Support DDL/DML for JSON TABLE
  1. ALTER TABLE XXX CHANGE XXX
  2. ALTER TABLE XXX MODIFY XXX
  3. ALTER TABLE XXX ADD XXX
  4. ALTER TABLE XXX DROP XXX
  5. CREATE TABLE XXX
  6. INSERT INTO XXX
  7. UPDATE XXX SET XXX WHERE XXX
  8. DELETE FROM XXX WHERE XXX
  9. SELECT XXX FROM XXX WHERE XXX
  • DataTypes supported by JSON TABLE
  1. BOOL
  2. TIMESTAMP
  3. VARCHAR
  4. DECIMAL
  5. INT
  • testsuites:
  1. tests/test_json_table.py
  2. tests/test_oceanbase_dialect.py

Solution Description

introduce sqlglot & pydantic into pyobvector

Signed-off-by: shanhaikang.shk <[email protected]>
Signed-off-by: shanhaikang.shk <[email protected]>
Signed-off-by: shanhaikang.shk <[email protected]>
Signed-off-by: shanhaikang.shk <[email protected]>
Signed-off-by: shanhaikang.shk <[email protected]>
Signed-off-by: shanhaikang.shk <[email protected]>
Session = sessionmaker(bind=self.engine)
session = Session()
new_meta_cache_items = []
col_id = 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does col_id starts with 16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OceanBase user column ID also starts with 16, which can be considered an easter egg.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be any problem, but it also doesn't feel very necessary.

**kwargs,
):
super().__init__(uri, user, password, db_name, **kwargs)
self.Base.metadata.create_all(self.engine)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this DDL work as expectation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DBA will create these two tables in advance so that even users without DDL permissions can use them normally.

def refresh_metadata(self):
self.jmetadata.reflect(self.engine)

def perform_json_table_sql(self, sql: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend to add type annotations for function return value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if jtable_name in self.jmetadata.meta_cache:
raise ValueError("Table name duplicated")

Session = sessionmaker(bind=self.engine)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to declare the sessionmaker as an attribute of the client class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

try:
session.commit()
self.jmetadata.meta_cache[jtable_name] = new_meta_cache_items
logger.info(f"ADD METADATA CACHE ---- {jtable_name}: {new_meta_cache_items}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe logger.debug is more appropriate here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pyobvector/client/ob_vec_json_table_client.py Show resolved Hide resolved
Copy link
Contributor

@caifeizhi caifeizhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

else:
ast.args['joins'] = [join_node]

extra_filter_str = f"user_id = {self.user_id} AND jtable_name = '{table_name}'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a table name reference to avoid conflicts; for example, maybe user_id can exist in user-defined table

Session = sessionmaker(bind=self.engine)
session = Session()
new_meta_cache_items = []
col_id = 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be any problem, but it also doesn't feel very necessary.

@GITHUBear GITHUBear merged commit e0bf21f into main Jan 2, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants