-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-50666][SQL] Support hint for reading in JDBC data source #49564
base: master
Are you sure you want to change the base?
Conversation
cc @wangyum |
@@ -321,4 +323,5 @@ object JDBCOptions { | |||
val JDBC_CONNECTION_PROVIDER = newOption("connectionProvider") | |||
val JDBC_PREPARE_QUERY = newOption("prepareQuery") | |||
val JDBC_PREFER_TIMESTAMP_NTZ = newOption("preferTimestampNTZ") | |||
val JDBC_HINT_STRING = newOption("hint") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there no way to standardize the type of hints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we make the option is available for each of the built-in JDBC dialects ?
also cc @beliefer |
@@ -406,7 +406,7 @@ private case class MySQLDialect() extends JdbcDialect with SQLConfHelper with No | |||
} | |||
|
|||
options.prepareQuery + | |||
s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" + | |||
s"SELECT ${options.hint}$columnList FROM ${options.tableOrQuery} $tableSampleClause" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an spec to say that the "HINT" could only appear in the SELECT clause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @pan3793 , it's just a comment, but Apache Spark's HINT is also in SELECT clause only.
https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-hints.html#hints
@@ -249,6 +249,8 @@ class JDBCOptions( | |||
.map(_.toBoolean) | |||
.getOrElse(SQLConf.get.timestampType == TimestampNTZType) | |||
|
|||
val hint = parameters.get(JDBC_HINT_STRING).map(_ + " ").getOrElse("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the JDBC url starts with jdbc:oracle
or jdbc:mysql
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for @beliefer 's comment to be robust. (I deleted my previous comment.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @wayneguow .
This looks like a valid suggestion if we have a validation for the user-provided hint string is a valid comment like /*+... */
always.
BTW, did you exclude MSSQL
Hint because the syntax style is different and not a comment?
SELECT *
FROM Customers
WHERE City = 'San Francisco'
OPTION (USE INDEX = IX_City);
Also, cc @yaooqinn because this could be a nice improvement as a part of the umbrella JIRA, SPARK-47361 ( |
What changes were proposed in this pull request?
This PR aims to add a hint option for JDBC data source. This option is used to specify the hint for reading. It will apply only if the underlying DBMS supports the hint feature. Currently, this option is only supported by OracleDialect and MySQLDialect.
Why are the changes needed?
It's useful for performance tuning when reading from DBMS.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Passed GA and add a new test case.
Was this patch authored or co-authored using generative AI tooling?
No.