Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PLUGIN-1813] CloudSQLMySQL Escape column names #514

Conversation

psainics
Copy link
Contributor

@psainics psainics commented Oct 7, 2024

CloudSQLMySQL Escape column names

Jira : PLUGIN-1813

Description

If column name is a MySQL reserved keyword say insert the query will fail due to incorrect sql syntax.
This PR fixes this issue by escaping the column names using back ticks.

Similar column escapes is used in other plugins. Ref CloudSQLPostgreSQLSink

Code change

  • Modified CloudSQLMySQLSink.java
  • Modified MysqlSink.java

Unit Tests

  • Added CloudSQLMySQLSinkTest.java
  • image
  • Added MysqlSinkTest.java
  • image

@psainics psainics added the build label Oct 7, 2024
@psainics psainics changed the title [PLUGIN-1017] Escape column names [PLUGIN-1017] CloudSQLMySQLSink Escape column names Oct 7, 2024
@psainics psainics changed the title [PLUGIN-1017] CloudSQLMySQLSink Escape column names [PLUGIN-1017] CloudSQLMySQL Escape column names Oct 7, 2024
@psainics psainics force-pushed the pathc/escape-column-names-cloud-mysql branch from 58d2d12 to 307d311 Compare October 22, 2024 05:07
Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this will fix one part of the problem but now this turns into a backward incompatible change because let's say if MySQL by default convert all identifiers to lowercase and if a cx has an existing pipeline which has any column name in uppercase, it will start failing.

The right thing to do would be as described in JIRA to introduce a new config named Field Mapping which can map any CDAP field in the schema to a different name and will be passed as quoted.

@psainics
Copy link
Contributor Author

psainics commented Oct 22, 2024

I know this will fix one part of the problem but now this turns into a backward incompatible change because let's say if MySQL by default convert all identifiers to lowercase and if a cx has an existing pipeline which has any column name in uppercase, it will start failing.

The right thing to do would be as described in JIRA to introduce a new config named Field Mapping which can map any CDAP field in the schema to a different name and will be passed as quoted.

Mysql Col names are not case sensitive.

@psainics psainics changed the title [PLUGIN-1017] CloudSQLMySQL Escape column names [PLUGIN-1813] CloudSQLMySQL Escape column names Oct 22, 2024
Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do the same changes for MySQL sink as well.

@itsankit-google
Copy link
Member

I know this will fix one part of the problem but now this turns into a backward incompatible change because let's say if MySQL by default convert all identifiers to lowercase and if a cx has an existing pipeline which has any column name in uppercase, it will start failing.
The right thing to do would be as described in JIRA to introduce a new config named Field Mapping which can map any CDAP field in the schema to a different name and will be passed as quoted.

Mysql Col names are not case sensitivity.

Discussed offline, to address the larger problem in future.

@psainics psainics force-pushed the pathc/escape-column-names-cloud-mysql branch from 307d311 to b8a9f71 Compare October 23, 2024 05:58
@psainics psainics force-pushed the pathc/escape-column-names-cloud-mysql branch from b8a9f71 to 7be2c44 Compare October 23, 2024 09:30
Comment on lines +98 to +104
protected void insertOperation(PreparedStatement stmt) throws SQLException {
for (int fieldIndex = 0; fieldIndex < columnTypes.size(); fieldIndex++) {
ColumnType columnType = columnTypes.get(fieldIndex);
Schema.Field field = record.getSchema().getField(columnType.getName(), true);
writeToDB(stmt, field, fieldIndex);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Contributor Author

@psainics psainics Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref:

@Override
protected void insertOperation(PreparedStatement stmt) throws SQLException {
for (int fieldIndex = 0; fieldIndex < columnTypes.size(); fieldIndex++) {
ColumnType columnType = columnTypes.get(fieldIndex);
// Get the field from the schema using the column name with ignoring case.
Schema.Field field = record.getSchema().getField(columnType.getName(), true);
writeToDB(stmt, field, fieldIndex);
}
}

This is done to get the cols values from schema with ignoring case.

If cols in CDAP schema are in different case from the cols in the mysql DB, ignoring case is required else we will get null values.

@psainics psainics merged commit 81c8853 into data-integrations:develop Nov 5, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants