Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sqlfluff linted queries #6210

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Sqlfluff linted queries #6210

wants to merge 3 commits into from

Conversation

Camyll
Copy link
Contributor

@Camyll Camyll commented Jan 22, 2025

clickhouse queries that were linted by new sqlfluff linter

Copy link

vercel bot commented Jan 22, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
torchci ✅ Ready (Inspect) Visit Preview Jan 24, 2025 7:36pm

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2025
.lintrunner.toml Outdated
'python3',
'tools/linter/adapters/pip_init.py',
'--dry-run={{DRYRUN}}',
'sqlfluff',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's pin the dependency here

Suggested change
'sqlfluff',
'sqlfluff==3.3.0',

with open(filename, 'r') as f:
original = f.read()
original = original.replace('{', '\'{').replace('}', '}\'')
with open(filename, 'w') as f:
Copy link
Contributor

@huydhn huydhn Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to use a temporary file here to support --dry-run mode. With this setup, sqlfluff will always overwrite the original file.

With a temporary file, you can apply sqlfluff on that file, then read the content and compare it with the original.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huydhn so if it's dry-run mode we just output the change but don't actually make it? And if it's not dry-run it behaves the current way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yean, dry-run is the default mode for lintrunner, so lintrunner --all-files just print the diff. To make the change, you need lintrunner --all-files -a

@@ -338,6 +338,25 @@ init_command = [
]
is_formatter = true

[[linter]]
code = 'SQLFLUFF'
include_patterns = ['torchci/clickhouse_queries/**/*.sql']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with one file first, i.e. torchci/clickhouse_queries/workflow_load/query.sql as I can confirm that one is working correctly. There is no major issue with the linter adapter, but I think there is another issue with sqlfluff. It changes the function name from toUnixTimestamp into TOUNIXTIMESTAMP. That's a problem on the tool rather than on your adapter. So let's deal with it in a separate PR.

Copy link
Contributor

@huydhn huydhn Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I look into this issue a bit more. The rule that is responsible for this is CP03 https://docs.sqlfluff.com/en/stable/reference/rules.html#rule-capitalisation.functions. I suggest that we add a .sqlfluff config file to the repo, then either

  • Disable the rule (I think I prefer this given that this is the function name from ClickHouse, this also looks simpler)
[sqlfluff]
exclude_rules = capitalisation.functions
  • or make it use camel-case by default
[sqlfluff:rules:capitalisation.functions]
extended_capitalisation_policy = camel

Both works, but the second approach looks a bit weird because it changes some one-word function like IF into iF.

@Camyll Camyll force-pushed the sqlfluff_linted_queries branch from cf93bad to 96605d9 Compare January 24, 2025 19:35
@Camyll Camyll marked this pull request as ready for review January 24, 2025 19:37
@huydhn
Copy link
Contributor

huydhn commented Jan 24, 2025

My suggestion is that we land this after @clee2000 change in #6206 because #6206 implements a testing mechanism for SQL changes.

A case in point is that there are still some failures after linting from https://torchci-git-sqlflufflintedqueries-fbopensource.vercel.app/kpis I think. So, having a test suite would help detect similar failures. It's also hard to make sure nothing break by eye-balling the preview page anyway.

cc @clee2000

@Camyll I think #6209 is already done what is asked in the bootcamp task. So, thank you for going beyond with this change :)

@clee2000
Copy link
Contributor

My suggestion is that we land this after @clee2000 change in #6206 because #6206 implements a testing mechanism for SQL changes.

A case in point is that there are still some failures after linting from https://torchci-git-sqlflufflintedqueries-fbopensource.vercel.app/kpis I think. So, having a test suite would help detect similar failures. It's also hard to make sure nothing break by eye-balling the preview page anyway.

cc @clee2000

@Camyll I think #6209 is already done what is asked in the bootcamp task. So, thank you for going beyond with this change :)

I don't think 6206 is going to be finished before this is. It also only has tests for 1 query so far

@huydhn
Copy link
Contributor

huydhn commented Jan 24, 2025

I don't think 6206 is going to be finished before this is. It also only has tests for 1 query so far

Yeah, that's not an issue though. I think we should just announce the new linter after #6209 lands, then do a coordinate BE effort to lint more queries + add tests. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants