Avoid normalising identifiers #30

Dietr1ch · 2025-01-11T15:34:40Z

Normalising identifiers for column names and tables yields to surprising behaviour (#29, apache/datafusion#13649) and while there's workarounds, it's hard for people that are just trying to start using bdt to discover them (quoting, renaming their data).

I think that a tool shouldn't have surprising behaviour like this.

The tracked `ahash` no longer builds in nightly as some SIMD features were dropped.

This is awfully surprising to new users, and I think it's a bad thing to do even if people more familiar with SQL DB engines won't be too surprised about it.

Dietr1ch · 2025-01-11T18:43:26Z

This is a breaking change, but given the nature of the tool it's an improvement too, as this normalisation prevents querying non-snake_case tables and fields coming from imported files (csv, parquet, etc...).

This is also untested, but I didn't find existing tests to mimic. I did try a local build and runs as expected. I was now able to query sample parquet files for Iris data from https://www.tablab.app/parquet/sample, which is the kind of thing that bdt should help with, but wasn't able to because of normalisation (which is "expected" on true DB/SQL world, but not in the real world of data).

Dietr1ch force-pushed the dev/raw-idents branch from 74b339e to afa8ae8 Compare January 11, 2025 15:36

Dietr1ch added 3 commits January 11, 2025 12:37

gitignore: Ignore Unix hidden files

22a7f46

Update dependencies

c512438

The tracked `ahash` no longer builds in nightly as some SIMD features were dropped.

Avoid normalising identifiers.

8377ee7

This is awfully surprising to new users, and I think it's a bad thing to do even if people more familiar with SQL DB engines won't be too surprised about it.

Dietr1ch force-pushed the dev/raw-idents branch from afa8ae8 to 8377ee7 Compare January 11, 2025 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid normalising identifiers #30

Avoid normalising identifiers #30

Dietr1ch commented Jan 11, 2025

Dietr1ch commented Jan 11, 2025

Avoid normalising identifiers #30

Are you sure you want to change the base?

Avoid normalising identifiers #30

Conversation

Dietr1ch commented Jan 11, 2025

Dietr1ch commented Jan 11, 2025