Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploration creation fails for tables with weird names #4088

Open
Anish9901 opened this issue Dec 17, 2024 · 1 comment
Open

Exploration creation fails for tables with weird names #4088

Anish9901 opened this issue Dec 17, 2024 · 1 comment
Labels
ready Ready for implementation restricted: maintainers Only maintainers can resolve this issue type: bug work: backend Related to Python, Django, and simple SQL

Comments

@Anish9901
Copy link
Member

To Reproduce

  1. Create a table with name Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clear
  2. Add a couple of columns with some data in them
  3. Try to explore the newly created table with the Data explorer
  4. Add more than 1 columns from the base table and notice the error.

Additional context

Screenshot 2024-12-17 at 11 15 44 PM

@Anish9901 Anish9901 added type: bug work: backend Related to Python, Django, and simple SQL restricted: maintainers Only maintainers can resolve this issue labels Dec 17, 2024
@Anish9901 Anish9901 added this to the v0.2.0 (beta release) milestone Dec 17, 2024
@Anish9901 Anish9901 changed the title Exploration creation fails for a tables with weird table names Exploration creation fails for tables with weird table names Dec 17, 2024
@Anish9901 Anish9901 changed the title Exploration creation fails for tables with weird table names Exploration creation fails for tables with weird names Dec 17, 2024
@kgodey kgodey removed the type: bug label Jan 17, 2025
@seancolsen seancolsen added ready Ready for implementation and removed needs: clarification beta: approved labels Jan 28, 2025
@seancolsen
Copy link
Contributor

seancolsen commented Jan 28, 2025

I was intrigued by this bug so I spent a little time troubleshooting it.

I had an easy time reproducing it, so I'm removing the needs: clarification label. This is clearly a bug.

PostgreSQL was hitting an "ambiguous column reference" error when attempting to run this SQL:

SQL
WITH
  anon_3 AS (
    SELECT
      public."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea".id AS "Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea_id",
      public."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea".yes AS "Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea_yes"
    FROM public."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea"
  ),
  anon_2 AS (
    SELECT
      anon_3."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea_id" AS "Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_1",
      anon_3."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea_yes" AS "Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_2"
    FROM anon_3
    ORDER BY
      anon_3."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - savings_clea_id" ASC
  ),
  anon_1 AS (
    SELECT
      anon_2."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_1" AS "Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_1",
      anon_2."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_2" AS "Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_2"
    FROM anon_2
    LIMIT %(param_1)s
  )
SELECT
  anon_1."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_1",
  anon_1."Sheet 1-Acct Statement_curr_sav_jan1-24-oct10-24 - saving_2"
FROM anon_1

Here's why

  1. Within the first CTE, we're concatenating the table name + an underscore + a column name. This concatenation seems to be the root cause.

  2. In PostgreSQL, the "identifier length" is limited to 63 bytes. If you write a CTE with a column alias longer than 63 bytes, the query planner will actually see a CTE with a truncated version of that identifier. And in this case the first 63 bytes of the identifiers were identical, so the columns in the CTE became ambiguous, thus unable to reference. Very interesting!!

It seems to me that we'll need to eliminate concatenation within aliases. In this case, the offending code seems likely to be python, but I've not traced it down fully. I think it's probably somewhere in mathesar/utils/explorations.py or deeper. I predict that it might not be straightforward to eliminate the concatenation!

There could be other cases of concatenation like this too. For example, here's some code that I've written which has concatenation inside an alias. I have a hunch there is a bug here too, though I've not attempted to identify any repro steps yet. Nonetheless, it would be quite tricky to eliminate the concatenation from that code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready Ready for implementation restricted: maintainers Only maintainers can resolve this issue type: bug work: backend Related to Python, Django, and simple SQL
Projects
None yet
Development

No branches or pull requests

4 participants