Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teradata incorrect sql_translation for scalar "as.double" and "as.character" #1544

Closed
rplsmn opened this issue Oct 14, 2024 · 0 comments · Fixed by #1545
Closed

Teradata incorrect sql_translation for scalar "as.double" and "as.character" #1544

rplsmn opened this issue Oct 14, 2024 · 0 comments · Fixed by #1545

Comments

@rplsmn
Copy link
Contributor

rplsmn commented Oct 14, 2024

Greetings,

In the backend translations for Teradata, the function sql_translation.Teradata has 2 wrong translations, for the base R function as.double() and as.character().

as.double()

The translation uses NUMERIC which errors because the correct SQL function for Teradata is FLOAT

con <- dbplyr::simulate_teradata()

tbl_test <- dbplyr::tbl_lazy(mtcars, con = con, name = "mtcars_td")

tbl_test |> 
  dplyr::select(mpg, cyl) |> 
  dplyr::mutate(mpg = as.double(mpg)) |> 
  dplyr::show_query()

Result :

<SQL>
SELECT CAST(`mpg` AS NUMERIC) AS `mpg`, `cyl`
FROM `mtcars_td`

The output should be

<SQL>
SELECT CAST(`mpg` AS FLOAT) AS `mpg`, `cyl`
FROM `mtcars_td`

The sql_translation.Teradata should have sql_cast("FLOAT") instead of sql_cast("NUMERIC"). The odbc package further helps validating this as the translation there for type double is FLOAT in Teradata.

To rectify, I've successfully modified the sql_translation.Teradata function with the above mentionned change and the fork works correctly now on our TD backend.

as.character()

The translation uses VARCHAR(MAX) which errors because the correct SQL function for Teradata is VARCHAR(N) where N must be a number, usually by default 255.

con <- dbplyr::simulate_teradata()

tbl_test <- dbplyr::tbl_lazy(mtcars, con = con, name = "mtcars_td")

tbl_test |> 
  dplyr::select(mpg, cyl) |> 
  dplyr::mutate(mpg = as.character(cyl)) |> 
  dplyr::show_query()

Result :

<SQL>
SELECT `mpg`, CAST(`cyl` AS VARCHAR(MAX)) AS `cyl`
FROM `mtcars_td`

The output should be, e.g :

<SQL>
SELECT `mpg`, CAST(`cyl` AS VARCHAR(255)) AS `cyl`
FROM `mtcars_td`

The sql_translation.Teradata should have at minimum sql_cast("VARCHAR(255)") instead of sql_cast("VARCHAR(MAX)"). The odbc package further helps validating this as the translation there for type character is VARCHAR(255).

To rectify, I've successfully modified the sql_translation.Teradata function with the above mentionned change and the fork works correctly now on our TD backend. I've also upgraded the function to mimic the behavior of as.numeric() which allows the user to specify the argument for digits, and allow the user to specify the length of VARCHAR(N)

sql_translation.Teradata <- function(con) {
  sql_variant(
    sql_translator(.parent = base_odbc_scalar,
      # Other scalars removed for ease of reading
      as.character  = function(x, nchar = 255L) {
        nchar <- vctrs::vec_cast(nchar, integer())
        sql_expr(CAST(!!x %as% VARCHAR(!!nchar)))
      },
   # Rest of function body
  )
)

I hope this is clear enough, thank you for your time, and I can provide a PR for this if that's how things work. I'm still very new to contributing so please bear with me and don't hesitate to tell me what I did wrong !

Cheers,

Raphael

@simonpcouch simonpcouch linked a pull request Oct 24, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant