Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbplyr 2.4.0 - table aliases with in_schema now break queries #1404

Closed
MKatz-DHSC opened this issue Nov 10, 2023 · 6 comments · Fixed by #1396
Closed

dbplyr 2.4.0 - table aliases with in_schema now break queries #1404

MKatz-DHSC opened this issue Nov 10, 2023 · 6 comments · Fixed by #1396
Labels
bug an unexpected problem or unintended behavior

Comments

@MKatz-DHSC
Copy link

New behaviour of prefixing * with the table name breaks Spark ODBC queries, is it possible to override this behaviour?

library(dplyr)
library(dbplyr)

x <- c("abc", "def", "ghif")

# works as generates SELECT *
lazy_frame(x = x, con = simulate_odbc()) %>% 
  show_query()

# breaks as generates SELECT `df`.*
lazy_frame(x = x, con = simulate_odbc()) %>% 
  head(5) %>% 
  show_query()
@mgirlich
Copy link
Collaborator

Are you sure this is due to df.*? This works for me

library(sparklyr)
library(dplyr)
sc <- spark_connect(master = "local")

iris_tbl <- copy_to(sc, iris, overwrite = TRUE)
iris_tbl %>% 
  head(5)
#> # Source: spark<?> [?? x 5]
#>   Sepal_Length Sepal_Width Petal_Length Petal_Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#> 1          5.1         3.5          1.4         0.2 setosa 
#> 2          4.9         3            1.4         0.2 setosa 
#> 3          4.7         3.2          1.3         0.2 setosa 
#> 4          4.6         3.1          1.5         0.2 setosa 
#> 5          5           3.6          1.4         0.2 setosa

iris_tbl %>% 
  head(5) %>% 
  show_query()
#> <SQL>
#> SELECT `iris`.*
#> FROM `iris`
#> LIMIT 5

Created on 2023-11-10 with reprex v2.0.2

Can you please add a reprex?

@MKatz-DHSC
Copy link
Author

MKatz-DHSC commented Nov 10, 2023

Apologies, had a deeper dive and believe the issue is to do with the interaction between in_schema and the new use of a prefix for *. If you pass a named vector as the table argument of in_schema that name is used for the table alias (not sure if this is documented) which conflicts with the prefix.

library(dplyr)
library(dbplyr)

x <- c("abc", "def", "ghif")
table <- in_schema("schema", c("alias" = "table"))

# works as generates SELECT *
lazy_frame(x = x, con = simulate_odbc(), .name = table) %>% 
  show_query()
#> <SQL>
#> SELECT *
#> FROM `schema`.`table`

# breaks as generates SELECT `table`.* but with FROM `schema`.`table` AS `alias`
lazy_frame(x = x, con = simulate_odbc(), .name = table) %>% 
  head(5) %>% 
  show_query()
#> <SQL>
#> SELECT `table`.*
#> FROM `schema`.`table` AS `alias`
#> LIMIT 5

Do you want me to open a new issue or rename this one?

@MKatz-DHSC MKatz-DHSC changed the title dbplyr 2.4.0 breaks select * with Spark ODBC Driver dbplyr 2.4.0 - table aliases with in_schema now break queries Nov 17, 2023
@MKatz-DHSC MKatz-DHSC changed the title dbplyr 2.4.0 - table aliases with in_schema now break queries dbplyr 2.4.0 - table aliases with in_schema now break queries Nov 17, 2023
@hadley
Copy link
Member

hadley commented Dec 19, 2023

@MKatz-DHSC Are you deliberately or accidentally passing a named vector to in_schema()?

@hadley hadley added the bug an unexpected problem or unintended behavior label Dec 19, 2023
@MKatz-DHSC
Copy link
Author

The unexpected behaviour occurred when I was indexing a named vector of table names as part of a loop with [ rather than [[. Based on the documentation assumed it did not matter which I passed to in_schema().

@hadley
Copy link
Member

hadley commented Dec 20, 2023

Ok, just wanted to double check that the expected behaviour here is to drop the names.

@MKatz-DHSC
Copy link
Author

Yep, can't really see a use case where you would want to manually specify the alias name. Thanks for looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants