Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query worked in 2.3.4 but not in 2.4.0 #1388

Closed
carlinstarrs opened this issue Oct 31, 2023 · 17 comments · Fixed by #1396
Closed

Query worked in 2.3.4 but not in 2.4.0 #1388

carlinstarrs opened this issue Oct 31, 2023 · 17 comments · Fixed by #1396
Labels
bug an unexpected problem or unintended behavior

Comments

@carlinstarrs
Copy link

carlinstarrs commented Oct 31, 2023

This used to work:

library("dbplyr")
cn <- DBI::dbConnect(odbc::odbc(), .connection_string = "Driver={SQL Server};server=servername;trusted_connection=yes;")
dplyr::tbl(cn, dbplyr::in_schema(dbplyr::sql("database.dbo"), dbplyr::sql("table_name")))

But now gets this error message:

Caused by error:
! nanodbc/nanodbc.cpp:1769: 42000: [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'database.dbo.table_name'.  [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. 
<SQL> 'SELECT *
FROM "database.dbo"."table_name" "q01"
WHERE (0 = 1)'

The update to 2.0 also broke our queries that included the database name within in_schema (see #552), and we've been using this workaround since then, but now the workaround is broken. Any chance this is something that would get fixed? The issue is the extra quotes around the . after dbo, i.e. the query should be:

'SELECT *
FROM "database.dbo.table_name" "q01"
WHERE (0 = 1)'

Reverting to 2.3.4 fixes it.

@whipson
Copy link

whipson commented Oct 31, 2023

I'm seeing similar behaviour after upgrading from 2.3.4 to 2.4.0 connecting to an Impala table.

When running 2.4.0:

impala <- DBI::dbConnect(odbc::odbc(), "Impala")
dplyr::tbl(impala, "reference.seat_capacity")

Output

It looks like you tried to incorrectly use a table in a schema as source.
ℹ If you want to specify a schema use `in_schema()` or `in_catalog()`.
ℹ If your table actually contains "." in the name use `check_from = FALSE` to
  silence this message.
Error in `as.character()`:
! Can't convert `x` <dbplyr_table_ident> to <character>.
Run `rlang::last_trace()` to see where the error occurred.

Using in_schema("reference", "seat_capacity") does the same thing. Reverting to 2.3.4 resolved it.

@carlinstarrs
Copy link
Author

carlinstarrs commented Oct 31, 2023

Adding to report this that this also no longer works:

library("dbplyr")
cn <- DBI::dbConnect(odbc::odbc(), .connection_string = "Driver={SQL Server};server=servername;trusted_connection=yes;")
dplyr::tbl(cn, DBI::Id(database = "database", schema = "dbo", table = "table_name"))
Error in `tbl_sql()`:
! `table` is an <Id> with unknown names "database".
i Only "catalog", "schema", and "table" are supported.

This does work for both 2.3.4 and 2.4.0 (changing 'database' to 'catalog'):

library("dbplyr")
cn <- DBI::dbConnect(odbc::odbc(), .connection_string = "Driver={SQL Server};server=servername;trusted_connection=yes;")
dplyr::tbl(cn, DBI::Id(catalog= "database", schema = "dbo", table = "table_name"))

@apalacio9502
Copy link

In my case in_schema works correctly for me, I think the problem is that the input of the variable schema and table are in SQL format and not character, as you can see the result is different

dbplyr::in_schema(dbplyr::sql("database.dbo"), dbplyr::sql("table_name"))
<SCHEMA> `database.dbo.table_name
dbplyr::in_schema("database.dbo", "table_name")
<SCHEMA> `database.dbo`.`table_name`

@mgirlich
Copy link
Collaborator

mgirlich commented Nov 2, 2023

@whipson We added the check for a . in the table name (like in your example dplyr::tbl(impala, "reference.seat_capacity")) to prevent issues and I think the error message is helpful. For me in_schema("reference", "seat_capacity") works. Can you add a reprex with in_schema("reference", "seat_capacity")?

@mgirlich
Copy link
Collaborator

mgirlich commented Nov 2, 2023

@carlinstarrs It definitely makes sense that in_schema(sql("database.dbo"), sql("table_name")) is translated to "database.dbo"."table_name". You already found the correct way of either using DBI::Id(catalog, schema, table) or you could use in_catalog().
It seems that for SQL Server you usually have database instead of catalog, so we might also support this.

@whipson
Copy link

whipson commented Nov 2, 2023

@mgirlich yes, I tried using in_schema and ran into an error. Here's what I see:

impala <- DBI::dbConnect(odbc::odbc(), "Impala")
dplyr::tbl(impala, dbplyr::in_schema("reference", "seat_capacity"))
Error in `as.character()`:
! Can't convert `x` <dbplyr_table_ident> to <character>.
Run `rlang::last_trace()` to see where the error occurred.

@mgirlich
Copy link
Collaborator

mgirlich commented Nov 2, 2023

@whipson Can you add the trace of the error. You can get it via rlang::last_trace() after the error occured.

@whipson
Copy link

whipson commented Nov 2, 2023

<error/vctrs_error_cast>
Error in `as.character()`:
! Can't convert `x` <dbplyr_table_ident> to <character>.
---
Backtrace:
     ▆
  1. ├─dplyr::tbl(impala, dbplyr::in_schema("yhz_reference", "aircraft_seat_capacity"))
  2. └─implyr:::tbl.src_impala(...)
  3.   └─dbplyr::tbl_sql("impala", src = src, from = from, ...)
  4.     ├─vars %||% dbplyr_query_fields(src$con, from)
  5.     └─dbplyr:::dbplyr_query_fields(src$con, from)
  6.       └─dbplyr:::dbplyr_fallback(con, "db_query_fields", ...)
  7.         ├─rlang::eval_bare(expr((!!fun)(con, ...)))
  8.         ├─dplyr::db_query_fields(con, ...)
  9.         └─implyr:::db_query_fields.impala_connection(con, ...)
 10.           └─base::grepl("\\s", sql)
 11.             ├─base::as.character(x)
 12.             └─vctrs:::as.character.vctrs_vctr(x)

@mgirlich
Copy link
Collaborator

mgirlich commented Nov 2, 2023

Thanks! This should be fixed in the implyr package. Please open an issue there.

@hadley
Copy link
Member

hadley commented Nov 2, 2023

@mgirlich you said:

It definitely makes sense that in_schema(sql("database.dbo"), sql("table_name")) is translated to "database.dbo"."table_name".

But the docs say

Use sql() to pass a raw name that won't get quoted.

@hadley hadley added the bug an unexpected problem or unintended behavior label Nov 2, 2023
@hadley
Copy link
Member

hadley commented Nov 4, 2023

I'm also starting to wonder if we've massively over indexed on a problem that few people actually have — how common is that people actually want to refer to a table containing a special character? Maybe we should just let foo.bar.baz flow through to the database, and if folks actually are trying to refer to a table with a weird name, we provide a tool for explicit quoting.

@andrew-schulman
Copy link

@hadley I agree. From my point of view, both table and schema.table are valid table names, and having to handle them differently in my code is just nuisance. (And if I include a . in a table name in a database that supports schema.table syntax, and if the database lets me do that, I deserve what I get.)

@carlganz
Copy link

carlganz commented Nov 6, 2023

Appears there is no way to call a UDF in SQL now:

Before:

# would frequently glue_sql the parameter into the udf
tbl(con, in_catalog("datawarehouse", "FACT", sql("MyUDF('blah')")

Now says whole object doesn't exist

@hadley
Copy link
Member

hadley commented Nov 6, 2023

At a minimum, I think we should make I("schema.table") the preferred syntax for declaring "I know what I'm doing and I have quoted my table names appropriately".

@hadley
Copy link
Member

hadley commented Nov 7, 2023

Also the requirement to name all the arguments to Id() is annoying, so I did a PR to remove it: r-dbi/DBI#417

@hadley hadley mentioned this issue Nov 7, 2023
5 tasks
@krlmlr
Copy link
Member

krlmlr commented Nov 10, 2023

Is I() the new ident_q() ? I was using the latter for this purpose before, this now gives a message every 8 hours.

@hadley
Copy link
Member

hadley commented Nov 10, 2023

@krlmlr yeah that's what I was thinking. I have a fuller rethink of the strategy in #1396. It's not finished yet but the basic idea is to lean into I() more to make the interface simpler, and do the escaping upfront to make the implementation easier.

hadley added a commit that referenced this issue Feb 20, 2024
The big idea in this PR is to simplify the implementation by immediately escaping as soon as we get a table name. That allows user facing functions to accept a wide range of table specifications while internal functions only need to deal with a new `table_path` class. Additionally, only the `table_path` class is vectorised, so we can ensure that user supplies a single table name (or SQL, where appropriate), while still providing vectorised tools internally. Since we escape table ids early we also need a new tool to extract the name of just the table; this is sort of the opposite problem to `table_ident` which needed tools to stitch components together instead of pulling them apart.

I've also simplified the implementation a little by taking `I()` to mean "don't escape". My expectation is that `I("foo.bar")`/`I("foo.bar.baz"))` will become the preferred way to specify nested tables.

Fixes #1388. Fixes #1396. Fixes #1413. Fixes #1408. Closes #1385. Fixes #1416. Fixes #1404.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants