Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot compare two arrays of different types is thrown on a simple two-statement query comparing floats and ints #5538

Closed
milevin opened this issue Mar 9, 2023 · 1 comment · Fixed by #5820
Labels
bug Something isn't working

Comments

@milevin
Copy link

milevin commented Mar 9, 2023

Problem Statement

This statement works:

select CASE 10.5 WHEN 0 THEN null ELSE 10 END as col;

This list of two statements don't:

create table res as 
   select CASE 10.5 WHEN 0 THEN null ELSE 10 END as col;

select * from res

And produce the following error:

Arrow error: Cast error: Cannot compare two arrays of different types (Int64 and Float64)

To Reproduce
Place the above statements into any of the .sql files in datafusion/core/tests/tpc-ds/ (e.g. 1.sql). Run the corresponding test (e.g. tpcds_logical_q1.

Expected behavior
According to SQL semantics any coercion and comparison between numeric types (including Float and Int) should work. Hence expected behavior is for the statement(s) above to succeed.

Additional context
This example is a variation of the TPC-DS query 39. It currently passes, but if you stick "create table xxx as " on the first line, it will exhibit the same error.

We hit this in SDF because we tend to generate multi-statement queries in which all dependent tables are explicitly "create"-d in a statement.

Note: I tried this in the current main, and in this branch #5343, hoping that the latter might be addressing related concerns. Both result in the same behavior.

@milevin milevin added the bug Something isn't working label Mar 9, 2023
@Jefffrey
Copy link
Contributor

Note that the SQL itself actually doesn't work, but the explain for it does (which is what the tpc-ds logical tests are checking). On latest main:

DataFusion CLI v19.0.0select CASE 10.5 WHEN 0 THEN null ELSE 10 END as col;
Arrow error: Cast error: Cannot compare two arrays of different types (Int64 and Float64)
❯ explain select CASE 10.5 WHEN 0 THEN null ELSE 10 END as col;
+---------------+-------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                            |
+---------------+-------------------------------------------------------------------------------------------------+
| logical_plan  | Projection: CASE Float64(10.5) WHEN Int64(0) THEN CAST(NULL AS Int64) ELSE Int64(10) END AS col |
|               |   EmptyRelation                                                                                 |
| physical_plan | ProjectionExec: expr=[CASE 10.5 WHEN 0 THEN CAST(NULL AS Int64) ELSE 10 END as col]             |
|               |   EmptyExec: produce_one_row=true                                                               |
|               |                                                                                                 |
+---------------+-------------------------------------------------------------------------------------------------+
2 rows in set. Query took 0.004 seconds.
❯

This bug seems separate from what #5343 addresses, and seems likely due to a type coercion bug is my guess.

@alamb alamb changed the title Cannot compare two arrays of different types is thrown on a simple two-statement query Cannot compare two arrays of different types is thrown on a simple two-statement query comparing floats and ints Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants