Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparison on Decimal dtypes does not work #2906

Closed
universalmind303 opened this issue Sep 24, 2024 · 6 comments
Closed

comparison on Decimal dtypes does not work #2906

universalmind303 opened this issue Sep 24, 2024 · 6 comments
Labels
bug Something isn't working p0 Priority 0 - to be addressed immediately

Comments

@universalmind303
Copy link
Collaborator

universalmind303 commented Sep 24, 2024

Describe the bug
A clear and concise description of what the bug is.

To Reproduce

df = (daft.from_pydict({
        'floats': [328.00, 327.00]
    })
    .where(col('floats').cast(daft.DataType.decimal128(15, 2)) > 300)
    .collect()
)

---------------------------------------------------------------------------
DaftCoreException                         Traceback (most recent call last)
Cell In[36], line 5
      1 # %%
      2 df = (daft.from_pydict({
      3         'floats': [328.00, 327.00]
      4     })
----> 5     .where(col('floats').cast(daft.DataType.decimal128(15, 2)) > 300)
      6     .collect()
      7 )
      8 df
File ~/Development/Daft/daft/api_annotations.py:26, in DataframePublicAPI.<locals>._wrap(*args, **kwargs)
     24 type_check_function(func, *args, **kwargs)
     25 timed_method = time_df_method(func)
---> 26 return timed_method(*args, **kwargs)
File ~/Development/Daft/daft/analytics.py:198, in time_df_method.<locals>.tracked_method(*args, **kwargs)
    195 @functools.wraps(method)
    196 def tracked_method(*args, **kwargs):
    197     if _ANALYTICS_CLIENT is None:
--> 198         return method(*args, **kwargs)
    200     start = time.time()
    201     try:
File ~/Development/Daft/daft/dataframe/dataframe.py:1379, in DataFrame.where(self, predicate)
   1376     from daft.sql.sql import sql_expr
   1378     predicate = sql_expr(predicate)
-> 1379 builder = self._builder.filter(predicate)
   1380 return DataFrame(builder)
File ~/Development/Daft/daft/logical/builder.py:163, in LogicalPlanBuilder.filter(self, predicate)
    162 def filter(self, predicate: Expression) -> LogicalPlanBuilder:
--> 163     builder = self._builder.filter(predicate._expr)
    164     return LogicalPlanBuilder(builder)
DaftCoreException: DaftError::External Unable to create logical plan node.
Due to: DaftError::TypeError Cannot perform comparison on types: 15.2, Int32
Details:
DaftError::TypeError Cannot perform comparison on types: 15.2, Int32

Expected behavior
the where clause should succeed with an implicit cast

should be equivalent to

dtype = daft.DataType.decimal128(15, 2)

df = (daft.from_pydict({
        'floats': [328.00, 327.00]
    })
    .where(col('floats').cast(dtype) > daft.lit(300).cast(dtype))
    .collect()
)
df
@universalmind303 universalmind303 added the bug Something isn't working label Sep 24, 2024
@samster25 samster25 added the p1 Important to tackle soon, but preemptable by p0 label Sep 24, 2024
@samster25
Copy link
Member

Good catch, tagging it as p1 to mark as a priority

@universalmind303
Copy link
Collaborator Author

universalmind303 commented Sep 24, 2024

Unassigning myself as I thought it'd be a quick fix and just add Decimal128 to get_supertype, but it looks like there's some issues when casting values to the supertype of Int128.

@universalmind303
Copy link
Collaborator Author

For more context, I tried updating get_supertype to cast (Decimal128(_,_), <numeric type>) => Int128 but for some reason the physical arrow type for Int128 is ArrowType::Decimal(32,32), and casting 300 to Decimal(32,32) results in None.

So there's definitely some funky stuff happening in our cast logic here. pinging @colin-ho since you recently did some work on cleaning up the casting logic.

@universalmind303
Copy link
Collaborator Author

Looks like Addition/subtraction also does not work on decimal dtype

@universalmind303 universalmind303 added p0 Priority 0 - to be addressed immediately and removed p1 Important to tackle soon, but preemptable by p0 labels Oct 22, 2024
@universalmind303
Copy link
Collaborator Author

another issue I just found:

 Daft error: DaftError::External Unable to create logical plan node.
Due to: DaftError::TypeError Invalid argument to mean supertype: Decimal(precision=7, scale=2)

@samster25
Copy link
Member

closed by: #3143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p0 Priority 0 - to be addressed immediately
Projects
None yet
Development

No branches or pull requests

2 participants