Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input type for min/max/min_by/max_by should be restricted to "orderable" types #6718

Closed
mbasmanova opened this issue Sep 24, 2023 · 11 comments
Closed
Assignees
Labels
aggregates bug Something isn't working fuzzer-found new-fuzzer-feature fuzzer issues that are blocking new features triage Newly created issue that needs attention.

Comments

@mbasmanova
Copy link
Contributor

Bug description

In Presto, input to min/max is restricted to orderable types. For example, MAP type is not orderable. Velox currently allows MAP inputs.

com.facebook.presto.sql.analyzer.SemanticException: line 1:28: 
Unexpected parameters (map(real,varchar)) for function max. Expected: max(E) E:orderable, max(E, bigint) E:orderable

This issue is similar to #6712 and applies to all scalar and aggregate functions that sort inputs.

This issue was discovered by running AggregationFuzzer with Presto as source of truth.

CC: @laithsakka @duanmeng

System information

n/a

Relevant logs

No response

@mbasmanova mbasmanova added bug Something isn't working fuzzer-found triage Newly created issue that needs attention. labels Sep 24, 2023
@duanmeng
Copy link
Collaborator

Would be better to do this check and restriction in the analyzer layer?
Or remove the unorderable type from registration (through code review or add an orderable flag in the type?

@laithsakka laithsakka added the new-fuzzer-feature fuzzer issues that are blocking new features label Sep 25, 2023
@mbasmanova
Copy link
Contributor Author

Fixing this issue requires multiple changes.

  • Extend Velox Type class to add isOrderable() and isComparable() methods.
  • Extend Function Signature APIs to allow restricting argument types to orderable or comparable.
  • Fix array_sort implementation to reject non-orderable input types.
  • Modify AggregationFuzzer and ExpressionFuzzer to take 'orderable' and 'comparable' restrictions on argument types into account when generating expressions.

For example, this is how signature for min/max functions may look like:

  signatures.push_back(exec::AggregateFunctionSignatureBuilder()
                           .orderableTypeVariable("T")
                           .returnType("T")
                           .intermediateType("T")
                           .argumentType("T")
                           .build());

For example, this is how signature for array_sort may look like:

      exec::FunctionSignatureBuilder()
          .orderableTypeVariable("T")
          .returnType("array(T)")
          .argumentType("array(T)")
          .build(),

@duanmeng
Copy link
Collaborator

Extend Velox Type class to add isOrderable() and isComparable() methods.

@mbasmanova Can you help explain the difference between "orderable" and "comparable", I used to think "comparable" implies "orderable".

@mbasmanova
Copy link
Contributor Author

@duanmeng For example, MAP type values are comparable as long as both key and value types are comparable, but MAP type values are not orderable. Comparison allows for a == b to return true, false, or NULL, but sorting requires that a < b comparison never returns NULL.

presto> select map(array[1, 2], array[10, null]) = map(array[1, 2], array[10, null]);
 _col0
-------
 NULL
(1 row)

presto> select map(array[1, 2], array[10, 20]) = map(array[1, 2], array[10, 20]);
 _col0
-------
 true
(1 row)

@duanmeng
Copy link
Collaborator

@duanmeng For example, MAP type values are comparable as long as both key and value types are comparable, but MAP type values are not orderable. Comparison allows for a == b to return true, false, or NULL, but sorting requires that a < b comparison never returns NULL.

@mbasmanova Got it, thanks. Could you assign this to me, I will break it into multiple tasks to track.

@mbasmanova
Copy link
Contributor Author

@duanmeng Thank you for helping with this.

facebook-github-bot pushed a commit that referenced this issue Sep 29, 2023
Summary:
In Presto, input to min/max/array_sort is restricted to orderable types. For example, MAP type is not orderable. Velox currently allows MAP inputs.
This PR extends the Velox Type class to add `isOrderable()` and `isComparable()` methods.

Part of #6718 and #6712

Pull Request resolved: #6770

Reviewed By: Yuhta

Differential Revision: D49693596

Pulled By: mbasmanova

fbshipit-source-id: 1386904dde45691368e759831b4bf24287b0de5d
@duanmeng
Copy link
Collaborator

duanmeng commented Sep 30, 2023

cc @mbasmanova

@duanmeng
Copy link
Collaborator

duanmeng commented Sep 30, 2023

@mbasmanova I used to add the constantArgumentType in this PR Allow marking arguments as constant in the function signature, and modify the fuzzer in the following PR Add support for constant function arguments to Expression Fuzzer.

  • Extend Function Signature APIs to allow restricting argument types to orderable or comparable.
  • Modify AggregationFuzzer and ExpressionFuzzer to take 'orderable' and 'comparable' restrictions on argument types

Could these two requirements use the same solution?

Update: It should be similar to SignatureVariable.knownTypesOnly_

@mbasmanova
Copy link
Contributor Author

@duanmeng I'm not sure I understand the question, but, yes this work would be similar to prior work on adding constantArgumentType and knownType.

facebook-github-bot pushed a commit that referenced this issue Oct 5, 2023
…n signature (#6906)

Summary:
In Presto, input to min/max is restricted to orderable types.
For example, MAP type is not orderable. Velox currently allows MAP inputs.

This PR adds two APIs in `FunctionSignatureBuilder`: `orderableTypeVariable`
and `comparableTypeVariable`. Use type's `orderable` and `comparable` flag
added in #6770 to do the check
during type binding in `SignatureBinder.tryBind`.

Part of #6718

Pull Request resolved: #6906

Reviewed By: xiaoxmeng

Differential Revision: D49951511

Pulled By: mbasmanova

fbshipit-source-id: 17ee5fd75b25a7df636279d07e99773105349220
facebook-github-bot pushed a commit that referenced this issue Oct 12, 2023
Summary:
PR #6906 added support to restricting arguments as orderable.

This PR changes ArgumentTypeFuzzer to respect this constraint.

Part of #6718

Pull Request resolved: #6950

Reviewed By: Yuhta

Differential Revision: D50212902

Pulled By: mbasmanova

fbshipit-source-id: 318b5e443baba7b6800539fafcf34946e6f0e91d
ericyuliu pushed a commit to ericyuliu/velox that referenced this issue Oct 12, 2023
Summary:
In Presto, input to min/max/array_sort is restricted to orderable types. For example, MAP type is not orderable. Velox currently allows MAP inputs.
This PR extends the Velox Type class to add `isOrderable()` and `isComparable()` methods.

Part of facebookincubator#6718 and facebookincubator#6712

Pull Request resolved: facebookincubator#6770

Reviewed By: Yuhta

Differential Revision: D49693596

Pulled By: mbasmanova

fbshipit-source-id: 1386904dde45691368e759831b4bf24287b0de5d
ericyuliu pushed a commit to ericyuliu/velox that referenced this issue Oct 12, 2023
…n signature (facebookincubator#6906)

Summary:
In Presto, input to min/max is restricted to orderable types.
For example, MAP type is not orderable. Velox currently allows MAP inputs.

This PR adds two APIs in `FunctionSignatureBuilder`: `orderableTypeVariable`
and `comparableTypeVariable`. Use type's `orderable` and `comparable` flag
added in facebookincubator#6770 to do the check
during type binding in `SignatureBinder.tryBind`.

Part of facebookincubator#6718

Pull Request resolved: facebookincubator#6906

Reviewed By: xiaoxmeng

Differential Revision: D49951511

Pulled By: mbasmanova

fbshipit-source-id: 17ee5fd75b25a7df636279d07e99773105349220
ericyuliu pushed a commit to ericyuliu/velox that referenced this issue Oct 12, 2023
…#6950)

Summary:
PR facebookincubator#6906 added support to restricting arguments as orderable.

This PR changes ArgumentTypeFuzzer to respect this constraint.

Part of facebookincubator#6718

Pull Request resolved: facebookincubator#6950

Reviewed By: Yuhta

Differential Revision: D50212902

Pulled By: mbasmanova

fbshipit-source-id: 318b5e443baba7b6800539fafcf34946e6f0e91d
facebook-github-bot pushed a commit that referenced this issue Oct 17, 2023
…6928)

Summary:
In Presto, input to `array_sort`, and `array_sort_desc` are restricted to
orderable types. For example, MAP type is not orderable.

This PR uses the `orderableTypeVariable` to apply the restriction
to the input argument of array_sort. This restriction applies to the input
argument type of normal array_sort, and the return type of the lambda
of array_sort with a transform lambda, not the array_sort function
with a custom comparator.

For more details please check the blog https://velox-lib.io/blog/array-sort/

```SQL
presto> SELECT array_sort(ARRAY [map(array['a', 'b', 'c'], array[1, 2, 3])]);]
...
Expected: array_sort(array(E)) E:orderable, ...

presto> SELECT array_sort(ARRAY [map(array['a', 'b', 'c'], array[1, 2, 3])],
(x, y) -> IF(cardinality(x) < cardinality(y), -1, IF(cardinality(x) = cardinality(y), 0, 1)));
       _col0
-------------------
 [{a=1, b=2, c=3}]
(1 row)
```

Part of #6718

Pull Request resolved: #6928

Reviewed By: laithsakka

Differential Revision: D50003502

Pulled By: mbasmanova

fbshipit-source-id: cf425f539418ebcbcf5737677c58172a20a71016
@mbasmanova
Copy link
Contributor Author

@duanmeng Meng, are you still interested in working on this? Anything blocking?

@duanmeng
Copy link
Collaborator

@mbasmanova Yes, left the one task "min/max implementation to reject non-orderable input types", I will file a PR to resolve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aggregates bug Something isn't working fuzzer-found new-fuzzer-feature fuzzer issues that are blocking new features triage Newly created issue that needs attention.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants