Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: temporary branch for IOx update (11-30-2023 to 12-09-2023) #8543

Conversation

appletreeisyellow
Copy link
Contributor

@appletreeisyellow appletreeisyellow commented Dec 14, 2023

⚠️ This draft PR is not intend to merge. This is a temporary branch for the purpose of updating DataFusion in InfluxData IOx.

Dec. 14, 2023

DataFusion commit history:

The head of this branch is at:

Cherry-picked two bug fixes:

Cherry-picked the commits from the main branch:

Dec. 15, 2023

Cherry-picked the commits from the main branch:

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Dec 14, 2023
appletreeisyellow and others added 2 commits December 14, 2023 11:03
… different schemas and statistics (apache#8533)

* Add test for schema evolution

* Fix reading parquet statistics

* Update tests for fix

* Add comments to help explain the test

* Add another test
Weijun-H and others added 9 commits December 14, 2023 15:42
* support LargeList in array_empty

* update err info
* feat: test queries for to_timestamp(float) WIP

* feat: Float64 input for to_timestamp

* cargo fmt

* clippy

* docs: double input type for to_timestamp

* feat: cast floats to timestamp

* style: cargo fmt

* fix: float64 cast for timestamp nanos only
* Support User Defined Table Function

Signed-off-by: veeupup <[email protected]>

* fix comments

Signed-off-by: veeupup <[email protected]>

* add udtf test

Signed-off-by: veeupup <[email protected]>

* add file header

* Simply table function example, add some comments

* Simplfy exprs

* make clippy happy

* Update datafusion/core/tests/user_defined/user_defined_table_functions.rs

---------

Signed-off-by: veeupup <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* document timestamp input limis

* fix text

* prettier

* remove doc for nanoseconds

* Update datafusion/physical-expr/src/datetime_expressions.rs

Co-authored-by: Andrew Lamb <[email protected]>

---------

Co-authored-by: Andrew Lamb <[email protected]>
* fix: make ntile work in some corner cases

* fix comments

* minor

* Update datafusion/sqllogictest/test_files/window.slt

Co-authored-by: Mustafa Akur <[email protected]>

---------

Co-authored-by: Mustafa Akur <[email protected]>
Given that group keys inherently have few repeated values, especially
when grouping on a single column, the use of dictionary encoding is
unlikely to be yielding significant returns
* done

Signed-off-by: jayzhan211 <[email protected]>

* add more test

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
@github-actions github-actions bot added sql SQL Planner physical-expr Physical Expressions labels Dec 14, 2023
alamb and others added 12 commits December 14, 2023 16:41
* Minor: Improve the documentation on `ScalarValue`

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

---------

Co-authored-by: Liang-Chi Hsieh <[email protected]>
* add benchmark

Signed-off-by: jayzhan211 <[email protected]>

* fmt

Signed-off-by: jayzhan211 <[email protected]>

* address clippy

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

* fix comment

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* minor changes

* PipelineStatePropagator tree refactor

* Remove duplications by children_unbounded()

* Remove on-the-fly tree construction

* Minor changes

---------

Co-authored-by: Mustafa Akur <[email protected]>
…8121)

* feat: support  LargeList in make_array and
array_length

* chore: add tests

* fix: update tests for nested array

* use usise_as

* add new_large_list

* refactor array_length

* add comment

* update test in sqllogictest

* fix ci

* fix macro

* use usize_as

* update comment

* return based on data_type in make_array
…che#8404)

- remove `unalias` TableScan filters
- refactor  CreateExternalTable
- fix typo
…ls (apache#8400)

* fix transforming LogicalPlan::Explain use TreeNode::transform fails

* Update datafusion/expr/src/logical_plan/plan.rs

Co-authored-by: Andrew Lamb <[email protected]>

---------

Co-authored-by: Andrew Lamb <[email protected]>
* Minor: Improve the document format of JoinHashMap

* Docs: Fix `array_except` documentation example
* Minor: Improve the document format of JoinHashMap

* support named query parameters

* cargo fmt

* add `ParamValues` conversion

* improve doc
…tests for ORDER BY cases with RANGE frame (apache#8410)

* fix: RANGE frame can be regularized to ROWS frame only if empty ORDER BY clause

* Fix flaky test

* Update test comment

* Add code comment

* Update
mustafasrepo and others added 24 commits December 15, 2023 14:03
* Relax schema check for optimize projections.

* Minor changes

* Update datafusion/optimizer/src/optimize_projections.rs

Co-authored-by: jakevin <[email protected]>

---------

Co-authored-by: jakevin <[email protected]>
* list cmp

Signed-off-by: jayzhan211 <[email protected]>

* remove cfg

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* Support parquet_metadata for datafusion-cli

Signed-off-by: veeupup <[email protected]>

* make tomlfmt happy

* display like duckdb

Signed-off-by: veeupup <[email protected]>

* add test & fix single quote

---------

Signed-off-by: veeupup <[email protected]>
* Fix nested count optimization

* fmt

* extend comment

* Clippy

* Update datafusion/optimizer/src/optimize_projections.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

* Add sqllogictests

* Fmt

---------

Co-authored-by: Liang-Chi Hsieh <[email protected]>
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4 to 5.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* change get zero to first()

Signed-off-by: Ruihang Xia <[email protected]>

* wake clone to wake_by_ref

Signed-off-by: Ruihang Xia <[email protected]>

* more first()

Signed-off-by: Ruihang Xia <[email protected]>

* try_from() to from()

Signed-off-by: Ruihang Xia <[email protected]>

---------

Signed-off-by: Ruihang Xia <[email protected]>
…e treated as constant sort (apache#8445)

* fix: RANGE frame for corner cases with empty ORDER BY clause should be treated as constant sort

* fix

* Make the test not flaky

* fix clippy
* fix: don't unifies projection if expr is non-trival

* Update datafusion/core/src/physical_optimizer/projection_pushdown.rs

Co-authored-by: Alex Huang <[email protected]>

---------

Co-authored-by: Alex Huang <[email protected]>
* Minor: Add new bloom filter tests

* fmt
* Aggregate rewrite for dataframe API.

* Simplifications

* Minor changes

* Minor changes

* Add new test

* Add new tests

* Minor changes

* Add rule, for aggregate simplification

* Simplifications

* Simplifications

* Simplifications

* Minor changes

* Simplifications

* Add new test condition

* Tmp

* Push requirement below aggregate

* Add join and subqeury alias

* Add cross join support

* Minor changes

* Add logical plan repartition support

* Add union support

* Add table scan

* Add limit

* Minor changes, buggy

* Add new tests, fix existing bugs

* change concat type array_concat

* Resolve some of the bugs

* Comment out a rule

* All tests pass, when single distinct is closed

* Fix aggregate bug

* Change analyze and explain implementations

* All tests pass

* Resolve linter errors

* Simplifications, remove unnecessary codes

* Comment out tests

* Remove pushdown projection

* Pushdown empty projections

* Fix failing tests

* Simplifications

* Update comments, simplifications

* Remove eliminate projection rule, Add method for group expr len aggregate

* Simplifications, subquery support

* Update comments, add unnest support, simplifications

* Remove eliminate projection pass

* Change name

* Minor changes

* Minor changes

* Add comments

* Fix failing test

* Minor simplifications

* update

* Minor

* Remove ordering

* Minor changes

* add merge projections

* Add comments, resolve linter errors

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Review Part 1

* Review Part 2

* Fix quadratic search, Change trim_expr impl

* Review Part 3

* Address reviews

* Minor changes

* Review Part 4

* Add case expr support

* Review Part 5

* Review Part 6

* Finishing touch: Improve comments

---------

Co-authored-by: berkaysynnada <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
* refactor data_trunc

* fix cast to timestamp array

* fix cast to timestamp scalar

* fix doc
* implement distinct func

implement slt & proto

fix null & empty list

* add comment for slt

Co-authored-by: Alex Huang <[email protected]>

* fix largelist

* add largelist for slt

* Use collect for rows & init capcity for offsets.

* fixup: remove useless match

* fix fmt

* fix fmt

---------

Co-authored-by: Alex Huang <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
…e#8377)

* Add `evaluate_demo` and `range_analysis_demo` to Expr examples

* Prettier

* Update datafusion-examples/examples/expr_api.rs

Co-authored-by: Raphael Taylor-Davies <[email protected]>

* rename ExprBoundaries::try_new_unknown --> ExprBoundaries::try_new_unbounded

---------

Co-authored-by: Raphael Taylor-Davies <[email protected]>
…ont/back` (apache#8401)

* array_element done

Signed-off-by: jayzhan211 <[email protected]>

* clippy

Signed-off-by: jayzhan211 <[email protected]>

* replace array_slice

Signed-off-by: jayzhan211 <[email protected]>

* fix get_indexed_field_empty_list

Signed-off-by: jayzhan211 <[email protected]>

* replace pop front and pop back

Signed-off-by: jayzhan211 <[email protected]>

* clippy

Signed-off-by: jayzhan211 <[email protected]>

* add doc and comment

Signed-off-by: jayzhan211 <[email protected]>

* fmt

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
* refactor trim

* add fmt for TrimType

* fix closure

* update comment
* add PlaceHolderRowExec

* Change produce_one_row=true calls to use PlaceHolderRowExec

* remove produce_one_row from EmptyExec, changes in proto serializer, working tests

* PlaceHolder => Placeholder

---------

Co-authored-by: Andrew Lamb <[email protected]>
@appletreeisyellow appletreeisyellow changed the title chore: temporary branch for IOx update chore: temporary branch for IOx update (11-30-2023 to 12-09-2023) Jan 2, 2024
@appletreeisyellow
Copy link
Contributor Author

Closing this as the update is done

@appletreeisyellow appletreeisyellow deleted the chunchun/temp-fix-12-14-2023 branch January 24, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Physical Expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt) substrait
Projects
None yet
Development

Successfully merging this pull request may close these issues.