refactor: Bump OpenDAL 0.46, arrow 51, tonic 0.11, reqwest 0.12, hyper 1, http 1 #15442

Xuanwo · 2024-05-08T15:11:12Z

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR will bump the following pacakges:

OpenDAL 0.46
arrow 51
tonic 0.11
reqwest 0.12
hyper 1
http 1

I have to submit them all in one PR to make sure databend can build.

Changes that worth of mention

OpenDAL v0.46 has transitioned from AsyncRead-based IO to Range-based IO. While some internal implementations have been refactored, not all are performing optimally. We need to address this regression.

We have transitioned to range-based IO for both Parquet and native formats; please give this area extra attention.

To minimize the number of changes in the PR, I refactored directly without robust abstraction, such as using opendal::Reader directly in the native format API. I intend to refine this aspect in upcoming PRs.

Benchmarks

Concurrent reading is enabled only for loading CSV/TSV files; other queries remain unaffected. I will first ensure there are no regressions, and then I'll optimize Parquet reading.

Qeury Performance

hits

tpch

Load (with 2 concurrent)

Load (with 4 concurrent)

Tests

Unit Test
Logic Test
Benchmark Test
No Test - Explain why

Type of change

Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
Documentation Update
Refactoring
Performance Improvement
Other (please describe):

This change is

Signed-off-by: Xuanwo <[email protected]>

Xuanwo · 2024-05-09T14:39:55Z

I'm waiting for the benchmark results.

github-actions · 2024-05-09T14:46:10Z

Docker Image for PR

tag: pr-15442-9e07209

note: this image tag is only available for internal use,
please check the internal doc for more details.

Xuanwo · 2024-05-09T15:12:20Z

The benchmark seems quiet slow, I'm working on this.

Signed-off-by: Xuanwo <[email protected]>

github-actions · 2024-05-10T10:53:30Z

ClickBench Report

github-actions · 2024-05-10T11:22:26Z

Docker Image for PR

tag: pr-15442-3c97076

note: this image tag is only available for internal use,
please check the internal doc for more details.

github-actions · 2024-05-10T12:06:50Z

ClickBench Report

Signed-off-by: Xuanwo <[email protected]>

github-actions · 2024-05-10T16:33:52Z

Docker Image for PR

tag: pr-15442-9013fda

note: this image tag is only available for internal use,
please check the internal doc for more details.

Signed-off-by: Xuanwo <[email protected]>

…to upgrade-opendal

github-actions · 2024-05-10T17:16:58Z

ClickBench Report

github-actions · 2024-05-10T17:41:31Z

Docker Image for PR

tag: pr-15442-41987a4

note: this image tag is only available for internal use,
please check the internal doc for more details.

github-actions · 2024-05-10T18:18:15Z

ClickBench Report

Xuanwo · 2024-05-11T01:57:37Z

Hi, @youngsofun, this PR is ready for review now!

Signed-off-by: Xuanwo <[email protected]>

BohuTANG · 2024-05-11T02:13:30Z

The data load times improved significantly in OpenDAL 0.46, but query performance as measured by the hits benchmark seems unchanged. What specific optimizations did 0.46 include that sped up loading but not queries?

Xuanwo · 2024-05-11T02:16:28Z

What specific optimizations did 0.46 include that sped up loading but not queries?

OpenDAL v0.46 introduces a new feature known as concurrent read, also referred to as auto ranged read. This feature allows us to read large files with concurrency.

However, I've only enabled concurrency in a few continuous reading areas like the bytes reader. Other areas remain unchanged, and I'm satisfied with the current performance since there's no significant regression. I plan to optimize additional areas in upcoming PRs.

Xuanwo added 8 commits May 7, 2024 15:10

Save current work

4bcbb80

Signed-off-by: Xuanwo <[email protected]>

Refactor layer

47704d0

Signed-off-by: Xuanwo <[email protected]>

Merge remote-tracking branch 'origin/main' into upgrade-opendal

a4edea5

Save work

f9dd7b3

Signed-off-by: Xuanwo <[email protected]>

Save current work

517ffa0

Signed-off-by: Xuanwo <[email protected]>

Build pass

32748ae

Signed-off-by: Xuanwo <[email protected]>

cargo fix

0129269

Signed-off-by: Xuanwo <[email protected]>

cargo check pass

aed49b2

Signed-off-by: Xuanwo <[email protected]>

Xuanwo changed the title ~~Upgrade opendal~~ refactor: Bump OpenDAL to 0.46, arrow to 51, tonic to 0.11, reqwest to 0.12, hyper to 1 May 8, 2024

github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label May 8, 2024

Xuanwo changed the title ~~refactor: Bump OpenDAL to 0.46, arrow to 51, tonic to 0.11, reqwest to 0.12, hyper to 1~~ refactor: Bump OpenDAL 0.46, arrow 51, tonic 0.11, reqwest 0.12, hyper 1, http 1 May 8, 2024

Merge branch 'main' into upgrade-opendal

fe0b61b

This comment was marked as resolved.

Sign in to view

Xuanwo added 2 commits May 9, 2024 17:42

Cleanup deps

f657fbd

Signed-off-by: Xuanwo <[email protected]>

Merge branch 'main' into upgrade-opendal

545b4f6

Xuanwo mentioned this pull request May 9, 2024

ci: Bump version to 2024-02-08 (the same commit with 1.78) #15455

Merged

11 tasks

Format files

10cccc0

Signed-off-by: Xuanwo <[email protected]>

Xuanwo added the ci-benchmark Benchmark: run all test label May 9, 2024

Xuanwo marked this pull request as ready for review May 9, 2024 14:15

Xuanwo requested review from b41sh, sundy-li, zhang2014 and youngsofun May 9, 2024 14:15

sundy-li approved these changes May 9, 2024

View reviewed changes

Xuanwo marked this pull request as draft May 9, 2024 15:12

Fix bytes reader use too small range

deaced2

Signed-off-by: Xuanwo <[email protected]>

Xuanwo removed the ci-benchmark Benchmark: run all test label May 9, 2024

Xuanwo added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels May 10, 2024

Xuanwo added 3 commits May 10, 2024 23:31

reduce to 2 concurrent

03ecd95

Signed-off-by: Xuanwo <[email protected]>

Also fix support for input pipeline

e5182bc

Signed-off-by: Xuanwo <[email protected]>

Merge branch 'main' into upgrade-opendal

97275f8

Xuanwo added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels May 10, 2024

Xuanwo added 2 commits May 11, 2024 00:48

try 4 concurrent

2747979

Signed-off-by: Xuanwo <[email protected]>

Merge remote-tracking branch 'refs/remotes/xuanwo/upgrade-opendal' in…

ecc0846

…to upgrade-opendal

Xuanwo added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels May 10, 2024

Xuanwo marked this pull request as ready for review May 10, 2024 17:21

Remove an extra head

46b8f31

Signed-off-by: Xuanwo <[email protected]>

youngsofun approved these changes May 11, 2024

View reviewed changes

Xuanwo added this pull request to the merge queue May 11, 2024

Merged via the queue into databendlabs:main with commit 220787d May 11, 2024
78 checks passed

Xuanwo deleted the upgrade-opendal branch May 11, 2024 07:44

This was referenced Jul 14, 2024

Link Checker Report databendlabs/databend-docs#955

Closed

Link Checker Report databendlabs/databend-docs#960

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Bump OpenDAL 0.46, arrow 51, tonic 0.11, reqwest 0.12, hyper 1, http 1 #15442

refactor: Bump OpenDAL 0.46, arrow 51, tonic 0.11, reqwest 0.12, hyper 1, http 1 #15442

Xuanwo commented May 8, 2024 •

edited

Loading

This comment was marked as resolved.

Xuanwo commented May 9, 2024

github-actions bot commented May 9, 2024

Xuanwo commented May 9, 2024

github-actions bot commented May 10, 2024

github-actions bot commented May 10, 2024

github-actions bot commented May 10, 2024

github-actions bot commented May 10, 2024

github-actions bot commented May 10, 2024

github-actions bot commented May 10, 2024

github-actions bot commented May 10, 2024

Xuanwo commented May 11, 2024

BohuTANG commented May 11, 2024

Xuanwo commented May 11, 2024 •

edited

Loading

refactor: Bump OpenDAL 0.46, arrow 51, tonic 0.11, reqwest 0.12, hyper 1, http 1 #15442

refactor: Bump OpenDAL 0.46, arrow 51, tonic 0.11, reqwest 0.12, hyper 1, http 1 #15442

Conversation

Xuanwo commented May 8, 2024 • edited Loading

Summary

Changes that worth of mention

Benchmarks

Qeury Performance

Load (with 2 concurrent)

Load (with 4 concurrent)

Tests

Type of change

This comment was marked as resolved.

Xuanwo commented May 9, 2024

github-actions bot commented May 9, 2024

Docker Image for PR

Xuanwo commented May 9, 2024

github-actions bot commented May 10, 2024

ClickBench Report

github-actions bot commented May 10, 2024

Docker Image for PR

github-actions bot commented May 10, 2024

ClickBench Report

github-actions bot commented May 10, 2024

Docker Image for PR

github-actions bot commented May 10, 2024

ClickBench Report

github-actions bot commented May 10, 2024

Docker Image for PR

github-actions bot commented May 10, 2024

ClickBench Report

Xuanwo commented May 11, 2024

BohuTANG commented May 11, 2024

Xuanwo commented May 11, 2024 • edited Loading

Xuanwo commented May 8, 2024 •

edited

Loading

Xuanwo commented May 11, 2024 •

edited

Loading