Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrently add partitions to Hive Metastore #15241

Merged
merged 4 commits into from
Dec 8, 2022
Merged

Concurrently add partitions to Hive Metastore #15241

merged 4 commits into from
Dec 8, 2022

Conversation

gaurav8297
Copy link
Member

@gaurav8297 gaurav8297 commented Nov 29, 2022

Description

Increase the performance of the finishing step when inserting into the newly created partitioned table.

Benchmarks

For Glue:

Total partitions: 2000+
Total rows: 600M

Before:
Writing time: 2:29 mins
Finishing time: 4:32 mins
Total time: 7:01 mins

After:
Writing time: 2:27 mins
Finishing time: 1:01 mins
Total time: 3:28 mins

For hive thrift (local setup):

Total partitions: 100+
Total rows: 1M

Before:
Writing time: 20 secs
Finishing time: 1:50 mins
Total time: 2:10 mins

After:
Writing time: 20 secs
Finishing time: 17.50 secs
Total time: 37.50 secs

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Nov 29, 2022
@gaurav8297 gaurav8297 marked this pull request as ready for review November 30, 2022 03:43
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

@sopel39 sopel39 merged commit 9adb3ca into trinodb:master Dec 8, 2022
@sopel39 sopel39 mentioned this pull request Dec 8, 2022
@github-actions github-actions bot added this to the 404 milestone Dec 8, 2022
@gaurav8297 gaurav8297 deleted the hive_finish_time branch December 8, 2022 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants