Produce Hive splits for bucketed tables in round-robin fashion #7031

haozhun · 2017-01-10T01:47:57Z

This reduces the likelihood that the scheduler gets blocked when one worker has
more splits queued than limit while other workers have no splits. Without round
robin, for a bucketed partition, the split loader would produce a series of
splits that has node afinity like C, C, ..., C, A, A, ..., A, D, D, ..., D, B,
B, ..., B. If there are more splits for node C than the number of queued splits
allowed, node A, B, D would not have any split available to run because the
scheduler is blocked.

In addition, this commit changes the policy to determine the target size for
initial splits. The original policy tries to produce a number of initial splits
that have similar size. For example, assuming maxInitialSplitSize = 9K,
maxSplitSize = 30K, and the file size is 48K.

When maxInitialSplits = 10, the file will be split into 8K, 8K, 8K, 8K, 8K, 8K.
When maxInitialSplits = 2, the file will be split into 8K, 8K, 16K, 16K.

In the new policy,

When maxInitialSplits = 10, the file will be split into 9K, 9K, 9K, 9K, 9K, 3K.
When maxInitialSplits = 2, the file will be split into 9K, 9K, 30K.

You can see that the old policy is better for case 1, while the new policy is
better for case 2. A smart policy that are optimal in both cases are available.
However, such policy has to know exactly how many initial splits are left.
This is not possible given the parallel nature of BackgroundHiveSplitLoader.

cberner · 2017-01-10T17:28:41Z

presto-hive/src/main/java/com/facebook/presto/hive/BackgroundHiveSplitLoader.java

-
-                long chunkOffset = 0;
-                while (chunkOffset < blockLocation.getLength()) {
-                    if (remainingInitialSplits.decrementAndGet() < 0 && creatingInitialSplits) {


it looks like you removed this decrementing. Unless I'm missing something, you need to add it back to make the initial splits work correctly

You're right. I should add the decrementing back.

Do you have any suggestions on verifying correctness of this code. I tried adding tests yesterday. I guess I can sit down and take time to manually construct a BackgroundHiveSplitLoader if that's what we have to do. Do you have any suggestions?

haozhun

Updated

haozhun · 2017-01-10T17:57:24Z

presto-hive/src/main/java/com/facebook/presto/hive/BackgroundHiveSplitLoader.java

-
-                long chunkOffset = 0;
-                while (chunkOffset < blockLocation.getLength()) {
-                    if (remainingInitialSplits.decrementAndGet() < 0 && creatingInitialSplits) {


You're right. I should add the decrementing back.

Do you have any suggestions on verifying correctness of this code. I tried adding tests yesterday. I guess I can sit down and take time to manually construct a BackgroundHiveSplitLoader if that's what we have to do. Do you have any suggestions?

This reduces the likelihood that the scheduler gets blocked when one worker has more splits queued than limit while other workers have no splits. Without round robin, for a bucketed partition, the split loader would produce a series of splits that has node afinity like C, C, ..., C, A, A, ..., A, D, D, ..., D, B, B, ..., B. If there are more splits for node C than the number of queued splits allowed, node A, B, D would not have any split available to run because the scheduler is blocked. In addition, this commit changes the policy to determine the target size for initial splits. The original policy tries to produce a number of initial splits that have similar size. For example, assuming maxInitialSplitSize = 9K, maxSplitSize = 30K, and the file size is 48K. * When maxInitialSplits = 10, the file will be split into 8K, 8K, 8K, 8K, 8K, 8K. * When maxInitialSplits = 2, the file will be split into 8K, 8K, 16K, 16K. In the new policy, * When maxInitialSplits = 10, the file will be split into 9K, 9K, 9K, 9K, 9K, 3K. * When maxInitialSplits = 2, the file will be split into 9K, 9K, 30K. You can see that the old policy is better for case 1, while the new policy is better for case 2. A smart policy that are optimal in both cases are available. However, such policy has to know exactly how many initial splits are left. This is not possible given the parallel nature of BackgroundHiveSplitLoader.

haozhun assigned cberner Jan 10, 2017

haozhun requested a review from cberner January 10, 2017 01:47

facebook-github-bot added the CLA Signed label Jan 10, 2017

cberner suggested changes Jan 10, 2017

View reviewed changes

cberner assigned haozhun and unassigned cberner Jan 10, 2017

haozhun force-pushed the splitloader branch from 7571d8f to 6c25a39 Compare January 10, 2017 17:53

haozhun commented Jan 10, 2017

View reviewed changes

haozhun force-pushed the splitloader branch from 6c25a39 to 4fdd084 Compare January 10, 2017 17:58

haozhun assigned cberner and unassigned haozhun Jan 10, 2017

cberner approved these changes Jan 11, 2017

View reviewed changes

cberner assigned haozhun and unassigned cberner Jan 11, 2017

haozhun force-pushed the splitloader branch from 4fdd084 to 4974e96 Compare January 14, 2017 00:38

haozhun added 2 commits January 13, 2017 18:13

Fix bug where HiveSplitLoader produces splits over max size

9695391

haozhun force-pushed the splitloader branch from 4974e96 to 7aab59d Compare January 14, 2017 02:13

haozhun merged commit 7aab59d into prestodb:master Jan 14, 2017

haozhun deleted the splitloader branch March 11, 2018 05:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Produce Hive splits for bucketed tables in round-robin fashion #7031

Produce Hive splits for bucketed tables in round-robin fashion #7031

haozhun commented Jan 10, 2017 •

edited by wenleix

Loading

cberner Jan 10, 2017

haozhun Jan 10, 2017

haozhun left a comment

haozhun Jan 10, 2017

Produce Hive splits for bucketed tables in round-robin fashion #7031

Produce Hive splits for bucketed tables in round-robin fashion #7031

Conversation

haozhun commented Jan 10, 2017 • edited by wenleix Loading

cberner Jan 10, 2017

Choose a reason for hiding this comment

haozhun Jan 10, 2017

Choose a reason for hiding this comment

haozhun left a comment

Choose a reason for hiding this comment

haozhun Jan 10, 2017

Choose a reason for hiding this comment

haozhun commented Jan 10, 2017 •

edited by wenleix

Loading