Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale table writers per task based on throughput #13111

Merged
merged 5 commits into from
Aug 30, 2022
Merged

Scale table writers per task based on throughput #13111

merged 5 commits into from
Aug 30, 2022

Conversation

gaurav8297
Copy link
Member

@gaurav8297 gaurav8297 commented Jul 7, 2022

Description

Currently, the task_writer_count per worker defaults to 1 to avoid many small files in some cases. This PR attempts to make it adaptive based on parameters like physicalWrittenBytes and buffer size, which is similar to what we are doing for scale_writers in ScaledWriterScheduler. This feature is behind a new flag scale_task_writers. Although, I don't like having two flags, but it gives more control.

  • scale_writers: To scale out writers by scheduling them on different workers based on throughput.
  • scale_task_writers: To scale out local parallel table writer jobs per worker based on throughput.

PS: For now, this only works for the unpartitioned table target, and for the partitioned target, I'm planning to create a separate PR.

Initial Results (~6x faster):

Before:

trino:gaurav_test_1> Insert into gaurav_test_1.lineitem_1 select * FROM tpch_sf300_orc_part.lineitem;
INSERT: 1799989091 rows

Query 20220706_113152_00007_2df8m, FINISHED, 7 nodes
Splits: 2,758 total, 2,758 done (100.00%)
26:25 [1.8B rows, 47.5GB] [1.14M rows/s, 30.7MB/s]

After:

trino:gaurav_test_1> set session scale_task_writers=true;
SET SESSION
trino:gaurav_test_1> Insert into gaurav_test_1.lineitem_1 select * FROM tpch_sf300_orc_part.lineitem;
INSERT: 1799989091 rows

Query 20220706_115833_00009_2df8m, FINISHED, 7 nodes
Splits: 2,800 total, 2,800 done (100.00%)
4:09 [1.8B rows, 47.5GB] [7.23M rows/s, 195MB/s]

Is this change a fix, improvement, new feature, refactoring, or other?

improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

core query engine

How would you describe this change to a non-technical end user or system administrator?

Related issues, pull requests, and links

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jul 7, 2022
@gaurav8297 gaurav8297 changed the title [WIP] Table writer tasks per worker scaling based on throughput [WIP] Scaling table writer tasks per worker based on throughput Jul 7, 2022
@gaurav8297 gaurav8297 changed the title [WIP] Scaling table writer tasks per worker based on throughput Scaling table writer tasks per worker based on throughput Jul 13, 2022
@gaurav8297 gaurav8297 changed the title Scaling table writer tasks per worker based on throughput Scale table writer tasks per worker based on throughput Jul 13, 2022
@gaurav8297 gaurav8297 marked this pull request as ready for review July 13, 2022 21:51
Copy link
Contributor

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comments

Copy link
Contributor

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comments % @raunaqmorarka's approval

@raunaqmorarka raunaqmorarka changed the title Scale table writer tasks per worker based on throughput Scale table writers per task based on throughput Jul 19, 2022
@arhimondr
Copy link
Contributor

@gaurav8297 There are some test failures

@gaurav8297
Copy link
Member Author

All of these test failures seem to be unrelated.

@gaurav8297
Copy link
Member Author

ci / hive-tests (config-empty and ci / hive-tests (config-hdp3) are failing due to this issue. #13270

ci / test (plugin/trino-hive) is failing due to OOM and ci / test (plugin/trino-elasticsearch) because of container failure.

All of these issues seem unrelated to this change.

cc @arhimondr @raunaqmorarka

@github-actions github-actions bot added the docs label Jul 22, 2022
@gaurav8297
Copy link
Member Author

gaurav8297 commented Jul 22, 2022

This experiment was performed to find the best default value of task.min-scaled-writer-count.

When the value is 8:

trino:gaurav_test_1> Insert into gaurav_test_1.lineitem_current select * FROM tpch_sf300_orc_part.lineitem;
INSERT: 1799989091 rows

Query 20220722_081653_00003_k3qm4, FINISHED, 7 nodes
Splits: 2,800 total, 2,800 done (100.00%)
4:05 [1.8B rows, 47.5GB] [7.03M rows/s, 190MB/s]

When the value is 4:

trino:gaurav_test_1> Insert into gaurav_test_1.lineitem_current select * FROM tpch_sf300_orc_part.lineitem;
INSERT: 1799989091 rows

Query 20220722_083038_00002_dscnm, FINISHED, 7 nodes
Splits: 2,776 total, 2,776 done (100.00%)
6:26 [1.8B rows, 47.5GB] [4.71M rows/s, 128MB/s]

So the difference is somewhere around ~2.5mins. That's why we are going with 8 as default.

@raunaqmorarka
Copy link
Member

@sopel39 PTAL
It lgtm % comments about docs

@gaurav8297
Copy link
Member Author

Test failures are unrelated

Copy link
Member

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % minor comments

Copy link
Contributor

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comments remaining

@sopel39
Copy link
Member

sopel39 commented Aug 19, 2022

@dain do you ack adding local writer scaling (the meat of this change)?

@gaurav8297
Copy link
Member Author

gaurav8297 commented Aug 26, 2022

Simulation:

config:

MAX_WORKERS = 20
TOTAL_DATA_SIZE = 4000
MAX_TASK_WRITER_COUNT = 8
MIN_WRITER_SIZE = 32

Results:

Total Writers: 49
Small Writers: 15
Large Writers: 34
Small/Large Ratio: 0.4411764705882353

cc @sopel39

// with incoming data. In another word, buffer utilization is below 50%.
if (writerCount < buffers.size() && memoryManager.getBufferedBytes() >= bufferSizeMinScalingThreshold) {
long physicalWrittenBytes = physicalWrittenBytesSupplier.get();
if ((physicalWrittenBytes - lastScaleUpPhysicalWrittenBytes) >= writerCount * writerMinSize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote i little simulation of this logic (output below).
I add 4GB of data in 1MB pages with bufferSizeMinScalingThreshold = 0 and maxWriterCount = 16.
Two things I noticed.

  1. Scaling to 8 writers takes about 900MB, where in theory I could be 8*32MB = 256MB.
  2. File sizes monotonically decrease (the numbers mean buffer index: pages/MBs written) :
    final buffers: 0:496, 1:464, 2:432, 3:400, 4:368, 5:336, 6:304, 7:272, 8:240, 9:208, 10:176, 11:144, 12:112, 13:80, 14:48, 15:16
    Is this expected behavior?
firs column is bytes written so far, second is MBs written per buffer
0 MB, buffers: 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
32 MB, buffers: 0:32, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.949-0500 WARNING Increased task writer count: 2
64 MB, buffers: 0:48, 1:16, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
96 MB, buffers: 0:64, 1:32, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.949-0500 WARNING Increased task writer count: 3
128 MB, buffers: 0:74, 1:43, 2:11, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
160 MB, buffers: 0:85, 1:54, 2:21, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
192 MB, buffers: 0:96, 1:64, 2:32, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.951-0500 WARNING Increased task writer count: 4
224 MB, buffers: 0:104, 1:72, 2:40, 3:8, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
256 MB, buffers: 0:112, 1:80, 2:48, 3:16, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
288 MB, buffers: 0:120, 1:88, 2:56, 3:24, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
320 MB, buffers: 0:128, 1:96, 2:64, 3:32, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.952-0500 WARNING Increased task writer count: 5
352 MB, buffers: 0:134, 1:103, 2:71, 3:38, 4:6, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
384 MB, buffers: 0:140, 1:109, 2:77, 3:45, 4:13, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
416 MB, buffers: 0:147, 1:116, 2:83, 3:51, 4:19, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
448 MB, buffers: 0:153, 1:122, 2:90, 3:58, 4:25, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
480 MB, buffers: 0:160, 1:128, 2:96, 3:64, 4:32, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.953-0500 WARNING Increased task writer count: 6
512 MB, buffers: 0:165, 1:134, 2:102, 3:69, 4:37, 5:5, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
544 MB, buffers: 0:170, 1:139, 2:107, 3:75, 4:43, 5:10, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
576 MB, buffers: 0:176, 1:144, 2:112, 3:80, 4:48, 5:16, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
608 MB, buffers: 0:181, 1:150, 2:118, 3:85, 4:53, 5:21, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
640 MB, buffers: 0:186, 1:155, 2:123, 3:91, 4:59, 5:26, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
672 MB, buffers: 0:192, 1:160, 2:128, 3:96, 4:64, 5:32, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.954-0500 WARNING Increased task writer count: 7
704 MB, buffers: 0:196, 1:165, 2:133, 3:101, 4:69, 5:36, 6:4, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
736 MB, buffers: 0:201, 1:170, 2:137, 3:105, 4:73, 5:41, 6:9, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
768 MB, buffers: 0:205, 1:174, 2:142, 3:110, 4:78, 5:46, 6:13, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
800 MB, buffers: 0:210, 1:179, 2:147, 3:114, 4:82, 5:50, 6:18, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
832 MB, buffers: 0:214, 1:183, 2:151, 3:119, 4:87, 5:55, 6:23, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
864 MB, buffers: 0:219, 1:188, 2:156, 3:124, 4:91, 5:59, 6:27, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
896 MB, buffers: 0:224, 1:192, 2:160, 3:128, 4:96, 5:64, 6:32, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.955-0500 WARNING Increased task writer count: 8
928 MB, buffers: 0:228, 1:196, 2:164, 3:132, 4:100, 5:68, 6:36, 7:4, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
960 MB, buffers: 0:232, 1:200, 2:168, 3:136, 4:104, 5:72, 6:40, 7:8, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
992 MB, buffers: 0:236, 1:204, 2:172, 3:140, 4:108, 5:76, 6:44, 7:12, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1024 MB, buffers: 0:240, 1:208, 2:176, 3:144, 4:112, 5:80, 6:48, 7:16, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1056 MB, buffers: 0:244, 1:212, 2:180, 3:148, 4:116, 5:84, 6:52, 7:20, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1088 MB, buffers: 0:248, 1:216, 2:184, 3:152, 4:120, 5:88, 6:56, 7:24, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1120 MB, buffers: 0:252, 1:220, 2:188, 3:156, 4:124, 5:92, 6:60, 7:28, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1152 MB, buffers: 0:256, 1:224, 2:192, 3:160, 4:128, 5:96, 6:64, 7:32, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.955-0500 WARNING Increased task writer count: 9
1184 MB, buffers: 0:259, 1:228, 2:196, 3:164, 4:132, 5:100, 6:67, 7:35, 8:3, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1216 MB, buffers: 0:263, 1:232, 2:199, 3:167, 4:135, 5:103, 6:71, 7:39, 8:7, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1248 MB, buffers: 0:266, 1:235, 2:203, 3:171, 4:139, 5:107, 6:75, 7:42, 8:10, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1280 MB, buffers: 0:270, 1:239, 2:207, 3:174, 4:142, 5:110, 6:78, 7:46, 8:14, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1312 MB, buffers: 0:273, 1:242, 2:210, 3:178, 4:146, 5:114, 6:82, 7:50, 8:17, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1344 MB, buffers: 0:277, 1:246, 2:214, 3:182, 4:149, 5:117, 6:85, 7:53, 8:21, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1376 MB, buffers: 0:280, 1:249, 2:217, 3:185, 4:153, 5:121, 6:89, 7:57, 8:25, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1408 MB, buffers: 0:284, 1:253, 2:221, 3:189, 4:157, 5:124, 6:92, 7:60, 8:28, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1440 MB, buffers: 0:288, 1:256, 2:224, 3:192, 4:160, 5:128, 6:96, 7:64, 8:32, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.956-0500 WARNING Increased task writer count: 10
1472 MB, buffers: 0:291, 1:260, 2:228, 3:195, 4:163, 5:131, 6:99, 7:67, 8:35, 9:3, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1504 MB, buffers: 0:294, 1:263, 2:231, 3:199, 4:167, 5:134, 6:102, 7:70, 8:38, 9:6, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1536 MB, buffers: 0:297, 1:266, 2:234, 3:202, 4:170, 5:138, 6:106, 7:73, 8:41, 9:9, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1568 MB, buffers: 0:300, 1:269, 2:237, 3:205, 4:173, 5:141, 6:109, 7:77, 8:45, 9:12, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1600 MB, buffers: 0:304, 1:272, 2:240, 3:208, 4:176, 5:144, 6:112, 7:80, 8:48, 9:16, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1632 MB, buffers: 0:307, 1:276, 2:244, 3:211, 4:179, 5:147, 6:115, 7:83, 8:51, 9:19, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1664 MB, buffers: 0:310, 1:279, 2:247, 3:215, 4:183, 5:150, 6:118, 7:86, 8:54, 9:22, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1696 MB, buffers: 0:313, 1:282, 2:250, 3:218, 4:186, 5:154, 6:122, 7:89, 8:57, 9:25, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1728 MB, buffers: 0:316, 1:285, 2:253, 3:221, 4:189, 5:157, 6:125, 7:93, 8:61, 9:28, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
1760 MB, buffers: 0:320, 1:288, 2:256, 3:224, 4:192, 5:160, 6:128, 7:96, 8:64, 9:32, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.957-0500 WARNING Increased task writer count: 11
1792 MB, buffers: 0:322, 1:291, 2:259, 3:227, 4:195, 5:163, 6:131, 7:99, 8:67, 9:35, 10:3, 11:0, 12:0, 13:0, 14:0, 15:0
1824 MB, buffers: 0:325, 1:294, 2:262, 3:230, 4:198, 5:166, 6:134, 7:102, 8:70, 9:38, 10:5, 11:0, 12:0, 13:0, 14:0, 15:0
1856 MB, buffers: 0:328, 1:297, 2:265, 3:233, 4:201, 5:169, 6:137, 7:105, 8:73, 9:40, 10:8, 11:0, 12:0, 13:0, 14:0, 15:0
1888 MB, buffers: 0:331, 1:300, 2:268, 3:236, 4:204, 5:172, 6:140, 7:108, 8:75, 9:43, 10:11, 11:0, 12:0, 13:0, 14:0, 15:0
1920 MB, buffers: 0:334, 1:303, 2:271, 3:239, 4:207, 5:175, 6:143, 7:110, 8:78, 9:46, 10:14, 11:0, 12:0, 13:0, 14:0, 15:0
1952 MB, buffers: 0:337, 1:306, 2:274, 3:242, 4:210, 5:178, 6:145, 7:113, 8:81, 9:49, 10:17, 11:0, 12:0, 13:0, 14:0, 15:0
1984 MB, buffers: 0:340, 1:309, 2:277, 3:245, 4:213, 5:180, 6:148, 7:116, 8:84, 9:52, 10:20, 11:0, 12:0, 13:0, 14:0, 15:0
2016 MB, buffers: 0:343, 1:312, 2:280, 3:248, 4:215, 5:183, 6:151, 7:119, 8:87, 9:55, 10:23, 11:0, 12:0, 13:0, 14:0, 15:0
2048 MB, buffers: 0:346, 1:315, 2:283, 3:250, 4:218, 5:186, 6:154, 7:122, 8:90, 9:58, 10:26, 11:0, 12:0, 13:0, 14:0, 15:0
2080 MB, buffers: 0:349, 1:318, 2:285, 3:253, 4:221, 5:189, 6:157, 7:125, 8:93, 9:61, 10:29, 11:0, 12:0, 13:0, 14:0, 15:0
2112 MB, buffers: 0:352, 1:320, 2:288, 3:256, 4:224, 5:192, 6:160, 7:128, 8:96, 9:64, 10:32, 11:0, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.958-0500 WARNING Increased task writer count: 12
2144 MB, buffers: 0:354, 1:323, 2:291, 3:259, 4:227, 5:195, 6:163, 7:131, 8:99, 9:66, 10:34, 11:2, 12:0, 13:0, 14:0, 15:0
2176 MB, buffers: 0:357, 1:326, 2:294, 3:262, 4:230, 5:197, 6:165, 7:133, 8:101, 9:69, 10:37, 11:5, 12:0, 13:0, 14:0, 15:0
2208 MB, buffers: 0:360, 1:328, 2:296, 3:264, 4:232, 5:200, 6:168, 7:136, 8:104, 9:72, 10:40, 11:8, 12:0, 13:0, 14:0, 15:0
2240 MB, buffers: 0:362, 1:331, 2:299, 3:267, 4:235, 5:203, 6:171, 7:139, 8:107, 9:74, 10:42, 11:10, 12:0, 13:0, 14:0, 15:0
2272 MB, buffers: 0:365, 1:334, 2:302, 3:270, 4:238, 5:205, 6:173, 7:141, 8:109, 9:77, 10:45, 11:13, 12:0, 13:0, 14:0, 15:0
2304 MB, buffers: 0:368, 1:336, 2:304, 3:272, 4:240, 5:208, 6:176, 7:144, 8:112, 9:80, 10:48, 11:16, 12:0, 13:0, 14:0, 15:0
2336 MB, buffers: 0:370, 1:339, 2:307, 3:275, 4:243, 5:211, 6:179, 7:147, 8:115, 9:82, 10:50, 11:18, 12:0, 13:0, 14:0, 15:0
2368 MB, buffers: 0:373, 1:342, 2:310, 3:278, 4:246, 5:213, 6:181, 7:149, 8:117, 9:85, 10:53, 11:21, 12:0, 13:0, 14:0, 15:0
2400 MB, buffers: 0:376, 1:344, 2:312, 3:280, 4:248, 5:216, 6:184, 7:152, 8:120, 9:88, 10:56, 11:24, 12:0, 13:0, 14:0, 15:0
2432 MB, buffers: 0:378, 1:347, 2:315, 3:283, 4:251, 5:219, 6:187, 7:155, 8:123, 9:90, 10:58, 11:26, 12:0, 13:0, 14:0, 15:0
2464 MB, buffers: 0:381, 1:350, 2:318, 3:286, 4:254, 5:221, 6:189, 7:157, 8:125, 9:93, 10:61, 11:29, 12:0, 13:0, 14:0, 15:0
2496 MB, buffers: 0:384, 1:352, 2:320, 3:288, 4:256, 5:224, 6:192, 7:160, 8:128, 9:96, 10:64, 11:32, 12:0, 13:0, 14:0, 15:0
2022-08-26T08:50:27.959-0500 WARNING Increased task writer count: 13
2528 MB, buffers: 0:386, 1:355, 2:323, 3:291, 4:259, 5:227, 6:195, 7:162, 8:130, 9:98, 10:66, 11:34, 12:2, 13:0, 14:0, 15:0
2560 MB, buffers: 0:388, 1:357, 2:325, 3:293, 4:261, 5:229, 6:197, 7:165, 8:133, 9:101, 10:69, 11:37, 12:5, 13:0, 14:0, 15:0
2592 MB, buffers: 0:391, 1:360, 2:328, 3:296, 4:264, 5:232, 6:199, 7:167, 8:135, 9:103, 10:71, 11:39, 12:7, 13:0, 14:0, 15:0
2624 MB, buffers: 0:393, 1:362, 2:330, 3:298, 4:266, 5:234, 6:202, 7:170, 8:138, 9:106, 10:74, 11:42, 12:9, 13:0, 14:0, 15:0
2656 MB, buffers: 0:396, 1:365, 2:333, 3:301, 4:269, 5:236, 6:204, 7:172, 8:140, 9:108, 10:76, 11:44, 12:12, 13:0, 14:0, 15:0
2688 MB, buffers: 0:398, 1:367, 2:335, 3:303, 4:271, 5:239, 6:207, 7:175, 8:143, 9:111, 10:79, 11:46, 12:14, 13:0, 14:0, 15:0
2720 MB, buffers: 0:401, 1:370, 2:338, 3:306, 4:273, 5:241, 6:209, 7:177, 8:145, 9:113, 10:81, 11:49, 12:17, 13:0, 14:0, 15:0
2752 MB, buffers: 0:403, 1:372, 2:340, 3:308, 4:276, 5:244, 6:212, 7:180, 8:148, 9:116, 10:83, 11:51, 12:19, 13:0, 14:0, 15:0
2784 MB, buffers: 0:406, 1:375, 2:343, 3:310, 4:278, 5:246, 6:214, 7:182, 8:150, 9:118, 10:86, 11:54, 12:22, 13:0, 14:0, 15:0
2816 MB, buffers: 0:408, 1:377, 2:345, 3:313, 4:281, 5:249, 6:217, 7:185, 8:153, 9:120, 10:88, 11:56, 12:24, 13:0, 14:0, 15:0
2848 MB, buffers: 0:411, 1:380, 2:347, 3:315, 4:283, 5:251, 6:219, 7:187, 8:155, 9:123, 10:91, 11:59, 12:27, 13:0, 14:0, 15:0
2880 MB, buffers: 0:413, 1:382, 2:350, 3:318, 4:286, 5:254, 6:222, 7:190, 8:157, 9:125, 10:93, 11:61, 12:29, 13:0, 14:0, 15:0
2912 MB, buffers: 0:416, 1:384, 2:352, 3:320, 4:288, 5:256, 6:224, 7:192, 8:160, 9:128, 10:96, 11:64, 12:32, 13:0, 14:0, 15:0
2022-08-26T08:50:27.960-0500 WARNING Increased task writer count: 14
2944 MB, buffers: 0:418, 1:387, 2:355, 3:323, 4:291, 5:258, 6:226, 7:194, 8:162, 9:130, 10:98, 11:66, 12:34, 13:2, 14:0, 15:0
2976 MB, buffers: 0:420, 1:389, 2:357, 3:325, 4:293, 5:261, 6:229, 7:197, 8:165, 9:132, 10:100, 11:68, 12:36, 13:4, 14:0, 15:0
3008 MB, buffers: 0:422, 1:391, 2:359, 3:327, 4:295, 5:263, 6:231, 7:199, 8:167, 9:135, 10:103, 11:71, 12:39, 13:6, 14:0, 15:0
3040 MB, buffers: 0:425, 1:394, 2:362, 3:329, 4:297, 5:265, 6:233, 7:201, 8:169, 9:137, 10:105, 11:73, 12:41, 13:9, 14:0, 15:0
3072 MB, buffers: 0:427, 1:396, 2:364, 3:332, 4:300, 5:268, 6:236, 7:203, 8:171, 9:139, 10:107, 11:75, 12:43, 13:11, 14:0, 15:0
3104 MB, buffers: 0:429, 1:398, 2:366, 3:334, 4:302, 5:270, 6:238, 7:206, 8:174, 9:142, 10:110, 11:77, 12:45, 13:13, 14:0, 15:0
3136 MB, buffers: 0:432, 1:400, 2:368, 3:336, 4:304, 5:272, 6:240, 7:208, 8:176, 9:144, 10:112, 11:80, 12:48, 13:16, 14:0, 15:0
3168 MB, buffers: 0:434, 1:403, 2:371, 3:339, 4:307, 5:274, 6:242, 7:210, 8:178, 9:146, 10:114, 11:82, 12:50, 13:18, 14:0, 15:0
3200 MB, buffers: 0:436, 1:405, 2:373, 3:341, 4:309, 5:277, 6:245, 7:213, 8:181, 9:148, 10:116, 11:84, 12:52, 13:20, 14:0, 15:0
3232 MB, buffers: 0:438, 1:407, 2:375, 3:343, 4:311, 5:279, 6:247, 7:215, 8:183, 9:151, 10:119, 11:87, 12:55, 13:22, 14:0, 15:0
3264 MB, buffers: 0:441, 1:410, 2:378, 3:345, 4:313, 5:281, 6:249, 7:217, 8:185, 9:153, 10:121, 11:89, 12:57, 13:25, 14:0, 15:0
3296 MB, buffers: 0:443, 1:412, 2:380, 3:348, 4:316, 5:284, 6:252, 7:219, 8:187, 9:155, 10:123, 11:91, 12:59, 13:27, 14:0, 15:0
3328 MB, buffers: 0:445, 1:414, 2:382, 3:350, 4:318, 5:286, 6:254, 7:222, 8:190, 9:158, 10:126, 11:93, 12:61, 13:29, 14:0, 15:0
3360 MB, buffers: 0:448, 1:416, 2:384, 3:352, 4:320, 5:288, 6:256, 7:224, 8:192, 9:160, 10:128, 11:96, 12:64, 13:32, 14:0, 15:0
2022-08-26T08:50:27.961-0500 WARNING Increased task writer count: 15
3392 MB, buffers: 0:450, 1:419, 2:387, 3:354, 4:322, 5:290, 6:258, 7:226, 8:194, 9:162, 10:130, 11:98, 12:66, 13:34, 14:2, 15:0
3424 MB, buffers: 0:452, 1:421, 2:389, 3:357, 4:325, 5:292, 6:260, 7:228, 8:196, 9:164, 10:132, 11:100, 12:68, 13:36, 14:4, 15:0
3456 MB, buffers: 0:454, 1:423, 2:391, 3:359, 4:327, 5:295, 6:263, 7:230, 8:198, 9:166, 10:134, 11:102, 12:70, 13:38, 14:6, 15:0
3488 MB, buffers: 0:456, 1:425, 2:393, 3:361, 4:329, 5:297, 6:265, 7:233, 8:201, 9:168, 10:136, 11:104, 12:72, 13:40, 14:8, 15:0
3520 MB, buffers: 0:458, 1:427, 2:395, 3:363, 4:331, 5:299, 6:267, 7:235, 8:203, 9:171, 10:139, 11:106, 12:74, 13:42, 14:10, 15:0
3552 MB, buffers: 0:460, 1:429, 2:397, 3:365, 4:333, 5:301, 6:269, 7:237, 8:205, 9:173, 10:141, 11:109, 12:77, 13:44, 14:12, 15:0
3584 MB, buffers: 0:462, 1:431, 2:399, 3:367, 4:335, 5:303, 6:271, 7:239, 8:207, 9:175, 10:143, 11:111, 12:79, 13:47, 14:15, 15:0
3616 MB, buffers: 0:465, 1:434, 2:401, 3:369, 4:337, 5:305, 6:273, 7:241, 8:209, 9:177, 10:145, 11:113, 12:81, 13:49, 14:17, 15:0
3648 MB, buffers: 0:467, 1:436, 2:404, 3:372, 4:339, 5:307, 6:275, 7:243, 8:211, 9:179, 10:147, 11:115, 12:83, 13:51, 14:19, 15:0
3680 MB, buffers: 0:469, 1:438, 2:406, 3:374, 4:342, 5:310, 6:277, 7:245, 8:213, 9:181, 10:149, 11:117, 12:85, 13:53, 14:21, 15:0
3712 MB, buffers: 0:471, 1:440, 2:408, 3:376, 4:344, 5:312, 6:280, 7:248, 8:215, 9:183, 10:151, 11:119, 12:87, 13:55, 14:23, 15:0
3744 MB, buffers: 0:473, 1:442, 2:410, 3:378, 4:346, 5:314, 6:282, 7:250, 8:218, 9:186, 10:153, 11:121, 12:89, 13:57, 14:25, 15:0
3776 MB, buffers: 0:475, 1:444, 2:412, 3:380, 4:348, 5:316, 6:284, 7:252, 8:220, 9:188, 10:156, 11:124, 12:91, 13:59, 14:27, 15:0
3808 MB, buffers: 0:477, 1:446, 2:414, 3:382, 4:350, 5:318, 6:286, 7:254, 8:222, 9:190, 10:158, 11:126, 12:94, 13:62, 14:29, 15:0
3840 MB, buffers: 0:480, 1:448, 2:416, 3:384, 4:352, 5:320, 6:288, 7:256, 8:224, 9:192, 10:160, 11:128, 12:96, 13:64, 14:32, 15:0
2022-08-26T08:50:27.962-0500 WARNING Increased task writer count: 16
3872 MB, buffers: 0:482, 1:450, 2:418, 3:386, 4:354, 5:322, 6:290, 7:258, 8:226, 9:194, 10:162, 11:130, 12:98, 13:66, 14:34, 15:2
3904 MB, buffers: 0:484, 1:452, 2:420, 3:388, 4:356, 5:324, 6:292, 7:260, 8:228, 9:196, 10:164, 11:132, 12:100, 13:68, 14:36, 15:4
3936 MB, buffers: 0:486, 1:454, 2:422, 3:390, 4:358, 5:326, 6:294, 7:262, 8:230, 9:198, 10:166, 11:134, 12:102, 13:70, 14:38, 15:6
3968 MB, buffers: 0:488, 1:456, 2:424, 3:392, 4:360, 5:328, 6:296, 7:264, 8:232, 9:200, 10:168, 11:136, 12:104, 13:72, 14:40, 15:8
4000 MB, buffers: 0:490, 1:458, 2:426, 3:394, 4:362, 5:330, 6:298, 7:266, 8:234, 9:202, 10:170, 11:138, 12:106, 13:74, 14:42, 15:10
4032 MB, buffers: 0:492, 1:460, 2:428, 3:396, 4:364, 5:332, 6:300, 7:268, 8:236, 9:204, 10:172, 11:140, 12:108, 13:76, 14:44, 15:12
4064 MB, buffers: 0:494, 1:462, 2:430, 3:398, 4:366, 5:334, 6:302, 7:270, 8:238, 9:206, 10:174, 11:142, 12:110, 13:78, 14:46, 15:14
final buffers: 0:496, 1:464, 2:432, 3:400, 4:368, 5:336, 6:304, 7:272, 8:240, 9:208, 10:176, 11:144, 12:112, 13:80, 14:48, 15:16

@gaurav8297
Copy link
Member Author

gaurav8297 commented Aug 28, 2022

benchmarks (4x better):
insert_unpart_benchmark.pdf

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

@sopel39
Copy link
Member

sopel39 commented Aug 29, 2022

@gaurav8297 We need to have a plan for config properties when we add local scaling for partitioned inserts

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

}

private PlanWithProperties visitPartitionedWriter(PlanNode node, Optional<PartitioningScheme> optionalPartitioning, PlanNode source, StreamPreferredProperties parentPreferences)
private PlanWithProperties visitPartitionedWriter(PlanNode node, Optional<PartitioningScheme> optionalPartitioning, PlanNode source, StreamPreferredProperties parentPreferences, WriterTarget writerTarget)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the name of this method is bad most likely

Scale up local writers when current buffer memory utilization
is more than 50% of maximum and physical written
bytes by the last scaled up writer is greater than
writerMinSize. The scaling will happen upto
task.scale-writers.max-writer-count.
@sopel39 sopel39 merged commit 75ca607 into trinodb:master Aug 30, 2022
@sopel39 sopel39 mentioned this pull request Aug 30, 2022
@github-actions github-actions bot added this to the 395 milestone Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

7 participants