Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize CreateDigest implementation. #16617

Merged
merged 2 commits into from
Aug 24, 2022

Conversation

danxmoran
Copy link
Contributor

@danxmoran danxmoran commented Aug 23, 2022

Closes #16570

  • Use a DigestTrie to create all snapshots at once, instead of creating them individually
  • Store all in-memory file contents in a single (batched) call, instead of storing them individually

[ci skip-build-wheels]


I restored the benchmark script from #14569 to test this. The results seem almost too good to be true, but here they are 🤔

size create_digest_before create_digest_after
20000 608 130
40000 1164 268
60000 2260 475
80000 3582 674
100000 5085 862
120000 6765 1057
140000 8818 1067
160000 10752 1361
180000 12619 1604

Somebody should probably double-check those numbers 😅 I'll re-run on my machine to sanity-check it as well

* Use a `DigestTrie` to create all snapshots at once, instead of
  creating them individually
* Store all in-memory file contents in a single (batched) call, instead
  of storing them individually

[ci skip-build-wheels]
@Eric-Arellano
Copy link
Contributor

Omg epic. @stuhood definitely cherry-pick this to 2.14. Thoughts on 2.13? I'm hoping to do an RC tonight if I can get CI green.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks.

src/rust/engine/src/intrinsics.rs Outdated Show resolved Hide resolved
@stuhood
Copy link
Member

stuhood commented Aug 23, 2022

@stuhood definitely cherry-pick this to 2.14. Thoughts on 2.13? I'm hoping to do an RC tonight if I can get CI green.

For sure 2.14. I'm definitely curious what the real-world impact will be, but I doubt it will be significant enough to be worth delaying 2.13 for. If it is though, can always go into a point release of 2.13.

@danxmoran danxmoran marked this pull request as ready for review August 23, 2022 22:46
It's not strictly needed, and skipping it can avoid a lot of IO.

[ci skip-build-wheels]
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@stuhood stuhood enabled auto-merge (squash) August 23, 2022 23:26
@stuhood stuhood added this to the 2.14.x milestone Aug 23, 2022
@stuhood stuhood merged commit cc3e78b into pantsbuild:main Aug 24, 2022
Eric-Arellano pushed a commit to Eric-Arellano/pants that referenced this pull request Aug 24, 2022
Closes pantsbuild#16570

* Use a `DigestTrie` to create all snapshots at once, instead of creating them individually
* Store all in-memory file contents in a single (batched) call, instead of storing them individually

[ci skip-build-wheels]

---

I restored the benchmark script from pantsbuild#14569 to test this.

| size | create_digest_before | create_digest_after |
| --- | --- | --- |
| 20000 | 608 | 130 |
| 40000 | 1164 | 268 |
| 60000 | 2260 | 475 |
| 80000 | 3582 | 674 |
| 100000 | 5085 | 862 |
| 120000 | 6765 | 1057 |
| 140000 | 8818 | 1067 |
| 160000 | 10752 | 1361 |
| 180000 | 12619 | 1604 |
Eric-Arellano added a commit that referenced this pull request Aug 29, 2022
Closes #16570

* Use a `DigestTrie` to create all snapshots at once, instead of creating them individually
* Store all in-memory file contents in a single (batched) call, instead of storing them individually

[ci skip-build-wheels]

---

I restored the benchmark script from #14569 to test this.

| size | create_digest_before | create_digest_after |
| --- | --- | --- |
| 20000 | 608 | 130 |
| 40000 | 1164 | 268 |
| 60000 | 2260 | 475 |
| 80000 | 3582 | 674 |
| 100000 | 5085 | 862 |
| 120000 | 6765 | 1057 |
| 140000 | 8818 | 1067 |
| 160000 | 10752 | 1361 |
| 180000 | 12619 | 1604 |
cczona pushed a commit to cczona/pants that referenced this pull request Sep 1, 2022
Closes pantsbuild#16570

* Use a `DigestTrie` to create all snapshots at once, instead of creating them individually
* Store all in-memory file contents in a single (batched) call, instead of storing them individually

[ci skip-build-wheels]

---

I restored the benchmark script from pantsbuild#14569 to test this.

| size | create_digest_before | create_digest_after |
| --- | --- | --- |
| 20000 | 608 | 130 |
| 40000 | 1164 | 268 |
| 60000 | 2260 | 475 |
| 80000 | 3582 | 674 |
| 100000 | 5085 | 862 |
| 120000 | 6765 | 1057 |
| 140000 | 8818 | 1067 |
| 160000 | 10752 | 1361 |
| 180000 | 12619 | 1604 |
@danxmoran danxmoran deleted the danxmoran/optimize-create-digest branch October 4, 2022 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CreateDigest time is non-linear in the number of inputs
3 participants