Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate a performant replacement for createPatch from diff library #103

Open
joyceerhl opened this issue Dec 14, 2021 · 3 comments
Open
Assignees
Labels
debt Code quality issues
Milestone

Comments

@joyceerhl
Copy link
Contributor

joyceerhl commented Dec 14, 2021

Loading a PR containing a very large file (34K LOC) with changes takes ~1 minute: microsoft/vscode-pull-request-github#3187

This is because GHPRI depends on RemoteHub for generating diffs, and RemoteHub uses the diff library, which suffers from known performance issues: kpdecker/jsdiff#239

GHPRI is mitigating this by not generating diffs until absolutely necessary, but a user can still get unlucky if the first file in the PR is very large and we auto-display the diff, or the user may have to wait a long time after clicking to view file changes in the diff view.

We should identify a more performant library, e.g. Google's diff-match-patch js implementation plus modifications to output unidiffs.

@joyceerhl joyceerhl added the debt Code quality issues label Dec 14, 2021
@joyceerhl joyceerhl self-assigned this Dec 14, 2021
@joyceerhl joyceerhl added this to the January 2022 milestone Dec 14, 2021
@joyceerhl
Copy link
Contributor Author

joyceerhl commented Dec 15, 2021

@alexr00, could you try setting "remoteHub.diff.useDiffMatchPatch": true in your settings.json and letting me know how it works for you over the next few days? The original diff library RemoteHub uses suffers from performance issues because it generates enough garbage that the VM spends >60% of its time GC-ing--I'm still looking into why.

From my testing, with the new library, it looks like a 50% speedup when loading https://insiders.vscode.dev/github/python/cpython/pull/29581 even without your recent GHPRI changes.
image

I am not sure if it has perfect feature parity with the original library so I don't want to enable it more broadly yet.

@alexr00
Copy link
Member

alexr00 commented Dec 15, 2021

Cool, I will try it out!

@joyceerhl
Copy link
Contributor Author

Looks like the new diff library is a big improvement perf-wise (still behind the remoteHub.diff.useDiffMatchPatch toggle). Here are some preliminary results from testing RemoteHub with GHPRI:

RemoteHub GHPRI Time to PR view Recording
Stable Stable 60s Recording
Insiders using new diff library Stable 22s Recording
Insiders using old diff library Insiders 15s Recording
Insiders using new diff library Insiders 14s Recording

Stable GHPRI, stable RemoteHub using old diff library

stable.mp4

Stable GHPRI, Insiders RemoteHub using new diff library

stable-rh.mp4

Insiders GHPRI, Insiders RemoteHub using old diff library

old-diff.mp4

Insiders GHPRI, Insiders RemoteHub using new diff library

new-diff.mp4

@joyceerhl joyceerhl modified the milestones: January 2022, February 2022 Jan 27, 2022
@joyceerhl joyceerhl modified the milestones: February 2022, March 2022 Feb 24, 2022
@joyceerhl joyceerhl modified the milestones: March 2022, Backlog Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debt Code quality issues
Projects
None yet
Development

No branches or pull requests

2 participants