Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance] make sure decompress is set to 4.2.0 #7715

Merged
merged 1 commit into from
May 1, 2020
Merged

Conversation

amiramw
Copy link
Member

@amiramw amiramw commented May 1, 2020

decompress 4.2.1 will cause a huge performance degradation in plugins deployment
See: kevva/decompress#77

Signed-off-by: Amiram Wingarten [email protected]

What it does

Set decompress dependency on plugin-ext to 4.2.0 exactly as 4.2.1 brings major performance degredation. This fixed version will make sure that theia and dependent projects don't build their product with 4.2.1.

How to test

If yarn.lock is deleted and theia is built with decompress 4.2.1 the loading of vsix files takes very high CPU and takes at least twice compared to current situation.
With current theia yarn.lock decompress is still 4.2.0 so there is no current problem.

Review checklist

Reminder for reviewers

decompress 4.2.1 will cause a huge performance degradation in plugins deployment
See: kevva/decompress#77

Signed-off-by: Amiram Wingarten <[email protected]>
Copy link
Contributor

@JPinkney JPinkney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@amiramw amiramw merged commit 7498bd1 into master May 1, 2020
@amiramw amiramw deleted the decompress branch May 1, 2020 14:21
@akosyakov akosyakov added dependencies pull requests that update a dependency file performance issues related to performance labels May 1, 2020
@vince-fugnitto
Copy link
Member

vince-fugnitto commented Jul 30, 2020

@amiramw can we revert back to ^4.2.0 as a dependency range as the pinned version has a security vulnerability? (#7319)

If we revert back to ^4.2.0 as a dependency range we allow downstream applications the possibility to use a version without the security vulnerability, and also the possibility for downstream applications (such as yours) to pin the version if they are worried about performance degradation when deploying plugins.

I think 4.2.0 is too restrictive.

cc @marcdumais-work

@amiramw
Copy link
Member Author

amiramw commented Jul 31, 2020

@vince-fugnitto the fix to the security vulnerability caused the performance issue. what is the difference if in both cases the downstream project has to set its own version via reoslution? or did you mean to pin version via other method?

@vince-fugnitto
Copy link
Member

@vince-fugnitto the fix to the security vulnerability caused the performance issue. what is the difference if in both cases the downstream project has to set its own version via reoslution? or did you mean to pin version via other method?

The problem is that we specify a version range with this change that is too restrictive, we don't have the possibility as far as I know in a downstream app to use a resolution to pin it to a higher version (without the vulnerability).

If we kept ^4.2.0 instead of 4.2.0 as a range, we give more possibility for downstream apps to pin the dependency to a version that suits their use case (ex: performance vs security) instead of pinning it upstream.

Note: If no operator is specified, then = is assumed in the version range. So the = operator is effectively optional.

@amiramw
Copy link
Member Author

amiramw commented Jul 31, 2020

The problem is that we specify a version range with this change that is too restrictive, we don't have the possibility as far as I know in a downstream app to use a resolution to pin it to a higher version (without the vulnerability).

I think you can define resolution that is not within the range defined by your dependency. See yarn documentation:

You may be depending on a package that is not updated frequently, which depends on another package that got an important upgrade. In this case, if the version range specified by your direct dependency does not cover the new sub-dependency version, you are stuck waiting for the author.

If you want to pin version via yarn.lock and not resolution then I think you are right.

Anyway, if other consumers care more about the security vulnerability than this performance issue then I don't mind the revert. It should probably come with breaking change documentation explaining how to pin it.

@marcdumais-work
Copy link
Contributor

marcdumais-work commented Aug 1, 2020

Anyway, if other consumers care more about the security vulnerability than this performance issue then I don't mind the revert.

The main issue in my eyes is that, currently, we effectively force any downstream project to use an insecure dependency version, unless they take action to force otherwise (e.g. through a resolution). We're making it hard to do the right thing, security-wise.

decompress 4.2.1 will cause a huge performance degradation in plugins deployment

Potentially. For it to matter, there would need to be a big degradation, that is noticeable for the user. I do not think I would notice if a plugin I am installing took 200ms to decompress during deployment instead of 100ms. The situation where it might be more noticeable is when we decompress many plugins at once while the user is waiting, like during a fresh start of the application. It seems likely the time spent decompressing plugins during startup is a relatively minor contribution to the total app startup time, so doubling it would not be a disaster.

To test this I have forked the linked decompress-bug repo and modified it so that we fetch the vscode extensions that we use in the example app. I unzip them all and measure the total time:

(open in Gitpod to take advantage of the automated build or follow the instructions in README)

Sample run on Gitpod

# (total execution time, decompress v4.2.0)
real    0m2.695s
user    0m2.546s
sys     0m1.109s

# (total execution time, decompress v4.2.1)
real    0m4.763s
user    0m4.383s
sys     0m2.555s

The relevant line to look-at is "real":
v4.2.0: 2.7s
v4.2.1: 4.8s

So we're looking at a ballpark of ~2s potential increase in startup time for our example application because of extra time spent decompressing the plugins. We have ~70, that in total weigh ~94MB unzipped and contain contain in total ~5000 files .

It should probably come with breaking change documentation explaining how to pin it.

Not sure this qualifies as a breakage? Anyway I am in favour of adding an entry in CHANGELOG, to let downstream users know they might be using an insecure version of decompress, pinned in their yarn.lock, and how to upgrade. We can mention that this comes with a performance degradation.

I would not offer any advice on how to keep an older version in this case, though any downstream project are free to do as they please.

@amiramw
Copy link
Member Author

amiramw commented Aug 5, 2020

The time penaly was huge when some heavy extensions were not bundled (I measured 3 minutes compared to 30 seconds ).
If they are bundled the penalty should be few seconds.

@akosyakov
Copy link
Member

Could you measure with intalling java extension? It is 50Mb.

@amiramw
Copy link
Member Author

amiramw commented Aug 6, 2020

I think python is more interesting. As it has 2.5K files in the vsix.
Redhar java has only 102.

@marcdumais-work
Copy link
Contributor

marcdumais-work commented Aug 6, 2020

The time penaly was huge when some heavy extensions were not bundled (I measured 3 minutes compared to 30 seconds ).

Can you clarify what you mean by "bundled / not bundled"? Do you mean whether an extension is defined in package.json vs being installed by end-user using the extensions view? Or does it perhaps refer to the extension(s) being bundled along the product in an electron-builder package vs not in a packaged form?

update: I think understand now. The concern is that the example from the main repo, that I used as a base for measurements, does not pull some of the heaviest extensions.

@marcdumais-work
Copy link
Contributor

I have updated my Gitpod example to add both the Java and Python extensions. Indeed the Python ext is the worst case I have seen so far.

Here are a few scenarios:

Only Python ext

# (total execution time, decompress v4.2.0)
real 0m5.497s

# (total execution time, decompress v4.2.1)
real 0m12.190s

Only Java ext

# (total execution time, decompress v4.2.0)
real 0m1.627s

# (total execution time, decompress v4.2.1)
real 0m2.000s

Python, Java and all extensions from main repo example app:

# (total execution time, decompress v4.2.0)
real 0m7.086s

# (total execution time, decompress v4.2.1)
real 0m14.488s


Conclusion: with all the extensions from the main repo's example application, plus python and Java, it takes ~7s more to decompress all extensions, vs the earlier version of decompress.

@amiramw
Copy link
Member Author

amiramw commented Aug 6, 2020

By bundling I mean using webpack or alike to reduce the number of files in the vsix. When vsix has thousands of files, like python, then a lot of time and cpu is spent by decompress.
BTW, I believe it can also cause brief disconnections when immediately opening the IDE when it's up.

@marcdumais-work
Copy link
Contributor

@amiramw thanks for the clarification.

The time penaly was huge when some heavy extensions were not bundled (I measured 3 minutes compared to 30 seconds ).

Everything else being equal, extrapolating the Python case ( ~2.5k files takes about 12s to decompress), your extension probably contained a huge number of files, as in multiple tens of thousands, to take 3 mins? vsce must not be happy with this :)

I propose that the best way forward is to start by adopting the latest decompress and so remove this security issue, making it easy for our extension's consumers to do similarly. Then in a follow-up, someone could try ti replace decompress with something faster or find a way to mitigate the impact on app startup time. I think @marechal-p might already have some ideas of improvements around this.

@amiramw
Copy link
Member Author

amiramw commented Aug 6, 2020

Everything else being equal, extrapolating the Python case ( ~2.5k files takes about 12s to decompress), your extension probably contained a huge number of files, as in multiple tens of thousands, to take 3 mins? vsce must not be happy with this :)

At the time there were several extensions with ~10K files each. And yes, vsce showed warnings and suggested bundling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies pull requests that update a dependency file performance issues related to performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants