Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable -Xshareclasses by default #5831

Closed
DanHeidinga opened this issue May 22, 2019 · 21 comments
Closed

Enable -Xshareclasses by default #5831

DanHeidinga opened this issue May 22, 2019 · 21 comments

Comments

@DanHeidinga
Copy link
Member

We've previously tried this and been blocked due to a performance regression related to the iprofiler data not being collected when using AOT from the SCC - see #4205

Previous attempt: #1646

Background: #3333

This needs to be resolved so we can re-enable SCC by default

@DanHeidinga
Copy link
Member Author

fyi @hangshao0 @pshipton

@DanHeidinga
Copy link
Member Author

DanHeidinga commented Jun 4, 2019

@mpirvu Can you add a reference to your recent change to keep the iprofiler enabled if only bootclasses are in the SCC?

Have the perf experiments shown this provides a sufficient win to be worth turning back on?

@dsouzai
Copy link
Contributor

dsouzai commented Jun 4, 2019

I'd like to voice a concern regarding having the SCC enabled during the build process of OpenJDK+OpenJ9. The build step doesn't use a stable JVM to build java code (that ships with the JDK) that can't be javac'd using the bootstrap JDK, but rather the JVM we've just built. It's bad enough that a JVM bug could result in some unintentional "trusting trust" like problems. Enabling the SCC just adds another layer of complexity.

If the build step can't use some other JDK (say the latest release of the same java version from AdoptOpenJDK) to build the bootstrap classes / JCL, then at the very least we should not use the SCC during the build step.

@mpirvu
Copy link
Contributor

mpirvu commented Jun 4, 2019

This PR #5967 eliminates the ILOG regression seen when the default shared class cache (the one that allows only bootstrap classes) is enabled.

This improvement is seen on my system. I would like to get further validation from our perf team.
Other applications could still be affected. For instance, my experiments with the old SPECjbb2005 benchmark show a remaining 8% regression (down from 26%). Maybe we don't care that much about SPECjbb2005 nowadays, but we need to look into other applications that we care about.

@DanHeidinga
Copy link
Member Author

@dsouzai while I'm sympathetic to the concerns you've raised here, it's not an approach I'm comfortable with. By enabling the SCC by default, we're asking everyone of our users to trust in the SCC / AOT by default. It's awfully hard to argue that's correct thing for our users if we're not doing it ourselves.

@dsouzai
Copy link
Contributor

dsouzai commented Jun 5, 2019

Yeah that's fair. I guess I'm voicing this concern mainly because of the more fundamental problem I have, namely that we're using the current build to compile the java code. In terms of asking our users to trust in the SCC/AOT, well that's fine for a release build because we're confident that we've tested the release build sufficiently. However, I can't have the same level of confidence for say a PR build or a nightly build; it's not out of the realm of possibility that a bug could be introduced (in a way that doesn't cause the test suite to fail).

Now, this problem does already exists because we're using the currently building JVM. However, if there's an intermittent bug in an interim build, the odds of it showing up in every new JVM invocation during the build step is low. However, if you have an intermittent bug that ended up in the code that's stored in the SCC, and it'll show up in every JVM invocation now.

The argument could be made that it's just as likely (really far more likely) that we'll generate code without said hypothetical bug and then we'd be fine because we'd always use the good code in the SCC. I guess I'm coming from the point of view that this is just playing with dice now.

At any rate, it's just an opinion. Given that I had to deal with issues during the build step the last time this was enabled by default (which makes debugging so much harder because the build goes away...), I figure it's better to voice this concern, so that it's at least discussed if nothing else.

@charliegracie
Copy link
Contributor

I do not think we are ready for this. The current issue raised on Windows where a run with a warm shared classes cache was slower than a run without any cache is a big concern for me.

@DanHeidinga
Copy link
Member Author

DanHeidinga commented Jun 5, 2019

The issue @charliegracie refers to is: #5918

And I agree it's an issue. As a tactical fix, we're looking at re-introducing the JDK8 zip patches while a more cohesive plan is being developed. @pshipton has a tentative initial approach for how to better handle timestamp checking.

@DanHeidinga
Copy link
Member Author

For instance, my experiments with the old SPECjbb2005 benchmark show a remaining 8% regression (down from 26%). Maybe we don't care that much about SPECjbb2005 nowadays, but we need to look into other applications that we care about.

@mpirvu can you clarify if that's a regression against the baseline (no SCC) or compared to a user-specified SCC?

@hangshao0
Copy link
Contributor

For Java 8, #5918 goes away with ibmruntimes/openj9-openjdk-jdk8#311

@hangshao0
Copy link
Contributor

hangshao0 commented Jun 5, 2019

I believe we are only enabling sharing bootstrap classes by default here. #5918 is more of a problem when application classes are shared. For Java 9+, the bootstrap classes are from the Jimage and the timestamp of Jimage is only checked once for the life time of a JVM. We can change the code to do the same for bootstrap jars in Java 8. But with ibmruntimes/openj9-openjdk-jdk8#311, #5918 is already gone there.

@mpirvu
Copy link
Contributor

mpirvu commented Jun 6, 2019

@mpirvu can you clarify if that's a regression against the baseline (no SCC) or compared to a user-specified SCC?

The regression is against the baseline (no SCC). However, when running the SPECjbb2005 benchmark I disabled hot and scorching compilations because I wanted to focus more on warm compilations. I will do some new runs allowing hot/scorching compilations.

@DanHeidinga
Copy link
Member Author

I will do some new runs allowing hot/scorching compilations.

@mpirvu Have you had a chance to do these runs?

@mpirvu
Copy link
Contributor

mpirvu commented Jun 11, 2019

Yes I did. With scorching compilations enabled there is no regression.
However, this means that there is still the potential for an application, under some circumstances to experience a slowdown.

@DanHeidinga
Copy link
Member Author

We are coming up on a go/no-go decision for this PR to make the 0.15.0 release.

@mpirvu with the perf data so far, do you have any objections to enabling this by default?

@hangshao0 @pshipton Similar questions for you as well - Do you think this is ready to go?

@DanHeidinga
Copy link
Member Author

My position is that we need to move this out to the next release as we don't have enough runway to fix any infra / test / code issues that appear. We should look at enabling this immediately after doing the code split.

@hangshao0
Copy link
Contributor

Similar questions for you as well - Do you think this is ready to go?

For Java 11 and up. Nothing from VM side is blocking this.
For Java 8, we need either #6022 to be fixed or ibmruntimes/openj9-openjdk-jdk8#311 to be merged.

@mpirvu
Copy link
Contributor

mpirvu commented Jun 18, 2019

Performance regressions that were holding up this item were resolved, but there is still the possibility of regressions for some applications in some configurations. For instance SPECjbb2005 is seeing a 8-9% regression when hot/scorching compilations are disallowed. Vijay would like to make more progress on that and I am investigating.

@DanHeidinga
Copy link
Member Author

Vijay would like to make more progress on that and I am investigating.

We would be branching ~Aug 26 for the 0.16.0 release and would need some time for this to "bake" before turning it on.

My preference is to turn it on immediately after branching for 0.15.0 and work on any perf regressions in parallel with any other issues that may arise. This gives us the longest time to validate this change.

@mpirvu
Copy link
Contributor

mpirvu commented Jun 20, 2019

My preference is to turn it on immediately after branching for 0.15.0 and work on any perf regressions in parallel with any other issues that may arise. This gives us the longest time to validate this change.

I agree

pshipton added a commit to pshipton/openj9 that referenced this issue Jun 25, 2019
@pshipton
Copy link
Member

pshipton commented Jul 8, 2019

As shared classes has been enabled by default in #6271, I expect this can be closed.

@pshipton pshipton closed this as completed Jul 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants