-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelization performance severely degraded since Tycho 1.7.0 #342
Comments
Can you provide small reproducer where this is visible for testing purposes? |
Have you tried tycho 2.5.0? It contains some improvements for large build setups regarding performance. |
Oh, I completely missed the release of Tycho 2.5.0, thanks for the heads up. The numbers are in:
It is a bit faster but still nowhere near to 1.6.0-durations. The non-parallelized build seems to have become a little bit slower (although maybe within error margins). I will try to create a reproducible project, but I cannot make promises. I guess it will have to be a bunch of test projects with sleeps in them. |
The gap in term of amount of commits between Tycho 1.6.0 and 2.5.0 is huge. It's going to be very hard, and not so motivating, to identify the change that cause this issue. And if issue is between 1.6.0 and 1.7.0, it's also not so motivating to dig into outdated code. |
I would guess the root cause to be around making the mojos thread-safe. We did that by simply introducing an artificial lock object in each and every mojo. Therefore if your builds use a huge amount of time in just one identical mojo for a big number of maven modules, that one mojo may dominate the build and cannot be run in parallel anymore. I.e. you can run tycho-foo and tycho-bar in parallel, but not tycho-foo 2 times, since that change. |
It seems the lock was introduced with Tycho 1.7.x 8afcd61 from a quick look at the code it seems that at least we do not need to lock globally as each project is using its own work-directory. My suggestion would be, move that method call outside the synchronized block and check if that helps in your case or causing issues. If it helps we can then integrate this change permanently into Tycho. |
Your suspicion seems to be correct. I cloned the repository and tested your suggested change: The build is successful (no multi threading failures, although there are up to 16 test projects running in parallel) and the build time is back down to approximately 2 hours. I'm a bit unsure though on how to continue from here. As far as I have seen I would need to sign the Eclipse ECA in order to commit the patch myself. I have not done that yet and would have to check with my company first if I would actually be allowed to do that. Should I be going forward with that or would you rather have someone of you commit the change yourself? I'd be fine with that too. |
@k-spa please check first with your company as it is generally a great opportunity to contribute (small or larger) patches to (eclipse) OSS projects and such a small change is a good starting-point to get familiar with the process and tools. If it would become too complicated let me know and I'll write an own patch for this but as you are currently the only one complaining its your time-scedule and probably it helps to convince your company to contribute if they are directly affected by this. |
We'd rather have you submitting the patch on your own name for proper and fair attribution, and also -since you've demonstrated some interest in the project and proven you have skills to improve it- to get you familiar with the contribution process as it's very likely you'll want to suggest some more changes in the future. |
Running tests while the lock is present leads to the tests being executed in series which severely degrades performance when the build is running with parallel maven projects.
Fixes #342: Run surefire tests outside of lock
We should probably mention this in the release notes. If there are people with SWTBot UI tests and parallel build enabled, they would previously not run into issues, but after this change multiple bot tests might try to manipulate windows and controls at the same time. That will probably fail, at least on Windows (due to not being able to run with multiple separate interactive sessions in parallel). In fact, in our company a big amount of tests follows that SWTBot pattern, and that's why we don't use the parallel build for surefire testing. Otherwise I might have noticed this specific bug much earlier myself. :) |
Document parallel testing changes (fixes #342)
We have been using Java 11 / Tycho 1.6.0 in our project since the release of Java 11.
The build is quite lengthy and, without any special configuration, takes about 3 hours. Therefore we set the
-T 1C
command line option to build several projects in parallel. This logs warnings about unsupported parallelization but has always been working fine. Building in parallel drops the build time to 2 hours.Anticipating the upcoming Java 17 support I tried upgrading the build system to Tycho 2.4.0 to test for any major roadblocks.
With Tycho 2.4.0 the unparallelized build takes the same 3 hours as with Tycho 1.6.0, but setting the parallelization option actually increases the build time to 3 hours and 47 minutes!
I tested with older versions of Tycho and it seems that the severe slowdown happens since version 1.7.0. I did not expect that - minor changes in the build time are fine, but I expected the build time to stay at least in the area of 2 hours.
-T 1C
-T 1C
Going from our current, parallelized, 2 hour build, that is an increase of almost 100 % - but even increasing the build time to 3 hours by building unparallelized is hardly an option.
Is there a problem with running the build parallelized since Tycho 1.7.0? Do I have to configure something differently?
Our maven call looks like this:
mvn install -Dmaven.repo.local=<repository location> -P <profile name> -T 1C -fae
The profile just determines which tests are executed and should not behave differently whether the
-T 1C
parameter is present or not.The text was updated successfully, but these errors were encountered: