Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowness while importing large projects or making POM changes #1670

Open
eliasbalasis opened this issue Feb 9, 2024 · 19 comments
Open

Slowness while importing large projects or making POM changes #1670

eliasbalasis opened this issue Feb 9, 2024 · 19 comments
Labels
question Further information is requested

Comments

@eliasbalasis
Copy link

Past slowness was fixed in #155 and m2e 1.20.0 which was made part of Eclipse 2022-03

However, the slowness has returned in 2023-09 with "M2E - Maven Intrgration for Eclipse" 2.4.100.20230827-1557

This has become almost unbearable on projects with large number of modules.

The problem becomes even worse when importing hierarchies of projects and making a configuration change in one of the ancestor projects.

@laeubi
Copy link
Member

laeubi commented Feb 9, 2024

@eliasbalasis please see here for how such reports should include at the required information:

"This has become almost unbearable on projects with large number of modules." is quite vague and hard to address.

@laeubi laeubi added the question Further information is requested label Feb 9, 2024
@eliasbalasis
Copy link
Author

Thanks @laeubi, the complaint is indeed vague but unfortunately the projects are private and I cannot publish them.

I have been following Eclipse IDE and M2E since their birth and I am confident of what I am expressing.

I will try to find alternative ways of describing the problem.

@laeubi
Copy link
Member

laeubi commented Feb 9, 2024

I will try to find alternative ways of describing the problem.

Any other Opensource project that shows similar behavior will suffice :-)

@eliasbalasis
Copy link
Author

eliasbalasis commented Feb 9, 2024

Eclipse.zip

I am afraid no public project that I could create would be big enough to reproduce the problem accurately.

That said, in relation to the original problem, it seems Eclipse is making unnecessary permutations.
The reason I am saying this is that the call chains in JVM snapshots I captured in the attached file seem to be insanely long, just like the old days (see #155 >> #123)

image
and it keeps getting deeper and deeper

Would this be helpful?

@laeubi
Copy link
Member

laeubi commented Feb 9, 2024

Good question, if you can reproduce it you probably want to start a debug session to look further into it an provide a PR?

@eliasbalasis
Copy link
Author

@laeubi , I hope you will agree that I cannot afford learning m2e-core only to reproduce a problem.

I was hoping for my observation to serve as a hint for someone familiar with the project.

@mickaelistria
Copy link
Contributor

@eliasbalasis It's near impossible to fix such problems if we cannot reproduce them. Back then, there were some examples to reproduce, we could derive some from the huge Apache Camel project.
Just even trying to reproduce it is going to consume a tremendous amount of time, while there are already actionable items in the backlog, so it'd be hard to make it a priority.
What could be interesting is if you can monitor a "cluster" of subprojects you have which are involved in the apparent extra processing. From here, we might see a pattern, such as some mojo that affects some pom or other interesting data to cause some rebuild or something of that sort.

@laeubi
Copy link
Member

laeubi commented Feb 9, 2024

If this is crucial to someones business and likes to speed up the development in that area a sponsoring would allow me to assign more time-slots to issues in general, also a dedicated contract for a specific issue is possible and then I can even sign a NDA or similar to get access to the non public example/code.

If there is no public real projects as mentioned by @mickaelistria its hardly impossible and highly unlikely as then one can neither prove there is a problem, nor analyses it and even never be sure that a certain optimization even improves the situation.

@eliasbalasis
Copy link
Author

I understand the difficulties.

I will try to monitor a "cluster" of my subprojects as suggested but I will also consider sponsoring or the NDA approach.

@ahoehma
Copy link

ahoehma commented Feb 20, 2024

@eliasbalasis @laeubi what about this idea:

  • lets "generate" such a big project ... just with nonsense data (module-foo-1..n) .. random relations etc.
  • maybe also with many nonsense git commits etc. (because I investigated today a big time waiting that the m2e importer scan .git directory in my project)

@laeubi
Copy link
Member

laeubi commented Feb 20, 2024

@ahoehma we don't want to optimize m2e for nonsense but for real life projects, also you can make every process slow if you just use enough data but this mostly does not replicate the actual issue.

maybe also with many nonsense git commits etc. (because I investigated today a big time waiting that the m2e importer scan .git directory in my project)

What abour cvs, svn, mecurial, ...

@ahoehma
Copy link

ahoehma commented Feb 20, 2024

I got your point @laeubi . Another idea: Maybe m2e could contain "something" which I as a user can enable to create a "report" which helps m2e develop to find bottlenecks? Full blown profiling is maybe too "private" ... something which is okay also for "closed source real world projects". A kind of build-metrics-report but "anonymized".

@laeubi
Copy link
Member

laeubi commented Feb 20, 2024

If you like to provide such thing, it would be welcome but consider that even if one can provide such thing this does not mean it will/can be fixed (we are all working on our free time here!) so if not the project can be provided (aka closed source you make money with) its unlikely it gets fixed "for free", so either one can analyze it yourself (in wich case anonymized is not important at all) and suggest an fix/improvement, or you hire a person that then can sign an NDA contract so anonymized is also not that important.

@eliasbalasis
Copy link
Author

Thanks @ahoehma , @laeubi for sharing your thoughts.

Unfortunately, it is a real life project that I am referring to.

As I am suspecting and I could be wrong, if depth and complexity of Maven modules structure is leading to the slowness, perhaps due to increased or unnecessarily large number of permutations generated by M2E, then creating a very big project equivalent to the real one would be rather difficult but not impossible.

I have to agree with @laeubi on the anonymized metrics report engine front.
It sounds quite demanding and may or may not reveal the actual problem, even though it is a good thought @ahoehma.

@laeubi
Copy link
Member

laeubi commented Feb 21, 2024

BTW next m2e is out so maybe test this, in general following the snapshots could help in reveling problems fast and getting them fixed fast as it is clear what causing them e.g. see this one:

@eliasbalasis
Copy link
Author

Thanks @laeubi

I will give Eclipse 2023-12 with m2e 2.6.0 a try

@eliasbalasis
Copy link
Author

eliasbalasis commented Feb 23, 2024

On Eclipse 2023-12 with m2e 2.6.0

when importing a descendand Maven project in a hierarchy of Maven projects, I am noticing the "Import Maven Projects" task reporting import of ancestor modules already imported into the workspace, unnecessarily in my point view, which sounds like extra permutations.

Does this give you any hints or clues ?

@laeubi
Copy link
Member

laeubi commented Feb 23, 2024

Does this give you any hints or clues ?

If you can construct an example of unnecessarily imports that would be good, in general its sadly hard to decide, because a project might be imported standalone, or with its parent, then one can have aggregator poms that aggregate projects with different parents and so on...

@eliasbalasis
Copy link
Author

eliasbalasis commented Feb 23, 2024

see execution snapshot snapshot-1.zip
captured with VisualVM and compressed
during which I kept observing the extra permutations
Perhaps this will help you better.
I am noticing a very long depth of method calls on the "Import Maven Projects" thread, with many references to initParent()
image

I intend to provide a Maven hierarchy similar to ours, in order to demonstrate the unnecessary imports or other undesirable behaviours, but it will take time that I don't have at the moment.

a project might be imported standalone, or with its parent, then one can have aggregator poms that aggregate projects with different parents and so on...

This sounds a lot like our multi ancestor layered structure
see
https://github.com/eliasbalasis/eclipse-lemminx-maven-issue-345-root
https://github.com/eliasbalasis/eclipse-lemminx-maven-issue-345-tools
https://github.com/eliasbalasis/eclipse-lemminx-maven-issue-345-project
https://github.com/eliasbalasis/eclipse-lemminx-maven-issue-345-impl
imported in the order mentioned

Our actual structure has a much larger number of modules and a couple of more layers but the driving thought process is the same, an ancestor modules chain forming a baseline of inherited dependencies and build steps for concrete solutions of different types but all part of the same organization's practices.

Would this help?

@laeubi
I have also established, unfortunately, that we can't offer sponsorship for public projects.
Therefore I am afraid the sponsorship or NDA paths cannot be explored.

Either way,
I can confirm that unnecessary permutations seem to be performed while making POM changes or when importing Maven projects into an Eclipse workspace, with unrelated modules displayed in the "Progress" view, which delays the overall process, is rather suspicious and feels like a repetition of older behaviours which I thought were fixed.
Unfortunately, this is not much but it is all I have confidently witnessed.
I will keep observing for hints and clues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants