Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore combinations of overrides whose intersection of markers is empty #4695

Conversation

radoering
Copy link
Member

@radoering radoering commented Nov 1, 2021

Pull Request Check List

Resolves: #3367
Resolves: #4670
Resolves: #4870
Resolves: #4947

  • Added tests for changed code.
  • Updated documentation for changed code.

Currently, poetry tries all combinations for multiple-constraints-dependencies, even if markers exclude each other, which means that the concerning combination is not relevant. As a consequence, dependency resolution may fail even if there is a solution as in #3367 or dependency resolution will take unnecessarily long as in #4670.

Instead of considering all combinations, only combinations whose intersection of markers is not empty should be considered in the overrides.

In the following example, it is clear that only two of the four combinations are relevant (one with both dependencies for 3.6 and one with both dependencies for 3.7). Mixed combinations are not sensible.

lib1= [
    {version = "1.0", python = "~3.6"},
    {version = "2.0", python = "~3.7"}
]
lib2 = [
    {version = "1.0", python = "~3.6"},
    {version = "2.0", python = "~3.7"},
]

However, there is one special case that has to be considered. Let's extend the previous example as follows:

lib1= [
    {version = "1.0", python = "~3.6"},
    {version = "2.0", python = "~3.7"}
]
lib2 = [
    {version = "1.0", python = "~3.6"},
    {version = "2.0", python = "~3.7"},
    {version = "3.0", python = "~3.8"}
]

Now, only considering the combinations whose intersection of markers is not empty is not enough because the constraint for lib2 for python 3.8 would be ignored. Thus, when determining the overrides, a dummy override with the inversion of the union of all markers of a dependency has to be added if there is no any marker dependency. In the example above, an empty constraint for lib1 with markers not (python = "~3.6" or python = "~3.7") (inversion of union of all markers of lib1) has to be added to the overrides. Thus, a solution for lib2 for python 3.8 can be found, not considering lib1 at all.

I am aware that this solution heavily relies on poetry's ability to recognize if the intersection of markers is empty and that poetry probably will fail in this task for more complex markers. However, I think that even in that case the drawback of this solution are only some additional (unnecessary but harmless) dependency resolutions (for the dummy overrides).

PR python-poetry/poetry-core#226 improves poetry's ability to recognize if the intersection of markers is empty.

@radoering radoering force-pushed the ignore-overrides-with-empty-marker-intersection branch 3 times, most recently from f03f330 to a8b5aa7 Compare November 21, 2021 08:49
@radoering radoering force-pushed the ignore-overrides-with-empty-marker-intersection branch from a8b5aa7 to 0ec6efa Compare November 28, 2021 16:45
@sonarcloud
Copy link

sonarcloud bot commented Nov 28, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
2.1% 2.1% Duplication

@maciejskorski
Copy link

I am watching this hopping poetry gets improved on markers!

However, let me share my concern regarding this specific PR.

On the file (real use-case)

[tool.poetry]
name = "test"
version = "0.1.0"
description = "demonstrates that poetry triggers overrides to easily (can be early pruned?)"
authors = ["Maciej Skorski <[email protected]>"]

[tool.poetry.dependencies]
python = "~3.8"
azureml-dataset-runtime = "1.*"

and

poetry install --dry-run -vv

I observe much more overrides and longer install time than in the release 1.2.0a2

@radoering
Copy link
Member Author

@maciejskorski

In order to be able to ignore irrelevant combinations of overrides, additional overrides for implicit "package not explicitly required" dependencies have to be added first. Otherwise, dependency resolution will be erroneous in some cases (see my second example in #4695 (comment)). Thus, the number of overrides can swing in both directions (less overrides if many of them can be ignored or more overrides if little of them can be ignored).
In some cases the number of overrides is reduced (e.g. in #4670 from 510 overrides to 24).
In other cases the number of overrides is increased (in your example from 6 to 12).

This PR heavily relies on the recognition of empty markers. With further improvement of marker intersection/union/inversion, the number of overrides could often be reduced. python-poetry/poetry-core#226 is just a first step in that direction. Nevertheless, I think this PR makes dependency resolution certainly more correct, probably more robust (see linked issues with failing resolution) and hopefully more often faster than slower.

I have analyzed your example. Its overrides originate in the following two multiple constraint dependencies:

  • azureml-dataset-runtime
    numpy (!=1.19.3); sys_platform == "linux"
    numpy (!=1.19.4); sys_platform == "win32"
  • msal-extensions
    portalocker (<2,>=1.0) ; python_version == "2.7" and platform_system != "Windows"
    portalocker (<2,>=1.6) ; python_version == "2.7" and platform_system == "Windows"
    portalocker (<3,>=1.0) ; python_version >= "3.5" and platform_system != "Windows"
    portalocker (<3,>=1.6) ; python_version >= "3.5" and platform_system == "Windows"

Without PR

Poetry is already able to ignore two of the portalocker dependencies because their python versions don't fit the project's requirement. Thus, there are two groups of multiple constraint dependencies each containing two dependencies.
Without the PR, this results in 2 + 2 x 2 == 6 overrides.

With PR

The PR does not help in this case because there is no empty intersection of markers (assuming that the connection between win32 and Windows is not considered by poetry at the moment).

In the first group, there is an implicit dependency that has to be considered: numpy is not explicitly required if sys_platform != "linux" and sys_platform != "win32". Without the PR this implicit dependency is not considered and strictly speaking that is just wrong. (Neglecting that numpy may always be required and the requirements specification of azureml-dataset-runtime might just not consider that there is something else than win32 and linux.)

In the second group, the implicit dependency is: portalocker is not explicitly required if python < "3.5" (Actually, it is (python < "3.5" or platform_system == "Windows") and (python < "3.5" or platform_system != "Windows") because that's the inversion of the union of all markers of the considered dependencies and poetry currently is not able to simplify this.) Thus, instead of two groups with two dependencies, we have two groups with three dependencies resulting in 3 + 3 x 3 == 12 overrides.

If poetry would be able to simplify the marker of the implicit portalocker dependency and build the intersection with the project's python restriction (python = "~3.8"), the marker of the implicit dependency would be empty so that the second group would still have only two dependencies resulting in 3 + 3 x 2 == 9 overrides. Further, if poetry would even be able to recognize that sys_platform == "linux" and platform_system == "Windows" as well as sys_platform == "win32" and platform_system != "Windows" as well as sys_platform != "linux" and sys_platform != "win32" and platform_system == "Windows" are empty markers,
3 combinations of overrides could be ignored resulting in only 6 overrides to consider.

@maciejskorski
Copy link

maciejskorski commented Dec 31, 2021

@radoering

thanks for that detailed clarification. I understand your patch may not help in this case, still increasing the number of costly operations is what worries me. Ideally I don't want the patch to be more expensive, or at least to have that communicated transparently. Per my understanding, you need to extend the logic to handle correctly some boundary cases. Do we really need that? Would they appear at al if the sub-dependencies were specified correctly?

Following up on your thoughts on environmental markets, I have prepared a draft proposal #4956 (tests included) for early filtering packages. Instead of exponentially many combinations arising from splitting logic, we could have match less by early matching to the system/platform markers. Would be great to hear your opinion on this

@finswimmer finswimmer requested a review from a team February 6, 2022 13:08
@radoering radoering force-pushed the ignore-overrides-with-empty-marker-intersection branch from 7e4af53 to 3dac1c3 Compare March 6, 2022 18:48
@radoering radoering added area/solver Related to the dependency resolver kind/enhancement Not a bug or feature, but improves usability or performance labels Mar 23, 2022
@radoering radoering force-pushed the ignore-overrides-with-empty-marker-intersection branch from 3dac1c3 to 2e331b9 Compare March 25, 2022 04:44
KaoruNishikawa added a commit to nanten2/necst-ros that referenced this pull request Mar 28, 2022
This is a workaround regarding Poetry's bug. If there are two identically constrained version specification, Poetry attempts to resolve dependency versions for every constraints combination, even if there's no intersection, which raises SolverProblemError. This behaviour isn't fixed yet but tracked in <python-poetry/poetry#3367> and will be resolved in <python-poetry/poetry#4695>.
@radoering radoering force-pushed the ignore-overrides-with-empty-marker-intersection branch from 2e331b9 to bafff7d Compare April 5, 2022 20:11
@abn abn added this to the 1.2 milestone Apr 17, 2022
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/solver Related to the dependency resolver kind/enhancement Not a bug or feature, but improves usability or performance
Projects
None yet
3 participants