Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.0 regression: large overhead of libsolv's solver_unifyrules when multichannels are used #3393

Closed
Hind-M opened this issue Aug 5, 2024 · 14 comments · Fixed by #3521
Closed
Assignees

Comments

@Hind-M
Copy link
Member

Hind-M commented Aug 5, 2024

From @ndevenish in the QS lobby on gitter:
"
Is there any known issues with current micromamba about resource usage, possibly related to Centos/RHEL? I've had two separate people come to me this week with issues with:
a) using micromamba in a container build dying because it filled their entire temp disk (when installing very few packages).
b) being what looked like OOM killed after taking >60% of their memory. Both tasks which have worked before.

The out-of-disk-space instance was running:
micromamba create -y -c conda-forge gnuplot python numpy pymca workflows>=1.7 xraylib zocalo
and it took at least 4GB of scratch disk space (the smallest of possible locations that podman was using to do container working on their system).

The other instance didn't get past resolving (an admittedly rather large requirement) but was using >9GB of ram on a 16GB machine the last time I checked before it died.
"

@Hind-M
Copy link
Member Author

Hind-M commented Aug 5, 2024

Used micromamba version: 2.0.0rc0

@jjerphan
Copy link
Member

jjerphan commented Aug 5, 2024

I cannot reproduce the errors which you report using conda-forge/label/micromamba_dev/linux-64/micromamba-2.0.0{rc0-1,rc1-2}.

On my machine, installing those packages take around 1.5GiB of memory storage in the $CONDA_PREFIX, while using less than 1GiB of RAM.

@ndevenish: Could you provide the difference of your instances' resource usage when using micromamba<2.0.0rc0 and micromamba>=2.0.0rc0?

@ndevenish
Copy link

When this ticket was made it was a while since I had seen it happen to people.

Now 2.0.0 is out I am seeing this happen on CI
image

@ndevenish
Copy link

On this environment file

@ndevenish
Copy link

This is exactly 700 GB btw

@ndevenish
Copy link

ndevenish commented Sep 28, 2024

RHEL8, 16GB memory machine:

curl -JLO https://raw.githubusercontent.com/dials/dials/refs/heads/main/.conda-envs/linux.txt
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
psrecord  --plot out.png "bin/micromamba create -yp ENV/ -c conda-forge --file linux.txt"

image

% bin/micromamba --version
2.0.0

@ndevenish
Copy link

Possibly because it seems to be in a package-cache-fetching loop?
https://github.com/user-attachments/assets/cf71deec-db90-4735-93b1-b8e6365f2fe7

@jjerphan
Copy link
Member

jjerphan commented Oct 1, 2024

The repodata.json is reparsed for each package (since conda-forge:: is specified for everyone of them), causing major resource usage.

This is a regression of micromamba 2.0.0.

@jjerphan
Copy link
Member

jjerphan commented Oct 1, 2024

From bisecting, e874e7e from #2986 is the culprit.

@ndevenish
Copy link

Ah, excellent detective work. Removing the conda-forge:: prefix sounds like it should give us a way to solve the problem before a more widespread fix. From recollection, we started doing that in order to prevent pulling in from other places, but I think the only way that we generate installations now avoids that completely, so it shouldn't be needed any more.

@jjerphan
Copy link
Member

jjerphan commented Oct 1, 2024

Yes, we must only parse the subdirectory once.

@jjerphan jjerphan self-assigned this Oct 1, 2024
ndevenish added a commit to cctbx/dxtbx that referenced this issue Oct 2, 2024
ndevenish added a commit to ndevenish/dials-fork that referenced this issue Oct 10, 2024
ndevenish added a commit to dials/dials that referenced this issue Oct 10, 2024
conda-forge:: prefix on package specification was causing redownload and
reparsing for every dependency.

See mamba-org/mamba#3393
@jjerphan
Copy link
Member

jjerphan:mamba:fix/parsing-subdir is a WIP branch to resolve this issue, it is currently blocked by jbeder/yaml-cpp#1322.

dagewa pushed a commit to dials/dials that referenced this issue Oct 10, 2024
conda-forge:: prefix on package specification was causing redownload and
reparsing for every dependency.

See mamba-org/mamba#3393
@jjerphan
Copy link
Member

Actually, the channel duplication is not the only cause: most of the runtime after its correction is also due to a costly quick sort execution in libsolv's solver_unifyrules.

Using samply:

samply record $HOME/dev/mamba/build/micromamba/micromamba create -yp /tmp/5ENV/ -c conda-forge --file /tmp/linux.txt

With the conda-forge:: prefix:

with conda-forge::

Without the conda-forge:: prefix:

without conda-forge::

I guess this might be due to comparison function for package solvable when the resolution is run.

@jjerphan
Copy link
Member

Bisecting indicates that the regression has been first introduced by e874e7e, the merge commit of #2986.

@jjerphan jjerphan changed the title Resource usage with micromamba 2.0 regression: large overhead of libsolv's solver_unifyrules when multichannels are used Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants