Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: dependency cache corruption / Is it supported to concurrently compile/sync separate projects by the same user? #1083

Closed
AndydeCleyre opened this issue Mar 2, 2020 · 2 comments
Labels
question User question

Comments

@AndydeCleyre
Copy link
Contributor

I have some scripts that concurrently run pip-compile in separate projects.

Sometimes the overall process breaks:

The dependency cache seems to have been corrupted.
Inspect, or delete, the following file:
  /home/andy/.cache/pip-tools/depcache-cp3.8.json

I believe that the issue is concurrent pip-compile commands writing to that json file.

Environment Versions

  1. Arch Linux
  2. Python version: 3.8.1
  3. pip version: 20.0.2
  4. pip-tools version: 4.5.1

Steps to replicate

zsh:

autoload -Uz zargs

for req in requests httpx remarshal black ruamel.yaml.cmd flake8 ward howdoi; do
    mkdir -p projects/$req
    print $req > projects/$req/requirements.in
done

compile_in_subshell () {
    # first and only argument is <proj-dir>
    (
        set -e
        cd $1
        [[ -d venv ]] || python3 -m venv venv
        . ./venv/bin/activate
        python -m pip install -qU pip pip-tools
        pip-compile -U 2>&1
    )
}

rm -f ~/.cache/pip-tools/depcache-cp3.8.json

for i in {1..10}; do
    zargs -P 8 -l -- projects/* -- compile_in_subshell | grep -C 2 corrupted
done

Expected result

# <crickets>

Actual result

The dependency cache seems to have been corrupted.
Inspect, or delete, the following file:
  /home/andy/.cache/pip-tools/depcache-cp3.8.json

Questions:

  • Would it be more correct to pass a different --cache-dir for each project when doing this?
  • How inefficient would that be?
  • Is there any interest in making this cache file more concurrent friendly, possibly by using something a little more db-like than plain json?

Thanks for any insight!

@atugushev atugushev added the question User question label Mar 5, 2020
@atugushev
Copy link
Member

Hello @AndydeCleyre,

Thanks for the interesting question! The --cache-dir option is deliberately designed to be able to run pip-compile in parallel mode. See also #1022 (review).

  • Would it be more correct to pass a different --cache-dir for each project when doing this?

Yes, that's the purpose of the --cache-dir.

  • How inefficient would that be?

More time would be spent on a "cold start" of each project, but that's a trade-off.

  • Is there any interest in making this cache file more concurrent friendly, possibly by using something a little more db-like than plain json?

The current format is perfectly fine for me and I'd prefer to keep things simple. But I would like to hear any other opinions.

@AndydeCleyre
Copy link
Contributor Author

AndydeCleyre commented Mar 5, 2020

Thanks very much! For anyone interested, here's how I'm handling this for now:

I'm using a dedicated cache dir for each unique-path output file (while unlikely, different pip-compile processes could be simultaneously operating on the very same input file, with different options; but it would be past sanity to concurrently try to generate the same output file). Really, for each output-file + venv pairing.

Something like this in my pip-compile wrapper (zsh):

# set variable 'reqsin' to relative or absolute path of input file
# set variable 'reqstxt' to relative or absolute path of output file
local reqstxt_hash="${$(md5sum =(<<<${reqstxt:P}))%% *}"
PIP_TOOLS_CACHE_DIR=${VIRTUAL_ENV:-$(mktemp -d)}/pip-tools-caches/${reqstxt_hash} \
pip-compile --no-header --build-isolation -o $reqstxt $@ $reqsin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question User question
Projects
None yet
Development

No branches or pull requests

2 participants