Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct Poetry config when collecting Poetry projects #447

Merged
merged 10 commits into from
Jan 3, 2023

Conversation

oranav
Copy link
Contributor

@oranav oranav commented Jun 27, 2022

Description:
When collecting Poetry projects for caching, a **/poetry.lock glob is used. However, in order to process the Poetry configuration, the "poetry" command is run from the repo's root directory; this causes Poetry to return an invalid configuration when there is a Poetry project inside an inner directory.

Instead of running a single Poetry command, glob for the same pattern, and configure Poetry for every discovered project.

Related issue:
#446

Check list:

  • Mark if documentation changes are required.
  • Mark if tests were added or updated to cover the changes.

@oranav
Copy link
Contributor Author

oranav commented Jun 27, 2022

I still need to test this locally, hence this is a draft.

@oranav oranav force-pushed the relative-poetry-config branch 2 times, most recently from d3dceab to 5c369bd Compare October 8, 2022 10:33
@oranav oranav marked this pull request as ready for review October 8, 2022 10:41
@oranav oranav requested a review from a team October 8, 2022 10:41
@oranav
Copy link
Contributor Author

oranav commented Oct 8, 2022

@IvanZosimov Sorry for the long delay! This is finally ready for review. All tests are passing successfully, and I'm using my own branch as an action in another project.

@oranav
Copy link
Contributor Author

oranav commented Oct 10, 2022

I fixed the failing main workflow on Windows. Good catch.

That being said, I don't understand why e2e-cache fails with Pipenv and Windows. I haven't changed anything related to it. Can it be failing randomly?

@IvanZosimov
Copy link
Contributor

I fixed the failing main workflow on Windows. Good catch.

That being said, I don't understand why e2e-cache fails with Pipenv and Windows. I haven't changed anything related to it. Can it be failing randomly?

Hi, @oranav 👋 Thank you for this PR and sorry for the delay! The problem with e2e test related to pipenv on Windows is most probably caused by recent pipenv minor tag update, I hope that they will repair it. You also have 3 failing unit-tests in Main workflow, could take a look at them?

@oranav
Copy link
Contributor Author

oranav commented Oct 14, 2022

Thanks @IvanZosimov!

It seems that I've hit an unrelated bug that happens sporadically, which causes tests to timeout in some cases. Take a look at this action for example, which has failed for the same reason.

I've fixed it in this commit. In short, the cache distributor tests were looking for dependency file with their default glob (**/requirements.txt for example), which made it search for the file in the project's root instead of in the tests data folder. I've changed that, and it reduced the tests time on my machine from ~7s to less than 3 seconds.

After we run the workflows again I expect the main workflow to pass.

@oranav
Copy link
Contributor Author

oranav commented Oct 27, 2022

Hey @IvanZosimov, can you please take a look?

@dsame dsame self-assigned this Oct 28, 2022
@oranav
Copy link
Contributor Author

oranav commented Nov 1, 2022

I'm not sure why main workflow failed with Python 2.7 (which I haven't changed at all); I've seen that it failed in other runs as well, so I suspect it's unrelated to my PR.
I rebased my commits on top of main. Can we please re-run?

@oranav
Copy link
Contributor Author

oranav commented Nov 7, 2022

@dsame @IvanZosimov Can we please re-run the actions?

@oranav
Copy link
Contributor Author

oranav commented Nov 7, 2022

Sorry again. The problem was indeed unrelated to my PR, and it seems that #535 fixed the problem (for the curious - there is no Python 2.7 for Ubuntu 22.04, and the workflow previously used ubuntu-latest which got upgraded to 22.04). Unfortunately, #535 wasn't included in my rebase; I now rebased on top of it as well. Running again should result in a working workflow. Can we please re-run one more time?

I ran the workflow in my repo just in case, and it passed successfully:
https://github.com/oranav/setup-python/actions/runs/3412737221/jobs/5678633730

@oranav
Copy link
Contributor Author

oranav commented Nov 9, 2022

@dsame @IvanZosimov Can we please re-run the workflow?
Sorry for the back and forth :)

@dsame
Copy link
Contributor

dsame commented Nov 10, 2022

@oranav can you please explain the purpose of running poetry env use after the .venv already has been added to the list of cached directories? Is it just to create .venv directory?

i mean this https://github.com/actions/setup-python/pull/447/files#diff-5a0a81991280dd0bbc75f051187afdb8ca7cb621a0bbc95b0d8d5d9a30f143a1R48

@oranav
Copy link
Contributor Author

oranav commented Nov 10, 2022

@dsame It's actually the same code as before, only in a loop (that iterates over all nested projects).
I believe poetry env use tells Poetry which Python binary to use; since there might not be any cached venv (if it's the first run, or if it has been evicted), then it'll initialize the venv with the correct Python binary.

@MaksimZhukov
Copy link
Contributor

Hello @oranav!
Thank you for the contribution! LGTM!

I think documentation needs to be updated as well. At the moment it says that only virtualenv directory will be cached

@dsame
Copy link
Contributor

dsame commented Nov 22, 2022

Thanks for the PR @oranav. Did you test the compat case -- saving a cache with the existing version of setup-python and restoring with this change?

@oranav please response this comment, we need it to merge the PR

@oranav
Copy link
Contributor Author

oranav commented Nov 22, 2022

@brcrista @dsame Sorry, I'm out of home until Sunday. I did not test the compat case -- good catch! It might be broken since we're saving multiple directories instead of one, but I don't believe it is.

I can test this as soon as I get back on Sunday. Is there an automated test of some kind?

@brcrista
Copy link
Contributor

@oranav no, you'll have to test it manually.

@agucova
Copy link

agucova commented Dec 1, 2022

@oranav heads up that the compat case doesn't seem to work. In my testing, this update triggered Poetry to not be able to find the correct python version and give up (even though poetry env use was run). Clearing the cache fixed it.

@oranav
Copy link
Contributor Author

oranav commented Dec 3, 2022

@brcrista @dsame @agucova I've tested the compat case thoroughly. Here's a summary of the results:

Poetry located at virtualenvs.in-project Pre-PR workflow Post-PR workflow
Root false Pass Pass
Root true Pass Pass
Inner folder false Failed silently, cache saved (wrong Python version used due to wrong poetry env use) Fail with old cache (Failed because it loaded an old, corrupt cache; will pass after invalidating the cache)
Inner folder true Fail Pass

As you can see, the only interesting case is number 3, in which if the previous version of the workflow was run, then a corrupt cache was already saved, and the post-PR version will happily load this cache and will fail with a wrong Python version (because the virtualenv was built incorrectly).

I see two options:

  1. Do nothing. This actually makes sense, since the action did not work properly even before; it seems as though the action "passed", but it actually failed because it has built a virtualenv with the wrong Python version.
  2. Change the Poetry cache primary key such that all previous caches are invalidated, which will make everything pass happily after this PR is merged. Again, this means all Poetry caches will be invalidated.

I'm leaning towards option 1, since option 2 solves a very esoteric case. Thoughts?

@agucova
Copy link

agucova commented Dec 4, 2022

@oranav I was on the “fail silently” option and I think it makes sense to try to repair it a posteriori, especially because that happened when the correct setup was being used.

It's difficult to estimate, but I wouldn't be surprised if it was the case that there were lots of people using the action without the cache working.

But yeah, changing the key doesn't seem ideal. Is it possible to just detect the edge case and warn users accordingly?

@brcrista
Copy link
Contributor

brcrista commented Dec 5, 2022

@oranav thank you for testing that! The two things we want to avoid are, in priority order:

  1. Workflows that were passing to start failing after the change
  2. Workflows that were getting cache hits to start getting cache misses after the change (but still succeeding)

It looks like the interesting case there is

Poetry located at virtualenvs.in-project Pre-PR workflow Post-PR workflow
Inner folder false Failed silently, cache saved (wrong Python version used due to wrong poetry env use) Fail with old cache (Failed because it loaded an old, corrupt cache; will pass after invalidating the cache)

I see in the logs for the Pre-PR workflow:

The currently activated Python version 3.8.10 is not supported by the project (^3.10).
Trying to find and use a compatible version. 
Using python3 (3.10.8)

so before the run is passing but caching is broken, and after the run is failing. Caching should be transparent: if the cache is corrupt, we should ignore the cache, invalidate it, and create a new one.

How much work would it be to handle that case? Since this should only happen on upgrade, we can handle it with a new major version release of the action. But that would slow down adoption of this feature, so it's better to handle it if we can.

When collecting Poetry projects for caching, a '**/poetry.lock' glob is
used.  However, in order to process the Poetry configuration, the
"poetry" command is run from the repo's root directory; this causes
Poetry to return an invalid configuration when there is a Poetry project
inside an inner directory.

Instead of running a single Poetry command, glob for the same pattern,
and run a Poetry command for every discovered project.
When the default dependency path is used for cache distributors, they
are looking for the dependency file in the project's root (including the
source code), which leads to tests taking a significant amount of time,
especially on Windows runners.  We thus hit sporadic test failures.

Change the test cases such that dependency files are always searched for
inside of `__tests__/data`, ignoring the rest of the project.
@oranav oranav force-pushed the relative-poetry-config branch 2 times, most recently from a957cf8 to 58362bf Compare December 9, 2022 15:56
The virtualenv cache might contain invalid entries, such as virtualenvs
built in previous, buggy versions of this action.  The `poetry env use`
command will recreate virtualenvs in case they are invalid, but it has
to be run only *after* the cache is loaded.

Refactor `CacheDistributor` a bit such that the validation (and possible
recreation) of virtualenvs happens only after the cache is loaded.
@oranav oranav force-pushed the relative-poetry-config branch 2 times, most recently from 92163f3 to 4b197ad Compare December 24, 2022 14:44
@oranav
Copy link
Contributor Author

oranav commented Dec 24, 2022

Woah, I finally triaged the issue. The old action never ran poetry env use correctly, so the envs.toml file was never created; this in turn made Poetry ignore the venv in case it is broken.
With my fix applied, poetry env use is run correctly, therefore envs.toml is created, but the virtualenv directory already exists (from the cache), so it forces Poetry to use the broken venv.

Either way, we'll want to cache to be rewritten after this PR. I suggest that we'll just bump the Poetry cache distributor primary key, so users will receive a cache miss upon the first run of the new action, and from there everything will work as expected.

I've pushed my changes. Can we please re-run the workflows?

@oranav
Copy link
Contributor Author

oranav commented Dec 26, 2022

Just to be clear: I've re-run the failing case (before the change, after the change, then once again) and it looks okay now.

@oranav
Copy link
Contributor Author

oranav commented Dec 28, 2022

Anything still missing from my side or can we proceed to merging this? 😊

@oranav
Copy link
Contributor Author

oranav commented Jan 2, 2023

@brcrista @dsame @agucova Forgot to tag you in the previous comments, trying again in case there were no notifications. :)

Copy link
Contributor

@brcrista brcrista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @oranav ! I think that's a good fix for the compat issue. Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants