Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix return-before-assignment bug in notebook test script, add pre-commit, drop cuspatial notebooks #690

Merged
merged 2 commits into from
Jul 18, 2024

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Jul 17, 2024

CI has been failing here for a few weeks, as a result of some failing cuspatial notebooks.

As described in rapidsai/cuspatial#1406, I observed those notebooks failing like this:

cannot access local variable 'execution_time' where it is not associated with a value

Turns out that that was coming from the test_notebooks.py script in this repo. It calls return execution_time unconditionally, even though it's possible for that variable to not exist.

This fixes that, and proposes adding a pre-commit configuration and corresponding CI job, to help catch such things earlier in the future.

This also proposes skipping some cuspatial notebooks, to get CI working. I'd hoped to address that in rapidsai/cuspatial#1407, but most cuspatial maintainers are busy or unavailable this week. I think we should get CI working here again and deal with the cuspatial notebooks as a separate concern.

Notes for Reviewers

Any other changes you see in files here are the result of running pre-commit run --all-files (with some manual fixes applied to fix ruff warnings).

@jameslamb jameslamb changed the title fix return-before-assignment bug in notebook test script, add pre-commit fix return-before-assignment bug in notebook test script, add pre-commit, drop cuspatial notebooks Jul 18, 2024
@jakirkham
Copy link
Member

One of the jobs had startup issues. Restarted

@jameslamb
Copy link
Member Author

blegh I see what happened.

One of the test jobs is failing like this:

Error response from daemon: manifest for rapidsai/staging:docker-notebooks-690-24.08a-cuda12.0-py3.10-amd64 not found: manifest unknown: manifest unknown

(build link)

I'm thinking that maybe the corresponding delete-temp-images job already ran.

Screenshot 2024-07-18 at 1 00 54 PM

source: https://github.com/rapidsai/docker/actions/runs/9993739485?pr=690

Looks like #574 tried to fix that, but I suspect that the if: always() results in the behavior "run the deletion even if tests failed".

delete-temp-images:
if: always()
needs: [compute-matrix, build, test, build-multiarch-manifest]

I think it'll take a full re-run to get CI passing here. I just kicked one off here.

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks James

@jakirkham
Copy link
Member

Thanks James! 🙏

Regarding CI, maybe we should write this up as an issue and see if there is a more friendly way to fix these one-off failures. Remember seeing this behavior before

@jakirkham
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit ddb36fd into rapidsai:branch-24.08 Jul 18, 2024
34 of 35 checks passed
@jakirkham
Copy link
Member

jakirkham commented Jul 18, 2024

Oops thought it would wait for CI 😳

At least we know nearly all passed and the one that failed was mostly likely a fluke

Will keep an eye on CI on branch-24.08

Edit: After discussing offline with Ray, configuring pr-builder would fix the merging behavior. Raised issue: #691

@jameslamb jameslamb deleted the fix/parsing branch July 18, 2024 19:06
@jameslamb
Copy link
Member Author

No problem, thanks John.

Just noting that the tests don't run on merges to main:

So we'll have to check the CI of another PR (like #689) to know if this worked.

@jakirkham
Copy link
Member

Hmm...good point. Seems like we have a few footguns to work on here 😅

Well it does look like most of those passed (except CUDA 12.5 where we still need cuSpatial)

Opened this PR ( #692 ) to test CI with no change

@jameslamb
Copy link
Member Author

Yep definitely, thanks for helping us uncover them!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants