Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support Python 3.10 #3140

Merged
merged 11 commits into from
Dec 29, 2022
Merged

support Python 3.10 #3140

merged 11 commits into from
Dec 29, 2022

Conversation

hirosassa
Copy link
Contributor

Description

This PR adds Python 3.10 CI workflow to support Python 3.10.

Motivation and Context

To follow latest Python version.

Have you tested this? If so, how?

Ran tox on local Mac and Docker (ubuntu 20.04)

@hirosassa hirosassa requested review from dlstadther and a team as code owners January 21, 2022 22:13
@dlstadther
Copy link
Collaborator

@hirosassa what do we need to solve these 3.10 errors? There have been a couple 3.10 related PRs that i've merged in the past few days (or so) - a deprecation warning and the tenacity version upgrade. Are there additional changes that we need to make before updating this branch and retesting?

So sorry for my infrequent review and feedback here

@hirosassa
Copy link
Contributor Author

@dlstadther Thanks for your comment! I would like to discuss about the CI failure.
The CI error on Python 3.10 occurred only on GitHub Actions (not reproduced on local Docker and Mac).

I found the problem comes from the code below:
https://github.com/spotify/luigi/blob/master/luigi/lock.py#L65-L66
actual fh contains nothing, it is expected that it contains current executed command like echo hello.

I tried adding retry using tenacity like here, and CI sometimes pass but it also fails "sometime".
https://github.com/hirosassa/luigi/pull/1/files#diff-e126af1be65ab218abe136e1248d7ec7fca5e00f0c3322d3c1eef50ff4389b20R82-R85

Do you have any ideas?

@rjcortese
Copy link
Contributor

It could be possible that the GitHub Actions CI runners are using different configuration of procfs, such that the read of /proc/pid/cmdline seems empty (there is already a comment related to that in the code). Might be that reading empty file does not raise IOError. Do other python versions work on GitHub Actions CI runners?

Another reason /proc/pid/cmdline might be empty is if it is a zombie process: See the /proc/[pid]/cmdline section of https://man7.org/linux/man-pages/man5/proc.5.html.

Did Python 10 change anything about way open() or fh.read() works?

@rschmidtner
Copy link

Are there any plans to make luigi Python 3.10 compatible in the future?

@hirosassa
Copy link
Contributor Author

@rschmidtner In my environment (production, too), Luigi is working well with Python 3.10.
But, in this PR, some tests are failed.

@ravwojdyla
Copy link

ravwojdyla commented Dec 12, 2022

@hirosassa thanks for starting this PR! Regarding the cmdline issue from #3140 (comment), have you tried to add a short sleep between these two lines (e.g. time.sleep(0.42)):

luigi/test/lock_test.py

Lines 39 to 40 in 38b0c2b

external_process = subprocess.Popen(command)
result = luigi.lock.getpcmd(external_process.pid)

EDIT: I'm not saying that is a fix, but it might confirm a problem.

@hirosassa
Copy link
Contributor Author

@ravwojdyla Thanks for your comment! I added sleep at c5ff943 but it failed.

@ravwojdyla
Copy link

ravwojdyla commented Dec 12, 2022

I added sleep at c5ff943 but it failed.

test_getpcmd now did NOT fail, test_get_info failed, which is the other test that failed in the past, probably the same reason. Could you please add the sleep in the other test as well, here:

luigi/test/lock_test.py

Lines 61 to 62 in 38b0c2b

p = subprocess.Popen(["yes", u"à我ф"], stdout=subprocess.PIPE)
pid, cmd, pid_file = luigi.lock.get_info(self.pid_dir, p.pid)

@hirosassa
Copy link
Contributor Author

@ravwojdyla Wow! Thank you. Finally all the tests are succeed.

@ravwojdyla
Copy link

ravwojdyla commented Dec 12, 2022

Finally all the tests are succeed.

@hirosassa nice, good job! time.sleep(0.42) was somewhat arbitrary and very likely an overkill, some milliseconds would probably be sufficient. If you want to, in the comment you can mention that this sleep is "necessary" to make sure the test can read populated /proc/*/cmdline on linux.

EDIT: there might be a more idiomatic way to handle this in a test than sleep, maybe something could be synchronized etc, but I will leave that up the you/reviewers to decide.

@lallea
Copy link
Contributor

lallea commented Dec 12, 2022

The sleep is fragile. On a bad day, it won't be enough and a longer sleep slows down tests. I suggest moving the getpcmd call to a separate function and wrapping it with tenacity.retry with a timeout of tens of seconds.

@hirosassa
Copy link
Contributor Author

@lallea @ravwojdyla Thanks for your comment. I agree with your opinion. I'll try it.

@ravwojdyla
Copy link

I suggest moving the getpcmd call to a separate function and wrapping it with tenacity.retry with a timeout of tens of seconds.

@lallea that sounds good. afaiu you are suggesting to retry until you can get the cmd line from /proc/*/cmdline, just want to point out that there can be multiple reasons why cmdline file exists BUT it's empty. At least two reasons come to my mind: it can be because the cmdline has not be written yet (might be buffered, in process etc), or the process is a zombie. If you add retry with "tens of seconds", in the case of the zombie process (which I do not know how often that will be the case), you might actually wait for those "tens of seconds", and that would be in the "prod code", not test. So there's a potential downside to that solution.

@lallea
Copy link
Contributor

lallea commented Dec 13, 2022

I suggest moving the getpcmd call to a separate function and wrapping it with tenacity.retry with a timeout of tens of seconds.

@lallea that sounds good. afaiu you are suggesting to retry until you can get the cmd line from /proc/*/cmdline, just want to point out that there can be multiple reasons why cmdline file exists BUT it's empty. At least two reasons come to my mind: it can be because the cmdline has not be written yet (might be buffered, in process etc), or the process is a zombie. If you add retry with "tens of seconds", in the case of the zombie process (which I do not know how often that will be the case), you might actually wait for those "tens of seconds", and that would be in the "prod code", not test. So there's a potential downside to that solution.

I'm suggesting breaking out the call to a new function in lock_test.py and put the retry there, not in production code. I agree that it would be appropriate in prod code.

@hirosassa
Copy link
Contributor Author

@lallea @ravwojdyla I added retry on getpcmd and get_info in test code. Please check.

ravwojdyla
ravwojdyla previously approved these changes Dec 27, 2022
test/lock_test.py Outdated Show resolved Hide resolved
@hirosassa
Copy link
Contributor Author

@dlstadther Hi Dillon! Could you review this PR?

@dlstadther dlstadther merged commit 3217e67 into spotify:master Dec 29, 2022
@hirosassa hirosassa deleted the python-310 branch December 29, 2022 01:00
@hirosassa
Copy link
Contributor Author

hirosassa commented Dec 29, 2022

@dlstadther Thank you for your review and merge.
Could you release newer version on pypi? (there's many new feature released)

@dlstadther
Copy link
Collaborator

Apologies @hirosassa ; I have been a bit consumed as of late. @honnix , could Spotify devs prep and release a new Luigi version?

@honnix
Copy link
Member

honnix commented Jan 17, 2023

@dlstadther We will take care of it. Looking at #3220, it seems we should get that in before dropping a new release.

@dlstadther
Copy link
Collaborator

@dlstadther We will take care of it. Looking at #3220, it seems we should get that in before dropping a new release.

Thanks @honnix !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants