-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problems with, and test agsinst, Python 3.8 #8674
Comments
I'm currently working on this one 🐍 🤙 |
Additional known issue: |
Oops, accidentally unassigned myself. |
@jhtimmins fixed |
@mik-laj Thanks very much! |
PR #8794 is part of this issue |
Should Airflow support process spawning? Tl;Dr - Spawning creates the possibility of hard-to-detect bugs in child processes. It’s the default on MacOS, the sole option on Windows, and Linux has Overview
Differences - Fork vs Spawn
Issue in 3.8:
Problems with spawn: This class of bug can be mitigated by using environment variables to set config values globally, rather than local to the parent process. Using environment variables to pass state presents two issues:
Because of these issues, we should thoughtfully consider whether to use the system defaults, or to hardcode to only using fork on Mac and Linux (h/t to Daniel for suggesting this discussion). Considerations:
Recommendations: If stability > multi-system support - If stability is key, then I recommend we hardcode forking all systems. This will continue to make Airflow incompatible with Windows, and introduces the possibility of memory bugs on Mac. The benefit is that we don’t need to store state globally on macOS and we don’t need to hunt down pieces of code that are currently failing silently. If multi-system support > stability - We should continue to support the default method on every system, and address bugs when they arise. We should come up with a standardized process for storing configuration values in either config or environment variables. Right now those methods are closely coupled, increasing the likelihood of errors. I lean towards multi-system support, with the caveat that this will likely cause hard to find bugs. Those bugs will probably be localized to MacOS, which should lower the risk for Airflow instances running on production Linux machines. |
Great analysis @jhtimmins thanks! Indeed - we discussed the fork/spawn case in the past and the MacOS vs. Linux change in 3.8. I think we are not at all concerned with Windows - anyhow the only recommended way to run Airflow on Windows is WSL2. Regarding MacOS vs. Linux. I think it's great it works on MacOS for development purposes, but for production, I believe we only ever support Linux. Also, we are moving more and more into Dockerised solutions - with official production image, coming docker-compose support, Helm Chart - we are Docker-first for execution of Airflow in production. For development we also have Breeze environment which I think long term might become the default and while we still support (and will support) running local virtualenv on MacOS, we already have a number of tests that will not work straight from the IDE or when run locally - they need integrations (rabbitmq, Cassandra, databases etc.) and for those Breeze's docker environment is the one that should be used. We have full MacOS support for Breeze as well so I think making sure tests pass and are fast on MacOS natively is not a huge problem. I think our main concern should be that all true "unit" tests work in any environment but the more complex "integration" tests can only work in Breeze. I think it really depends how many tests will not work. I believe this is a very small subset of tests. Those are the important ones, but I believe there are not many of those. If that's the case, then I personally would be for hard-coding forking and marking the tests that are bound to fail on MacOS with a custom pytest marker (say @pytest.marker.linux ). Those tests will then be automatically skipped when run on MacOS and the message we can put is "These tests only work in Linux environment, please use Breeze docker environment if you are running them on MacOS" or similar. This way we avoid people being surprised by failing tests, give them clear information how to run them. At the same time if people develop on Linux, they will still be able to run those tests "natively" without having to switch to breeze environment. |
Let's separate out development Airflow itself and developing DAGs etc. For the former: maybe, but it should never be a requirement on contributors. (And as long as I'm working on Airflow my default's gonna be direct development, not using Docker 😀. I am on Linux now though.) For the latter: being able to use OSX (and yes, eventually Windows) as a user writing dags and running a development Airflow instance without Docker is a critical goal of on-boarding for new users because:
We can solve some of that, but I am set on still supporting running directly for users on OSX.
Yes, it is. The only real issue in the spawn-vs-fork comes when we use the And the config mocking issue aside, I think being able to test on OSX/with spawn is useful to getting us towards testing it runs on OSX/Windows. |
Hi Airflow Team, I tried to run airflow 1.10.10 with Python 3.8.3 (the same for 3.8.0) on macOs Mojave 10.14.6 as a demon process this way does not work:
probably, it can be related to https://bugs.python.org/issue39685 |
Fixed in 1.10.11 |
Airflow is currently tested against python 3.5 only. This may be insufficient as users may wish to run airflow on newer versions, which may introduce breaking changes.
This is a bit of an "epic" issue to track the current know Python 3.8 bugs.
Migrated from AIRFLOW-4762.
The text was updated successfully, but these errors were encountered: