-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removes limitations from Dask dependencies #22017
Conversation
Dask dependencies were holding us back - when it comes to upgrading somoe of the packages (for example apache-beam and looker - in google provider). This PR removes the limitations but with a twist. * Dask tests stop working. We reach out to the Dask Team to fix them but since a very old version of `distributed` library was used the Dask team is called for help to fix those * The typing-extensions library was limited by `distributed` but it seems that version 4.0.0+ breaks kubernetes tests
361621e
to
b7e4b5f
Compare
cc: @alekseiloginov - related to #20882 |
|
||
DEFAULT_DATE = timezone.datetime(2017, 1, 1) | ||
SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help'] | ||
FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false'] | ||
|
||
# For now we are temporarily removing Dask support until we get Dask Team help us in making the | ||
# tests pass again | ||
skip_dask_tests = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@potiuk , Out of curiosity, is this strictly maintained by the Dask team, is this something I can take a look, I was looking also at options of connecting to a Local Dask cluster along with distributed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not. I started discussion on that here https://lists.apache.org/thread/6stgcpjt5jb3xfw92oo1j486j33c8v7m
This is is a second time I start similar discussion - the previous time it was in Jan 2020 https://lists.apache.org/thread/875fpgb7vfpmtxrmt19jmo8d3p6mgqnh and then Dask team chimed in and helped in fixing the tests.
But more than 1 year later we have similar problem.
I also asked at Dask's disccoure whether they can help again: https://dask.discourse.group/t/potential-removal-of-dask-executor-support-in-airflow/433
All looks good now with those changes. I propose to merge it so that we are unblocked while we discuss next steps. |
@potiuk isn't it possible (and easier) to skip these tests "from the outside" – ? I would think |
How would you imagine "externall" exclusion :)? I would like to avoid having anyone to add extra pytest arguments or command line switches to skip those. The idea is that whoever runs "All tests" for Airlflow can see "blessed tests" succeed, no matter if they specify some exclusions. In our case, if you want to make sure that all tests are succeeding you do this:
Those three commands should lead to successful test execution (in 3.7 + mysql combination). Having to add some additional exclusion rules is bad, especially in case we want Dask team members to help with fixing those. What we ask them, is to eventually remove skipIf (initially set the `skip_dask_test' to False and run the failed tests). They do not have to learn anything about external scripts and "Exclusions" we do outside of our "python" code. their fix will be limited to "dask" tests only - they don't have to understand what our CI does and how we exclude stuff externally. I think in this case skipIf "close" to the tests being disabled is exactly what we need. |
|
||
@pytest.mark.skipif(skip_dask_tests, reason="The tests are skipped because it needs testing from Dask team") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pytest.mark.skipif(skip_dask_tests, reason="The tests are skipped because it needs testing from Dask team") | |
@pytest.mark.skip(reason="The tests are skipped because it needs testing from Dask team") |
skipif
is used to dynamically skip tests depending on a dynamic value, e.g. in the active database backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to give an easy way for Dask committers to be able to "play" with it. it's easier to set up one variable to True to do it, rather than manually remove @pytest.mark.skip for all test cases.
This is really a "temporary" state I wanted to make easy for them to work on (same as last time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine the Dask developers will do it this way:
- setup Breeze
- get Dask service to run tests on
- set
skip_dag_tests
to False - fix the tests
- remove skips
By having single flag to switch that enables all tests it is just easier to not forget about removing some of the skips.
Merging for now. We can discuss futher steps later. |
Dask dependencies were holding us back - when it comes to upgrading
somoe of the packages (for example apache-beam and looker - in google
provider). This PR removes the limitations but with a twist.
Dask tests stop working. We reach out to the Dask Team to fix them
but since a very old version of
distributed
library was usedthe Dask team is called for help to fix those
The typing-extensions library was limited by
distributed
but itseems that version 4.0.0+ breaks kubernetes tests
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.