-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Testing] tfx sample pipeline broken in release not caught by postsubmit #5178
Comments
Notice that, postsubmit test before the release commit passed, however the released version had a problematic tfx sample pipeline (ignore the failure, it was expected for a release commit). Integration test of the passing commit: https://oss-prow.knative.dev/view/gs/oss-prow/logs/kubeflow-pipeline-postsubmit-integration-test/1360157617023881216 What's even weirder, after the fix #5165, postsubmit tests start to fail on parameterized_tfx_sample pipeline. |
One problem I found was that, the integration test in postsubmit is incorrectly using presubmit test script: https://github.com/GoogleCloudPlatform/oss-test-infra/blob/f29fc29cd617497ea44164ff6a1734c7dee3c0f4/prow/prowjobs/kubeflow/pipelines/kubeflow-pipelines-postsubmits.yaml#L50 EDIT: this is not root cause of this problem |
I think I found the root cause, it's caused by incomplete upgrade of tfx dependencies. Let me explain the order of events happening:
Conclusion
|
High priority problems fixed and I made samples-test to imports the same requirements.in from backend requirements.in. What's missing is that, people may update requirements.in, but forget to update all requirements.txt. /assign @chensun |
and #5187 is pending review |
Aren't we deprecating |
I see, thanks for the clarification. Checked again, and it did disabled python. (I was under the wrong impression that sdk/python/requirements.txt was covered as I saw an ignore list with some components path yet sdk is not in that list). |
-- @chensun Moving some of the discussion from the PR thread back to the main issue. I agree with Chen, testing as what users would get is an important test. However, we used to do that before introducing requirements.{in/txt}, the result was that from time to time presubmit broke without any clues and we needed to dig through the dependencies to find out why. I just want to make sure we are not going in cycles, we should admit approaches have their pros and cons. I think the discussion laid out in #4682 is not significantly different from what we have here. Maybe the best approach is also to have a requirements.txt, but set up a bot to update it periodically as PRs. In this way, if that update PR fails, we know users might hit that problem too, but it won't be blocking presubmit tests (and other people not working on this problem). |
If I'm not mistaken, this is usually due to new versions of dependencies not compatible with other existing dependencies. And that's a sign that we need to fix The fact that many of the dependencies listed in kfp Lines 24 to 47 in 9bc63f5
So I think one action item is to add upper limits regardless whether we use requirements.txt in tests. WDYT?
EDIT: created #5258 |
TODO: update TFX upgrade documentation |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it. |
In #5137, tfx sample pipeline was broken, but it was not caught by postsubmit test.
Originally posted in #5137 (comment)
I think this is a high priority issue, because it causes a lot of efforts after 1.4.0 was released.
/cc @numerology @chensun
The text was updated successfully, but these errors were encountered: