Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Resolve errors in e2e tests for cypress in Katib UI #2384

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tariq-hasan
Copy link
Contributor

@tariq-hasan tariq-hasan commented Jul 15, 2024

What this PR does / why we need it: This PR fixed the errors in e2e tests for cypress.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2381

Checklist:

  • Docs included if any changes are user facing

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign johnugeorge for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tariq-hasan
Copy link
Contributor Author

tariq-hasan commented Jul 15, 2024

This PR relies on a commit in my forked repository: kubeflow/kubeflow@master...tariq-hasan:kubeflow:fix-katib-ui-tests.

The commit needs to be merged into the kubeflow repository and the COMMIT file in this PR needs to be updated with the latest commit tag after the merge.

Also the addition of the process package in components/crud-web-apps/common/frontend/kubeflow-common-lib/package.json and the update to the associated package-lock.json may have repercussions for crud-web-apps. This needs to be assessed further.

@tariq-hasan
Copy link
Contributor Author

tariq-hasan commented Jul 15, 2024

The e2e tests involving Cypress have passed: https://github.com/kubeflow/katib/actions/runs/9935880934?pr=2384.

There is a failed test for Katib UI: https://github.com/kubeflow/katib/actions/runs/9935880964/job/27442941694?pr=2384.

This is coming from the npm ci commands in https://github.com/kubeflow/katib/blob/master/cmd/ui/v1beta1/Dockerfile.

I am investigating the issue and will have it fixed.

@andreyvelich
Copy link
Member

andreyvelich commented Jul 15, 2024

Thank you so much for investigating this and for this fix @tariq-hasan!
Please can you submit PR into kubeflow/kubeflow to add this library so we can take the appropriate commit for Katib repo?

@tariq-hasan
Copy link
Contributor Author

Thank you so much for investigating this and for this fix @tariq-hasan!
Please can you submit PR into kubeflow/kubeflow to add this library so we can take the appropriate commit for Katib repo?

Sounds good. I'll try to fix the failed Katib UI test so that I can identify if any other code changes are required for kubeflow/kubeflow. I am then creating a PR to merge the code changes in kubeflow/kubeflow and will then update this PR for review.

Failed Katib UI test: https://github.com/kubeflow/katib/actions/runs/9935880964/job/27442941694?pr=2384

@tariq-hasan tariq-hasan force-pushed the fix-katib-ui-tests branch 3 times, most recently from d0d4b5f to 734af5d Compare July 16, 2024 04:26
@andreyvelich
Copy link
Member

/rerun-workflow "Publish Katib Core Images"

@andreyvelich
Copy link
Member

/rerun-all

@andreyvelich
Copy link
Member

@tariq-hasan @tenzen-y It looks like /rerun-all is working only right now.
We might need to wait until this one will be merged to rerun specific workflow: estroz/rerun-actions#3

@tariq-hasan
Copy link
Contributor Author

tariq-hasan commented Jul 16, 2024

Thank you for the clarification.

I did some further investigation on the Publish Katib Core Images workflow.

I found that the workflow fails within the final stage of the Dockerfile due to the results coming from npm ci.

If I replace npm ci with npm install --no-package-lock the workflow succeeds.

That said, npm ci uses package-lock.json instead of package.json and is therefore a more stringent validation of the configuration in the frontend folder. In addition, npm ci is less resource-intensive compared to npm install. As such, I suppose that npm install --no-package-lock is not a feasible option.

As npm install --no-package-lock succeeds and npm ci fails the issue could be due to package-lock.json not working in the node:14 base image as it was generated in a different environment on my local.

I will try to do some more investigation to determine why the build is failing and find an appropriate fix.

@andreyvelich
Copy link
Member

Thank you for this deep investigation @tariq-hasan!

@kimwnasptd @elenzio9 @orfeas-k @kubeflow/wg-data-leads @ederign @alexcreasy
cc some UI folks on problems with Katib and Kubeflow UIs.
Please can you help us to resolve problems with npm and node ?

@orfeas-k
Copy link
Contributor

orfeas-k commented Jul 17, 2024

@tariq-hasan @andreyvelich I think the answer is already highlighted in that previous commnet

package-lock.json not working in the node:14 base image as it was generated in a different environment on my local.

If the build succeeds with npm install --no-package-lock, we 'd need to update package-lock.json file (and commit it) from the same node and npm version with the one being used in the CI.

@andreyvelich
Copy link
Member

What changes do we need to make for Kubeflow UI library ?

@ederign
Copy link
Member

ederign commented Jul 18, 2024

@andreyvelich I believe what @orfeas-k meant is that @tariq-hasan generated package-lock.json file in the PR needs to match the node (and npm) version of the node where the 'Publish Katib Core Images' is being executed.

@tariq-hasan, could you please double-check that?

Also, I'm new to the project, but is there a reason why we are upgrading to node 14 instead of node 16 like kubeflow/kubeflow is using?

@andreyvelich
Copy link
Member

Also, I'm new to the project, but is there a reason why we are upgrading to node 14 instead of node 16 like kubeflow/kubeflow is using?

I think, the common Kubeflow frontend component that we use doesn't use Node 16 today: https://github.com/kubeflow/kubeflow/blob/master/components/crud-web-apps/common/frontend/kubeflow-common-lib/package.json#L62

@tariq-hasan Did you check if cypress tests can be succeeded on Node 16 ? If yes, we should just update it.

@tariq-hasan
Copy link
Contributor Author

I am upgrading kubeflow-common-library in crud-webapps as well as katib-ui to node 16.

That said, the following components are still using node 12:

I will also try to reproduce the build from a Dockerfile on my local to ensure that the package-lock.json is correctly generated.

@tariq-hasan tariq-hasan force-pushed the fix-katib-ui-tests branch 2 times, most recently from db5c4f1 to c660b3c Compare July 29, 2024 22:46
@tariq-hasan
Copy link
Contributor Author

/rerun-all

1 similar comment
@tariq-hasan
Copy link
Contributor Author

/rerun-all

@tariq-hasan
Copy link
Contributor Author

tariq-hasan commented Jul 30, 2024

Using package-lock.json from a build of the Dockerfile did not fix the errors.

However, upgrades of both crud-webapps/common/frontend/kubeflow-common-library and katib-ui to node 16 fixed the errors.

Please let me know if there are any feedback.

I will then create a PR to move kubeflow-common-library to node 16.

@andreyvelich
Copy link
Member

Yes, @tariq-hasan please create PR into kubeflow/kubeflow to update Node version for common library.

@tariq-hasan
Copy link
Contributor Author

tariq-hasan commented Jul 31, 2024

Got it. As part of the PR, I will upgrade the following components from version 12 to version 16 to ensure compatibility with kubeflow-common-library and for the CI tests in kubeflow/kubeflow to pass successfully.

@tenzen-y
Copy link
Member

Got it. As part of the PR, I will upgrade the following components from version 12 to version 16 to ensure compatibility with kubeflow-common-library and for the CI tests in kubeflow/kubeflow to pass successfully.

Thank you for the great investigation!
Once you open the PR in the kubeflow/kubeflow repository, please put the link on this PR.

@andreyvelich andreyvelich changed the title fix: Resolve errors in e2e tests for cypress fix: Resolve errors in e2e tests for cypress in Katib UI Aug 7, 2024
@andreyvelich
Copy link
Member

@tariq-hasan Any success with submitting PR into kubeflow/kubeflow so we can fix the e2e tests in Katib UI ?

@tariq-hasan
Copy link
Contributor Author

tariq-hasan commented Aug 14, 2024

I apologize for the delay. I've updated the configuration for tensorboards. I am working through jupyter, volumes and centraldashboard-angular, and will create a PR in kubeflow/kubeflow shortly.

@andreyvelich
Copy link
Member

Thank you @tariq-hasan, appreciate your time for it!

@tariq-hasan
Copy link
Contributor Author

/rerun-all

@tariq-hasan
Copy link
Contributor Author

I have created a PR for kubeflow/kubeflow: kubeflow/kubeflow#7637.

This PR upgrades the node version from 12 to 16 for kubeflow-common-lib and its dependent components: jupyter, tensorboards, volumes and centraldashboard-angular.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix Katib UI Tests
5 participants