Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate flyteconsole from Travis -> Github Actions #423

Closed
schottra opened this issue Jul 20, 2020 · 0 comments
Closed

Migrate flyteconsole from Travis -> Github Actions #423

schottra opened this issue Jul 20, 2020 · 0 comments
Assignees
Labels
ui Admin console user interface

Comments

@schottra
Copy link
Contributor

We're currently running a hybrid setup with Travis and GH actions. We can/should remove the Travis CI bits which are duplicated (the PR image build) and consider migrating any other checks (I believe just the unit tests/linting?) over to GH actions.

@schottra schottra added task ui Admin console user interface labels Jul 20, 2020
@schottra schottra self-assigned this Jan 11, 2021
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
* Moving all plugins to a common package, for easy loading

Signed-off-by: Ketan Umare <[email protected]>

* updated

Signed-off-by: Ketan Umare <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 9, 2023
* Moving all plugins to a common package, for easy loading

Signed-off-by: Ketan Umare <[email protected]>

* updated

Signed-off-by: Ketan Umare <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 21, 2023
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Apr 30, 2024
…rg#423)

* WIP. Marked places where an acknowledgement before an update is needed.

Signed-off-by: Kamal Eybov <[email protected]>

* (1) Added error handling for methods fetching matchable attributes when attributes do not exist. (2) Added fetching data that is needed for diffing during updates.

Signed-off-by: Kamal Eybov <[email protected]>

* Diff and ask for ack.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed some of the TODOs.

Signed-off-by: Kamal Eybov <[email protected]>

* Cleaned up error handling.

Signed-off-by: Kamal Eybov <[email protected]>

* Updated tests.

Signed-off-by: Kamal Eybov <[email protected]>

* More tests.

Signed-off-by: Kamal Eybov <[email protected]>

* Fix case for No string to no (flyteorg#419)

Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
Signed-off-by: Kamal Eybov <[email protected]>

* Replaced diffing implementation.

Signed-off-by: Kamal Eybov <[email protected]>

* Addressed pull request comments.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed linter errors.

Signed-off-by: Kamal Eybov <[email protected]>

---------

Signed-off-by: Kamal Eybov <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future-Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Apr 30, 2024
…rg#423)

* WIP. Marked places where an acknowledgement before an update is needed.

Signed-off-by: Kamal Eybov <[email protected]>

* (1) Added error handling for methods fetching matchable attributes when attributes do not exist. (2) Added fetching data that is needed for diffing during updates.

Signed-off-by: Kamal Eybov <[email protected]>

* Diff and ask for ack.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed some of the TODOs.

Signed-off-by: Kamal Eybov <[email protected]>

* Cleaned up error handling.

Signed-off-by: Kamal Eybov <[email protected]>

* Updated tests.

Signed-off-by: Kamal Eybov <[email protected]>

* More tests.

Signed-off-by: Kamal Eybov <[email protected]>

* Fix case for No string to no (flyteorg#419)

Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
Signed-off-by: Kamal Eybov <[email protected]>

* Replaced diffing implementation.

Signed-off-by: Kamal Eybov <[email protected]>

* Addressed pull request comments.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed linter errors.

Signed-off-by: Kamal Eybov <[email protected]>

---------

Signed-off-by: Kamal Eybov <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future-Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
austin362667 pushed a commit to austin362667/flyte that referenced this issue May 7, 2024
…rg#423)

* WIP. Marked places where an acknowledgement before an update is needed.

Signed-off-by: Kamal Eybov <[email protected]>

* (1) Added error handling for methods fetching matchable attributes when attributes do not exist. (2) Added fetching data that is needed for diffing during updates.

Signed-off-by: Kamal Eybov <[email protected]>

* Diff and ask for ack.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed some of the TODOs.

Signed-off-by: Kamal Eybov <[email protected]>

* Cleaned up error handling.

Signed-off-by: Kamal Eybov <[email protected]>

* Updated tests.

Signed-off-by: Kamal Eybov <[email protected]>

* More tests.

Signed-off-by: Kamal Eybov <[email protected]>

* Fix case for No string to no (flyteorg#419)

Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
Signed-off-by: Kamal Eybov <[email protected]>

* Replaced diffing implementation.

Signed-off-by: Kamal Eybov <[email protected]>

* Addressed pull request comments.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed linter errors.

Signed-off-by: Kamal Eybov <[email protected]>

---------

Signed-off-by: Kamal Eybov <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future-Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
robert-ulbrich-mercedes-benz pushed a commit to robert-ulbrich-mercedes-benz/flyte that referenced this issue Jul 2, 2024
…rg#423)

* WIP. Marked places where an acknowledgement before an update is needed.

Signed-off-by: Kamal Eybov <[email protected]>

* (1) Added error handling for methods fetching matchable attributes when attributes do not exist. (2) Added fetching data that is needed for diffing during updates.

Signed-off-by: Kamal Eybov <[email protected]>

* Diff and ask for ack.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed some of the TODOs.

Signed-off-by: Kamal Eybov <[email protected]>

* Cleaned up error handling.

Signed-off-by: Kamal Eybov <[email protected]>

* Updated tests.

Signed-off-by: Kamal Eybov <[email protected]>

* More tests.

Signed-off-by: Kamal Eybov <[email protected]>

* Fix case for No string to no (flyteorg#419)

Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
Signed-off-by: Kamal Eybov <[email protected]>

* Replaced diffing implementation.

Signed-off-by: Kamal Eybov <[email protected]>

* Addressed pull request comments.

Signed-off-by: Kamal Eybov <[email protected]>

* Fixed linter errors.

Signed-off-by: Kamal Eybov <[email protected]>

---------

Signed-off-by: Kamal Eybov <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
Co-authored-by: Future-Outlier <[email protected]>
Co-authored-by: Future Outlier <[email protected]>
pvditt added a commit that referenced this issue Aug 20, 2024
## Overview
when [informer cache has stale values](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L478), we cannot update the k8s resource when [clearing finalizers](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L450) and get `Error: Operation cannot be fulfilled on pods.` The current implementation bubbles up the error resulting in a system retry. By the next loop, the informer cache is up to date and the update is able to be applied. However, in an ArrayNode with many subnodes getting executed in parallel, the execution can easily run out of retries.

This update adds a basic retry with exponential backoff to wait for the informer cache to get up to date.

## Test Plan
Ran in dogfood-gcp
- https://buildkite.com/unionai/managed-cluster-staging-sync/builds/4622 + manually updated configmap to enabled finalizers
- Run without change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/fd16ac81fd7b5480fb6f/nodes)
- Run with change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/f016a3be7fa304db5a77/nodeId/n0/nodes)
confirmed in logs that conflict errors:
```
{"json":{"exec_id":"f016a3be7fa304db5a77","node":"n0/n42","ns":"development","res_ver":"146129599","routine":"worker-66","src":"plugin_manager.go:455","wf":"flytesnacks:development:tests.flytekit.integration.map_task_issue.wf8"},"level":"warning","msg":"Failed to clear finalizers for Resource with name: development/f016a3be7fa304db5a77-n0-0-n42-0. Error: Operation cannot be fulfilled on pods \"f016a3be7fa304db5a77-n0-0-n42-0\": the object has been modified; please apply your changes to the latest version and try again","ts":"2024-08-17T02:02:48Z"}

```
did not bubble up + confirmed finalizers were removed:

```
➜  ~ k get pods -n development f016a3be7fa304db5a77-n0-0-n42-0 -o json | grep -i final
INFO[0000] [0] Couldn't find a config file []. Relying on env vars and pflags.
➜  ~
```

## Rollout Plan (if applicable)
- revert changes to customer's config that disabled finalizers

## Upstream Changes
Should this change be upstreamed to OSS (flyteorg/flyte)? If not, please uncheck this box, which is used for auditing. Note, it is the responsibility of each developer to actually upstream their changes. See [this guide](https://unionai.atlassian.net/wiki/spaces/ENG/pages/447610883/Flyte+-+Union+Cloud+Development+Runbook/#When-are-versions-updated%3F).
- [x] To be upstreamed to OSS

## Issue
fixes: https://linear.app/unionai/issue/COR-1558/investigate-why-finalizers-consume-system-retries-in-map-tasks

## Checklist
* [ ] Added tests
* [x] Ran a deploy dry run and shared the terraform plan
* [ ] Added logging and metrics
* [ ] Updated [dashboards](https://unionai.grafana.net/dashboards) and [alerts](https://unionai.grafana.net/alerting/list)
* [ ] Updated documentation

Signed-off-by: Paul Dittamo <[email protected]>
eapolinario pushed a commit that referenced this issue Aug 21, 2024
## Overview
when [informer cache has stale values](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L478), we cannot update the k8s resource when [clearing finalizers](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L450) and get `Error: Operation cannot be fulfilled on pods.` The current implementation bubbles up the error resulting in a system retry. By the next loop, the informer cache is up to date and the update is able to be applied. However, in an ArrayNode with many subnodes getting executed in parallel, the execution can easily run out of retries.

This update adds a basic retry with exponential backoff to wait for the informer cache to get up to date.

## Test Plan
Ran in dogfood-gcp
- https://buildkite.com/unionai/managed-cluster-staging-sync/builds/4622 + manually updated configmap to enabled finalizers
- Run without change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/fd16ac81fd7b5480fb6f/nodes)
- Run with change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/f016a3be7fa304db5a77/nodeId/n0/nodes)
confirmed in logs that conflict errors:
```
{"json":{"exec_id":"f016a3be7fa304db5a77","node":"n0/n42","ns":"development","res_ver":"146129599","routine":"worker-66","src":"plugin_manager.go:455","wf":"flytesnacks:development:tests.flytekit.integration.map_task_issue.wf8"},"level":"warning","msg":"Failed to clear finalizers for Resource with name: development/f016a3be7fa304db5a77-n0-0-n42-0. Error: Operation cannot be fulfilled on pods \"f016a3be7fa304db5a77-n0-0-n42-0\": the object has been modified; please apply your changes to the latest version and try again","ts":"2024-08-17T02:02:48Z"}

```
did not bubble up + confirmed finalizers were removed:

```
➜  ~ k get pods -n development f016a3be7fa304db5a77-n0-0-n42-0 -o json | grep -i final
INFO[0000] [0] Couldn't find a config file []. Relying on env vars and pflags.
➜  ~
```

## Rollout Plan (if applicable)
- revert changes to customer's config that disabled finalizers

## Upstream Changes
Should this change be upstreamed to OSS (flyteorg/flyte)? If not, please uncheck this box, which is used for auditing. Note, it is the responsibility of each developer to actually upstream their changes. See [this guide](https://unionai.atlassian.net/wiki/spaces/ENG/pages/447610883/Flyte+-+Union+Cloud+Development+Runbook/#When-are-versions-updated%3F).
- [x] To be upstreamed to OSS

## Issue
fixes: https://linear.app/unionai/issue/COR-1558/investigate-why-finalizers-consume-system-retries-in-map-tasks

## Checklist
* [ ] Added tests
* [x] Ran a deploy dry run and shared the terraform plan
* [ ] Added logging and metrics
* [ ] Updated [dashboards](https://unionai.grafana.net/dashboards) and [alerts](https://unionai.grafana.net/alerting/list)
* [ ] Updated documentation

Signed-off-by: Paul Dittamo <[email protected]>
pmahindrakar-oss pushed a commit that referenced this issue Sep 9, 2024
## Overview
when [informer cache has stale values](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L478), we cannot update the k8s resource when [clearing finalizers](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L450) and get `Error: Operation cannot be fulfilled on pods.` The current implementation bubbles up the error resulting in a system retry. By the next loop, the informer cache is up to date and the update is able to be applied. However, in an ArrayNode with many subnodes getting executed in parallel, the execution can easily run out of retries.

This update adds a basic retry with exponential backoff to wait for the informer cache to get up to date.

## Test Plan
Ran in dogfood-gcp
- https://buildkite.com/unionai/managed-cluster-staging-sync/builds/4622 + manually updated configmap to enabled finalizers
- Run without change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/fd16ac81fd7b5480fb6f/nodes)
- Run with change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/f016a3be7fa304db5a77/nodeId/n0/nodes)
confirmed in logs that conflict errors:
```
{"json":{"exec_id":"f016a3be7fa304db5a77","node":"n0/n42","ns":"development","res_ver":"146129599","routine":"worker-66","src":"plugin_manager.go:455","wf":"flytesnacks:development:tests.flytekit.integration.map_task_issue.wf8"},"level":"warning","msg":"Failed to clear finalizers for Resource with name: development/f016a3be7fa304db5a77-n0-0-n42-0. Error: Operation cannot be fulfilled on pods \"f016a3be7fa304db5a77-n0-0-n42-0\": the object has been modified; please apply your changes to the latest version and try again","ts":"2024-08-17T02:02:48Z"}

```
did not bubble up + confirmed finalizers were removed:

```
➜  ~ k get pods -n development f016a3be7fa304db5a77-n0-0-n42-0 -o json | grep -i final
INFO[0000] [0] Couldn't find a config file []. Relying on env vars and pflags.
➜  ~
```

## Rollout Plan (if applicable)
- revert changes to customer's config that disabled finalizers

## Upstream Changes
Should this change be upstreamed to OSS (flyteorg/flyte)? If not, please uncheck this box, which is used for auditing. Note, it is the responsibility of each developer to actually upstream their changes. See [this guide](https://unionai.atlassian.net/wiki/spaces/ENG/pages/447610883/Flyte+-+Union+Cloud+Development+Runbook/#When-are-versions-updated%3F).
- [x] To be upstreamed to OSS

## Issue
fixes: https://linear.app/unionai/issue/COR-1558/investigate-why-finalizers-consume-system-retries-in-map-tasks

## Checklist
* [ ] Added tests
* [x] Ran a deploy dry run and shared the terraform plan
* [ ] Added logging and metrics
* [ ] Updated [dashboards](https://unionai.grafana.net/dashboards) and [alerts](https://unionai.grafana.net/alerting/list)
* [ ] Updated documentation

Signed-off-by: Paul Dittamo <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
bgedik pushed a commit to bgedik/flyte that referenced this issue Sep 12, 2024
…eorg#5673)

## Overview
when [informer cache has stale values](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L478), we cannot update the k8s resource when [clearing finalizers](https://github.com/unionai/flyte/blob/1e82352dd95f89630e333fe6105d5fdb5487a24e/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go#L450) and get `Error: Operation cannot be fulfilled on pods.` The current implementation bubbles up the error resulting in a system retry. By the next loop, the informer cache is up to date and the update is able to be applied. However, in an ArrayNode with many subnodes getting executed in parallel, the execution can easily run out of retries.

This update adds a basic retry with exponential backoff to wait for the informer cache to get up to date.

## Test Plan
Ran in dogfood-gcp
- https://buildkite.com/unionai/managed-cluster-staging-sync/builds/4622 + manually updated configmap to enabled finalizers
- Run without change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/fd16ac81fd7b5480fb6f/nodes)
- Run with change (https://dogfood-gcp.cloud-staging.union.ai/console/projects/flytesnacks/domains/development/executions/f016a3be7fa304db5a77/nodeId/n0/nodes)
confirmed in logs that conflict errors:
```
{"json":{"exec_id":"f016a3be7fa304db5a77","node":"n0/n42","ns":"development","res_ver":"146129599","routine":"worker-66","src":"plugin_manager.go:455","wf":"flytesnacks:development:tests.flytekit.integration.map_task_issue.wf8"},"level":"warning","msg":"Failed to clear finalizers for Resource with name: development/f016a3be7fa304db5a77-n0-0-n42-0. Error: Operation cannot be fulfilled on pods \"f016a3be7fa304db5a77-n0-0-n42-0\": the object has been modified; please apply your changes to the latest version and try again","ts":"2024-08-17T02:02:48Z"}

```
did not bubble up + confirmed finalizers were removed:

```
➜  ~ k get pods -n development f016a3be7fa304db5a77-n0-0-n42-0 -o json | grep -i final
INFO[0000] [0] Couldn't find a config file []. Relying on env vars and pflags.
➜  ~
```

## Rollout Plan (if applicable)
- revert changes to customer's config that disabled finalizers

## Upstream Changes
Should this change be upstreamed to OSS (flyteorg/flyte)? If not, please uncheck this box, which is used for auditing. Note, it is the responsibility of each developer to actually upstream their changes. See [this guide](https://unionai.atlassian.net/wiki/spaces/ENG/pages/447610883/Flyte+-+Union+Cloud+Development+Runbook/#When-are-versions-updated%3F).
- [x] To be upstreamed to OSS

## Issue
fixes: https://linear.app/unionai/issue/COR-1558/investigate-why-finalizers-consume-system-retries-in-map-tasks

## Checklist
* [ ] Added tests
* [x] Ran a deploy dry run and shared the terraform plan
* [ ] Added logging and metrics
* [ ] Updated [dashboards](https://unionai.grafana.net/dashboards) and [alerts](https://unionai.grafana.net/alerting/list)
* [ ] Updated documentation

Signed-off-by: Paul Dittamo <[email protected]>
Signed-off-by: Bugra Gedik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ui Admin console user interface
Projects
None yet
Development

No branches or pull requests

1 participant