Stop a workflow during offload causes `upper: no more rows in this result set` #13220

shuangkun · 2024-06-19T11:26:53Z

Pre-requisites

I have double-checked my configuration
I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
I have searched existing issues and could not find a match for this bug
I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

When the workflow updates the nodestatus to the database and deletes the old records, if someone stops the workflow during the update, a conflict will occur. However, reapply cannot parse woc.orig at this time, resulting in upper: no more rows in this result set

Version

latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

Any large workflow

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

The text was updated successfully, but these errors were encountered:

shuangkun · 2024-06-19T12:12:22Z

Maybe we don't need to delete oldRecord when offload sace, because there is periodic cleanup.

jswxstw · 2024-06-20T07:54:24Z

Maybe we don't need to delete oldRecord when offload sace, because there is periodic cleanup.

uid is the primary key, use insert on update instead?

shuangkun · 2024-07-01T03:16:15Z

This is similar to canceling the cleanup of old records in the offload save phase. Dehydrate and update are very close. If you insert after update, you need to save the offloadVersion generated in the Dehydrate phase.

Signed-off-by: shuangkun <[email protected]>

jswxstw · 2024-07-01T07:19:21Z

Sorry, I misunderstood this issue, I thought it was a issue caused by workflow archiving before.

imliuda · 2024-07-10T03:40:12Z

@shuangkun I have a doubt with this, when deleting old records, there is a updatedat filter, that means your reapplyUpdate() run at least OFFLOAD_NODE_STATUS_TTL time, right? And have you noticed there are any slow queries?

	rs, err := wdc.session.SQL().
		DeleteFrom(wdc.tableName).
		Where(db.Cond{"clustername": wdc.clusterName}).
		And(db.Cond{"uid": uid}).
		And(db.Cond{"version <>": version}).
		And(wdc.oldOffload()).
		Exec()

shuangkun · 2024-08-27T08:20:02Z

Sometimes, the workflow may not be updated for many cycles, which is longer than OFFLOAD_NODE_STATUS_TTL. If the old version is deleted and the workflow is changed at this time, a conflict will occur and the workflow cannot be restored.

…j#13220 (argoproj#13286) Signed-off-by: shuangkun <[email protected]>

…13286) Signed-off-by: shuangkun <[email protected]>

shuangkun added the type/bug label Jun 19, 2024

shuangkun added the area/controller Controller issues, panics label Jun 19, 2024

agilgur5 changed the title ~~Stop a workflow cause "upper: no more rows in this result set"~~ Stop a workflow cause upper: no more rows in this result set Jun 19, 2024

agilgur5 changed the title ~~Stop a workflow cause upper: no more rows in this result set~~ Stop a workflow during offload causea upper: no more rows in this result set Jun 19, 2024

agilgur5 changed the title ~~Stop a workflow during offload causea upper: no more rows in this result set~~ Stop a workflow during offload causes upper: no more rows in this result set Jun 19, 2024

agilgur5 added the P3 Low priority label Jun 19, 2024

shuangkun added a commit to shuangkun/argo-workflows that referenced this issue Jul 1, 2024

fix: don't clean up old records when save. Fixes: argoproj#13220

5b4d5e4

Signed-off-by: shuangkun <[email protected]>

shuangkun mentioned this issue Jul 1, 2024

fix: don't clean up old offloaded records during save. Fixes: #13220 #13286

Merged

juliev0 closed this as completed in #13286 Aug 31, 2024

juliev0 closed this as completed in b026a0f Aug 31, 2024

agilgur5 added the area/offloading Node status offloading label Aug 31, 2024

Joibel pushed a commit to pipekit/argo-workflows that referenced this issue Sep 19, 2024

fix: don't clean up old offloaded records during save. Fixes: argopro…

85578fe

…j#13220 (argoproj#13286) Signed-off-by: shuangkun <[email protected]>

Joibel pushed a commit that referenced this issue Sep 20, 2024

fix: don't clean up old offloaded records during save. Fixes: #13220 (#…

472843a

…13286) Signed-off-by: shuangkun <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop a workflow during offload causes `upper: no more rows in this result set` #13220

Stop a workflow during offload causes `upper: no more rows in this result set` #13220

shuangkun commented Jun 19, 2024

shuangkun commented Jun 19, 2024

jswxstw commented Jun 20, 2024

shuangkun commented Jul 1, 2024 •

edited by agilgur5

Loading

jswxstw commented Jul 1, 2024 •

edited by agilgur5

Loading

imliuda commented Jul 10, 2024 •

edited by agilgur5

Loading

shuangkun commented Aug 27, 2024

Stop a workflow during offload causes upper: no more rows in this result set #13220

Stop a workflow during offload causes upper: no more rows in this result set #13220

Comments

shuangkun commented Jun 19, 2024

Pre-requisites

What happened/what did you expect to happen?

Version

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

Logs from the workflow controller

Logs from in your workflow's wait container

shuangkun commented Jun 19, 2024

jswxstw commented Jun 20, 2024

shuangkun commented Jul 1, 2024 • edited by agilgur5 Loading

jswxstw commented Jul 1, 2024 • edited by agilgur5 Loading

imliuda commented Jul 10, 2024 • edited by agilgur5 Loading

shuangkun commented Aug 27, 2024

Stop a workflow during offload causes `upper: no more rows in this result set` #13220

Stop a workflow during offload causes `upper: no more rows in this result set` #13220

shuangkun commented Jul 1, 2024 •

edited by agilgur5

Loading

jswxstw commented Jul 1, 2024 •

edited by agilgur5

Loading

imliuda commented Jul 10, 2024 •

edited by agilgur5

Loading