Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI does not show all running workflows (it used to) / UI and argo list differ (missing running workflows) #9696

Closed
2 of 3 tasks
scravy opened this issue Sep 27, 2022 · 23 comments · Fixed by #11840
Closed
2 of 3 tasks
Assignees
Labels
area/ui P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@scravy
Copy link
Contributor

scravy commented Sep 27, 2022

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

When I am submitting a workflow in the UI it is shown in the list of running workflows. When refreshing this page all workflows should be listed.

We just upgraded to argo 3.4.0 and not all running workflows are shown in the UI. When you wait long enough eventually workflows pop up. In our day to day work we noticed that workflows seem to be added to the UI once a transition happens, i.e. when a new step is entered in the workflow.

In short: The output of argo list and the workflows shown in the UI are not the same. Some running flows are missing. They are being added once they transition from one node to another while the UI is shown.

Version

3.4.0

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: zargo-debug-child
  namespace: argo
spec:
  entrypoint: main
  templates:
    - name: main
      steps:
        - - name: node1
            template: work
        - - name: node2
            template: work
    - name: work
      script:
        image: bash:5.2.0-alpine3.15
        command:
          - bash
        source: |
          echo "Doing some work for 60 seconds"
          sleep 60

Logs from the workflow controller

time="2022-09-27T15:57:49.577Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.591Z" level=info msg="Updated phase  -> Running" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.593Z" level=info msg="Steps node zargo-debug-child-vcmgq initialized Running" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.593Z" level=info msg="StepGroup node zargo-debug-child-vcmgq-3865638504 initialized Running" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.593Z" level=info msg="Pod node zargo-debug-child-vcmgq-3922648685 initialized Pending" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.628Z" level=info msg="Created pod: zargo-debug-child-vcmgq[0].node1 (zargo-debug-child-vcmgq-work-3922648685)" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.628Z" level=info msg="Workflow step group node zargo-debug-child-vcmgq-3865638504 not yet completed" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.628Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.628Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:49.639Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10435392 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:59.629Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:59.629Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:59.629Z" level=info msg="node changed" namespace=argo new.message= new.phase=Running new.progress=0/1 nodeID=zargo-debug-child-vcmgq-3922648685 old.message= old.phase=Pending old.progress=0/1 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:59.629Z" level=info msg="Workflow step group node zargo-debug-child-vcmgq-3865638504 not yet completed" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:59.629Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:59.629Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:57:59.726Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10435445 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.782Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.782Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.782Z" level=warning msg="workflow uses legacy/insecure pod patch, see https://argoproj.github.io/argo-workflows/workflow-rbac/" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.782Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=zargo-debug-child-vcmgq-3922648685 old.message= old.phase=Running old.progress=0/1 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.782Z" level=info msg="Step group node zargo-debug-child-vcmgq-3865638504 successful" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.782Z" level=info msg="node zargo-debug-child-vcmgq-3865638504 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.782Z" level=info msg="node zargo-debug-child-vcmgq-3865638504 finished: 2022-09-27 15:59:01.78294768 +0000 UTC" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.787Z" level=info msg="StepGroup node zargo-debug-child-vcmgq-2858834269 initialized Running" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.787Z" level=info msg="SG Outbound nodes of zargo-debug-child-vcmgq-3922648685 are [zargo-debug-child-vcmgq-3922648685]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.788Z" level=info msg="Pod node zargo-debug-child-vcmgq-3116902745 initialized Pending" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.818Z" level=info msg="Created pod: zargo-debug-child-vcmgq[1].node2 (zargo-debug-child-vcmgq-work-3116902745)" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.818Z" level=info msg="Workflow step group node zargo-debug-child-vcmgq-2858834269 not yet completed" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.818Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.818Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.828Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10435675 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:01.834Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/zargo-debug-child-vcmgq-work-3922648685/labelPodCompleted
time="2022-09-27T15:59:11.820Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:11.820Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:11.820Z" level=info msg="node changed" namespace=argo new.message= new.phase=Running new.progress=0/1 nodeID=zargo-debug-child-vcmgq-3116902745 old.message= old.phase=Pending old.progress=0/1 workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:11.820Z" level=info msg="SG Outbound nodes of zargo-debug-child-vcmgq-3922648685 are [zargo-debug-child-vcmgq-3922648685]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:11.821Z" level=info msg="Workflow step group node zargo-debug-child-vcmgq-2858834269 not yet completed" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:11.821Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:11.821Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T15:59:11.834Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10435730 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=warning msg="workflow uses legacy/insecure pod patch, see https://argoproj.github.io/argo-workflows/workflow-rbac/" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=zargo-debug-child-vcmgq-3116902745 old.message= old.phase=Running old.progress=0/1 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="SG Outbound nodes of zargo-debug-child-vcmgq-3922648685 are [zargo-debug-child-vcmgq-3922648685]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="Step group node zargo-debug-child-vcmgq-2858834269 successful" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="node zargo-debug-child-vcmgq-2858834269 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="node zargo-debug-child-vcmgq-2858834269 finished: 2022-09-27 16:00:13.417764045 +0000 UTC" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="Outbound nodes of zargo-debug-child-vcmgq-3116902745 is [zargo-debug-child-vcmgq-3116902745]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="Outbound nodes of zargo-debug-child-vcmgq is [zargo-debug-child-vcmgq-3116902745]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="node zargo-debug-child-vcmgq phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="node zargo-debug-child-vcmgq finished: 2022-09-27 16:00:13.41780197 +0000 UTC" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="Checking daemoned children of zargo-debug-child-vcmgq" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.417Z" level=info msg="Running OnExit handler: b72ca92baf93-exit-handler" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.419Z" level=info msg="Steps node zargo-debug-child-vcmgq-4002967377 initialized Running" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.419Z" level=info msg="StepGroup node zargo-debug-child-vcmgq-3620866933 initialized Running" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.419Z" level=info msg="Pod node zargo-debug-child-vcmgq-1020052388 initialized Pending" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.438Z" level=info msg="Created pod: zargo-debug-child-vcmgq.onExit[0].success (zargo-debug-child-vcmgq-b72ca92baf93-send-slack-1020052388)" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.438Z" level=info msg="Skipping zargo-debug-child-vcmgq.onExit[0].failure: when 'Succeeded != Succeeded' evaluated false" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.438Z" level=info msg="Skipped node zargo-debug-child-vcmgq-1275290133 initialized Skipped (message: when 'Succeeded != Succeeded' evaluated false)" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.438Z" level=info msg="Workflow step group node zargo-debug-child-vcmgq-3620866933 not yet completed" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.448Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10436003 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:13.455Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/zargo-debug-child-vcmgq-work-3116902745/labelPodCompleted
time="2022-09-27T16:00:23.440Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.440Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.440Z" level=warning msg="workflow uses legacy/insecure pod patch, see https://argoproj.github.io/argo-workflows/workflow-rbac/" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.440Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=zargo-debug-child-vcmgq-1020052388 old.message= old.phase=Pending old.progress=0/1 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Running OnExit handler: b72ca92baf93-exit-handler" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Step group node zargo-debug-child-vcmgq-3620866933 successful" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="node zargo-debug-child-vcmgq-3620866933 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="node zargo-debug-child-vcmgq-3620866933 finished: 2022-09-27 16:00:23.441636451 +0000 UTC" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Outbound nodes of zargo-debug-child-vcmgq-1020052388 is [zargo-debug-child-vcmgq-1020052388]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Outbound nodes of zargo-debug-child-vcmgq-1275290133 is [zargo-debug-child-vcmgq-1275290133]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Outbound nodes of zargo-debug-child-vcmgq-4002967377 is [zargo-debug-child-vcmgq-1020052388 zargo-debug-child-vcmgq-1275290133]" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="node zargo-debug-child-vcmgq-4002967377 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="node zargo-debug-child-vcmgq-4002967377 finished: 2022-09-27 16:00:23.44170012 +0000 UTC" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Checking daemoned children of zargo-debug-child-vcmgq-4002967377" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Updated phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Marking workflow completed" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Marking workflow as pending archiving" namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.441Z" level=info msg="Checking daemoned children of " namespace=argo workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.447Z" level=info msg="cleaning up pod" action=deletePod key=argo/zargo-debug-child-vcmgq-1340600742-agent/deletePod
time="2022-09-27T16:00:23.462Z" level=info msg="Workflow update successful" namespace=argo phase=Succeeded resourceVersion=10436076 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.646Z" level=info msg="archiving workflow" namespace=argo uid=cdd6c373-96d9-4c98-b894-3ef1f02b18a2 workflow=zargo-debug-child-vcmgq
time="2022-09-27T16:00:23.650Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/zargo-debug-child-vcmgq-b72ca92baf93-send-slack-1020052388/labelPodCompleted
time="2022-09-27T16:00:23.669Z" level=info msg="Queueing Succeeded workflow argo/zargo-debug-child-vcmgq for delete in 18h0m0s due to TTL"

Logs from in your workflow's wait container

time="2022-09-27T15:58:52.538Z" level=info msg="Creating minio client using static credentials" endpoint=s3.amazonaws.com
time="2022-09-27T15:58:52.544Z" level=info msg="Saving file to s3" bucket=incrmntal-argo-artifactory endpoint=s3.amazonaws.com key=incrmntal-prod-two/argo/zargo-debug-child-vcmgq/2022-09-27-zargo-debug-child-vcmgq-work-3922648685/main.log path=/tmp/argo/outputs/logs/main.log
time="2022-09-27T15:58:52.587Z" level=info msg="Save artifact" artifactName=main-logs duration=49.392744ms error="<nil>" key=incrmntal-prod-two/argo/zargo-debug-child-vcmgq/2022-09-27-zargo-debug-child-vcmgq-work-3922648685/main.log
time="2022-09-27T15:58:52.587Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2022-09-27T15:58:52.587Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2022-09-27T15:58:52.599Z" level=info msg="Create workflowtaskresults 403"
time="2022-09-27T15:58:52.599Z" level=warning msg="failed to patch task set, falling back to legacy/insecure pod patch, see https://argoproj.github.io/argo-workflows/workflow-rbac/" error="workflowtaskresults.argoproj.io is forbidden: User \"system:serviceaccount:argo:argo\" cannot create resource \"workflowtaskresults\" in API group \"argoproj.io\" in the namespace \"argo\""
time="2022-09-27T15:58:52.618Z" level=info msg="Patch pods 200"
time="2022-09-27T15:58:52.621Z" level=info msg="Deadline monitor stopped"
time="2022-09-27T15:58:52.621Z" level=info msg="Alloc=7002 TotalAlloc=16520 Sys=24018 NumGC=5 Goroutines=9"
@scravy scravy added type/bug type/regression Regression from previous behavior (a specific type of bug) labels Sep 27, 2022
@terrytangyuan
Copy link
Member

time="2022-09-27T15:58:52.599Z" level=warning msg="failed to patch task set, falling back to legacy/insecure pod patch, see https://argoproj.github.io/argo-workflows/workflow-rbac/" error="workflowtaskresults.argoproj.io is forbidden: User \"system:serviceaccount:argo:argo\" cannot create resource \"workflowtaskresults\" in API group \"argoproj.io\" in the namespace \"argo\""

You are missing some permissions on your service account.

@scravy
Copy link
Contributor Author

scravy commented Sep 27, 2022

@terrytangyuan Are you saying this is the cause for the regression I am seeing?

The workflows do show up when a transition occurs. Also completed workflows are shown correctly. Just running ones are not.

Also I installed argo from the install.yaml as per the release, so the roles and cluster roles are the vanilla roles from v3.4.0 – this should surely not miss any permissions?

@terrytangyuan
Copy link
Member

time="2022-09-27T16:00:23.646Z" level=info msg="archiving workflow" namespace=argo uid=cdd6c373-96d9-4c98-b894-3ef1f02b18a2 workflow=zargo-debug-child-vcmgq

This indicates that you are archiving the workflows. Could you check if they are in your list of archived workflows?

@scravy
Copy link
Contributor Author

scravy commented Sep 27, 2022

First things first. So I did patch the respective role with the missing permissions. I changed the argo-cluster-role as per the vanilla install yaml at https://github.com/argoproj/argo-workflows/releases/download/v3.4.0/install.yaml from

- apiGroups:
  - argoproj.io
  resources:
  - workflowtaskresults
  verbs:
  - list
  - watch
  - deletecollection

to

- apiGroups:
  - argoproj.io
  resources:
  - workflowtaskresults
  verbs:
  - list
  - watch
  - deletecollection
  - create
  - patch

The warnings are now gone. Here is the complete logs from a rerun. However the problem that I am describing exists in the same way.

Wait Container logs:

time="2022-09-27T16:50:02.568Z" level=info msg="Starting Workflow Executor" version=v3.4.0
time="2022-09-27T16:50:02.570Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2022-09-27T16:50:02.570Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=argo podName=zargo-debug-child-qnxdn-work-1688663510 template="{\"name\":\"work\",\"inputs\":{},\"outputs\":{},\"metadata\":{\"annotations\":{\"cluster-autoscaler.kubernetes.io/safe-to-evict\":\"false\"}},\"script\":{\"name\":\"\",\"image\":\"bash:5.2.0-alpine3.15\",\"command\":[\"bash\"],\"resources\":{},\"source\":\"echo \\\"Doing some work for 60 seconds\\\"\\nsleep 60\\n\"},\"archiveLocation\":{\"archiveLogs\":true,\"s3\":{\"endpoint\":\"s3.amazonaws.com\",\"bucket\":\"incrmntal-argo-artifactory\",\"region\":\"us-east-1\",\"insecure\":false,\"accessKeySecret\":{\"name\":\"argo-s3-config\",\"key\":\"accessKey\"},\"secretKeySecret\":{\"name\":\"argo-s3-config\",\"key\":\"secretKey\"},\"encryptionOptions\":{},\"key\":\"incrmntal-prod-two/argo/zargo-debug-child-qnxdn/2022-09-27-zargo-debug-child-qnxdn-work-1688663510\"}},\"serviceAccountName\":\"argo\"}" version="&Version{Version:v3.4.0,BuildDate:2022-09-19T03:47:58Z,GitCommit:047952afd539d06cae2fd6ba0b608b19c1194bba,GitTag:v3.4.0,GitTreeState:clean,GoVersion:go1.18.6,Compiler:gc,Platform:linux/amd64,}"
time="2022-09-27T16:50:02.570Z" level=info msg="Starting deadline monitor"
time="2022-09-27T16:51:03.582Z" level=info msg="Main container completed" error="<nil>"
time="2022-09-27T16:51:03.582Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2022-09-27T16:51:03.582Z" level=info msg="No output parameters"
time="2022-09-27T16:51:03.582Z" level=info msg="No output artifacts"
time="2022-09-27T16:51:03.582Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: incrmntal-prod-two/argo/zargo-debug-child-qnxdn/2022-09-27-zargo-debug-child-qnxdn-work-1688663510/main.log"
time="2022-09-27T16:51:03.582Z" level=info msg="Creating minio client using static credentials" endpoint=s3.amazonaws.com
time="2022-09-27T16:51:03.590Z" level=info msg="Saving file to s3" bucket=incrmntal-argo-artifactory endpoint=s3.amazonaws.com key=incrmntal-prod-two/argo/zargo-debug-child-qnxdn/2022-09-27-zargo-debug-child-qnxdn-work-1688663510/main.log path=/tmp/argo/outputs/logs/main.log
time="2022-09-27T16:51:03.637Z" level=info msg="Save artifact" artifactName=main-logs duration=54.307792ms error="<nil>" key=incrmntal-prod-two/argo/zargo-debug-child-qnxdn/2022-09-27-zargo-debug-child-qnxdn-work-1688663510/main.log
time="2022-09-27T16:51:03.637Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2022-09-27T16:51:03.637Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2022-09-27T16:51:03.661Z" level=info msg="Create workflowtaskresults 201"
time="2022-09-27T16:51:03.661Z" level=info msg="Deadline monitor stopped"
time="2022-09-27T16:51:03.661Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2022-09-27T16:51:03.661Z" level=info msg="Alloc=7571 TotalAlloc=15996 Sys=23506 NumGC=5 Goroutines=9"

Workflow Controller Log

time="2022-09-27T16:50:00.763Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.776Z" level=info msg="Updated phase  -> Running" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.776Z" level=info msg="Steps node zargo-debug-child-qnxdn initialized Running" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.776Z" level=info msg="StepGroup node zargo-debug-child-qnxdn-2181329391 initialized Running" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.777Z" level=info msg="Pod node zargo-debug-child-qnxdn-1688663510 initialized Pending" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.809Z" level=info msg="Created pod: zargo-debug-child-qnxdn[0].node1 (zargo-debug-child-qnxdn-work-1688663510)" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.809Z" level=info msg="Workflow step group node zargo-debug-child-qnxdn-2181329391 not yet completed" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.809Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.809Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:00.820Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10468680 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:10.811Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:10.811Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:10.811Z" level=info msg="node changed" namespace=argo new.message= new.phase=Running new.progress=0/1 nodeID=zargo-debug-child-qnxdn-1688663510 old.message= old.phase=Pending old.progress=0/1 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:10.811Z" level=info msg="Workflow step group node zargo-debug-child-qnxdn-2181329391 not yet completed" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:10.811Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:10.811Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:50:10.824Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10468732 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.959Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.959Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=1 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.959Z" level=info msg="task-result changed" namespace=argo nodeID=zargo-debug-child-qnxdn-1688663510 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.960Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=zargo-debug-child-qnxdn-1688663510 old.message= old.phase=Running old.progress=0/1 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.960Z" level=info msg="Step group node zargo-debug-child-qnxdn-2181329391 successful" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.960Z" level=info msg="node zargo-debug-child-qnxdn-2181329391 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.960Z" level=info msg="node zargo-debug-child-qnxdn-2181329391 finished: 2022-09-27 16:51:12.960547268 +0000 UTC" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.960Z" level=info msg="StepGroup node zargo-debug-child-qnxdn-2248586962 initialized Running" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.960Z" level=info msg="SG Outbound nodes of zargo-debug-child-qnxdn-1688663510 are [zargo-debug-child-qnxdn-1688663510]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.961Z" level=info msg="Pod node zargo-debug-child-qnxdn-3431233478 initialized Pending" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.996Z" level=info msg="Created pod: zargo-debug-child-qnxdn[1].node2 (zargo-debug-child-qnxdn-work-3431233478)" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.996Z" level=info msg="Workflow step group node zargo-debug-child-qnxdn-2248586962 not yet completed" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.996Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:12.996Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:13.008Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10469033 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:13.013Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/zargo-debug-child-qnxdn-work-1688663510/labelPodCompleted
time="2022-09-27T16:51:22.998Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:22.998Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=1 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:22.998Z" level=info msg="node changed" namespace=argo new.message= new.phase=Running new.progress=0/1 nodeID=zargo-debug-child-qnxdn-3431233478 old.message= old.phase=Pending old.progress=0/1 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:22.998Z" level=info msg="SG Outbound nodes of zargo-debug-child-qnxdn-1688663510 are [zargo-debug-child-qnxdn-1688663510]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:22.998Z" level=info msg="Workflow step group node zargo-debug-child-qnxdn-2248586962 not yet completed" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:22.998Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:22.999Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:51:23.011Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10469088 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=2 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="task-result changed" namespace=argo nodeID=zargo-debug-child-qnxdn-3431233478 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=zargo-debug-child-qnxdn-3431233478 old.message= old.phase=Running old.progress=0/1 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="SG Outbound nodes of zargo-debug-child-qnxdn-1688663510 are [zargo-debug-child-qnxdn-1688663510]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="Step group node zargo-debug-child-qnxdn-2248586962 successful" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="node zargo-debug-child-qnxdn-2248586962 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="node zargo-debug-child-qnxdn-2248586962 finished: 2022-09-27 16:52:25.256890468 +0000 UTC" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="Outbound nodes of zargo-debug-child-qnxdn-3431233478 is [zargo-debug-child-qnxdn-3431233478]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="Outbound nodes of zargo-debug-child-qnxdn is [zargo-debug-child-qnxdn-3431233478]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="node zargo-debug-child-qnxdn phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="node zargo-debug-child-qnxdn finished: 2022-09-27 16:52:25.256948516 +0000 UTC" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="Checking daemoned children of zargo-debug-child-qnxdn" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.256Z" level=info msg="Running OnExit handler: b72ca92baf93-exit-handler" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.257Z" level=info msg="Steps node zargo-debug-child-qnxdn-4154040042 initialized Running" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.257Z" level=info msg="StepGroup node zargo-debug-child-qnxdn-3486861020 initialized Running" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.257Z" level=info msg="Pod node zargo-debug-child-qnxdn-1717903073 initialized Pending" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.285Z" level=info msg="Created pod: zargo-debug-child-qnxdn.onExit[0].success (zargo-debug-child-qnxdn-b72ca92baf93-send-slack-1717903073)" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.285Z" level=info msg="Skipping zargo-debug-child-qnxdn.onExit[0].failure: when 'Succeeded != Succeeded' evaluated false" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.285Z" level=info msg="Skipped node zargo-debug-child-qnxdn-3455007452 initialized Skipped (message: when 'Succeeded != Succeeded' evaluated false)" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.285Z" level=info msg="Workflow step group node zargo-debug-child-qnxdn-3486861020 not yet completed" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.298Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=10469324 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:25.304Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/zargo-debug-child-qnxdn-work-3431233478/labelPodCompleted
time="2022-09-27T16:52:35.287Z" level=info msg="Processing workflow" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=3 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="task-result changed" namespace=argo nodeID=zargo-debug-child-qnxdn-1717903073 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=zargo-debug-child-qnxdn-1717903073 old.message= old.phase=Pending old.progress=0/1 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg=reconcileAgentPod namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="Running OnExit handler: b72ca92baf93-exit-handler" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="Step group node zargo-debug-child-qnxdn-3486861020 successful" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="node zargo-debug-child-qnxdn-3486861020 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="node zargo-debug-child-qnxdn-3486861020 finished: 2022-09-27 16:52:35.288917456 +0000 UTC" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="Outbound nodes of zargo-debug-child-qnxdn-1717903073 is [zargo-debug-child-qnxdn-1717903073]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="Outbound nodes of zargo-debug-child-qnxdn-3455007452 is [zargo-debug-child-qnxdn-3455007452]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="Outbound nodes of zargo-debug-child-qnxdn-4154040042 is [zargo-debug-child-qnxdn-1717903073 zargo-debug-child-qnxdn-3455007452]" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="node zargo-debug-child-qnxdn-4154040042 phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="node zargo-debug-child-qnxdn-4154040042 finished: 2022-09-27 16:52:35.288986789 +0000 UTC" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.288Z" level=info msg="Checking daemoned children of zargo-debug-child-qnxdn-4154040042" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.289Z" level=info msg="Updated phase Running -> Succeeded" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.289Z" level=info msg="Marking workflow completed" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.289Z" level=info msg="Marking workflow as pending archiving" namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.289Z" level=info msg="Checking daemoned children of " namespace=argo workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.294Z" level=info msg="cleaning up pod" action=deletePod key=argo/zargo-debug-child-qnxdn-1340600742-agent/deletePod
time="2022-09-27T16:52:35.301Z" level=info msg="Workflow update successful" namespace=argo phase=Succeeded resourceVersion=10469386 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.515Z" level=info msg="archiving workflow" namespace=argo uid=6dceddd6-f757-4398-85e1-34333b8aaa13 workflow=zargo-debug-child-qnxdn
time="2022-09-27T16:52:35.520Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/zargo-debug-child-qnxdn-b72ca92baf93-send-slack-1717903073/labelPodCompleted
time="2022-09-27T16:52:35.554Z" level=info msg="Queueing Succeeded workflow argo/zargo-debug-child-qnxdn for delete in 18h0m0s due to TTL"

@scravy
Copy link
Contributor Author

scravy commented Sep 27, 2022

@terrytangyuan Yes we are archiving workflows. But the problem here is that the workflow is not listed while it is running. Once the workflow transitions to a new node in the flow and the list of workflows in the UI is open while it does it will show in the list.

...and yes, they are in the archived workflows, but only after they completed.

@sarabala1979
Copy link
Member

@scravy Can you check if a namespace filter is populated with the cookie's default value?

@scravy
Copy link
Contributor Author

scravy commented Sep 28, 2022

@sarabala1979 I am not sure what you mean by "the cookie's default value".

In the UI I do have a filter set for namespace = argo – but then again we do everything in that namespace and all workflows show up here, and it used to work in previous versions or argo, so not sure how this would affect anything.

I looked at the cookies and there's four cookies set: authorization, _ga_E659EW9J25, _gcl_au, and _ga.

@ese
Copy link

ese commented Oct 3, 2022

It also happens to me.
When you first load the workflow page in the UI it does not show all the workflows available but if you wait like 20 or 30 seconds they suddenly appear (without touching anything in the UI during the wait). Access directly to the URL of a specific workflow works fine always, this only happen in the list view.
I'm not sure about the root cause but it seems more related to the UI than the server.

@terrytangyuan
Copy link
Member

When you refresh the list view, do they appear as expected?

@sarabala1979
Copy link
Member

@scravy Is it working as expected on v3.3?

@ese
Copy link

ese commented Oct 3, 2022

When I refresh the page it always show the list incomplete and after some time it loads all the workflows.

I noticed the issue is calling this endpoint:

https://argo/api/v1/workflow-events/argo-workflows?listOptions.fieldSelector=metadata.namespace=argo-workflows&fields=result.object.metadata.name,result.object.metadata.namespace,result.object.metadata.resourceVersion,result.object.metadata.creationTimestamp,result.object.metadata.uid,result.object.status.finishedAt,result.object.status.phase,result.object.status.message,result.object.status.startedAt,result.object.status.estimatedDuration,result.object.status.progress,result.type,result.object.metadata.labels,result.object.spec.suspend

As soon as this error appears in the browser console the UI loads all the workflows.

image

@scravy
Copy link
Contributor Author

scravy commented Oct 4, 2022

@terrytangyuan No, they do not appear when the list view is refreshed. On the contrary: When they appear and you then refresh they are gone again. Until the workflow transitions to a new node and pops up again.

@sarabala1979 We actually made a version jump from 3.1 where it used to work to 3.4 where it doesn't. I can try setting up a 3.3 later today and see whether the problem exists or not.

@ese On my installation I do not actually see any errors in the UI / dev console. As I mentioned above once a workflow event comes in the workflow is shown, but not all workflows. Are you sure all missing workflows are being shown once you hit that error?

@tonyo
Copy link

tonyo commented Oct 5, 2022

Seeing the same issue.
After I enqueue some workflows (and while they stay in the "Pending" state), they are not initially displayed when I open the list view. After some time they appear (as "Pending"). If I refresh the page, they are gone again.
kubectl get workflow and argo list correctly show those disappearing workflows as Pending.

We upgraded from 3.3.8 to 3.4.1 recently.

@scravy
Copy link
Contributor Author

scravy commented Oct 12, 2022

I setup a 3.3 on our dev cluster and things work as expected. Both 3.4.0 and 3.4.1 show the regression as described here.

@stale
Copy link

stale bot commented Oct 29, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.

@stale stale bot added problem/stale This has not had a response in some time and removed problem/stale This has not had a response in some time labels Oct 29, 2022
@argoproj argoproj deleted a comment from scravy Nov 16, 2022
@alexec
Copy link
Contributor

alexec commented Nov 16, 2022

There is an error in your console, but I don’t see the diagnostics.

Can you please add any Argo Server logs that show more details.

@sarabala1979 sarabala1979 added the P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important label Nov 21, 2022
@scravy
Copy link
Contributor Author

scravy commented Dec 20, 2022

@alexec Here are argo server logs from running the workflow which I showed in the beginning.

time="2022-12-20T11:19:18.594Z" level=info msg="not enabling pprof debug endpoints"
time="2022-12-20T11:19:18.594Z" level=info authModes="[sso client]" baseHRef=/ managedNamespace= namespace=argo secure=true ssoNamespace=argo
time="2022-12-20T11:19:18.594Z" level=info msg="Generating Self Signed TLS Certificates for Secure Mode"
time="2022-12-20T11:19:18.896Z" level=info msg="SSO configuration" clientId="{{google-oauth} client-id <nil>}" insecureSkipVerify=false issuer="https://accounts.google.com" issuerAlias=DISABLED redirectUrl="https://octoplane.incrmntal.net/oauth2/callback" scopes="[email openid]"
time="2022-12-20T11:19:18.997Z" level=info msg="SSO enabled"
time="2022-12-20T11:19:19.005Z" level=info msg="Starting Argo Server" instanceID= version=v3.4.4
time="2022-12-20T11:19:19.042Z" level=info msg="Creating event controller" asyncDispatch=false operationQueueSize=16 workerCount=4
time="2022-12-20T11:19:19.045Z" level=info msg="GRPC Server Max Message Size, MaxGRPCMessageSize, is set" GRPC_MESSAGE_SIZE=104857600
time="2022-12-20T11:19:19.045Z" level=info msg="Argo Server started successfully on https://localhost:2746" url="https://localhost:2746"
time="2022-12-20T11:19:36.723Z" level=info duration="79.831µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:37.998Z" level=info duration="59.801µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.022Z" level=info duration="67.497µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.574Z" level=info duration="67.56µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.575Z" level=info duration="51.682µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.575Z" level=info duration="250.961µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.579Z" level=info duration="43.963µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.579Z" level=info duration="60.938µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.579Z" level=info duration="64.415µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.579Z" level=info duration="54.907µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.784Z" level=info duration="84.558µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.785Z" level=info duration="52.762µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:39.825Z" level=info duration="53.165µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:40.404Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:19:41.233Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:19:41.283Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:19:44.807Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:19:45.483Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:19:45.484Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:19:46.796Z" level=info duration="78.009µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:47.060Z" level=info duration="61.868µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:48.360Z" level=info duration="81.547µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:48.893Z" level=info duration="60.056µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.057Z" level=info duration="78.274µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.086Z" level=info duration="92.317µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.562Z" level=info duration="76.271µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.582Z" level=info duration="59.7µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.620Z" level=info duration="93.988µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.621Z" level=info duration="48.601µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.678Z" level=info duration="71.587µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.678Z" level=info duration="50.859µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.705Z" level=info duration="65.62µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:49.723Z" level=info duration="65.859µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:50.408Z" level=info duration="76.328µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:51.366Z" level=info duration="72.342µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:53.001Z" level=info duration="84.05µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.024Z" level=info duration="62.148µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.581Z" level=info duration="65.373µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.581Z" level=info duration="45.289µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.581Z" level=info duration="71.962µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.581Z" level=info duration="40.367µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.581Z" level=info duration="54.945µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.581Z" level=info duration="58.917µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.581Z" level=info duration="40.969µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.782Z" level=info duration="66.14µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.787Z" level=info duration="55.769µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:54.809Z" level=info duration="63.441µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:56.723Z" level=info duration="63.826µs" method=GET path=index.html size=473 status=0
time="2022-12-20T11:19:57.654Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:19:57.677Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ResubmitWorkflow grpc.service=workflow.WorkflowService grpc.start_time="2022-12-20T11:19:57Z" grpc.time_ms=24.06 span.kind=server system=grpc
time="2022-12-20T11:19:57.678Z" level=info duration=25.971117ms method=PUT path=/api/v1/workflows/argo/zargo-debug-child/resubmit size=2591 status=0
time="2022-12-20T11:19:57.822Z" level=info duration=12.340447759s method=GET path=/api/v1/workflow-events/argo size=8594 status=0
time="2022-12-20T11:19:57.822Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2022-12-20T11:19:45Z" grpc.time_ms=12340.208 span.kind=server system=grpc
time="2022-12-20T11:19:57.835Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:20:00.907Z" level=info duration="24.72µs" method=GET path=index.html size=0 status=304
time="2022-12-20T11:20:01.031Z" level=info duration=16.226099745s method=GET path=/api/v1/workflow-events/argo size=84524 status=0
time="2022-12-20T11:20:01.031Z" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=WatchWorkflows grpc.service=workflow.WorkflowService grpc.start_time="2022-12-20T11:19:44Z" grpc.time_ms=16225.839 span.kind=server system=grpc
time="2022-12-20T11:20:01.471Z" level=info msg="using the default service account for user" email=REDACTED subject=<D-r>
time="2022-12-20T11:20:01.499Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:20:01.499Z" level=info msg="tracking UI usage️️" email=REDACTED name=openedWorkflowList subject=REDACTED
time="2022-12-20T11:20:01.499Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=CollectEvent grpc.service=info.InfoService grpc.start_time="2022-12-20T11:20:01Z" grpc.time_ms=1.439 span.kind=server system=grpc
time="2022-12-20T11:20:01.499Z" level=info duration=30.505105ms method=POST path=/api/v1/tracking/event size=2 status=0
time="2022-12-20T11:20:01.575Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:20:01.575Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetInfo grpc.service=info.InfoService grpc.start_time="2022-12-20T11:20:01Z" grpc.time_ms=1.503 span.kind=server system=grpc
time="2022-12-20T11:20:01.576Z" level=info duration=1.979622ms method=GET path=/api/v1/info size=67 status=0
time="2022-12-20T11:20:01.577Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:20:01.577Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetUserInfo grpc.service=info.InfoService grpc.start_time="2022-12-20T11:20:01Z" grpc.time_ms=1.642 span.kind=server system=grpc
time="2022-12-20T11:20:01.577Z" level=info duration=2.005306ms method=GET path=/api/v1/userinfo size=110 status=0
time="2022-12-20T11:20:01.578Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:20:01.613Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListCronWorkflows grpc.service=cronworkflow.CronWorkflowService grpc.start_time="2022-12-20T11:20:01Z" grpc.time_ms=36.216 span.kind=server system=grpc
time="2022-12-20T11:20:01.614Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=ListWorkflowTemplates grpc.service=workflowtemplate.WorkflowTemplateService grpc.start_time="2022-12-20T11:20:01Z" grpc.time_ms=143.616 span.kind=server system=grpc
time="2022-12-20T11:20:01.614Z" level=info duration=38.550003ms method=GET path=/api/v1/cron-workflows/argo size=26549 status=0
time="2022-12-20T11:20:01.706Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED
time="2022-12-20T11:20:01.778Z" level=info duration=309.0362ms method=GET path=/api/v1/workflow-templates/argo size=1043014 status=0
time="2022-12-20T11:20:01.836Z" level=info msg="using the default service account for user" email=REDACTED subject=REDACTED

Since opening the ticket we upgraded to Argo 3.4.4 and the problem still persists.

Anything I can do to share more information with you? I was on sabbatical the past two months but can react quickly now.

@angapov
Copy link

angapov commented Jan 15, 2023

I can confirm that this works as expected in 3.3.9 but not in 3.4.0

@agilgur5
Copy link
Member

agilgur5 commented Sep 1, 2023

I dug into the (very large) changelog for 3.4.0-rc1 and the one thing that seemed potentially related to this is #8596, which added a date range filter to the UI. Specifically, this added a new default to the UI (this line) to only show the past month.

argo list meanwhile does not have such a default.

When you remove the date filter, does this still occur?

With the Workflow only appearing after a transition, that sounds it could be because one of the statuses does not yet exist. The UI seems to exclusively filter on status.startedAt. Meanwhile, the CLI filters on the metadata.creationTimestamp and status.finishedAt.
metadata.creationTimestamp may exist immediately, whereas status.startedAt may not, if I had to guess.

I may be able to fix the UI transition issue based on the above, but the CLI and UI having different defaults seems like it may be a bit unintuitive.

@agilgur5
Copy link
Member

agilgur5 commented Sep 1, 2023

Looks like this may have been partially fixed by #9909 (released in 3.4.4) for Pending workflows specifically.

@agilgur5
Copy link
Member

agilgur5 commented Sep 4, 2023

Was unable to reproduce on latest 3.5.0-rc1+. This seems like it was indeed fixed by #9909 as only a Pending workflow would have no status.startedAt.

We may want to remove the new default in 3.4.0+ of showing only last month first if users are confused by that. Then the UI would by default match the CLI as well. I'll leave that decision to more tenured contributors + maintainers to decide.
@terrytangyuan @sarabala1979 @alexec any thoughts on UI vs. CLI defaults as mentioned above?

@agilgur5 agilgur5 closed this as completed Sep 4, 2023
@agilgur5
Copy link
Member

agilgur5 commented Sep 10, 2023

Ok I did find a pretty bad bug in the date filtering logic: when either of the date filters were cleared, no Workflows would show, as they were comparing dates against undefined, which always result in false. 😬

I submitted a fix for this in #11792

That should only have impact if one of the date filters were clear, which was something I didn't test for when I tried repro'ing in my previous comment (I only tested with date filters).

@agilgur5
Copy link
Member

We may want to remove the new default in 3.4.0+ of showing only last month first if users are confused by that. Then the UI would by default match the CLI as well. I'll leave that decision to more tenured contributors + maintainers to decide.
@terrytangyuan @sarabala1979 @alexec any thoughts on UI vs. CLI defaults as mentioned above?

I've gone ahead and done this in #11840. With that PR, UI and CLI should now match.
The date filter is client-side, so it doesn't save on any networking to have a default and this issue and #11671 show that the current behavior is confusing and buggy. Hopefully that will all be cleared up now (although I was still unable to reproduce some of the bugs listed here and there).

@agilgur5 agilgur5 self-assigned this Sep 19, 2023
@agilgur5 agilgur5 changed the title UI does not show all running workflows (it used to do so) / UI and argo list differ (missing running workflows) UI does not show all running workflows (it used to) / UI and argo list differ (missing running workflows) Feb 11, 2024
@argoproj argoproj locked as resolved and limited conversation to collaborators Jul 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/ui P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
Development

Successfully merging a pull request may close this issue.

8 participants