Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote: pod logs: catch-up delay #16132

Closed
edsantiago opened this issue Oct 12, 2022 · 5 comments
Closed

remote: pod logs: catch-up delay #16132

edsantiago opened this issue Oct 12, 2022 · 5 comments
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote

Comments

@edsantiago
Copy link
Member

As best I can tell, podman-remote pod logs is sometimes not actually fully catching up, so a system test is failing

# podman-remote --url unix:/tmp/podman_tmp_tAJi logs 9f7628c570de2a8939936681617a216ebc6cc43943bc4f64874c2e7850c81f20
#/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#|     FAIL: test string received on container
#| expected: 'uwOjtSVXmcfjVrydfvPFF9zoVWdfov'
#|   actual: ''
#\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The code in question is doing echo teststring | nc localhost PORT, then immediately running podman logs on a container in a pod.

# send a random string to the container. This will cause the container
# to output the string to its logs, then exit.
teststring=$(random_string 30)
echo "$teststring" | nc 127.0.0.1 $port_out
# Confirm that the container log output is the string we sent it.
run_podman logs $cid
is "$output" "$teststring" "test string received on container"

Seems to have started 08-Sept, which suggests a real code problem because that test code hasn't changed since June 2020.

@edsantiago edsantiago added flakes Flakes from Continuous Integration remote Problem is in podman-remote labels Oct 12, 2022
@github-actions github-actions bot removed the remote Problem is in podman-remote label Oct 12, 2022
@edsantiago edsantiago added the remote Problem is in podman-remote label Oct 13, 2022
@edsantiago
Copy link
Member Author

Recent flakes, to remind everyone that this is still not fixed:

@edsantiago
Copy link
Member Author

Another one in remote f36-aarch64 root:

# podman-remote logs e622101a248a44fac7660fa0c95d354c7f519785e952e665c57a275af9d3ff6c
#/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#|     FAIL: test string received on container
#| expected: '8UhHQsrXptSI4strBWkKJTBSfaKaWK'
#|   actual: ''
#\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Although in CI I only see this on podman-remote tests, yesterday I saw this flake on a 1mt VM:

✗ podman logs - since journald                                                  
   (from function `is' in file test/system/helpers.bash, line 743,                                                                                              
    from function `_log_test_since' in file test/system/035-logs.bats, line 189,                    
    in test file test/system/035-logs.bats, line 211)                                                                                                           
     `_log_test_since journald' failed                                                                                                                          
   # /root/go/podman/bin/podman rm -t 0 --all --force --ignore                                                                                                  
   # /root/go/podman/bin/podman ps --all --external --format {{.ID}} {{.Names}}                                                                                 
   # /root/go/podman/bin/podman images --all --format {{.Repository}}:{{.Tag}} {{.ID}}
   quay.io/libpod/testimage:20221018 f5a99120db64                                                                                                               
   # /root/go/podman/bin/podman run --log-driver=journald -d --name test quay.io/libpod/testimage:20221018 sh -c echo before_bDnJJB3Ydm_journald; trap 'echo aft
er_DRQNIJ59TR_journald; exit' SIGTERM; while :; do sleep 1; done
   cf99c637a0d167926ab15d44e597ed4e9fc8e60d3ddfc3498d27531a39bece3e                                                                                             
   # /root/go/podman/bin/podman stop test                                                                                                                       
   test                                                                                                                                                         
   # /root/go/podman/bin/podman logs test                                                                                                                       
   before_bDnJJB3Ydm_journald           
   #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
   #|     FAIL: podman logs test        
   #| expected: 'before_bDnJJB3Ydm_journald                
   after_DRQNIJ59TR_journald'           
   #|   actual: 'before_bDnJJB3Ydm_journald' 
   #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

No idea if there's a connection, and the VM is long gone.

@edsantiago
Copy link
Member Author

Here's another non-remote logs flake (in f37 root):

podman create with --mount flag
   ...
         # podman [options] create --log-driver k8s-file --name test --mount type=bind,src=/tmp/podman_test336222415/secrets,target=/create/test quay.io/libpod/alpine:latest grep /create/test /proc/self/mountinfo
         35df13069504efe8e7675e87256b9340f734768f2efbcb0ced8e3747cd90d5a5
         # podman [options] start test
         test
         # podman [options] logs test
         [no output from podman logs]
         ....
           Expected
               <string>: 
           to contain substring
               <string>: /create/test rw

I'm not saying the bug is in podman-logs, but maybe I am.

@edsantiago
Copy link
Member Author

And another one in int tests, also a podman-logs that spits out nothing.

edsantiago added a commit to edsantiago/libpod that referenced this issue Nov 7, 2022
This one has been a thorn in my side: it's a podman-log issue,
but not remote, so I _almost_ retitled containers#16132 (removing "remote").

Nope, it's a bug in the tests themselves. One solution would be to
podman-wait, but I see no reason for logs to be involved, so I
went with podman start -a instead. This removes the k8s-log stuff
which is no longer necessary. Cleanup all around.

Signed-off-by: Ed Santiago <[email protected]>
edsantiago added a commit to edsantiago/libpod that referenced this issue Nov 7, 2022
It looks like containers#16132 was my fault: a missing 'wait' for a container
to exit. Let's see if this fixes the flake.

And, while poking through flake logs, I found another missing wait.

And... in wait_for_output(), address a potential race.

Signed-off-by: Ed Santiago <[email protected]>
@edsantiago
Copy link
Member Author

I'm going to hope that this was fixed in #16437

edsantiago added a commit to edsantiago/libpod that referenced this issue Nov 16, 2022
[backport containers#16437]

It looks like containers#16132 was my fault: a missing 'wait' for a container
to exit. Let's see if this fixes the flake.

And, while poking through flake logs, I found another missing wait.

And... in wait_for_output(), address a potential race.

Signed-off-by: Ed Santiago <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 10, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote
Projects
None yet
Development

No branches or pull requests

1 participant