Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make run-all-tests fails #218

Open
mhanif opened this issue Sep 12, 2022 · 5 comments
Open

make run-all-tests fails #218

mhanif opened this issue Sep 12, 2022 · 5 comments

Comments

@mhanif
Copy link
Collaborator

mhanif commented Sep 12, 2022

On a recent (today) clean clone from "main", make run-all-tests fails with the following error - Not sure what I am doing wrong:

$ make run-all-tests
# Ensure P4Runtime server is listening
t=5; \
while [ ${t} -ge 1 ]; do \
	if sudo lsof -i:9559 | grep LISTEN >/dev/null; then \
		break; \
	else \
		sleep 1; \
		t=`expr $t - 1`; \
	fi; \
done; \
docker exec -w /tests/libsai/vnet_out simple_switch-mhanif ./vnet_out
lsof: no pwd entry for UID 4321
lsof: no pwd entry for UID 4321
GRPC call SetForwardingPipelineConfig 0.0.0.0:9559 => /etc/dash/dash_pipeline.json, /etc/dash/dash_pipeline_p4rt.txt
GRPC ERROR[7]: Not primary, GRPC call Write::INSERT ERROR:
table_id: 38960243 match { field_id: 1 exact { value: "\000\000<" } } action { action { action_id: 21912829 } }Failed to create Direction Lookup Entry
make: *** [Makefile:253: run-libsai-test] Error 1
@chrispsommers
Copy link
Collaborator

Hi @hanif, this is similar to the #186 one you closed yesterday. The symptoms are similar (except this time you also have lsof: no pwd entry for UID 4321 which must be related to recently merged https://github.com/Azure/DASH/pull/202>

Like last time, I see GRPC ERROR[7]: Not primary, GRPC call Write::INSERT ERROR: which I've seen in the past when there is already a P4Runtime client owning the session to the switch. Can you confirm no extraneous docker containers running which might have attached to the bmv2 switch. Please paste here the output of docker ps -a, thanks

@mhanif
Copy link
Collaborator Author

mhanif commented Sep 12, 2022

Hi @chrispsommers, thanks a lot for looking in to this. Below is a requested output:

$ docker ps -a
CONTAINER ID   IMAGE                                     COMMAND                  CREATED         STATUS                       PORTS     NAMES
940c99d8af69   chrissommers/dash-saithrift-bldr:220819   "./saiserver"            2 hours ago     Up 2 hours                             dash-saithrift-server-mhanif
eeb81e8cff56   chrissommers/dash-bmv2-bldr:220819        "env LD_LIBRARY_PATH…"   2 hours ago     Up 2 hours                             simple_switch-mhanif
058107d1ad3e   hello-world                               "/hello"                 5 months ago    Exited (0) 5 months ago                agitated_sinoussi
14ba71aec719   nf:latest                                 "tail -f /dev/null"      10 months ago   Exited (137) 10 months ago             fw
beb0503f1b82   endpoint:latest                           "tail -f /dev/null"      10 months ago   Exited (137) 10 months ago             ext
1bb2dabe1b0f   endpoint:latest                           "tail -f /dev/null"      10 months ago   Exited (137) 10 months ago             int
36465ac7f57f   endpoint:latest                           "tail -f /dev/null"      10 months ago   Exited (137) 10 months ago             h1
9acb562fb0f7   8425a5f345b8                              "/bin/sh -c 'apt-get…"   10 months ago   Exited (1) 10 months ago               determined_jackson
16377b7c6997   ubuntu:trusty                             "sh"                     10 months ago   Exited (0) 10 months ago               affectionate_lovelace

@chrispsommers
Copy link
Collaborator

Looks normal too me. I assume you executed the customary three commands in three consoles (make run-switch, make run-saithrift-server, make run-all tests?). Does this happen every time?

@mhanif
Copy link
Collaborator Author

mhanif commented Sep 12, 2022

Yes, I ran 3 commands in 3 separate console. Other behave as expected but run-all-tests doesn't . I tried couple of times after calling kill-all and re-running them and I get the same results. Thanks!

@chrispsommers
Copy link
Collaborator

Looks like it happened once in the CI pipeline, for the first time ever AFAIK: https://github.com/Azure/DASH/actions/runs/3075002621/jobs/4968124219#step:13:24

It passed on a subsequent re-try. I suspect a race condition when this line is executed. We look for a listener on the P4Runtime server socket, but there may be a delay before the server is actually "ready." Could you try inserting a sleep(3) or similar before this step and see if it makes your test runs succeed? Another experiment is to run all the steps for make run-all-tests manually, i.e. init-switch and so forth. It would also reinforce the theory that's it's a timing issue and will vary based on CPU speed, environment, etc.
https://github.com/Azure/DASH/blob/17912b53472433b9bfef4d04651d8d816571c0e8/dash-pipeline/Makefile#L280

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants