Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RouterTestSslInterRouter::test_connected_tls_sasl_routers, part of system_tests_ssl, takes a long time (>300s) #420

Closed
Tracked by #456
jiridanek opened this issue May 3, 2022 · 8 comments · Fixed by #423
Assignees
Labels
bug Something isn't working

Comments

@jiridanek
Copy link
Contributor

jiridanek commented May 3, 2022

https://github.com/skupperproject/skupper-router/runs/6253209682?check_suite_focus=true#step:26:3238

Note the time difference between previous passing test at 06:48:43 and test_connected_tls_sasl_routers passing at 06:53:52.

2022-05-02T06:48:43.2487811Z 40: ::RouterTestSslClient::test_tls_all �[32mPASSED�[0m
2022-05-02T06:48:43.2492787Z 40: Router QDR.A output file:
2022-05-02T06:48:43.2496696Z 40: >>>>
2022-05-02T06:48:43.2500996Z 40: -----------------------------------------------------
2022-05-02T06:48:43.2504875Z 40: Suppressions used:
2022-05-02T06:48:43.2508657Z 40:   count      bytes template
2022-05-02T06:48:43.2512522Z 40:       4        224 ^IoAdapter_init$
2022-05-02T06:48:43.2516288Z 40:     720     708025 /libpython3.*.so
2022-05-02T06:48:43.2520386Z 40: -----------------------------------------------------
2022-05-02T06:48:43.2524265Z 40: 
2022-05-02T06:48:43.2565966Z 40: 
2022-05-02T06:48:43.2566201Z 40: <<<<
2022-05-02T06:48:43.2566398Z 40: 

[...]

2022-05-02T06:53:52.3057972Z 40: ::RouterTestSslInterRouter::test_connected_tls_sasl_routers �[32mPASSED�[0m

[...]

32/32 Test #40: system_tests_ssl ......................   Passed  325.81 sec

It seems an invariable rule that the ssl tests always finish last.

After setting a 300s timeout per test (here meaning python test method), which is IMO still overly generous, the test gets reported as exceeding that timeout

https://github.com/skupperproject/skupper-router/runs/6263670044?check_suite_focus=true#step:26:3841

        # Wait till listener is running and all expected connectors are connected
        cls.routers[0].wait_ports()
        for router in cls.connected_tls_sasl_routers[1:]:
>           cls.routers[0].wait_router_connected(router)

/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_tests_ssl.py:744: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_test.py:818: in wait_router_connected
    retry(lambda: self.is_router_connected(router_id), **retry_kwargs)
/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_test.py:144: in retry
    delay = retry_delay(deadline, delay, max_delay)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

deadline = 1651524401.7453303, delay = 1, max_delay = 1

    def retry_delay(deadline, delay, max_delay):
        """For internal use in retry. Sleep as required
        and return the new delay or None if retry should time out"""
        remaining = deadline - time.time()
        if remaining <= 0:
            return None
>       time.sleep(min(delay, remaining))
E       Failed: Timeout >300.0s

/home/runner/work/skupper-router/skupper-router/skupper-router/tests/system_test.py:124: Failed
---------------------------------------------------------------------- Captured log setup ----------------------------------------------------------------------
INFO     proton:_reactor.py:1053 Connecting to Url('amqp://0.0.0.0:29029')...
INFO     proton:_reactor.py:1091 Connected to 0.0.0.0
INFO     proton:_reactor.py:1053 Connecting to Url('amqp://0.0.0.0:29029')...
INFO     proton:_reactor.py:1091 Connected to 0.0.0.0
INFO     proton:_reactor.py:1105 Disconnected, already closed
INFO     proton:_reactor.py:1053 Connecting to Url('amqp://0.0.0.0:29029')...
INFO     proton:_reactor.py:1091 Connected to 0.0.0.0
INFO     proton:_reactor.py:1105 Disconnected, already closed
======================================================================= warnings summary =======================================================================
@kgiusti kgiusti added this to the 2.1.0 milestone May 3, 2022
ganeshmurthy added a commit to ganeshmurthy/skupper-router that referenced this issue May 3, 2022
…ter QDR.A to connect to the other routers. This is not a fix for the issue
@ganeshmurthy
Copy link
Contributor

I added some prints in the associated PR. This is what I see locally -

44: QDR.A to  QDR.B connecting time: 7.25615668296814 seconds
44: QDR.A to  QDR.C connecting time: 0.021644115447998047 seconds
44: QDR.A to  QDR.D connecting time: 60.00566530227661 seconds
44: test_connected_tls_sasl_routers Execution time: 0.050333499908447266 seconds

Will be interesting to see what the CI reports.

ganeshmurthy added a commit to ganeshmurthy/skupper-router that referenced this issue May 3, 2022
…ter QDR.A to connect to the other routers. This is not a fix for the issue
@jiridanek
Copy link
Contributor Author

Such suspense, I could not resist and had a peek

https://github.com/ganeshmurthy/skupper-router/runs/6275527291?check_suite_focus=true#step:26:3209

40: ::RouterTestSslInterRouter::test_connected_tls_sasl_routers QDR.A to  QDR.B connecting time: 2.0318427085876465 seconds
40: QDR.A to  QDR.C connecting time: 0.07413935661315918 seconds
40: QDR.A to  QDR.D connecting time: 300.0044779777527 seconds
40: test_connected_tls_sasl_routers Execution time: 0.0192108154296875 seconds

Both 60 and 300 are values of QPID_SYSTEM_TEST_TIMEOUT. 60 is the default, and 300 is increased timeout for CI.

@ganeshmurthy ganeshmurthy added the bug Something isn't working label May 10, 2022
ganeshmurthy added a commit that referenced this issue May 31, 2022
…onnect to the other routers. This is not a fix for the issue (#423)
@jiridanek
Copy link
Contributor Author

Not resolved, test still takes long time

@jiridanek jiridanek reopened this May 31, 2022
@ganeshmurthy
Copy link
Contributor

Not resolved, test still takes long time

I did not use 'Fixes #420' in my commit message but that did not matter, the issue was still closed !!(??)!!

@jiridanek
Copy link
Contributor Author

I think that @kgiusti is responsible for that :P

image

image

@jiridanek
Copy link
Contributor Author

In any case, I get email when my issues get closed, and the issue list is still of manageable size, so I can remember context and scream "not fixed!"

@ganeshmurthy
Copy link
Contributor

Gotcha ! Thanks.

@ganeshmurthy ganeshmurthy modified the milestones: 2.1.0, 2.2.0 Sep 29, 2022
@ganeshmurthy ganeshmurthy modified the milestones: 2.2.0, 2.3.0 Nov 11, 2022
@ganeshmurthy ganeshmurthy modified the milestones: 2.3.0, 2.4.0 Feb 6, 2023
@ganeshmurthy ganeshmurthy modified the milestones: 2.4.0, 2.5.0 Apr 12, 2023
@ganeshmurthy
Copy link
Contributor

Closing this, might reopen if this problem shows up again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants