Node HTTP outgoing request fails randomly #1470

aviramha · 2023-05-25T19:54:17Z

I believe this is a fd leak / leak issue since it happens randomly. We also had an e2e test of many outgoing http requests that was flaky so it might be relevant.

Happens on Linux container on Mac M1 with node.

aviramha · 2023-05-25T19:59:49Z

Relevant:
#757

t4lz · 2023-06-07T11:30:34Z

When running the test from @infiniteregrets's branch on Mac, but changing the test to send all the requests 30 times in a loop, at some point the agent sends a ConnectTimedOut back to the layer and the application gets an ENETUNREACH. Is this the bug reported here, or is this another, unrelated issue? (it does not happen when running that test app without mirrord)

aviramha · 2023-06-07T11:43:59Z

When running the test from @infiniteregrets's branch on Mac, but changing the test to send all the requests 30 times in a loop, at some point the agent sends a ConnectTimedOut back to the layer and the application gets an ENETUNREACH. Is this the bug reported here, or is this another, unrelated issue? (it does not happen when running that test app without mirrord)

I don't think we managed to reproduce the issue locally. It only happens in E2E (after the fix) - before the fix we had another issue (or same?) that occurred locally.

t4lz · 2023-06-07T13:06:01Z

But what was the error? Was it ENETUNREACH?

aviramha · 2023-06-07T13:07:12Z

But what was the error? Was it ENETUNREACH?

Before the fix, locally? It was connection reset.

t4lz · 2023-06-07T13:10:41Z

Do we have the exact node error? Is it the same issue as #564?

aviramha · 2023-06-07T13:12:30Z

Seems so, yes.

aviramha · 2023-12-17T10:49:38Z

Anyone knows if this still happens or how to check if it still happens? I believe the internal proxy refactor should've solved it.

infiniteregrets · 2023-12-30T17:24:22Z

I have stress tested this on main locally and this does not happen. I will give a try on the CI with an e2e, if that passes will add an integration test. related to #1484

gememma · 2024-09-10T14:34:11Z

Looks like this passes with 30 loops of makeRequests() in CI now (https://github.com/metalbear-co/mirrord/actions/runs/10794249050/job/29938137749?pr=2748) but fails locally on more than about 10 loops - feels like it could be a workload problem where lots of requests are increasing the latency?
Weirdly sometimes the test_outgoing_traffic_many_requests_disabled (running the test without mirrord) also fails with ECONNRESET which makes me think this isn't a mirrord problem
EDIT After a few more runs im not convinced there is any pattern in the failures (other than more loops is more bad)

gememma · 2024-09-12T12:34:59Z

A short(ish) summary of investigation

Methods attempted:

in CI, both with and without mirrord (tests test_outgoing_traffic_many_requests_enabled/_disabled respectively)
locally on Mac M1, MacOS 14.6.1, via tests above and manually using mirrord as well as running node without mirrord
All of the above was with the script in tests/node-e2e/outgoing/test_outgoing_traffic_many_requests.mjs - changing the number of loops to make more requests

Observations:

the error that occurs (socket hang up) is a timeout issue - the client has waited too long for a response since sending the request and force closes the socket
mirrord does not emit any interesting logs that suggest unexpected errors/ weird behaviour, even when using mirrord-console
in node, all requests are sent before any responses are received, both with and without mirrord
I believe this reveals that the issue is caused by all the requests (critical mass locally with mirrord was around 240 requests sent one after the other) competing for cpu time and choking eachother out - mirrord can only do so many things at once
running locally without mirrord succeeds until around 2400 requests, at which point the same error occurs - any computer can only do so many things at once
running a similar script in python did not encounter the same issue, probably because the script blocked on receiving the response from each request before making the next
in node, the issue is completely fixed if you add a second of sleep() between each batch of 12 requests
sending requests to localhost in a cluster does not cause the same issue (I assume because it's so much faster that it never reaches the point of serious cpu time contention)
when running in CI, sending enough requests causes both test_outgoing_traffic_many_requests_enabled/_disabled to fail, just like running the script locally

eyalb181 · 2024-09-12T12:47:29Z

Closing this as it seems it's not a bug on our end, just mirrord exacerbating timeouts with its (negligible) added latency, which is a known issue.

eyalb181 assigned infiniteregrets May 25, 2023

eyalb181 added the bug Something isn't working label May 25, 2023

infiniteregrets mentioned this issue May 29, 2023

Fix nodejs failing randomly on outgoing requests #1482

Closed

eyalb181 assigned t4lz and unassigned infiniteregrets Jun 4, 2023

This was referenced Jul 4, 2023

Mirrord terminates process when agent receives a connection reset while reading in outgoing tcp session. #562

Open

Application gets RSTs on outgoing TCP sessions. #564

Open

t4lz mentioned this issue Jul 11, 2023

Fix outgoing #1649

Closed

t4lz removed their assignment Dec 13, 2023

eyalb181 added the retest label Jun 20, 2024

eyalb181 assigned gememma Aug 22, 2024

gememma mentioned this issue Sep 10, 2024

Change e2e test test_outgoing_traffic_many_requests #2748

Closed

eyalb181 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node HTTP outgoing request fails randomly #1470

Node HTTP outgoing request fails randomly #1470

aviramha commented May 25, 2023 •

edited

Loading

aviramha commented May 25, 2023

t4lz commented Jun 7, 2023

aviramha commented Jun 7, 2023

t4lz commented Jun 7, 2023

aviramha commented Jun 7, 2023

t4lz commented Jun 7, 2023

aviramha commented Jun 7, 2023

aviramha commented Dec 17, 2023

infiniteregrets commented Dec 30, 2023

gememma commented Sep 10, 2024 •

edited

Loading

gememma commented Sep 12, 2024

eyalb181 commented Sep 12, 2024

Node HTTP outgoing request fails randomly #1470

Node HTTP outgoing request fails randomly #1470

Comments

aviramha commented May 25, 2023 • edited Loading

aviramha commented May 25, 2023

t4lz commented Jun 7, 2023

aviramha commented Jun 7, 2023

t4lz commented Jun 7, 2023

aviramha commented Jun 7, 2023

t4lz commented Jun 7, 2023

aviramha commented Jun 7, 2023

aviramha commented Dec 17, 2023

infiniteregrets commented Dec 30, 2023

gememma commented Sep 10, 2024 • edited Loading

gememma commented Sep 12, 2024

A short(ish) summary of investigation

eyalb181 commented Sep 12, 2024

aviramha commented May 25, 2023 •

edited

Loading

gememma commented Sep 10, 2024 •

edited

Loading