Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simultaneous wifi send & receive stalling #3011

Closed
b3nn0 opened this issue Mar 2, 2017 · 7 comments
Closed

Simultaneous wifi send & receive stalling #3011

b3nn0 opened this issue Mar 2, 2017 · 7 comments
Assignees

Comments

@b3nn0
Copy link

b3nn0 commented Mar 2, 2017

Basic Infos

Hardware

Hardware: ESP-12 (NodeMCU)
Core Version: Tested 2.3.0 and Git

Description

I'm trying to implement a simple HTTP Proxy server on my NodeMCU. And it already works fine for very small transfers of only a few kb. However, when transferring larger files over the Proxy, the connection will only transmit the first few kb and will then stall forever in a WiFiclient.write operation.
I also tried several combinations of setNoDelay, delay() calls, yield() calls, etc. All to no avail.
I feel like if I put in a few delay() calls, the connection works a bit longer, but I've not measured it.

Terminology:
Let "server" be the server that hosts a website,
let "client" be a browser that tries to contact the "server" via the NodeMCU proxy.

The code during a file download basically comes down to

void loop() {
  size_t len = serverConn.read(buff, 1460*2);
  if (len > 0)
    clientConn.write(buff, len);
}

The rest of the code is just connection handling, etc.
At some point, serverConn.read() will return that it has read some data, but clientConn.write() will hang forever.
If the data does not come from the server, but is generated on the device (e.g. sending megabytes of '\0' arrays), transmission rate is around 1Mb/s and stable. If the data is not forwarded to the client, but simply discarded, it's also around 1Mb/s and stable.

My uneducated guess would be, that, since the server keeps sending data as fast as possible, the internal rx buffer of the ESP will become full.
clientConn.write() will wait for an ACK, but will never be able to receive it, since the buffer is already full with data from the server. No idea if that makes sense or not, it just sounded plausible.
I also checked wireshark on the client side, and it seems that the ESP tries to retransmit multiple times and ignores the ACKs from the client, which also supports my guess.

Settings in IDE

Module: NodeMCU 1.0
Flash Size: 4MB
CPU Frequency: 80Mhz
Flash Mode: ?
Flash Frequency: ?
Upload Using: SERIAL
Reset Method: nodemcu

Sketch

The complete sketch for local testing can be found here:
https://paste.ubuntu.com/24095234/
There are also the two functions dlPage() and servePage() I used to test the general throughput of the device.

Debug Messages

Connected to: xxx, IP address: 192.168.1.17
New connection from proxy client
Read host header: GET http://example.com/large/file.zip HTTP/1.1
Connecting to host: example.com, Port: 80
Reading from browser
Read 59 bytes
Wrote bytes to http client
Done something! - Bytes written: 59
Reading from server
Read 2920 bytes from http server
Sent 2920 to proxy client
Done something! - Bytes written: 5840
Reading from server
Read 2920 bytes from http server
Sent 2920 to proxy client
Done something! - Bytes written: 5840
Reading from server
Read 2920 bytes from http server
Sent 2920 to proxy client
Done something! - Bytes written: 5840
...
Reading from server
Read 2920 bytes from http server
Sent 2920 to proxy client
Done something! - Bytes written: 5840
Reading from server
Read 2920 bytes from http server

Note that it does not always hang after the same amount of bytes. Sometimes it transmits only 5kb, sometimes 260kb or anything in between (only rarely more than that).

Tested with curl:
http_proxy=http://192.168.1.17:8080 curl http://example.com/large/file.zip > /dev/null

@d-a-v
Copy link
Collaborator

d-a-v commented Apr 14, 2017

Could you try #3129 ?

with:
http_proxy=http://192.168.1.101:8080 curl http://download.thinkbroadband.com/1GB.zip > /dev/null
I tried your sketch and I went much further than 260kb.
But I eventually went into the same behaviour (sitting, waiting).

In this PR there is also a new call WiFiClient::availableForWrite()
I modified your sketch this way:

      size_t avail = httpClients[i].availableForWrite();
      if (avail > BUFF_SIZE)
        avail = BUFF_SIZE;
      size_t lenClient = proxyClients[i].read((uint8_t*)buf, avail);

and

      size_t avail = proxyClients[i].availableForWrite();
      if (avail > BUFF_SIZE)
        avail = BUFF_SIZE;
      lenServer = httpClients[i].read((uint8_t*)buf + lenServer, avail);

and the sketch is still running happily after 6mn and 20MB.

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  2 1024M    2 21.8M    0     0  65621      0  4:32:42  0:05:48  4:26:54 72496

@b3nn0
Copy link
Author

b3nn0 commented Apr 15, 2017

Hi,
Indeed, the PR you mentioned seems to make things much more stable.
However, it also seems to make things really slow, if I'm not mistaken.

With the original network stack, transfer speeds were around 1Mb/s for serving data from memory or downloading data from remote. So the throughput I'd expect from the proxy would be around 400-500kb/s.

With the code from the PR, the proxy will only be around 50-70kb/s. Serving data from memory is around 150-200k.

You can easily try that out by uncommenting the call to servePage(), and start once without your PR applied and then with it (anyway, to apply your stuff, I did what is written here: https://github.com/d-a-v/esp8266-phy. Is that even correct? And is there a way to revert to the original network stack? calling "make revert" doesn't seem to do anything - I can still call availableForWrite()).

EDIT: Never mind about the last part. make revert did bring back the old behavior.

@d-a-v
Copy link
Collaborator

d-a-v commented Apr 15, 2017 via email

@d-a-v
Copy link
Collaborator

d-a-v commented Apr 18, 2017

I could not succeed in serving data from memory with your sketch.

However, I had been using my own test echoer.ino .
With lwip-1.4, it stops quickly. With lwip-2:
With MSS=536 it runs at ~1Mbits/s.
With MSS=1460, it runs at ~3Mbits/s (at the cost of more RAM).
It could run even faster but I did not figure out yet how to run particular C functions in IRAM instead of leaving them running in IROM only and without patching lwip2 sources. It's a matter of ldscripts (like this d-a-v/esp8266-phy@e9adb1f).

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 5, 2018

@b3nn0 At the light of recent core updates, should this issue be closed ?

@b3nn0
Copy link
Author

b3nn0 commented Sep 5, 2018

Sorry, I'm currently not able to test it any more, as I don't have an ESP here - they are all in production use.. So I can't really comment on this any more.

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 5, 2018

So I can't really comment on this any more.

If you have further issues, feel free to report. Closing this one

@d-a-v d-a-v closed this as completed Sep 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants