-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP8266WebServer - Drop inactive connection when another is waiting to improve page load time #8216
Conversation
I was not able to reproduce this safari behavior with the However I'm not telling it cannot happen like you say. The logic of your PR is the following: if "the current connection has currently nothing to say" As a result, both connections are closed. I'm not sure that this is what you intend. Indeed, Did you intend to do what's below instead ? diff --git a/libraries/ESP8266WebServer/src/ESP8266WebServer-impl.h b/libraries/ESP8266WebServer/src/ESP8266WebServer-impl.h
index 8c5cdf40..d570e877 100644
--- a/libraries/ESP8266WebServer/src/ESP8266WebServer-impl.h
+++ b/libraries/ESP8266WebServer/src/ESP8266WebServer-impl.h
@@ -334,8 +334,9 @@ void ESP8266WebServerTemplate<ServerType>::handleClient() {
break;
} // switch _parseRequest()
} else {
- // !_currentClient.available(): waiting for more data
- if (millis() - _statusChange <= HTTP_MAX_DATA_WAIT) {
+ // !_currentClient.available():
+ // wait until timeout except if another connection is available
+ if (!_server.hasClient() && millis() - _statusChange <= HTTP_MAX_DATA_WAIT) {
keepCurrentClient = true;
}
else |
I see. I didn't realize that calling #define DEBUG_ESP_HTTP_SERVER
#include <ESP8266WiFi.h>
#include <ESP8266WebServer.h>
const char *ssid = "xxx";
const char *wifiPassword = "xxx";
ESP8266WebServer HTTPserver(80);
void setup() {
Serial.begin(115200);
// Initialize WiFi
Serial.print("Connecting to WiFi");
WiFi.mode(WIFI_STA);
WiFi.begin(ssid, wifiPassword);
unsigned long startConnect = millis();
while(WiFi.status() != WL_CONNECTED && millis()-startConnect < 30000) {
delay(1000);
Serial.print(".");
}
if(WiFi.status() != WL_CONNECTED) {
Serial.println("\nWiFi connection failed! Rebooting...");
delay(1000);
ESP.restart();
}
Serial.println(" success");
Serial.print("\tConnected to ");
Serial.println(ssid);
Serial.print("\tIP address: ");
Serial.println(WiFi.localIP());
HTTPserver.on("/", page);
HTTPserver.begin();
}
void loop() {
HTTPserver.handleClient();
}
void page() {
HTTPserver.send(200, "text/plain", "Time: "+String(millis()));
} This Wireshark capture of the unmodified library and the double Safari connections (ESP is 192.168.1.10, computer is 192.168.1.2, the HTTP GET is sent over the 53254 connection):
Like I tried to say in my first comment, this doesn't seem to happen every time, so these captures are of course only the ones that show the double connection issue. Here's what this becomes after my proposed modification.
From the Wireshark capture, it's clear that the act of calling Your suggestion of using
However, I was hoping to be able to verify that there was actually something available for reading from the new connection before dropping the current one (the
Or, equivalently, |
It is not currently implemented but may be possible. I haven't tested if it can indeed return a positive number but may help with the above patch if it does. diff --git a/libraries/ESP8266WiFi/src/WiFiServer.cpp b/libraries/ESP8266WiFi/src/WiFiServer.cpp
index 644f8b45..72f89d7b 100644
--- a/libraries/ESP8266WiFi/src/WiFiServer.cpp
+++ b/libraries/ESP8266WiFi/src/WiFiServer.cpp
@@ -109,6 +109,10 @@ bool WiFiServer::hasClient() {
return false;
}
+int WiFiServer::hasClientData() {
+ return _unclaimed? _unclaimed->getSize(): -1;
+}
+
WiFiClient WiFiServer::available(byte* status) {
(void) status;
if (_unclaimed) {
diff --git a/libraries/ESP8266WiFi/src/WiFiServer.h b/libraries/ESP8266WiFi/src/WiFiServer.h
index f0619383..e81efb89 100644
--- a/libraries/ESP8266WiFi/src/WiFiServer.h
+++ b/libraries/ESP8266WiFi/src/WiFiServer.h
@@ -82,6 +82,11 @@ public:
virtual ~WiFiServer() {}
WiFiClient available(uint8_t* status = NULL);
bool hasClient();
+ // hasClientData():
+ // -1: no client
+ // 0: client but no data
+ // >0: awaiting client with incoming data
+ int hasClientData();
void begin();
void begin(uint16_t port);
void begin(uint16_t port, uint8_t backlog);
I agree that the current default timeout set to 5 second might be a bit large.
This could be the start of a neverending optimization process: Should we look only the next one if the current one is empty ? By the way, AsyncWebserver might be worth a try (I'm not saying that we should not fix our webserver). One still needs to remember that memory is precious and if we can make our not-async-webserver responsive (ie, without useless timeout) it should be sufficient regarding the esp8266's capabilities. |
Your
I replaced my old commit with a new one that includes the new server function and a new timeout constant for this connection dropping behavior that I've set to 30ms. I agree that this solution is a bit weird since there could conceivably be any number of mute connections and we don't know why Safari is making this one in the first place. But if you think this specific case is worth addressing, this new commit does fix it.
I did look at it for the project I'm working on, but I wanted HTTPS which AsyncWebServer doesn't really seem to support. Your library looks like the best with HTTPS for the ESP8266. In fact, messing with HTTPS was what originally led me on to this problem. The ESP takes about 2-3 seconds to respond with its certificate, which I haven't really looked into yet (seems to be faster with the faster 160MHz CPU speed setting, so the crypto might just be computationally expensive which is unavoidable I guess). While I was trying to figure out what was causing that delay when I ran into this other 5-second problem. |
It looks nice ! |
Yeah, that seems like it would be a better solution than just checking the first connection in the list. I actually had another idea for a new version of the diff --git a/libraries/ESP8266WiFi/src/WiFiServer.cpp b/libraries/ESP8266WiFi/src/WiFiServer.cpp
index 72f89d7b..360de616 100644
--- a/libraries/ESP8266WiFi/src/WiFiServer.cpp
+++ b/libraries/ESP8266WiFi/src/WiFiServer.cpp
@@ -109,8 +109,15 @@ bool WiFiServer::hasClient() {
return false;
}
-int WiFiServer::hasClientData() {
- return _unclaimed? _unclaimed->getSize(): -1;
+size_t WiFiServer::hasClientData() {
+ ClientContext *next = _unclaimed;
+ while (next) {
+ size_t s = next->getSize();
+ // return the amount of data available from the first connection that has any
+ if (s) return s;
+ next = next->next();
+ }
+ return 0;
}
WiFiClient WiFiServer::available(byte* status) {
diff --git a/libraries/ESP8266WiFi/src/WiFiServer.h b/libraries/ESP8266WiFi/src/WiFiServer.h
index e81efb89..9efb5cca 100644
--- a/libraries/ESP8266WiFi/src/WiFiServer.h
+++ b/libraries/ESP8266WiFi/src/WiFiServer.h
@@ -83,10 +83,9 @@ public:
WiFiClient available(uint8_t* status = NULL);
bool hasClient();
// hasClientData():
- // -1: no client
- // 0: client but no data
- // >0: awaiting client with incoming data
- int hasClientData();
+ // returns the amount of data available from the first client
+ // or 0 if there is none
+ size_t hasClientData();
void begin();
void begin(uint16_t port);
void begin(uint16_t port, uint8_t backlog); This addresses the case in which we have 3 connections and the request is sent in the last one:
However, it seems like somewhere there's a limit of 6 connections maximum, so this partially fails if you open 7 or more connections and send the request on the last one. It gets stuck for 5 seconds on the first one before it closes it (meanwhile the server gets angry and spams it with TCP retransmission packets), accepts the last connection and sees that it has data, and quickly closes the ones in between to get to it. Your proposed solution doesn't have this problem, but it would skip any open connections even if there isn't any data to read. I guess the question comes down to which scenario is more likely: more than 6 empty connections before one with data, or multiple slow clients that take more than 30ms to send their data. Given that I seem to be the first one complaining about the mute connections, the latter is probably more likely?... Which solution do you prefer? I made this little python script to test empty connections if you want to play with it. import sys
import socket
import time
import signal
if len(sys.argv) != 4:
print("Usage: "+sys.argv[0]+" (ip) (number of connections) (connection to send request on)")
sys.exit(1)
connections = []
def closeConnections(sig = None, frame = None):
for connection in connections:
connection.close()
print("Closed connections")
sys.exit(0)
signal.signal(signal.SIGINT, closeConnections)
for x in range(int(sys.argv[2])):
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((sys.argv[1], 80))
connections.append(client)
print("Connections established")
if int(sys.argv[3]) >= 0:
connections[int(sys.argv[3])].send(("GET / HTTP/1.1\r\nHost:%s\r\n\r\n" % sys.argv[1]).encode())
print("Request sent on connection "+sys.argv[3])
else:
print("Not sending request")
time.sleep(30)
closeConnections() |
Given that fact, as you mentioned it, we can imagine a step further an angry safari making 10 connections with only the last one containing a non-void request. In that case, we'd have a deadlock for 5secs times (10 - backlog-size).
That was initially your proposal and yes: No data to read, no time to lose because this webserver can serve only one client at time. Empty connections would be skipped one by one. I don't know if 30ms is too long or short, but usually initial data are sent right away after opened connections. I'll mark this pull request for release 3.1 because the 15? vs 30 vs 50? ms has to be tested a bit more (and 3.0.2 is a bugfix release only). |
iirc, it is specific to the lwip2-feat variant, and is not enabled by default Arduino/tools/sdk/lwip2/include/lwipopts.h Line 1405 in 8a42163
Arduino/tools/sdk/lwip2/include/lwip/tcp.h Lines 444 to 454 in 8a42163
(happened to notice this when re-implementing basic server and trying to use lwip_backlog funcs) |
To be more precise, it is not enabled by default with platformIO (but it should, TCP is more stable when |
It's good to know that there's some kind of anti-DOS stuff built-in. With the Perhaps a compromise would be to use |
…nother has data Safari sometimes opens two connections when loading a page and only sends a request over the second one, resulting in a 5 second wait (HTTP_MAX_DATA_WAIT) before the request is processed. This commit drops the current connection after 30ms (HTTP_MAX_DATA_AVAILABLE_WAIT) when there is a new connection with data available or the buffer of pending TCP clients is full (currently 5).
I figured out how to access the number of connections waiting and added another helper function to WiFiServer (
The first 6 connections get accepted, the thing sees that the buffer is full, and it drops the first one. A second later, the python program retransmits the 7th connection (which wasn't ACKed since the buffer was full), which, now that the buffer is down to 4 connections, gets accepted. The python program then tries the 8th connection, but this doesn't geet ACKed again since the buffer is now up to 5. This repeats until it gets the last connection with the request in it, causing it to quickly skip the other 4 connections in the buffer and finally get to that last one. This still doesn't handle all 10 connections instantly, but now the speed limitation is the time it takes the client to send the retransmissions instead of the 5 second read timeout. The |
This issue is also reproducible in case of Chrome and ESP32. |
Am I right to conclude that @aWZHY0yQH81uOYvH "fixes", or any fixes for this issue, isn't released in 3.0.2? |
@NRollo Yes, I just checked 3.0.2 still exhibits this issue with multiple connections. Applying my changes from this PR to the 3.0.2 code fixes the problem. |
@aWZHY0yQH81uOYvH Thanks for the update! |
@NRollo The changes are under the files changed tab on this page, and the whole repository with my changes included is available on my fork. |
@d-a-v any objections? So, if we are dealing with timing, do we also want to expose timeout values as member variables and can be overridden by the user, not something hardcoded? As tests above only checked out lan, not something with worse latency conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aWZHY0yQH81uOYvH
Thank you, good job !
So, if we are dealing with timing, do we also want to expose timeout values as member variables and can be overridden by the user, not something hardcoded? As tests above only checked out lan, not something with worse latency conditions
I'm approving it for merging now then we can add an accessor in a further PR if that's OK with you
From the CI. Host tests have a custom WiFiServer class with sockets, hence the error And it would probably mean another test-case that has two clients, one stalling and another connecting? |
@aWZHY0yQH81uOYvH Thanks for pointing me in the right direction! |
@aWZHY0yQH81uOYvH If you agree with it, accepting the PR I made on your repository should make CI to pass here. |
@d-a-v Mock functions make sense, it's been merged. |
@d-a-v tnx! |
Will this patch be merged into ESP32 repo? |
@alambin No, one'd have to ask them or the same PR be made there. |
@mcspr Safari?'s requests would have to be reproduced reliably in order to achieve a host test. @aWZHY0yQH81uOYvH could you try compilation on host with your browser ? $ cd tests/host
$ make
$ #<copy-paste of one of the examples>, or
$ make path/to/sketch # builds path/to/sketch/sketch.ino into `tests/host/bin[32]/...`
$ ./bin/.../path/to/sketch/sketch |
@d-a-v The mock server does quickly respond to the weird Safari requests, but the behavior is quite different from on the real hardware due to the lack of a connection buffer (and therefore If you want to reliably simulate the Safari behavior, run the script I left here like Also, I couldn't get the host tests to compile on Mac with clang or gcc 11; it had a linker error complaining about not being able to find a symbol ( |
Thanks @aWZHY0yQH81uOYvH . Then a Regarding host and g++-11 I have no issue with ubuntu. That said |
When loading pages with Safari from an HTTP ESP8266WebServer, Safari appears to occasionally open up two TCP connections and sends the GET request over the second one (found using Wireshark). In its current state, the library waits for data on the first connection, and, since there isn't any data, it sits there until it times out after 5 seconds (
HTTP_MAX_DATA_WAIT
). After the timeout, it moves on to the second connection which has the GET request waiting then calls the configured callback functions to serve the page content. I don't know why Safari does this, but it causes pages to take 5 seconds to load when they should be almost instant.My fix is to simply check for more TCP connections while the server is waiting for more data from the one it's currently working with (connection is in the
HC_WAIT_READ
state). If there is another connection with available data, drop the current one and move on to the new one. This fixes the problem with Safari loading pages slowly as it gets rid of the seemingly pointless first connection once it receives the request on the second one. However, I am not particularly familiar with the other use cases of the library, so if this change in behavior is dumb, please close this PR.