-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP8266HttpClient never seems to finish large files when using lwip2 #4176
Comments
Could you please post a full sketch so the issue can be reproduced ? |
This sketch downloaded 2Mb for me and then stalled.
This stalls with lwip2 but not lwip1.4 |
I have made my tests, no stall.
My downlink BW is ~20Mbits/s. |
I just ran it for 20 min until the http connection automatically closed, it downloaded less than half of the file. When I tried lwip2/1460 it took about 670s So it seems like the problem is more specifically with lwip2/536 What is the best way to get a tcp dump for this? Also side note, is there something I can do to speed up the download I get nowhere near your speeds. On the same router my computer gets 30Mb/s |
The website is 13 hops far from me. Try with
The farther peer is, the more chance you get lost packets (among tons of other reasons to get lost packets). TCP has to retransmit them. TCP has its limits, Designers set up tcp stacks with good parameters so it can just work. Under linux and I guess in all regular TCP stacks, the TCP window size (= the receive buffer) is at least 64KB. In this issue, the slowness could be due to ESP's inability to store many / large holes so lots of retransmissions has to happen on a path with higher probability for lost packets. In these cases, the retransmit-timeout TCP algorithm is more often triggered which dramatically decreases performances. If we want to precisely understand what is happening in this issue, tcp dumps are needed, but I'm afraid there's nothing more we can do than understand and/or dedicate all the RAM for tcp. For tcp dumps, a way is to install and run tcpdump on your router (if it can), or wireshark on your PC if you can get the ESP traffic go through it (via an usb-wlan in AP mode). |
I'm 15 hops away, do you think that could account for the difference between our downloads? Me getting 5 times slower download seems pretty bad, is there anything else you did differently? I'll work on getting a tcp dump to see if I can get any more info, but i think I'll just stick lwip1.4 for now since it seems like the faster lwip2 may be unstable? |
I did nothing different, I can't. Everybody would do if it was possible. |
OK, I can try that, what makes these unstable? Just so I know what to look out for, is it just lower RAM? |
Can you please retry your sketch with replacing |
One other question about that constants: Is it a bug that HTTP_TCP_BUFFER_SIZE does not always match TCP_MSS? |
Yes and no. It was initially set to MSS, and was not changed since MSS changed.
|
Regarding my previous comment, MSS=1460 is not at all unstable (I was responsible for saying that and I've proven myself wrong). |
Thank you, the last two comments helped me to understand the meaning of those two constants. |
For context, this |
There is another related constant WIFICLIENT_MAX_PACKET_SIZE which is always 1460, regardless what lwip option I select in the board configuration. Shouldn't that match TCP_MSS in case of lwip v2? |
This constant has not real meaning too. It is used nowhere and should be removed. There is no such notion of packet in TCP. |
"A TCP connection is a stream, and the user needs to see it that way. " |
Very true. UDP with its datagram/packet way of transferring data is ruled by MTU which is not defined as a constant in our core, but is accessible through lwIP's |
Thank your for clarification, and for the good work of course. |
I tried this again on the newer version and am still getting the same issue whenever I choose v2 |
I have the same problem, in this case with 5MB files downloaded from the internet to the SD Card. I could not trace where the problem originated. |
I sometimes can reproduce this issue, but not always, that makes it difficult to debug. |
I was able to test with V2 Low, V2 High and V1.4 High. Unfortunately I could not compile with V1.4 Compile from Source [exec: "make": executable file not found in% PATH% NodeMCU 1.0 (ESP-12E Module). In all cases I had the same error, unloading instabilities of files. I'm guessing it's something related to Wifi encryption. By connecting through my 4G share through the cell phone, it worked without instabilities.
|
Inever encountered the issue with v1.4
|
Same it has never been a problem on 1.4 |
@d-a-v @schlaegerz I ran several tests and could not determine a setup option that works. Anyone else with the same problem? The only test that worked very well was to use my 4G shared internet phone. Is it something related to the type of WIFI encryption?
|
After tcpdump-ing again and again, it seems that those lags are due to SACK (Selective ACKs) missing in lwip. And what is great, is that lwIP folks implemented partial SACK in their v2.1 pre-release (lwip2 is currently using stable-2.0.3). Running with this version seems to make large transfers fluent (not hanging anymore for secoooonnds nor stop). Note: I have no idea why it works with lwip1.4 (I did not test that myself) |
Update: |
Hello, I'm happy with the news, at least we found the problem. Did the code example I mentioned above help? Any prediction to launch a new core of esp8266 with this problem solved? |
@laercionit I did not tried OTA but only this issue's sketch. Please allow some time to check general side effects of moving from lwIP-2.0.3 to lwIP-2.1.0. |
Please test #5126
Revert to master:
edit: in case of updates in the PR, do this before restarting the above.
|
Hello everybody. #5126 seems to work for me too. I am only downloading files of about 250 kB, though. My test case is similar to the one I posted here. So far I haven't observed any obvious side effects, but of course I have tested very little, only running a single sketch a handful of times. |
Thanks for the feedback ! |
…in menu) (#5126) * update to lwIP-2.1.0rc1: partial SACK support fix #4176 * hash fix * get some flash back due to mistake in conf (fragmentation & reassembly was incorrectly enabled) (ahah I scared you) * add missing include files * update to lwip-2.1.0(release) + remove unused lwIP's include files * lwIP release 2.1.0, SACK is now default, bigger, no-SACK is selectable * fix ldscript * pio * rename 'sack' option to 'feat'ure option, + IP fragmentation/reassembly * merge, fix pio * change internal/hidden string * pio: more lwip2 configuration: + without sack for no change in flash footprint
Even though I have experienced a great improvement since the fix (i.e. it doesn't get stuck "forever"), I still get several-seconds-long intervals with 0 bytes available way too often. Is there any test you can suggest that would help me figure out whether this is real network slowness (i.e. weak signal) or something fishy that suggests there's still a software bug? I find it hard to believe the connection can be so poor when every other device in the same area gets a very strong signal and the router is just a few meters away (of course it's still possible that the ESP8266 performs more poorly than a smartphone or a laptop). |
Well?? |
I have connected an antenna (this one) and the situation hasn't improved the slightest bit. I am doing my tests few meters away from the router, where every other device (laptops, smartphones) have excellent connection. I am downloading a file about 250kB in size. As I mentioned, the patch fixed the issue where the download would get stuck forever, but still, I very often observe that there are very long intervals of times (several seconds) with zero bytes received. Then (unlike before the fix), download resumes at a normal speed, but then it gets stuck again for several seconds, and so on. Overall, it often takes several minutes to download the 250kb file, while normally (i.e. when the download goes on at approximately constant speed) it takes about 30 seconds. The physical reception just can't be that bad. |
Basic Infos
Hardware
Hardware: NodeMCU 1.0 (ESP-12E)
Core Version: 2.4
Description
I am trying to download a file that is around 10MB, and on lwip2 the file download seems to get around 1MB done and then just stops. If i go back to 1.4 then it finishes the download, but takes about 4min.
Settings in IDE
Module: NodeMCU 1.0 (ESP-12E)
Flash Size: 4M
CPU Frequency: 160Mhz
Flash Mode:
Upload Using: Serial
Code:
while (success &&
http.connected() &&
(downloadRemaining > 0 || downloadRemaining == -1)
) {
// get available data size
The text was updated successfully, but these errors were encountered: