Skip to content

Commit

Permalink
bugfix for random crash when switching between presets
Browse files Browse the repository at this point in the history
this is a band-aid fix for random crashes when switching between presets with multiple segments - crossfade disabled.

!! adding type initializers fixed it for me on -S3, however I still see (less frequent) crashes on esp32, due to heap corruption.

It took me hours to get a meaningful stackdump:

assert failed: heap_caps_free heap_caps.c:360 (heap != NULL && "free() target pointer is outside heap areas")

Backtrace: 0x40084ee1:0x3ffb2570 0x4008e341:0x3ffb2590 0x40094709:0x3ffb25b0 0x4008534a:0x3ffb26e0 0x40094739:0x3ffb2700 0x400e9037:0x3ffb2720 0x400e917c:0x3ffb2740 0x400eaeeb:0x3ffb2760 0x40117ec5:0x3ffb27c0 0x401184ea:0x3ffb2800 0x4013509d:0x3ffb2820

  #0  0x40084ee1:0x3ffb2570 in panic_abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/esp_system/panic.c:402
  #1  0x4008e341:0x3ffb2590 in esp_system_abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/esp_system/esp_system.c:128
  #2  0x40094709:0x3ffb25b0 in __assert_func at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/newlib/assert.c:85
  #3  0x4008534a:0x3ffb26e0 in heap_caps_free at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/heap/heap_caps.c:360
      (inlined by) heap_caps_free at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/heap/heap_caps.c:345
  #4  0x40094739:0x3ffb2700 in free at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/newlib/heap.c:39
  #5  0x400e9037:0x3ffb2720 in Segment::deallocateData() at wled00/FX_fcn.cpp:189
  #6  0x400e917c:0x3ffb2740 in Segment::resetIfRequired() at wled00/FX_fcn.cpp:206
      (inlined by) Segment::resetIfRequired() at wled00/FX_fcn.cpp:203
  #7  0x400eaeeb:0x3ffb2760 in WS2812FX::service() at wled00/FX_fcn.cpp:1216 (discriminator 2)
  #8  0x40117ec5:0x3ffb27c0 in WLED::loop() at wled00/wled.cpp:115 (discriminator 3)
  #9  0x401184ea:0x3ffb2800 in loop() at C:/src/wled00/wled00.ino:20
  #10 0x4013509d:0x3ffb2820 in loopTask(void*) at C:/Users/user/.platformio/packages/framework-arduinoespressif32/cores/esp32/main.cpp:50

ELF file SHA256: 18c20b536f4c6ef4
  • Loading branch information
softhack007 committed Sep 8, 2023
1 parent c7dd4a7 commit 188956a
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 6 deletions.
8 changes: 4 additions & 4 deletions wled00/FX.h
Original file line number Diff line number Diff line change
Expand Up @@ -395,10 +395,10 @@ typedef struct Segment {
};
uint16_t _aux0T;
uint16_t _aux1T;
uint32_t _stepT;
uint32_t _callT;
uint8_t *_dataT;
uint16_t _dataLenT;
uint32_t _stepT = 0;

This comment has been minimized.

Copy link
@blazoncek

blazoncek Sep 8, 2023

Collaborator

This is also a typedef, so initializers shouldn't be present IMO.
Much better way (cleaner) is constructor.

uint32_t _callT = 0;
uint8_t *_dataT = nullptr;
uint16_t _dataLenT = 0;
} tmpsegd_t;

private:
Expand Down
12 changes: 10 additions & 2 deletions wled00/FX_fcn.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ bool Segment::allocateData(size_t len) {
if (data && _dataLen == len) return true; //already allocated
//DEBUG_PRINTF("-- Allocating data (%d): %p\n", len, this);
deallocateData();
if (len == 0) return(false); // nothing to do
if (Segment::getUsedSegmentData() + len > MAX_SEGMENT_DATA) {
// not enough memory
DEBUG_PRINTF("!!! Effect RAM depleted: %d/%d !!!\n", len, Segment::getUsedSegmentData());
Expand All @@ -184,9 +185,13 @@ bool Segment::allocateData(size_t len) {
}

void Segment::deallocateData() {
if (!data) return;
if (!data) { _dataLen = 0; return; }
//DEBUG_PRINTF("--- Released data (%p): %d/%d -> %p\n", this, _dataLen, Segment::getUsedSegmentData(), data);
free(data);
if ((Segment::getUsedSegmentData() > 0) && (_dataLen > 0)) { // check that we don't have a dangling / inconsistent data pointer
free(data);

This comment has been minimized.

Copy link
@softhack007

softhack007 Sep 8, 2023

Author Collaborator

PS: "free() target pointer is outside heap areas" usually means there is a used-after-free pointer, maybe indirectly like

x.ptr = malloc(5);
y.ptr = x.ptr:
free (x.ptr); x.ptr = nullptr;
[....]
free (y.ptr);  // <-- target pointer is outside heap areas; memory was already freed by free (x.ptr);
} else {
DEBUG_PRINTF("---- Released data (%p): inconsistent UsedSegmentData (%d/%d), cowardly refusing to free nothing.\n", this, _dataLen, Segment::getUsedSegmentData());

This comment has been minimized.

Copy link
@blazoncek

blazoncek Sep 8, 2023

Collaborator

If you can, avoid very long texts with DEBUG_PRINTF() if the code remains active all the time, as the string is kept in RAM and hinders development on ESP8266.
One option is to comment out DEBUG_PRINTF (and only uncomment when doing actual debugging) or split into multiple DEBUG_PRINT(F("...")).

}
data = nullptr;
// WARNING it looks like we have a memory leak somewhere
Segment::addUsedSegmentData(_dataLen <= Segment::getUsedSegmentData() ? -_dataLen : -Segment::getUsedSegmentData());
Expand Down Expand Up @@ -338,6 +343,7 @@ void Segment::stopTransition() {
//DEBUG_PRINTF("-- Released duplicate data (%d): %p\n", _t->_segT._dataLenT, _t->_segT._dataT);
free(_t->_segT._dataT);
_t->_segT._dataT = nullptr;
_t->_segT._dataLenT = 0;
}
#endif
delete _t;
Expand All @@ -364,6 +370,7 @@ uint16_t Segment::progress() {
void Segment::swapSegenv(tmpsegd_t &tmpSeg) {
if (!_t) return;
//DEBUG_PRINTF("-- Saving temp seg: %p (%p)\n", this, tmpSeg);
if (nullptr == &tmpSeg) { DEBUG_PRINTLN(F("swapSegenv(): tmpSeg is nullptr !!")); return; } // sanity check

This comment has been minimized.

Copy link
@blazoncek

blazoncek Sep 8, 2023

Collaborator

I do not think a reference can be a null pointer.
Specifications say it is not possible.

This comment has been minimized.

Copy link
@softhack007

softhack007 Sep 8, 2023

Author Collaborator

I was also wondering why this changes anything.. in fact without this check I had crashes at the next assignment tmpSeg.xyz = abcd;
Will keep an eye on it. maybe we have a bigger problem with heap corruption or fragmentation...

This comment has been minimized.

Copy link
@blazoncek

blazoncek Sep 8, 2023

Collaborator

Look for places where swapSegenv() is called. There are only 2, I think.
One in startTransition() and another in service()

tmpSeg._optionsT = options;
for (size_t i=0; i<NUM_COLORS; i++) tmpSeg._colorT[i] = colors[i];
tmpSeg._speedT = speed;
Expand Down Expand Up @@ -404,6 +411,7 @@ void Segment::swapSegenv(tmpsegd_t &tmpSeg) {

void Segment::restoreSegenv(tmpsegd_t &tmpSeg) {
//DEBUG_PRINTF("-- Restoring temp seg: %p (%p)\n", this, tmpSeg);
if (nullptr == &tmpSeg) {DEBUG_PRINTLN(F("restoreSegenv(): tmpSeg is nullptr !!")); return;} // sanity check
if (_t && &(_t->_segT) != &tmpSeg) {
// update possibly changed variables to keep old effect running correctly
_t->_segT._aux0T = aux0;
Expand Down

5 comments on commit 188956a

@blazoncek
Copy link
Collaborator

@blazoncek blazoncek commented on 188956a Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If crossfading is disabled (i.e. transition time equals 0) there is no transition happening and _t structure is/remains uninitialised.
I would also recommend against using initialisers within struct definition and rather create a constructor that does that if needed.

deallocateData() is only called when segment is reset or being destroyed.

I also do not see any benefit by assigning _dataLenT to 0 as it is not the only decision making element. It is always compared together with actual pointer. Still, I am wondering that adding post-free assignment provided a resolution.

EDIT: I will have another look when I get back to my computer.

@softhack007
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to find a better solution, as you know the logic better than I do.

Actually the crash did happen in deallocateData(), with crossfade disabled in led settings.
The reason was uninitialized / random data in _dataLen and data.

@softhack007
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-free assignment provided a resolution.

It was more a band aid. In some scenarios, its better to "clean up" if there are parallel tasks (like async webserver) that might access the same objects.

@blazoncek
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason was uninitialized / random data in _dataLen and data.

That can explain your behaviour. If both are non 0.

@blazoncek
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-free assignment provided a resolution.

It was more a band aid. In some scenarios, its better to "clean up" if there are parallel tasks (like async webserver) that might access the same objects.

those are nonvolatile variables so they may not get loaded from memory each time (compiler optimizations)

Please sign in to comment.