Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RMT: using rmt_write_sample async with all 8 channels has signficant staggered starts of last few channels #2885

Closed
Makuna opened this issue Jun 11, 2019 · 51 comments
Assignees
Labels
Area: ESP-IDF related ESP-IDF related issues Resolution: Expired More info wasn't provided Type: For reference Common questions & problems

Comments

@Makuna
Copy link

Makuna commented Jun 11, 2019

Hardware:

Board: ESP32 Dev Module
Core Installation version: 1.0.2 (latest public release)
IDE name: Arduino IDE
Flash Frequency: 40Mhz (default)
PSRAM enabled: no (default)
Upload Speed: 921600 (default)
Computer OS: Windows 10

Description:

When using the rmt_write_sample, with no wait (async) with all 8 channels, the last few channels have significant delay before the pulses are started.

image
Channel 5 has a 1.5ms stagger from the first, which is significant from the first four.
Channel 6 has a 14.5ms stagger from the first (as captured above as dx field), which seems way out of reasonability. This seems related to a channel buffer becoming available. The timing shows that while the last pulse of channel 0 is still presently being sent, it is just sending the extended low side of the last pulse; so it will not be calling translate again for that channel; seemingly making it available to start sending channel 6.

Sketch:

extern "C"
{
#include <driver/rmt.h>
}

const size_t dataSize = 1500;
uint8_t* data;

// selected to be in order on the board for easy connection of the 
// logic analyser 
const uint8_t ChannelPins[] = {19, 18, 5, 17, 16, 4, 2, 15};


static void IRAM_ATTR _translate(const void* src,
        rmt_item32_t* dest,
        size_t src_size,
        size_t wanted_num,
        size_t* translated_size,
        size_t* item_num) {
    if (src == NULL || dest == NULL) {
        *translated_size = 0;
        *item_num = 0;
        return;
    }

    size_t size = 0;
    size_t num = 0;
    uint8_t *psrc = (uint8_t *)src;
    rmt_item32_t* pdest = dest;

    for (;;) {
        uint8_t data = *psrc;

        // convert a byte into rmt item timing
        // zero bit pulse = 200ns 1000ns = 8 cycles 40 cycles  = 0x0028 8008 as rmt item val
        // one bit pulse = 1000ns 200ns = 40 cycles 8 cycles  = 0x0008 8028 as rmt item val
        for (uint8_t bit = 0; bit < 8; bit++) {
            pdest->val = (data & 0x80) ? 0x00088028 : 0x00288008;
            pdest++;
            data <<= 1;
        }
        num += 8;
        size++;

        // if this is the last byte we need to adjust the length of the last pulse
        if (size >= src_size) {
            // extend the last bits LOW value to include the full reset signal length
            pdest--;
            pdest->duration1 = 20000; // 500us reset
            // and stop updating data to send
            break; 
        }

        if (num >= wanted_num) {
            // stop updating data to send
            break;
        }

        psrc++;
    }

    *translated_size = size;
    *item_num = num;
}

void initChannel(rmt_channel_t ch, gpio_num_t pin) {
    rmt_config_t config;

    config.rmt_mode = RMT_MODE_TX;
    config.channel = ch;
    config.gpio_num = pin;
    config.mem_block_num = 1;
    config.tx_config.loop_en = false;
        
    config.tx_config.idle_output_en = true;
    config.tx_config.idle_level = RMT_IDLE_LEVEL_LOW;

    config.tx_config.carrier_en = false;
    config.tx_config.carrier_level = RMT_CARRIER_LEVEL_LOW;

    config.clk_div = 2; 

    rmt_config(&config);
    rmt_driver_install(ch, 0, 0);
    rmt_translator_init(ch, _translate);
}

void writeChannel(rmt_channel_t ch) {
    // wait for the last send to complete
    if (ESP_OK == rmt_wait_tx_done(ch, 10000 / portTICK_PERIOD_MS)) {
        // then start a new async send
        rmt_write_sample(ch, data, dataSize, false);
    }
}

void waitForAllDone()
{
  bool done = false;

  while (!done) {
    done = true;
    for (uint8_t ch = 0; ch < RMT_CHANNEL_MAX; ch++) {
        if (ESP_OK != rmt_wait_tx_done(static_cast<rmt_channel_t>(ch), 0)) {
            done = false;
            yield();
            break;
        }
    }
  }
}

void setup() {
  data = (uint8_t*)malloc(dataSize);
  memset(data, 0x00, dataSize);

  for (uint8_t ch = 0; ch < RMT_CHANNEL_MAX; ch++) {
    initChannel(static_cast<rmt_channel_t>(ch), static_cast<gpio_num_t>(ChannelPins[ch]));
  }
}

void loop() {
  // start all channels, then wait for them to be sent, then start all channels again
  // 
  for (uint8_t ch = 0; ch < RMT_CHANNEL_MAX; ch++) {
    writeChannel(static_cast<rmt_channel_t>(ch));
  }
  waitForAllDone();
  for (uint8_t ch = 0; ch < RMT_CHANNEL_MAX; ch++) {
    writeChannel(static_cast<rmt_channel_t>(ch));
  }

  // repeat with a long delay between so it can easily be captured in the logic analyser
  //
  delay(5000);
}
@r1dd1ck
Copy link

r1dd1ck commented Jun 16, 2019

Had a look on it and it seems to be a scheduler issue rather than anything to do with buffers.

Each RMT channel has a fixed amount of memory blocks allocated which are exclusive to RMT (eg. not shared outside). These blocks can be re-allocated within RMT (between channels), but you have to do so manually.

This issue - where the start of transmission on the last few channels is held back until transmission on other channels ends - happens almost exclusively in cases where you give RMT enough time between the "updates".

Ergo, as the issuing of the RMT transfers is pushed closer to the maximum rate, the transmissions start to run more and more in parallel over all 8 channels! - albeit at a slightly reduced "frame rate" (lagging behind).

I did tests with 1kB frames @ 800kbit (or the equivalent of running 250 RGBW LEDs) per channel - which provides "10ms per frame" transmission windows.

The result:
@ update intervals longer than 20ms (roughly 2* the transfer window), the output looked very similar to the results you provided above 😕
... BUT!
@ update intervals shorter than 20ms, the transmission on the affected last few channels began to run more and more in parallel (eg. no more waiting for transfers on other channels to end!) 😉

There was, however, one ill-effect ...

The resulting update rate was slightly lagging behind. While pushing the data @ 80 updates per second, only about 75 frames made it to the outputs. But good news is, that the amount of outputted frames over a certain time period remains constant & consistent over all channels 👍

The bottom line though being, that this issue should be moved over to the ESP-IDF repo, as the problem most likely is not directly related to the Arduino ESP32 core..

Howgh.

@stale
Copy link

stale bot commented Aug 15, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issue is stale stage (outdated/stuck) label Aug 15, 2019
@Makuna
Copy link
Author

Makuna commented Aug 15, 2019

It is not stale. It is linked to an active issue in the espressif/Arduino-esp32 repo and an active issue in the makuna\NeoPixelBus repo.
You might consider adding some smarts to stale bot to give it more time when there is a link to espressif repo.
Another solution would be to add a label for "dependent issue", mark this with it, and then update stale bot to extend the time for issues with dependent issues.

@stale stale bot removed the Status: Stale Issue is stale stage (outdated/stuck) label Aug 15, 2019
@atanisoft
Copy link
Collaborator

@me-no-dev FYI...

@stale
Copy link

stale bot commented Oct 14, 2019

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issue is stale stage (outdated/stuck) label Oct 14, 2019
@atanisoft
Copy link
Collaborator

.

@stale
Copy link

stale bot commented Oct 14, 2019

[STALE_CLR] This issue has been removed from the stale queue. Please ensure activity to keep it openin the future.

@stale stale bot removed the Status: Stale Issue is stale stage (outdated/stuck) label Oct 14, 2019
@stale
Copy link

stale bot commented Dec 13, 2019

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issue is stale stage (outdated/stuck) label Dec 13, 2019
@Makuna
Copy link
Author

Makuna commented Dec 15, 2019

Still waiting the IDF fix to provide a work around to the problem; as currently there is no exposed way to work around it.

@stale
Copy link

stale bot commented Dec 15, 2019

[STALE_CLR] This issue has been removed from the stale queue. Please ensure activity to keep it openin the future.

@stale stale bot removed the Status: Stale Issue is stale stage (outdated/stuck) label Dec 15, 2019
@stale
Copy link

stale bot commented Feb 13, 2020

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issue is stale stage (outdated/stuck) label Feb 13, 2020
@Makuna
Copy link
Author

Makuna commented Feb 13, 2020

still waiting on an exposed solution.

@stale
Copy link

stale bot commented Feb 13, 2020

[STALE_CLR] This issue has been removed from the stale queue. Please ensure activity to keep it openin the future.

@stale stale bot removed the Status: Stale Issue is stale stage (outdated/stuck) label Feb 13, 2020
@leonyuhanov
Copy link

I have seen some issues with this as well.
I have 3 x Output channels, each with 125 WS2812 Pixels (so 375bytes {3000bits})
If i output to all 3 one after the other, in quick succession, there is an obvious break of some sort int he transmition to channels 2 and 3.
If i ad a 3ms delay after each RMT_WRITE command the issue is gone, but this is unacceptable.

@embengnoob
Copy link

I also would appreciate to see this issue solved. I need to make user programmable pulses with nanoseconds accuracy on 4 different pins exactly at the same time as these need to drive multiple IGBT gate drivers. I haven't found any other possibilities to generate precise pulses as short as 12.5 ns on ESP32.

@leonyuhanov
Copy link

It seems like if 1 chanel hasnt completed TX before you ask the next to do so, the 1st TX gets borked. and so on. There needs to be a way to iether poll the RMT for a TX Completed message(which seems possible via its internal flags, but each time i test it its always saying TX complete).

@stale
Copy link

stale bot commented Jun 5, 2020

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issue is stale stage (outdated/stuck) label Jun 5, 2020
@Makuna
Copy link
Author

Makuna commented Jun 6, 2020

keeping alive waiting on the IDF related bug to be fixed.

@stale
Copy link

stale bot commented Jun 6, 2020

[STALE_CLR] This issue has been removed from the stale queue. Please ensure activity to keep it openin the future.

@VojtechBartoska
Copy link
Collaborator

VojtechBartoska commented Nov 18, 2021

We will consider this issue when we will work on RMT refactoring to use ESP-IDF API.

@atanisoft
Copy link
Collaborator

@VojtechBartoska this is also reproduced using the IDF API per @Makuna.

@SuGlider SuGlider self-assigned this Dec 15, 2021
@SuGlider
Copy link
Collaborator

SuGlider commented Dec 17, 2021

@Makuna @VojtechBartoska
My understanding about this issue is:

1- It's not Arduino related. The code here presented is IDF only related.
2- The IDF refactoring job (#6024) will neither make any difference nor solve the issue here presented - mainly because the sketch presented as example is pure IDF API, not RMT Arduino API. There is no Arduino RMT API that solves this issue (which IMO - is not a real issue, but a characteristic of the SoC and RMT).
3- IDF will also not solve the problem because each channel must be initialized separately and started (written) one by one. This can be seen in the example code:

  // from setup()
  for (uint8_t ch = 0; ch < RMT_CHANNEL_MAX; ch++) {
    initChannel(static_cast<rmt_channel_t>(ch), static_cast<gpio_num_t>(ChannelPins[ch]));
  }
  
  // from loop()
  for (uint8_t ch = 0; ch < RMT_CHANNEL_MAX; ch++) {
    writeChannel(static_cast<rmt_channel_t>(ch));
  }  

Therefore it will take time (CPU cycles) to actually write each channel in sequence and this will always create a delayed output.

The closest way to make it work is to start (enable RMT_TX_START_CH bit in RMT_CHnCONF1_REG) of all channels in a single CPU clock. But this is impossible because each channel has its own Peripheral Registers in different addresses.
So, maybe, a closet possible solution is to write 64 bytes of the 2 x 32 bits RMT Registers of all of the 8 Channels as fast as possible with the necessary Register setup and Enabling writing all channels at (almost) once, right after pre-populating RMT RAM of all channels correctly.

In other words: not using IDF, but writing bare metal code to manipulate the registers directly as described in the TRM using SoC:

#include "soc/gpio_reg.h"
#include "soc/rmt_struct.h"

TRM: https://www.espressif.com/sites/default/files/documentation/esp32_technical_reference_manual_en.pdf

Another possible solution is to not use RMT. But instead use I2S with parallel sending 8 bits in 8 GPIOs from a data buffer at a desired frequency (data rate). This will produce a perfect parallel in synch signal to all the 8 GPIOs. But this is a completely different way of solving the product/project.

Final words about the issue

This is not Arduino related and the IDF refactoring #6024 will make no difference to this issue.

@SuGlider
Copy link
Collaborator

In my opinion, this issue should be closed.

@SuGlider
Copy link
Collaborator

SuGlider commented Dec 17, 2021

@Makuna @VojtechBartoska This is the potential solution - use FastLED LIB
https://github.com/FastLED/FastLED/blob/master/src/platforms/esp/32/clockless_rmt_esp32.cpp

Therefore it will take time (CPU cycles) to actually write each channel in sequence and this will always create a delayed output.

The closest way to make it work is to start (enable RMT_TX_START_CH bit in RMT_CHnCONF1_REG) of all channels in a single CPU clock. But this is impossible because each channel has its own Peripheral Registers in different addresses. So, maybe, a closet possible solution is to write 64 bytes of the 2 x 32 bits RMT Registers of all of the 8 Channels as fast as possible with the necessary Register setup and Enabling writing all channels at (almost) once, right after pre-populating RMT RAM of all channels correctly.

In other words: not using IDF, but writing bare metal code to manipulate the registers directly as described in the TRM using SoC:

Researching about it, I've found FastLED Code that does exactly what I have just proposed:
https://github.com/FastLED/FastLED/blob/master/src/platforms/esp/32/clockless_rmt_esp32.cpp#L284-L331

@Makuna
Copy link
Author

Makuna commented Dec 17, 2021

Capturing this here was meant to track a side conversation at the time of the possibility of an Arduino API being exposed.

The IDF linked issue was just waiting on them to provide a unified "take action" so that I could change my code to only support either a single channel show (with this issue still a problem) or a unified show (all channels start at once) but it kept getting closed AND I never got any message on how to proceed (as there was no way for me to implement a unified show since the required code was hidden behind the IDF API).

Unless I completely rewrite all my RMT, doing only one part so directly to hardware will cause far too much maintenance for me when the IDF changes.

@SuGlider
Copy link
Collaborator

SuGlider commented Dec 18, 2021

@Makuna
Thanks for the explanation about this issue. I understand better the history of this issue, as you described it.

ESP32 RMT can't start all channels at once because of its internal Register structure. Each channel has its own Register with associated TX_Enable bit.

In order to start all the channels at once, it would be necessary to exist some other register structure for the peripheral. Therefore, it is impossible to IDF, or any other software layer, to make it happen.

One way to do something very close to it, with a ready-to-use code, would be using the FastLed Library based on the example presented in these links:

https://github.com/FastLED/FastLED/blob/master/src/platforms/esp/32/clockless_rmt_esp32.cpp

https://github.com/FastLED/FastLED/blob/master/src/platforms/esp/32/clockless_rmt_esp32.h

I hope it helps you in your project.

@SuGlider
Copy link
Collaborator

SuGlider commented Jan 11, 2022

@Makuna,

I put together a modified sketch that uses a modified RMT driver (all in the sketch folder)
I got it to start all 8 channels in less than 2 microseconds.

I think that this code will help you in accomplishing your project.
The sketch is at https://github.com/SuGlider/RMT_FastChannelStart

I hope it helps you and others.
Please let me know if this issue is solved by this code.

@SuGlider SuGlider added Resolution: Awaiting response Waiting for response of author Status: In Progress Issue is in progress Type: For reference Common questions & problems Area: ESP-IDF related ESP-IDF related issues and removed Status: Stale Issue is stale stage (outdated/stuck) labels Jan 11, 2022
@VojtechBartoska
Copy link
Collaborator

I'm closing it as there is no feedback, Issue is labelled as 'Type: For reference' and can be reopened if needed.

@VojtechBartoska VojtechBartoska added Resolution: Expired More info wasn't provided and removed Status: In Progress Issue is in progress Resolution: Awaiting response Waiting for response of author labels Feb 1, 2022
@Aircoookie
Copy link

@SuGlider thank you for the modified driver! Are you aware of any limitations it introduces or is it an improvement all around?

It might be hard to have changes merged on the driver level and would require a lot of testing, but it definitely looks very promising (Sorry that I am ignorant to the inner workings of the RMT peripheral, I'm just a power user of @Makuna 's NeoPixelBus library)

@Makuna
Copy link
Author

Makuna commented Feb 9, 2022

The issue I have seen with FastLED is that the RMT was not shareable with other libraries. There is an instance of my library being used with an IR library that uses RMT hardware also. The project is an IR remote controlled LED light system, rather common commercial product.

@SuGlider
Copy link
Collaborator

SuGlider commented Feb 9, 2022

@Aircoookie

It is an addon on top of the current IDF RMT driver.
It has all the same API plus the functions I have added.

The result is a faster way to start all RMT channel almost at the same time.

I think it won't be incorporated to IDF neither to Arduino, thus there is this problem with long term support.

@BobLynas
Copy link

I am using Arduino/PIO and also looking at a high speed multi-output SPI using clock_div=1, and so far my findings are good, albeit similar to the above

I am able to start my RMT outputs within a new clock cycles of each other by using direct register writes and I am able to send 6 bytes with clock and latch in ~2us

The first three channels (clock, latch and dataA) all start fully synchronised - great !
The subsequent 5 channels are a bit messed up

Its like the RMT struggles with outputting more than 3 channels at precisely the same time

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: ESP-IDF related ESP-IDF related issues Resolution: Expired More info wasn't provided Type: For reference Common questions & problems
Projects
None yet
Development

No branches or pull requests