Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes in Ticker cause UART to fail #14489

Closed
SibertEnovates opened this issue Mar 30, 2021 · 13 comments · Fixed by #14532
Closed

Changes in Ticker cause UART to fail #14489

SibertEnovates opened this issue Mar 30, 2021 · 13 comments · Fixed by #14532

Comments

@SibertEnovates
Copy link

SibertEnovates commented Mar 30, 2021

Description of defect

When upgrading from Mbed OS 6.1 to 6.7, we saw weird behavior in our serial communication where most of our frames (about 40 bytes) are invalid because it usually drops at least one byte somewhere in communication. By one by one disabling components in, we managed to pin down the issue to a ticker somehow. When the ticker is disabled or replaced (for example using the 'call_every' function of the mbed event queue), the UART issue is resolved.
After some further testing the issue seems to appear from Mbed OS 6.6 and onwards, which makes sense as 6.6 saw an overhaul of the ticker system. When checking out the latest version of Mbed (6.9 as of time of writing) and reverting the commits in the ticker PR, the issue also goes away. This is also discussed in this PR.

Target(s) affected by this defect ?

STM32F030RC

Toolchain(s) (name and version) displaying this defect ?

  • CMake (Clion bundled 3.17.5)
  • GNU ARM embedded toolchain (2020 q2) arm-none-eabi-g++/gcc

What version of Mbed-os are you using (tag or sha) ?

  • mbed-os-6.6.0 and higher

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

  • mbed-cli 1.10.5 (used for mbed deploy and export)
  • c++-17

How is this defect reproduced ?

UART (baud 115200) is started, with regular data being send to UART. When a Ticker that ticks each 3 ms is started, the incoming data gets dropped.

@ciarmcom
Copy link
Member

@SibertEnovates thank you for raising this issue.Please take a look at the following comments:

It would help if you could also specify the versions of any tools you are using?

NOTE: If there are fields which are not applicable then please just add 'n/a' or 'None'. This indicates to us that at least all the fields have been considered.
Please update the issue header with the missing information, the issue will not be mirrored to our internal defect tracking system or investigated until this has been fully resolved.

@ciarmcom
Copy link
Member

Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers.
Internal Jira reference: https://jira.arm.com/browse/IOTOSM-3738

@kjbracey
Copy link
Contributor

kjbracey commented Mar 31, 2021

Are you using a non-standard target? I'm not seeing "STM32F030RC" in the standard list.

Presumably your ticker is doing some work in the periodic interrupt - it could be that what you're doing is marginal in terms of interrupt latency.

Still, I would hope that the PR should not have made the system overhead worse to push you over the edge. I'm going to have to look at the generated code difference.

(The major optimisations would not have occurred in your configuration, but the non-optimised case should not be worse.)

@kjbracey
Copy link
Contributor

One thing you can try is to set target.custom-tickers to false. That should activate the full intended optimisation of that PR on an F030, meaning the code should be a lot smaller and faster than any previous version. It should make the problem firmly go away.

@SibertEnovates
Copy link
Author

Are you using a non-standard target? I'm not seeing "STM32F030RC" in the standard list.

It is not a standard target.

One thing you can try is to set target.custom-tickers to false. That should activate the full intended optimisation of that PR on an F030, meaning the code should be a lot smaller and faster than any previous version. It should make the problem firmly go away.

I did try that, but that didn't seem to resolve the issue. I will try again, to verify this.

@kjbracey
Copy link
Contributor

I've knocked up a dummy target based on MCU_STM32F030x8, hoping it's representative.

STM targets have period defines for USTICKER permitting compile-time optimisation if that's the only clock in the system.

MCU_STM32F030x8 lacks LPTICKER, so turning off target.custom-tickers should activate the full optimisation.

(The STM targets should be improved to have LPTICKER period defines permitting compile-time optimisation for LPTICKER+USTICKER being the only clocks, but don't yet).

@kjbracey
Copy link
Contributor

If you can attach a debugger to the standard (custom-tickers true) build, I'd be interested in seeing the state of the ticker_event_queue_t on entry to update_present_time. It's possible the clock initialisation has somehow messed up the assessment of the clock characteristics.

The initialisation is supposed to look at the period and figure out what the necessary multiplies/divides/shifts are for conversion to/from 1MHz.

In this case, the native rate should be 1MHz, ie exactly what we want, so it should end up doing shift-by-0.

The new version doesn't have a special case for exactly equal to 1MHz, so it does end up doing a run-time shift left by 0 the original doesn't (because I was optimising for code size). But given other optimisations, that change should be minimal - a shift is trivial.

But maybe it's accidentally doing a multiply or divide it shouldn't be? The state of the ticker_event_queue_t should indicate what it's doing.

@SibertEnovates
Copy link
Author

MCU_STM32F030x8 lacks LPTICKER, so turning off target.custom-tickers should activate the full optimisation.

I tried this once again, with no improvement

If you can attach a debugger to the standard (custom-tickers true) build, I'd be interested in seeing the state of the ticker_event_queue_t on entry to update_present_time. It's possible the clock initialisation has somehow messed up the assessment of the clock characteristics.

See the screenshot below of the ticker data on entering update_present_time
image

@kjbracey
Copy link
Contributor

kjbracey commented Apr 9, 2021

Okay, looks like we've got a smoking gun, at least for the custom-tickers true case.

period_num_shifts and period_den_shifts are -1 (meaning "use multiply and divide") when they should be 0.

So this will be doing a runtime multiply-by-1 and divide-by-1, which is probably not optimised. It works, but not as fast as intended.

Looks like the error is in exact_log2 in mbed_ticker_api.c - its loop should be

for (int i = 31; i >= 0; --i) {

Rather than i > 0.

That doesn't explain a failure with custom-tickers false though. Maybe that setting was failing to take effect? Could you show debugger output for that case? If your custom target didn't have the US_TICKER_PERIOD defines, then you'd have still ended up with runtime division - I would have expected it to be eliminated for a normal no-LPTICKER STM target.

@SibertEnovates
Copy link
Author

Looks like the error is in exact_log2 in mbed_ticker_api.c - its loop should be

for (int i = 31; i >= 0; --i) {

Rather than i > 0.

This seems to fix the issue! I'm going to try adding a US_TICKER_PERIOD as well because that might be a quicker fix for us, instead of waiting for this to be patched in an official release.

@kjbracey
Copy link
Contributor

kjbracey commented Apr 12, 2021

I'm going to try adding a US_TICKER_PERIOD as well

Ah, if you don't have that, then I think we have all the answers. custom-tickers: false won't do anything without it.

Having the define and setting that false is primarily a space saving. It greatly reduces the mbed_ticker_api.c code if it knows it only ever has to deal with a 1MHz hardware clock. It will be a minor further speed-up though.

The US_TICKER_PERIOD will also reduce the overhead in wait_us. That's why it was originally added - to move the "convert microseconds to hardware ticks" from runtime to compile-time for wait_us calls with a constant parameter (the usual case). See #7397 and #10609.

@SibertEnovates
Copy link
Author

Alright, thanks for the help! Great to see the Mbed-community helping and fixing the issue so quickly!

@jeromecoutant
Copy link
Collaborator

@kjbracey-arm @SibertEnovates
Maybe you could review in parallel my 2 commits in https://github.com/jeromecoutant/mbed/tree/DEV_STM32F0_LPTICKER_OPTIM
Thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants