-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Realtek RTL8195AM - CMSIS-RTOS error: ISR Queue overflow (status: 0x2, task ID: 0x0, object ID: 0x30051484) #5640
Comments
@ARMmbed/team-realtek |
looking into this issue, will update ASAP. |
@Archcady |
I am trying to test and debug this issue, but it takes a long time to reproduce. Is there a way to make the handle_timer_click trigger faster so that I can test and debug the point where it fails? |
@Archcady You can try to reduce the wait period in main.cpp. Currently it's 25 secs per loop. updates.wait(25000); [Mirrored to Jira] |
I updated this to 1000, still the time taken for the trigger is a lot [Mirrored to Jira] |
We will be talking of hours, not days even with the default value. [Mirrored to Jira] |
Have debugged this issue, the ISR queue overflow is happening as the result of the function call "release()" in Semaphore.cpp which in turn calls "osSemaphoreRelease". I checkked our driver from the realtek side, we are not calling the "release()" function at all in our code. It would be great if anyone from ARM side could help me understand this issue. These are my findings so far. P.S. We are unable to debug with pyOCD as it is giving some error hence debugging is taking an extended amount of time. |
Hei, you are 100% sure you are not using semaphores anywhere else? I can see at least these calls using git grep under TARGET_Realtek.
Any unbalance in acquiring those semaphores vs. releasing could potentially cause an issue, right? [Mirrored to Jira] |
@Archcady is there any update on this? |
Hi Marcelo, Realtek team is still trying to narrow down where the semaphore mismatch might come from. |
Hi, I have reproduced the same problem. I found that using some library (temperature sensor), to get some real-world data, causes this problem as soon as that library gets called, say mcp9808.readTemp(). [Mirrored to Jira] |
So does this mean the ISR queue overflow is not platform dependent? @JanneKiiskila |
From realtek side, we are still debugging the issue, but ill share some of my findings here in case there could be some pointers.
P.S. I am still debugging the issue, these are just my findings, if anyone from arm team could get some pointers from these findings and highlight anything that I am missing, please kindly help out. |
There is something that's board specific (or driver specific) - K64F does not have this issue. But, clearly it's now something that's impacting more than one board, if this happens also with NUCLEO-F746ZG. [Mirrored to Jira] |
Hi @JanneKiiskila do you think can commenting out osRtxErrorNotify in osRtxPostProcess or switching to use software timer in semaphore release be the fix? |
This is something it's quite easy to hit in RTX when using any RTOS operations from interrupt context. The RTOS work is always deferred onto this queue, so if you do 16 consecutive RTOS operations from interrupt before returning to thread context, it overflows. I've raised one issue here for RTX suggesting how this could be improved, at least for flags. Not sure if the same logic could apply to semaphores. Maybe? Pending any RTX improvement, it's usually best to work around the issue by including logic to make sure you don't signal multiple consecutive times from interrupt. Some sort of "pending" flag which is cleared by the person who is monitoring the semaphore. Do we really have no information about where the interrupt-context semaphore release triggering this is is coming from? No backtrace? [Mirrored to Jira] |
@prashantrar @ARMmbed/team-realtek |
We are having difficulty taking backtraces because the second the crash happens the stack is corrupt, but it originates from semaphore release all the time, beyond this the backtrace is unable to point out to specific functions usually just shows " ?? ()" in the backtrace. I will try to get proper backtraces once again tomorrow and update this ticket. |
@kjbracey-arm I am updating the latest backtrace with all the latest mbed-os components.
[Mirrored to Jira] |
@ARMmbed/team-realtek @JanneKiiskila Is this still a blocker and issue has not yet been fixed? |
@M-ichae-l , can you confirm if this issue has been addressed? |
Internal Jira reference: https://jira.arm.com/browse/IOTPART-5928 |
Closing as target won't be supported in Mbed 6 - #12775 |
Description
Bug
Target
REALTEK_RTL8195AM
Toolchain:
GCC_ARM
Toolchain version:
mbed cli Windows installed toolchain 0.43
gcc_arm - same with Linux as well.
mbed-cli version:
(
mbed --version
)1.2.2
mbed-os sha:
(
git log -n1 --oneline
)2e1c2a1 (HEAD -> master, origin/master, origin/feature-lorawan, origin/HEAD) Merge pull request #5538 from geky/littlefs-staging
41591eb Merge pull request #5602 from artokin/nanostack_release_v704
DAPLink version:
241 - this one is also a bit old
It would be nice if Realtek could update the official DAPLINK you can download via their website.
Expected behavior
mbed-os-example-client
can run for a very long time.For example reference testing was done with K64F, it works fine.
Actual behavior
Steps to reproduce
(Have not tried other compilers, though).
The text was updated successfully, but these errors were encountered: