-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SMP trace status #412
Comments
It would be good to understand better the strange issues that occur when trace is enabled and used, e.g., are the issues related to the operation of ThreadX or problems with the trace buffer?
From: yf13 ***@***.***>
Sent: Sunday, October 13, 2024 5:10 AM
To: eclipse-threadx/threadx ***@***.***>
Cc: Bill Lamie ***@***.***>; Mention ***@***.***>
Subject: [eclipse-threadx/threadx] SMP trace status (Issue #412)
I am debugging a SMP port and noticed the sample works fine w/o trace. However when trace is enabled they are some strange issues.
@billlamiework<https://github.com/billlamiework> and @goldscott<https://github.com/goldscott>, I am wondering status of trace APIs in SMP ThreadX? does it work with other SMP ports?
I noticed that core id is embedded in most sigificant byte of event id, but not sure if there are any more SMP events yet to be added? Also from TraceX, there is no special reports on SMP trace file as well.
—
Reply to this email directly, view it on GitHub<#412>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BDUEV3EJLWA27E7R4X6RUVDZ3JPKNAVCNFSM6AAAAABPC4KQDKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU2TMMJSGYZDMMA>.
You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>
|
@billamiework, thanks for asking. I think it more relates to ThreadX operations of my port. My sample rotates two trace buffers upon buffer full and a thread_8 is added to record full buffer to files. The sample quits when So console log of a successful run looks like below, with 97 trace buffer rotates:
Below is that for typical failure case with a weird exception:
I tried skipping the data recording, but the issues still occur, so it seems not like trace buffer contention from the recording thread. If trace have been used with other SMP ports before then it is likely an issue specific to my port. |
@billamiework, I noticed that trace feature has been used with other upstream SMP ports but they don't rotate trace buffers. So I commented out buffer full notifier registration in my sample and it works as good as trace disabled situation. So my buffer full handler is more suspicious: /* Trace buffer full handler, called within critical section */
void trace_buffer_rotate(void *buffer)
{
void *new;
tx_trace_disable();
/* Switch buffer */
if (buffer == &trace_buffer0) {
new = &trace_buffer1;
} else {
new = &trace_buffer0;
}
trace_buffer_setup(new);
/* Send messag to notify the logging thread */
tx_queue_send(&queue_1, &buffer, TX_NO_WAIT);
} What is the right way for buffer rotation? |
It is good that you have narrowed it down to this callback handler… It would be good to check the thread or ISR stack availability where this callback occurs. Memory corruption from stack overflow is an easy way to cause a system failure. It might be interesting to know if there is any consistency on where the callback is being called from when the crash occurs. Does it work sometimes, or does it crash every time this is called? Also, what is going on inside of “trace_buffer_setup”?
From: yf13 ***@***.***>
Sent: Tuesday, October 15, 2024 4:37 AM
To: eclipse-threadx/threadx ***@***.***>
Cc: Bill Lamie ***@***.***>; Mention ***@***.***>
Subject: Re: [eclipse-threadx/threadx] SMP trace status (Issue #412)
@billamiework, I noticed that trace feature has been used with other upstream SMP ports but they don't rotate trace buffers.
So I commented out buffer full notifier registration in my sample and it works as good as trace disabled situation.
So it seems that buffer rotation might be more suspicious:
/* Trace buffer full handler, called within critical section */
void trace_buffer_rotate(void *buffer)
{
void *new;
tx_trace_disable();
/* Switch buffer */
if (buffer == &trace_buffer0) {
new = &trace_buffer1;
} else {
new = &trace_buffer0;
}
trace_buffer_setup(new);
/* Send messag to notify the logging thread */
tx_queue_send(&queue_1, &buffer, TX_NO_WAIT);
}
Is this the right use of buffer rotation?
—
Reply to this email directly, view it on GitHub<#412 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BDUEV3B5BNHIM7QWXINS3VLZ3T445AVCNFSM6AAAAABPC4KQDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJTGY2DKOJXHA>.
You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>
|
@billlamiework, thanks again for continuing this.
For For rv64, stacks for app and timer threads are 2048 bytes, the system stacks are of 4096 bytes. here is one log of
The log shows that exception happened to different app threads. Is there a simple way to find where the callback is called?
The demo works sometimes, but often fails with two typical failure cases:
Here it is: void trace_buffer_setup(void *buffer)
{
UINT ret;
ret = tx_trace_enable(buffer, DEMO_TRACE_SIZE, DEMO_TRACE_OBJECTS);
if (ret != TX_SUCCESS){
printf("tx_trace_enable failed: %d", ret);
exit(1);
}
ret = tx_trace_buffer_full_notify(trace_buffer_rotate);
if (ret != TX_SUCCESS) {
printf("failed to install buffer full handler: %d", ret);
exit(2);
}
} I haven't found the API to set event buffer only and reuse existing header and object registry, so this is what I have now. |
The trace_buffer_setup processing is non-trivial and, as implemented, could be called from ISR processing. Instead, I suggest using the buffer full callback notification to wake up a thread, which will then perform the trace buffer swap. I don’t know if this is the actual problem… but it’s hard to tell without digging in with the debugger.
From: yf13 ***@***.***>
Sent: Tuesday, October 15, 2024 2:10 PM
To: eclipse-threadx/threadx ***@***.***>
Cc: Bill Lamie ***@***.***>; Mention ***@***.***>
Subject: Re: [eclipse-threadx/threadx] SMP trace status (Issue #412)
@billlamiework<https://github.com/billlamiework>, thanks again for continuing this.
It would be good to check the thread or ISR stack availability where this
callback occurs.
For rv32, stacks for app and timers threads are of 1536 bytes, as per the logs from program finish, they look fine so far. The system stacks are of 2048 bytes, I need see how to track the system stack usage, I might need add some code to track its depth for sure.
For rv64, stacks for app and timer threads are 2048 bytes, the system stacks are of 4096 bytes. here is one log of rv64 success case:
Copyright (c) 2024 Eclipse Foundation. * ThreadX SMP/RISC-V/GNU Version 6.4.x *
**** ThreadX RiscV Demonstration **** (c) 1996-2024 Eclipse Foundation
thread 0 events sent: 1000
thread 1 messages sent: 503
thread 2 messages received: 504
thread 3 obtained semaphore: 253
thread 4 obtained semaphore: 252
thread 5 events received: 1001
thread 6 mutex obtained: 251
thread 7 mutex obtained: 251
thread 8 messages received: 97
Stack depths: 1048, 1184, 1184, 1184, 1184, 1172, 1192, 1204, 768, 1096
SMP inter-core interrupts: 1424, 554
It might be interesting to know if there is any consistency on where the callback is being called from when the crash occurs.
The log shows that exception happened to different app threads. Is there a simple way to find where the callback is called?
Does it work sometimes, or does it crash every time this is called?
The demo works sometimes, but often fails. Typical failure behaviours are two: one is above mentioned exception, the other is stalled QEMU which I have to quit forcedly.
what is going on inside of “trace_buffer_setup"?
Here it is:
void trace_buffer_setup(void *buffer)
{
UINT ret;
ret = tx_trace_enable(buffer, DEMO_TRACE_SIZE, DEMO_TRACE_OBJECTS);
if (ret != TX_SUCCESS){
printf("tx_trace_enable failed: %d", ret);
exit(1);
}
ret = tx_trace_buffer_full_notify(trace_buffer_rotate);
if (ret != TX_SUCCESS) {
printf("failed to install buffer full handler: %d", ret);
exit(2);
}
}
I haven't find an API to just set the event buffer and keep reusing existing header and object registry, so this is what I have now.
—
Reply to this email directly, view it on GitHub<#412 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BDUEV3EW5F52VHOIKRB7Q5TZ3WACDAVCNFSM6AAAAABPC4KQDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJVGE3TCMZWGE>.
You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>
|
Thank you @billlamiework, I guess you mean to keep As a side topic, I am wondering if we shall enrich trace design later so that more SMP related events are supported? |
Sounds good. Please let me know how it goes. I’m happy to discuss other issues/improvements at ***@***.******@***.***>
From: yf13 ***@***.***>
Sent: Wednesday, October 16, 2024 3:30 PM
To: eclipse-threadx/threadx ***@***.***>
Cc: Bill Lamie ***@***.***>; Mention ***@***.***>
Subject: Re: [eclipse-threadx/threadx] SMP trace status (Issue #412)
Instead, I suggest using the buffer full callback notification to wake up a thread, which will then perform the trace buffer swap.
Thank you @billlamiework<https://github.com/billlamiework>, I guess you mean to keep tx_queue_send() only in the callback? Will give this a try later as I am using a bigger buffer currently.
As a side topic, I am wondering if we shall enrich trace design later so that more SMP related events are supported?
—
Reply to this email directly, view it on GitHub<#412 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BDUEV3B6KDBOFOTU3KQ4CLDZ33SFJAVCNFSM6AAAAABPC4KQDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJYGA3TMOBUGY>.
You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>
|
@billamiework, I tried following buffer full callback and off-load rotation to a thread: /* The buffer full callback */
void trace_buffer_full(void *buffer)
{
tx_trace_disable();
tx_queue_send(&queue_1, &buffer, TX_NO_WAIT);
}
/* Trace rotation for thread #8 */
void trace_buffer_rotate(void *buffer)
{
TX_INTERRUPT_SAVE_AREA
void *new;
TX_DISABLE
if (buffer == &trace_buffer0) {
new = &trace_buffer1;
} else {
new = &trace_buffer0;
}
trace_buffer_setup(new);
TX_RESTORE
}
/* Enable trace with given buffer */
void trace_buffer_setup(void *buffer)
{
UINT ret;
ret = tx_trace_enable(buffer, DEMO_TRACE_SIZE, DEMO_TRACE_OBJECTS);
if (ret != TX_SUCCESS){
printf("tx_trace_enable failed: %d", ret);
exit(1);
}
ret = tx_trace_buffer_full_notify(trace_buffer_full);
if (ret != TX_SUCCESS) {
printf("failed to install trace buffer fully handler: %d", ret);
exit(2);
}
} Unfortunately it doesn't help yet, there are more frequent stalls and errors. |
I am debugging a SMP port and noticed the sample works fine w/o trace. However when trace is enabled they are some strange issues.
@billlamiework and @goldscott, can you teach status of trace APIs in SMP ThreadX? do they work with any other SMP ports?
I noticed that core id is embedded in most sigificant byte of event id, but not sure if there are any more SMP events yet to be added? Also from TraceX, there is no special reports on SMP trace file as well.
The text was updated successfully, but these errors were encountered: