This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Task_stat calls seems to block system

Other Parts Discussed in Thread: SYSBIOS

Hi Community

We implemented a system monitor task in our SysBios 6 system. This task monitors and prints the load and stack usage level of all registered task. All our SysBios tasks are created dynamically and register them-self at the system monitor. The system monitor task runs every X seconds through the list of registered tasks and determines the task load and stack usage level of every task by calling Load_getTaskLoad and Task_stat.

The system monitor task is a SysBios task running with the lowest priority of all our tasks, except the idle task, which means prio 1. The number of tasks that are be monitored is about 30. Some days after the implementation we realized, that there are audio gaps in the audio output of our system. By the using the execution graph function of the TI system analyzer we narrowed down the problem to our system monitor task. What we can see, is that it seems our system monitor task is blocking all other high prio tasks as long as it isn't finished. Furthermore it looks like that even HWI ISRs aren't running, although there must be HWIs. At this point we stuck. Therefore we scan through the SysBios 6 CDOC of our SysBios 6.32.05.54.

In the CDOC we found the following hint for Task_stat:
"Task_stat has a non-deterministic execution time. As such, it is not recommended to call this API from Swis or Hwis." What does the term "non-deterministic execution time" mean in this context. What is determining the execution time for a Task_stat?

Is there anything to be aware of, when calling Load_getTaskLoad and Task_stat that often in a row? Will these calls disable HWI, SWI or task scheduling?

Cheers
Jo + Colleagues

  • Jo,

    After reading your description of the issue, I do not believe it is a result of calling Task_stat().  Task_stat() is non-deterministic meaning sometimes it can take a short time while other times it can take a long time.  The reason why is because its trying to determine how much stack the Task has used.  Depending on how large your task is and how much stack space you have used, the execution time of Task_stat can vary.

    Task_stat does not call or disable Hwis, Swis or other Task .  You might want to look to see if any idle function disable interrupts.

    Judah

  • Hi Judah,

    Thanks for your reply.

    Is my understanding correct, that the execution time of a Task_stat call grows with the amount of unused stack? As far as I can see, the SysBios inits the stack area with 0xBEBEBEBE at the beginning. I guess, during the Task_stat() call, the SysBios is checking how many of the 0xBEBEBEBEs have been overwritten caused by the use of the stack. From which direction starts the SysBios the 0xBEBEBEBE check? To be more precise, can we reduce the execution time of the call by reducing the amount of unused stack?

    Cheers Jo

  • Jo,

    Yes, if you reduce the amount of unused stack it should reduce the Task_stat() call.

    Judah

  • Hi Judah

    judahvang said:

    You might want to look to see if any idle function disable interrupts.

    According to your advice, I checked our idle function. We are using the default SYS/BIOS idle function, that means there is no other function hooked into the idle function.

    judahvang said:

    Task_stat does not call or disable Hwis, Swis or other Task .

    How does the SYS/BIOS make sure, that there is no context switch happened during the Task_stat() call?

    Cheers
    Jo

  • Jo,

    The point of Task_stat() is to call it in a way so that its not intrusive to your system.  You don't want to disable interrupts to call Task_stat() because that would be intrusive to your system.  You should allow for context switch to happen during a Task_stat() call especially since checking the stack usage can be long if the Task stack is huge!

    Judah

  • Judah,

    judahvang said:

    The point of Task_stat() is to call it in a way so that its not intrusive to your system.  You don't want to disable interrupts to call Task_stat() because that would be intrusive to your system.  You should allow for context switch to happen during a Task_stat() call especially since checking the stack usage can be long if the Task stack is huge!

    I tried to make the Task_stat() call as little intrusive as possible by doing  the call in a SYS/Bios task, called SystemMonitor in our case, with priority 1 and no disabling of anything (HWI, SWI, Task scheduling). However, I can see on the execution graph, that the SystemMonitor is quite intrusive. Please find attached a screen-shot of the execution graph. The SystemMonitor task function is called staticWrapSystemMonitor. As you can see, as long as the SystemMonitor task is running it looks like the HWIs are disabled or at least the ISR "edmaISR" stops to run. But there is no such a HWI_disable() call in the source code.

    Any idea what can cause such a behaviour?

    Cheers
    Jo

  • Jo,

    So you have a single Task that does nothing but calls Task_stat()?  And you think its somehow disabling interrupts?

    If I understand your CPU graph correctly, the only time the Task_stat() Task runs if there's nothing else going on in the system (Since its the lowest priority Task).  So assuming this is the case, could it simply be that there's no work going on in the system and when an EDMA isr comes in, it does go and execute that?

    Another thing to try would be to simply add the function that calls Task_stat as part of Idle Task to see if that makes any difference.

    What device are you using?  On a C6000 device you can check the GIE bit to make sure its enabled.  Its the LSB in the TSR register.

    Judah

  • Hi Judah,

    judahvang said:

    If I understand your CPU graph correctly, the only time the Task_stat() Task runs if there's nothing else going on in the system (Since its the lowest priority Task).  So assuming this is the case, could it simply be that there's no work going on in the system and when an EDMA isr comes in, it does go and execute that?

    We started to investigate the problem because of audio gap in the system audio output. The other blue-colored task in the exec graph is the audio task. As you can see, it is called very periodically because it has to serve a ping/pong buffer for the EDMA that transfers the audio sample via McASP into the "world". If this task isn't able to run as planed then there will appear audio gaps/errors. Furthermore the edmaISR runs quite often and there must be at least one HWI during the SystemMonitor was running. But the exec graph shows that there was no run of the edmaISR.

    Please find attached some pseudo code, that shows what the SysMon is calling and how. The method "SystemMonitor::getTaskStatus()" is called every X seconds by the SysMon task. Please have a look to this code. As I mentioned in the initial post, SysMon calls Task_ and Load_ APIs.

    Cheers
    Jo

    SystemMonitorPseudoCode.cpp
  • Jo,

    I'm quite sure that Task_stat() is not the root of your problem here.  Can you try commenting out that call?  I bet you still see the problem.

    Now, looking at your code, I'm curious what MockPrint() does?  Does this print go out to the CCS console window?  If it does, I believe this could be the root of your issue because when you do a print, the Target is halted.  Try commenting out those calls and see if that makes a difference.

    What device are you running on again?  You didn't answer this question from my last post and I'm curious.

    Judah

  • Judah,

    judahvang said:

    What device are you running on again?  You didn't answer this question from my last post and I'm curious.

    We are running on C674x device and the DSP is part of a SOC (ARM + DSP). The SOC is protected by an NDA.

    judahvang said:

    Now, looking at your code, I'm curious what MockPrint() does?  Does this print go out to the CCS console window?  If it does, I believe this could be the root of your issue because when you do a print, the Target is halted.  Try commenting out those calls and see if that makes a difference.

    MockPrint is just a placeholder. In the original source code there is a call to our own print implementation, that pushes the message to queue. The queue is situated in the shared memory area (shared mem between DSP and ARM). The OS on the ARM processor reads the message from the queue and prints it on a console. Therefore the target isn't halted during that print call.

    judahvang said:

    I'm quite sure that Task_stat() is not the root of your problem here.  Can you try commenting out that call?  I bet you still see the problem.

    After we commented out the Task_stat() call our audio gaps disappeared. But we also tried that already during our error analysis. Therefore the Task_stat() call became suspicious for us. At the moment the SystemMonitor is running without the Task_stat() call and it works. We just commented this one call, all the other code lines are untouched.

    Cheers
    Jo

  • Jo,

    At this point, I'm sort of lost as to why Task_stat() would be blocking off interrupts.

    Can you please task some screen shots of the disassembly code for Task_stat() and any function call it makes.  I believe it makes just one function call and that function is small also.

    Meanwhile, I'll also continue to look at the generated code on my end.

    Judah

  • Jo,

    I looked into the disassembly code for Task_stat() and did not find any disabling of interrupts.

    In our compile line we do have the following specified  "-mi10" which says to the compiler the maximum interrupt disable time is 10 cycles.

    Do you know what your compile lines looks like?

    Judah

  • Judah,

    judahvang said:

    At this point, I'm sort of lost as to why Task_stat() would be blocking off interrupts.

    We are also curios about what's going on. We checked the SYS/Bios C-code of the Task_stat() [1] again. I guess the function call you mentioned is at line 911 in in the attached file. We will gather more information and logs.

    For new I would like to provide you some more information according to the execution graph, that I posted above. During the runtime of the System Monitor task function, I'm referring to the blue bar in the exec graph, there was only one iteration of the for-loop "for (uint32_t i = 0; i < MAX_NUM_TSK; i++) {" in SystemMonitorPseudoCode.cpp. We put SYS/Bios Log_prints around all the API calls in the for-loop and we saw that nearly all the runtime, the blue bar, was consumed by the Task_stat() call. And during that time the HWIs seems to be disabled.

    Cheers
    Jo

    [1] /opt/ti/bios_6_32_05_54/packages/ti/sysbios/knl/Task.c

  • Judah,

    judahvang said:

    I looked into the disassembly code for Task_stat() and did not find any disabling of interrupts.

    We will follow up and check the assembler code generated on our site. May be, we aren't able follow up today.

    judahvang said:

    In our compile line we do have the following specified  "-mi10" which says to the compiler the maximum interrupt disable time is 10 cycles.

    Do you know what your compile lines looks like?

    There is no --interrupt_threshold[=num] or -mi flag in our compiler call. Our compiler call looks like the following: "-mv6740 --gcc  --abi=eabi --exceptions --rtti -O2 -g".

    Cheers
    Jo

  • Nothing jumps out at me still.  I would definitely try the "-mi10" compile option just to see if it makes any difference.

    Another thing would be simply to set a breakpoint in that Task and step through the code and see if/when interupts get disabled.

    I'm currently not able to reproduce this on my end but would like to understand what is causing the issue.

    Judah

  • Judah,

    Just a quick update. In the meantime we lowered the realtime requirement of our system by increasing the size of some ping/pong buffers. So, it looks like that we got rid of the audio gaps in our audio output. Furthermore, we commented the Task_stat() call to play it safe for now. This is only a temporary solution for me but at the moment there are other higher priority problems that needs to be solved. I hope we will be able to continue the analysis in mid-August.

    Best regards
    Jo

  • Jo,

    Okay.  The next best thing would be to maybe create a cut-down version of the program that you can post here which reproduces the problem.

    I've tried creating my own program that calls Task_stat() with a fast periodic timer and I do not see the issue you mention here.  In my case, the Timer always preempts the Task_stat() call.

    Judah