This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/AM5746: Not happen the time-out of clock class.

Guru 24520 points
Part Number: AM5746
Other Parts Discussed in Thread: SYSBIOS

Tool/software: TI-RTOS

Hi, Experts,

Please give us your comments for the unexpected behavior which customer found like below.

- Create the Master Task on Core0
- Create the task_T0 on Core0
- Create the task_T1 Core1
- Create Suspend Clock 3 ms and Resume Clock 100 ms in the TI Clock class.
- Start Suspend Clock and Resume Clock in the TI Clock class.
- After 3ms : Interrupt START
- Task_setPri(Task T0, -1)
- Task_setPri(Task T1, -1)
   -> This Task_setPri is Never return (looks hung-up)

Customer summarized them as PDF and I attached it here.

2500.task.pdf

In addition to this, this behavior was also happened on behavior on TMDXIDK5728. Here is the customer's project.

1643.LC_rtos_clk_not_timeout_app_am572x_a15_1024.zip

If you have any questions, please let me know.

Best regards.
Kaka

  • Hi,

    This is a customer code with SYSBIOS and task functions, I asked my colleague in TI-RTOS team if they can support.

    Regards, Eric
  • Hi Eric,

    I am waiting for your team feedback, would you please give us your comments?

    Best regards.
    Kaka
  • Hi, TI Experts,

    I am also waiting for your team feedback.
    Could you please give us your comments?

    Best regards,
    Hajime-k

  • Hi Kaka (and Hajime-k),

    First of all, thank you for the wonderful write-up. We have some one on the SYS/BIOS team actively looking at this and we'll get back to you as soon as possible.

    Todd

  • Hi Kaka,

    I will attempt to reproduce this problem locally with your test and look into it.

    In the meantime, would you be able to gather some Log information to help us resolve this more quickly?  It will take some time for us to get set up and reproduce the problem.

    Adding the following code to your .cfg file will enable Task & Hwi event logs that can than be collected using ROV after the failure:

    BIOS.logsEnabled = true;
    var Diags = xdc.useModule('xdc.runtime.Diags'); 
    var LoggerBuf = xdc.useModule('ti.sysbios.smp.LoggerBuf'); 
    var LoggerBufParams = new LoggerBuf.Params();
    LoggerBufParams.numEntries = 8192;
    var logger = LoggerBuf.create(LoggerBufParams); 
    Task.common$.logger = logger;
    Task.common$.diags_USER1 = Diags.ALWAYS_ON;
    Task.common$.diags_USER2 = Diags.ALWAYS_ON;
    Task.common$.diags_USER3 = Diags.ALWAYS_ON; 

     var HalHwi = xdc.useModule('ti.sysbios.hal.Hwi');

     HalHwi.common$.logger = logger;

     HalHwi.common$.diags_USER1 = Diags.ALWAYS_ON;

     

    After the failure is detected, use ROV to browse the LoggerBuf "Records" view.

    Then copy/paste the entire output into a reply to this message.

    Regards,

    - Rob

  • Hi Kaka,

    You can ignore the Log output request above, we've been able to get help from another TI engineer in reproducing the problem.

    We think we have identified the issue with SYS/BIOS SMP Task_setPri() when being called from an ISR (which includes Clock functions).  The final solution is to fix SYS/BIOS and release new components, but that takes lots of time and is a cumbersome process.  We will certainly be fixing the current SYS/BIOS and back-porting the fix to SYS/BIOS versions that are important to our customer base (and you guys fall under that classification), but in the meantime we can endorse your "quick fix" to keep you moving forward on your development.

    The "quick fix" (or work-around) is to simply do as you did - perform the Task_setPri() calls in the opposite order, as you indicate is working in your .pdf write-up.  We need to confirm our suspicions about the bug, but if we're correct in our assumption then that means that doing the Task_setPri() on the core 0 task *last* in a sequence of Task_setPri() calls will always work and not crash the system:
        Task_setPri(core 1 task, -1);
        Task_setPri(core 0 task, -1);

    I hope this is a good enough work-around for now.  You did some excellent analysis of the problem as illustrated in your superb write-up in your .pdf file, and your observation of the opposite ordering of the Task_setPri() calls was key in allowing us to narrow down the problem.

    I will follow up on this post when we have confirmed our suspicions of the cause of the bug, but assuming we're correct (and I have a high-level of confidence that we are), we can assure you that this work-around is solid.

    Regards,

    - Rob

  • Hi Rob,

    Thank you for your comments.
    I informed your comments to our customer. If I get more questions, please let me know.

    Also, if you will fix this bug, please let me know.

    Best regards.
    Kaka
  • Hi

    Have you already moved to SDK 5.1 (and SYS/BIOS 6.73.x). If not, can you move ?
  • Hi,

    Thank you very much for your kindness.

    I really appreciate your help.

     

    Our customer send us to their feedback.

     

    Currently, they use bios_6_52_00_12.

    - They consider to use the bug-fixed latest bios on their final product design.

    - But they cannot change the bios version at that time.

    - Because changing bios version is high impact to their all project member.

     

    They require "OFFICIAL WORKAROUND from TI" as a evidence to proceed their product development.

    They also require "bug-fix version bios release date".

    -> Please consider to provide "OFFICIAL WORKAROUND" and "Bug-Fix plan" to the customer.

     

    Detail:

    They understand "SYS/BIOS has problem".

    Our customer found one "quick fix" such as changing Task_setPri() order.

    But they cannot accept this "quick fix" as a "WORKAROUND".

    Because they cannot verify all usecase including side effect by this "quick fix".

     

    At least, they require the detail explanation in like below usecase as example;

       - They use "Task_setAffinity()=ANY". -> they think "the quick fix" cannot be applied.

       - They use Task_setPri(Not -1: 4...) in Interrupt handler.

       - They plan to use like below function. -> Are there same problem or not?

     Task_smp.c

      Task_stat(), Task_self(), Task_getAffinity(), Task_setPri(),

     Semaphore.c

      Semaphore_post(), Semaphore_getCount(),

     Event.c

      Event_post(), Event_getPostedEvents()

     Mailbox.c

      Mailbox_getNumFreeMsgs(), Mailbox_pend(BIOS_NO_WAIT), Mailbox_post(BIOS_NO_WAIT)

    Best regards, 

  • Hi matusan,

    FYI, SYS/BIOS 6.75.01 has been released and contains the bug fix for the Task_setPri() SMP problem.  I realize your customer can't switch now but wanted you & them to be aware.

    They raise a good point about using Task_setPri() with a task that has affinity of ANY.  They can call Task_getAffinity() and be careful  to call Task_setPri() last for a core 0 affinity task when a sequence of multiple Task_setPri() calls is performed.  This would work well if all their Task_setPri() calls are isolated to just one ISR thread.

    However, if they have Task_setPri() calls from various ISR threads (where an "ISR thread" is one of Hwi/Swi/Clock) then even this work-around could fail. If one ISR thread calls Task_setPri(task-on-core-0) and then gets preempted by a different ISR thread that calls Task_setPri(task-on-core-other-than-0) then the same bad situation has been reached and will likely crash the system.

    This scenario leads me to believe that there are other sequences that could result in this bug exposing itself.  For instance, if a Semaphore_post() is called from an ISR that readies a task on core 0, followed by a Task_setPri() for a task on a core other than 0, then I think the same problem will occur.  These API calls wouln't even need to be in the same ISR thread.  The initial Semaphore_post() could be in a separate ISR thread that gets preempted by the Task_setPri() ISR thread.

    What sort of things would they need to have an "OFFICIAL WORKAROUND"?  Certainly something that's endorsed by TI, but what else?

    matusan said:

    - They plan to use like below function. -> Are there same problem or not?

     Task_smp.c

      Task_stat(), Task_self(), Task_getAffinity(), Task_setPri(),

     Semaphore.c

      Semaphore_post(), Semaphore_getCount(),

     Event.c

      Event_post(), Event_getPostedEvents()

     Mailbox.c

      Mailbox_getNumFreeMsgs(), Mailbox_pend(BIOS_NO_WAIT), Mailbox_post(BIOS_NO_WAIT)

    I don't really understand the question about the above APIs.  This particular bug is isolated to Task_setPri() for SMP.  I assume they're worried about calling the above APIs from ISR threads.  We are not aware of any bugs associated with the above APIs, whether called from a Task thread or called from an ISR thread.  The Task_setPri() bug does not affect those other APIs at all.

    We don't have any bug-fix plan for older releases at the moment, but we will discuss the topic of releasing a new SYS/BIOS 6.52.01 with this bug fix, and will let you know the outcome of those discussions.

    Regards,

    - Rob

  • Hi,

    Thank you very much for prompt reply.
    I really appreciate your help.

    We understand like below;
    For "Task_setAffinity()=ANY":
    Calling Task_getAffinity() and be careful to call Task_setPri() last for a core 0 affinity task is one advice.

    For Task function:
    The Task_setPri() bug does not affect those other APIs at all.

    We would appreciate if you tell us the comment for the below point.
    - use Task_setPri(Not -1: 4...) in Interrupt handler.

    We are also waiting for the update of bug-fix plan.

    Best regards,

  • matusan said:
    We would appreciate if you tell us the comment for the below point.
    - use Task_setPri(Not -1: 4...) in Interrupt handler.

    Using Task_setPri(4) from an interrupt handler has the same issue as Task_setPri(-1); the value of priority doesn't matter to the bug.

    matusan said:
    We are also waiting for the update of bug-fix plan.

    We will update this thread when there is a plan.

    I would like to suggest an alternative method that could be used as a work-around for this Task_setPri() bug...

    Since the problem with Task_setPri() happens when it is called from an interrupt handler, the problem could be avoided by calling it from a Task instead.  A Task dedicated to calling Task_setPri() could be sent a Mailbox message from the ISR, and the Mailbox message could contain the task handle and the desired priority, for example:

    struct setPriMsg {

    Task_Handle task;

    Int priority;

    };

    setPriTask()

    {

    struct msg msg;

    for (;;) {

    Mailbox_pend(mailbox, &msg, BIOS_WAIT_FOREVER);

    Task_setPri(msg.task, msg.priority);

    }

    }

    interruptHandler()

    {

    struct msg msg;

    msg.task = task;

    msg.priority = priority;

    Mailbox_post(mailbox, &msg, BIOS_NO_WAIT);

    }

    If the setPriTask's priority was set to the highest priority value (usually 15, unless you've configured a different number of priority levels) then it would run as soon as the interrupt handler was finished.

    Regards,

    - Rob

  • Can I mark this as resolved?

    Todd
  • Hi,

    Thank you very much for your kindness.
    I really appreciate your help.

    Our customer needs the answer to below question.

    Question:
    - When do you decide the release date of bug-fix SYS/BIOS?

    Best regards,
  • Hi,

    With the holidays over, everyone is back in the office and we are trying to figure out the best way to handle this. We should have an update later this week.

    Todd
  • Hi,

    The issue was fixed in SYSBIOS 6.75.01 and we don’t have a plan to fix it in older SYSBIOS 6.52.

    The current Processor SDK RTOS is 5.2 release with SYSBIOS 6.73. We have a plan to take latest version in Processor SDK 6.75. There is no plan to go back to older baseline for Processor SDK releases. 6.73 update did had many migration issues within processor SDK including tool chain update. So please suggest customer to migrate to latest Processor SDK RTOS 5.2 with SYSBIOS fix in 6.75.01. It is available: software-dl.ti.com/.../

    Regards, Eric
  • Hi,

    Thank you very much for your kindness.
    I really appreciate your help.
    I will send the answer to the customer.

    Best regards,