This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DSPLINK prevent DSP tasks execution

Other Parts Discussed in Thread: OMAPL138

Hello,

We face a realtime problem in our product using DSPLINK on a OMAP L138 chip.


We only use DSPLINK MSGQ component to make both ARM/DSP applications communicate. Up to 5 MSGQs are used with a single pool to allocate messages (of various size).  Within both applications, 1 task receive messages from all MSGQs and deliver it. At contrary, more tasks can directly use MSGQs for sending.

Under normal activity, several 1 second messages are exchanged from ARM application to DSP one. We noticed that after several hours running that periodic tasks (P=1ms) within DSP application are delayed for more than 2ms missing so their deadlines.

If i understood it well, DSPLINK/MSGQ use a task to process received messages (combined with an ISR), which priority is the highest one in DSP application. This task use MPCS to safely access shared data with ARM. MPCS_enter function implements a Petterson algorithm to guarantee exclusive access to these data. Here is peace of code within MPCS_enter i am taking about:

Uint32     timeout = MPCS_CONTENTION_POLLCOUNT;
...
/* Wait while other process is using the resource and has the turn. */ while ( (mpcsHandle->gppMpcsObj.flag == MPCS_BUSY) && (mpcsHandle->turn != SELF)) { HAL_cacheInv ((Ptr) mpcsHandle, sizeof (MPCS_ShObj)) ; #if defined (DDSP_PROFILE) conflictFlag = TRUE ; #endif /* if defined (DDSP_PROFILE) */ sleepcount++; if ((timeout != 0xFFFFFFFFl ) && (sleepcount == timeout)){ #if defined (DSP_TSK_MODE) TSK_sleep (1); #else /* if defined (DSP_TSK_MODE) */ #endif /* if defined (DSP_TSK_MODE) */ sleepcount = 0; timeout = timeout >> 1; } }

When probem is detected, we track down that the DSPLINK task in DSP application execute for more than 2ms spinsing in MPCS_enter before doing TSK_sleep (due to timeout value of MPCS_CONTENTION_POLLCOUNT). Because this task is the highest priority in application, it prevents other tasks execution.

I understand the goal of this algorithm but i'd like to know how MPCS_CONTENTION_POLLCOUNT was choosed ? Because it leads to 2ms spisning, i'm planning to lower its value and so reduce active wait time.

Do you think it's the right solution or is there other way to prevent DSPLINK task to execute for 2ms ? (We are currently tracking on ARM side what can block MPCS that long)

Here are versions of softwares we use:

DSPLINK 1.65.01.06

DSPBIOS 5.42.00.7

C6000 compiler 7.4.2

XDC tools 3.23.00.32

Benoît

  • Hi Benoît,

    I think you are correct in your assessment of the situation. By default, MPCS_CONTENTION_POLLCOUNT is set to 800. If this is taking too long you can certainly reduce the poll count. I don't think the value was necessarily optimized for this platform, given I see the same value defined on platforms other than OMAPL138 which are running at different clock rates.

    Another approach you can try is to perhaps lower the priority of this DSPLINK TSK, so that the periodic TSKs that matter to you can run at a higher priority. Looking at the code in <DSPLINK_INSTALL_DIR>/dsp/src/msg/DspBios/zcpy_mqt.c, I think you can change the priority using ARGUMENT1 of the MQT configuration in <DSPLINK_INSTALL_DIR>/config/all/CFG_OMAPL138GEM_SHMEM.c:

    STATIC LINKCFG_Mqt LINKCFG_mqtObjects [] =
    {
    {
    "ZCPYMQT", /* NAME : Name of the Message Queue Transport */
    (Uint32) SHAREDENTRYID1, /* MEMENTRY : Memory entry ID (-1 if not needed) */
    (Uint32) -1, /* MAXMSGSIZE : Maximum message size supported (-1 if no limit) */
    1, /* IPSID : ID of the IPS used */
    0, /* IPSEVENTNO : IPS Event number associated with MQT */
    0x0, /* ARGUMENT1 : First MQT-specific argument */
    0x0 /* ARGUMENT2 : Second MQT-specific argument */
    }
    } ;

    That said, you are doing the right thing in trying to find out why the ARM is holding onto the MPCS for a long time. Ultimately you want to avoid having such a scenario in your system.

    Best regards,

    Vincent

  • Hello Vincent,

    First thank you for you quick answer. We keep searching what could hold MPCS on ARM application.

    Within ZCPYMQT_send function, we observed that interrupt occurs between IPS_notify and MPCS_leave calls. Servicing this interrupt may take some time (actually hundreds µs), leading to block the DSP task all the while.

    Do you think that disabling interrupt by the time ZCPYMQT_send run could be a solution ? I read another post concerning this topic (but on SYSLINK)  and highlighting issues within Linux kernel. Here is the post i am talinkg about: http://e2e.ti.com/support/embedded/tirtos/f/355/p/250517/878554.aspx#878554

    One good point is that lowering the active wait loop count on DSP side resolves issue of DSPLINK task blocking the whole application. But we fear that multiple sends of messages within a short time could lead to the same problem. It surely need quite a continuous flow of messages to happen.

    We're are a bit hesitating on lowering DSP task priority as we do not yet manage its effect. Do you have any thought about this ?

    We would set it to one of the lowest priority in the application. Beside DSPLINK task, only one application task use MSQG on receiving. But several use MSGQ send functions.

    Do you think that setting priority low could lead to other issues ? For example by blocking task on sending. We also haven't seen any priority inversion protection within MSGQ code.

    Best regards,

    Benoît

  • Hi Benoît,

    Let me see if I can get your data paths straight:

    1. Every 1 ms

    - ARM calls MSGQ_put
    - DSP receives interrupt and causes DSPLINK TSK to run at priority 15
    - Medium pri (say 4) DSP TSK returns from MSGQ_get and processes message

    2. Every 1 ms

    - A periodic TSK wakes up at high priority (say 5) and does some processing. It is important for this TSK to complete within the 1 ms.

    3. Asynchronously

    - Low pri (say 3) DSP TSK calls MSGQ_put to send messages to ARM

    Is this correct?

    I don't think disabling interrupts around the time when the MPCS is held would solve your issue. It would only ensure the ISR does not run during that time, but the minute interrupts are re-enabled the ISR will run before the DSPLINK TSK or your periodic TSK has a chance to run. If interrupt processing takes too long you'd have to either make it more efficient or live with a higher period. Fundamentally you need to have enough time during every period to receive and process the incoming message, run your periodic TSK, service any other interrupts (e.g. timer) and send any relevant messages to the ARM.

    Multiple calls to MSGQ_put will hold onto the MPCS each time, but there will be a time gap between each call to service the other TSKs, if the latter are at a higher priority level.

    As for whether to decrease the priority of the DSPLINK TSK, here's the question you need to ask yourself: is there any more important TSK in the system that needs to be executed even when the DSPLINK TSK is ready? In any case, if you choose to reduce the poll count to a minimal amount you shouldn't have to worry about changing the priority level, since you wouldn't be spending time spinning in the DSPLINK TSK.

    Best regards,
    Vincent

  • Hi Vincent,

    Very correct, you guess it right.

    But I’m not sure we speak the same about disabling interrupt. We notice that ARM driver can interrupted during ZCPYMQT_send function (especially between IPS_notify and MPCS_leave) and so blocks DSPLINK TSK (on DSP side) because notify was sent but mutex is not released.

    That's why we thought about disabling interrupt within ARM driver all time between IPS_notify and MPCS_leave. This would avoid DSPLINK TSK to be blocked on MPCS_enter.

    Concerning DSP tasks priority, answer is yes, there's at least one task (the one at priority 5 in you example) that should be executed before DSPLINK one.

    I'm ok with you about poll count. I lowered it to 40 and started long time test last week. Thus far it has not risen any problem which would confirm that lowering poll count is sufficient. No need to lower task priority.

    Thank you for your time,

    Benoît

  • Hi Benoît,


    Ahh, I didn't realize you were referring to disabling interrupts on the ARM side. Yes, disabling interrupts between IPS_notify and LDRV_MPCS_leave() would prevent Linux-side code from delaying the release of the MPCS after notifying the DSP, which presumably would cause the DSPLINK TSK to spin-wait at a high priority. This may introduce interrupt latency on the ARM side however, so you want to be careful about that (not sure how long it takes to run that code - it could be in the noise though).

    Best regards,

    Vincent