This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Illegal priority call to llEnter(). NDK application for C6670

Dear all,

I'm running a NDK application which consists of two UDP daemons:

static void NetworkOpen()
{

    udp_rx_test = DaemonNew( SOCK_DGRAM, 0, 5002, udp_rx_test,
      OS_TASKPRILOW, OS_TASKSTKHIGH, 0, 1 );
    udp_tx_test = DaemonNew( SOCK_DGRAM, 0, 5001, udp_tx_test,
         OS_TASKPRILOW, OS_TASKSTKHIGH, 0, 1 );

}

One of these daemons sends UDP datagrams to a UDP server in a PC. The other receives UDP packets from a UDP client running in a PC. The purpose of this application is the measurement of networking performance.

When both daemons are running (not listening but active sending and receiving), I'm getting the following error randomly:

[C66xx_0] 00020.1662 Illegal priority call to llEnter()
[C66xx_0] 00020.1706 Illegal call to llExit()

And NDK is dead after this.

Additional details of my application are:

- XDCTools version: 3.24.7.73

- SYS/BIOS version: 6.35.1.29

- MCSDK version: 2.1.2.6 

- PDK for C6670 version: 1.1.2.6

- NDK version: 2.22.2.16

- The daemon / thread which sends UDP packets (sendto) is calling Task_sleep(1)

- The daemon / thread which receives UDP packets is calling blocking recvncfrom.

- In order to increase performance, I have set Clock Tick period to 10us (ten microseconds). Therefore, I have adjusted NDK tick period to 10000 (100miliseconds, as stated in NDK user's guide).

- Both threads are low priority.

- I have set heap mem size huge. And also UDP rx buffers and NDK buffers (lots of megabytes). So I think NDK cause of deadh is not lack of memory.

- Network scheduler priority low or high, does not matter with respect to the deadh of NDK

Please, any help is wellcome.

Thanks in advance.

Kind regards,

Ricardo

 

 


 

 

  • Hi Ricardo,

    It looks like this message is coming out of llEnter:

    void llEnter()
    {
    ...

        /* Set this task's priority at kernel level */
        tmpOldPriority = TaskSetPri( TaskSelf(), OS_TASKPRIKERN );

        /* Verify this call was legal */
        if( tmpOldPriority >= OS_TASKPRIKERN )
        {
            if( InKernel )
                DbgPrintf(DBG_ERROR,"Illegal reentrant call to llEnter()");
            else
                DbgPrintf(DBG_ERROR,"Illegal priority call to llEnter()");
            return;
        }

    Can you put a break point at this code and see who's calling llEnter?

    Also, can you check ROV at this point?  What are the priorities of the tasks that are running in your system?

    Steve

  • Dear Steven,

    Thank you for your answer.

    "Can you put a break point at this code and see who's calling llEnter?"

    OK with the breakpoint. But with "who's calling llEnter?" I have an additional question.

    Is there something similar to "call trace" in CCS 5.2? How can I open it?

    What do you recommend me in order to track who is calling llEnter? Just "step return" from this function and "mannually" stepping through the calling functions? Is there a better way to achieve this?

    Thanks in advance.

    Regards,

    Ricardo

     

  • Ricardo,

    You can use the B3 register, which contains the return address on C6x CPUs.

    So, once you hit the break point, you open the CCS register view, find B3, copy the address and paste it into the dis/assembly window, it should take you to the address.  Then, you can scroll upward in the assembly code and find the function that you are in.

    Steve

  • Dear Steven,

    Thank you very much for your reply.

    I think that my problem is related to cache coherency.

    When I have the following sections in L2SRAM my NDK application runs perfectly.

    Program.sectMap[".nimu_eth_ll2"] = "L2SRAM";

    Program.sectMap[".resmgr_memregion"] = {loadSegment: "L2SRAM", loadAlign:128}; /* QMSS descriptors region */

    Program.sectMap[".resmgr_handles"] = {loadSegment: "L2SRAM", loadAlign:16}; /* CPPI/QMSS/PA Handles */

    Program.sectMap[".resmgr_pa"] = {loadSegment: "L2SRAM", loadAlign:8}; /* PA Memory

     

    It is when I move those sections to DDR3 when my application dies with those strange "Illegal priority call to llEnter()".

    Could you please tell my if I'm right with this conclusion? That those memory sections MUST be in coherent memory?

    When having those sections in L2SRAM and if I'm right, my NDK app works because of Snoop Coherence Protocol described in "TMS320C66x DSP Cache User Guide, literature number: SPRUGY8". To sum it up, DMA accesses to L2SRAM are automatically redirected to L1D.

    1) Could you please confirm this?

    2) Could you please tell me if all the four of those regions need to be in coherent memory or which ones?

    3) Could you please tell me what is the purpose of ".qmss" and ".cppi" memory sections? I can have both in DDR3 without any problems.

    4) How can I check if I have cache enabled for DDR3?

    5) If I need to allocate those regions to DDR3, how can I disable caching for those sections? In other words, how can I mark those sections as "non-cacheable"?

    Thank you very much.

    Regards,

    Ricardo

     

     

  • Hi Ricardo,

    1) Could you please confirm this?

    2) Could you please tell me if all the four of those regions need to be in coherent memory or which ones?

    There should be no hard dependency on those regions, except cache coherency may need to be maintained (usually through OSAL). I have not been able to confirm this yet; I have taken the client example from MCSDK and recompiled those regions to be in DDR3, but did not see the errors you saw. I will take a closer look before I confirm anything.

    3) Could you please tell me what is the purpose of ".qmss" and ".cppi" memory sections? I can have both in DDR3 without any problems.

    These are sections for data structures used by the QMSS/CPPI LLD. They should be in a shared memory space accessible by all cores to keep track of assigned resources between cores.

    4) How can I check if I have cache enabled for DDR3?

    5) If I need to allocate those regions to DDR3, how can I disable caching for those sections? In other words, how can I mark those sections as "non-cacheable"?

    I believe DDR3 should be all cache enabled by default (on our GEL scripts). I may have to double check if I'm wrong on this. You can enable/disable caching using the cache functions in the csl_cacheAux.h file, specifically CACHE_disableCaching and CACHE_enableCaching, who take an input for the MAR region to enable/disable.

    With that said, I'll begin digging deeper to see if I can reproduce your error on our other MCSDK examples.

    -Ivan

  • Dear Ivan,

    First of all, thanks for your reply.

    I think we can reproduce the problem I'm facing.

    Could you please try the following?

    1) Import client_evmc6670 example application

    2) Edit client.cfg source, and place every memory section into DDR3.

    3) In CCS Debug perspective -> Tools -> RTSC Tools -> Platform -> New, create a new platform based on EVM C6670, set L1D to 32KB, L1P to 32KB, L2 cache to "any value" (from 0 to 1MB), and place all Code memory, Data memory and Stack memory into DDR3

    Compile client_evmc6670 application using the platform defined in step 3 (project properties->General->RTSC and add and select the new platform).

    This application "all in DDR3" crashes in a few seconds if you just ping or execute send.exe (in winapps) or testudp.exe or any other simple udp client.

    When it crashes, I can see two different behaviours:

    1) It prints those "Illegal priority call to llEnter()" messages

    2) If it does not print anything, EDITED: IP connectivity dies silently. I can see in ROV->Task that ti_ndk_config_Global_stackThread is Terminated, but I think this is normal behaviour

     

    Everything works fine when I leave default client.cfg (which means that some memory sections stay in L2SRAM, but everything else in DDR3; still using the newly defined platform).

    If this issue is not related to memory coherency, I have no idea what is going on. Could be any DDR3 malfunctioning/missconfiguration?

    Could you please try to reproduce the steps above and check if application crashes??

    Thanks in advance,

    Ricardo

     

  • Ricardo,

    Strange enough, I am not getting the same error. It does stop responding after a few seconds, but i never see the "Illegal priority call to llEnter()" messages. Have you tried setting L2 and L1 cache to 0 in your RTSC Platform and see if this made a difference? (while leaving the code/data/stack in DDR3)

    I'll continue looking into this, but I apologize in advance if this takes longer to debug than expected. 

    -Ivan

  • Dear Ivan,

    Thanks for your answer.

    It is "good" you are experiencing a similar problem.

    "Illegal priority call to llEnter()" messages are printed or not randomly under heavy UDP traffic.

    I have just started to take a look at NIMU driver (nimu_eth.c file), and something interesing happens when debugging the all-in-DDR3 application: it is receiving packets of above MTU (MTU defined as 1514 in NDK). I'm talking about the following piece of code:

    if ((pHostDesc->buffLen-4) > (ptr_net_device->mtu + ETHHDR_SIZE)) {       

    /* lets try the next one... we should record this as a too large.... */        

                   gRxDropCounter++;

                pHostDesc->buffLen = pHostDesc->origBufferLen;

                QMSS_QPUSH (gRxFreeQHnd, (Ptr)pHostDesc, pHostDesc->buffLen, SIZE_HOST_DESC, Qmss_Location_TAIL);

                continue;

    }

    In my current scenario, it makes no sense to receive packets of length bigger than 1514 bytes. So I think that something strage is happening. Moreover, I think that when I place memory sections in L2SRAM (when everything works as expected), the application does not seem to receive any packets bigger than 1514 bytes.

    Thanks for your help. It is important for us to track this down.

    Regards,

    Ricardo

  • Ricardo,

    I did further experiments and noticed that .resmgr_memregion is the only section that could not be in DDR (in my case anyway). The application worked for me when it is in MSMCSRAM. When I checked the gRxDropCounter, as you pointed out, I can also see packets bigger than 1514 bytes. This did not happen if I have .resmgr_memregion in MSMC or L2SRAM. When in DDR3, the packet length looks to be a stale value at times (origBufferLen).

    I suspect there may be problems with cache inv/wb for DDR. In my case, I have gotten this all-in-DDR3 version of the client example to work by modifying the NIMU to use BIOS cache API instead of CSL cache api. I attached my modified NIMU library here; can you try using BIOS cache api and see if it works better? (Granted, I am using 6678 instead of 6670, but the platforms are very similar)

    Let me know your progress on this.

    -Ivan

    6052.nimu_test_6678_BIOS.zip

  • Dear Ivan,

    Better, but it still does not work properly.

    When I launch "send.exe" and "recv.exe" applications (found in winapps in NDK directory) at the same time, it dies with the following message:

    [C66xx_0] Network Removed: If-1:10.0.0.101

    [C66xx_0]

    [C66xx_0]   25:48   ( 39%)    17:96   ( 53%)    10:128  ( 41%)    17:256  ( 70%) 

    [C66xx_0]    1:512  ( 16%)     0:1536            0:3072        

    [C66xx_0] (18432/49152 mmAlloc: 94/0/94, mmBulk: 15/0/15)

    [C66xx_0]

    I have increased TCP buffers a lot (from 8K to 80K) in order to make sure this problem was not a new one (in client.c file).

     // TCP Transmit buffer size

        rc = 81920;   

    CfgAddEntry( hCfg, CFGTAG_IP, CFGITEM_IP_SOCKTCPTXBUF,

                     CFG_ADDMODE_UNIQUE,

    sizeof(uint), (UINT8 *)&rc, 0 );  

    // TCP Receive buffer size (copy mode)

        rc = 81920;   

    CfgAddEntry( hCfg, CFGTAG_IP, CFGITEM_IP_SOCKTCPRXBUF,

                     CFG_ADDMODE_UNIQUE,

    sizeof(uint), (UINT8 *)&rc, 0 );   

    // TCP Receive limit (non-copy mode)

        rc = 81920;   

    CfgAddEntry( hCfg, CFGTAG_IP, CFGITEM_IP_SOCKTCPRXLIMIT,

                     CFG_ADDMODE_UNIQUE,

    sizeof(uint), (UINT8 *)&rc, 0 );   

    // UDP Receive limit

        rc = 81920;  

    CfgAddEntry( hCfg, CFGTAG_IP, CFGITEM_IP_SOCKUDPRXLIMIT,

                     CFG_ADDMODE_UNIQUE,

    sizeof(uint), (UINT8 *)&rc, 0 );

     

    I have also increased a lot the number of frames in .far:NDK_PACKETMEM and the number of pages in .far:NDK_MMBUFFER

    Global.pktNumFrameBufs = 1000;

    Global.memRawPageCount = 128;

    So this issue does not seem to be related to lack of memory.

     

    Please, could you try to reproduce this scenario? and please let me now any progress with this issue.

    Thanks in advance,

    Ricardo

     

  • Ricardo,

    I can reproduce your error, but it only happens when running both the send and recv test apps at the same time. (For recv, I do get TCP retransmit timeout errors)

    I believe this may not be a bug, but a limitation of the DSP being bursted by two streams of TCP packets at once. I will continue to look into ways to confirm/workaround this; what functionality are you looking for specifically?

    -Ivan

  • But it works smoothly when placing driver-related memory sections in L2SRAM...

    It would be great if we could investigate this deeper.

    For example, I can think of the following: you told me that Cache_* functions in CSL would be buggy. These functions are also used in NDK applications appart from NIMU driver. I think these functions should be replaced everywhere in the code, shouldn't they?

    Regards,

    Ricardo

  • It's possible you're running into the following issue:

    SDOCM00100363 mmBulkAlloc resets the stack if there's no memory available in the heap

    This could easily happen given that you're using large TCP send/receive buffer sizes.  When you create a new socket, it calls mmBulkAlloc to allocate the socket buffers - for both send and receive sides - using that large buffer size.

    If there's not enough heap available, then, due to the above bug, NDK will restart.

    The size of the stack is probably governed by the BIOS HeapMem module.  You can increase it by adding the following code to your BIOS configuration file (*.cfg) file:

    BIOS.heapSize = <insert heap size here>;

    You can also use ROV when the processor is halted to view the heap statistics (via the HeapMem module) to see things like how much free space is available at a given point in time.

    Finally, the above mentioned bug has been fixed in the latest NDK version (2.22.03.20, available on the NDK download page).

    Steve

  • Steven,

    I don't agree. The same heap configuration but with memory sections in L2SRAM and it works perfectly, as I wrote in my previous post. I have also used ROV to take a look at heap statics and heap free size value is huge.

    I'm going to download latest NDK version and give it a try.

    Thank you.

    Regards,

    Ricardo

     

     

  • Dear all,

    About the bug in CSL cache functions, could you provide a patch and could this be corrected in an official release of PDK?

    I suppose that there are lots of other pieces of code that use these functions too. I'm not sure, but for example SRIO, PCIe, AIF2 and other DMA based peripherals.

    So I think it is not a good idea to trust current version of PDK.

    Could you please manage this by the standard/official way?

    Thanks in advance.

    Regards,

    Ricardo

     

  • Ricardo,

    The BIOS cache api implements workarounds for two erratas that we have noted on our c66x family. The CSL cache api, on the other hand, does not. In some cases outside of NIMU that uses CSL cache api, the workarounds are implemented within the application. 

    That said, there are two bug reports submitted to and accepted by the MCSDK team at the moment regarding the CSL cache api, and NIMU using the CSL cache api. I cannot give an estimate on the next target release date containing these fixes. If need be, I suggest contacting your local TI FAE representative for any official patches or updates.

    Back to the current issue: I am going to need some more time to look into this further. I believe Steve said is relevant and should be taken into account.

    -Ivan

  • Ricardo,

    Just syncing up and see if there are any news on this issue.

    When I was probing around, it looks like at some point when the error occurs, the descriptor loses the PBM handle to do PBM_free on. I have not identified the cause of the issue yet, but it does seem like a cache/memory-access related issue. I do not see the same error when the descriptors are in L2SRAM.

    -Ivan

  • Ivan,

    I can confirm those PBM_free error messages in console.

    I think those errors can be mitigated if CSL Cache related functions in resourcemgr.c file are replaced with SYS/BIOS equivalent functions.

    In other words, the same we did for the NDK source code, now inside resourcemgr.c file included in the example project.

    But not sure if this is enough. Could you please confirm this?

    Thanks.

    Regards,

    Ricardo

     

     

  • Ricardo,

    I believe you are right. When I traced down to the missing PBM handle, it seemed to have been caused by QPUSH in resourcemgr.c pushing an invalid descriptor into the queues. After changing the CSL cache functions with the BIOS equivalent, I do not see the crash any more.

    In short, it seem sticking with the SYS/BIOS cache api throughout the entire application is the safest bet.

    -Ivan

  • HI, 

    I created this error message: 

    00008.605 Illegal priority call to llEnter() 

    by running another task at a very high priority while running the NDK Stack Test. When I brought the other task down to priority 2, outside of the NDK, the error disappears. 

    Hope this helps,

    Amanda