This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SRIO IPC Transport not working with >=3 cores per chip

Hi,

We're trying (for quite some time now) to get the IPC Messaging working over SRIO with more than 2 cores active per chip, using two EVM6678 and a breakout board.

I used the srioIpcChipToChipExample supplied with the PDK as a starting point which works well as is (two cores per chip sending messages). However, as soon as 3 or more cores per chip are configured, SRIO is messed up.
Although initialization seems to be successful, the NameServerMessageQ is not functional and fails to resolve remote message-queue IDs.
We used EVMs for better reproducibility, but it behaves exactly the same on the final target hardware.

After digging for more information, we found the following, unfourtunately unanswered, post describing what sounds like the same issue: http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/284083/996887.aspx

I've uploaded the output with 3 cores active here: http://pastebin.com/PdLPSMCt
The modified srioChipToChip example project can be downloaded here: 4274.srioIpcChipToChipExample3Cores.zip

We really need assistance with this issue, due to the time already lost trying to get it working it isbecoming critical for our project.

Thank you in advance, Clemens

Versions used:
XDCTools-3.25.4.60
CGT-7.4.2
IPC-1.24.3.32
PDK: 1.1.2.6
SYS/BIOS: 6.35.4.50

  • After further investigation we found out even the 2-core srioChipToChip example has a ~80% failure rate.
    I followed the steps mentioned in the Readme.txt precisely .

    When it failes, it behaves like this:
    1. Loaded producer-image on each core of board 1
    2. Loaded consumer-image on each core of board 2
    3. Started cores on board 1
    4. Started cores on board 2

    5. SRIO Ports are opened successfully
    6. IPC + SRIO Transport initialize sucessfully
    7. The produces fails to open the queues generated by the consumer, spinning endlessly in MessageQ_open in sendPacketsOffChip, where MessageQ_open returns -5 (= MessageQ_E_NOTFOUND).

    I enabled the Osal_srioLog-function in bench_osal in order to get more useful debugging output, unfourtunately it didn't provide useful insights for me. Please see the log-messages at the end of this post.

    Any help would be very appriciated.

    Thank you in advance, Clemens


    [C66xx_0] Local Core ("CORE0") starting
    Local Core ID: 0
    Global Core ID: 0

    -----------------------Initializing---------------------------
    Core 0 : L1D cache size 4. L2 cache size 0.
    Core 0 : Memory region 0 inserted
    [C66xx_1] Local Core ("CORE1") starting
    Local Core ID: 1
    Global Core ID: 1
    Core 1: Waiting for SRIO to be initialized.
    [C66xx_8] Local Core ("CORE0") starting
    Local Core ID: 0
    Global Core ID: 2

    -----------------------Initializing---------------------------
    Core 2 : L1D cache size 4. L2 cache size 0.
    Core 2 : Memory region 0 inserted
    [C66xx_9] Local Core ("CORE1") starting
    Local Core ID: 1
    Global Core ID: 3
    Core 3: Waiting for SRIO to be initialized.
    [C66xx_8] Port 0 is okay
    Port 1 is okay
    Port 2 is okay
    [C66xx_0] Port 0 is okay
    Port 1 is okay
    Port 2 is okay
    Port 3 is okay
    Core 0: SRIO Driver has been initialized
    [C66xx_1] Core 1: SRIO can now be used.
    [C66xx_0] Debug: AppConfig Tx Queue: 0x65 Flow Id: 8552584
    Debug: SRIO Driver Instance 0x@0083b754 has been created
    [C66xx_1] Debug: AppConfig Tx Queue: 0x65 Flow Id: 8552584
    Debug: SRIO Driver Instance 0x@0083b754 has been created
    localQueueName=CORE1
    remoteQueueName=CORE3
    Core 1: tsk0 starting
    [C66xx_0] localQueueName=CORE0
    remoteQueueName=CORE2
    Core 0: tsk0 starting
    Global Core 0: Sending packets to an off-chip core.
    Global Core 0 attempting to open remote board Queue CORE2
    [C66xx_1] Global Core 1: Sending packets to an off-chip core.
    Global Core 1 attempting to open remote board Queue CORE3
    [C66xx_8] Port 3 is okay
    Core 2: SRIO Driver has been initialized
    [C66xx_9] Core 3: SRIO can now be used.
    [C66xx_8] Debug: AppConfig Tx Queue: 0x65 Flow Id: 8551624
    Debug: SRIO Driver Instance 0x@0083b394 has been created
    [C66xx_9] Debug: AppConfig Tx Queue: 0x65 Flow Id: 8551624
    Debug: SRIO Driver Instance 0x@0083b394 has been created
    localQueueName=CORE3
    Core 3: tsk0 starting
    [C66xx_8] localQueueName=CORE2
    Core 2: tsk0 starting
    Global Core 2: Receiving packets from an off-chip core.
    [C66xx_9] Global Core 3: Receiving packets from an off-chip core.
    [C66xx_0] MsQ open failed, status: -5  //Additional debug output added to sendPacketsOffChip
    [C66xx_1] MsQ open failed, status: -5
    [C66xx_0] MsQ open failed, status: -5
    [C66xx_1] MsQ open failed, status: -5
    [C66xx_0] MsQ open failed, status: -5
    [C66xx_1] MsQ open failed, status: -5

  • Hi Clemens,

    The SrioChipToChipProducer and SrioChipToChipConsumer examples in pdk_C6678_1_1_2_6 is broken due to code changes in ipc_1_24_03_32, specifically it's the length of instance name in the function NameServerMessageQ_get() in ipc_1_24_03_32\packages\ti\sdo\ipc\nsremote\NameServerMessageQ.c. Please see details in the thread: 

    http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/295667/1112316.aspx#1112316

    Regards, Garrett

  • Hi Garret,

    unfortunately, this seems to be another bug.
    The problem is that TransportSrioSetup_setupRxDescBufs() in TransportSrioSetup.c gets called multiple times if more than two cores are used.

    I fixed the problem by changing this function as follows:

    Int TransportSrioSetup_setupRxDescBufs (UArg arg, UInt16 input)
    {
      SharedRegion_Entry entry;
      GateMP_Handle gateMpHandle;
      GateMP_Params gateMpParams;  
      HeapBufMP_Handle heapHandle;
      HeapBufMP_Params heapBufParams;
      static char initDone = 0;
     
      if (initDone)
        return 0;

      SharedRegion_getEntry(TransportSrioSetup_descMemRegion, &entry);

      /* Check if owner of memory region */
      if (entry.ownerProcId == MultiProc_self())
      {
        GateMP_Params_init (&gateMpParams);
        gateMpParams.localProtect = GateMP_LocalProtect_INTERRUPT;
        gateMpHandle = GateMP_create (&gateMpParams);
        
        HeapBufMP_Params_init(&heapBufParams);
        heapBufParams.sharedAddr = TransportSrioSetup_module ->reservedMemAddr;
        /* Should be same as what was defined in TransportSrioSetup_Module_Startup */
        heapBufParams.numBlocks = TransportSrioSetup_numRxDescBuffs;
        heapBufParams.blockSize = TransportSrio_srioMaxMtuSizeBytes;
        heapBufParams.gate = gateMpHandle;
        heapHandle = HeapBufMP_create(&heapBufParams);
      }
      else
      {
        /* Open the heap created by the other processor. Loop until opened. */
       HeapBufMP_openByAddr(TransportSrioSetup_module ->reservedMemAddr, &heapHandle);
      }

      /* Register this heap with MessageQ */
      MessageQ_registerHeap((IHeap_Handle)heapHandle, TransportSrioSetup_messageQHeapId);

      /* Initialize the receive side descriptors with MsgQ msgs */
      TransportSrio_initRxDescBufs();

      initDone = 1;
      return (0);
    }

    Ralf