J722SXH01EVM: Issues getting tisp/dmautils to work with FreeRTOS

Part Number: J722SXH01EVM
Other Parts Discussed in Thread: SYSCONFIG

Tool/software:

Hello,

I've developed a basic bare metal application on c7x making use of tisp to test the hardware and develop our algorithms. This has been working fine, but now I need to implement IPC with the linux host and it seems that requires FreeRTOS, even though I would prefer to stay bare metal. (The library is missing for NoRTOS)

To move to RTOS, the first step I take is to include the library, which because of the entry point requires me to change the linker file to include some stuff for MMU and BSS. I use a reference from an example. The FreeRTOS entry point seems to have some dependency on MMU configuration from ti_dpl_config.c, generated by sysconfig. So I include that file as well

With these changes in place, my project builds again, and the main function runs as before, but TISP, in particular the use of the DRU, is broken. The loop checking for completion of the transfer now hangs forever.

In particular, this line of DmaUtilsAutoInc3d_wait() hangs forever:
while ((eflRegisterVal & waitWord) != waitWord) {

Nothing in my main function has changed - the only change is the addition of FreeRTOS library and the adjustment of the entry point, and associated linker changes. For reference, here is my old linker file:

-heap  0xF0000   // 756 kB
-stack 0x4000    //  16 kB
--cinit_compression=off
--args 0x1000
--diag_suppress=10068 // "no matching section"

MEMORY
{

  L2SRAM_CINIT (RWX)  : org = 0x7E000000, len = 0x000100 //for 256byte init     c7x_0 = 7E000000, c7x_1 = 7E200000
  L2SRAM (RWX)        : org = 0x7E000100, len = 0x1fff00 //for 2MBytes  EL2
  L2SRAMAUX   (RWX): org = 0x7F000000, len = 0x040000   // for 256 KBytes J7AEN c7x_0 = 7F000000, c7x_1 = 7F800000

  EXTMEM_STATIC   (RWX): org = 0x80000000, len = 0x4000000
  EXTMEM          (RWX): org = 0x84000000, len = 0x200000

}

SECTIONS
{
  .sram_start START(_sram_start) > L2SRAM NOINIT

//  .kernel: {
//    *.obj (.text:optimized) { SIZE(_kernel_size) }
//  } > EXTMEM

//  .kernel_data SIZE(_data_size)
  .l2mem            > L2SRAM
  .l1dmemory        > L2SRAM
  .l2dmemory        > L2SRAM
  .text:            > L2SRAM
  .text:touch:      > L2SRAM
  .text:_c_int00:   > L2SRAM_CINIT
  .neardata:        > L2SRAM
  .rodata:          > L2SRAM
  .bss:             > L2SRAM
  .init_array:      > L2SRAM
  .far:             > L2SRAM
  .fardata:         > L2SRAM
  .neardata         > L2SRAM
  .rodata           > L2SRAM
  .data:            > L2SRAM
  .switch:          > L2SRAM
  .args:            > L2SRAM align = 0x4, fill = 0 {_argsize = 0x200; }
  .sysmem:          > L2SRAM
  .cinit:           > EXTMEM
  .const:           > L2SRAM START(const_start) SIZE(const_size)
  .pinit:           > L2SRAM
  .cio:             > L2SRAM
  .stack:           > L2SRAM
  .ddrData          > EXTMEM_STATIC
  .staticData       > EXTMEM_STATIC
  .l2sramaux        > L2SRAMAUX
  xdc.meta:        > L2SRAM, type = COPY
}

And here is the new linker file. It's basically straight from an example, but I've enlarged the heap and moved most of the local program data to SRAM

--ram_model
-heap  0x20000
-stack 0x20000
--args 0x1000
--diag_suppress=10068 /* to suppress no matching section error */
--cinit_compression=off
-e _c_int00_secure

#define DDR0_ALLOCATED_START  0xA3000000

#define C7X_ALLOCATED_START DDR0_ALLOCATED_START

#define C7X_RESOURCE_TABLE_BASE (C7X_ALLOCATED_START + 0x00100000)
#define C7X_IPC_TRACE_BUFFER    (C7X_ALLOCATED_START + 0x00100400)
#define C7X_BOOT_BASE           (C7X_ALLOCATED_START + 0x00200000)
#define C7X_VECTOR_BASE         (C7X_ALLOCATED_START + 0x00400000)
#define C7X_DDR_SPACE_BASE      (C7X_ALLOCATED_START + 0x00410000)

MEMORY
{
    L2SRAM (RWX):  org = 0x7E000000,                len = 0x200000
    DDR0_RESERVED: org = 0x80000000,                len = 0x19800000         /*  Reserved for A53 OS */
    C7X_IPC_D:     org = C7X_ALLOCATED_START,       len = 0x00100000         /*  1MB DDR */
    C7X_BOOT_D:    org = C7X_BOOT_BASE,             len = 0x400              /*  1024B DDR */
    C7X_VECS_D:    org = C7X_VECTOR_BASE,           len = 0x4000             /*  16KB DDR */
    C7X_CIO_MEM:   org = C7X_DDR_SPACE_BASE,        len = 0x1000             /*  4KB */
    C7X_DDR_SPACE: org = C7X_DDR_SPACE_BASE+0x1000, len = 0x00BF0000-0x1000  /*  11.9MB - 4KB DDR  */
    /* For resource table */
    C7X_RT_D:      org = C7X_RESOURCE_TABLE_BASE, len = 0x400         /*  1024B DDR */
    /* IPC trace buffer */
    LINUX_IPC_TRACE_BUFFER: org = C7X_IPC_TRACE_BUFFER, len = 0xFFC00 /* 1023KB DDR */
    LOG_SHM_MEM             : ORIGIN = 0xA7000000, LENGTH = 0x40000
    /* Shared memory for RTOS NORTOS IPC */
    RTOS_NORTOS_IPC_SHM_MEM: org = 0xA5000000, len = 0x1C00000  /* 8MB DDR */
}

SECTIONS
{
    boot:
    {
      boot.*<boot.oe71>(.text)
    } load > C7X_BOOT_D
    .vecs       >       C7X_VECS_D
    .secure_vecs    >   C7X_DDR_SPACE ALIGN(0x100000)
    .text:_c_int00_secure > C7X_DDR_SPACE ALIGN(0x200000)
    .text       >       C7X_DDR_SPACE ALIGN(0x100000)

    .bss        >       L2SRAM  /* Zero-initialized data */
    RUN_START(__BSS_START)
    RUN_END(__BSS_END)

	.l2mem		>		L2SRAM
	.ddrData	>		C7X_DDR_SPACE
    .data       >       L2SRAM  /* Initialized data */

    .cinit      >       L2SRAM  /* could be part of const */
    .init_array >       L2SRAM  /* C++ initializations */
    .stack      >       L2SRAM ALIGN(0x2000)
    .args       >       L2SRAM
    .cio        >       C7X_CIO_MEM
    .const      >       L2SRAM
    .switch     >       L2SRAM /* For exception handling. */
    .sysmem     >       L2SRAM /* heap */

    GROUP:              >  C7X_DDR_SPACE
    {
        .data.Mmu_tableArray          : type=NOINIT
        .data.Mmu_tableArraySlot      : type=NOINIT
        .data.Mmu_level1Table         : type=NOINIT
        .data.gMmu_tableArray_NS       : type=NOINIT
        .data.Mmu_tableArraySlot_NS   : type=NOINIT
        .data.Mmu_level1Table_NS      : type=NOINIT
    }

    .benchmark_buffer:     > C7X_DDR_SPACE ALIGN (32)

    /* This is the resource table used by linux to know where the IPC "VRINGs" are located */
    .resource_table: { __RESOURCE_TABLE = .;} > C7X_RT_D
    /* This IPC log can be viewed via ROV in CCS and when linux is enabled, this log can also be viewed via linux debugfs */
    .bss.debug_mem_trace_buf    : {} palign(128)    > LINUX_IPC_TRACE_BUFFER
    /* this is used when Debug log's to shared memory is enabled, else this is not used */
    .bss.log_shared_mem  (NOLOAD) : {} > LOG_SHM_MEM
    /* this is used only when IPC RPMessage is enabled */
    .bss.ipc_vring_mem   (NOLOAD) : {} > RTOS_NORTOS_IPC_SHM_MEM
}

My only guess is this could be related to the MMU, but I don't know. Any help is appreciated.

  • Hello Tyler,

    First, TISP is not verified and supported for J722S, so anything you have gotten to work so far is excellent. However, what you are trying is not verified by the TI team, and you must be using the TI PSDK RTOS release for J722S with FreeRTOS if you want IPC to work with Linux. 

    Also, the TI PSDK RTOS does not support NoRTOS/bare-metal support.

    Thanks.

  • Evidently I shouldn't have mentioned TISP, because it is irrelevant to the current problem.

    I am trying to get DMAUtils to work under FreeRTOS, which is definitely something TI should support because you have given no other support on how to use the DRU.

    Let me describe the problem, and I will include source code that shows the problem without including any TISP

    The first problem I run into is that drivers lib and dmautils lib have conflicting copies of udma.c in them, so to get the program to build I have to remove one of the udma.c from one of the libraries. They are clearly very different implementations, so it is strange to me that they are both being offered in the same package. The one under DMAUtils includes support for DRU, so I include that one and comment the other one out of the drivers.lib makefile.

    Now the build works, and I can run a simple program (below) that triggers the DRU and waits for it to complete. This works fine without FreeRTOS.

    __attribute__((section(".ddrData"), aligned(128))) uint8_t ddrBuffer[DDR_SIZE];
    
    
    
    
    __attribute__((section(".l2mem"), aligned(128))) float pInputBlock[128*16*2];
    __attribute__((section(".l2mem"), aligned(128))) float pOutputBlock[128*16*2];
    
    
    
    Udma_DrvHandle udmaHandle;
    int main() {
        printf("Starting program\r\n");
    
        struct Udma_DrvObj udmaDrvObj;
    
        Udma_InitPrms initPrms;
    
        uint32_t instId;
        uint32_t retVal;
    
        Udma_DrvHandle drvHandle = &udmaDrvObj;
    
        instId = UDMA_INST_ID_MAIN_0;
        UdmaInitPrms_init(instId, &initPrms);
        initPrms.printFxn     = &testDmaAutoIncPrintf;
        initPrms.virtToPhyFxn = &testDmaAutoIncVirtToPhyFxn;
        retVal = Udma_init(drvHandle, &initPrms);
    
        if (UDMA_SOK != retVal) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        udmaHandle = drvHandle;
    
        int numChannels = 2;
        uint8_t *pTrMemCh[2];
        int32_t  chIdIn[2];  // Input channel IDs
        int32_t  chIdOut[2]; // Output channel IDs
        uint8_t *dmaUtilsContext = (uint8_t *) memalign(128, DmaUtilsAutoInc3d_getContextSize(numChannels));
    
    
    
        for (size_t ch = 0; ch < numChannels; ch++) {
           pTrMemCh[ch] = (uint8_t *) memalign(128, DmaUtilsAutoInc3d_getTrMemReq(1));
        }
    
        retVal = dmaUtils<float>::init(dmaUtilsContext, numChannels, udmaHandle);
    
        if (retVal != UDMA_SOK) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        uint32_t transferSize = DMAUTILSAUTOINC3D_SYNC_2D;
        DmaUtilsAutoInc3d_TransferDim transferDimIn;  /*!< Structure to hold transfer properties of input DMA channel */
        DmaUtilsAutoInc3d_TransferDim transferDimOut; /*!< Structure to hold transfer properties of output DMA channel */
    
        /***********************/
        /* Configure input DMA */
        /***********************/
    
        // set transfer dimenssion structure
        dmaUtils<float>::dmaAutoIncSetupXferPropIn2D(128, 128, 128, 16, 128*sizeof(double), 128*sizeof(double),
                                                       &transferDimIn);
        // assign channel number
        chIdIn[0] = 0;
        /* chIdIn[0] = dmaChOffset::globalChOffset; */
        /* dmaChOffset::globalChOffset += 1; */
    
        // configure channel
        retVal = dmaUtils<float>::configure_channel(dmaUtilsContext, chIdIn[0], pTrMemCh, (uint8_t *) ddrBuffer,
                                                      (uint8_t *) pInputBlock, transferSize, &transferDimIn);
    
        if (retVal != UDMA_SOK) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        /***********************/
        /* Configure output DMA */
        /***********************/
    
        // set transfer dimenssion structure
        dmaUtils<float>::dmaAutoIncSetupXferPropOut2D(128, 128, 128, 16, 128*sizeof(double), 128*sizeof(double),
                                                         &transferDimOut);
        // assign channel number
        chIdOut[0] = 1;
        /* chIdOut[0] = dmaChOffset::globalChOffset; */
        /* dmaChOffset::globalChOffset += 1; */
    
        // configure channel
        retVal = dmaUtils<float>::configure_channel(dmaUtilsContext, chIdOut[0], pTrMemCh, (uint8_t *) pOutputBlock,
                                                       (uint8_t *) ddrBuffer, transferSize, &transferDimOut);
    
        if (retVal != UDMA_SOK) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        dmaUtils<float>::trigger(dmaUtilsContext, chIdIn, 1);
    
        dmaUtils<float>::wait(dmaUtilsContext, chIdIn, 1);
    
    
        printf("Goodbye\n");
    
    
    }

    Now, if I include the FreeRTOS library `freertos.j722s.c75x.ti-c7000.debug.lib`, the build will break because my entry point is _c_int00 - if I simply change that to _c_int00_secure it will build again, but the program now hangs in the very last "wait" line

    Please help me understand what is going wrong here. "We don't support it" or "It is not verified" won't fly - being able to use the DRU is fundamental for any DSP application that would run on this processor. If I am taking the wrong approach please let me know. I am using the 'ipc_rpmsg_echo_linux' application as my baseline, which I know works and can communicate with linux.

  • Just to add a bit more of my suspicion -

    I found the source for _c_int00_secure, and as far as I can tell the only thing it does is call MmuP_init(), so it indicates my MMU configuration is breaking the DRU

    I can't find any reasonable documentation on how the DRU works or how it connected to the MMU. My MMU configuration gives the c7x access to the SRAM and DDR regions required, as well as access to the DRU register space. I'm able to perform the same copy operation the DRU is attempting from the c7x using a for loop, and it succeeds.

    If I remove any of these three regions from the MMU configuration, the program will either crash or refuse to start. So I know the MMU settings are being applied and they seem to work to some degree.

    So I believe somehow the MMU has given access to the c7x core but not to the DRU? I'd appreciate any insight on the topic.

    I tried setting the enableMmu parameter from syscfg to 0, but it looks like in MmuP_c75.c this feature is not yet supported.

  • We need to understand what you are trying to get running on C7x outside the TI RTOS SDK's current support, which is to use C7x for TIDL/Deep Learning. We suggest you discuss your case with the TI field representative so that we can communicate effectively.

    In the interim, we'll ask our expert to comment on your DRU question.

    Thanks.

  • Hi Tyler,

    It looks like the DRU stops working when you try to add the freeRTOS support due to the IPC example requires it. As the matter of fact, the IPC is OS agnostic. It should work for both FreeRTOS and noRTOS. Here is the example in the AM263Px MCU+ SDK 10.00.00.35: /ipc_notify_echo_am263px-cc_r5fss0-1_nortos_ti-arm-clang. Attached are the two files you may be interested in the example:

    ipc_notify_echo.c
    /*
     *  Copyright (C) 2021 Texas Instruments Incorporated
     *
     *  Redistribution and use in source and binary forms, with or without
     *  modification, are permitted provided that the following conditions
     *  are met:
     *
     *    Redistributions of source code must retain the above copyright
     *    notice, this list of conditions and the following disclaimer.
     *
     *    Redistributions in binary form must reproduce the above copyright
     *    notice, this list of conditions and the following disclaimer in the
     *    documentation and/or other materials provided with the
     *    distribution.
     *
     *    Neither the name of Texas Instruments Incorporated nor the names of
     *    its contributors may be used to endorse or promote products derived
     *    from this software without specific prior written permission.
     *
     *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
     *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
     *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
     *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
     *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
     *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
     *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
     *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
     *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
     *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     *  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     */
    #include <stdio.h>
    #include <inttypes.h>
    #include <kernel/dpl/ClockP.h>
    #include <kernel/dpl/SemaphoreP.h>
    #include <drivers/ipc_notify.h>
    #include "ti_drivers_open_close.h"
    #include "ti_board_open_close.h"
    
    /* This example shows message exchange between multiple cores.
     *
     * One of the core is designated as the 'main' core
     * and other cores are desginated as `remote` cores.
     *
     * The main core initiates IPC with remote core's by sending it a message.
     * The remote cores echo the same message to the main core.
     *
     * The main core repeats this for gMsgEchoCount iterations.
     *
     * In each iteration of message exchange, the message value is incremented.
     *
     * When iteration count reaches gMsgEchoCount, a semaphore is posted and
     * the pending thread/task on that core is unblocked.
     *
     * When a message or its echo is received, a user registered callback is invoked.
     * The message is echoed from within the user callback itself.
     *
     * This is a example message exchange, in final systems, user can do more
     * sophisticated message exchanges as needed for their applications.
     */
    
    /* number of iterations of message exchange to do */
    uint32_t gMsgEchoCount = 1000000u;
    /* client ID that is used to send and receive messages */
    uint32_t gClientId = 4u;
    
    #if defined(SOC_AM64X)
    /* main core that starts the message exchange */
    uint32_t gMainCoreId = CSL_CORE_ID_R5FSS0_0;
    /* remote cores that echo messages from main core, make sure to NOT list main core in this list */
    uint32_t gRemoteCoreId[] = {
        CSL_CORE_ID_R5FSS0_1,
        CSL_CORE_ID_R5FSS1_0,
        CSL_CORE_ID_R5FSS1_1,
        CSL_CORE_ID_M4FSS0_0,
        CSL_CORE_ID_A53SS0_0,
        CSL_CORE_ID_MAX /* this value indicates the end of the array */
    };
    #endif
    
    #if defined(SOC_AM243X)
    /* main core that starts the message exchange */
    uint32_t gMainCoreId = CSL_CORE_ID_R5FSS0_0;
    /* remote cores that echo messages from main core, make sure to NOT list main core in this list */
    uint32_t gRemoteCoreId[] = {
        CSL_CORE_ID_R5FSS0_1,
        CSL_CORE_ID_R5FSS1_0,
        CSL_CORE_ID_R5FSS1_1,
        CSL_CORE_ID_M4FSS0_0,
        CSL_CORE_ID_MAX /* this value indicates the end of the array */
    };
    #endif
    
    #if defined (SOC_AM263X) || defined (SOC_AM263PX)
    /* main core that starts the message exchange */
    uint32_t gMainCoreId = CSL_CORE_ID_R5FSS0_0;
    /* remote cores that echo messages from main core, make sure to NOT list main core in this list */
    uint32_t gRemoteCoreId[] = {
        CSL_CORE_ID_R5FSS0_1,
        CSL_CORE_ID_R5FSS1_0,
        CSL_CORE_ID_R5FSS1_1,
        CSL_CORE_ID_MAX /* this value indicates the end of the array */
    };
    #endif
    
    #if defined(SOC_AM273X) || defined(SOC_AWR294X)
    /* main core that starts the message exchange */
    uint32_t gMainCoreId = CSL_CORE_ID_R5FSS0_0;
    /* remote cores that echo messages from main core, make sure to NOT list main core in this list */
    uint32_t gRemoteCoreId[] = {
        CSL_CORE_ID_R5FSS0_1,
        CSL_CORE_ID_C66SS0,
        CSL_CORE_ID_MAX /* this value indicates the end of the array */
    };
    #endif
    
    #if defined(SOC_AM261X)
    /* main core that starts the message exchange */
    uint32_t gMainCoreId = CSL_CORE_ID_R5FSS0_0;
    /* remote cores that echo messages from main core, make sure to NOT list main core in this list */
    uint32_t gRemoteCoreId[] = {
        CSL_CORE_ID_R5FSS0_1,
        CSL_CORE_ID_MAX /* this value indicates the end of the array */
    };
    #endif
    
    /* semaphore's used to indicate a main core has finished all message exchanges */
    SemaphoreP_Object gMainDoneSem[CSL_CORE_ID_MAX];
    
    /* semaphore used to indicate a remote core has finished all message xchange */
    SemaphoreP_Object gRemoteDoneSem;
    
    void ipc_notify_msg_handler_main_core(uint32_t remoteCoreId, uint16_t localClientId, uint32_t msgValue, int32_t crcStatus, void *args)
    {
        /* increment msgValue and send it back until gMsgEchoCount iterations are done */
        if(msgValue != (gMsgEchoCount-1))
        {
            /* send new message to remote core, that echod our message */
            msgValue++;
            IpcNotify_sendMsg(remoteCoreId, gClientId, msgValue, 1);
        }
        else
        {
            /* there is one semaphore for each core ID, so post the semaphore for the remote core that
             * has finished all message exchange iterations
             */
            SemaphoreP_post(&gMainDoneSem[remoteCoreId]);
        }
    }
    
    void ipc_notify_echo_main_core_start(void)
    {
        int32_t status;
        uint32_t i, numRemoteCores;
    
        /* create completion semaphores for all cores */
        for(i=0; i < CSL_CORE_ID_MAX; i++)
        {
            SemaphoreP_constructBinary(&gMainDoneSem[i], 0);
        }
    
        /* register a handler to receive messages */
        status = IpcNotify_registerClient(gClientId, ipc_notify_msg_handler_main_core, NULL);
        DebugP_assert(status==SystemP_SUCCESS);
    
        /* wait for all cores to be ready */
        IpcNotify_syncAll(SystemP_WAIT_FOREVER);
    
        DebugP_log("[IPC NOTIFY ECHO] Message exchange started by main core !!!\r\n");
    
        for(i=0; gRemoteCoreId[i]!=CSL_CORE_ID_MAX; i++)
        {
            uint32_t msgValue = 0;
            /* send message's to all participating core's, wait for message to be put in HW FIFO */
            status = IpcNotify_sendMsg(gRemoteCoreId[i], gClientId, msgValue, 1);
            DebugP_assert(status==SystemP_SUCCESS);
        }
    
        /* wait for all messages to be echo'ed back */
        numRemoteCores = 0;
        for(i=0; gRemoteCoreId[i]!=CSL_CORE_ID_MAX; i++)
        {
            SemaphoreP_pend(&gMainDoneSem[ gRemoteCoreId[i] ], SystemP_WAIT_FOREVER);
            numRemoteCores++;
        }
    
        DebugP_log("[IPC NOTIFY ECHO] All echoed messages received by main core from %d remote cores !!!\r\n", numRemoteCores);
        DebugP_log("[IPC NOTIFY ECHO] Messages sent to each core = %d \r\n", gMsgEchoCount);
        DebugP_log("[IPC NOTIFY ECHO] Number of remote cores = %d \r\n", numRemoteCores);
        DebugP_log("All tests have passed!!\r\n");
    }
    
    void ipc_notify_msg_handler_remote_core(uint32_t remoteCoreId, uint16_t localClientId, uint32_t msgValue, int32_t crcStatus, void *args)
    {
        /* on remote core, we have registered handler on the same client ID and current core client ID */
        IpcNotify_sendMsg(remoteCoreId, localClientId, msgValue, 1);
    
        /* if all messages received then post semaphore to exit */
        if(msgValue == (gMsgEchoCount-1))
        {
            SemaphoreP_post(&gRemoteDoneSem);
        }
    }
    
    void ipc_notify_echo_remote_core_start(void)
    {
        int32_t status;
    
        SemaphoreP_constructBinary(&gRemoteDoneSem, 0);
    
        /* register a handler to receive messages */
        status = IpcNotify_registerClient(gClientId, ipc_notify_msg_handler_remote_core, NULL);
        DebugP_assert(status==SystemP_SUCCESS);
    
        /* wait for all cores to be ready */
        IpcNotify_syncAll(SystemP_WAIT_FOREVER);
    
        DebugP_log("[IPC NOTIFY ECHO] Remote Core waiting for messages from main core ... !!!\r\n");
    
        /* wait for all messages to be echo'ed back */
        SemaphoreP_pend(&gRemoteDoneSem, SystemP_WAIT_FOREVER);
    
        DebugP_log("[IPC NOTIFY ECHO] Remote core has echoed all messages !!!\r\n");
    }
    
    void ipc_notify_echo_main(void *args)
    {
        Drivers_open();
        Board_driversOpen();
    
        if(IpcNotify_getSelfCoreId()==gMainCoreId)
        {
            ipc_notify_echo_main_core_start();
        }
        else
        {
            ipc_notify_echo_remote_core_start();
        }
    
        Board_driversClose();
        /* We dont close drivers to let the UART driver remain open and flush any pending messages to console */
        /* Drivers_close(); */
    }
    

    0640.main.c
    /*
     *  Copyright (C) 2022-23 Texas Instruments Incorporated
     *
     *  Redistribution and use in source and binary forms, with or without
     *  modification, are permitted provided that the following conditions
     *  are met:
     *
     *    Redistributions of source code must retain the above copyright
     *    notice, this list of conditions and the following disclaimer.
     *
     *    Redistributions in binary form must reproduce the above copyright
     *    notice, this list of conditions and the following disclaimer in the
     *    documentation and/or other materials provided with the
     *    distribution.
     *
     *    Neither the name of Texas Instruments Incorporated nor the names of
     *    its contributors may be used to endorse or promote products derived
     *    from this software without specific prior written permission.
     *
     *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
     *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
     *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
     *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
     *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
     *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
     *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
     *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
     *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
     *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     *  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     */
    
    #include <stdlib.h>
    #include "ti_drivers_config.h"
    #include "ti_board_config.h"
    
    void ipc_notify_echo_main(void *args);
    
    int main(void)
    {
        System_init();
        Board_init();
    
        ipc_notify_echo_main(NULL);
    
        Board_deinit();
        System_deinit();
    
        return 0;
    }
    

    Best regards,

    Ming

  • Hi Ming,

    I did try to replicate the examples, but I found they depend on the FreeRTOS library. In particular for the example, SemaphoreP is defined in FreeRTOS lib. There is no NoRTOS lib in the J722S SDK for c7x processors, only for the r5f processors. But if I include the FreeRTOS lib to make use of these methods, the entry point is changed and suddenly DRU no longer works.

    Appreciate any further insight you have

  • Hi Ming,

    Does my issue make sense, or is there something obvious I am missing?

  • Hi Tyler,

    is it possible for you to share the project file for the code you provided a couple of replies ago:

    __attribute__((section(".ddrData"), aligned(128))) uint8_t ddrBuffer[DDR_SIZE];
    
    
    
    
    __attribute__((section(".l2mem"), aligned(128))) float pInputBlock[128*16*2];
    __attribute__((section(".l2mem"), aligned(128))) float pOutputBlock[128*16*2];
    
    
    
    Udma_DrvHandle udmaHandle;
    int main() {
        printf("Starting program\r\n");
    
        struct Udma_DrvObj udmaDrvObj;
    
        Udma_InitPrms initPrms;
    
        uint32_t instId;
        uint32_t retVal;
    
        Udma_DrvHandle drvHandle = &udmaDrvObj;
    
        instId = UDMA_INST_ID_MAIN_0;
        UdmaInitPrms_init(instId, &initPrms);
        initPrms.printFxn     = &testDmaAutoIncPrintf;
        initPrms.virtToPhyFxn = &testDmaAutoIncVirtToPhyFxn;
        retVal = Udma_init(drvHandle, &initPrms);
    
        if (UDMA_SOK != retVal) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        udmaHandle = drvHandle;
    
        int numChannels = 2;
        uint8_t *pTrMemCh[2];
        int32_t  chIdIn[2];  // Input channel IDs
        int32_t  chIdOut[2]; // Output channel IDs
        uint8_t *dmaUtilsContext = (uint8_t *) memalign(128, DmaUtilsAutoInc3d_getContextSize(numChannels));
    
    
    
        for (size_t ch = 0; ch < numChannels; ch++) {
           pTrMemCh[ch] = (uint8_t *) memalign(128, DmaUtilsAutoInc3d_getTrMemReq(1));
        }
    
        retVal = dmaUtils<float>::init(dmaUtilsContext, numChannels, udmaHandle);
    
        if (retVal != UDMA_SOK) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        uint32_t transferSize = DMAUTILSAUTOINC3D_SYNC_2D;
        DmaUtilsAutoInc3d_TransferDim transferDimIn;  /*!< Structure to hold transfer properties of input DMA channel */
        DmaUtilsAutoInc3d_TransferDim transferDimOut; /*!< Structure to hold transfer properties of output DMA channel */
    
        /***********************/
        /* Configure input DMA */
        /***********************/
    
        // set transfer dimenssion structure
        dmaUtils<float>::dmaAutoIncSetupXferPropIn2D(128, 128, 128, 16, 128*sizeof(double), 128*sizeof(double),
                                                       &transferDimIn);
        // assign channel number
        chIdIn[0] = 0;
        /* chIdIn[0] = dmaChOffset::globalChOffset; */
        /* dmaChOffset::globalChOffset += 1; */
    
        // configure channel
        retVal = dmaUtils<float>::configure_channel(dmaUtilsContext, chIdIn[0], pTrMemCh, (uint8_t *) ddrBuffer,
                                                      (uint8_t *) pInputBlock, transferSize, &transferDimIn);
    
        if (retVal != UDMA_SOK) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        /***********************/
        /* Configure output DMA */
        /***********************/
    
        // set transfer dimenssion structure
        dmaUtils<float>::dmaAutoIncSetupXferPropOut2D(128, 128, 128, 16, 128*sizeof(double), 128*sizeof(double),
                                                         &transferDimOut);
        // assign channel number
        chIdOut[0] = 1;
        /* chIdOut[0] = dmaChOffset::globalChOffset; */
        /* dmaChOffset::globalChOffset += 1; */
    
        // configure channel
        retVal = dmaUtils<float>::configure_channel(dmaUtilsContext, chIdOut[0], pTrMemCh, (uint8_t *) pOutputBlock,
                                                       (uint8_t *) ddrBuffer, transferSize, &transferDimOut);
    
        if (retVal != UDMA_SOK) {
           printf("[Error] UDMA init failed!!\n");
        }
    
        dmaUtils<float>::trigger(dmaUtilsContext, chIdIn, 1);
    
        dmaUtils<float>::wait(dmaUtilsContext, chIdIn, 1);
    
    
        printf("Goodbye\n");
    
    
    }

    I would like to look at this from my side and try to replicate.

    Best,

    Daniel

  • Hi Daniel,

    I don't know how to properly send you a project, but I've zippled the CCS project directory here. Let me know if I should do it differently.

    You will find that as it is now, the code will hang on the final wait() call. If you remove the freertos library and change the entry point from _c_int00_secure to _c_int00, then it will work fine.

    As I mentioned before, I think it has something to do with the MMU because the only thing the _c_int00_secure entry does differently is to initialize the MMU, but your input will be appreciated!

    basic-project.zip

  • Hi Tyler, 

    I don't know how to properly send you a project, but I've zippled the CCS project directory here. Let me know if I should do it differently.

    thank you for the project, this should be good enough to proceed. We'll keep updating with our findings

    Best,

    Daniel

  • Hi Tyler,

    In particular for the example, SemaphoreP is defined in FreeRTOS lib. There is no NoRTOS lib in the J722S SDK for c7x processors, only for the r5f processors.

    I understand the issue you are having and I have also noticed the lack of NoRTOS libraries for the C7x cores. I am currently in contact with the development team regarding this and to see how we can provide a potential solution.

    Thanks,

    Neehar

  • Hi Tyler,

    I am currently still in contact with the development team regarding baremetal IPC examples running on C7x.

    I was able to generate the binaries for the IPC notify echo application running on MCU R5 with baremetal and I am currently testing the binaries. This may be helpful to take a look at the implementation from the example as it provides reference for your development.

    Thanks,

    Neehar

  • Hi Neehar,

    Thanks for the update. Assuming you are able to get IPC working on bare metal C7x and it does not break the DRU (like RTOS does), that will be sufficient for us. looking forward to further updates.

    - Tyler

  • Hey all,

    What's the status on this?

    - Tyler

  • Hi Tyler,

    Sorry for the delay, I am still going back and forth with the development team on if we can generate and provide you with a binary/patch with the bare metal IPC example on C7x. We currently do not support baremetal on C7x and due to this, they have been reluctant.

    Additionally, I am looking into when we will have FreeRTOS support for TISP. 
    I can still provide the MCU R5 baremetal IPC example in the mean time with the end goal of providing on C7x.

    There may be further delays due to US holidays.

    Thanks,

    Neehar

  • Hi Neehar,

    If it's not clear, the example applications are not important to us. We simply need either 1) The bare minimum nortos libraries to allow us to compile our own app using IPC or 2) A way of compiling a FreeRTOS application without breaking the DRU

    The R5 is not helpful to us, the application needs to run on c7x.

    The right person looking at this could solve it in an hour, why is your team so reluctant to help us with this simple matter?

  • Hi Tyler,

    Sorry for the delays as I have also been out of office. I have generated the baremetal or nortos libraries for C75-0 core and attached it below. Please take a look and get back to me after testing. I am still working on testing your application code with these nortos libraries, however, I wanted to provide this update to you. Additionally, I am working to root cause the issues you encounter with FreeRTOS as well.

    nortos.j722s.c75ss0-0.ti-c7000.release.lib 

    Thanks,

    Neehar

  • Hi Neehar,

    Thanks for the initial library. It doesn't quite work, but hopefully we can quickly iterate it to get moving

    I am essentially trying to rebuild the 'ipc_rpmsg_echo_linux' example using the new nortos library, and we run into linking issues pretty quickly, with the inclusion of the `RPMessage_waitForLinuxReady(SystemP_WAIT_FOREVER);` line

    There are several methods missing from the library, linked below.

     undefined            first referenced                                                                                                                                                                
      symbol                  in file                                                                                                                                                                     
     ---------            ----------------                                                                                                                                                                
     ClockP_getTimerCount /home/tmiddlet/ti-processor-sdk-rtos-j722s-evm-10_00_00_05/mcu_plus_sdk_j722s_10_00_00_25/source/kernel/nortos/lib/nortos.j722s.c75ss0-0.ti-c7000.release.lib<ClockP_nortos.obj>
     ClockP_init          ./syscfg-generated/ti_dpl_config.obj                                                                                                                                            
     HwiP_destruct        /home/tmiddlet/ti-processor-sdk-rtos-j722s-evm-10_00_00_05/mcu_plus_sdk_j722s_10_00_00_25/source/kernel/nortos/lib/nortos.j722s.c75ss0-0.ti-c7000.release.lib<ClockP_nortos.obj>
     HwiP_disable         ./syscfg-generated/ti_dpl_config.obj                                                                                                                                            
     HwiP_restore         /home/tmiddlet/ti-processor-sdk-rtos-j722s-evm-10_00_00_05/mcu_plus_sdk_j722s_10_00_00_25/source/kernel/nortos/lib/nortos.j722s.c75ss0-0.ti-c7000.release.lib<DebugP_log.obj>   
     Hwi_vectorTableBase  /home/tmiddlet/ti-processor-sdk-rtos-j722s-evm-10_00_00_05/mcu_plus_sdk_j722s_10_00_00_25/source/kernel/nortos/lib/nortos.j722s.c75ss0-0.ti-c7000.release.lib<HwiP_c75.obj>     
     xPortInIsrContext    /home/tmiddlet/ti-processor-sdk-rtos-j722s-evm-10_00_00_05/mcu_plus_sdk_j722s_10_00_00_25/source/kernel/nortos/lib/nortos.j722s.c75ss0-0.ti-c7000.release.lib<HwiP_c75.obj>     


    Let me know if you can add these methods, or think of a workaround.

  • Hello Neehar, thank you for providing the initial library. Do you have any updates for us on FreeRTOS compatibility or an updated nortos library?

    Thanks,

    David

  • Hi David,

    I have generated an updated noRTOS library that I am currently testing. I will provide it here when I am able to successfully run an example with it. The issues Tyler sees is because there are ClockP and HwiP APIs that are not implemented for noRTOS on C75.

    Thanks,

    Neehar

  • Hi Neehar,

    How is this going? Does this approach seem promising?

    - Tyler

  • Hello Neehar, great to hear that there is a library in testing. Do you have any updates for us here?

    Thank you,

    David

  • Hi Tyler and David,

    I am still working towards getting a working noRTOS library for C7x. I am having issues as some HwiP APIs have only been implemented for FreeRTOS and do not work in a baremetal environment. I am currently working on filling in these APIs.

    Thanks,

    Neehar

  • Hi Neehar,

    I just want to reiterate that all we need is for IPC and DRU to work in the same environment - FreeRTOS and noRTOS are both fine for our purposes. I don't know if it is easier for you to pursue this path of making IPC work with no RTOS, or if it would be simpler to fix the DRU boot issue that occurs with FreeRTOS. It seems like the path you are taking might be more overall work, but again I don't know how hard it is to fix the DRU boot issue. I hope you will choose the path of least resistance.

    Thanks,

    Tyler

  • Hello Neehar, do you have an update on baremetal implementation of these APIs? Our DSP team is blocked by this issue and this is our highest priority to resolve in order to move on with development.

    Thanks,

    David

  • Hello Neehar, we are following up again on this issue. I believe it has been 3-4 months since we first surfaced this with TI and we consider DRU + IPC to be bare minimum functionality on TDA4. We appreciate your hard work here to get these features running and we would like an update on the latest status as our team and our roadmap is being blocked by this.

    Thanks,

    David

  • Hi David,

    IPC is already supported on the TDA4AEN device. On which cores are you looking for IPC? 

    Also regarding DRU usage, on which core do you want to access DRU?

    DRU is currently not supported in the vision, in the vision apps/utils folder and we are looking into enabling it in the next release (or after next release). 

    Regards,

    Brijesh 

  • Hi Brijesh,

    This is all explained in the question history, but I'll give a quick overview.

    DRU was not supported on c7x, which means fast transfers between DDR and SRAM were not possible. This is critical for DSP applications, so we asked TI to implement it. They responded with a new SDK version providing a prototype TISP library and limited support for DRU.

    This was fine, until I tried to integrate it with IPC, which is also critical for any useful application on c7x. IPC on c7x only works if you use FreeRTOS, but for some reason FreeRTOS breaks the limited DRU support that was previously provided.

    We need IPC and DRU to both be working on c7x. It doesn't matter if FreeRTOS is present or not, so there are two solutions to this problem. Either fix DRU on FreeRTOS, or implement the NoRTOS methods required by IPC.

    Let me know if further clarification is needed.

    - Tyler

  • Hi Tyler,

    FreeRTOS is supported on the C7x and its validated in the SDK release. If you use vision apps, even IPC is working fine on C7x. I have tested echo test on C7x with vision apps and its working fine. So could not understand what exactly issue you are facing with the IPC. IPC with FreeRTOS is supported and is working fine on J722S. 

    Regarding DRU, i need to check. This wasn't supported using MCU+ driver, so it should be supported in dmautils, that's what TIDL uses internally, but yes, there could be surprises.

    Lets first see your IPC issue and then check on DRU.

    Regards,

    Brijesh

  • Hi Brijesh, we are running a custom DSP application on C7x that is not vision or TIDL. We are using TISP (C7x DSP) which is a brand new SDK component in in SDK 10.0 and it is TISP's usage of DRU that has buggy compatibility with IPC. The relevant info, even code snippets are available up above in this thread and Neehar seems to be on the case for us which we greatly appreciate.

    Our software team is flexible and we would be happy with either of these solutions:

    Linux on A53 + TISP & FreeRTOS on C7x

    Linux on A53 + TISP & Bare Metal on C7x

    To the TI team, if you would like to have a short meeting so we can discuss/root cause the issue and share current status that would be greatly appreciated. 

    Thanks,

    David

  • Hello Brijesh and Neehar, do you have any updates for us on this issue? Our team (and others) are still blocked on this on the J722S platform. I believe the status 2 weeks ago was Neehar had a preliminary solution but it did not implement some necessary HwiP APIs for bare metal usage.

    Thanks,

    David

  • Hi David,

    I am having issues with the newly built noRTOS library and I am currently working through fixing them. I will follow up in a couple days with an update.

    Thanks,

    Neehar

  • Thanks Neehar, we are looking forward to your updates.

    Regards

    David

  • Following up so the thread does not lock.

    Thanks,

    Neehar