This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/PROCESSOR-SDK-OMAPL138: Loading DSP into the L2RAM of the OMAP-L138 when using remoteproc.0 from Linux

Part Number: PROCESSOR-SDK-OMAPL138
Other Parts Discussed in Thread: OMAP-L138, DA8XX

Tool/software: Linux

Hi,

I'm trying to load DSP firmware from Linux where the DSP memory uses L2RAM/IRAM for it's memory locations.
If the memory is placed into DDR it will work, but if any memory is placed within the L2RAM then remoteproc.0 won't load the firmware.

Here is my config.bld

/*
 *  ======== config.bld ========
 *
 */
var Build = xdc.useModule('xdc.bld.BuildEnvironment');
var Pkg = xdc.useModule('xdc.bld.PackageContents');

/* when constructing a release, release everything */
Pkg.attrs.exportAll = true;

/* Uncomment this to build the app with debug support */
Pkg.attrs.profile = "debug";

/* bin/ is a generated directory that 'xdc clean' should remove */
Pkg.generatedFiles.$add("bin/");

Build.platformTable["ti.platforms.evmOMAPL138:dsp"] = {
    externalMemoryMap: [
        [ "DDR", {
            name: "DDR", space: "code/data", access: "RWX",
            base: 0xC3100000, len: 0x800000,
            comment: "DSP Program Memory (8 MB)"
        }]
    ],
    codeMemory:  "DDR",
    dataMemory:  "DDR",
    stackMemory: "DDR",
    l1DMode: "32k",
    l1PMode: "32k",
    l2Mode: "0k"
};


/*
 *  ======== ti.targets.elf.C674 ========
 */
var C674 = xdc.useModule('ti.targets.elf.C674');
C674.ccOpts.suffix += " -mi10 -mo ";

Settings any of the Memory regions to be in the IRAM doesn't work when trying to load the firmware.

Address of the IRAM is 0x18000000 and the size is 0x40000

I have tried the following:

1. Tried adding DEVMEM to the resource table.

    {
        TYPE_DEVMEM, IRAM_DA, IRAM_DA, IRAM_SIZE, 0, 0, "DSP_IRAM",
    },

2. Adding a memory pool to the DTS and making a carveout instead in the resource table

	reserved-memory {
		#address-cells = <1>;
		#size-cells = <1>;
		ranges;

		dsp_memory_region: dsp-memory@c3000000 {
			compatible = "shared-dma-pool";
			reg = <0xc3000000 0x1000000>;
			reusable;
			status = "okay";
		};

		dsp_memory_region2: dsp-iram@18000000 {
			compatible = "shared-dma-pool";
			reg = <0x18000000 0x00040000>;
			reusable;
			status = "okay";
		};
	};
    {
         TYPE_CARVEOUT, IRAM_DA, IRAM_DA, IRAM_SIZE, 0, 0, "DSP_IRAM",
    },

None of this worked.

I need my DSP memory to reside in IRAM/L2RAM.


Here is the full resource table:

/*
 *  ======== rsc_table_omapl138.h ========
 *
 *  Include this table in each base image, which is read from remoteproc on
 *  host side.
 *
 */

#ifndef _RSC_TABLE_OMAPL138_H_
#define _RSC_TABLE_OMAPL138_H_

#include "rsc_types.h"

#define IRAM_DA                 0x18000000
#define IRAM_SIZE               SZ_256K

#define DATA_DA                 0xc3100000

#ifndef DATA_SIZE
#  define DATA_SIZE  (SZ_8M)
#endif

#define RPMSG_VRING0_DA         0xc3000000
#define RPMSG_VRING1_DA         0xc3004000

#define CONSOLE_VRING0_DA       0xc3008000
#define CONSOLE_VRING1_DA       0xc300C000

#define BUFS0_DA                0xc3040000
#define BUFS1_DA                0xc3080000

/*
 * sizes of the virtqueues (expressed in number of buffers supported,
 * and must be power of 2)
 */
#define RPMSG_VQ0_SIZE          256
#define RPMSG_VQ1_SIZE          256

#define CONSOLE_VQ0_SIZE        256
#define CONSOLE_VQ1_SIZE        256

/* flip up bits whose indices represent features we support */
#define RPMSG_DSP_FEATURES      1

struct my_resource_table {
    struct resource_table base;

    UInt32 offset[4];

    /* rpmsg vdev entry */
    struct fw_rsc_vdev rpmsg_vdev;
    struct fw_rsc_vdev_vring rpmsg_vring0;
    struct fw_rsc_vdev_vring rpmsg_vring1;

    /* data carveout entry */
    struct fw_rsc_carveout data_cout;

    /* data devmem entry */
    struct fw_rsc_devmem iram;

    /* trace entry */
    struct fw_rsc_trace trace;
};
extern char xdc_runtime_SysMin_Module_State_0_outbuf__A;
#define TRACEBUFADDR (UInt32)&xdc_runtime_SysMin_Module_State_0_outbuf__A
#define TRACEBUFSIZE 0x8000

#pragma DATA_SECTION(ti_ipc_remoteproc_ResourceTable, ".resource_table")
#pragma DATA_ALIGN(ti_ipc_remoteproc_ResourceTable, 4096)

struct my_resource_table ti_ipc_remoteproc_ResourceTable = {
    1, /* we're the first version that implements this */
    4, /* number of entries in the table */
    0, 0, /* reserved, must be zero */
    /* offsets to entries */
    {
        offsetof(struct my_resource_table, rpmsg_vdev),
        offsetof(struct my_resource_table, data_cout),
        offsetof(struct my_resource_table, iram),
        offsetof(struct my_resource_table, trace),
    },

    /* rpmsg vdev entry */
    {
        TYPE_VDEV, VIRTIO_ID_RPMSG, 0,
        RPMSG_DSP_FEATURES, 0, 0, 0, 2, { 0, 0 },
        /* no config data */
    },
    /* the two vrings */
    { RPMSG_VRING0_DA, 4096, RPMSG_VQ0_SIZE, 1, 0 },
    { RPMSG_VRING1_DA, 4096, RPMSG_VQ1_SIZE, 2, 0 },

    {
        TYPE_CARVEOUT, DATA_DA, DATA_DA, DATA_SIZE, 0, 0, "DSP_MEM_DATA",
    },

    {
        TYPE_DEVMEM, IRAM_DA, IRAM_DA, IRAM_SIZE, 0, 0, "DSP_IRAM",
    },

    {
        TYPE_TRACE, TRACEBUFADDR, TRACEBUFSIZE, 0, "trace:dsp",
    },
};

#endif /* _RSC_TABLE_OMAPL138_H_ */


Declaring either the IRAM as a CARVEOUT or as a DEVMEM did not fix my issue.

This is the full log that Linux gives through dmesg:

[ 3128.896586] davinci-rproc davinci-rproc.0: pm_clk_notify() 6
[ 3128.897324] PM: Removing info for No Bus:remoteproc0
[ 3128.920801] firmware_class: fw_name_devm_release: fw_name-rproc-dsp-fw devm-c0ca0fb8 released
[ 3128.920868] remoteproc remoteproc0: releasing dsp
[ 3128.934502] platform davinci-rproc.0: pm_clk_notify() 7
[ 3128.936532] bus: 'platform': driver_probe_device: matched device davinci-rproc.0 with driver davinci-rproc
[ 3128.936587] bus: 'platform': really_probe: probing driver davinci-rproc with device davinci-rproc.0
[ 3128.936681] davinci-rproc davinci-rproc.0: no pinctrl handle
[ 3128.936756] davinci-rproc davinci-rproc.0: pm_clk_notify() 4
[ 3128.936922] devices_kset: Moving davinci-rproc.0 to end of list
[ 3128.937193] davinci-rproc davinci-rproc.0: assigned reserved memory node dsp-memory@c3000000
[ 3128.997060] device: 'remoteproc0': device_add
[ 3128.997340] PM: Adding info for No Bus:remoteproc0
[ 3129.005493] remoteproc remoteproc0: dsp is available
[ 3129.009591] firmware_class: __allocate_fw_buf: fw-rproc-dsp-fw buf=c26ac5c0
[ 3129.009728] remoteproc remoteproc0: loading /lib/firmware/updates/4.14.40-g4796173fc5/rproc-dsp-fw failed with error -2
[ 3129.009808] remoteproc remoteproc0: loading /lib/firmware/updates/rproc-dsp-fw failed with error -2
[ 3129.009877] remoteproc remoteproc0: loading /lib/firmware/4.14.40-g4796173fc5/rproc-dsp-fw failed with error -2
[ 3129.053666] driver: 'davinci-rproc': driver_bound: bound to device 'davinci-rproc.0'
[ 3129.053763] davinci-rproc davinci-rproc.0: pm_clk_notify() 5
[ 3129.054058] bus: 'platform': really_probe: bound device davinci-rproc.0 to driver davinci-rproc
[ 3129.134009] remoteproc remoteproc0: direct-loading rproc-dsp-fw
[ 3129.134100] firmware_class: fw_set_page_data: fw-rproc-dsp-fw buf=c26ac5c0 data=cbec1000 size=2845088
[ 3129.134138] remoteproc remoteproc0: powering up dsp
[ 3129.137771] firmware_class: batched request - sharing the same struct firmware_buf and lookup for multiple requests
[ 3129.137805] firmware_class: fw_set_page_data: fw-rproc-dsp-fw buf=c26ac5c0 data=cbec1000 size=2845088
[ 3129.137841] remoteproc remoteproc0: Booting fw image rproc-dsp-fw, size 2845088
[ 3129.232656] remoteproc remoteproc0: Failed to process resources: -22
[ 3129.262848] firmware_class: __fw_free_buf: fw-rproc-dsp-fw buf=c26ac5c0 data=cbec1000 size=2845088



  • Hi, Dmitriy,

    Sorry, I just came back to office and need some time to digest your info. I may need to discuss it with DSP expert. Usually, it is caused by inconsistency of the configuration. I'll get back to you as soon as I have info.

    Rex
  • Hi, Dmitriy,

    I read through your post, you seems to either configure carveout or modify resource table. Have you tried with both?
    Carveout is to reserve it to be used, but resource table notifies MMU to get access to the memory areas.

    There is a section in "Changing the DSP memory Map" in IPC for AM57x Chapter of the IPC User Guide, software-dl.ti.com/.../Foundational_Components_IPC.html which may have info you need.

    I included the DSP expert in case he has something to add.

    Rex
  • Hi Rex,

    I want to assure you have I looked through "Changing DSP memory map" very thoroughly and looked for multiple threads regarding this.
    I will try with both a carveout and the memory map, since I think I carved out a wrong memory region before. But please add any more information that can be used for this. It's imperative for my project for this to work otherwise since I REQUIRE all of the memory of the DSP to lie in the inner DSP memories L1/L2 and not on the DDR.

    EDIT:

    I edited the CARVEOUT in the DTS to have the proper memory region:

    	reserved-memory {
    		#address-cells = <1>;
    		#size-cells = <1>;
    		ranges;
    
    		dsp_memory_region: dsp-memory@c3000000 {
    			compatible = "shared-dma-pool";
    			reg = <0xc3000000 0x1000000>;
    			reusable;
    			status = "okay";
    		};
    
    		dsp_memory_region2: dsp-memory@11800000 {
    			compatible = "shared-dma-pool";
    			reg = <0x11800000 0x40000>;
    			reusable;
    			status = "okay";
    		};
    	};

    Set my build to use all IRAM (L2RAM):

    Build.platformTable["ti.platforms.evmOMAPL138:dsp"] = {
        externalMemoryMap: [
            [ "DDR", {
                name: "DDR", space: "code/data", access: "RWX",
                base: 0xC3100000, len: 0x800000,
                comment: "DSP Program Memory (8 MB)"
            }]
        ],
        codeMemory:  "IRAM",
        dataMemory:  "IRAM",
        stackMemory: "IRAM",
        l1DMode: "32k",
        l1PMode: "32k",
        l2Mode: "0k"
    };

    And have modified the resource table:

    /*
     *  ======== rsc_table_omapl138.h ========
     *
     *  Include this table in each base image, which is read from remoteproc on
     *  host side.
     *
     */
    
    #ifndef _RSC_TABLE_OMAPL138_H_
    #define _RSC_TABLE_OMAPL138_H_
    
    #include "rsc_types.h"
    
    #define IRAM_DA                 0x11800000
    #define IRAM_SIZE               SZ_256K
    
    #define DATA_DA                 0xc3100000
    
    #ifndef DATA_SIZE
    #  define DATA_SIZE  (SZ_8M)
    #endif
    
    #define RPMSG_VRING0_DA         0xc3000000
    #define RPMSG_VRING1_DA         0xc3004000
    
    #define CONSOLE_VRING0_DA       0xc3008000
    #define CONSOLE_VRING1_DA       0xc300C000
    
    #define BUFS0_DA                0xc3040000
    #define BUFS1_DA                0xc3080000
    
    /*
     * sizes of the virtqueues (expressed in number of buffers supported,
     * and must be power of 2)
     */
    #define RPMSG_VQ0_SIZE          256
    #define RPMSG_VQ1_SIZE          256
    
    #define CONSOLE_VQ0_SIZE        256
    #define CONSOLE_VQ1_SIZE        256
    
    /* flip up bits whose indices represent features we support */
    #define RPMSG_DSP_FEATURES      1
    
    struct my_resource_table {
        struct resource_table base;
    
        UInt32 offset[4];
    
        /* rpmsg vdev entry */
        struct fw_rsc_vdev rpmsg_vdev;
        struct fw_rsc_vdev_vring rpmsg_vring0;
        struct fw_rsc_vdev_vring rpmsg_vring1;
    
        /* data carveout entry */
        struct fw_rsc_carveout data_cout;
    
        /* data carveout entry */
        struct fw_rsc_carveout iram;
    
        /* trace entry */
        struct fw_rsc_trace trace;
    };
    extern char xdc_runtime_SysMin_Module_State_0_outbuf__A;
    #define TRACEBUFADDR (UInt32)&xdc_runtime_SysMin_Module_State_0_outbuf__A
    #define TRACEBUFSIZE 0x8000
    
    #pragma DATA_SECTION(ti_ipc_remoteproc_ResourceTable, ".resource_table")
    #pragma DATA_ALIGN(ti_ipc_remoteproc_ResourceTable, 4096)
    
    struct my_resource_table ti_ipc_remoteproc_ResourceTable = {
        1, /* we're the first version that implements this */
        4, /* number of entries in the table */
        0, 0, /* reserved, must be zero */
        /* offsets to entries */
        {
            offsetof(struct my_resource_table, rpmsg_vdev),
            offsetof(struct my_resource_table, data_cout),
            offsetof(struct my_resource_table, iram),
            offsetof(struct my_resource_table, trace),
        },
    
        /* rpmsg vdev entry */
        {
            TYPE_VDEV, VIRTIO_ID_RPMSG, 0,
            RPMSG_DSP_FEATURES, 0, 0, 0, 2, { 0, 0 },
            /* no config data */
        },
        /* the two vrings */
        { RPMSG_VRING0_DA, 4096, RPMSG_VQ0_SIZE, 1, 0 },
        { RPMSG_VRING1_DA, 4096, RPMSG_VQ1_SIZE, 2, 0 },
    
        {
            TYPE_CARVEOUT, DATA_DA, DATA_DA, DATA_SIZE, 0, 0, "DSP_MEM_DATA",
        },
    
        {
            TYPE_CARVEOUT, IRAM_DA, IRAM_DA, IRAM_SIZE, 0, 0, "DSP_MEM_IRAM",
        },
    
        {
            TYPE_TRACE, TRACEBUFADDR, TRACEBUFSIZE, 0, "trace:dsp",
        },
    };
    
    #endif /* _RSC_TABLE_OMAPL138_H_ */
    

    My configuration for the multiproc for SYS/BIOS:

        var MultiProc = xdc.useModule('ti.sdo.utils.MultiProc');
        MultiProc.setConfig("DSP", ["HOST", "DSP"]);
        
        // Enable Memory Translation module that operates on the Resource Table
        var Resource = xdc.useModule('ti.ipc.remoteproc.Resource');
        Resource.customTable = true;
        Resource.loadSegment = "DDR";
        //Resource.loadSegment = Program.platform.dataMemory;
        
        /*  Use SysMin because trace buffer address is required for Linux/QNX
         *  trace debug driver, plus provides better performance.
         */
    	System.SupportProxy = SysMin;
    	SysMin.bufSize  = 0x8000;
    	
    	Program.sectMap[".tracebuf"] = "DDR";
    	Program.sectMap[".errorbuf"] = "DDR";

     

    OUTCOME:

    Linux now accept the resource table and loads the firmware, BUT It doesn't work.

    Even my basic start-up GPIO that work when loaded from DDR no longer work.

    So I'm assuming it no longer actually loads the firmware at all into the DSP.

    Resource.loadSegment = "IRAM"; but the outcome is still the same.

    I have tried only using one carveout in the resource table just for the IRAM and the outcome is also the same.

    Please any help would be greatly appreciated.

  • Hi, Dmitriy,

    I got info back from developer that loading DSP code to IRAM/L2RAM is not supported for OMAP-L138 at the moment. It only supports KS2 and AM57x DSPs. Sorry for the confusion.

    Rex
  • So there 0% or anyway to load the DSP code into the IRAM/L2RAM from Linux. I don't really care about tracing or IPC at this point.
    I just need the code to be loaded into the proper memory regions.

    Could this be done through U-Boot instead of Linux? Then restrict Linux of using a certain Range of the DDR so it doesn't overlap?
  • Hi, Dmitriy,

    We haven't tried that way. The kernel drivers will need to rewrite.

    Rex
  • I found a link with someone else talking about it being possible:
    e2e.ti.com/.../1929519

    Not sure why the Linux drivers need to be re-written since I don't really need Linux to do anything.
    Linux exists simply for the LAN and a filesystem. The only cross communication is done through the SRAM,
    but the the ARM and DSP exist independently in my system for the most part.

    I was just wondering, with the link provided if there is anyone else that has tried or encountered this on the TI
    side.
  • Hi, Dmitriy,

    I am not sure if the link you mentioned still valid, but if you want, you can give it a try by following Rahul's first suggestion to rebuilt the u-boot with DSP image and have those code added to the DSP. Rahul's second suggetion of integrated ELF loader which refers to syslink. Syslink is no longer supported, and has been migrated to IPC. This method may not work.

    As to Titus's suggestion, I am not sure if that command is supported in Processor SDK and have it in TI u-boot.

    As Rahul mentioned, loading from u-boot is not a conventional approach. TI didn't verify it in the SDK releases. You can try and see if they work.

    Rex
  • Hi Rex,

    I will go with trying to embed Firmware into the U-Boot as this is the only approach that seems the most feasible.
    I think the custom command Titus was referring to is only added in the specific mitty board which is created by someone else.
    It seems that the OP of the thread said it has been done, so I'm going to attempt to do it.

    Thank you for verifying that Linux remoteproc does not support internal memory loading. It has saved me time
    spending trying to use it.
  • Hi, Dmitriy,

    I had a discussion internally. It appears that OMAP-L138 need a patch that adds da_to_va like the one in
    git.ti.com/.../

    or the equivalent code from the keystone_remoteproc.c

    If you get blocked in that integrated ELF loader image path, you may want to give it a try with this path to modify da8xx_remoteproc.c

    Rex
  • Hello Rex,

    Is there a guide on how to apply a patch to the kernel or do what you propose. I would love to
    try this approach and see how well it works. I'm just not really familiar enough with the kernel.
  • Hi, Dmitriy,

    The patch in that url is for omap device. You can't simply apply it. I took a look at the omap patch and keystone code, they are similar. But rproc->priv points to different rproc structures. You want to do similar address translation from device address to kernel virtual address so the internal RAM can be accessed.

    Rex
  • Rex, so I have tried adding the da_to_va to the remoteproc

    /*
     * Internal Memory translation helper
     *
     * Custom function implementing the rproc .da_to_va ops to provide address
     * translation (device address to kernel virtual address) for internal RAMs
     * present in a DSP or IPU device). The translated addresses can be used
     * either by the remoteproc core for loading, or by any rpmsg bus drivers.
     */
    static void *da8xx_rproc_da_to_va(struct rproc *rproc, u64 da, int len,
    				 u32 flags)
    {
    	struct da8xx_rproc *drproc = rproc->priv;
    	void *va = NULL;
    	int i;
    	u32 offset;
    
    	if (len <= 0)
    		return NULL;
    
    	if (!drproc->num_mems)
    		return NULL;
    
    	for (i = 0; i < drproc->num_mems; i++) {
    		if (da >= drproc->mem[i].dev_addr && da + len <=
    		    drproc->mem[i].dev_addr +  drproc->mem[i].size) {
    			offset = da -  drproc->mem[i].dev_addr;
    		
    			/* __force to make sparse happy with type conversion */
    			va = (__force void *)(drproc->mem[i].cpu_addr + offset);
    			break;
    		}
    	}
    
    	return va;
    }
    
    static const struct rproc_ops da8xx_rproc_ops = {
    	.start = da8xx_rproc_start,
    	.stop = da8xx_rproc_stop,
    	.kick = da8xx_rproc_kick,
    	.da_to_va = da8xx_rproc_da_to_va,
    };
    

    I modified the struct as well as adding the function, however it didn't do anything.

    With/without a carveout present. It still does not load, giving the

    davinci-rproc davinci-rproc.0: assigned reserved memory node dsp-memory@c3000000
    remoteproc remoteproc0: dsp is available
    root@omapl138-lcdk:~/DSP_Apps# remoteproc remoteproc0: powering up dsp
    remoteproc remoteproc0: Booting fw image rproc-dsp-fw, size 2919648
    remoteproc remoteproc0: erroneous trace resource entry
    remoteproc remoteproc0: Failed to process resources: -22
    
    

    I'm at a bit of a loss, I guess there are loader solutions, but I'm not quite aware of any of them.

  • Hi, Dmitriy,

    Not sure what else would be involved. Have you tried Rahul's first suggestion to rebuilt the u-boot with DSP image and have those code added to the DSP? By the way, is your L2SRAM address correct? You have 0x18000000, but I saw in the datasheet, it starts at 0x11800000. I am not sure if this is a typo or you meant to locate it at 0x18000000.

    Rex

  • There's a lot of other things that actually go into trying to rebuild the u-boot as well as altering the kernel to not reset the DSP on bootup. So for now this point is moot and I'm looking for outside to help resolve this. The address was a typo, it is 0x11800000 and that didn't matter in regards to fixing this issue. I'm rather surprised TI has no type of support/services for dealing these kind of issues.
  • Hi, Dmitriy,

    It isn't a requirement. I'll have to submit a feature request for it, and will do it now.

    Rex
  • Hi, Dmitriy,

    Since this is now handled internally and being evaluated. I'll close this thread for now.

    Rex