This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Writing to DSP cache from Linux/ARM on OMAP-L138

I would like to use some or all of the DSP L2 memory directly for high performance functions however I have been unable to load sections into the memory from Linux running on an arm.

I have entries for these memory locations in my resource table (using the deprecated RSC_INTMEM), and have created memory sections (called L2_RAM) and code segments (.l2_ram).  So just need remoteproc_elf_loader.c to "memcpy" the code into the correct location.

This is what I have tried so far:

In /arch/arm/mach-davinci/da850.c I added some code to reserve and map the memory locations to virtual addresses.

#define DA850_SHARED_RAM_DSP_BASE       0x80000000
#define DA850_SHARED_RAM_DSP_SIZE       SZ_128K

#define DA850_L2_RAM_DSP_BASE           0x11800000
#define DA850_L2_RAM_DSP_SIZE           SZ_256K

// SNIP

// SHARED Ram structs
static struct resource da850_shared_ram_dsp_resources[] = {
        { /* registers */
                .start  = DA850_SHARED_RAM_DSP_BASE,
                .end    = DA850_SHARED_RAM_DSP_BASE + DA850_SHARED_RAM_DSP_SIZE - 1,
                .flags  = IORESOURCE_MEM,
        },
};

static struct platform_device da850_shared_ram_dsp_device = {
        .name           = "davinci_shared_ram_dsp",
        .id             = -1,
        .num_resources  = ARRAY_SIZE(da850_shared_ram_dsp_resources),
        .resource       = da850_shared_ram_dsp_resources,
};

// L2 Ram structs
static struct resource da850_l2_ram_dsp_resources[] = {
        { /* registers */
                .start  = DA850_L2_RAM_DSP_BASE,
                .end    = DA850_L2_RAM_DSP_BASE + DA850_L2_RAM_DSP_SIZE - 1,
                .flags  = IORESOURCE_MEM,
        },
};

static struct platform_device da850_l2_ram_dsp_device = {
        .name           = "davinci_l2_ram_dsp",
        .id             = -1,
        .num_resources  = ARRAY_SIZE(da850_l2_ram_dsp_resources),
        .resource       = da850_l2_ram_dsp_resources,
};

// SNIP
        // shared memory 0x80000000
        ret = platform_device_register(&da850_shared_ram_dsp_device);
        if(ret){
                printk("da850_register_misc_ram_dsp: error platform_device_register %s\n", da850_shared_ram_dsp_device.name);
                return ret;
        }
        da850_shared_ram_dsp_device.dev.platform_data = ioremap_nocache(DA850_SHARED_RAM_DSP_BASE, DA850_SHARED_RAM_DSP_SIZE);
        if (!da850_shared_ram_dsp_device.dev.platform_data) {
                printk("da850_register_misc_ram_dsp: error ioremap_nocache %s\n", da850_shared_ram_dsp_device.name);
                return -ENOMEM;
        }

        // l2 ram dsp
        ret = platform_device_register(&da850_l2_ram_dsp_device);
        if(ret){
                printk("da850_register_misc_ram_dsp: error platform_device_register %s\n", da850_l2_ram_dsp_device.name);
                return ret;
        }
        da850_l2_ram_dsp_device.dev.platform_data = ioremap_nocache(DA850_L2_RAM_DSP_BASE, DA850_L2_RAM_DSP_SIZE);
        if (!da850_l2_ram_dsp_device.dev.platform_data) {
                printk("da850_register_misc_ram_dsp: error ioremap_nocache %s\n", da850_l2_ram_dsp_device.name);
                return -ENOMEM;
        }

This creates memory locations in /proc/iomem.

Next the function "rproc_elf_load_sections()" in remoteproc_elf_loader.c can correctly memcpy() code to the "shared_ram_dsp" section.  The DSP can then access and execute this code.  

However, if I attempt to load memory into the "l2_ram_dsp" it fails.  I do a memcmp() immediately following the memcpy.  The memory area is all zero.  For all the memcpy and memcmp calls am using the virtual address as stored in the carveout list.

I have tried using both  "ioremap_nocache()" and "ioremap()" in the allocation/mapping function.  I have also tried "memcpy_toio()" as well as "memcpy()" in the "rproc_elf_load_sections()" function.  No combination worked.

How can I copy code sections (or anything for that matter) from the ARM into L1P ram or L2 ram?

-Douglas

NB: I found this :

Can OMAP-L138 boot loader access DSP Cache? - Processors forum - Processors - TI E2E support forums

e2e.ti.com
Other Parts Discussed in Thread: OMAP-L138 , SYSBIOS , TMS320C6748 Does the bootloader on the OMAP-L138 have access to DSP cache? I have a app that is using SYSBIOS

  but it does not answer the question.

  • Dear Douglas,
    I think, full L2 memory might have configured as cache (part of L2 memory).
    If so, you may need to modify the register (L2CFG) to access the L2 memory as RAM.

    www.ti.com.cn/.../sprufk5a.pdf
  • I am writing to the DSP directly from the ARM while the DSP is in reset (prior to any DSP code executing) and the default for the L2CFG register bit 7:0 is 0 which should set the L2 cache size to 0.

    Also, according to this:

    http://www.ti.com/lit/ds/symlink/omap-l138.pdf

    The ARM is not allowed to access addresses 0x0184 0000.  This is where the L2CFG resides.  So even if that value were wrong, I could not change it from the ARM.

    This may not be relevant, since as stated earlier, I am trying to write prior to running the DSP but, I do have all the cache areas configured to 0k in Platform.xdc.

    l2Mode: "0k",
    l1PMode: "0k",
    l1DMode: "0k",

    I also tried setting them to 0k in a custom config.bld file and in my custom dsp.cfg file as well.  It appeared to work as the performance drops dramatically when not using cache and the map files shows memory blocks like this

    MEMORY CONFIGURATION
    
             name            origin    length      used     unused   attr    fill
    ----------------------  --------  ---------  --------  --------  ----  --------
      IROM                  11700000   00100000  00000000  00100000  R  X
      L2_RAM                11800000   00040000  00000000  00040000  RW X
      L1PSRAM               11e00000   00008000  00000000  00008000  RW X
      L1DSRAM               11f00000   00008000  0000008c  00007f74  RW  
      L3_CBA_RAM            80000000   00040000  00000000  00040000  RW X
      DDR_ARM               c0000000   03000000  00000000  03000000  RW X
      DDR_LIBRARY           c3100000   00900000  00000000  00900000  RW X
      DDR_DSP               c4000000   00800000  0019dbd4  0066242c  RW X

    So any other ideas as to why I  can't write to the L2 memory location?

    -Doug

  • Ok, additionally printed out what L2CFG was at the start of the DSP (I printed it from code running on the DSP). It is 0x1010000.
    This shows L2MODE being 0, cache disabled.