This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/66AK2L06: mpmcl load fails

Part Number: 66AK2L06
Other Parts Discussed in Thread: SYSBIOS

Tool/software: Code Composer Studio

Hello,

I have built a C66 RTSC application binary "dsp.out" for our custom K2L board. The size of the file is about 2MB and I can load it from the Arago Linux command prompt by typing

root@k2l-evm:~/tests# mpmcl reset dsp0

reset succeeded

root@k2l-evm:~/tests# mpmcl load dsp0 dsp.out

load succeeded

But when I try with a bigger, say 15MB file, it fails and the mpm-daemon crashes:

root@k2l-evm:~/tests# mpmcl reset dsp0

reset succeeded

root@k2l-evm:~/tests# mpmcl load dsp0 dsp2.out

Timeout in reading from socket
load failed (error: 0)
root@k2l-evm:~/tests# mpmcl load dsp0 testAntiTree_dsp.out

can't send data to /var/run/mpm/mpm_daemon (error: Connection refused)
load failed (error: 0)
root@k2l-evm:~/tests#

In the u-boot I have set the environment variable "mem_reserve" to "512M".

The boot log displays only 24MB:

[ 0.000000] Switching physical address space to 0x800000000
[ 0.000000] Reserved memory: created CMA memory pool at 0x000000081f800000, size 8 MiB
[ 0.000000] Reserved memory: initialized node dsp_common_cma_pool, compatible id shared-dma-pool
[ 0.000000] Reserved memory: created DMA memory pool at 0x0000000820000000, size 32 MiB
[ 0.000000] Reserved memory: initialized node dsp_reserved_mpm_area, compatible id shared-dma-pool
[ 0.000000] cma: Reserved 24 MiB at 0x000000085e800000

Do I understand right, should it be 512 MiB also here?

The generated map file shows a memory configuration like this:

MEMORY CONFIGURATION

name origin length used unused attr fill
---------------------- -------- --------- -------- -------- ---- --------
L2SRAM 00800000 00100000 00000258 000ffda8 RW X
MSMCSRAM 0c000000 00600000 00000000 00600000 RW X
DDR3 80000000 80000000 0b5c12c4 74a3ed3c RWIX

Is this OK?

When I load binaries to several DSP cores, do I need to modify the memory configuration for each of them or is the mpm-daemon clever enough to relocate the code and data into a free area in the memory in all 3 memory sections defined in the map file like above?

The size of the MSMCSRAM above is wrong. K2L has only 1MB of that memory, but I don't put any code into it anyway so it does not matter, right?

In addition, do I need to set some special linker flag, like "Produce a relocatable output module" or should I leave the Linker Output selections to their defaults?

Best regards,

Ari

  • Hi Ari,

    I've forwarded this to the software experts. Their feedback should be posted here.

    BR
    Tsvetolin Shulev
  • Hi Ari,

    The "mem_reserve" U-boot variable only tells Linux how much memory to use for itself. It does not affect the DSP memory. But you would use this to reduce the size of the ARM memory so that the DSP memory can follow.

    The DSP memory is allocated based on the DTS nodes in Linux, you can see our examples here: 

    git.ti.com/.../k2l.dtsi

    Take a look primarily at the one called 'dspmem'. There is also a configuration file for the DSPs located at /etc/mpm/mpm_config.json.

     One thing that I noticed is that your image is located at memory address 0x8000_0000, the ARM will be using this for Linux. If you use our examples for MPM DSP memory you will want to locate that image at 0xa000_0000. Otherwise, there is a conflict.

    Regards,

    Mike

  • Hello Mike!

    Thanks for the reply. The device tree setting is now clear to me. I will study that.

    But this also means that I need to build a different .out file for each of the DSP CPUs in order to load them to certain absolute addresses, right? Or is there a way to auto-locate them?

    I need to run the same application in 3 CPUs and. I intend to load and run them by using the mpmcl either from Linux command prompt, from a shell script or from an ARM application running on Linux. Do I really need 3 .out files ( = 3 copies of the same binary with only different load address), one for each CPU or can I do with only one .out file?

    So far I haven't found a way to change the DDR3 addresses for the DSP applications in CCS 6 or CCS7. They both always force it to 0x8000 0000. In CCS3 this was much easier with the .cmd file.

    I tried using a .cmd in the CCS6 but then it conflicts because CCS takes the DDR3 start address from some template. I suspected the addresses were defined in the target configuration file (.cxml) but it lets me change the DDR3 address for the ARM CPUs only, not for the DSPs. How comes this?

    Please help me to figure this out.

    Best regards,

    Ari

  • Hello!

    Please somebody tell me how can I make my DSP application .out use memory starting from 0xa000 0000 instead of default 0x8000 0000.

    BRS, Ari

  • Ahoy! Anybody home?

    Is this too trivial a question, too difficult a question or too stupid a user to answer to?

    If trivial a question then one could expect it quickly replied to. What is taking time?

    - Ari

  • /cfs-file/__key/communityserver-discussions-components-files/791/3527.Snapshots_5F00_CCS_5F00_MemoryMap_5F00_AM437x_5F00_Sysbios.docxAri, I just returned from a short vacation and I am eager to respond !

    You are correct. If your project involved RTSC system (you see cfg file in your project) the memory is define in the target configuration.  Your question is how to change or build a new target configuration.

    Enclosed is a file that shows how to change the memory.,  These are screen shots for a different device   (AM57X DSP) but you can use it to modify your target or define a new target.  

    So please define the external memory as starting at 0xa000 0000   (and modify the size accordingly), make sure that you indeed have physical memory on the board and try.

    If you need more help get back to this e2e

    Ran

  • Hello Ran,

    I have tried editing the existing platform and creating a new platform for the project, both SYS/BIOS and XDCTools, for CPU and GPP, cleaned and rebuilt the project and also restated the CCS several times in between. Nothing helps. The DDR3 address is all the time 0x8000 0000 in the generated .map file. The map file is rewritten every time, but always with the same contents.

    Finally the CCS began complaining that the K2L target is not supported so I uninstalled the SYS/BIOS 6.45.1.29 and XDCTools 3.30.5.60 which I have been using because I probably got them corrupted with some desperate brute-force attempts. Now I am trying to get them back. :-)

    I am out of further ideas.

    BRS, Ari

  • OK

    I guess that you modify one platform but your system uses a different platform (with the same name) from other location. But I understand that you already spent too much time and you want some dirty fast trick how to continue so here is what I do in similar situations:
    I define a very large array with size of 0x1000 0000 bytes (well, if this is floating point then the number of elements are 0x1000 0000 divided by 4) and put it at the top of the main.c with alignment that correspondent to 0x1000 0000 (look for information on pragma of the alignment of memory, the number in the pragma is the power of 2 of the bytes). Call it junk or something and verify that the linker puts it in 0x8000 0000 and then everything else after it - starting in 0x9000 0000


    try it, see if it works for you and get back to me with your observations. We will continue from there

    Ran
  • Hello Ran,

    > I guess that you modify one platform but your system uses a different platform (with the same name) from other location.

    I guess you are right. It behaves just like that and I cannot figure out what's going on.

    > But I understand that you already spent too much time and you want some dirty fast trick how to continue

    This is exactly my feeling and thought! You are reading me like an open book! :D

    I understand how your "dirty trick" works. I have done something similar also myself in another project long ago.

    But can you please show me how exactly you do the "dirty trick" in the beginning of main.c. It just does not work correctly for me. Here is how I try to do it:

    // dirty trick to force the memory be used starting from 0x9000 0000 instead of 0x8000 0000:
    #pragma DATA_SECTION(".sysmem");
    #pragma DATA_ALIGN(256*1024*1024);
    int g_reserved[256 * 1024 * 1024];
    //#pragma DATA_ALIGN(256*1024*1024);
    //int g_reserved2[256 * 1024 * 1024];

    #include "../../dspPlatform.h"
    #include "../../mcmControl.h"
    #include "../../interrupts.h"
    #include "../../dspLink.h"
    #include <xdc/runtime/System.h>
    #include <stdio.h>
    #include <ti/sysbios/BIOS.h>
    #include <ti/platform/resource_mgr.h>
    #include <ti/csl/cslr_emif16.h>

    As you see, my dirty trick is in the beginning of the main. But still it puts something (I guess heap) into the 0x8000 0000 making ARM Linux crash when I load the program with CCS, for instance.

    And here is a snippet from the .map file showing what actually happened:

    ...

    00822428   __TI_CINIT_Base
    00822458   __TI_CINIT_Limit
    80000000   __ASM__
    80000000   g_reserved
    80000000   ti_sysbios_heaps_HeapMem_Instance_State_0_buf__A
    80000068   __ISA__
    80000080   __PLAT__
    800000b0   __TARG__
    800000d8   __TRDR__
    90000000   xdc_runtime_SysMin_Module_State_0_outbuf__A
    90000400   ti_sysbios_knl_Task_Instance_State_0_stack__A
    ...

    I checked the contents of the DDR3 memory at 0x8000 0000 before and after loading the program and I can see that loading the program wrote the heap stuff starting from 0x8000 0000 and then filled the memory up to 0x9000 0000 with zeros. Originally there was some Linux stuff at 0x8000 0000.

    So, we need try something else.

    BRS, Ari

  • OK Ari let's look at the options:

    1. Find out what platform you are using.  For example, do not change your platform, define a new one with a different name.  From the same tub as edit platform use NEW and give it a name with the ti.platforms.myName.  Import the values from the platform you always use and see if you can save the new platform and then try to use it. It is possible that you will have to search where the new platform is  located and add the path to the list of repositories

    2. Move the heap say to the L2  (I assume it is not a shared heap) and see what is going on.  By the way, we may need to define g-reserve as volatile.  I would suggest moving to MSMC memory but I think that the LINUX uses is to communicate with the DSP (I may be wrong though)

    3. Use the MPAX registers to map (for DSP) logical address 0x8000 0000 to physical address 0xA000 0000.  If you use RTSC to define memory before main, then the configuration of the MPAX registers should be done before main.  (I may show you how, but first understand how it works)

    Ran

  • Hello Ran,

    I finally got the "option 1 working". I can now set up the memory map as I wish. Thanks very much! This is a good trick for using later as well.

    But this did not solve my original problem: I cannot load the application with mpmcl if it uses DDR3. I got a hint from an earlier post that I should move the start address 0x8000 0000 to 0xa000 0000.

    One thing that I noticed is that your image is located at memory address 0x8000_0000, the ARM will be using this for Linux. If you use our examples for MPM DSP memory you will want to locate that image at 0xa000_0000. Otherwise, there is a conflict.

    Now it is done, but the problem still exists. Command "mpmcl load dsp0 dspMain" gives the following error:

    Timeout in reading from socket

    load failed (error:o)

    After that trying to run the application with "mpmcl run dsp0" gives the following error:

    can't send data to /var/run/mpm/mpm_daemon (error: Connection refused)

    run failed (error: 0)

    Here is the memory configuration of the application I cannot load with the mpmcl:

    OUTPUT FILE NAME:   <dspMain.out>
    ENTRY POINT SYMBOL: "_c_int00"  address: b00251a0


    MEMORY CONFIGURATION

             name            origin    length      used     unused   attr    fill
    ----------------------  --------  ---------  --------  --------  ----  --------
      L2SRAM                00800000   00100000  00000400  000ffc00  RW X
      MSMCSRAM              0c000000   00200000  00000000  00200000  RW X
      DDR3                  a0000000   60000000  1005214c  4ffadeb4  RW X

    To be more sure about not conflicting with Linux I tried with another address space, but still the same failure:

    OUTPUT FILE NAME:   <dspMain.out>
    ENTRY POINT SYMBOL: "_c_int00"  address: f80251a0


    MEMORY CONFIGURATION

             name            origin    length      used     unused   attr    fill
    ----------------------  --------  ---------  --------  --------  ----  --------
      L2SRAM                00800000   00100000  00000400  000ffc00  RW X
      MSMCSRAM              0c000000   00200000  00000000  00200000  RW X
      DDR3                  f0000000   10000000  0805214c  07fadeb4  RW X

    And here is the memory configuration of another application which load and runs fine, not using the DDR3 at all:

    OUTPUT FILE NAME:   <testDspLink.out>
    ENTRY POINT SYMBOL: "_c_int00"  address: 0080ec80


    MEMORY CONFIGURATION

             name            origin    length      used     unused   attr    fill
    ----------------------  --------  ---------  --------  --------  ----  --------
      L2SRAM                00800000   00100000  00018b42  000e74be  RW X
      MSMCSRAM              0c000000   00200000  00000000  00200000  RW X
      DDR3                  80000000   80000000  00000000  80000000  RWIX

    It seems that I have some issue with loading the application to any DDR3 address. The mpm-daemon crashes when doing so.

    I haven't touched the mpm and dsp settings in the device tree or /etc/mpm/mpm_config.json. They are as in an EVM board factory settings.

    Do you have some ideas what might be wrong?

    Best regards,

    Ari

  • OK  so this is a memory issue.    In the "old" days when I wrote code for k2H I remember that if the mpmcl was unhappy with the DSP out file, it would not load it but does not send the message about the socket. So it may be changed from the old days to now.

    If the kernel uses the memory, mpmcl will not load the memory - I think that this was the original problem.I will suggest two things-

    Make sure that mpmcl reset DSP0 is working without errors

    The DSP DDR3 code should be inside the mem_reserve area.   Either do a search on the mem_reserve and see how it is configured, or stop boot in U-boot and do printenv and see where is the mem_reserve   .  This is where your DSP code should be

    Tell me if it works

    Ran

  • One more thing

    look at processors.wiki.ti.com/.../MCSDK_UG_Chapter_Exploring
    For discussion about mem_reserve head and see if there is a conflict

    Ran
  • Hello Ran,

    I have been busy with other more urgent tasks, but now I have again some time for facing the mpmcl challenges. :-)

    Resetting a DSP core has no problems. Issuing a command:

    mpmcl reset dsp0

    works just fine and, like I told long time before, I can load .out files into internal memory and run them there, so the mpmcl basically works alright. I can also load .out files into DDR3 with the CCS through JTAG, but doing so crashes the Linux although I can still debug the loaded .out program with the CCS without any problems.

    In the u-boot I modified the mem_reserve to 1536, which should make the DDR3 memory area 0xA000 0000 - 0xffff ffff available for DSP cores.

    The "mem_lpae" is set to 1.

    I also checked the memory usage in Linux with

    cat /proc/iomem

    It reports that Kernel code is at

    0x8 0000 8000 - 0x8 008a a073

    and the Kernel data at

    0x8 008f 2000 - 0x8 0097 cfc3

    But now I notice that Linux has mapped its memory to a wider (36-bit?) address space, not 32-bit. Is this what is making now the problem? You mentioned earlier the MPAX registers. Should I program them in order to do the similar mapping for my DSP applications as well?

    During the kernel boot it says:

    [    0.000000] Switching physical address space to 0x800000000

    and later:

    [    12.802699] keystone-dsp-mem a0000000.dspmem: registered misc device dspmem

    If I need to program the MPAX registers, then could you please briefly tell me what to write on them?

    I want the Linux run somewhere at 0x8000 0000 - 0x9fff ffff and all the rest of the DDR3 at 0xa000 0000 - 0xffff ffff be available for all the DSP CPU's so that ARM has access to the same area (0xa000 0000 - 0xffff ffff) as well.

    Best regards,

    Ari

  • OK   let's be clear on the 36 bits and the 32-bits.  

    First I attach a short presentation on the MPAX registers.   Look at the start of the presentation and see if you know already what it says.  If you have any question post here.

    And yes, the default setting of the DSP is address 0x8000 0000 is mapped into 0x8 0000 0000  If you look at the MPAX registers you can see what is the default setting.

    Obviously the ARM uses part of the physical memory that the DSP uses.  The issue is how to find it.   It can be defined in the device tree, or the UBOOT environment.  It can be part of the kernel code or may be cmem code that is used by user space.

    Just to be sure, the DSP does not uses the MSMC memory at all.

    Read the attached document, and then you can do several things:

    1. Understand what is the (36 bit) hysical memory that DSP code and data occupies. Understand what is the 36-bit physical address that the ARM uses. Verify that indeed there is no collision. (It must be, otherwise we cannot explain what you see)

    2. before loading the DSP look at the code that is assigned to the DSP and make sure that the ARM does not use any of this code.  (for example, from CCS connect the DSP and load the DSP physical address with something, and then see if teh ARM code is still working)

    get back to here after you verify all the physical addresses and we will continue from there

    Ran

  • Hello Ran,

    I have read the presentation you sent last time.

    According to it I must set the MPAX HI and MPAX LO to certain values. I tried the 1GB segment example on the page 29 which seems be closest to my needs, but could not accomplish what I tried to get with it.

    Right after boot the XMC registers at 0x0800 0000 are:

    0: XMC_xmpax_xmpaxl: 000000BF
    0: XMC_xmpax_xmpaxh: 0000001E
    1: XMC_xmpax_xmpaxl: 800000BF
    1: XMC_xmpax_xmpaxh: 8000001E
    2: 00000080
    2: 00000000
    3: 00000080
    3: 00000000
    4: 00000080
    4: 00000000

    I cannot edit the register pairs 0 and 1, but the pairs 2, 3, 4,... I can. Are the 0 and 1 for ARM and the rest of them for DSP0, DSP1, etc or what is their meaning?

    In a gel file I have a function for setting the XMC registers for the pair 2 (at address 0x08000010 and 0x08000014). It writes LO = 0x121010FF, HI = 0x2101000B, but they do not help.

    I then tried with the values from the example: 0x1d, 2 and 0x20 to the pairs 2: LO = 20FF, HI = 201D. They neither did any good.

    I have created a custom platform in order to change the DSP code image load address from the default 0x8000 0000 to 0xa000 0000. I have verified from the .map file and by debugging with teh CCS and emulator that the code and data is at 0xa000 0000 area.

    In the device tree I have the following settings:

    dsp_common_mpm_area : dsp_reserved_mpm_area

    <8    0x2000 0000    0x0000 0000    0x2000 0000>

    cma_pool

    <8    0x1f80 0000    0    0x800000>

    mpm_mem : dspmem @ 0xa000 0000

    <0xa000 0000    0x2000 0000>

    cmem_block_mem0

    <8    0x2200 0000    0    0x1e00 0000>

    cmem_block_mem1

    <0    0x0c08 0000    0    0x000c 0000>

    cmem_block0

    <1    0    0x1e00 0000>

    cmem_block1

    ....

    What next?

    BRS, Ari

  • Hi Ari, I apologize but I did not tell you an important fact (you may already know it)
    If the same logical address is covered by two MPAX registers, the translation to physical is done using the higher number register

    So for example, if MPAX 0 maps logical address 0x8000 0000 to address 0x8 0000 0000 and MPAX 14 maps logical address 0x8000 0000 to physical address 0x9 1234 0000 then when the CPU addresses address 0x8000 0000 the physical address will be 0x9 1234 0000

    So in a rule, I never change the default MPAX registers values (MPAX 0 and MPAX 1) I just define higher number register that maps the way I want it to map.

    Now the above was talking about he MPAX registers that sit between DSP and the MSMC. Similar registers are on the TeraNet access to the MSMC. But in the case of the TeraNet each host ID has 8 registers for external access (address 0x8000 0000 and higher) and 8 registers for internal access (addresses less than 0x8000 0000)

    On top of it, the TeraNet access is for other hosts, like EDMA. The ARM translation is done by the MMU.
    So what is next -
    You can look in the LINUX device tree and see what DDR memory the ARM does not use (or change the device tree to have reserve area for the DSP)

    TI encourage the user to use OpenCL to dispatch DSP code from the ARM. You can look for example at www.ti.com/.../tidep0046 that shows TI design with OpenCL, and you should read the following: downloads.ti.com/.../ddr-partition.html By the way, even though the last paper does not talk about mpmcl you may find some hints there

    Update me with what is going on

    Best regards

    Ran
  • The LINUX expert says the following:

    LINUX kernel defines what memories can the DSP use in the DTS file as a cma_pool node

    Look at the keystone-k2l-evm.dts file in the release, read it and look for cma_pool

    Ran
  • Hello Ran,

    in the "cma_pool" I have:

    < 0x00000008    0x1f800000    0x00000000    0x800000>

    From what you said I understand that I should put there

    < 0x00000008    0x20000000    0x00000000    0x20000000>

    But that would overlap with the "mpm_area":

    < 0x00000008    0x20000000    0x00000000    0x20000000>

    I am now a bit confused how to proceed.

    What should I set for the "mpm_area" then if I change the "cma_pool" locate at that area?

    How can I know where the dsp.out is actually loaded in the physical 36-bit memory space? Is there some handy tool in Linux side for that or can it be found from the mpmsrv log? So far I have not found such a log, but I guess it is somewhere. Or do I just need to do some memory browsing to spot the familiar looking bytes in the DDR3?

    Best regards,

    Ari

  • Well, I am not a LINUX expert so my suggestion is to try the cma_pool in the same area of the mpm__area, and then try it in a different location. If i make a guess I would say that the mpm_area is used for the communication between the ARM and the DSP and the cma_pool is dedicated to the DSP

    And you have to look at the DSP memory map and the configuration of the MPAX registers (before you load the DSP code to the DSP memory) to verify that the DSP code and data is indeed mapped into the cma_pool area.

    Try and tell me what you see

    Ran
  • Hello Ran,

    thanks for your quick reply. I will try what you suggest and let you know the results.

    By the way, I have noticed that when loading a RTOS application for a DSP core in CCS, the EVM_init function (in the .cfg it is set up as the "Startup.firstFxns" function) is called already in the load time, before reaching the main() function. Does the same happen with the "mpmcl load" command from the Linux command prompt? I mean that could it be possible that there is something in the RVM_init which crashes the system?

    In the EVM_init() I have the following;

        platform_info info;
        platform_init_config sConfig;
        int32_t pform_status;

        platform_uart_init();
        platform_uart_set_baudrate(UART_BAUDRATE);
        (void) platform_write_configure(PLATFORM_WRITE_ALL);
        Uint32 uartBaudDivisor = (DSP_CPU_OSC_FREQ_Hz/6) / (16 * UART_BAUDRATE);
        CSL_UartRegs* uartRegs = (CSL_UartRegs*)UART_BASE_ADDRESS;
        uartRegs->DLL = uartBaudDivisor;
        uartRegs->DLH = 0;

        memset( (void *) &sFlags, 0, sizeof(platform_init_flags));
        memset( (void *) &sConfig, 0, sizeof(platform_init_config));

        sFlags.pll = 1; // 0 /* 1 = update PLLs for clocking */
        sFlags.ddr = 0; /* External memory */
        sFlags.tcsl = 1; /* Time stamp counter */
        sFlags.phy = 0; /* Ethernet */
        sFlags.ecc = 0; /* Memory ECC */
        sConfig.pllm = 0x14;//0; // Use custom PLLM instead of library default clock divisor

        pform_status = platform_init(&sFlags, &sConfig);

        platform_get_info(&info);    platform_init_flags sFlags;

        platform_write("Platform info:\n");
        platform_write("Board: %s\n", info.board_name);
        platform_write("Revision: %d\n", info.board_rev);
        platform_write("Frequency: %u MHz\n", info.frequency);
        platform_write("S/N: %s\n", info.serial_nbr);
        platform_write("Version: %s\n", info.version);
        platform_write("CPU\n");
        platform_write("\tcount: %d\n", info.cpu.core_count);
        platform_write("\tendianness: %d\n", info.cpu.endian);
        platform_write("\tID: %d\n", info.cpu.id);
        platform_write("\tmegamodule revision: %d.%d\n", info.cpu.megamodule_revision_major, info.cpu.megamodule_revision_minor);
        platform_write("\tname: %s\n", info.cpu.name);
        platform_write("\trevision: %d\n", info.cpu.revision_id);
        platform_write("\tsilicon revision: %d.%d\n", info.cpu.silicon_revision_major, info.cpu.silicon_revision_minor);
        platform_write("EMAC\n");
        Uint8* mac = info.emac.eeprom_mac_address;
        platform_write("\tEeprom MAC address: %02x:%02x:%02x:%02x:%02x:%02x\n",
            mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
        mac = info.emac.efuse_mac_address;
        platform_write("\tEfuse MAC address: %02x:%02x:%02x:%02x:%02x:%02x\n",
            mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
        platform_write("\tport count: %d\n", info.emac.port_count);
    }

    Best regards,

    Ari

  • Yes, the startup function is part of the executable before main and thus it runs when you load the code from mpmcl as well

    This is the way to configure the MPAX registers before main starts. The reason is that c_init initializes memory before main, and if the MPAX registers are not configured yet, the initialization may happen in the wrong physical memory

    Ran

  • Hello Ran,

    I tried changing the "cma_pool" address in the device tree. The result was that the kernel did not boot at all. :-(

    I loaded my modified device tree in the u-boot with "tftpboot" command and the just run the kernel and it did not start. It displayed nothing after "Starting kernel..."

    Then I did the same again (after power off-on of course), loaded again my modified device tree but then modified the original address of the "cma_pool" back and booted and the kernel booted normally.

    So modifying the address is not a good idea, or at least using the reg = < 8 0x2000000 0 0x800000> is not good.

    With the default MPAX settings this 36-bit physical address (8 0x2000 0000) equals to the 32-bit address 0xa000 0000, am I right?

    I just wonder why this has to be so difficult. Using the DDR3 in DSP applications should be just basic stuff. Why should I need to modify some "hard-to-understand" device tree settings? Shouldn't they be already set up so that the end user can easily start using the system?

    I asked earlier if there is some easy way to know where in the 36-bit address space the "mpmcl" tries to load the image. That would help me to understand what happens. Can you name some tool or a log file in the target system?

    If I put a printf or platform_write into the "EVM_Init" function, should it print into the console during the "mpmcl load"? That way I possibly could print out some pointers etc.

    Best regards,

    Ari

  • I asked LINUX expert to look at this.  Help will come (Hopefully)

    Ran

  • Hello Ari,

    Can you please tell me the SDK you are using and the version?

    (Are you using Processor SDK or MCSDK and the version)

    This will help me guide you through this accordingly.

    with regards,

    Sam

  • Hello Sam,
    I am using the u-boot and kernel for K2L EVM board from the Processor SDK version 03.02.00.05 from 15th December 2016.
    The kernel version found in this package is 4.4.32.

    BRS, Ari
  • Thanks for the information.
    Based on your log earlier
    [ 0.000000] Reserved memory: created DMA memory pool at 0x0000000820000000, size 32 MiB
    [ 0.000000] Reserved memory: initialized node dsp_reserved_mpm_area, compatible id shared-dma-pool
    >> I observe, the memory with physical memory 0x820000000 DSP side memory (assuming default MPAX) 0xa0000000; length 32 MB. is reserved for DSP for MPM to use for code/data download. ( 0xa0000000-0xa2000000 on the DSP side)
    (NOTE: This is different from what is configured in the default device tree in Processor SDK. I assume you are customizing this)
    Now given this device tree configuration, the DSP image need to match this memory map.
    Now based on the following line in memory map, I see that the memory layout in DSP is not configured to match this.
    DDR3 80000000 80000000 0b5c12c4 74a3ed3c RWIX
    It is using memory 0x80000000 which is out of range of 0xa0000000-0xa2000000

    On first glance this need to be corrected to look at other issues.
    Please let me know if this makes sense.

    With regards,
    Sam
  • Hello Sam,

    I don't understand why the 0x8000 0000 range is still in the post. It has come from some very early stage of experimenting. I have been trying with the 0xa000 0000 range for a long time already.

    See the snippet of my latest .map file:

    ENTRY POINT SYMBOL: "_c_int00"  address: a8025880


    MEMORY CONFIGURATION

             name            origin    length      used     unused   attr    fill
    ----------------------  --------  ---------  --------  --------  ----  --------
      L2SRAM                00800000   00100000  000005c0  000ffa40  RW X
      MSMCSRAM              0c000000   00200000  00000000  00200000  RW X
      DDR3                  a0000000   60000000  08052c08  57fad3f8  RW X

    But as told before, it just does not work. The Linux crashes if I load the dsp.out from the CCS and if I load it from Linux using "mpmcl", it says after a long wait (one minute or so):

    root@k2l-evm:~/tests# mpmcl load dsp0 dsp.out

    Timeout in reading from socket
    load failed (error: 0)


    One more attempt gives:

    root@k2l-evm:~/tests# mpmcl load dsp0 dsp.out

    can't send data to /var/run/mpm/mpm_daemon (error: Connection refused)

    load failed (error: 0)
    root@k2l-evm:~/tests#

    Once more: I simply want to load most of my DSP code and data into DDR3 so that Linux keeps running in ARM. There should be absolutely nothing special in this, right? Still doing so seems be amazingly complex!

    I also want to use some of the L2 SRAM for time critical code. That for I have configured the L2 be disabled letting me freely to use the L2 area at 0x0080 0000.

    Now, done so far:

    1. in the u-boot the environment variable "mem_reserve" (or how was it called, can't remember exactly now) has been set to 1536 MB. I have understood this tells Linux not to touch the memory at 0xa000 0000 - 0xffff ffff. Is this correct?

    2. I have created a custom platform, so that CCS builds the .out files into the 0xa000 0000 area with it. This works according to the .map files.

    3. I have read the training material of how to setup the MPAX registers for mapping the 32-bit addresses into segments in 36-bit physical memory. I think I understand this, and I have managed to set my own values on those registers, but it just has no effect on the crashing.

    4. I have tried to set the "cma_pool" in the device tree. So far all trials have crashed the Linux already in the boot.

    5. I have checked where the Linux is running:

    "cat /proc/iomem" reports that

    Kernel code is at

    0x8 0000 8000 - 0x8 008a a073

    and the Kernel data at

    0x8 008f 2000 - 0x8 0097 cfc3

    So, the area 0xa000 0000 (physically 0x8 2000 0000, right?) should be completely free for DSP.

    All this has taken already some weeks. I think it is too much, since my project deadline approaches and I am lost = out of ideas for now.

    I only wish somebody could simply tell me:

    1. what to put into MPAX if needed to change the defaults. I suppose the defaults are OK since I have set the address to 0xa000 0000 in the CCS already and Linux is not using that area, right?

    2. how to edit the device tree, if needed at all.

    If these settings are already OK, then I am "barking at the wrong tree" and the mistake must be somewhere else.

    Best regards,

    Ari

  • Ari,

    Let me try to break it down and answer your questions one at a time.

    Thanks for the update on the new memory map. I did not catch your update on your later post. With the new memory map below.

    DDR3 a0000000 60000000 08052c08 57fad3f8 RW X

    >> I notice, that still does not fit within the reserved memory for dsp ( which based on the linux log I understand is 0xa0000000-0xa2000000). Let me know if this changed in the device tree side.

    1. On the mem_reserve: This variable is now deprecated. Don't need to set this variable as the device tree entry takes care of making the reservation in Processor SDK. The dsp_common_mpm_area in the device tree, along with mpm_mem entry in the device tree now takes care of this. )
    2. Understand you are taking care of memory map. But probably need to align this with the Linux side so that the DSP memory area fits within what is reserved in the Linux kernel side.
    3. MPAX: I would recommend to use the default MPAX for now and resolve the issues first. Then we can revisit this once the base issues are resolved.
    4. Regarding "cma-pool" How you are changing the right CMA pool. The section in question is the dsp_common_mpm_area: Hope you are change this. Please confirm.

    Sorry I am involved late into this. I will try to help you to get things up and running ASAP.
    with regards,
    Sam
  • Hello Sam,

    good news at last.

    The mpmcl began suddenly working after I moved one of the CMEM blocks and the "common MPM area" block into the end of the DDR3 by modifying the device tree.

    Memory map as 32-bit perspective:

    Block Before Now
    0x0c00 0000 - 0x0c07 ffff msmram0 msmram0
    0x0c08 0000 - 0x0c13 ffff CMEM CMEM
    0x0c14 0000 - 0x0c1f 7fff free ? free ?
    0x0c1f 8000 - 0x0c1f ffff msmram1 msmram1
    0x8000 8000 - 0x8097 cfc3 Kernel Kernel
    0x8097 cfc4 - 0x9f7f ffff free ? free ?

    0x9f80 0000 - 0x9fff ffff

    common CMA pool common CMA pool
    0xa000 0000 - 0xa1ff ffff common MPM area System RAM
    0xa200 0000 - 0xbfff ffff CMEM System RAM
    0xc000 0000 - 0xdfff ffff System RAM System RAM
    0xe000 0000 - 0xe1ff ffff System RAM common MPM area
    0xe200 0000 - 0xffff ffff System RAM CMEM

    This setup leaves me 1GB continuous memory block for DSP cores to use which should be enough (for now).

    In order to use the MCMSM at 0x0c00 0000 more efficiently I could probably reorganize it as well a bit, right?

    Please verify the table above that it is all right.

    Some more questions:

    Please ignore my CMEM and CMA questions if you think I do not need to take care of them and I should only let them be like they now are. I do not want to waste your time for nothing.

    1. When applications allocate memory in Linux side, will they take it from the CMEM or System RAM" ? Or is the CMEM area for the "cmemk.ko" kernel module use only?

    2. When running applications in Linux side, will the Kernel data + code grow, or do the applications use the System RAM? I mean that do I need to take care the kernel does not grow over the "common CMA pool"?

    3. The Linux boot messages name the "common MPM area" (32 MB) as "DMA memory pool". What it has to do with the DMA? Does Linux at ARM side do some DMA transfers and do I need to have this area if I wish to use the DMA only in DSP side?

    4. What is CMA pool anyway? I tried to find explanation from the net but all I found was that I could use either CMEM or CMA since the CMA is only kind of addition and the CMEM can still do all the stuff. Do I then need the CMA at all?

    Thanks very much for helping me out of the problem!

    Best regards,

    Ari

  • Hello Ari,

    Glad to hear you have it working now.

    I am not clear on your intention behind your new memory map. Is your new DSP DDR memory map is using code/data at: 0xe000 0000 - 0xe1ffffff?


    Anyway, Let me try to answer your specific questions:

    1. When applications allocate memory in Linux side, will they take it from the CMEM or System RAM" ? Or is the CMEM area for the "cmemk.ko" kernel module use only?

    >> The CMEM area is reserved for cmemk.ko use only.


    2. When running applications in Linux side, will the Kernel data + code grow, or do the applications use the System RAM? I mean that do I need to take care the kernel does not grow over the "common CMA pool"?

    >> What is carved out for common CMA is reserved for CMA allocation. No need worry about other kernel allocation using that memory.

    3. The Linux boot messages name the "common MPM area" (32 MB) as "DMA memory pool". What it has to do with the DMA? Does Linux at ARM side do some DMA transfers and do I need to have this area if I wish to use the DMA only in DSP side?

    >> The DMA name may be misleading. It is just used to reserve the area. There is no DMA involved.

    4. What is CMA pool anyway? I tried to find explanation from the net but all I found was that I could use either CMEM or CMA since the CMA is only kind of addition and the CMEM can still do all the stuff. Do I then need the CMA at all?

    >> CMEM can allocate from CMA pool and its own carved out memory. But there may be other application that can allocate from the CMA pool as well.

    Hope this answers your questions.

    with regards,

    Sam

  • Hello Sam,

    thanks for clarifying all my questions 1-4. They are now clear enough for me.

    To answer your question about my new memory map, the Linux reports now:

    root@k2l-evm:~# cat /proc/iomem

    ...

    80000000-dfffffff : System RAM (boot alias)

    800000000-85fffffff : System RAM

      800008000-8008aa073 : Kernel code

      8008f2000-80097cfc3 : Kernel data

    862000000-87fffffff : CMEM

    I am not going to use the "Common MPM Area" at 0xe000 0000 - 0xe1ffffff at all since I don't exactly know what it is and it is all too small (32MB) to be used almost for anything in my case.

    In the original memory map one of the CMEM blocks was placed into (32-bit address space) 0xA200 0000 - 0xBFFF FFFF splitting badly the otherwise available area starting from 0xA000 0000. I cannot understand why it was done so in the first place as a default setting. There seems be no sense in it at all!

    But with this new setup I can load and run my DSP applications into the area 0xA000 0000 - 0xDFFF FFFF making it be 1 GB of continuous memory.

    For me this is just excellent, but is there some issue in this? Should I, for example, enlarge the "Common MPM area" to fill the "System RAM" to make it right?

    At least my applications now (seem to) load and run perfectly.

    BRS, Ari

  • Ari,
    The "Common MPM Area" is expected to exactly match what is in "mpm_mem entry" in the device tree.
    This area is meant for code/initialized data/const in the DSP code. So also need to make sure the DSP code/initialized data/const in DSP memory map falls within this region.
    So please make sure this is well understood and three items mentioned above ( "Common MPM Area", "mpm_mem entry" & DSP memory map) are aligned.
    Otherwise this may result in random failures. ( System RAM overlapping with DSP memory).
    By default, the use case is openCL and it uses limited code/initialized data and hence the small area reserved for this.
    with regards,
    Sam
  • Hello Sam,

    In the device tree I defined the "Common MPM area" be located at 0xa000 0000 - 0xdfff ffff. Also "mpm_mem" falls on exactly the same area.

    The CMEM block I relocated to 0xe200 0000 - 0xffff ffff as I did already before.

    The Linux boots and the "mpmcl load" works on the area starting from 0xa000 0000.

    The boot says it created a DMA pool at 0x0000000820000000, size 1024 MiB.

    I guess it is now OK. Do you agree?

    Best regards,

    Ari

  • Yes. If memory areas are aligned. There should not be any issues.

    Sam

  • Hello Sam,

    thanks for the "quick" reply. ;-)

    The mpmcl has now been working perfectly. There is an issue only if I put something into the MSMCSRAM area in the .out files, but I can live with that. I need to put data in there only in run-time, so that's not a problem at all.

    Thank you for the great help!

    Best regards,

    Ari