This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Can bootloader download image to all of Cores?

Other Parts Discussed in Thread: SYSBIOS

I boot my C6678 by SPI + Nor Flash.

I have only one Project , so only one .out file ,  translate it to .dat file and  write it  to flash .

When the C6678 boot , the bootloader will download the image to only core0 or all of the 8 cores automatic ?

The CMD file of My Project use the logic Address , so all of the Cores have the same memory map , so i dont think i need the MAD . 

  • Hi,

    According to sprugy5a Bootloader, the SPI node can process a boot table produce by the utility hex6x.exe utility. The boot table is composed of section to be copied somewhere in the DSP memory, so it depend on your boot image (boot table). If your application is allocated into a shared memory (for instance MCSM), it could be executed by all the cores, otherwise it cannot. Anyway, it the core you run on core 0 that have to launch the execution of the other:

    • if you link to place the code into Local L2RAM (0x00800000), it will be loaded only onto the core0 Local L2 and will not be executable by other cores.
    • If you link to place the code into MCSM, it will be accessible by all cores. RBL never download the image for all corss, it is the address of the image that is accessible by all the core
    • If you link to place the code into DDR, it will be accessible by all the cores but the download process will fail since RBL doesn't initialize the DDR controller
    • If you link to place the code into a shared L2RAM address, for instance at 0x10800000, core 0 L2SRAM, it will be accessible by all the core but in this case you have restrictions in the use of the local L2RAM of core 0 (it overlay the shared addresses)

    At which address have you choose to allocate you code? MCSM ? L2RAM?

  • Thank you very much . 

    1. If i link to place the code into Local L2RAM (0x00800000), the DNUM is meaningless ,  the project is a single core project  essentially ,  and the memory of  L2RAM of    

        core x and other cpu is waste ,  is it right ?

    2. If i  link to place the code into MCSM,  is all of 8 cores share the same stack in MCSM ? and is the Local L2RAM and local L2RAM of other cores is waste (if not for

        cache) ?

    3. since RBL doesn't initialize the DDR ,  how can i  link to place the code into DDR ?

    4. If i  link to place the code into a shared L2RAM address , all of  the 8 cores will share the same Memory , and we have 8 CPU do the task (use DNUM), is it right ?

    5.QUOTE: "you have restrictions in the use of the local L2RAM of core 0" , what is it mean?   

  • If i have 8 project ,and want to link to place the code into the local L2RAM of core 0 to core7 ,can i do it ?

  • xi su said:

    1. If i link to place the code into Local L2RAM (0x00800000), the DNUM is meaningless ,  the project is a single core project  essentially ,  and the memory of  L2RAM of   core x and other cpu is waste ,  is it right


    In my mail I talk about what the RBL can do for you. It is up to your code to do some more. For instance, your start routine can copy the code in the slaves core L2RAM (a sort of integrated second level boot loader) and then launch tem.


    2. If i  link to place the code into MCSM,  is all of 8 cores share the same stack in MCSM ? and is the Local L2RAM and local L2RAM of other cores is waste (if not for

        cache) ?

    Yes, but you can allocate code into MCSM and data into L2RAM (you have to use a ROM autoinitialization model) or you can write a start routine that use memory alias to remap the data segments of each core into different memory location (I do in this way).



    3. since RBL doesn't initialize the DDR ,  how can i  link to place the code into DDR ?

    You can include in your boot image a special section to initialize the DDR, see sprugy5a para.2.2



    4. If i  link to place the code into a shared L2RAM address , all of  the 8 cores will share the same Memory , and we have 8 CPU do the task (use DNUM), is it right ?

    Well, this is a particular case. To really share the code, one core (the one that has the code loaded into L2) cannot use its L2 for the data, and you have to allocate the data somewhere else. I suggest to forget about it. I mention it only to show you that the RBL can load olso into every core L2 memeory.

    You can link to run ad an address and load at another: the RBL will load at load address (example shared L2RAM) and then run at local L2 address. Your image can hold the code for all cores. The RBL don't care about that. Again it is your startup code to do something to make it usable.


  • My God ! It seems so hard !  I am a  newie ....

    Now I setup only one project , and the project will do 8 Task together  for each core by DNUM , so I have only  generate one .hex file ,  i boot it by spi  +  nor ,

    I hope when the C6678 boot , the bootloader will load  my code to L2 Sram of each 8 Core , each 8 Core  have the same CODE and DATA ,  and then each core do its own task by DNUM .

    What  should i do to complete  this work?  I  have try many  times , but my EVM still not work as i hpoe ,   

    SOS !  In the name of charity  !

  • Alberto Chessa said:
    • if you link to place the code into Local L2RAM (0x00800000), it will be loaded only onto the core0 Local L2 and will not be executable by other cores.

            Are you sure its Local L2RAM (0x00800000) ? I think if i link to place the code into Local L2RAM (0x00800000), it will be accessible by all the core .

            The  0x00800000  is  a  global address , is it ?

  • xi su said:

            Are you sure its Local L2RAM (0x00800000) ? I think if i link to place the code into Local L2RAM (0x00800000), it will be accessible by all the core .

            The  0x00800000  is  a  global address , is it ?

    Look at memory map in Data Manual para.2.3: 0x00800000 is "Local L2 SRAM". Same address for every core but works on different memory area (core private), while, for instance, 0x10800000 is "CorePac0 L2 SRAM" and every core can write into it (but you have cache coherency problems).

    I'm sorry if I confuse it: the important think to understand is that the RBL do nothing of special by the multicore point of view. Essentialy it only copy from your boot image into memory (as specificed in the boot image) and launch the core 0. Everything else is left to your core (and the addreses in the boot image).

    It is similar to connect and load the image with CCS only to core 0. Then is up to your main() to arrange to run the other cores (you can do experiments with CCS in this way). Anyway you can use the MAD utility to do the "dirty" work.

    First you have to choose how use the memory: can you use L2RAM at least for the data? The easiest way is to link the application so to have the constant seciont into MCSM (.text, .switch, .const, .cinit) and the data section onto L2SRAM (local to every core: .bss, .fat, .rodata, ...). In your main() you can launch the other cores.

    This works if you link for ELF ABI with ROM autoinitialization model. I never try with COFF, and also ELF "RAM autoinitialization" doesn't work (I suppose). Try with CCS (connecting only to core0) and then move to RBL boot.

  • Alberto Chessa said:

    I'm sorry if I confuse it: the important think to understand is that the RBL do nothing of special by the multicore point of view. Essentialy it only copy from your boot image into memory (as specificed in the boot image) and launch the core 0. Everything else is left to your core (and the addreses in the boot image).

    Its the point ! If i have one project , and i want to load the same code to L2  of  each 8 cores, and  then each core work together when boot .

    How can i do this ?  How can i link the projcet ?

    The framwork of my project like this:

    main()

    {

         Init();

         Switch(DNUM)

        {

              case 0 : main0();

              case1:main1();

              ...................

         }

    }

  • Hi,

    As I said, it is easiest if you place the const part (code) into MCSM and ony data into L2. Anyway, you can do something like this:

    #include <c6x.h>

    extern void core0_main(void);
    extern void core_slave_main(void);

    #define IPCGR_REG  ((volatile unsigned long*)0x02620240)
    #define REBOOT_BASE 0x1087FFFF
    #define REBOOT_GAP  0x01000000

    extern void _c_int00(void); //Entry point ELF ABI (from libc): initialize data segments and jump to main()

    int main_()
    {
      switch(DNUM)
      {
      case 0:
        {
          int i;
          for(i=1; i<8; ++i)  //copy core 0 L2 to core 1..7 L2 (using global adddress, not cached)
          {
            memcpy((void*)(0x10800000+0x01000000*i), (void*)0x00800000, 512*1024);
            unsigned int magic=REBOOT_BASE+(REBOOT_GAP*i);
            *((volatile unsigned int*)magic)=(unsigned int)_c_int00;  //set the slaves core entry point
          }
          for(i=1; i<8; ++i)
          {
            IPCGR_REG[i]=1;  //sent IPC interrupt to slave cores
          }
          core0_main();
        }
        break;
      default:
        core_slave_main();
        break;
      }

      return 0;
    }


  • Hi,

    Sorry, in my previous message change main_() with main()

  • Thank  you very  much ! YOU HELPED ME A LOT. 

    By the way ...

    Alberto Chessa said:

            memcpy((void*)(0x10800000+0x01000000*i), (void*)0x00800000, 512*1024); 

    Its seemed strange ,  is there same way let bootloader do this task ?


  • xi su said:

      memcpy((void*)(0x10800000+0x01000000*i), (void*)0x00800000, 512*1024); 

      Its seemed strange ,  is there same way let bootloader do this task ?

    The memcpy simple copy all the core 0 L2 (local address) to the L2 of cores 1..7 using the global address. The destination address expand to 0x11800000 (core 1), 0x12800000 (core 2) and so on (look at the memory map of C6678). I copy the maximum (512K) since I don't know where exactly is allocated the code (you can automatically obtain that info from the linker).

    WARNING: there is a mistake: you cannot overwrite the last 0xD23F bytes of L2 of each core, so fix the memcpy to copy only 512*1024-0xD23F.

    I suppose you can arrange to build a boot image with 8 copy of the same code, each copy with the proper global address, but I think it requires a some post-link works and give not real benefits (IMHO).

  • Thank you !  But   excuse me 。。。

    Alberto Chessa said:
    • if you link to place the code into Local L2RAM (0x00800000), it will be loaded only onto the core0 Local L2 and will not be executable by other cores

           If i  link to place the code into Local L2RAM (0x00800000), then memcpy it to other core , then the code will be executable by other cores now , is it right ?

    Alberto Chessa said:
    • If you link to place the code into MCSM, it will be accessible by all cores. RBL never download the image for all corss, it is the address of the image that is accessible by all the core

            If  i  link to place the code into MCSM , i shoud place the stack into  Local L2RAM  , then copy it to other core, is it right?I think it will err if each core use one stack.

    Alberto Chessa said:
    • If you link to place the code into a shared L2RAM address, for instance at 0x10800000, core 0 L2SRAM, it will be accessible by all the core but in this case you have restrictions in the use of the local L2RAM of core 0 (it overlay the shared addresses)

             I  cannot understand it ,  if i link to place the code into a shared L2RAM address, for instance at 0x10800000, I think it will only be executed by core0 , even if i copy

             the L2 of core0 to other cores, the other core cannot  work too , beacause  the address is linked to core0 .  

  • xi su said:

           If i  link to place the code into Local L2RAM (0x00800000), then memcpy it to other core , then the code will be executable by other cores now , is it right ?

    I was talking about what RBL. Is up to your code to make it accesible (that is copy to other core L2RAM).

            If  i  link to place the code into MCSM , i shoud place the stack into  Local L2RAM  , then copy it to other core, is it right?I think it will err if each core use one stack.

    All the data sections must be placed into Local L2. The stack is an unitialized are and not need to be copied. The other initialized data section (such as .bss, .far, ..) are initialized by _c_int00 (ELF ABI, with ROM autoinitialization option of the linker). So you don't need to copy the data.

             I  cannot understand it ,  if i link to place the code into a shared L2RAM address, for instance at 0x10800000, I think it will only be executed by core0 , even if i copy

             the L2 of core0 to other cores, the other core cannot  work too , beacause  the address is linked to core0 .  

    If you link at a global address, every core can fetch from it, so every core can execute (you don't have to copy the code). As in the other cases, the data section should be placed in local RAM.

  • Thank you !

    Alberto Chessa said:

    All the data sections must be placed into Local L2. The stack is an unitialized are and not need to be copied. The other initialized data section (such as .bss, .far, ..) are initialized by _c_int00 (ELF ABI, with ROM autoinitialization option of the linker). So you don't need to copy the data.

    The data sections must  link to Local L2RAM (0x00800000) , cannot  0x1x800000 in my roject ,  is it right ? 

    Alberto Chessa said:

             I  cannot understand it ,  if i link to place the code into a shared L2RAM address, for instance at 0x10800000, I think it will only be executed by core0 , even if i copy

             the L2 of core0 to other cores, the other core cannot  work too , beacause  the address is linked to core0 .  

    If you link at a global address, every core can fetch from it, so every core can execute (you don't have to copy the code). As in the other cases, the data section should be placed in local RAM.

    [/quote]

    If  i  link at a global address,  and  write the _c_int00 address of core0 to the magic address of other cores when boot , then each core will do the same thing , is my description right ? but there is a problem :  each 8 core will share the same stack  ,  beacause there _c_int00 will init B15 with the global address not the Local L2 address.

  • xi su said:

    The data sections must  link to Local L2RAM (0x00800000) , cannot  0x1x800000 in my roject ,  is it right ? 


    Yes. All the data sections (including stack, heap and ,.rodata) must be mapped to a private area.

    xi su said:

    If  i  link at a global address,  and  write the _c_int00 address of core0 to the magic address of other cores when boot , then each core will do the same thing , is my description right ? but there is a problem :  each 8 core will share the same stack  ,  beacause there _c_int00 will init B15 with the global address not the Local L2 address.

    As said before, you HAVE TO map data on a private area that have the same logical address for each core, that is Local L2SRAM (you can use PAX register to use another area, but it is more complicated). So the _c_init00() initiliaze B15 with the Local L2 address: it will be the same logical value for each core but map to different physical memory.

    Note that is you use a core global L2 RAM, you have to take care to not overlap the Local L2 used by the data. For instance, if you map code at 0x10800000 and data at 0x00800000, it don't work since code and data of core 0 overlaps (but code and data for core 1..7 don't overlaps). You have, for instance, fo map code at 0x10800000 and data af 0x00800000+size_of_code, where size of code can be a constant or calculated at link time.

  • Thank you very much ! 

    You are so patient and helped me a lot !

  • Hi,

    This seems like an old post, but i'm currently facing a similar issue.

    I'm using an EVM6678L and i created a new project, copied the code over and did some modification, but i receive an error when i debug.

    Also, why is the REBOOT_BASE 0x1087FFFF? According to the data sheet, CorePac0 L2SRAM is 0x10800000 - 0x1087FFFF, so shouldn't the REBOOT_BASE be 0x10800000 instead? And IPCGR_REG 0x02620240, when chip-level registers is 0x02620000 - 0x026207FF.
    ps:I've tried changing the REBOOT_BASE to 0x10800000 but i still receive the error.

    My code:

    main()
    {
      IPC_start();
      platform_uart_init();//etc, init platform code as well as platform write etc.
      if (MultiProc_self() == 0)
      {
        for(i=1; i<8; ++i)  //copy core 0 L2 to core 1..7 L2 (using global adddress, not cached)
       {
         memcpy((void*)(0x10800000+0x01000000*i), (void*)0x00800000, 512*1024);
         unsigned int magic=REBOOT_BASE+(REBOOT_GAP*i);
         *((volatile unsigned int*)magic)=(unsigned int)_c_int00;
       }
       for(i=1; i<8; ++i)
       {
         IPCGR_REG[i]=1;  //sent IPC interrupt to slave cores
       }
       master_Main(); //set led0 to blink
     }
     else
       slave_Main(); //led 1/2/3 will blink when core 1/2/3 runs, ignore 4/5/6/7 for now
    }

    My .cfg file:

    //respective var
    Program.sectMap [".const"] = "MSMCSRAM";
    Program.sectMap [".text"] = "MSMCSRAM";
    Program.sectMap [".code"] = "MSMCSRAM";
    Program.sectMap [".data"] = "MSMCSRAM";
    Program.sectMap [".system"] = "MSMCSRAM";
    Program.sectMap ["platform_lib"] = "MSMCSRAM";

    When i debug, i resume core 0(no error), resume core 1(no error yet). But when i resume core 2 onwards, every single core which i resume will output the following error.

    Error:
    A12=0x802938 A13=0x804594
    A14=0x804588 A15=0x804590
    A16=0x8045dc A17=0x0
    A18=0x8045d6 A19=0x0
    A20=0x480010 A21=0xc06c0732
    A22=0x5e0308 A23=0x300d
    A24=0x40040 A25=0x4498001d
    A26=0x2084844 A27=0x3001c000
    A28=0x8080000 A29=0x20
    A30=0x803128 A31=0x0
    B0=0x0 B1=0x0
    B2=0x804444 B3=0x2
    B4=0xc03316d B5=0x2
    B6=0x0 B7=0xff
    B8=0x0 B9=0x8014
    B10=0x804588 B11=0x804590
    B12=0x802938 B13=0x804594
    B14=0xc022830 B15=0x804548
    B16=0x30 B17=0x8045d4
    B18=0x8045c0 B19=0x1
    B20=0x8045b4 B21=0x8045a4
    B22=0x8045bc B23=0x8045ac

    B24=0x8045b8 B25=0x8045a4
    B26=0x1 B27=0x8045cc
    B28=0x8045a8 B29=0x8045c8
    B30=0x804598 B31=0x8045a0
    NTSR=0x8045a0
    ITSR=0x8045a0
    IRP=0x804590
    SSR=0xc00a414
    AMR=0x8044c8
    RILC=0xd
    ILC=0x804580
    Exception at 0x0
    EFR=0x2 NRP=0x0
    Internal Exception: IERR=0x1
    Instruction fetch exception
    ti.sysbios.family.c64p.Exception:line 255: E_exceptionMin: pc=0x0c032974, sp=00804548

    What is this error i'm facing right now? The code looks fine to me, even the address and magic address looks correct. Hope someone can clarify this for me.

    Regards,
    Jon

  • Jonathan Lin said:

    Also, why is the REBOOT_BASE 0x1087FFFF? According to the data sheet, CorePac0 L2SRAM is 0x10800000 - 0x1087FFFF, so shouldn't the REBOOT_BASE be 0x10800000 instead?

    The REBOOT_BASE is where the RBL (ROM Boot Loader) looks for the start address of core 1..7.

    The RBL wait for an "IP" interrupt , then load the address from the REBOOT_BASE and then jump to it. So it must be at  0x1087FFFF+(0x01000000*core_num). This requirement is imposed by the Texas Instruments RBL.

    And IPCGR_REG 0x02620240, when chip-level registers is 0x02620000 - 0x026207FF.

    I don't understand the problem: IPCGR_REG is part of chip-level register and it is located at 0x02620240 (that is in the chip-level registers range).

    Program.sectMap [".const"] = "MSMCSRAM";

    Program.sectMap [".text"] = "MSMCSRAM";
    Program.sectMap [".code"] = "MSMCSRAM";
    Program.sectMap [".data"] = "MSMCSRAM";
    Program.sectMap [".system"] = "MSMCSRAM";
    Program.sectMap ["platform_lib"] = "MSMCSRAM";

    The scenario of this discussion assume your application works on L2RAM, so each core run on its own private L2RAM. The initialization code copy the core0 L2 to the slaves cores L2RAM to make the code and other initialized section ready for use by slaves core.

    In your example all application segments, but the stacks, are mapped in MSMCRAM, that is all the cores use the same memory. The const section usually can be shared (as the .texd section), but sharing data section could arise to various issue, depending on what the application is doing.

  • Alberto Chessa said:

    The REBOOT_BASE is where the RBL (ROM Boot Loader) looks for the start address of core 1..7.

    The RBL wait for an "IP" interrupt , then load the address from the REBOOT_BASE and then jump to it. So it must be at  0x1087FFFF+(0x01000000*core_num). This requirement is imposed by the Texas Instruments RBL.

    Yeap i sorta figured that out haha. But while looking at the data sheet, it states that CorePac0 start address is 0x10800000, and ending with 0x1087FFFF. So what i was trying to find out is why is the reboot_base the last address?

    I don't understand the problem: IPCGR_REG is part of chip-level register and it is located at 0x02620240 (that is in the chip-level registers range).

    My bad, i must be blind, i must have missed it in the data sheet.

    Program.sectMap [".const"] = "MSMCSRAM";

    The scenario of this discussion assume your application works on L2RAM, so each core run on its own private L2RAM. The initialization code copy the core0 L2 to the slaves cores L2RAM to make the code and other initialized section ready for use by slaves core.

    In your example all application segments, but the stacks, are mapped in MSMCRAM, that is all the cores use the same memory. The const section usually can be shared (as the .texd section), but sharing data section could arise to various issue, depending on what the application is doing.

    Well, i changed some of them to DDR3. Works okay so far. Though when i moved my codes from main into a task, i seem to get an exception error. But i suppose that should be in another thread. But thanks anyway =]

  • Jonathan Lin said:

    Yeap i sorta figured that out haha. But while looking at the data sheet, it states that CorePac0 start address is 0x10800000, and ending with 0x1087FFFF. So what i was trying to find out is why is the reboot_base the last address?

    The data sheet reports the L2 memory address range (start and end) and not the code run start address.

    I suppose the RBL designers choose to use the last address since usually the application code start to use the memory from the start. By mapping it at the end, the RBL is sure to not overlap an used location.

    Well, i changed some of them to DDR3. Works okay so far. Though when i moved my codes from main into a task, i seem to get an exception error. But i suppose that should be in another thread. But thanks anyway =]

    It is not so important if you map onto MCSM or onto DDR. The important thing is to use separate address for each core. If you use the same memory, the task created by core 1 will overwrite the memory for the task created by core 2 (and so on).

  • Oh wow, thanks for the detailed explanation.