This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Efficient C6748 booting?

Other Parts Discussed in Thread: OMAP-L138, OMAPL138

Hi,

We have developed our own custom C6748 board for one of our products. We are using the following software components:

- DSP/BIOS 5.41

- CCS 5.2

- Codegen tools 7.3.4

- AISgen 1.9

- Modified version of the CCS NORWriter flasher app that came with OMAP-L138_FlashAndBootUtils_2_40

Our application is now nearing completion, and we need an efficient way of booting our image. Given our *.out file, I have managed to generate a *.bin image with AISgen, which is then written to flash using the modified NORwrite app. However, we found that the initial image did not boot. The reason for this is that we configured the PLL for the maximum CPU frequency (456MHz) in AISgen, but due to hardware limitations the core voltage is still somewhere below 1.3V when the DSP attempts a NOR boot and consequently the operating point is undefined (because the PLL is configured for >100MHz), which is why the boot then fails.

The simplest solution is to initially configure the PLL in AISgen for <100MHz, and then just reconfigure the PLL in the application itself for 456MHz. However, the problem with this, is that it then takes forever to boot the actual image, which is unacceptable from the user's perspective.

The better option is thus to write a secondary bootloader, which initially delays for a few milliseconds to allow the core voltage to become stable at 1.3V and then reconfigure the PLL for 456MHz and continue to copy the main application image from flash to RAM and then branch to the main application.

This brings me to my questions:

1) I'm assuming I will/can only use AISgen to prepare the image for my secondary bootloader?

2) Can my secondary bootloader also be a DSP/BIOS application?

3) How should I prepare the image of my main DSP/BIOS application, given the *.out file? I am assuming I should use the hex6x.exe utility to generate some HEX image to be written to flash? I am aware that section 11 of this document http://www.ti.com/lit/ug/spru186w/spru186w.pdf, provides the details of hex6x.exe, but since I'm inexperienced with using this utility, I would really appreciate some pointers as to which format to use.

Your help would be greatly appreciated!

Kind regards

Reinier

  • Hi Reinier,

    From a software perspective, the easier way for you to go about this would have been to have your hardware setup for 1.3V core voltage so that the 456 Mhz can be achieved from the AIS GEN tool but given that this is not achievable, you will need to create a secondary boot loader that allows the hardware to change the core voltage to 1.3V  to achieve the 456Mhz required by your application.

    My recommendation to go about this would be to start with the Starterware bootloader and use the out2rprc tool for your application as described here:

    http://processors.wiki.ti.com/index.php/OMAPL138_StarterWare_Booting_And_Flashing

    This is an non-bios secondary bootloader that has the ability to parse the rprc image obtained from the out2rprc and load the sections of your application binary into device memory. The difference between this tool and hex6x is that hex6x will just give you a flat binary that you copy over from flash to the device memory but most applications have multiple sections and may have holes between sections.

    Please refer to some other pointer being discussed on this thread by Dean, Brandon and Norman where the same topic is being discussed.

    http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/115/p/244700/863160.aspx#863160

    Please let us know if you have any follow up questions.

    Regards,

    Rahul

  • Rahul,

    Thank you for your response.

    I'm assuming your suggesting that I do not use a DSP/BIOS-based secondary bootloader?

    An additional requirement of our bootloader, that I perhaps should have mentioned earlier, is that in-field updates should be possible over the UART. Basically we will provide the client with a PC app that will connect to our product over a RS-232 port and given the latest image of our code, the secondary bootloader will receive the new image over the UART and write it to flash. If the bootloader does not initially detect the PC app within a certain time period, it will simply continue to boot the main application from flash. We may also have multiple applications stored in flash and boot different applications, depending on some configuration.

    The implication of the above requirement, is that I would probably have to make significant changes (i.e. hacking) to the Starterware bootloader to accommodate this. Although this option is not off the table, I would prefer to write the secondary bootloader from scratch and streamline it for our purpose, as I believe this will be beneficial for maintaining the code in the long run. However, I do plan to at least use the UART driver from the Starterware.

    Can you please elaborate on the advantages of using out2rprc over hex6x?

    By a flat binary I'm assuming you mean that the hex6x utility fills the holes between sections with some value, which means that the image being stored in flash is larger than necessary, and also during the booting process unnecessary data are being copied to the device's RAM? Is it not possible generate a continuous binary image with hex6x?

    Regards

      Reinier

  • Reinier,

    Yes, I am suggesting that you write the secondary bootloader using low level code like Starterware rather than DSP/BIOS based applications to keep things simple.

    Also, your understanding of flat binary is as I intended to put it in the first place. George elaborates on the hex6x utility in the following forum thread which you may want to refer. There is additional information on this tool in Chapter 11 of the C6000 compiler documentation.

    http://e2e.ti.com/support/development_tools/compiler/f/343/t/80866.aspx

    Out2rprc does a little more heavy lifting than the hex6x tool. It organizes the section in the output binary and attaches a header to the binary image that startes with a key word to indicate start of the image and informations such as number of sections in the binary the starte address of the section in flash and the load address of the section. If you refer to the file OMAPL138_StarterWare_1_10_03_03\bootloader\src\bl_copy_rprc.c,  you will get an idea of the format of the binary generated by the out2rprc tool.

    If you have the device memory to support flat binary, you could choose to use the hex6x utility. For most efficient way to generate a binary using hex6x, you may want to post on the compiler forums as I haven`t used the utility in a while and there may have been new features added to the tool that I may have overlooked.

    As far as field upgrades is concerned, I think once you have the custom bootloader finalized, you should only plan to replace the application binary as the custom bootloader will be generic enough to load all kinds of application images. Currently the secondary bootloader in Starterware hard codes the Image offset for the application. You could introduce an custom control logic in there to pick from different image offsets based on the images you have in your flash. As you might already know, we do provide a windows utility for flashing images over UART which is provided in the serial flash and boot utilities. For your custom hardware, you will need a provision to switch between UART boot  mode (for flashing over UART) and Flash boot ( For booting your application) to be able to enable this functionality..

    Hope this helps.

    Regards,

    Rahul

  • Hi Rahul,

    After some experimenting, I managed to write bootloader that is able to successfully boot an Out2rprc image. However, I found that some of my simple DSP/BIOS test apps would boot fine, while others won't. I narrowed it down to DSP/BIOS apps that make use of TSK_sleep(). My bootloader is in fact using Timer0 for delay, so I suspect my bootloader does not the leave the timer peripheral in a proper state for when DSP/BIOS initializes.

    Can you please explain the proper way to "clean-up" the timers (or any other peripherals the bootloader might use) before transferring control to the main DSP/BIOS application? If possible, some sample code would also be very helpful!

    Regards

      Reinier

  • Rahul,

    I've sorted out the issue of the Timer peripheral, I simply wrote the default values of the peripheral registers back, as specified in the Technical Reference Manual.

    I've managed to successfully boot the entire image (about 11MB in flash), but the booting time is still too slow (around 12 seconds). I've made some optimizations in the copy loop of bl_copy_rprc.c, but I believe the answer lies in using the cache. I've managed to turning on L1 cache with no problems (which brought the boot time from 19 seconds to the current 12 seconds).

    However, when I turn on L2 cache, the main app won't boot anymore. My main app is DSP/BIOS based and I know DSP/BIOS uses some of the IRAM for its functionality, so I can't use the entire L2 as cache. I moved most of the DSP/BIOS objects to either L3 or DDR and I want to at least configure L2 cache as 128KB. I'm using the cache functions implemented in cache.c in the Starterware,

    I enable the cache with the following call:

    CacheEnable(L1PCFG_L1PMODE_32K | L1DCFG_L1DMODE_32K | L2CFG_L2MODE_256K);

    I have also added the following functions to invalidate L2 cache:

    void CacheInvL2All (void)
    {
        All(DSPCACHE_L2INV);
    }

    void CacheWBInvAll (void)
    {
        All(DSPCACHE_L2WBINV);
    }

    I call these functions just before I branch to main application to invalidate the L2 cache.

    Can you please explain the proper way of configuring and the invalidating L2 cache or am I missing something else here?

    Also is there some additional way of optimizing the copying from flash to RAM?

    Regards

      Reinier

  • Hello,

    I am also currently developing fast second stage bootloader to boot from SPI flash. I am using Micron 25Q032A flash which can be used in quad mode. It means 4 bit data bus can be used instead of SPI 1 bit data bus. Moreover, it supports clock upto 108 MHz. So this is possibly the fastest solution you can get.

    I am going to use starterware bootloader as starting point: C:\Program Files\Texas Instruments\pdk_C6748_2_0_0_0\C6748_StarterWare_1_20_03_03\bootloader

    Some questions have arisen during digging the e2e forum.

    1. According to http://processors.wiki.ti.com/index.php/OMAPL138_StarterWare_Booting_And_Flashing bootloader must be generated using AISgen and application must be generated using out2rprc. This means that AISgen sets up pinmuxing, PSC, etc and application uses the same settings that are setup within AISgen bootloader image? For example if I would like to run bootloader with 456 MHz (to boot as fast as possible) then it must be setup correctly with AISgen and then before giving control to application it should be changed to 300 MHz if this is needed.

    2. I can see that starterware bootloader does not use interrupts. What if I would like to use interrupt driven solution, is there any fundamental limitation for bootloader to use only non interrupt driven code? For example with microcontrollers there is typically problem with holding interrupt vector table for bootloader and for app. I changed the starterware bootloader to use SPI0 instead of SPI1, however for some reason SPI clock signal is not correct. I did the same changes to C:\Program Files\Texas http://e2e.ti.com/cfs-file.ashx/__key/communityserver-discussions-components-files/42/6433.tr_2D00_bootloader.7zInstruments\pdk_C6748_2_0_0_0\C6748_StarterWare_1_20_03_03\examples\evmC6748\spi\spiflash.c which uses interrupt driven solution and there it seemed OK. So to get it quickly going I would like to use the same interrupt driven approach.

    I attached stripped startwerware bootloader that only uses SPI0 (it was SPI1 before) and UART1 for logging. Maybe there is some configuration incorrect during migration from SPI1 to SPI0 SCS pin 3. PS currently I am using GEL file for configuring all the settings.

    Andres

  • I think I almost got answers to all my questions. First of all I used different SPI clock with interrupt based application. It was 2 MHz instead of 20 MHz what was used in bootloader. That is why I saw messed up clock signal because my logic analyzer was not fast enough. The second mistake I made was about SCS pin, actually I meant to use SCS_0 pin, because RBL ofcouser must use SCS_0 pin. I used SCS_3.

    After those fixes I am able to boot up my application using modified starterware bootloader.

    1. bootloader binary was generated with AISGen (entrypoint was 0xC1080000, it was 0 in the example, I guess it shows where to load secondary bootloader image)
    2. application image was generated using out2rprc.exe
    3. Images were uploaded to flash via JTAG using SPIWriter. For app image load address I used 0x100000 and for the app image entry point address I used C1080000. Both numbers are in hex, but SPIWriter does not like 0x...notation.
    Andres
  • With 300 MHz DSP clock, SPI clock could be max 50 MHz according to this thread: http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/50281.aspx

    To improve easily improve bootup time I changed my SPI clock to 50 MHz in starterware secondary bootloader.

    I put logic analyzer trigger to UART tx pin so I could measure bootup time. What I saw was little bit strange, there is 0.8 s gap between seeing "Jumping to StarterWare Application...\r\n\n" and seeing first output from main application, which is actually SYS/BIOS application.

    int main(void)
    {
    /* Configures PLL and DDR controller*/
    //BL_PLATFORM_Config();
    /* Initializing the UART instance for serial communication. */
    UARTStdioInit();
    UARTPuts("StarterWare ", -1);
    UARTPuts(deviceType, -1);
    UARTPuts(" Boot Loader\n\r", -1);
    /* Copies application from non-volatile flash memory to RAM */
    ImageCopy();
    UARTPuts("Jumping to StarterWare Application...\r\n\n", -1);
    /* Do any post-copy config before leaving boot loader */
    BL_PLATFORM_ConfigPostBoot();
    /* Giving control to the application */
    appEntry = (void (*)(void)) entryPoint;
    (*appEntry)( );
    return 0;
    }
    
    
    Is it normal that SYS/BIOS application starts so long time. I guess calling (*appEntry)( ); should give control immediately to main application (currently built with debug configuration) that runs from DDR.
    PS: I am using C6748.
    Could I get some advice how to find out what is causing this "gap".
    Andres
  • Above mentioned test was done using JTAG. Meaning that I actually ran secondary bootloader code within CCS and in this configuration I saw 0.8 s gap. However things are much slower if I tried full boot (RBL -> secondary bootloader -> app). Then there is 2.8 second gap instead of 0.8 s.

    I am very worried about the situation here. I could do initial reading from SPI flash to DDR very quickly if 4 bit interface is used, however I have no idea why there is this 2.8 s gap.

    I try to figure out if this is some kind of application specific behavior (I try next with simple starterware app instead of complex sys/bios).

    Andres

  • I continue with monologue.

    SPIWriter is used to load bootloader and app. If it is asking for app image entry point address I gave this address from .map file.

    ENTRY POINT SYMBOL: "_c_int00"  address: c185d360

    From map can be seen that it refers to:

    c185d360    000000a0     boot.ae674 : boot.oe674 (.text:_c_int00)

    As far as I can understand actually main address is not called from bootloader, there happens some init before control is given to main.

    I tried also with simple starterware app and it started immediately after call from bootloader. Then I tried with another toy sys/bios app which image was made bigger using some very big arrays. So I got out with 670 KB image. Which is little bit larger than my actual production image (617 KB). What I noticed it did not matter a lot how big it was. It started with 0.19 s if it was 68 KB big and with 0.23 s if it was 670 KB (here I am referring times from the bootloader, SPI loading time is not considered).

    So something special must happen in my app boot.ae674 otherwise I do not know where this 2.8 s delay comes from.

  • Andres,

    I do not understand what you are trying to achieve here, you are not really asking any questions?

    You hi-jacked the thread to have a conversation with yourself and in the process you are spamming me with every post you make, since I was the original poster.

    Please start your own thread!