This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

IPU load time

Other Parts Discussed in Thread: SYSBIOS

Hi There,


I am using DRA7xx platform (Jacinto 6) for automotive application enablement.

We are using the IPU2 for early video decode and we observed strange timing behaviour in IPU2 firmware loading.

Version - GLSDK 6.04.00.02 and corresponding IPC version is ipc_3_21_00_07.

Actual issue -

dra7-ipu2-fw.xem loading and rpmsg-dce message arrival takes ~ 650 ms. Is that fixed or we have some opportunity to optimize it? We are not using the late attach feature as that is not an option for us.

It seems most of the time is spent in M4 Startup till it lands at M4 side main function. Does it have to do with several static module loading or SYS/BIOS initialization time?

Any help would be great to reduce this load and trigger time.

Thanks & Regards,

-Harshit

  • Hi Harshit,

    I am not familiar with the GLSDK itself, so I'll leave it to someone else to comment on that aspect, but from an IPC standpoint there are some things you can look into.

    The loader in remoteproc is pretty much provided as-is in the Linux kernel, so there is not much you can do to speed it up afaik, short of contributing to the kernel code yourself. However, you can play some tricks like try to put the image file in faster memory (e.g. ramdisk). On the IPU2 side, if as you said most of the time is spent in having the M4 run from the entry point all the way to main(), you should look into optimizing the SYSBIOS configuration to make sure there is nothing extra it is doing at startup. For example you can disable HWI stack initialization by setting Hwi.initStackFlag to false: http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/290700.aspxOr you can look into placing the SYSBIOS code into faster memory, and see if it makes a significant difference in your application.

    Best regards,

    Vincent

  • Hi Vincent,

    Thanks a lot for the response.

    All Linux side optimizations were already attempted with small improvements.

    Hwi.initStackFlag is not there for M4 HWI module. "ti.sysbios.family.arm.m3.Hwi"

    M4 side Stack Size and Heap Size are also set to be at most optimal configuration. The binary size is also stripped one and just 600 KB but somehow this 600 ms till it is triggered from A15 and landing at main seems to be too much.

    All our code and data section is put into DDR only.

    Regards,

    Harshit

  • Hi Harshit,

    My bad. Yes you are correct that the M4 Hwi module does not have the initStackFlag property. Sorry for the confusion.

    Could you elaborate on how you are doing the time measurements? The binary size may affect the load time, but not the time it takes for the IPU2 to run from its entry point to main() - how did you figure out that the latter is the primary contributor to the delay? Assuming all of your measurements are accurate, here are some more suggestions on how to reduce this time:

    1. The .const section in the IPU executable must be read-only.
    2. Turn off cinit compression. E.g.

      ipu_tgt.lnkOpts.prefix += " --cinit_compression=off";
    3. Set heap sections as no-zero-init. E.g.

      Program.sectMap[".systemHeap"].type = "NOINIT";
    4. Ensure that the IPU is running at normal operating point (not some low-power mode).
    5. Review your IPU's .cfg file and to avoid bringing in any unnecessary modules, to reduce the amount of time spent in initialization.

    Best regards,

    Vincent

  • hi,

    Optimizations 2 & 3 are already there in the IPU code.

    For (3) IPU is always clocked at 425 Mhz (212.8 MHz per core)

    One query:

    What is the current measured time and the what is the target required  time?

    Regards

    Pradeep

  • Hi Pradeep,

    The current measured time is around 600 ms and we should be able to achieve our milestone if it is somewhere 300-400 ms. I assume that should be possible.

    Vincent,

    We are logging the 32 bit "COUNTER_32 Timer (address - 0x4AE0 4030)" register write from A15 boot till load, boot and landing till main. we have put logs at Startup function of each module to lay down full trace.

    Thanks a lot again for useful tips.

    Regards,

    Harshit

  • Hi Harshit,

    To give you and others on this forum thread an update, internally we have verified that boot time on the IPU can be reduced from 600 ms to approximately 150 ms by having SYSBIOS turn on the cache earlier in its own boot sequence. This fix is available today in SYSBIOS 6.37.04 and 6.41.00, and is planned to be made available on SYSBIOS 6.40.04 later this month as well.

    Best regards,

    Vincent