This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Write Virtual Address to PLE c11, c5 error

Dear supporters:

For performance concern, I 'd like to copy data from RAM to A8 Cortex L2 cache way via Preload Engine(PLE).

I am currently stuck on writing the virtual address to the PLE (3.2.64 c11). 

I suppose that the general purpose registers, r0, r1...,rN can be used to pass arguments to

assembly function, so I wrote a simple *.asm file to write start address

to PLE c5 register as below:

.global Write_PLEStartAddress

Write_PLEStartAddress:

mcr p15,#0,r0,c11,c5,#0

The buffer address of lut_data, the passing argument after calling

Write_PLEStartAddress(lut_data );

is 0x80240158  (r0 shows correctly on CCS register view)

However, after I wrote the PLE start address and read back from registers by issuing the cmd,

mrc p15,#0,r0,c11,c5,#0

bx lr

I got 0x80240178 or 0x8024016C, but never 0x80240158.

Could someone has any idea about why is this happening?

My platform info: TMS320DM8148 (Vision-Mid) 

600-MHz ARM® Cortex™-A8 RISC MPU

500-MHz C674x™ VLIW DSP

200-MHz M3-ISS/M3-HDVPSS 

Best regards,

Joey from Altek

  • I have no expertise with chip level details like preload engine.  It looks like you are having trouble writing an assembly routine you call from C.

    I presume you are using the TI ARM compiler.  That means you are building according to the EABI application binary interface.  That interface is standard for all ARM compilers.  A web search on ARM EABI calling convention yields this document from ARM as the first result.  That document contains all the details on writing an assembly routine that can be called by any ARM compiler, including the TI ARM compiler.  The important detail is that the first argument is passed in R0.

    Hope this helps ...

    -George

  • Hi, experts:

    According to a8 spec ver. r3p2,  it indicates that the start address requires 64 bits alignment. How

    could you align the virtual address of an array that was assigned by the OS at the first place?

    If I prepare another buffer address that is 64 bit aligned, doesn't it means to take extra CPU clocks to move data back and forth? Could someone provide a better solution for the alignment issue?

    Thanks in advance,

    Joey from Altek

  • Joey,

    I see this text is extracted from the ARM TRM:

    http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344k/DDI0344K_cortex_a8_r3p2_trm.pdf

    8.4.1 Configuring the preload engine

    Thus the right place to ask about this text is the ARM forum.

    I can also provide you an example for programming the Cortex-A8 ARM for the OMAP3 device, but should be similar to the Cortex-A8 ARM for the DM814x device:

    http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/716/p/341640/1193445.aspx#1193445

    Regards,
    Pavel

  • Here is one thread from the ARM forum, thread that discuss DM814x Cortex-A8 ARM PLE and L2:

    http://community.arm.com/thread/2135

    Regards,
    Pavel