This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS570LS3137 HDK - Running software into SDRAM via the EMIF - Stuck in prefetchEntry state at first instruction

Other Parts Discussed in Thread: TMS570LS3137, HALCOGEN, TMS570LC4357, RM57L843

hello,

I am using the TMS570LS3137 HDK with CCS  Version: 6.0.1.00040 and HALCoGen 04.01.00. I want to run a Dhrystone within the 8MB SDRAM of the HDK.

everything works fine: correct entries in map-file, SDRAM contains same code as ROM after copy.

Then when calling the function by function name the program runs into prefetchentry. Following the call sequence in disassembly window, the trampoline is called and jumps to prefetchEntry instead of my function.

My project is based on the "example_emif_sdram" given by HalCoGen
•EMIF timings, clock and pinmux are OK
•MPU is enabled for SDRAM region, type : STRONGLYORDERED_SHAREABLE | Permission : PRIV_RW_USER_RW_EXEC
•Dhrystone OK when running normally


The guilty ASM :

123         DMIPS = dhrystone();
          $C$L3:
0000829c:   EB000260 BL              $Tramp$AA$L$PI$$dhrystone
000082a0:   EE10CA10 VMOV            R12, S0
000082a4:   E58DC560 STR             R12, [R13, #1376]

...

          $Tramp$AA$L$PI$$dhrystone():
00008c24:   E300C000 MOVW            R12, #0
00008c28:   E348C000 MOVT            R12, #32768     
00008c2c:   E12FFF1C BX              R12 --> branch to 0x8000 0000 (First SDRAM address)

and then... stuck on prefetchEntry

here is the concerned code :



Linker


MEMORY
{
    VECTORS (X)  : origin=0x00000000 length=0x00000020
    FLASH0  (RX) : origin=0x00000020 length=0x0017FFE0
    FLASH1  (RX) : origin=0x00180000 length=0x00180000
    STACKS  (RW) : origin=0x08000000 length=0x00001500
    RAM     (RW) : origin=0x08001500 length=0x0003EB00

/* USER CODE BEGIN (2) */
    SDRAM  (RWX) : origin=0x80000000 length=0x02000000
/* USER CODE END */
}

/* USER CODE BEGIN (3) */
/* USER CODE END */


/*----------------------------------------------------------------------------*/
/* Section Configuration                                                      */

SECTIONS
{
    .intvecs : {} > VECTORS
    .text    : {} > FLASH0 | FLASH1
    .const   : {} > FLASH0 | FLASH1
    .cinit   : {} > FLASH0 | FLASH1
    .pinit   : {} > FLASH0 | FLASH1
    .bss     : {} > RAM
    .data    : {} > RAM
 .sysmem  : {} > RAM
 

/* USER CODE BEGIN (4) */
 .dhrystone_section :  RUN = SDRAM, LOAD = FLASH0 | FLASH1
           LOAD_START(DhrystoneLoadStart), LOAD_END(DhrystoneLoadEnd),  LOAD_SIZE(DhrystoneSize),
           RUN_START(DhrystoneStartAddr ), RUN_END(DhrystoneEndAddr )
/* USER CODE END */
}



sys_main.c


#include "sys_common.h"

/* USER CODE BEGIN (1) */
#include <string.h>
#include <stdio.h>
#include "system.h"
#include "sci.h"
#include "rti.h"
#include "emif.h"
#include "sys_core.h"
#include "user_log.h"
#include "dhry.h"
#include "sys_pmu.h"
#include "sys_mpu.h"

extern float dhrystone();
extern uint32 DhrystoneLoadStart;
extern uint32 DhrystoneLoadEnd;
extern uint32 DhrystoneSize;
extern uint32 DhrystoneStartAddr;
extern uint32 DhrystoneEndAddr;

void main(void)
{
    int i;
    float DMIPS;
    uint32 size=(uint32)&DhrystoneSize;

 _disable_interrupt_(); /* Firstly disable all pending interrupts */
 _esmCcmErrorsClear_(); /* Clear pending CCM ESM errors */
 sciInit();
 emif_SDRAMInit();  /* Initializes the emif Driver for SDRAM */
 _pmuInit_();
 _pmuEnableCountersGlobal_();
 _pmuSetCountEvent_(pmuCYCLE_COUNTER, PMU_CYCLE_COUNT); // PMU_INST_ARCH_EXECUTED
 _enable_interrupt_();/* enable IRQ and FIQ */

 for(i=0;i<size;i++)
    {
        ((char *)&DhrystoneStartAddr)[i] =((char *)&DhrystoneLoadStart)[i];
    }

 i = 0;
 while(1)
 {
  DMIPS = dhrystone();
  printf("DHRYSTONE %3d : %5.2f DMIPS",i,DMIPS);
  i++;
 }

/* USER CODE END */
}



Dhrystone.c


#pragma CODE_SECTION(dhrystone, ".dhrystone_section")
float dhrystone()
{

...
return DMIPS;
}



hope you can help me find out why I'm stuck on this instruction

(I edited the original post for esthetic purpose)

  • Benjamin,
    Thank you for the detailed information.
    One question for you is this: Doesn't your SDRAM section overlap STACK and RAM sections? If so, I wonder if you are over-writing stuff you need.
    Best Regards,
    Kevin Lavery
  • You should confirm that the code is copied to the SDRAM correctly first.

    Make sure you can open a memory window to view SDRAM and see the code get copied.

    If you have a timing problem or configuration problem in your SDRAM setup in HalCoGen, it may either get copied incorrectly or with errors.


    The HDK also has a jumper on it which physically disables the SDRAM you should make sure this jumper is off.


    Performance when running code from SDRAM is going to be poor on this device.   SDRAM is included mainly for data storage / large buffers for DMA based peripherals.  

    If you need to execute code from SDRAM - the TMS570LC4357 or RM57L843 are the parts I would suggest from the Hercules family.   The 32K instruction cache makes execution from SDRAM a lot more practical.

    -Anthony

  • Kevin Lavery said:

    One question for you is this: Doesn't your SDRAM section overlap STACK and RAM sections? If so, I wonder if you are over-writing stuff you need.
     

    Hi Kevin,

    Thanks for your response.

    Regarding to my "MEMORY" section in the linker command file, the SDRAM seems not to overlap any of the other sections. RAM & STACK are located at base address 0x0800 0000 and the SDRAM is at 0x8000 0000.

    Anthony F. Seely said:

    You should confirm that the code is copied to the SDRAM correctly first.

    Make sure you can open a memory window to view SDRAM and see the code get copied.

    If you have a timing problem or configuration problem in your SDRAM setup in HalCoGen, it may either get copied incorrectly or with errors.


    The HDK also has a jumper on it which physically disables the SDRAM you should make sure this jumper is off.


    Performance when running code from SDRAM is going to be poor on this device.   SDRAM is included mainly for data storage / large buffers for DMA based peripherals.  

    If you need to execute code from SDRAM - the TMS570LC4357 or RM57L843 are the parts I would suggest from the Hercules family.   The 32K instruction cache makes execution from SDRAM a lot more practical.

    -Anthony

    Hi Anthony,

    Memory window + Software check are OK : the code is copied correctly from "DhrystoneLoadStart" (0x0000 0020 First address of FLASH) to "DhrystoneStartAddr" (0x8000 0000 First address of SDRAM).

    for(i=0;i<size;i++)
    	{
    	    ((char *)&DhrystoneStartAddr)[i] = 0x00;
    	}
    	UART_PRINT("Memory reseted");
    	for(i=0;i<size;i++)
        {
            ((char *)&DhrystoneStartAddr)[i] =((char *)&DhrystoneLoadStart)[i];
        }
    	UART_PRINT("Memory Copied");
    	for(i=0;i<size;i++)
        {
            if(((char *)&DhrystoneStartAddr)[i] !=((char *)&DhrystoneLoadStart)[i])
            {
            	UART_PRINT("BAD COPY !!!");
            }
        }

    The jumper is also clear.

    Regarding to the hardware question, I am aware the EMIF is not good for running any code (10x slower), I need to have a measurement of that performance.

    I did some digging in ARM abort online troubleshooting and I read that Prefetch Abort occurs when you try to execute code from non-existing memory regions. Why 0x8000 0000 is considered as non-existent memory region ? is it because the EMIF address Width is 16bit ?

    After the abort occurs, the LR(purpose register 14) value = 0x80000004

                                CPSR                    value = 0x60000397

    hoping these informations may be useful to solve my problem.

    best Regards,

    Benjamin.

  • This may be the culprit.

    From section A3.5.7 of the ARM v7 - AR arch. manual:

     Any instruction fetch must access only Normal memory. If it accesses Device or Strongly-ordered
    memory, the result is UNPREDICTABLE. For example, instruction fetches must not be performed to an
    area of memory that contains read-sensitive devices, because there is no ordering requirement
    between instruction fetches and explicit accesses.

    In which case if you want to execute from SDRAM you would change the type to Normal.

  • And if you do change to normal, you probably need a DSB barrier instruction between the last data write after the 'copy' and the first instruction fetch...
  • nice one, That could have been the solution,

    I tried these different configurations in HalCoGen :

    type : STRONGLYORDERED_SHAREABLE | Permission : PRIV_RW_USER_RW_EXEC

    type : NORMAL_OINC_NONSHARED     | Permission : PRIV_RW_USER_RW_EXEC

    type : NORMAL_OINC_SHARED        | Permission : PRIV_RW_USER_RW_EXEC

    none of these seems to work

    I attached my HalCoGen config 8446.HalCoGen Config RunInSDRAM.zip

    edit : I just saw your DSB idea. I am not aware of the behavior. I just tried this with no result :

        .text

        .arm

        .def _DSB_

        .asmfunc

    _DSB_

            dsb

        .endasmfunc

  • Hi Benjamin,

    Would it be possible for you to send the whole project so we can try to execute / debug it?
  • sure,

    here is my project : 6177.Benchmark_In_Memory.zip

    thank you very much for your help btw.

  • Hi Benjamin,

    Thanks. It was easy to reproduce.

    To me it looks like the issue is that the MPU isn't initialized. It's configured ok (I believe) in HalCoGen but I don't see a call to _mpuInit_ and in fact this function doesn't even appear to be linked into the object.

    When I ran the code - the instruction fault status register did indicate 0xD which is a permission problem.

    But then checking the CP15 system control register - bit 0 is '0' indicating that the MPU isn't enabled.

    If you check out the Cortex R4 TRM, there is table listing the default memory map when the MPU is not enabled.
    The default memory map has the address range of our EMIF marked Execute Never.

    I didn't try fixing the issue, but I think adding a call to mpuInit() will solve the problem.

    -Anthony
  • Anthony,

    Thank you very much !
    This was my problem here.

    I wasn't aware of the MPU's initialization needs.

    (1) I see the "example_emif_sdram.c" given by HalCoGen is made without MPU initialization. Is this an omission ?

    ________________________________________________________________________________
    I have two more questions, maybe it deserve its own topic...
    - I have to develop an application using this TMS570 with a 128Mb SDRAM via the EMIF.
    - I need a way to access the Embedded Trace Macrocell (ETM) traces

    As I can tell :
    - the ETM[8:31] and the EMIF share the same pins. So I can't have full access to ETM signals
    - I saw on this forum that with such configuration (19 pins ETM & 330MHz CPU) there is FIFO overflow for full data access.
    - The Hercules products did not include ETM trace buffer (ETB) so I can't access the traces via JTAG

    (2) Can you confirm the possibility to access some of the ETM traces with only 8 pins ?
    (3) Is there a way to allocate a memory space as an ETB (buffer) ?




    Best Regards,

    Benjamin
  • Hi Benjamin,

    Great questions - unfortunately though you've got a good understanding of the issue.

    I'll look into whether the example has an omission or not. I think it may not be *needed* for that example but is probably a good idea given the issue you ran into.

    You've got a clear understanding of EMIF v.s. ETM I'm afraid. No on chip ETB and no way to direct the ETM port at RAM.
    The ETM has 8 dedicated data lines - so you'll get lots of FIFO overruns if you try to trace everything.

    In your specific case - if you are actually executing from SDRAM then it may be that your program runs so slowly on the 3137 that 8 trace data pins would be fine. I think this is probably the case.

    The 330MHz device actually has been designed for EMIF || ETM but in a larger package that we have not yet designed :(. In the 337ZWT it has the same limitations on ETM/EMIF.

    Best Regards,
    Anthony