Hi,
I try to use the NEON in a OMAP3530 on a DevKit8000 board. I run a program without an OS – directly under the U-boot bootloader. My toolchain worked fine so far. Now, when I try to use the NEON my program hangs on NEON’s intructions. Debugging with a XDS100v2 debugger showed that a program runs to a NEONs instruction (VLD1.32 {D16, D17}, [R7]) to be precise) and jumps to address 0x00014004, which is Undefined Instruction exception (CPSR.M == 0x1B after jump). The NEON is turned on for sure.
The weird is that the program operation depends on location and size of an array with which I want to work. I didn’t find clear rule, but I manage to run almost successfully program with small array in a stack area. It hanged on a second NEON command, not first. In one case program couldn’t be run at all. The U-boot hanged after GO command without printing “„## Starting application at 0x80000000”. It looks like problem with a stack, but without NEON’s instructions program works fine. I try to use NEON Intrinsics and auto-vectorize, but result is the same.
I init the neon with a function:
void NeonInit(void)
{
RegisterSet(&PM_PWSTCTRL_MPU, 0x3, 2, 16);//L2 Cache memory is ON when domain is ON
RegisterSet(&PM_PWSTCTRL_MPU, 1, 1, 8);//L2 Cache memory is retained when domain is in RETENTION state
RegisterSet(&CM_CLKSTCTRL_NEON, 0, 2, 0);//Automatic transition of clock state is disabled
RegisterSet(&PM_PWSTCTRL_MPU, 1, 1, 2);//Logic & L1 Cache are retained when domain is in RETENTION state
RegisterSet(&PM_PWSTCTRL_MPU, 0x3, 2, 0);//Power state control: ON
RegisterSet(&PM_WKDEP_NEON, 1, 1, 1);//NEON domain is woken-up upon MPU domain wake-up.
RegisterSet(&PM_PWSTCTRL_NEON, 0x3, 2, 0);//Power state control: ON
if (((CM_IDLEST_NEON>>0) & 0x1)==1) print("\r\nNEON standby mode"); else print("\r\nNEON is active");
if (((PM_PWSTST_NEON>>0) & 0x3)==3) print("\r\nNEON Power is ON"); else print("\r\nNEON Power is OFF");
}
A function with neon instructions looks like this:
void NeonArrayInit(int * xOutputArray, int * xInitValArray, int xLength)
{
int i;
printc('a');
uint32x4_t z4 = vld1q_u32((uint32_t *) xInitValArray);
printc('b');
uint32_t *ptrz = (uint32_t *) xOutputArray;
printc('c');
for(i=0;i<(xLength/4);i++)
{
vst1q_u32(ptrz, z4);
ptrz+=4;
}
printc('d');
}
I compile code with Sourcery G++ Lite:
...
arm-none-eabi-gcc -c MPU_neon.c -o MPU_neon.o -Wall -march=armv7-a -mtune=cortex-a8 -mfpu=neon -ftree-vectorize -mfloat-abi=softfp
...
arm-none-eabi-gcc main.o startup.o [other files] MPU_neon.o -T script.ls -o main.out -Xlinker -Map=main.map.txt -mfpu=neon -ftree-vectorize -mfloat-abi=softfp
arm-none-eabi-objcopy -O binary main.out main.bin
Script file looks like this:
SECTIONS
{
. = 0x80000000;
.main . : { main.o }
.text : { *(.text) }
.data : { *(.data) }
.bss :
{
PROVIDE (__bss_start = .);
*(.bss)
PROVIDE (__bss_end = .);
}
.bss_f : { bss_file.o (.bss)}
.stack ALIGN(256) :
{
. += 0x1000;
PROVIDE (_stack = .);
PROVIDE (_stack_top = .);
}
}
This settings work fine when I don't use NEON's intructions. I tried to change compiler for gcc-none-linux-gnueabi, because I saw that U-boot is compiled with it. But it showed many errors so I give up.
The Program compiles without any errors or warnings. I don’t see anything weird in the MAP file.
What can cause this problem?