This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

startup_ARMCA15.S in CCS 6.1.3 can cause an Undefined Instruction Exception when enabling the VFP

Other Parts Discussed in Thread: 66AK2H14, AM5728, AM4379, AM3359

Using CCS 6.1.3.00033 created a new project for the Cortex-A15 in a 66AK2H14 using the GNU v4.9.3 compiler. After creating the project there were no startup file or linker script added by CCS so:

- Copied the ccsv6\ccs_base\arm\include\startup_ARMCA15.S file into the project

- Copied the ccsv6\ccs_base\arm\include\66AK2Gxx.lds file into the project, and changed to store the program in DDR0 (the 66AK2Gxx.lds linker script was the closest example for the target device)

In the CCS project properties -float-abi was set to "hard" and -mfpu was set to "vfp4" to enable hardware floating point, which then causes the startup_ARMCA15.S file to enable NEON extensions and the VFP.

The target board is a EVMK2H, with the boot mode set to "SLEEP W/ SLOW PLL & SLOW ARM PLL" and the xtcievmk2x_arm.gel GEL script used on the Cortex-A15. This means on a board reset uboot in flash isn't executed, and the GEL script is used to initialize the device.

On starting debug sessions, sometimes the program failed to reach main. E.g. the following sequence is repeatable:

1) Execute a reboot command in EVMK2H BMC to perform a full reset (without having to power-cycle).

2) Start a debug session, and the program doesn't reach main. When suspend the program a Undefined Instruction Exception has occurred on the Cortex-A15.

3) Terminate the failed debug session, and start a new debug session. This time the program reaches main.

By using the ARM Advanced Features to enable "Break on Undefined Instruction" and enable PC trace the undefined instruction is occurring on the FMXR  FPEXC,r0 instruction on the following code in the startup_ARMCA15.S file:

.if __ARM_PCS_VFP == 1
@
@ Enable NEON extensions and the VFP. Must be done before entering user mode.
@
        MRC   p15, #0, r3, c1, c0, #2    @ Read CPACR
		ORR   r3, r3, #0x00F00000        @ Enable access to CP10 and CP11
		MCR   p15, #0, r3, c1, c0, #2    @ Write CPACR
		MOV   r3, #0
      	MOV   r0,#0x40000000
        FMXR  FPEXC,r0                   @ Set FPEXC bit 30 to enable VFP
		MCR   p15, #0, r3, c7, c5, #4    @ flush prefetch buffer because of FMXR above
.endif

The Enabling Advanced SIMD and VFP extensions section of the ARM Cortex-A15 MPCore Processor Technical Reference Manual shows an ISB instruction after enabling access to CP10 and CP11, which "ensures that all of the CP15 register changes from the previous steps have been committed".

After modifying the startup_ARMCA15.S file to add an ISB instruction after the write to CPACR the program now reaches main() on the first debug session following a full reset:

.if __ARM_PCS_VFP == 1
@
@ Enable NEON extensions and the VFP. Must be done before entering user mode.
@
        MRC   p15, #0, r3, c1, c0, #2    @ Read CPACR
		ORR   r3, r3, #0x00F00000        @ Enable access to CP10 and CP11
		MCR   p15, #0, r3, c1, c0, #2    @ Write CPACR
		ISB                              @ Ensure that all of the CP15 register changes from the previous steps have been committed.
		MOV   r3, #0
      	MOV   r0,#0x40000000
        FMXR  FPEXC,r0                   @ Set FPEXC bit 30 to enable VFP
		MCR   p15, #0, r3, c7, c5, #4    @ flush prefetch buffer because of FMXR above
.endif

Suggest the default startup_ARMCA15.S in CCS is modified accordingly.

Also, it looks like the startup_ARMCA8.S and startup_ARMCA9.S have the same problem, although I haven't tested them to see if an Undefined Instruction Exception can occur.

  • Chester Gillon said:
    On starting debug sessions, sometimes the program failed to reach main. E.g. the following sequence is repeatable:

    1) Execute a reboot command in EVMK2H BMC to perform a full reset (without having to power-cycle).

    2) Start a debug session, and the program doesn't reach main. When suspend the program a Undefined Instruction Exception has occurred on the Cortex-A15.

    3) Terminate the failed debug session, and start a new debug session. This time the program reaches main.

    What was happening was:

    1) After a full reset, the CPACR register was zero when the entry point was reached, meaning access to the CP10 and CP11 co-processor registers was disabled.

    2) While the code in startup_ARMCA15.S did write to the CPACR register to enable access to the CP10 and CP11 co-processor registers, the lack of the ISB synchronization instruction meant that when the FMXR  FPEXC,r0 instruction attempted to write to the FPEXC an Undefined Instruction Exception occurred.

    The ARMv7-A Architecture Reference Manual notes that FMXR  FPEXC,r0 instruction will generate an Undefined Instruction Exception if access to the CP10 and CP11 co-processor registers is disabled.

    3) After terminating the failed debug session, when started a new debug session (without performing a full reset) the CPACR register was 0x00F00000 when the entry point was reached, meaning access to the CP10 and CP11 co-processor registers was enabled (as a result of the previous run). Therefore, on this run FMXR  FPEXC,r0 instruction no longer generates an Undefined Instruction Exception.

    Chester Gillon said:
    Also, it looks like the startup_ARMCA8.S and startup_ARMCA9.S have the same problem, although I haven't tested them to see if an Undefined Instruction Exception can occur.

    The startup_ARMCA9.S also omits the ISB instruction after writing to the CPACR register:

    .if __ARM_PCS_VFP == 1
    @
    @ Enable NEON extensions and the VFP. Must be done before entering user mode.
    @
            MRC   p15, #0, r3, c1, c0, #2    @ Read CPACR
    		ORR   r3, r3, #0x00F00000        @ Enable access to CP10 and CP11
    		MCR   p15, #0, r3, c1, c0, #2    @ Write CPACR
    		MOV   r3, #0
          	MOV   r0,#0x40000000
            FMXR  FPEXC,r0                   @ Set FPEXC bit 30 to enable VFP
    		MCR   p15, #0, r3, c7, c5, #4    @ flush prefetch buffer because of FMXR above
    .endif
    

    Using a Cortex-A9 in an OMAP4430 when the CPACR register was zero at the entry point after a power on reset, the code was able to enable the VFP without generating an Undefined Instruction Exception. i.e. unlike the Cortex-A15 can't repeat the exception. It is probably processor implementation defined behavior about if omitting an ISB synchronization instruction will cause a run time failure or not, due to the internal timing of how instructions are prefetched. The Cortex-A9 NEON Media Processing Engine Technical Reference Manual also shows an ISB instruction after writing to the CPACR register, which is sufficient justification to modify startup_ARMCA9.S to add an ISB after the write to the CPACR register.

  • Chester,

    Thanks for reporting this; at a certain point I recall having a delay to grant the registers were properly updated before the next instruction, but somehow this disappeared from the last implementation.

    >>Using a Cortex-A9 in an OMAP4430 when the CPACR register was zero at the entry point after a power on reset, the code was able to enable the VFP without generating an Undefined Instruction Exception. i.e. unlike the Cortex-A15 can't repeat the exception. It is probably processor implementation defined behavior about if omitting an ISB synchronization instruction will cause a run time failure or not, due to the internal timing of how instructions are prefetched.

    I agree with you; I have been testing this code with the Sitara devices (AM3359 for CA8, AM4379 for CA9 and AM5728 for CA15) and did not experience any trouble with the current implementation.

    I will update the files and perform tests on some Keystone II boards I have here.

    Thank you again for reporting this and I apologize for the trouble,
    Rafael

  • desouza said:
    I will update the files and perform tests on some Keystone II boards I have here.

    Did you get a chance to perform any tests?

    Just noticed that the ccsv7\ccs_base\arm\include\startup_ARMCA15.S installed by CCS 7.2.0.00013 and TI Emulators 7.0.48.0 still omits the ISB instruction before enabling VFP.