This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Stellaris LM4F120; how to generate floating point instructions.

Other Parts Discussed in Thread: CODECOMPOSER

Hello, 

I am trying to generate floating point instructions from a Stellaris LM4F120 board.  Can someone give me a simple example of how to do this.

System info:

OS: Ubuntu 12.04 LTS

Microcontroller: Stellaris LM4F120

Software CodeComposer Studio 5.4

Thanks very much

  • Hi,

    Difficult task... I'm joking.. So you have some examples in TivaWare - one using some floating point operations is the application qs-rgb in Launchpad, but this is weak for your purposes (I think).

    Another one is sine-demo in ek-lm4f232 board - but this one uses the graphical display - so an idea is to look to this example, use the formula found there ("SIN(2pi*t/4)*0.5") - and generate yourself a number of samples per period (32-64-128), and then printf them to a console.

    One hint if you seems to be lost with this: try the formula first on a small PC application, written in C  (GCC may be the single option in Linux) and then just move the code to main in your micro.

    Petrei

  • For clarification.  Since I am looking to perform a lot of calculations efficiently, I want to verify that I am generating hardware floating point instructions.  How can I verify that in the disassembly listing.

    Thanks.

  • HI,

    The data sheet of your micro, paragraph 2.8 lists all the asm instructions - those for floating point starts with v.... - so if you link with the right library you will be able to read in the listing some instructions, like vadd.f32 {Sx,} Sy, Sm 

    Petrei

  • This is what Im looking at,

    ....
    00005864:   EE000A10 FMSR            S0, R0
    00005868:   4858     LDR             R0, $C$CON104
    0000586a:   ED800A00 FSTS            S0, [R0, #0]
     828                  accelerations_in[1] = y/1716.00407747 ;
    0000586e:   9805     LDR             R0, [SP, #0x14]
    00005870:   F006F9BA BL              __aeabi_f2d
    00005874:   A49B     ADD             R4, PC, #0x26C $C$FL6
    00005876:   E894000C LDMIA.W         R4, {R2, R3}
    0000587a:   F004FC07 BL              __aeabi_ddiv
    0000587e:   F006F811 BL              __aeabi_d2f
    00005882:   EE000A10 FMSR            S0, R0
    00005886:   489C     LDR             R0, $C$CON105
    00005888:   ED800A00 FSTS            S0, [R0, #0]
     829                  accelerations_in[2] = z/1716.00407747 ;
    0000588c:   9806     LDR             R0, [SP, #0x18]
    0000588e:   F006F9AB BL              __aeabi_f2d
    00005892:   A494     ADD             R4, PC, #0x250 $C$FL6
    ....

    I see that after a command there is a divide.  This instruction is not in floating point, right? How can I make it so that the hardware generates floating point instructions. 

    Thanks very much, Petrei, for your help.

  • Which TI ARM compiler version are you using, and what the the target processor version (--silicon_version, -mv), Specify floating point support (--float_support) and optimization level (--opt_level, -O) options set to?

    Looking at the following code targeting a LM4F120H5QR with a target processor version of "7M4", floating point support of "FPv4SPD16" and optimization level of "2":

    float w, e, mu, energy;

     w = (e * mu) / (energy + 0.000000119209289f);

    The TI ARM compiler v5.0.5 generated the following floating point instructions:
    VADD.F32 S5, S1, S0 ; [DPU_LIN_PIPE] |260| 
    VMUL.F32 S4, S2, S4 ; [DPU_LIN_PIPE] |260|
    VDIV.F32 S4, S4, S5 ; [DPU_LIN_PIPE] |260|
  • CJ Wilkerson said:
      accelerations_in[1] = y/1716.00407747 ;

    On further consideration those constants will be implicity considered as double by the compiler, which will then promote the divide to double precision. The LM4F only has single precision hardware floating point support, so the implict double constants will force the software floating point support to be used. Try defining the constants a single precision (with a 'f' suffix), to allow the compiler to use hardware floating point:
    accelerations_in[1] = y/1716.00407747f ;
  • Chester Gillon said:

    Which TI ARM compiler version are you using, and what the the target processor version (--silicon_version, -mv), Specify floating point support (--float_support) and optimization level (--opt_level, -O) options set to?

    Looking at the following code targeting a LM4F120H5QR with a target processor version of "7M4", floating point support of "FPv4SPD16" and optimization level of "2":

    float w, e, mu, energy;

     w = (e * mu) / (energy + 0.000000119209289f);

    The TI ARM compiler v5.0.5 generated the following floating point instructions:
    VADD.F32 S5, S1, S0 ; [DPU_LIN_PIPE] |260| 
    VMUL.F32 S4, S2, S4 ; [DPU_LIN_PIPE] |260|
    VDIV.F32 S4, S4, S5 ; [DPU_LIN_PIPE] |260|

    [/quote]

    Target processor: 7M4

    floating point support: FPvSPD16

    Optimization level "0"

    Compiler Version TI v5.0.4

    Did not generate those instructions, instead it generated FMUL, FADD, FDIV.

  • Hello again.

    We still have not been able to generate hardware floating point instructions. We have tinkered with a lot of settings and tried many other things but are still unable to make any progress on this issue.

    Does anyone have any other ideas or things to try.

    Thanks.

  • Hi,

    Do you have a small test program - what you tried to do/test ? zip-it and post it, but don't forget to add also the Debug folder (generated code/listing) - I can help, first looking at the results and then compiling it under Windows, to see the differences -

    Petrei

  • CJ Wilkerson said:
    We still have not been able to generate hardware floating point instructions. We have tinkered with a lot of settings and tried many other things but are still unable to make any progress on this issue.

    To follow up on my previous comment about implicit double conversions, I created the following example for a LM4F120H5QR:

    float float_divide (float y)
    {
     return y / 1716.00407747f;
    }

    float implicit_double_divide (float y)
    {
     return y / 1716.00407747;
    }

    int main(void)
    {
     return float_divide (1000.0f) + implicit_double_divide (1000.0f);
    }

    Hardware floating point instructions were created for the float_divide function:

    float_divide:
    ;* --------------------------------------------------------------------------*
            SUB       SP, SP, #8            ; [DPU_3_PIPE]
            VSTR.32   S0, [SP, #0]          ; [DPU_LIN_PIPE] |6|
    ;----------------------------------------------------------------------
    ;   7 | return y / 1716.00407747f;                                            
    ;----------------------------------------------------------------------
            LDR       A1, $C$FL1            ; [DPU_3_PIPE] |7|
            VMOV      S1, A1                ; [DPU_LIN_PIPE] |7|
            VLDR.32   S0, [SP, #0]          ; [DPU_LIN_PIPE] |7|
            VDIV.F32  S0, S0, S1            ; [DPU_LIN_PIPE] |7|
            ADD       SP, SP, #8            ; [DPU_3_PIPE]
            BX        LR                    ; [DPU_3_PIPE]

    Whereas for implicit_double_divide the double constant caused the divide to implicitly be performed as double which used software double precision library to be called:

    implicit_double_divide:
    ;* --------------------------------------------------------------------------*
            PUSH      {A4, LR}              ; [DPU_3_PIPE]
            VSTR.32   S0, [SP, #0]          ; [DPU_LIN_PIPE] |11|
    ;----------------------------------------------------------------------
    ;  12 | return y / 1716.00407747;                                             
    ;----------------------------------------------------------------------
            LDR       A1, [SP, #0]          ; [DPU_3_PIPE] |12|
            BL        __aeabi_f2d           ; [DPU_3_PIPE] |12|
            ; CALL OCCURS {__aeabi_f2d }     ; [] |12|
            ADR       A3, $C$FL2            ; [DPU_3_PIPE] |12|
            LDMIA     A3, {A3,A4}           ; [DPU_3_PIPE] |12|
            BL        __aeabi_ddiv          ; [DPU_3_PIPE] |12|
            ; CALL OCCURS {__aeabi_ddiv }    ; [] |12|
            BL        __aeabi_d2f           ; [DPU_3_PIPE] |12|
            ; CALL OCCURS {__aeabi_d2f }     ; [] |12|
            VMOV      S0, A1                ; [DPU_LIN_PIPE] |12|
            POP       {A4, PC}              ; [DPU_3_PIPE]

    Does your C code for which floating point instructions are not being generated contain any implicit or explicit double precision variables, constants or functions?

  • Alright.  I took the project that I was trying to get to generate hardware floating point instructions, copied it, and gutted the copy.  All that is left of the code is a simple test made up of the code that Chester has provided and has proven generates hardware floating point instructions and the initialization of the FPU and the clock.  I compiled it and ran it to double check that it did NOT generate hardware FP instructions before attaching it here. Any information you may need is probably stated in previous replies but if you need something, tell me where to find that and Ill post it here.

    Thank you both for your help so far. Hopefully we can get this all figured out.

    CJ

    Attached is the compressed testproject

    4834.testproject.zip

  • CJ Wilkerson said:
    I compiled it and ran it to double check that it did NOT generate hardware FP instructions before attaching it here

    How did you determine that it did NOT generate hardwaare FP instructions?

    I imported your testproject into a CCS 5.4 Workspace under Windows XP, changed the Unix style StellarisWare references to point to my Windows installation and run it in a Stellaris Launchpad. The CCS disassembler showed floating point instructions being generated for single point precision calculations:

    40        void testfpu(float arg) {
              testfpu:
    00000f44:   B500     PUSH            {LR}
    00000f46:   F1AD0D14 SUB.W           R13, R13, #20
    00000f4a:   ED8D0A00 FSTS            S0, [R13, #0]
    42         e = arg;
    00000f4e:   9800     LDR             R0, [SP]
    00000f50:   9002     STR             R0, [SP, #0x8]
    43         mu = arg;
    00000f52:   9800     LDR             R0, [SP]
    00000f54:   9003     STR             R0, [SP, #0xC]
    44         energy = arg;
    00000f56:   9800     LDR             R0, [SP]
    45         w = (e * mu) / (energy + 0.000000119209289f);
    00000f58:   ED9D1A03 FLDS            S2, [R13, #12]
    44         energy = arg;
    00000f5c:   9004     STR             R0, [SP, #0x10]
    45         w = (e * mu) / (energy + 0.000000119209289f);
    00000f5e:   482E     LDR             R0, $C$FL1
    00000f60:   EE000A90 FMSR            S1, R0
    00000f64:   ED9D0A04 FLDS            S0, [R13, #16]
    00000f68:   EE300A80 FADDS           S0, S1, S0
    00000f6c:   EDDD0A02 FLDS            S1, [R13, #8]
    00000f70:   EE610A20 FMULS           S1, S2, S1
    00000f74:   EE800A80 FDIVS           S0, S1, S0
    00000f78:   ED8D0A01 FSTS            S0, [R13, #4]
    46         w = sqrtf(e);
    00000f7c:   ED9D0A02 FLDS            S0, [R13, #8]
    00000f80:   F001F8A0 BL              sqrtf
    00000f84:   ED8D0A01 FSTS            S0, [R13, #4]
    00000f88:   B005     ADD             SP, #0x14
    00000f8a:   BD00     POP             {PC}
    49        {
              float_divide:
    00000f8c:   F1AD0D08 SUB.W           R13, R13, #8
    00000f90:   ED8D0A00 FSTS            S0, [R13, #0]
    50         return y / 1716.00407747f;
    00000f94:   4821     LDR             R0, $C$FL2
    00000f96:   EE000A10 FMSR            S0, R0
    00000f9a:   EDDD0A00 FLDS            S1, [R13, #0]
    00000f9e:   EE800A80 FDIVS           S0, S1, S0
    00000fa2:   B002     ADD             SP, #0x8
    00000fa4:   4770     BX              R14
    54        {
              implicit_double_divide:
    00000fa6:   B51C     PUSH            {R2, R3, R4, LR}
    00000fa8:   ED8D0A00 FSTS            S0, [R13, #0]
    55         return y / 1716.00407747;
    00000fac:   9800     LDR             R0, [SP]
    00000fae:   F000FF29 BL              __aeabi_f2d
    00000fb2:   A41B     ADD             R4, PC, #0x6C $C$FL3
    00000fb4:   E894000C LDMIA.W         R4, {R2, R3}
    00000fb8:   F7FFFF28 BL              __aeabi_ddiv
    00000fbc:   F000FD55 BL              __aeabi_d2f
    00000fc0:   EE000A10 FMSR            S0, R0
    00000fc4:   BD1C     POP             {R2, R3, R4, PC}
    58        int main(void) {
              main:
    00000fc6:   B508     PUSH            {R3, LR}
    60         init() ;
    00000fc8:   F000F817 BL              init
    61         testfpu(0.5f);
    00000fcc:   EEB60A00 VMOVS           S0, #5.000000e-01
    00000fd0:   F7FFFFB8 BL              testfpu
    62         return float_divide (1000.0f) + implicit_double_divide (1000.0f);
    00000fd4:   4814     LDR             R0, $C$FL4
    00000fd6:   EE000A10 FMSR            S0, R0
    00000fda:   F7FFFFD7 BL              float_divide
    00000fde:   4812     LDR             R0, $C$FL4
    00000fe0:   EEF00A40 VMOVS           S1, S0
    00000fe4:   EE000A10 FMSR            S0, R0
    00000fe8:   F7FFFFDD BL              implicit_double_divide
    00000fec:   EE300A20 FADDS           S0, S0, S1
    00000ff0:   EEBD0AC0 FTOSIZS         S0, S0
    00000ff4:   EE100A10 FMRS            R0, S0
    00000ff8:   BD08     POP             {R3, PC}
    67        void init() {

    Notes:

    a) Single stepping into the sqrtf library function shows the hardware square root instruction FSQRTS being used, after the input has been validated (an invalid argument will cause an exception to be raised)

    b) CCS doesn't currently display the contents of the floating point registers - which the subject of enhancement request CCS5.4 doesn't display the floating point registers for a Stellaris LM4F120H5QR

  • CJ Wilkerson said:
    Did not generate those instructions, instead it generated FMUL, FADD, FDIV.

    I now realise that some of the confusion may be that I have been quoting instructions as a mixture from the CCS debugger diassembler and the assembler listing produced by the TI ARM compiler - where the two can display the same instructions differently!

    E.g. the TI ARM compiler assembler listing was displaying the following, which matches the instruction names given in the TI LM4F datasheet and the ARM Cortex®-M4 Technical Reference Manual:

            VMOV      S1, A1                ; [DPU_LIN_PIPE] |45|
            VLDR.32   S0, [SP, #16]         ; [DPU_LIN_PIPE] |45|
            VADD.F32  S0, S1, S0            ; [DPU_LIN_PIPE] |45|
            VLDR.32   S1, [SP, #8]          ; [DPU_LIN_PIPE] |45|
            VMUL.F32  S1, S2, S1            ; [DPU_LIN_PIPE] |45|
            VDIV.F32  S0, S1, S0            ; [DPU_LIN_PIPE] |45|
            VSTR.32   S0, [SP, #4]          ; [DPU_LIN_PIPE] |45|
    Whereas the CCS debugger diassembler displays the same instructions as:
    00000f60:   EE000A90 FMSR            S1, R0
    00000f64:   ED9D0A04 FLDS            S0, [R13, #16]
    00000f68:   EE300A80 FADDS           S0, S1, S0
    00000f6c:   EDDD0A02 FLDS            S1, [R13, #8]
    00000f70:   EE610A20 FMULS           S1, S2, S1
    00000f74:   EE800A80 FDIVS           S0, S1, S0
    00000f78:   ED8D0A01 FSTS            S0, [R13, #4]
  • So the instructions FADD, FMUL, etc... are hardware floating point instructions?

  • CJ Wilkerson said:
    So the instructions FADD, FMUL, etc... are hardware floating point instructions?

    Yes, from looking at the ARMv7-M Architecture Reference Manual:

    a) The assembler listing from the TI ARM compiler is displaying the Unified Assembler Language (UAL) mnemonics

    b) The CCS disassembly is displaying the "legacy" Pre-UAL assembler mnemonics.