VFP in Windows Embedded Compact 7

Richard Leong

Other Parts Discussed in Thread: OMAP3530

Hi, All,

I've asked "VFP support in WinCE 6" in my previous post.
It seems that CE6 will cause some FP arithmetic operation error and suggested WinCE 7 is already fixed the issue.
Thus, I port my BSP package to WinCE 7 now.
However, I still meet the issue.

I'm currently use vfp2fpcrt.dll to implement the Math feature.
As DavidVescovi mentioned WinCE 6 did not have enough registers allocated to store and restore correctly on context switches between NEON and VFP.
Thus, what kind of part do I need to change in CE7 to fix this issue?
Or do I need to change the library to VFPv3 for CE7?
If that is the case, would you please provide that kind of library to me?

Thanks for your help,

Richard

over 14 years ago

0 Frank Walzer over 14 years ago

TI__Mastermind 44306 points

Richard,

the VFP libs in WinCE6 are just needed because the MS tools did not support creating floating point code for the VFPv3 on Cortex-A8. This should be solved with the tools needed for WinCE7. So this is a question to MS what compiler args are required to produce VFPv3 or NEON code to support the HW floating point operations. VFPv3 code should be fully accurate. NEON has some limitations but that is documented on the ARM web site.

Regards.

0 Richard Leong over 14 years ago in reply to Frank Walzer

Prodigy 175 points

Hi, Frank,

Thanks for answering my question.

According to the MSDN, I can enable or disable the VFP feature with parameter /QRFpe.
Thus, I just take a look at the assembly.
The compiled source code seems to use VFP intruction set to do the floating point calculation.
Does that means I have enabled the VFP now?

i.e.
122:      double myDoubleA = 48.4;
00011318    ldr         r3,| (000114bc)|
0001131C    vldr        d0,[r3]
00011320    vstr        d0,[sp,#myDoubleA]
123:      double myDoubleB = 48.7;
00011324    ldr         r3,| (000114b8)|
00011328    vldr        d0,[r3]
0001132C    vstr        d0,[sp,#myDoubleB]
124:      int myIntA = (int) myDoubleA;
00011330    vldr        d0,[sp,#myDoubleA]
00011334    vcvtr.s32.f64 s0,d0
00011338    vmov        r3,s0
0001133C    str         r3,[sp,#myIntA]
125:      int myIntB = (int) myDoubleB;
00011340    vldr        d0,[sp,#myDoubleB]
00011344    vcvtr.s32.f64 s0,d0
00011348    vmov        r3,s0
0001134C    str         r3,[sp,#myIntB]

Furthermore, the performance is quite strange.
I use the following code to test the performance.

    for (int i=0; i<50000; i++)
    {
        test = tan(cos(3.4578236482) + sin(83.9374658) / asin(0.123)) * log(36.123);
        test = log(fabs(test)) * pow(fabs(test), 0.578);
    }

In WinCE6 with vfp2fpcrt.dll, the calculation time is around 0.17 secs.
In WinCE7 with VFP enabled(/QRarch7 /QRfpe-), the calculation time is around 0.92 secs.
In WinCE7 with VFP disabled(/QRarch5), the calculation time is around 0.38 secs.

Would you help me to point out why the speed is slower then CE6?
And why VFP enabled in CE7 is slower than VFP disabled?

Thanks Again,

Richard

0 Frank Walzer over 14 years ago in reply to Richard Leong

TI__Mastermind 44306 points

Richard,

you are welcome but I am not sure how much further I can help. I do not have the MS tools for WinCE7.

I tseems you are creating VFP code now but to be sure you would need to check the binary instructions created. The ARM assembly is somewhat confusing as I have seen different Mnemonics for the same instructions (depends on the compiler/disassembly tools I guess).

In your example benchmark you are using float functions. Are you sure they are compiled with floating support enabled? Otherwise you might link a soft float implementation from a standard library that is not actually using the VFPv3 for most of the calculations. If these are provided by the VFP lib on WinCE6 they will be faster.

Another thing to check if the functions are single or double precision.

Regards.

0 Richard Leong over 14 years ago in reply to Frank Walzer

Prodigy 175 points

Hi, Frank,

Would you mind to tell me how to check the binary instuctions?
I have .exe .obj .cod .pch .rel .map .mac after compiled.
Which file is related to the binary instructions you mentioned?

For the floating support in WinCE 7, I can see it will jump into vfp version of math function.
i.e.

; 126 : cos(90);
  0029c e3a00000 mov r0,#0
  002a0 e59f1168 ldr r1,|$LN18@InitFloat| ; =0x40568000
  002a4 eb000000 bl cos

When step into "bl cos", it will then check register R12 to see whether the VFP is enabled or not.
If disabled, jump to normal cos assembly code.
If enabled, jump to cos_vfp assembly code.
I don't know if there is any other way to make sure whether I have compiled with floating support or software floating implementation.
Or do you have any suggestion where I should to take a look to confirm it?

Thank you very much,

Richard

0 Atul Verma over 14 years ago in reply to Richard Leong

TI__Expert 3710 points

Richard,

I am not sure which WinCE7 BSP you are using or where you got it from but pl. make sure that OEMInit() is making call to VfpOemInit() for VFP support initialization

thanks

Atul

0 Richard Leong over 14 years ago in reply to Atul Verma

Prodigy 175 points

Hi, Atul,

I'm using the default BSP package in WinCE7 for TI OMAP 3530(TI_EVM_3530).
In the bsp_version.h, it's version number is 6.13.0.

In the OEMInit(), I have the line of VfpOemInit(g_pOemGlobal, VFP_AUTO_DETECT_FPSID);
From MSDN, the default implementations are based on the ARM common VFP subarchitecture.
So, does it mean I can use the common CE7 VFP library to initialize the OMAP 3530 VFP?

Thanks,

Richard

0 Madhvi over 14 years ago in reply to Richard Leong

TI__Expert 5320 points

Hi Richard

If you are calling VfpOemInit() from OEMInit and the MS provided VFP library is included in your NK.bin, then you should be able to use the VFP on OMAP3530 for your floating point calculations, provided your code is correctly generated. We have tested using the QBench (Linpack, whetstone, dhrystone) benchmark tests and we see improved performance in CE7 while using the VFP library as compared to software emulated functions.

-Madhvi

Processors

Processors forum

VFP in Windows Embedded Compact 7