How to load a single-precise data in linear assembly code?

may may92122

Expert 1030 points

I wrote a linear assembly function as follows:

.def _test

_test: .cproc a_0

.reg val_a0

LDW *a_0++,val_a0

.return val_a0

.endproc

Then in another file in the project calling this function:

void main()

{

int a = 1;

float b = 1.0;

int temp1 = 0;

int temp2 = 0.0;

temp1 = test(&a);

printf("%d\n",temp1);

temp2 = test(&b);

printf("%f\n",temp2);

}

I ran the above code on EVM6678, and the following result appeared in the console window:

1065353216.00000

It seemed that LDW couldn't work right with single precise data, can somebody tell me why?

over 13 years ago

0 Chad Courtney over 13 years ago

TI__Mastermind 30825 points

You have temp2 as an INT and when you returned the value then your printf or temp2 is casting an INT to float.

Best Regards,
Chad

0 may may92122 over 13 years ago in reply to Chad Courtney

Expert 1030 points

Chad,Thanks, but I checked the code, I mistyped the code here, the original code is as follows:

.def _test

_test: .cproc a_0

.reg val_a0

LDW *a_0++,val_a0

.return val_a0

.endproc

Then in another file in the project calling this function:

void main()

{

int a = 1;

float b = 1.0;

int temp1 = 0;

foat temp2 = 0.0;

temp1 = test(&a);

printf("%d\n",temp1);

temp2 = test(&b);

printf("%f\n",temp2);

}

I ran the above code on EVM6678, and the following result appeared in the console window:

1065353216.00000

It seemed that LDW couldn't work right with single precise data, can somebody tell me why?

0 Allen Lee over 13 years ago in reply to may may92122

Genius 3770 points

Hi May,

Did you declare the prototype of the function? Such as:

float test(void *a);

If not, the output of test(&b) will be treated as integer and be converted to float by INTSP.

Allen

0 Chad Courtney over 13 years ago in reply to Allen Lee

TI__Mastermind 30825 points

I agree with Allen's comments.

That said, back to the basic question of the LDW assembly instruction. It's just going to return the 32 bit value that's stored at the location that's being pointed to. It doesn't care if it's float, int, 2 16bit values packed, etc. It simply returns the 32bits exactly as they're stored in memory. It's your type casting/declarations in C that's affecting how this data is treated.

If you want to, single step into the assembly code, look where the a_0 register (it's going to be A4 since A4 is passed in as the first variable of a function) look at the memory location pointed to by A4, display it as a SP Float in a memory window and see what you observe, display it again as plane hex value, step through the code until you get the LDW executed (4 single steps after LDW is when the data will land in the register (I assume it would be B4 register, but you'll have to look at the code in dissassembly to see.) Now, you'll see this is the exact same 32bit hex data as was in memory and this is what gets returned back.

Best Regards,
Chad

0 may may92122 over 13 years ago in reply to Chad Courtney

Expert 1030 points

Thanks Allen and Chad,

With your help, I totally got the right result. But on the other side, I'm sad with the result. I studied on linear assembly in order to improve the processing speed of code, but after it, I found that I failed.

The length of the array in my test is 264, when optimization level was not chosen, the CPU cycle of c code is 11,923 , and the CPU cycle of linear assembly code is 8,350; but after o2 optimization level was chosen , the CPU cycle of c code is 440, and the CPU cycle of linear assembly code is 962, which is two times of the c code! Does it mean that it's so hard to optimize the code? Following spru187t, I tried the optimization methods in section 3 and section 4, but except optimization level coming with the complier, no other methods work. If I badly need to optimize it furtherly, what can I do?

Best regards,

May

0 Allen Lee over 13 years ago in reply to may may92122

Genius 3770 points

Hi May,

In this situation, I think it need the manually assembly coding. You should assign the registers and arrange the pipeline by yourself in order to utilize the calculation resource as much as possible. It will be more complex and time-cosuming than linear assembly, but also more effective.

Allen

0 RandyP over 13 years ago in reply to Allen Lee

TI__Guru* 84110 points

May,

This thread shows a specific linear assembly test routine and a specific C-code benchmarking routine. Your original questions and the insightful answers were all for those specific code examples.

may may92122 said:
The length of the array in my test is 264, ... after o2 optimization level was chosen , the CPU cycle of c code is 440, and the CPU cycle of linear assembly code is 962, ...

You are now talking about completely different program code, both the linear assembly and the main() function in C. The linear assembly example was a trivial one that you would never use in a real application.

And you now you seem to have 2 versions of the same routine, one in C and one in linear assembly. This has not been shown in any of your posts for this thread.

It is no longer clear what your question is, at least not to me. Chad and Allen may know exactly what you are doing, but I do not.

Regards,
RandyP

0 Chad Courtney over 13 years ago in reply to RandyP

TI__Mastermind 30825 points

Randy,

I'd have to concur, it's difficult to tell specifically what's being referenced since it's not the code that was originally being discussed here.

May,

You may want to post another thread regarding the optimization, but you'll want to do so in the C/C++ Compiler Forum which includes coverage for assembler and linear assembly as well. That said, I'll note that linear assembly still requires you to 'unroll' the loop to give it the flexibility to build optimal code, and the Compiler itself is designed if given the freedom to generate highly optimized code, and it's recommended to not go to assembly/linear assembly if not necessary, to keep your code as portable as possible.

Best Regards,

Chad

0 may may92122 over 13 years ago in reply to Chad Courtney

Expert 1030 points

I see now, thanks everyone.

Processors

Processors forum

How to load a single-precise data in linear assembly code?