This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How to load a single-precise data in linear assembly code?

I wrote a linear assembly function as follows:

                    .def       _test

_test:          .cproc    a_0

                    .reg        val_a0

                    LDW       *a_0++,val_a0

                   .return      val_a0

                   .endproc

Then in another file in the project  calling this function:

void      main()

{

                    int  a = 1;

                   float  b = 1.0;

                   int temp1 = 0;

                   int temp2 = 0.0;

                   temp1 = test(&a);

                  printf("%d\n",temp1);

                   temp2 = test(&b);

                   printf("%f\n",temp2);


}

I ran the above code on EVM6678, and the following result appeared in the console window:

1

1065353216.00000

It seemed that LDW couldn't work right with single precise data, can somebody tell me why?

  • You have temp2 as an INT and when you returned the value then your printf or temp2 is casting an INT to float.

    Best Regards,
    Chad

  • Chad,Thanks, but I checked the code, I mistyped the code here, the original code is as follows:

                        .def       _test

    _test:          .cproc    a_0

                        .reg        val_a0

                        LDW       *a_0++,val_a0

                       .return      val_a0

                       .endproc

    Then in another file in the project  calling this function:

    void      main()

    {

                        int  a = 1;

                       float  b = 1.0;

                       int temp1 = 0;

                       foat temp2 = 0.0;

                       temp1 = test(&a);

                      printf("%d\n",temp1);

                       temp2 = test(&b);

                       printf("%f\n",temp2);


    }

    I ran the above code on EVM6678, and the following result appeared in the console window:

    1

    1065353216.00000

    It seemed that LDW couldn't work right with single precise data, can somebody tell me why?

  • Hi May,

    Did you declare the prototype of the function? Such as:

    float test(void *a);

    If not, the output of test(&b) will be treated as integer and be converted to float by INTSP.

    Allen

  • I agree with Allen's comments.

    That said, back to the basic question of the LDW assembly instruction.  It's just going to return the 32 bit value that's stored at the location that's being pointed to.  It doesn't care if it's float, int, 2 16bit values packed, etc.  It simply returns the 32bits exactly as they're stored in memory.  It's your type casting/declarations in C that's affecting how this data is treated.

    If you want to, single step into the assembly code, look where the a_0 register (it's going to be A4 since A4 is passed in as the first variable of a function) look at the memory location pointed to by A4, display it as a SP Float in a memory window and see what you observe, display it again as plane hex value, step through the code until you get the LDW executed (4 single steps after LDW is when the data will land in the register (I assume it would be B4 register, but you'll have to look at the code in dissassembly to see.)  Now, you'll see this is the exact same 32bit hex data as was in memory and this is what gets returned back. 

    Best Regards,
    Chad

  • Thanks Allen and Chad,

    With your help, I totally got the right result. But on the other side, I'm sad with the result.  I studied on linear assembly in order to improve the processing speed of code, but after it, I found that I failed.

    The length of the array in my test is 264, when  optimization level was not chosen, the CPU cycle of c code is 11,923 , and the CPU cycle of linear assembly code is 8,350; but after o2 optimization level  was chosen , the CPU cycle of c code is 440, and the CPU cycle of linear assembly code is 962, which is two times of the c code!  Does it mean that it's so hard to optimize the code? Following spru187t, I tried the optimization methods in section 3 and section 4, but except optimization level coming with the complier, no other methods work.  If I badly need to optimize it furtherly, what can I do?

    Best regards,

    May

  • Hi May,

    In this situation, I think it need the manually assembly coding. You should assign the registers and arrange the pipeline by yourself in order to utilize the calculation resource as much as possible. It will be more complex and time-cosuming than linear assembly, but also more effective.

    Allen

  • May,

    This thread shows a specific linear assembly test routine and a specific C-code benchmarking routine. Your original questions and the insightful answers were all for those specific code examples.

    may may92122 said:
    The length of the array in my test is 264, ... after o2 optimization level  was chosen , the CPU cycle of c code is 440, and the CPU cycle of linear assembly code is 962, ...

    You are now talking about completely different program code, both the linear assembly and the main() function in C. The linear assembly example was a trivial one that you would never use in a real application.

    And you now you seem to have 2 versions of the same routine, one in C and one in linear assembly. This has not been shown in any of your posts for this thread.

    It is no longer clear what your question is, at least not to me. Chad and Allen may know exactly what you are doing, but I do not.

    Regards,
    RandyP

  • Randy,

    I'd have to concur, it's difficult to tell specifically what's being referenced since it's not the code that was originally being discussed here.

    May,

    You may want to post another thread regarding the optimization, but you'll want to do so in the C/C++ Compiler Forum which includes coverage for assembler and linear assembly as well.  That said, I'll note that linear assembly still requires you to 'unroll' the loop to give it the flexibility to build optimal code, and the Compiler itself is designed if given the freedom to generate highly optimized code, and it's recommended to not go to assembly/linear assembly if not necessary, to keep your code as portable as possible.

    Best Regards,

    Chad

  • I see now, thanks everyone.