This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MAC function in IQmath library

Other Parts Discussed in Thread: TMS320F2812

Dear all,

My customer want to realize some function using MAC instruction and want to know if it has similar IQ function in the library, like IQmac etc.

If you, do you know if we plan to make it?

The original message is attached below.

I must write some code using MAC instruction directly into C code. A while ago you gave me a tip regard __IQmpy(x,y,32) and __IQmpy(x,y,31) to write efficient optimized code for normalized multiplication. But now I'm wondering if already exist some libraries that allow me to use MAC instruction (and MAC with Saturation). I thought there was a function like __IQmac(x,y,acc) or something like that, but I haven't found anything. Could you suggest me the way to write an inline function (non a real function: I want to avoid the call instruction) that this optimized instruction? or could you indicate me, if exists, an appropriate library? I must do something like LibIQMath, because the parameter could be local, global or constants.

Cheers,

 

Wei

  • Does anybody know this?

    Thanks

  • Wei,

    There is no _IQmac() like function.  C28x CPU does not have direct hardware support for an IQmac operation.  Most efficient way to code it in C using the IQ library is to just do IQmpy()  followed by the add.  This is what an _IQmac() library function would do if one existed anyway.

    Regards,

    David

  • Thanks David,

    I've just got update from customer. Actually, he want to get MAC instruction in the output of C complier. Do you know if it is possbile to specifiy the compiler?

    Cheers,

    Wei

    Original customer's message:

    The DSP TMS320F2812 supports MAC instruction, as you can see from “SPRU160C” (CPU and Instruction Set).

    Moreover, browsing the documentation I read right now that using –O3 and (mainly) -mt optimization, the compiler generates MAC instruction (see SPRU541B page 2-17). I remember I tried to use –O3 optimization instruction as suggested by some documentation, but It didn’t work (sorry I don’t find any more the document I have read to do this). 

    So the “specific question” is how to obtain code really optimized (if it is possible with this compiler), because MAC instructions (and others) don’t appear in the assembly statement generated by C compiler.

    As I said in the last mail, I have tried to pass some compiler parameters like –mt and –ms, but they don't work: my code remain the same. It could be fault of DSP/BIOS? Because DSP/BIOS forces me to use “-ml” and I haven’t understood  exactly if this compromises the generation of MAC instructions and some other optimizations.

  • Wei,

    The compiler can generate the MAC instructions, but the C-code has to be written just right.  Here is an example that will generate MAC with -o2 or greater optimizer level (and -mt option):

    //------------------------
    extern int16 *xp, *yp;
    extern int32 *zp;

    void MacTest(void)
    {
    Uint16 i;
    int32 tmp;
         tmp = 0;
         for(i=0; i<100; i++)
         {
              tmp += (int32)*xp++ * (int32)*yp++;   
         }
         *zp = tmp;
    }
    //------------------------

    the generated assembly code (compiler v6.0.1) is:

            MOVW      DP,#_yp               ; [CPU_U]
            MOVL      XAR5,@_yp             ; [CPU_] |12|
            MOVW      DP,#_xp               ; [CPU_U]
            MOVL      XAR4,@_xp             ; [CPU_] |12|
            MOVB      ACC,#0                ; [CPU_]
    ;----------------------------------------------------------------------
    ;  14 | tmp += (int32)*xp++ * (int32)*yp++;                                   
    ;----------------------------------------------------------------------
            MOV       P,#0                  ; [CPU_]
            MOVL      XAR7,XAR4             ; [CPU_]
            RPT       #99
    ||      MAC      P,*XAR5++,*XAR7++     ; [CPU_]
            ADDL      ACC,P                 ; [CPU_]
            MOVL      @_xp,XAR4             ; [CPU_]
            MOVW      DP,#_yp               ; [CPU_U]
            MOVL      @_yp,XAR5             ; [CPU_]
    ;----------------------------------------------------------------------
    ;  17 | *zp = tmp;                                                            
    ;----------------------------------------------------------------------
            MOVW      DP,#_zp               ; [CPU_U]
            MOVL      XAR4,@_zp             ; [CPU_] |17|
            MOVL      *+XAR4[0],ACC         ; [CPU_] |17|


    The above code is very deliberate.  For example, notice that a temporary local variable 'tmp' is used to hold the accumulation, and then it is assigned to 'z' at the end.  If tmp were global or instead the accumulation were done directly on the global z, the compiler generates a MPY, ADD, BANZ construct instead of RPT || MAC.  I'm a little surprised by this since 'z' is not declared as volatile (which would command the compiler to store z off after each accumulation).  But, that is what I discovered when I tested this morning.

    Regards,

    David

     

     

     

     

  • David,

     

    Thank you for posting this disassembly code. Could you confirm whether the same code would translate to MAC assembly if no optimizations are set OR if large memory model (-ml) is being used? In other words, apart from the necessary condition that the code is specifically crafted for MAC compatibility, what other conditions need to be met for compiler to output MACs?

     

    Thank you.

  • S,

    You need at least -o2 level optimization in my example to get the MAC instruction (this is with compiler v6.0.1).  I cannot speak to other code or older/newer compilers, although it is probably the same.  Memory model shouldn't matter (although you should always be using large memory model.  All TI libraries are written for large memory model, and that is really what we support).

    Best thing to do is play around with the coding yourself, and see what the compiler produces for your particular use case.

    Note that you can select build options on a per-file basis in CCS (i.e., you can set the default for the project to be no optimization for example, but set -o2 for the file containing the MAC function).

    Regards,

    David