This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Instrinsic

I was going through intrinsic manual for 66xx. I want to know if there is any intrinsic which accumulates 16bit complex additions in 32bit complex vector.

eq  e+if = (a+ib) + (c+id)  where a,b,c,d are 16bit signed integer and e and f 32 bit signed integer

Thanks

Ronak

  • Hi Ronak,

    there are a couple of instructions that do that. You can check the Instruction Set User's Guide for details see http://www.ti.com/litv/pdf/sprugh7. These are also available as inbtrinsics.

    CMPY,CMPYR, CMPYR1, DOTP2, DOTPN2

    Kind regards,

    one and zero

  • To precisely answer your question of Complex number addition you can use _add2();


  • Hi 

    @Venugopala _add2 will give 16bit output for addition without saturation. I want to addition to be stored as 32bit singed value. 

    and CMPY,CMPYR, CMPYR1, DOTP2, DOTPN2 all this instrinsics are for complex multiplication. I want some intrinsic which does 16bit complex addition and store it as 32bit answer. I don't want to saturate the answer as i want precision.

    Regards

    Ronak

  • Hi Ronak,

    sorry I oversaw that you not doing multiplication ... first quick glance your formula looked like a complex multiplication.

    Anyhow. add2 can be used for complex additions but will store the result as 16 bit again. So the only option I see is to use unpack instructions + the regular 32 bit add.

    Kind regards,

    one and zero

  • Thanks....Even i felt that's the only way to do it thought of checking once for intrinsic if available.

    -Ronak

  • At the point you stated int16 addition may lead to just one bit growth. Assuming you have at least one bit before saturation, int16 still is precise for your sum. The only reason to have larger is might be related with further processing, but know nothing about it. So I'd ask first, whether your really need 32-bit sum.

  • For the sake of ideas (not sure if this would be the best one :-), one idea might be to do a dot product with your data and proper constant vectors to select re/im parts (e.g., 0x00000001 for real part, 0x00010000 for imaginary part). Something like this:

    void accumulate_imre(int *restrict data, int *restrict re, int *restrict im)
    {

        register __x128_t b1,b2;
        long long sr,si,vec1,vec2;
        int sumr=0,sumi=0;
        int i;

        b1 = _ito128(0x00000001,0x00000001,0x00000001,0x00000001);
        b2 = _ito128(0x00010000,0x00010000,0x00010000,0x00010000);
        #pragma UNROLL(2);
        for (i=0;i<DATA_SIZE;i+=4)
        {
          vec1 = _amem8(&data[i]);
          vec2 = _amem8(&data[i+2]);
          sr = _ddotpsu4h(_llto128(vec1,vec2),b1);
          si = _ddotpsu4h(_llto128(vec1,vec2),b2);
          sumr += _hill(sr)+_loll(sr);
          sumi += _hill(si)+_loll(si);
        }
        
        *re = sumr;
        *im = sumi;
        return;
    }

    -Matti