This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Division of Intrinsic for Two-way SIMD or Four-way SIMD?

Other Parts Discussed in Thread: MATHLIB

Hi, I am using intrinsics (ex, dshl2, dsadd2, dccmpyr1) for two-way SIMD in TMS320C6600. However, I didn't find any division of intrinsic for two or four way SIMD, so is there any intrinsics for division?

Thanks 

  • Using intrinsics means using assembly instructions through the C compiler's syntax. The Compiler User's Guide has lists of all of the intrinsics available for each of the device instruction set architectures. You can look through there to find all available intrinsics.

    The C66x CPU & Instruction Set Reference Guide has all of the instructions that can execute on the C66x core. Most of those instructions are available as intrinsics, with the possible exception of some instructions that are very commonly used for ordinary operations.

    Division is not a native instruction of the C66x because it is a complex multi-cycle operation and not a RISC-type of operation.

    You may search the Compiler User's Guide for "division" to see how that is implemented in the C context.

    Regards,
    RandyP

  • Thanks for reply.

    I have one more question regarding to Division.

    For example, as below

    #pragma MUST_ITERATE(2, 600, 2)
     for(i = 0; i < Len; i++)
     {
           itemp0 = _amem4(&inputPtr0[i]);
           itemp1 = _amem4(&inputPtr1[i]);
           //itemp3 = __c6xabi_divi(itemp0, itemp1);
           itemp3 = itemp0 / itemp1;
           _amem4(&outputPtr0[i]) = itemp3;
     }

    Even if I expect software pipeline, it didn't work due to "Disqualified loop: Loop contains a call" (written in .asm file)

    Are there any methods for applying software pipeline in division case?

    Regards,
    JP

     

     

     

  • JP,

    Please search the Compiler User's Guide for "division" to see advice on this topic. Section 3.14 of the version of the Compiler User's Guide I am using shows how to get advice for optimization from the compiler, and one example discusses division in particular.

    If your example is made up arbitrarily for evaluation of the compiler, then a practical example may be better suited to pipeline optimization. But your example probably will not be. It is common in DSP algorithms to avoid division for this reason, but that discussion may be a separate discussion.

    Perhaps you can adapt the division routine (which probably contains a loop) to make a dual-path version since you always want to pass through your loop an even number of time.

    In your example, you should be able to avoid using _amem4 everywhere by making sure the the three pointer variables are of type (int *). It would make the code more readable and C-like.

    Regards,
    RandyP

  • Jaeyong Park said:

    Are there any methods for applying software pipeline in division case?

    You can try with the divsion routine from mathlib. The sources are vailable as part of the MCSDK (mathlib_c66x_3_0_0_0\packages\ti\mathlib\src\divsp\c66\divsp_i.h) and can be inlined. I think Its precision is less then the default libc one but mybe it is adeguate for your needs. It is based on the intrinsic RCPSP that computes the approximate 32-bit float reciprocal (the only division operation that the C6678 is able to do).

    .