This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[FAQ] TDA4AH-Q1: How to Generate an Assembly file and perform analysis

Part Number: TDA4VH-Q1


Hi Team,

How can I generate an assembly file from C7x code and calculate the total cycle count using the software pipeline information found in the assembler file?

Regards,

Betsy Varughese

  • Hi,

    To generate an assembly file from C7x code and calculate the total cycles from software pipeline information, here’s a step-by-step breakdown tailored for C7x DSP architecture.

    Reference Link : https://www.ti.com/lit/ug/spruig8g/spruig8g.pdf

    To generate the assembly file from C7x code, you can use the --keep_asm compiler option. This option ensures that the assembly language (.asm) file is retained in the respective folders after compilation.

    Example : As part of the J784S4 SDK (https://www.ti.com/tool/PROCESSOR-SDK-J784S4), the dsplib includes several kernels. One such example is the DSPLIB_addConstant kernel, located at: "dsplib/src/DSPLIB_addConstant".

    The code can be built using the standard SDK build commands. For detailed steps, please refer the FAQ documentation: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1549465/faq-build-and-run-instructions-for-mathlib-dsplib-and-fftlib.

    A sample code snippet from the DSPLIB_addConstant kernel is attached for reference:

    The assembly file generation is enabled as part of the CMake. Once the build process is completed for the selected SoC, the corresponding .asm file is automatically generated and placed in the respective build folder(" dsplib/build/bin/DSPLIB_addConstant/DSPLIB_addConstant_ci.asm").

    Below is the .asm file corresponding to the above code snippet.

    ;*----------------------------------------------------------------------------*
    ;*  SOFTWARE PIPELINE INFORMATION
    ;*    Loop found in file               : /home/betsy/workspace/ti_c7x_sdk/C7x_dsplib/dsplib/src/DSPLIB_addConstant/DSPLIB_addConstant_ci.cpp
    ;*    Loop source line                 : 196
    ;*    Loop opening brace source line   : 196
    ;*    Loop closing brace source line   : 205
    ;*
    ;*    Summary:
    ;*      - The compiler successfully software pipelined this loop.
    ;*
    ;*    Known Minimum Iteration Count    : 1
    ;*    Known Max Iteration Count Factor : 1
    ;*    Loop Carried Dependency Bound(^) : 0
    ;*    Unpartitioned Resource Bound     : 1
    ;*    Partitioned Resource Bound       : 1 (pre-sched)
    ;*
    ;*    Searching for software pipeline schedule at ...
    ;*      ii = 1  Schedule found with 2 iterations in parallel
    ;*
    ;*    Partitioned Resource Bound(*)    : 1 (post-sched)
    ;*
    ;*      Constant Extension #0 Used [C0]  : 0
    ;*      Constant Extension #1 Used [C1]  : 0
    ;*
    ;*
    ;*    Resource Partition (may include "post-sched" split/spill moves):
    ;*
    ;*                                                A-side   B-side
    ;*
    ;*      .L units                                     0        0     
    ;*      .S units                                     0        0     
    ;*      .M units                                     0        0     
    ;*      .N units                                     0        0     
    ;*      .D units                                     1        -     
    ;*      .C units                                     -        0     
    ;*      .P units                                     -        0     
    ;*
    ;*      .M/.N units                                  0        0     
    ;*      .L/.S units                                  0        0     
    ;*      .L/.S/.C units                               0        0     
    ;*      .L/.S/.C/.M units                            0        1     
    ;*      .L/.S/.C/.M/.D units                         1        0     
    ;*
    ;*      .X cross paths                               0        0     
    ;*
    ;*      Bound(.C2)                                   -        0     
    ;*      Bound(.P2)                                   -        0     
    ;*      Bound(.D)                                    1*       -     
    ;*      Bound(.M .N .MN)                             0        0     
    ;*      Bound(.L .S .LS)                             0        0     
    ;*      Bound(.L .S .C .LS .LSC)                     0        0     
    ;*      Bound(.L .S .C .M .LS .LSC .LSCM)            0        1*    
    ;*      Bound(.L .S .C .M .D .LS .LSC .LSCM .LSCMD)  1*       1*    
    ;*
    ;*    Register Usage Tables:
    ;*
    ;*      +---+------------------+----------+----------+------------------+----------+----------+
    ;*      | # |       Axx        |   ALx    |   AMx    |       VBxx       |   VBLx   |   VBMx   |
    ;*      +---+------------------+----------+----------+------------------+----------+----------+
    ;*      |   | 0000000000111111 | 00000000 | 00000000 | 0000000000111111 | 00000000 | 00000000 |
    ;*      |   | 0123456789012345 | 01234567 | 01234567 | 0123456789012345 | 01234567 | 01234567 |
    ;*      +---+------------------+----------+----------+------------------+----------+----------+
    ;*      | 0 | ***              |          |          | *                | *        |          |
    ;*      +---+------------------+----------+----------+------------------+----------+----------+
    ;*      +---+------------------+----------+-------+
    ;*      | # |       Dxx        |    Px    | CUCRx |
    ;*      +---+------------------+----------+-------+
    ;*      |   | 0000000000111111 | 00000000 | 0000  |
    ;*      |   | 0123456789012345 | 01234567 | 0123  |
    ;*      +---+------------------+----------+-------+
    ;*      | 0 | *                |          |       |
    ;*      +---+------------------+----------+-------+
    ;*    Done
    ;*
    ;*    Collapsed epilog stages       : 1
    ;*    Collapsed prolog stages       : 1
    ;*    Max amt of load speculation   : 0 bytes
    ;*
    ;*    Minimum safe iteration count    : 1
    ;*    Min. prof. iteration count (est.) : 2
    ;*
    ;*    Mem bank conflicts/iter(est.) : { min 0.000, est 0.000, max 0.000 }
    ;*    Mem bank perf. penalty (est.) : 0.0%
    ;*
    ;*
    ;*    Total cycles (est.)         : 1 + trip_cnt * 1
    ;*----------------------------------------------------------------------------*
    ;*      SINGLE SCHEDULED ITERATION
    ;*
    ;*      ||$C$C87||:
    ;*   0              VADDH   .L2     SE1++,VBL0,VB0    ; [B_L2] |199| 
    ;*     ||   [ A0]   ADDW    .L1     A0,0xffffffff,A0  ; [A_L1] |196| 
    ;*   1              VST32H  .D2     VB0,*D0[SA0++]    ; [A_D2] |204| 
    ;*     ||   [ A0]   B       .B1     ||$C$C87||        ; [A_B] |196| 
    ;*   2              ; BRANCHCC OCCURS {||$C$C87||}    ; [] |196| 
    ;*----------------------------------------------------------------------------*
    ||$C$L1||:    ; PIPED LOOP PROLOG
    ;** --------------------------------------------------------------------------*
    ||$C$L2||:    ; PIPED LOOP KERNEL
    ;          EXCLUSIVE CPU CYCLES: 1
    
       [ A2]   ADDW    .S1     A2,0xffffffff,A2  ; [A_S1] <0,1> collapsing predicate control
    || [ A1]   ADDW    .D1     A1,0xffffffff,A1  ; [A_D1] <0,1> collapsing predicate control
    || [ A0]   B       .B1     ||$C$L2||         ; [A_B] |196| <0,1> 
    || [!A2]   VST32H  .D2     VB0,*D0[SA0++]    ; [A_D2] |204| <0,1> 
    || [ A0]   ADDW    .L1     A0,0xffffffff,A0  ; [A_L1] |196| <1,0> 
    || [ A1]   VADDH   .L2     SE1++,VBL0,VB0    ; [B_L2] |199| <1,0> 
    
    ;** --------------------------------------------------------------------------*
    ||$C$L3||:    ; PIPED LOOP EPILOG
    ;          EXCLUSIVE CPU CYCLES: 1
               PROTCLR                            ; [A_U] 
    ;** --------------------------------------------------------------------------*
    ||$C$L4||:    
    ;          EXCLUSIVE CPU CYCLES: 1
    ;***	-----------------------g4:
    ;*** 206	-----------------------    __se_close(1u);
    ;*** 207	-----------------------    __sa_close(0u);
    ;*** 209	-----------------------    return 0;
    
               RET     .B1     ; [A_B] 
    ||         SACLOSE .C2     0                 ; [B_C] |207| 
    ||         SECLOSE .D2     0x1               ; [A_D2] |206| 
    ||         MVKU32  .L1     0,A4              ; [A_L1] |209| 

    The assembly file can also be generated using the CCS IDE, provided we have a workspace or standalone code.

    Steps to generate an assembly file using CCS IDE:

    To retain the generated assembly (.asm) file, enable the option "Keep the generated assembly language(.asm) file (–keep_asm, -k)" under Project Properties > Build > C7000 Compiler > Advanced Options > Assembler Options.

    Once the build is complete, you’ll find the .asm file in the debug/release folder

    How to understand total cycle in SOFTWARE PIPELINE INFORMATION in assemly file ??

    Reference Document https://software-dl.ti.com/codegen/docs/c7000/optimization_guide/5_understand_opt/compopt_under_pipelining.html

    The total pipeline consist of three parts: prolog,kernel (steady state),epilog.

    Initiation Interval (ii): number of cycles between starting two successive iterations of the loop.

    Total cycles = prolog +( minimum iteration count * ii ) + epilog.

    Regards,

    Betsy Varughese