This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6748: using the resource partitions outside loops

Hi everyone,

I'm in the middle of an optimization session within our DSP code.

We came up with the idea of using the resource partitions A/B used while optimizing loops in the code:

*––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––*

* Resource Partition:

*                                                   A–side                              B–side ;

* .L units                                         2                                           3 ;

* .S units                                        4                                          4 ;

* .D units                                        1                                          0 ;

* .M units                                        0                                          0;

* .X cross paths                             1                                          3 ;

* .T address paths                         1                                         0 ;

* Long read paths                         0                                          0 ;

* Long write paths                         0                                         0 ;

* Logical ops (.LS)                        0                                         1  (.L or .S unit) ;

* Addition ops (.LSD)                    6                                        3 (.L or .S or .D unit) ;

* Bound(.L .S .LS)                         3                                         4 ;

* Bound(.L .S .D .LS .LSD)          5*                                       4 ;

*––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––*

My questions are:

1. Does the compiler use these partitions "behind the scenes" without informing the user when it comes to code blocks which aren't within any loop?

a. if not, does this mean that only one pipeline partition is used for code which is outside of a loop and 2 partitions are only used within loops?

b. if yes, is there a way to see the asm output of how these partitions are used for each block of code?

2. Is it possible to use them manually in case I have a code block outside of a loop that can be optimized using both partitions?

a. if not, does this mean that I have to "fool" the compiler by putting this code block into a dummy loop format in order to utilize the resource partitions better?

Thanks in advance!

Yehonadav.

  • Hi,

    Thanks for your post.

    To answer your question #1, yes, the compiler use these partitions and the compiler provides some feedback by default and additional feedback can be generated with the −mw option.

    In general, the feedback would be located in the .asm file that the compiler would be generating and to view the feedback, you have to enable the -k option which would retain a copy of the .asm output from the compiler.

    The section titled Understanding Feedback of the C6000 Programmer's Guide below shows how to understand these compiler generated comments

    http://www.ti.com/lit/ug/spru198k/spru198k.pdf

    Through understanding feedback, you can quickly tune your C code to obtain the highest possible performance.

    The sample assembly output would like as below is shown just for your reference:

    ;*----------------------------------------------------------------------------*

    ;*   SOFTWARE PIPELINE INFORMATION

    ;*

    ;*      Loop found in file               : abs_diff.cc

    ;*      Loop source line                 : 19

    ;*      Loop opening brace source line   : 19

    ;*      Loop closing brace source line   : 22

    ;*      Loop Unroll Multiple             : 2x

    ;*      Known Minimum Trip Count         : 512                    

    ;*      Known Max Trip Count Factor      : 16

    ;*      Loop Carried Dependency Bound(^) : 0

    ;*      Unpartitioned Resource Bound     : 2

    ;*      Partitioned Resource Bound(*)    : 2

    ;*      Resource Partition:

    ;*                                A-side   B-side

    ;*      .L units                     0        2*    

    ;*      .S units                     0        0    

    ;*      .D units                     1        2*    

    ;*      .M units                     0        0    

    ;*      .X cross paths               0        2*    

    ;*      .T address paths             1        2*    

    ;*      Long read paths              0        0    

    ;*      Long write paths             0        0    

    ;*      Logical  ops (.LS)           0        0     (.L or .S unit)

    ;*      Addition ops (.LSD)          0        0     (.L or .S or .D unit)

    ;*      Bound(.L .S .LS)             0        1    

    ;*      Bound(.L .S .D .LS .LSD)     1        2*    

    ;*

    ;*      Searching for software pipeline schedule at ...

    ;*         ii = 2  Schedule found with 4 iterations in parallel

    ;*

    ;*      Register Usage Table:

    ;*          +-----------------------------------------------------------------+

    ;*          |AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA|BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB|

    ;*          |00000000001111111111222222222233|00000000001111111111222222222233|

    ;*          |01234567890123456789012345678901|01234567890123456789012345678901|

    ;*          |--------------------------------+--------------------------------|

    ;*       0: |   **                           |     ** **                      |

    ;*       1: |   ***                          |    ******                      |

    ;*          +-----------------------------------------------------------------+

    ;*

    ;*      Done

    ;*

    ;*      Loop will be splooped

    ;*      Collapsed epilog stages       : 0

    ;*      Collapsed prolog stages       : 0

    ;*      Minimum required memory pad   : 0 bytes

    ;*

    ;*      Minimum safe trip count       : 1 (after unrolling)

    ;*      Min. prof. trip count  (est.) : 2 (after unrolling)

    ;*

    ;*      Mem bank conflicts/iter(est.) : { min 0.000, est 0.250, max 1.000 }

    ;*      Mem bank perf. penalty (est.) : 11.1%

    ;*

    ;*      Effective ii                : { min 2.00, est 2.25, max 3.00 }

    ;*

    ;*

    ;*      Total cycles (est.)         : 6 + trip_cnt * 2        

    ;*----------------------------------------------------------------------------*

    ;*        SINGLE SCHEDULED ITERATION

    ;*

    ;*        $C$C21:

    ;*   0              LDDW    .D2T2   *B8++,B7:B6       ; |21|

    ;*     ||           LDDW    .D1T1   *A3++,A5:A4       ; |21|

    ;*   1              NOP             4

    ;*   5              SUBABS4 .L2X    B7,A5,B5          ; |21|

    ;*   6              SUBABS4 .L2X    B6,A4,B4          ; |21|

    ;*   7              STDW    .D2T2   B5:B4,*B9++       ; |21|

    ;*     ||           SPBR            $C$C21

    ;*   8              ; BRANCHCC OCCURS {$C$C21}        ; |19|

    ;*----------------------------------------------------------------------------*

    To address question #2, it is not possible to use them manually.

    Thanks & regards,

    Sivaraj K

    -------------------------------------------------------------------------------------------------------

    Please click the Verify Answer button on this post if it answers your question

    -------------------------------------------------------------------------------------------------------