This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

sRIO Test - CPU cycle consumed is high

Hi,

I am trying to benchmark a few sRIO tests on the Advantech's - "DSPA-8901" card that runs a custom software. The test I do is over 2 DSP chips; the first DSP sending a 720p YUV to the second DSP over sRIO interface. The 2 DSPs share a direct link and my test uses the Type 11 message passing sRIO.

When I measured the cycles "for receiving the 720p YUV" on the DSP, it turns out to be that this DSP consumes around 80-85% of the CPU. The numbers are clearly high and while I try to understand the reason for it, i would like to know if there are any benchmarking figures available/published. Could you also suggest few data points to look at, given the above problem.

Regards,

Niks   

  • Note, the Advantech's - "DSPA-8901" card has C6678 DSPs.
  • Please take a look at keystone Throughput Performance Guide for SRIO Throughtput:
    www.ti.com/.../sprabk5a.pdf

    Thanks,

  • Thanks Ganapathi, I will take a look at the benchmarking numbers.

    However my real concern is the cycles it takes to read the 720p YUV data on DSP 2. For 1 720p YUV (330 Mbps), its ~50% of CPU and to read 2 720p YUV (660 Mbps) the cycle usage shoots up to 80 %. Do you think this could be related to some port routing issue ? I will probably take a closer look at the TI examples but if you have any recommendations to make, i would like to consider those first.

    Thanks,
    Niks
  • Hi,

    TI not provide direct throughput test example for DSPA-8901 board. Please take a look at MCSDK SRIO Throughput test code. It support external SRIO switch.

    MCSDK Path: C:\ti\pdk_C6678_x_x_x_x\packages\ti\drv\exampleProjects\SRIO_TputBenchmarkingTestProject

    Refer section "5.4 Single EVM looped back externally using an external SRIO switch" and "9.3 Setting up C-S-C connection mode (core to core, with a SRIO switch)" on SRIO_Benchmarking_Example_Code_Guide document.

    Doc Path: \ti\pdk_C6678_x_x_x_x\packages\ti\drv\srio\test\tput_benchmarking\docs\SRIO_Benchmarking_Example_Code_Guide.

    I have tested the SRIO example project (SRIO_TputBenchmarkingTestProject) in two C6678 EVM and compare the result with throughput document value mostly both results are same.

    Thanks,

  • Hi Ganapathi,

    Thank you for your inputs, I started on the Tput benchmarking example projects and ran into a problem.  

    1. I did the Internal loopback test - C-I-C and this test worked.

    2. Using a break-out card CI2EVM_Boc, I then connected my single C6678LE EVM to the BoC and tried the - "C-E-C" test. I tweaked the "benchmarking.h" file to enable the following as I am only interested in doing Type-11 at 3.125G and port width set to 4 -1x:

    #define SRIO_LANE_SPEED                   srio_lane_rate_3p125Gbps

    #define SRIO_PORT_WIDTH                   srio_lanes_form_four_1x_ports

    #define TESTS_TO_RUN                      srio_type11_tests

    #define USE_LOOPBACK_MODE                 FALSE

    This test however failed for me, I am using the pdk_C6678_1_1_2_5 and from the logs I see "[C66xx_1] Error: Transmit is unable to send packets. Packet count: 32". Looking up through the logs, I also see that the

    Ouput ;
    
    [C66xx_1] ********************************
    [C66xx_1] *********** PRODUCER ***********
    [C66xx_1] ********************************
    [C66xx_1] WARNING: Please ensure that the CONSUMER is executing before running the PRODUCER!!
    [C66xx_1] Debug(Core 1): Waiting for SRIO to be initialized.
    [C66xx_0] ********************************
    [C66xx_0] *********** CONSUMER ***********
    [C66xx_0] ********************************
    [C66xx_0] WARNING: Please ensure that the CONSUMER is executing before running the PRODUCER!!
    [C66xx_0] Debug: Waiting for module reset...
    [C66xx_0] Debug: Waiting for module local reset...
    [C66xx_0] Debug: Waiting for SRIO ports to be operational...  
    [C66xx_0] Debug: SRIO port 0 is NOT operational.
    [C66xx_0] Debug: SRIO port 1 is NOT operational.
    [C66xx_0] Debug: SRIO port 2 is NOT operational.
    [C66xx_0] Debug: SRIO port 3 is NOT operational.
    [C66xx_0] Debug:   Lanes status shows lanes formed as four 1x ports
    [C66xx_0] Debug: AppConfig Tx Queue: 0x2a0 Flow Id: 0
    [C66xx_0] Debug: SRIO Driver Instance 0x@00861840 has been created
    [C66xx_0] Debug: Running test in polled mode.
    [C66xx_0] Debug: SRIO Driver handle 0x861840.
    [C66xx_0] 
    [C66xx_0] 
    [C66xx_1] Debug: AppConfig Tx Queue: 0x2a1 Flow Id: 1
    [C66xx_1] Debug: SRIO Driver Instance 0x@00861750 has been created
    [C66xx_1] Debug: Running test in polled mode.
    [C66xx_1] Debug: SRIO Driver handle 0x861750.
    [C66xx_1] 
    [C66xx_1] 
    [C66xx_0] Throughput: (RX side, Type-11, 3.125GBaud, 1X, tab delimited)
    [C66xx_1] Error: Transmit is unable to send packets. Packet count: 32
    [C66xx_1] Debug:   SRIO status: 0x01000000.
    [C66xx_1] Debug:     Packet Response Time-out
    [C66xx_1] Debug:   Port response timeout: 0xff0fff
    [C66xx_1] Debug:   SRIO is an Agent or Slave device.
    [C66xx_1] Debug:   SRIO processing element can issue requests.
    [C66xx_1] Debug:   SRIO device has not been previously discovered.
    [C66xx_1] Debug: Various Register Display:
    [C66xx_1]                                      32211100
    [C66xx_1]                                      17395173
    [C66xx_1]                                      ||||||||
    [C66xx_1]                                      22211000
    [C66xx_1]                                      84062840
    [C66xx_1]                                      --------
    [C66xx_1]                    PCR: 0x02900004:0x00000005
    [C66xx_1]           PER_SET_CNTL: 0x02900014:0x01053800
    [C66xx_1]          PER_SET_CNTL1: 0x02900018:0x00000000
    [C66xx_1]      ERR_RST_EVNT_ICSR: 0x029001e0:0x00000000
    [C66xx_1]              LSUx_Reg6: 0x02900d18:0x00000010
    [C66xx_1]          LSU_STAT_REG0: 0x02900de8:0x00000000
    [C66xx_1]              SP_RT_CTL: 0x0290b124:0xff0fff00
    [C66xx_1]             SP_GEN_CTL: 0x0290b13c:0x40000000
    [C66xx_1]   Port n Control 2 CSR: 0x0290b154:0x02aa0000
    [C66xx_1]           SPn_ERR_STAT: 0x0290b158:0x00000001
    [C66xx_1]                SPn_CTL: 0x0290b15c:0x00600001
    [C66xx_1]            SPx_ERR_DET: 0x0290c040:0x00000000
    [C66xx_1]           SPx_ERR_RATE: 0x0290c068:0x80000000
    [C66xx_1]            LANEn_STAT0: 0x0290e010:0x00004f08
    [C66xx_1]            LANEn_STAT1: 0x0290e014:0x00000000
    [C66xx_1]   PLM Port(n)Implement: 0x0291b080:0x00000000
    [C66xx_1]       PLM_SP(n)_Status: 0x0291b090:0x00000000
    [C66xx_1]        EM_PW_PORT_STAT: 0x0291b928:0x00000000
    
    . I am attaching the complete logs of the test as it fails.

    Note, I also tried one of the suggestions made on the forum and that did not help either: .

    Could you please let me know what I am missing, I see some jumper setting on the BoC card and I have left them at factory defaults.

    Thanks,

    Niks

  • Hi Ganapathi,
    Sorry for the bad formatting, the insert attachment created problems to my text. Here is what i meant to say:
    I see the following errors:
    1. "[C66xx_1] Error: Transmit is unable to send packets. Packet count: 32" and

    2. [C66xx_0] Debug: SRIO port 0 is NOT operational.
    [C66xx_0] Debug: SRIO port 1 is NOT operational.
    [C66xx_0] Debug: SRIO port 2 is NOT operational.
    [C66xx_0] Debug: SRIO port 3 is NOT operational.

    PFA the "sRIO-Tput-Type11-Output.txt" that contains the test output.

    Regards,
    Niks
  • Hi,

    Based on your above test log, the SRIO port are not connected properly. Have you referred the SRIO Benchmarking document (SRIO_Benchmarking_Example_Code_Guide)?

    SMA connections are looped back for each of the SRIO ports on the breakout card. If external line loopback is enabled, data received on RX “line” will be directly send back to TX “line”. Refer section 2.6.3 External line loopback on SRIO Programming and Performance Data document. 6825.SRIO_Programming_Performance.pdf

    Thanks,

  • Hi,

    Can you please elaborate when you say - "the SRIO port are not connected properly" ? Is it some change needed outside the Tput application code? The section 9.2 in "SRIO_Benchmarking_Example_Code_Guide.doc" says :

    • This mode is for core to core transfers on the same EVM using external loopback. In this mode only one version of the .out file is needed. The .out file will be run on core 0 (the consumer) and core 1 (the producer). It is assumed that the board is properly looped back over the external interface via a SMA break-out board.

    Can you explain what the above statement in bold mean?

    Thanks,

    Niks

  • Hi,

    If external line loopback mode, all SRIO ports RX “line” will be directly connected back to TX “line”.

    Refer section 2.6.3 External line loopback on SRIO Programming and Performance Data document. 6825.SRIO_Programming_Performance.pdf

    Thanks,
  • Niks-

    For P2020 to C66x communication (not inter-C66x as you are asking about), we have a modified high performance SRIO driver for 8901 we use with our DirectCore software.  If you want to experiment with DirectCore on the 8901 board let me know.

    -Jeff
    Signalogic