This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2H12: EDMA speed performance decrease due to SRIO

Part Number: 66AK2H12

Hi Champs,

We used to experiments performance test for SRIO communication and EDMA.

EDMA use read / write between DDR3A and C66 L2 cache. .

SRIO write data to DDR3A

<Experiments combination>

①:Measure communication speed with activate EDMA each of channel (CC0-TC0, CC0-TC1, CC1-TC2, CC1-TC3, CC2-TC2, CC2-TC3 )  and disabled SRIO

②:   Measure  communication speed  with activate EDMA(CC0-TC0, CC0-TC1, CC1-TC2, CC1-TC3, CC2-TC2, CC2-TC3)and receiving data on the SRIO

③:Confirmed EDMA speed change with SRIO activate or not. 

<Result>

Each of CC0-TC1, CC1-TC2, CC1-TC3 speed decreased 0 -5% due to SRIO module active

Each of  CC0-TC0, CC2-TC2, CC2-TC3 speed decreased 20-25% due to SRIO module active

<Question>

According to following experiments, Bridge_SES1 communication speed didn't decrease. On the other hands,  Bridge_SES0 speed significantly decreased.

Could you please tell us what kind of factor expect that decrease Bridge_SES0 communication speed due to SRIO activate ? 

SRIO communication pass is TeraNet3_L→TeraNet3_A→Bridge_9→TeraNet3_C→Bridge_SES2→MSMC→DDR3A.

So, we don't think it affect to Bridge_SES0.

  • Hi,

    Please share which software are you using? Which Processor SDK RTOS version, also which example do you base your tests on?

    Also could you share what is the Bridge_SES0, I suppose it is an acronym of something, but cannot find it in RTOS SW developers guide, nor in K2H documentation, could you please clarify?

    Best Regards,
    Yordan

  • Hi Jordan,

    Thanks for your reply. Actually,  Customer is using a 3rd party RTOS and measure their own board.

    Sorry for misunderstanding "Bridge_SES"  mean BR_SES_x  indicate as bellow module. 

    Actually, customer would like to simply know why SRIO activate affect to "BR_SES_0" communication speed ?

    they would like to know this back ground.

  • Hi Jordan,

    Thank you for your support.

    Could you please update it ?

  • Hi,

    From the K2H datasheet, Table 9-1, SRIO ---- SES2---- MSMC. From Figure 9-1 and 9-3, TeraNet3_A→Bridge_9→TeraNet3_C→Bridge_SES2→MSMC→DDR3A. I am not sure how do you know it is bridge_9, not bridge_5, 6, 7, 8, 10?

    On the same Table 9-1, only EDMA2 CC0 and CC1 access the DDR3A via SES2. The rest goes either SES0 or SES1. From your test cases, it looks no resource contention in the Teranet level.

    However, do you know if any contention in the DDR3A end point. What is the DDR3A SODIMM chip speed? Is it 1600 MT/s (12.8GB/s) or lower one like 1333MT/s? When you have EDMA to move data between DDR3A and L2, how many DMA channels you have in parallel? I believe 1 channel can create 5-6GB/s traffic. So using 3 DMA channels we can saturate the DDR bandwidth. Meanwhile, how fast is SRIO writing data into DDR?

    If you using EDMA to move data between L2 (this does not use SES2) and using SRIO writes DDR, will you see the EDMA slow down?

    Regards, Eric  

  • Any update?

    Regards, Eric

  • Hi Eric,

    Sorry, I am asking customer and waiting their response.

  • Sure, no problem. Just a note that I will be out of office for a while and will come back 07/22. My colleague will help on this topic during my absence.

    Regards, Eric

  • Hi Eric,

    Sorry for late reply.

    >From the K2H datasheet, Table 9-1, SRIO ---- SES2---- MSMC. From Figure 9-1 and 9-3, >TeraNet3_A→Bridge_9→TeraNet3_C→Bridge_SES2→MSMC→DDR3A. I am not sure how do you know it is bridge_9, not bridge_5, 6, 7, 8, 10?

    - Please see this table 9-1 red circle. We understand bridge_9 at yellow high light.

    >. What is the DDR3A SODIMM chip speed? Is it 1600 MT/s (12.8GB/s) or lower one like 1333MT/s?

    -1600 MT/s

    >When you have EDMA to move data between DDR3A and L2, how many DMA channels you have in parallel?

    -Total 16 channel DMA.

    I believe 1 channel can create 5-6GB/s traffic. So using 3 DMA channels we can saturate the DDR bandwidth. Meanwhile, how fast is SRIO writing data into DDR?

    -1600MT/s

    If you using EDMA to move data between L2 (this does not use SES2) and using SRIO writes DDR, will you see the EDMA slow down?

    - They thought  this EDMA slow down. So, they are asking us this question.

    Is this enough to investigate for debug ?

    Regards,

    Kz777

  • Hi Omori-san

    Eric is out of office for next 2 weeks. 

    I think there is potentially some common point for traffic within MSMC and DDR3 A. Have they tried to keep the traffic for EDMA on DDR3A and SRIO on DDR3B etc to see if it addresses the issue?

    Regards

    Mukul 

  • Hi ,

    Sorry for my late response.

    Actually, customer system can't connect to SRIO with DDR3B module.

    Instead of this , I offer more detail data . Please refer it. If you have additional suspect point , please let us know.

    Also, we are using Direct I/O. However, we concern to miss setting "Message" transfer.

    Could you please tell us which register value should confirm for "Message" transfer ? We would like to know this confirmation as back up too.

  • Hi Omori-san

    I am hoping to get hold of the soc architecture expert on this next Tuesday and I will see if they have additional guidance on the information you have shared in the table. 

    It would be great, if for completeness you could add SRC and DST for each EDMA TC and SRIO transfers. I know you mentioned transfers from L2/DDR , but it would be good to clearly summarize SRC and DST. 

    Additionally I did not understand your question here

    Also, we are using Direct I/O. However, we concern to miss setting "Message" transfer. Could you please tell us which register value should confirm for "Message" transfer ? We would like to know this confirmation as back up too.

    Can you please elaborate further on the exact concern?

  • Hi Mukul,

    >Can you please elaborate further on the exact concern?

    Actually, I would confirm this SRIO set both Direct I/O and message passing at same times. In addition to Direct I/O.

    So, when SRIO and EDMA use same time, performance was decreased. because, some data goes to multicore navigater.

    However, when I look into  user's guide, it turned out it is not possible to  set message passing with default setting.

    Do you have any reason  why EDMA affect performance due to SRIO at performance Test  ?

    Customer test same address EDMA and L3 at each of test.

  • Omori-san

    I was not able to catch the chip architect on this to discuss the system interconnect (teranet) specifics. I am out of office for the remainder of the week, and will try to get you an update some time mid next week.

    I apologize for the delay.

    Regards

    Mukul 

  • Hi Mukul,

    Thank you for support. Do you have any update ?

  • Hi Omori-san

    No update yet. I need more time to understand the design spec.

    It will likely be another day or two.

    Sorry for the delay.

    Regards

    Mukul 

  • Hi Omori-san

    As I had previously requested, I will need to understand for each TC , what is the source and what is the destination. 

    Your original email said 

    >>EDMA use read / write between DDR3A and C66 L2 cache. .

    So for all EDMA TC, is it read from DDR3A (SRC) and write to C66 L2 (DST)?

    Please confirm.

    Regards

    Mukul 

  • Hi Mukul,

    Thanks for comment.

    My understanding is all EDMA TC communicating  DDR3A and L2. Sometimes  read from DDR3A(SRC) and write to C66 L2 (DST). Sometimes read from C66 L2(SRC) to write to DDR3A(DST).

  • Hi Omori-san,

    We had checked with the design team to understand the connections between different EDMA CC/TC or SRIO into MSMC or GEM. While still try to find the path difference across SES_0 or SES_1, I want to check with you the EDMA and SRIO bus priority you used for the test setup.

    For EDMA CC/TC, the bus priority is set by QUEPRI registers, the reset value is 0 and means the highest priority. There are CSL API to set/get those.

    static inline void CSL_edma3SetEventQueuePriority

    (

        CSL_Edma3Handle hModule,

        Uint8           eventQueue,

        Uint8           priority

    )

    static inline Uint8 CSL_edma3GetEventQueuePriority

    (

        CSL_Edma3Handle hModule,

        Uint8           eventQueue

    )

    For the SRIO, the bus priority is set by RIO_PER_SET_CNTL for VBUS transaction priority, the reset value is 4 with 0 the highest and 7 the lowest. There is also CSL call to get/set those:

    CSL_SRIO_GetTransactionPriority (CSL_SrioHandle hSrio,Uint8 *priority)

    CSL_SRIO_SetTransactionPriority (CSL_SrioHandle hSrio, Uint8 priority) .

     

    Can you confirm what was the priority setting you used? Also, if you set SRIO to 7 and set the EDMA to 0, will it help the EDMA passing through the traffic?

    Regards, Eric 

  • Hi Omori-san

    As Eric mentioned we have a follow up question into design.

    From my review of the internal spec, it appears that the path for SRIO should be as follows

    SRIO→ TeraNet3_L→TeraNet3_A→Bridge_8→TeraNet3_C→Bridge_SES1→TNET_SES→MSMC→DDR3A.

    I need to confirm this from the design team, and see if the datasheet is incorrect, showing Bridge 9 and SES2 instead.

    It almost appears that if the CC/TC combinations that use SES1 (same as SRIO) seem to work better , interleave things better vs CC/TC that use SES0.

    Hence it is good to see if Eric's query on making SRIO higher or equal priority makes any difference.

    Regards

    Mukul 

  • Hi Mukul, Eric,

    Thanks for good information.

    I will ask customer it and back to you. 

    Regards,

    Kz777

  • Hi Eric,

    Sorry for my late reply.

    I request customer to confirm your advice way to confirm priority.

    >I want to check with you the EDMA and SRIO bus priority you used for the test setup.

    1) For EDMA  =7  For SRIO = 0

    2) Also, we tried to change SRIO =7 , EDMA =0

    So, I will send this detail performance to your private message.

    Please let us know if you find some thing.

    Regards,

    Kaz

  • Kaz-san 

    Limited progress on reviewing the data you shared offline.

    Is this still an open issue with the customer?

    Regards

    Mukul 

  • Hi Mukul,

    Yes, it still open. If you need additional information , please let us know.

    Regards,

    Kz777

  • Hi Omori-san

    From the data  you shared offline, it does appears that putting EDMA at higher priority significantly improves EDMA performance compared to the case when SRIO was running at higher priority.

    From the topology the common point for SRIO and EDMA accessess to/from DDR as the SES bridges to MSMC. 

    So there will be some delay incurred and the paths are not completely independent. 

    So I think overall the update results look ok?

    Let us know if there were more follow up questions from you or the customer.

    Regards

    Mukul