This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM814x EMAC 3PGSWRXINT0 IRQ delay

Hello all,

I have an issue with the timing of the 3PGSWRXINT0 (EMAC Switch Receive) IRQ of DM814x.

We are using a custom board with a DM8147 processor and a DP83848 PHY, our firmware is based on SYS/BIOS 6.35.06.56 and NDK 2.25.00.09.

Amongst others we are running a Profinet stack on this device operating with a cycle period of 1 ms.

Sometimes I see an interval of 2 up to 3 ms between receiving two subsequent Profinet packets in the "EMAC Switch Receive" IRQ which should be received with an interval of 1 ms. Using a network TAP I verified that the interval between the packets is exactly 1 ms on the cable.

When this happens I typically see that one "EMAC Switch Receive" IRQ receives two Profinet packtes (which have actually been transmitted with an interval of 1 ms on the cable) and one up to four other TCP packets (usually TCP ACKs of a parallel TCP transfer) from the EMAC within one IRQ call taking approximately 30-50 µs (including time expensive logging to analyze this issue). This means the first of these two Profinet packets is received with a delay of 1 ms at least. (If this delay increases up to 2 ms this causes a logical timeout in the Profinet stack which is finally a serious problem for our application.)

I already checked if there were any conflicts in handling the IRQ. Everything seems to be fine. There should be no other IRQ nor any "IRQ disable" during that interval which might block handling the "EMAC Switch Receive" IRQ for such a long time.

What could be the reason for an obviously delayed "EMAC Switch Receive" IRQ?

Is there any known issue in the receiver HW?

What could I check to find out what happens in detail?

What could I do to get the "EMAC Switch Receive" IRQ in time?

Thanks and best regards,

Lars

  • Hi,

    Are you running Linux on the ARM?

    Thank you

    Cesar

  • Hi,

    no, we are running SYS/BIOS and NDK on the ARM, SYS/BIOS on the DSP and IPC on both cores.

    All network communication is handled by the ARM core.

    Lars
  • Lars,

    There is no errata (known issue) with the EMAC RX HW. The reason for the delay might be that on heavily loaded system, the number of interrupts that occur during a given period of time is huge. You can handle this case with the "interrupt pacing feature". You can get more info in DM814x TRM, sections:

    9.2.5.1 Transmit Packet Completion Pulse Interrupt (TX_PULSE)
    9.2.6 Interrupt Pacing

    Also the below wiki pages discuss EMAC pacing:

    processors.wiki.ti.com/.../TI81XX_PSP_04.04.00.02_Feature_Performance_Guide
    processors.wiki.ti.com/.../Linux_Core_CPSW_User's_Guide

    Regards,
    Pavel
  • Hi Pavel,

    thanks for the hints about interrupt frequency and interrupt pacing, but I'm sure this is not the problem.

    Ethernet is operating at 100 MBit/s so the amount of packets to handle is quite limited anyway.

    I checked what interrupts occur during the 1 ms when I miss a Rx IRQ using the HWI hook set. Typically it looks like this:

    • 6x EMAC TX IRQ (42)
    • 1x TIMER1 IRQ (67)
    • 2x TIMER3 IRQ (69)

    Even in the 1 ms before this period it looks quite similar:

    • 2x EMAC RX IRQ (41)
    • 5x EMAC TX IRQ (42)
    • 1x TIMER1 IRQ (67)
    • 2x TIMER3 IRQ (69)

    Each IRQ is handled within 10 up to 25 µs.

    During these 2 ms I see two SWI running for approx 10 µs additionally and approx. 20-25 task switches. During the last 250 µs before the "delayed" EMAC Rx IRQ I see the IDLE task is running.

    I believe this is not too much load and it would not require interrupt pacing. I'm even afraid interrupt pacing could make it worse since the application requires to get noticed about received Profinet packets very quickly. I also checked that interrupt pacing is actually off, so it can't be the reason for a delayed delivery of the EMAC Rx IRQ.

    Any other ideas?

    Best regards,
    Lars

  • Lars,

    Several points:

    - try with running the DM814x device at higher OPP, thus increasing the frequency of A8, DDR3, L3/L4 interconnect, EMAC

    - configure higher priority for EMAC at device level

    - configure higher priority for RX packets at EMAC level, see registers CPSW_THRU_RATE, CPDMA_RX_CH_MAP, section 9.2.1.3, table 9-13

    - enable EMAC statistics, see section 9.2.8, register CPSW_STAT_PORT_EN, register group CPSW_STATS

    - try with DMA instead of IRQ

    - disable all other applications, leave only EMAC active and see if this will fix the issue. Configure higher priority for EMAC appl and EMAC RX irq. Disable all other irqs, leave EMAC RX irq only.

    Regards,

    Pavel

  • Hi Pavel,

    thanks for your response. It will take me a while to check all your ideas, especially as I won't be in the office during the next days.

    What do you mean by "try with DMA instead of IRQ"? I thought received data will always be transferred from PHY/EMAC to RAM by DMA and I'll get noticed by Rx IRQ when a packet is received (i.e. DMA is finished). Is there any way to configure using DMA? Is there any way to check out what happens on DMA, i.e. when a transfer starts or finishes or is stalled?

    Best regards,
    Lars

  • Hi Pavel,

    I checked / tried some of the points you suggested:

    Pavel Botev said:


    - try with running the DM814x device at higher OPP, thus increasing the frequency of A8, DDR3, L3/L4 interconnect, EMAC



    We are already running at highest OPP166.

    Pavel Botev said:


    - configure higher priority for EMAC at device level



    I set "3PGSW initiator priority" to 3 in bits 1-0 of INIT_PRIORITY_1 register (0x4814060C). There was no change in behavior. Was it that what you meant?

    Pavel Botev said:


    - configure higher priority for RX packets at EMAC level, see registers CPSW_THRU_RATE, CPDMA_RX_CH_MAP, section 9.2.1.3, table 9-13



    Frankly spoken didn't get the point what the parameters for SL_RX_THRU_RATE and CPDMA_THRU_RATE exactly mean and how the could influence behavior, but I set the combinations 1/1, 1/7, 7/1, 7/7. There was no change in behavior either.

    I didn't try to change CPDMA_RX_CH_MAP since we are using only one DMA channel, so I expect there would be no change though.

    Pavel Botev said:


    - enable EMAC statistics, see section 9.2.8, register CPSW_STAT_PORT_EN, register group CPSW_STATS



    I did so and checked the stats after my problem occurred, but the statistics didn't look odd to me. All error counters were 0.

    Pavel Botev said:


    - try with DMA instead of IRQ



    As I already mentioned - we are using DMA (fixed one channel for both Rx and Tx). I didn't manage to use different DMA channels for sending and receiving by now.

    Since the ethernet packets have no priority and VLAN tags and we are using only one EMAC port I'm afraid there is no way to control packet handling by setting priority features. As I understood from the TRM all priority configuration is based packet/VLAN priorities and associated ports.

    Pavel Botev said:


    - disable all other applications, leave only EMAC active and see if this will fix the issue. Configure higher priority for EMAC appl and EMAC RX irq. Disable all other irqs, leave EMAC RX irq only.



    Well, that's not so easy to do - I need the framework to get the 1 ms Profinet transfer running in parallel with other TCP communication. I would have to set up a completely new test framework both on DM8147 and PC (or some other ...). I'm afraid I won't get the resources to do so.



    Is there a way (preferable a simple one ;-)) to observe what DMA transfer does, e.g. when a transfer is ready and when it does actually start and finish?


    Thanks and best regards,
    Lars

  • Lars,

    I checked this with the EMAC experts, below is the feedback:


    The interrupt would be delayed if the packet data transfer is delayed because something else is using the bus. Make the cpsw have a higher priority on the bus and see if that takes care of it.


    Regards,
    Pavel
  • Hi Pavel,

    Pavel Botev said:
    Lars,

    The interrupt would be delayed if the packet data transfer is delayed because something else is using the bus. Make the cpsw have a higher priority on the bus and see if that takes care of it.

    as I wrote I already tried to set "3PGSW initiator priority" to 3 in bits 1-0 of INIT_PRIORITY_1 register (0x4814060C) without any success.

    Is there anything else I could do to increase the CPSW priority on the bus?

    What could I do to find out what's blocking the bus?

    Best regards,
    Lars

  • Lars,

    There is nothing about the CPSW that changes when the interrupt happens. It happens after the packet is sent. The only priority adjustment is at device level. Can you also try with DMM_PEG_PRIO6/0x4E000638 register, field PRIO48. Setting 0 is for highest priority. Please give CPSW highest priority there, and also decrease priority of other DDR initiators that request memory at the same time as CPSW.

    Regards,
    Pavel
  • Lars,

    Two more hints from the EMAC expert:

    - checked the priority of the CPSW transactions on the VBUS
    - make sure it’s getting the transaction bandwidth that is needed

    Regards,
    Pavel