This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How to flush TMS570LC4357 data cache before EMAC dma transfer?

Other Parts Discussed in Thread: HALCOGEN, TMS570LC4357

Hi,

I've been working on FreeRTOS-Plus-TCP - HALCoGen - integration.

My problem that, EMAC send empty packages over the ethernet - inspite of the TX buffer descriptor and the packet buffer looks good.

The MPU is enabled and I've tried more option for RAM region without any success. The only thing that works disabling the cache - so I think I should flush the dCache before initiating EMAC transfer.

Could you give me clue, how should I do it?

Thank you,

Szilard

  • Szilard,

    You'll want to refer to the CPU TRM, http://infocenter.arm.com/help/topic/com.arm.doc.ddi0460d/DDI0460D_cortex_r5_r1p2_trm.pdf  or online in html form at infocenter.arm.com -   where you navigate down to the Cortex R5 in the left pane.

    See  'Cache Maintenance Operations" under

    8.5. About the caches

  • Dear Anthony,

    Thank you for the prompt answer. I've tried to keep memory in cache coherent state before initiating EMAC transfer with the following function:

    _dataSyncBarrier_
            stmfd sp!, {r0}
            MOV   R0,#0
            MCR   P15, #0, IP, C7, C10, #4 ; Data Synchronization Barrier operation
            ldmfd sp!, {r0}
            bx      lr
            .endasmfunc

    The only things that works for me so far:

    - Using offchip 8M sdram as a packet buffer with NORMAL_OIWBWA_SHARED, PRIV_RW_USER_RW_NOEXEC setting.

    - Disabling dCache

    Do you have any idea about this?

    Thank You very much, best regards:

    Szilard

  • Hi Szilard,

    The data syncronization barrier operation (From ARM DDI 0363C) does this:

    "The purpose of the Data Synchronization Barrier operation is to ensure that all
    outstanding explicit memory transactions complete before any following instructions
    begin. This ensures that data in memory is up to date before the processor executes any
    more instructions."


    But that's not the same as cleaning out the cache.

    If you write data to cached memory and then want the external device (EMAC) to be able to access these values you need to either perform a cache clean operation, or configure the cache as write-through for the region of memory that contains the EMAC buffers.

  • Dear Anthony,
    your answer was really helpful - thank you very much!

    I could flush the cache with the following function:

    (From billauer.co.il/.../)

    ;-------------------------------------------------------------------------------
    ; dcacheCleanRange
    ; void _dcacheCleanRange_(unsigned int startAddress, unsigned int endAddress);
    .def _dcacheCleanRange_
    .asmfunc
    _dcache_clean_range_
    BIC R0, R0, #7 ; data cache line size - 1
    loop: MCR P15, #0, R0, C7, C10, #1 ; clean D entry
    ADD R0, R0, #8 ; data cache line size
    CMP R0, R1
    BLO loop
    MCR P15, #0, R0, C7, C10, #4 ; data Synchronization Barrier
    BX LR
    .endasmfunc

    Other way, the first thing that I tried was setting memory type as write-through (and shared) but it still not work - I don't know why.

    Best Regards,
    Szilard
  • Hi Szilard,

    Super - did the cache clean work for you?

    The MPU memory type change should work, but be aware that a higher MPU region over-rules a lower region when regions overlap. So let's say MPU region #10 and #5 both cover the area of RAM that contains your buffer. MPU#10 is higher priority so it'll control the attributes of your buffer region. Which means if you changed just MPU #5 this wouldnt' have any effect.

    So #5 and #10 are bogus examples, you should check all of the regions that you have enabled and which ones cover the DMA buffer to make sure you changed the highest priority one.

    Also note that this provides you a way to make some of the L2RAM WT but the rest WB (for speed). You can for example set up MPU #10 to cover only 1/8th of the RAM as WT, and MPU#9 to cover the rest of the RAM as WB. Then as long as you keep the DMA buffers in the 1/8th that's WT you should be fine, and your stack and heap operations in the WB area will get a little speed boost.
  • Hi Anthony,

    Yes, flushing cache before EMAC TX and invalidating after EMAC RX works well!

    About the WT solution:

    I've read about overlapping memory regions policy, and this was my first attempt in HALCoGen:

    Region 3 covers the internal RAM address space and ithe changes are disabled by default in TMS570LC4357_FreeRTOS template.

    - Firstly I made region 5 which overlaps region 3 with "write through, shared" policy. I checked the generated HL_sys_mpu.asm and it was OK.

    - Secondly tried changing the policy of region directly in HL_sys_mpu.asm - without any success.

    I checked the CP15_CCSIDR - so WT is supported.

    The bonus: in the case of external SDRAM (region 6) changing policy works as it expected (WT with DMA is OK).

    Other question, maybe you could help: Hercules family's and DaVinci's have same EMAC circuit?

    Thank You, Best Regards:

    Szilard

  • Hi Szilard,
    Instead of WT maybe you can make the EMAC TX region non-cacheable. If that doesn't work then there is definitely an issue w. some other MPU region.
    WT has worked for others that have tried it so I suspect something else is goin on in the program.
    Yes - Hercules has basically the same EMAC as on DaVinci and exactly the same as OMAPL1x / C674x DSPs.
    Thanks and Best Regards,Anthony