This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Emac Cache Coherency Issue on dm6435

I have been tracking down an issue which appears on only a handful of builds of our product.  We have seen an issue where Ethernet packets are sent out of the EMAC on the dm6435 with incorrect checksums.  This might happen at a rate as high as 1 in every 10,000 packets.  We are using the TI NDK v1.94.1, with the standard dm643x EMAC Packet Driver.     

This issue only appears when we are also acquiring images from an onboard CMOS imager, using the CCDC and VPFE to transfer image data into DDR2 memory. 

We have run several tests which seem to vary the behavior to some extent:

1)       We have disabled interrupts during the call to HwPktTxNext, this did not make the issue occur any more/less often.

2)       We have tried calling OEMCacheCleanSynch prior to the call to OEMCacheClean to see if another cache sync operation might be interfering with the EDMA driver’s, but this also did not change the equation.

3)       We have tried adding a 1us delay between the OEMCacheCleanSynch call and the code block which adds the transmit descriptor to the queue.  This causes the issue to occur much less frequently ( 1 in 10 million ), but it does not go away.

4)       We have tried moving the OEMCacheCleanSynch call to earlier in the function ( before the transmit descriptor is being filled out).  This cause the issue to occur much much less frequently ( 1 in 100 million ), but it does not completely go away.

5)       We have tried adding a dummy read after the OEMCacheCleanSynch operation of 4 bytes from the beginning of the packet, and so far we have run for 25 million cycles without incident.

Generally this issue does not seem to present itself in all builds of this product, but certain builds have shown this issue.  At this point, we are unsure what might be different about each of these builds, but they are populated with the same hardware and should be completely equivalent.

We would like to confidently resolve this issue, and we were hoping to get some input from you as it relates to cache coherency issues.

1)       Is there any possibility that after a Cache Clean operation is triggered, and a read from the status indicates that it has completed, that the cache has not been written out to DDR2 memory yet?  Would a very active DMA from the Video Port increase the likelihood of this?

2)       If this situation is possible, what is the best way to resolve it?  Would a dummy read back from a cacheable DDR2 address resolve this?

Thanks for your help,

Dale Peterson 

  • Here are a few things to investigate

     

    1) Make sure data buffers are cache aligned and only one master (e.g. CPU or DMA) at a time accesses data.  If both must access data, than ensure cache is coherent with physical memory

    2) This may be a long shot, but some times instruction prefetching can play a role putting data into cache; therefore it is a good idea to put a few cache pages between all code and data sections

    3) Confirm that the PBBPR hardware register is set to 0x20 or lower to prevent command starvation

    4) Confirm EMAC has higher priority than VPFE

    5) Interrupts from EMAC and VPFE correspondingly are routed to different HWI if possible.

     

  • Hi Juan,

    Regarding #1, the EMAC must access data at the same time that the VFPE is copying image data to DDR, as well as at the same time that the CPU is copying data to maximize total throughput.

    I still need to investigate #2 above.

    Regarding #3, the PBBPR register is currently set to 0x20 (when we saw the issue), reducing this to 0x10 did not cause the issue to go away, but reducing this to 0x04 seems to have cleared up the issue.  It seems like this could reduce the overall bandwidth of the DDR2 having this value set so low.  Can you recommend whether this would be an advisable value to set to the PBBPR?

    Regarding #4, the VPFE currently has priority 0, and the EMAC priority 4 (the default).  I tried setting the VPFE to 3 and the EMAC to 2, but this did not clear up the issue.  But after thinking about this a little bit more, it seems that it would be the CPU commands that would be getting starved (the packet data is coming from the CPU and that data is not ariving in the DDR prior to the EMAC sending).  Would it be advisable to give the CPU priority over the EMAC priority?

    Regarding #5, the EMAC and VPFE are using different HWI currently.

    Thank you for your help on this issue.