This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM335x NAND ECC interface between GPMC and ELM "magic"

We have a AM335x based product that has 512MB of 4k page size NAND memory attached to the GPMC bus. In reviewing the NAND driver we have identified some "magic" that must be happening between the GPMC and ELM to correct ECC errors. We are hoping someone can shed some details concerning this "magic".

We have seen that the driver is currently using BCH8 and is generating the checksums from the GPMC registers.  The checksums are calculated by the GPMC and what is stored in the sectors seem to be matching up.  The driver is reading in the checksums stored in the sectors and those calculated by the GPMC, if the checksum don't match then there is an error, this is then passed onto the ELM. 
 
 
Now as far as we can see the only information in the driver that is passed onto the ELM is the calculated checksum from the GPMC.  Something seems missing to me, the checksum in the sectors is not sent to the ELM nor is any data that is read from the sectors sent to the ELM.  So how does the ELM locate the error when as far as we can see there is nothing to compare it with?  Is there some hidden information being sent to the ELM from the GPMC?  We have read the AM335x datasheet and it also seems to be vague in this area.  
 
 
Can someone please explain how this process works in correcting the NAND memory?
 
 
Thank you for your support.
  • Hi Alex,

    The ELM module is covered in detail in section 7.4 from the AM335X Technical Reference Manual, Rev. K. Have you read it?

  • The ELM is getting the raw data from the GPMC interface when data blocks are read or written.

    The position of the error (which bit is to flip) is encoded in the checksum. Correction is done by software.

    regards

    Wolfgang

  • Thank you both Biser and Wolfgang.

    After reading through the TRM (with wolfgang's comments in mind) the most pertinent aspects are in 7.4.1.

    Concerning what is passed to the ELM:

    "The general-purpose memory controller (GPMC) probes data read from an external NAND flash and uses this to compute checksum-like information, called syndrome polynomials..." and "The error-location module (ELM) extracts error addresses from these syndrome polynomials". 

    What the ELM does with that "checksum":

    "Based on the syndrome polynomial value, the ELM can detect errors, compute the number of errors and give the location of each error bit. The actual data is not required..."

    Based on the above the ELM actually never receives any data only the "checksum" (polynomial) which allows it to identify the error location which is passed back to the GPMC to handle the actual correction.

    We have seen in the driver source code where the actual correction is made but still a little unclear as to how the GPMC decides to correct an error reported by the ELM. Is there a "high level" explanation of this process somewhere?

    Thanks again.