This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How to verify DDR3 ECC function on AM572x

Other Parts Discussed in Thread: X5777AXGABC

Hi All,

My customer want us to prove DDR3 ECC function is working on AM572x, is there any way to verify ECC by software? Thank you.

Regards,

Allen 

  • Hi Allen,

    ECC functionality requires additional ECC DDR memory to be connected to the EMIF, as shown on Figure 15-45 (AM571X/AM572X TRM). Unless this memory configuration is designed on board ECC functionality will not be available. Additional information can be found in section 15.3.4.14 of the TRM.
  • Hi Biser,

    Thank you. Maybe my post was not clear, we had board with ECC connected and configured EMIF and PHY registers correctly. I want to verify ECC fucntionality but how to make error on ddr3 purposely?

    Regards,
    Allen
  • I cannot say. I will ask the DDR experts.
  • There isn't any kind of "bit injection logic" to introduce a bit error, though simply doing a non-quanta sized write should be sufficient to trigger an error (i.e. write a single byte and that will cause a problem).
  • Thanks. Brad. But how did you know ECC fucntion was correct in design and verification?
  • Allen35065 said:
    Thanks. Brad.  But how did you know ECC fucntion was correct in design and verification?

    I don't see this any differently than anything else in the chip. The design team can simulate all the logic to make sure it is correct. I'm not a designer so I don't know all their methods.

    If you write non quanta sized data you will get an error.

  • Brad,

    Thanks. One more question, TRM said 'Once ECC is enabled, the entire protected region must be initialized with data.' Is that mean we have to memset all the protested DDR3 region to 0(or other data) after enable ECC in EMIF_ECC_CTRL_REG register?

    Regards,
    Allen
  • Correct. You must write to the entire space so that you have valid ECC data. Otherwise a read would result in an error.

  • Hi Brad,

    Thank you. I'm trying to do a test on VAYU EVM which has ECC chip on EMIF1. How to generate a non-quanta write to DDR?

    I enabled ECC

    hCtrlCoreWkup->EMIF1_SDRAM_CONFIG_EXT = 0x0001C127U;

    and LISA

    /* DMM_LISA_MAP_i */
    hDmmCfg->LISA_MAP[0U] = 0x80700100;
    hDmmCfg->LISA_MAP[1U] = 0x80700100;

    and setup DDR registers

    {
    Uint32 regVal = 0U;
    //Enable ECC from 0x80000000 to 0xA0000000
    hEmif->regs->ECC_ADDRESS_RANGE_1 = 0x8FFF8000;
    //Enable ECC from 0xA0000000 to 0xC0000000
    hEmif->regs->ECC_ADDRESS_RANGE_2 = 0x9FFF9000;
    //Enable ECC with in RANGE 1&2
    hEmif->regs->ECC_CTRL_REG = 0xC0000003;

    memset((unsigned int *)(0x80000000), 0, 0x20000000);

    regVal = hEmif->regs->B_ECC_ERR_CNT;
    hEmif->regs->B_ECC_ERR_CNT=regVal;
    hEmif->regs->B_ECC_ERR_DIST_1 = 0xFFFFFFFF;
    hEmif->regs->B_ECC_ERR_ADDR_LOG2 = 0x1;
    hEmif->regs->SYSTEM_OCP_INTERRUPT_STATUS |= 0x1C;
    }

    I tried to wrote a byte like this

    *(unsigned char*)(0x80000001) = 0x55;

    But i did not see any error record.

    EMIF1_EMIF_1B_ECC_ERR_CNT
    0x00000000
    EMIF1_EMIF_1B_ECC_ERR_THRSH
    0x00000000
    EMIF1_EMIF_1B_ECC_ERR_DIST_1
    0x00000000
    EMIF1_EMIF_1B_ECC_ERR_ADDR_LOG
    0x00000000
    EMIF1_EMIF_2B_ECC_ERR_ADDR_LOG
    0x00000000

    Regards,
    Allen
  • Are you sure the corresponding memory range is non-cacheable as defined by your MMU page table? If the cache is operating on the corresponding memory range then it will always be accessing full cache lines.

  • Yes. I tried setting DDR3 to non-cacheable and no-MMU. Both results were no errer record.
  • And i find there is ecc_test in K2 MCSDK uboot. Do you know how it works? Thank you.
    processors.wiki.ti.com/.../MCSDK_UG_Chapter_Exploring
  • Brad,

    I checked K2 and AM572x DDR3 controller part. It looks like verification and RMW is not supported on AM572x. So the way to verify ECC in K2 could not been used in AM572x. Do you have example code for the non-quanta read test? Thank you.

    Regards,
    Allen
  • Allen35065 said:
    //Enable ECC with in RANGE 1&2
    hEmif->regs->ECC_CTRL_REG = 0xC0000003;

    You combined several steps into this one operation.  I can't say for sure that will break things, but given that you're having some issues I recommend following the initialization instructions precisely.

    Allen35065 said:
    *(unsigned char*)(0x80000001) = 0x55;

    That should be sufficient to trigger an error.  So are you not seeing EMIF_SYSTEM_OCP_INTERRUPT_STATUS[3] WR_ECC_ERR_SYS set to one?

  • I tried to do ECC configuration seperately and still did not get an error. The EMIF_SYSTEM_OCP_INTERRUPT_STATUS and EMIF_SYSTEM_OCP_INTERRUPT_RAW_STATUS are 0x00000000.
  • Which silicon revision are you using?
  • I was looking for some code snippets for enabling ECC. Here's from a gel file:

    WR_MEM_32(0x4AE0C144, (RD_MEM_32(0x4AE0C144)|0x00010000));

    /* EMIF_ECC_ADDRESS_RANGE_1 - 0x80000000 to 0x90000000 */
    WR_MEM_32(0x4C000114, 0x0FFF0000);
    /* EMIF_ECC_ADDRESS_RANGE_2 - 0x90000000 to 0xA0000000 */
    WR_MEM_32(0x4C000118, 0x1FFF1000);

    /* EMIF_ECC_CTRL_REG - Enable ECC on both ranges */
    WR_MEM_32(0x4C000110, 0xC0000003);

    So related to your earlier sequence:
    1. It looks like combining those other writes was ok, as evidenced by the final write above.
    2. Why aren't you doing a read-modify-write on CTRL_WKUP_EMIF1_SDRAM_CONFIG_EXT_1?
    3. *IMPORTANT* The address range used is given from "the EMIF perspective". In particular, consider the start of memory to be 0x00000000 and not 0x80000000. I suspect this is why the ECC doesn't appear to be doing anything.
  • Brad,

    I don't know my silicon version as i used VAYU EVM. The P/N on chip is OMAP X5777AXGABC.

    And i tried to use 0x00000000 address but i found i couldn't read and write DDR3.  When using 0x80000000 I can access (write and readback) DDR3 data correctly.  

    From TRM, CTRL_WKUP_EMIF1_SDRAM_CONFIG_EXT_1 register is used for EMIF1_PHY_REG_READ_DATA_EYE_LVL.

    Regards,

    Allen 

  • Allen35065 said:
    And i tried to use 0x00000000 address but i found i couldn't read and write DDR3.  When using 0x80000000 I can access (write and readback) DDR3 data correctly.  

    The correct value is 0x00000000.  If you can't read/write DDR3, that's likely due to the fact that there is something else wrong with your ECC setup.  Programming this register incorrectly simply tricks the EMIF into thinking that you're not trying to write to the protected region (i.e. it does not attempt to utilize the ECC lanes).

    My suspicion would be that this relates to leveling.  Are you doing hardware leveling or software leveling?  By the way, is this a gel file you're using or C code (such as u-boot, etc.)?

  • Hi Brad,

    Thank you. You are right. After putting ECC init code before leveling i can get error now.

    But i have one question, TRM says we need to initilize the ECC protected region after ECC enabled, but in that time leveling is not started yet. Is there any problem?

    Regards,
    Allen
  • Allen35065 said:
    Thank you. You are right. After putting ECC init code before leveling i can get error now.

    Ok, I think we're getting there.  Your ECC initialization code looks to be correct with the latest change.  However, there is a problem related to leveling.  Leveling should be done prior to ECC configuration.  So If leveling is breaking ECC, then I think something is wrong with your leveling.  Are you using software leveling or hardware leveling?

  • Yes. I use software leveling as TRM mentioned. I just don't know if we do leveling before ECC enabled, will the leveling results be correct for ECC chip?
  • When you perform the software leveling, do you follow this note from the TRM:

    The xxx_RATIO8 and xxx_RATIO9 bit fields are associated with the ECC data PHY and
    must be loaded with values same as in the xxx_RATIO0 through xxx_RATIO7 bit fields
    from registers EMIF_EXT_PHY_CONTROL_xx.
  • Did you perform the steps that Brad suggested?

    I'm not sure what is the OS the customer plans to use, but checking the ECC feature can be done using a simple GEL file. I'd like to suggest another option to also check for ECC errors in addition to the one that Brad suggested earlier.

    - Enable ECC (This can be done using the ECC_CTRL_REG register)
    - Set appropriate address range for ECC (this should be relative value as Brad suggested earlier)
    - Perform initialization by enabling ECC including all the leveling steps. Let us know if you've any difficulty in this.
    - Fill the memory with known patterns after the initialization is done
    - Turn off the ECC (This again can be done by using the ECC_CTRL_REG register)
    - Introduce a 1-bit error by writing into the previous filled memory locations
    - Enable ECC back (Same as 1st step)
    - Perform a READ on the address location where a 1-bit error was introduced
    - Verify if the ECC functionality worked as expected

    Let us know if this helps.

    Regards, Siva
  • Hi Siva,

    I completed the ECC test according to your comments. I did see the 1-bit error was inserted and corrected. Thank you.

    Regards,

    Allen

  • Hi Siva,

    After ECC function verified, i'm going to generate an interrupt from ECC event. Based on ECC test case, i enabled EMIF event in EMIF_SYSTEM_OCP_INTERRUPT_ENABLE_SET. But i tried to issue 1-bit error by

    - Turn off the ECC (This again can be done by using the ECC_CTRL_REG register)

    - Introduce a 1-bit error by writing into the previous filled memory locations

    - Enable ECC back (Same as 1st step)

    then i checked ECC status registers,

    0x4C0000A4 EMIF1_EMIF_SYSTEM_OCP_INTERRUPT_RAW_STATUS is 0x0

    0x4C0000B4 EMIF_SYSTEM_OCP_INTERRUPT_ENABLE_SET 0x39

    0x4C000130 EMIF1_EMIF_1B_ECC_ERR_CNT is 0x1

    it looked like ECC error was issued but no event was generated. 

    My question is Can we generate EMIF ECC event by this way? if yes, how to set these registers?

    Thank you.

    Regards,

    Allen

  • Allen35065 said:

    i enabled EMIF event in EMIF_SYSTEM_OCP_INTERRUPT_ENABLE_SET. But i tried to issue 1-bit error by

    - Turn off the ECC (This again can be done by using the ECC_CTRL_REG register)

    - Introduce a 1-bit error by writing into the previous filled memory locations

    - Enable ECC back (Same as 1st step)

    then i checked ECC status registers,

    Did you read the location with the 1-bit error in it?

    Allen35065 said:

    0x4C0000A4 EMIF1_EMIF_SYSTEM_OCP_INTERRUPT_RAW_STATUS is 0x0

    0x4C0000B4 EMIF_SYSTEM_OCP_INTERRUPT_ENABLE_SET 0x39

    0x4C000130 EMIF1_EMIF_1B_ECC_ERR_CNT is 0x1

    What was the ERR_CNT value before performing the test?  Had you cleared out the error count from the previous test?

  • Yes. I cleared status before enable EMIF ECC event as below.

    *(unsigned int *)(0x4C000130) = *(unsigned int *)(0x4C000130); //clear 1B ECC counter
    *(unsigned int *)(0x4C000138) = 0xFFFFFFFF; //clear 1B ECC distrubution
    *(unsigned int *)(0x4C000140) = 0x1; //clear 2B ECC ERROR address
    *(unsigned int *)(0x4C0000AC) |= 0x7 << 3 ;//clear error status

    I did check 1B ECC counter, it was 0x0. After disable, modify and enable ECC, 1B ECC counter(0x4C000130) was 0x1 and 1B ECC distrubution was also correct at the bit just as i modified in the data.  EMIF_SYSTEM_OCP_INTERRUPT_ENABLE_SET was 0x39. But there was no bit set in EMIF interrupt raw status and status register. 

  • Have you programmed EMIF_1B_ECC_ERR_THRSH[31:24]? By default the value is 0 (disabled).
  • Brad,

    Now i'm on biz trip and I will try your suggestion when i back to office. Thank you.

    Regards,
    Allen