This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM4372: EMIF Setting

Part Number: AM4372

Sitara Support Team,

My customer is facing a problem with a custom board with an AM4372.
Please answer questions A, B in the following detailed status report.

1) Defects
-CPU exceptions occur irregularly. (from 30s to 30s after startup if it is short)
-Checking with JTAG-ICE when a CPU exception occurs,
we found that the code area of the program deployed in RAM (DDR3) was unintentionally rewritten.
Even if the area is set to ReadOnly in the MMU, the contents of the RAM will be rewritten.
→ It has been confirmed that a Permission Fail exception occurs
when a program intentionally rewrites the area set as Read Only, and the content is not rewritten.
He disabled Master (TPTC(EDMA), USB, etc.) other than MPU,
but I confirmed with ICE that the contents of RAM were rewritten.
After changing the timing of the EMIF (see (4) below), the symptoms no longer occur.

(2) Configuration
Product name; Model (Manufacturer) 
CPU;AM4372BZDNA60 (TI)
DDR3-SDRAM; IS43TR16128C-125K (ISSI )   Remarks:400MHz, used in 1pcs

(3) EMIF register setting value
It was set up with reference to EMIF_CONFIGURATION_TOOL_e2e201207.xlsx.
 *The file will be attached later.

(4) Findings at this stage on EMIF
"For the parameters to be set in Step2-DDR Timing of EMIF_CONFIGURATION_TOOL,"
After changing the Final Bit Value of all items in the attachment settings to +1tCK,
the above frequently occurring problematic symptoms are no longer observed.
The DDR compliance test: We only measured the address line A (0),
but he confirmed that it is PASSED with step 1 and step 3 settings.


(5) Questions
A. If you know of any possible causes of this bug, workarounds,
 or cases where similar symptoms have occurred, please let me know.

B.About EMIF registers
If you have any status registers that can be used for debugging this case,
please let me know how to use them as well.
Also, if there are any other bits in the attached file that need to be set up and changed, please let me know.

Best regards,
Kanae

  • Kanae, 

    you mention that the " program intentionally rewrites the area set as Read Only".  Are you sure the program is doing this, or is the memory just getting corrupted?  If you can confirm that, for example a full area of memory is getting rewritten by the program, then this implies some issue with the addr/cmd signals.  If the memory is corrupted, it might be more of a data issue.

    Post the spreadsheet and i can take a look.  Based on the fact that you can get things working by relaxing the timing values, there may be an issue with some of the spreadsheet inputs or possibly the DDR PLL configuration.    

    Can they also run these two debug scripts:  

    https://git.ti.com/cgit/sitara-dss-files/am43xx-dss-files/tree/am43xx-ddr-analysis.dss

    https://git.ti.com/cgit/sitara-dss-files/am43xx-dss-files/tree/am43xx-ctt.dss

    and post the results.  Instructions for running the scripts using CCS can be found here:  https://git.ti.com/cgit/sitara-dss-files/am43xx-dss-files/tree/README

    These scripts will also give insight into the DDR and PLL configuration

    Regards,

    James

  • Hi James,

    Thank you for your reply.
    I will share your comments to my customer.

    Here is the spreadsheet.

    EMIF_CONFIGURATION_TOOL_e2e201208.xlsx

    I will ask my customer to run the scripts, and post the results here.

    Best regards,
    Kanae

  • Hi James,

    Here are additional information from the customer.

    ======================================================================

    About Debug Script;

    I'm afraid that the code you sent us can only be executed by TI's ICE.
    As our environment does not allow us to execute it,
    may I ask you to suggest another way of checking?

    I attach you the results of dumping the addresses listed in the script,
    so please confirm whether or not you can analyze the problem with this information only.
    If the above results are insufficient or incomplete, please contact us.

    am43xx-ddr-config_20201208.csv

    am43xx-ctt_20201208.csv

    About the EMIF status register

    I attach you the results of dumping the EMIF status/error register and readable registers.
    The values vary depending on the conditions, can you find out anything from this data?

    20201208_EMIF4D STATUS REG E2E.xlsx

    ======================================================================

    If you need more information to check the details from my customer, Please let me know.

    Best regards,
    Kanae

  • Hi Kanae, thanks for the info.  The register dumps are good.  Here is what i found:

    1. Here is one difference i found which could be the main cause of the exceptions

    EMIF Tool:  DDR_PHY_CTRL_1=0x48009

    Reg Dump: DDR_PHY_CTRL_1=0x48008

    This sets the read latency, and certainly can cause the issues you are seeing.  That fact that things improved when they adjusted values +1 also point to the fact that this may be the main problem.  I would have them change just this value and test again.

    2. I also noticed differences in the IOCTRL registers, which control driver impedance and slew rate for the DDR IOs.  Is there a reason these were changed?

    EMIF Tool: ADDRCTRL_IOCTRL = DDR_DATAx_IOCTRL: 0x84

    Reg Dump: ADDRCTRL_IOCTRL = DDR_DATAx_IOCTRL: 0x87 

    3.  Line 16 in the "System Details" section was also changed to higher memory drive strength.  Was this changed to try to solve the exception errors.  I would suggest to revert this back to RZQ/6 if the higher drive strength is not needed.

    I checked the PLL and its configuration looks fine.  There is nothing significant in the EMIF status registers that i see.  

    In summary, i think #1 is the main problem, #2 and #3 may be minor but are other differences i saw

    Regards,

    James

  • Hi James,

    Thank you for your support.
    I will get the confirmation information from you and update them here.

    I would like to share with you the result of running the script
    that I received from the customer, because he found out that he had XDS560v2.

    ScriptResults.zip

    Best regards,
    Kanae

  • Hi James,

    Here is my customer's comments.

    ==================================================================================

    Regarding #1.

    I tried changing the setting values of the registers, but the symptoms remained the same.
    The result of the script: 20201211_TI_DebugScriptResult.zip

    Regarding #2. and #3.

    I originally set them to TI Recommendation values, but adjusted them during DDR compliance testing.
    Since it was not possible to change the clk/addr settings individually on EMIFTool,
    the actual settings are different from the recommended values.
    I apologize for not being able to inform you in advance.

    In addition, I measured the Addr_0 signal today.
    There was no waveform disturbance between startup and malfunction.

    ==================================================================================

    If there are any other points that should be checked or changed, please let us know.
    Also, my customer has been able to avoid the problem by using the method of adjusting the "value +1",
    but could you please tell me why you do not recommend this measure?

    Best regards,
    Kanae

  • Hi Kanae, there is nothing wrong with their "value+1" configuration.  It is just strange that the spreadsheet didn't yield the correct configuration for them.  If "value+1" config is more robust, then that can be their valid configuration.

    The only other thing i noticed is this

    SDRAM_REFRESH_CTRL = 0x10000C30

    The highlighted red value should be 0.  That bit causes an SDRAM initialization sequence, which is not necessary after the first init sequence upon powerup.  This may possibly be causing their issue.  I would have them change this and test again. 

    Were you ever able to get more details on the failure?  Can they tell if just random bits are failing, or if whole blocks of memory are incorrectly getting written?

    Regards,

    James  

  • Hi James,

    Thank you for your support.

    My customer's reply and additional questions are here.

    ============================================================

    I set ASR[28] of SDRAM_REFRESH_CTRL to 0
    (SDRAM_REFRESH_CTRL = 0x00000C30) and checked,
    but the symptoms did not change.

    [Questions and request]

    1. Isn't the initialization sequence triggered by INITREF_DIS[31] in SDRAM_REFRESH_CTRL?

    2. At what timing in the initialization sequence does ASR[28] need to be set?

    3.Is it possible to get the source code of the EMIF initialization sequence for reference?
      I am asking this in order to check if there is any problem in the initialization procedure.

    [Details of the problem]

    The red text in the attached file is the address and data that was rewritten (before→rewriteen).
    memory_rewritten_e2e.zip
    There is an address where the same binary is located in memory,
    and it appears that random bits are not being overwritten,
    but are being replaced by data from another address.

    The address being overwritten is not necessarily random.
    There seems to be some regularity, but I haven't found it.
    When it is rewritten, 16 bytes are rewritten.

    Here's an example.
    Rewritten address
    Address    Data
    80299700 95FFFFEB7440BFE6001090E20110A013

    The address where the same binary is located.
    Address    Data
    801d8700 95FFFFEB7440BFE6001090E20110A013

    It appears that the contents of 801d8700 have been overwritten by 80299700.

    ==============================================================

    I appreciate your continued support.

    Best regards,
    Kanae

  • Hi James,

    Could you please reply to my customer's questions dated Dec-17th?
    If you have unclear points, please let me know.

    I appreciate your continued support.

    Best regards,
    Kanae

  • Kanae, the initialization sequence is kicked off by a write to the SDRAM_CONFIG register.  They should not need to write to the ASR bit separately.  They should be using the programming sequence found in our software offerings, either the Linux SDK or RTOS.  Altering the sequence can result in incorrect initialization.  What software are they using?

    The errors shown in the zip seem to be slight timing issues in the address.  Have they checked the layout as well and followed the routing guidelines in the datasheet?

    Regards,

    james

  • Hi James,

    Thank you for your reply.
    I will check your pointed items to my customer.

    I appreciate your continued support.

    Best regards,
    Kanae