This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320TCI6616: MSMC Error Detection and Correction mechanism

Hi,

After MSMC SRAM non-correctable EDC error has been reported, specific sub-bank (on which resides error) is not accessible. Do you know the reason for this behavior?

Furthermore, is there any possibility to clear/recalculate the parity for reported faulty line in MSMC? I would like to give a DSP an opportunity to continue with execution if error affects data segment (which will not lead to DSP crash).

MSMC registers when EDC is reported:

[smcerrar=0x00000000, smcerrxr=0x00000000, smncerrar=0x0c000800, smncerrxr=0x00000190, smsecc=0x80b60000]
[smestat=0x00000004, smirstat=0x00000005, smedcc=0x44000001]

CIC0 is used for MSMC SRAM Non-correctable EDC error handling.

BR/Ante

  • Hi Ante,

    I've forwarded this to the memory experts. Their feedback should be posted here.

    BR
    Tsvetolin Shulev
  • Hi Ante

    Do you use SYS BIOS or bare metal code? I think that in SYS BIOS when exception occurs there is an exception routine that capture this error and go to infinite loop

    Based on your answer we will continue from there

    Ran
  • Hi,

    From my point of view, we are using bare metal programming.
    We have OS_PROCESS (exception handler) which will be called when an exception occurs.
    It will verify exception state, clear EXC bit, check kind of reported interrupt, perform interrupt specific actions...

    Our implemented MSMC interrupt servicing sequence:
      cic0_channel = 5U;
      system_int = CSL_INTC0_MSMC_DEDC_NC_ERROR;
      CSL_CPINTC_Handle csl_handle = CSL_CPINTC_open(0);
     
      CSL_CPINTC_disableAllHostInterrupt(csl_handle);
      CSL_CPINTC_setNestingMode(csl_handle, CPINTC_NO_NESTING);
      CSL_CPINTC_mapSystemIntrToChannel(csl_handle, system_int, channel);
      CSL_CPINTC_clearSysInterrupt(csl_handle, system_int);
      CSL_CPINTC_enableSysInterrupt(csl_handle, system_int);
      CSL_CPINTC_enableHostInterrupt(csl_handle, channel);
      CSL_CPINTC_enableAllHostInterrupt(csl_handle);

      When interrupt occurs:
      OS_PROCESS (exception handler) which will be called
      check CIC0 system interrupt status enabled/clear registers
      clear CIC0 system interrupt status enabled/clear registers flag
      CSL_CPINTC_clearSysInterrupt(csl_handle, system_int);
      CSL_CPINTC_enableSysInterrupt(csl_handle, system_int);

    BR/Ante

  • OK your question is really interesting. May I ask you to do the following experiment?

    Change the exception routine to infinite loop, and then try to access the MSMC memory from CCS (the same bank where there was a problem) and the MSMC registers. Post here if you were able to access the memory or not

    I assume that you have read chapter 2.5 and 2.6 of www.ti.com/.../sprugw7a.pdf and that you have not manipulate any of the MPAX registers

    Ran
  • Hi,

    First of all, I don't have a CCS.
    Where do you want to make an infinite loop, before or after any exception specific actions?

    I have read mentioned document and we don't manipulate with MPAX registers in exception handling routine.


    BR/Ante
  • So how do you debug DSP code?

    My suggestion is to have an infinite loop inside the exception routine,  so you can look at the memory before returning from the exception.  I know how to do it with CCS but I am not sure how to debug it with the debugger that you use.

    Ran

  • Hi,

    it's hard to explain my debugging methods.

    In a meantime, could you give me some advice regarding second question: "is there any possibility to clear/recalculate the parity for reported faulty line in MSMC?".


    BR/Ante
  • Hi,


    I've tried your proposal with several scenarios for implemented infinite loop in exception handler.

    1) EDC exception cleared; fetch value from MSMC (in DSP code):

    EDC exception raised again.

    2) EDC exception cleared but not enabled; fetch value from MSMC (in DSP code):

    Access violation exception detected

    3) EDC exception cleared but not enabled; fetch value from MSMC (outside the DSP code):

    sRIO event detected (Direct IO read failed)

    4) EDC exception not cleared; fetch value from MSMC (in DSP code):

    Access violation exception detected

    BR/Ante

  • I am not sure if you have answered your own question

    Ran