This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6678: Test EDC/ECC mechanism

Part Number: TMS320C6678

Hello,

I am developing a software sequence which goal is to test EDC/ECC mechanism of the DSP TMS302C6678.

The sequence is the following:

  1. Enabling EDC ( in L1P, L2 and MSM) and ECC in DDR3

  2. Initialization of L2, MSM and DDR3 to 0

  3. Initialization of Memory protection, interrupts

  4. Running EDC/ECC test

L2 EDC test consist in:

5-1/ Writing to an arbitrary L2 address (0x00868000 in my case) value 0xFFFFFFFF

N.B: 0x00868000 is an arbitrary address. The only condition is to choose an address were R/W/X are allowed.

5-2/ Suspend L2 EDC by setting bit 3 of L2EDCMD register to 1 (to suspend EDC computation)

5-3/ Write 0xFFFFFFF3 at address 0x00868000.

N.B: Value of 0xFFFFFFF3 is chosen because only two bits are different with previous value of 0xFFFFFFFF

N.B.2: Goal of this step is to simulate corruption of bits 2 and 3 at address 0x00868000. Parity RAM is not updated since EDC mechanism is suspended

5-4/ Enable L2 EDC mechanism by setting bit 0 of L2EDCMD register to 1

5-5/ Perform a read access at address 0x00868000

5-6/ Wait for L2 EDC interrupt

However, this sequence does not work (L2 EDC interrupt corresponding to corepac event 117 is never called).

I am suspecting that L2 write done during step 5-1 is not performed in 128 bits. I know that my instruction is a 32 bit write but Initialy I was thinking that the DSM memory controller will perform the access always in 128 bits. It seems that it is not the case. And if it is confirmed then corresponding parity ram is not updated as expected (because access is narrower than 128 bits, as explained in §11 of C66x corepac datasheet), only bit validity is updated.

So, I have tried to replace in my previous sequence, the step 5-1 by following one:

                5-1-bis/Program IDMA to write value 0xFFFFFFFF at range [0x00868000--0x00868010] by setting IDMA register as following:

  1. IDMA1_SOURCE is set to 0xFFFFFFFF

  2. IDMA1_DEST is set to 0x00868000

  3. IDMA1_COUNT is set to 0x00010010

With this only modification, my test becomes OK (i.e corepac event 117 is generated).

This seems to confirm my analysis of memory access not done in 128 bits.

I have performed a third test as follow:

5-1-ter/ Write 0xFFFFFFFF at address 0x00868000, 0x00868004, 0x00868008 and 0x0086800C in 4 assembly instructions

Result: Test is KO, and event 117 not generated. This seems to confirm that every DSP access is done in 32 bit.

So, my test works (using IDMA) but I have following questions:

Q1- If DSP R/W instructions are done in 32 bits, then parity RAM will never be updated and only bit validity will be set to invalid. Could you explain how a L2 memory corruption should be detected/corrected in this case?

Q2- There is a scrubbing technique described in §11.3.3 of C66x corepac datasheet. Is is the only method of detecting a L2 data corruption?

Q3- Can you explain me if there is a way to force every R/W accesses coming from DSP core to be done in 128 bits? If it is already the case, can you explain me what I am doing wrong in my test sequence?

Q4- I am trying to perform same kind of test in MSM memory but it seems complicated because as explained in §2.5.2.1 of SPRUGW7A “Setting ECM=1 enables error correction; however, any further writes to the ECM bit are ignored (that is, once enabled, error correction stays enabled until the MSMC is reset).”. Since ECC mechanism cannot be deactivated then it is impossible to test MSM EDC mechanism with my technique because it is impossible for me to “simulate” a MSM data corruption. Am I right or there is a other possibility to test it?

 

Q5- Same question for DDR3, is it possible to test DDR3 ECC mechanism (with the test sequence described below or by an other way)?

 

Thanks by advance for your answer.

 

Regards,

  • Hi Tiago,

    I've forwarded this to the SW team. Their feedback should be posted here.

    BR
    Tsvetolin Shulev
  • Hi,

    Ok, I am waiting for SW Team answer.

    Could you please ask them to give me an answer as soon as possible ?

    It becomes very urgent for me.

    I can give you additional informations if necessary.

    Best Regards

    Tiago
  • Hi,

    Sorry, I missed this thread.

    EDC (Error Detection & Correction) in L2: The EDC functionality is meant to protect the SRAM against random, transient memory cell errors, typically caused by high energy particles (cosmic particles, radiation, etc.). The L2 EDC protection is at a 128bit granularity and does 1-bit correction and 2-bit detection. It does this by generating and checking against a signature called syndrome. This syndrome may be referred to as ‘Parity’ information in some specs. The syndrome is generated when 128bits of data are stored in the L2, and is written to the SRAM with the data. On reads, the syndrome is generated for the data, and compared against the stored syndrome. Any miscompare indicates an error.

    At reset, EDC is disabled. In this state. Coming out of reset, the L2 controller does not initialize the L2SRAM – meaning that after reset, the contents of the L2SRAM are unknown (as against all zeroes). Thus, there may be data in the L2SRAM, where the actual data and syndrome (parity signature) are not in sync, and if the CPU were to read the contents of this SRAM as this time, the L2 controller would detect that the (random) data and (random) syndrome are not consistent and take a 2-bit EDC exception.

    The IDMA Scrub is a mechanism to re-generate fresh syndrome information for the L2SRAM data. This is how its done (Section 11.3.2-3 in SPRUGW0B).

    DSP R/W instructions are done in 32 bits, not 128 bits. You need to scrub the memory range you intend to protect for detecting/correcting L2 error.

    Regards, Eric