This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

F28M36H33B2: bus or hard fault on RAM read access in the LRAM3 area

Part Number: F28M36H33B2


Hi C2K champs,

A bus or hard fault occurs with a very low occurrence on a read RAM access on a F28M36 device – the problem happens on a read access in the LRAM3 area (From 0x20018000 to 0x2002FFFF) / no specific RAM address has been identified – This address range is very frequently accessed by the firmware

Q1 : In case a bus or hard fault occurs and then returns to normal, will the fault be cleared?

Q2 : What are the possible causes of bus or hard fault that occurs after accessing (reading) RAM data?

Thank you!

Best regards,

Guillaume

  • Guillaume

                Please answer the following questions. You can also send the information to me privately. 

    1. How long has this design been in production? Or is it yet to go into production?
    2. If this design is in production, how many boards have been shipped with this device till date?
    3. In how many boards is this problem seen? What is the DPPM value?
    4. In a board where the problem is seen, is the problem intermittent or permanent?
    5. If the problem is intermittent, how often is it seen?
    6. Is it possible to reproduce the problem easily and consistently?
    7. Is this seen in a device that has been working fine for some time? If so, how long? 

    If the problem is seen in just one device, it could be due to a latent defect.

  • Here are the different answers which have been provided in TI to my 2 first questions based on the problem context :

    Q1: If a bus or hard fault occurs and then returns to normal, will the fault be cleared?

    Fault may not be cleared automatically. Fault handler in user application need to take care of this.

     

    Q2: What are the possible causes of bus or hard fault that occurs after accessing (reading) RAM data?

    Most probably it is due to uncorrectable ECC error. User should be able to check the flag register to know about this. ECC errors can be caused by noise on the board.

    Regards,

    Guillaume

  • Thank you Guillaume,

    Regarding Q2, if an uncorrectable ECC error occurs, is it possible to recover from this error?

    If it's possible, how should I do?

    Best regards,

  • It is suggested to reset the device in case of uncorrectable error because the error can not be corrected. 

  • Hi Vivek,

    Thank you for you reply.

    This CPU has  the M3 Uncorrectable Error Flag Clear register(MUECLR). Is this register can be used to clear this error?

    Of course, I understand that the data of RAM may not be normal.

     

    Best regards,

  • Hi Vivek,

    Q1. Are there any ways to recover uncollectable error except reset?

    Q2. The error pops up just after the instruction highlighted in red here below - R0 contains the RAM address(e.g. 0x2002EE22).

    Are there any problems with this source code?

          <Source code>

             PUSH    {R4-R9}

             CPSID   I

             MOV     R4,#0x5555                         ;Check pattern 1

             MOV     R5,#0xAAAA                        ;Check pattern 2

             MOV     R6,R0                                  ;RAM test address

             LDRH    R7,[R6]                               ;Memory data save

             STRH    R4,[R6]

    Thank you!

    Best regards,

    Guillaume

  • This CPU has  the M3 Uncorrectable Error Flag Clear register(MUECLR). Is this register can be used to clear this error?

    Yes, it will clear the error flag.

    Vivek Singh

  • Q1. Are there any ways to recover uncollectable error except reset?

    It's difficult question to answer because it depends on if it's data error or program fetch error. Our recommendation is to reset the device and restart the application.

    Are there any problems with this source code?

    This should be compiler generated code so I don't expect any issue with it. 

    Vivek Singh

  • Thank you Vivek,

    This assembly code was NOT generated by the compiler. This code was written by our engineer.

    So I would like you to check this code.

    This code is for RAM checking.

    Process is follows...

    Store 1word data (original data) -> Write 0xAAAA -> Check the data -> Write 0x5555 -> Check the data -> Restore to original data

    If data doesn't match during checking data then it raises error.

    The error pops up when executing the store 1word data(original data).

    Best regards,

    Tatsuya

  • Any specific reason to write this code in assembly ? 

    Also does error comes every time this code get executed or only sometime time  ?

    Would it be possible to write same code in C and try ?

    Vivek Singh

  • We need to store a RAM data to one of the general register(eg. R6) in order to realize this RAM checking procedure.

    So we have written this code by assembly. Therefore, I think difficult to write this code by C language.

    This error occurs very low occurence (This issue occurred after more than three months under continuous power supply conditions).

    Best regards,

    Tatsuya

  • This error occurs very low occurence (This issue occurred after more than three months under continuous power supply conditions).

    In that case assembly code should be fine but I don't see below sequence in this assembly code. 

    Store 1word data (original data) -> Write 0xAAAA -> Check the data -> Write 0x5555 -> Check the data -> Restore to original data

    Vivek Singh

  • Hi Vivek,

    Posted assembly code is a part of the this sequence.

    I'd like to know if there is something wrong with the use of this mnemonic(LDRH    R7,[R6] ).

    Best regards,

    Tatsuya

  • As I mentioned earlier, I don't see any issue with that mnemonic but I'll loop in our compiler team also to confirm the same.

    Vivek Singh

  • I'd like to know if there is something wrong with the use of this mnemonic(LDRH    R7,[R6] ).

    I'm not sure what I'm checking for.  That is a valid Arm instruction.  When you single step through the code, does it do what you expect?

    Thanks and regards,

    -George

  • Hi George, Hi Vivek,

    Thank you for your checking and  support.

    This code works fine normally. But a bus or hard fault occurs with a very low occurrence on this code on our product.

    Therefore, I wanted to know if there were any problems with this mnemonic usage.

    It seems likely that an ECC error has occurred.

    If this ECC error is due to a program fetch error, is it possible to resolve this issue by clearing the error using the MUECLR register and re-reading the RAM?

    Best regards,

    Tatsuya

  • Hi,

    If this ECC error is due to a program fetch error, is it possible to resolve this issue by clearing the error using the MUECLR register and re-reading the RAM?

    This will clear the error but since it is un-correctable error, re-reading will not correct the error hence return to same address may cause fault again.

    Regards,

    Vivek Singh