This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F280041: F280041 ADC Interrupt Overflow Issue

Part Number: TMS320F280041

Hi, Champs,

I’m writing to you for an urgent case support from my OBC customer on F280041 ADC interrupt overflow issue.

 

The F280041 OBC platform has been gone to mass production and will ramp up soon, however, they met a critical issue that the ADC interrupt will overflow and cause it to be not serviced anymore in their end customer side.

We spent the weekend in their lab to identify the root cause and find a solution for them, but now we still need your help to clarify the root cause and help us to explain why it worked not as expected.  

Please find attached issue summary and give your suggestions, or let us know if you need more information to understand the details.

F280041 ADC Interrupt Overflow Issue Summary.pdf

Best Regards,

Ricky Zhang

  • Hi Ricky,

    In general, if you clear all the overflow flags in the ADC and make sure everything is cleared out of the PIE it should be possible to resume servicing subsequent interrupts. Since the document says that interrupts fail to be resumed when the ISR duration is around an integer multiple of the ADC trigger period, I'd have to assume that they are running into a race condition where the new ISR coming in is causing the ADCINT flag to be set shortly after the attempt to clear the overflow and flag. This results in the ISR exiting with the ADCINT flag still pending. Subsequent setting of the ADCINT flag causes an overflow instead of an ISR.

    You can verify the above hypothesis by adding some code at the very end of the ISR to check if any of the ADCINT flags and/or ADCINT overflow flags are set. I suspect that they are in the case where the application fails to resume.

    An easy solution might be to disable the CPU timer trigger to the ADC when you are inside the ISR (either always do this, or at least do this when you are doing the operation that results in the long ISR). This will prevent additional results from resetting the ADCINT flag while you are still in the ISR. This should also actually prevent the overflow from occurring in the first place.
  • Devin,

    Thanks for your prompt response and advice.

    As the document described, I think we had already gotten a solution following Tommy's reply to another post, while now we want to understand what happens at that moment and why.

    There're 4 questions based on your reply.

    1. Customer didn't clear ADCINT overflow flag in the ISR, but they can service the new ISR again as long as the ISR duration is NOT around an integer multiple of the ADC trigger period, is it a normal operation for the PIE on our chip?

    2. If it's a normal operation, then why it's not serviced right after the old ISR but starts the execution at the timing where is an integer multiple of the ADC trigger period?

    3. Without ADCINT overflow, a new ISR might not be able to be serviced also if the ISR duration is 1x ADC trigger period, does it make sense?

    4. How the CPU and PIE works when ISR duration is an integer multiple of the ADC trigger period to have new ISRs not resumed anymore?

    We have already checked the ADCINT flags and ADCINTOVF flags in CCS register view and know they are both set.

    Best Regards,

    Ricky Zhang

  • Hi Ricky,

    The CPU should immediately service an interrupt from the PIE if a flag is pending when the CPU returns from another ISR.  In other words, servicing the ISR is not a pulse-triggered event -  as long as the flag is pending the CPU will eventually get around to servicing the ISR.  

    If the ADCINT flag and overflow flag are not cleared, the only way for the ADC to generate further interrupts is if the continuous run mode is set.  In this case, the event that propagates to the PIE and sets the PIE flag pending is a pulse.  The ADCINT flag can be set, but this will have no effect on the PIE.  The PIE pending flag is only set when the SOC ends and the ADCINT pulse propagates to the PIE.  If this was not the case the ADC ISR would continually re-enter as soon as it was exited because the ADCINT flag is always set and is never cleared in continuous  mode.   

    I'd definitely recommend that you clear the ADCINT and ADCINOVF flags, at least for debug (even though it sounds like you are relying on the continuous INT mode of the ADC to trigger the PIE).  This way you can confirm that further ADCINT events are still occurring.  If you can confirm that this is the case, then the issue is in the PIE configurations (probably somehow one of the PIE enable or global interrupt enable flags is getting cleared).  

    Along the lines of the last sentence of the previous paragraph, at any point in the ISR or background code are you toggling on/off the global interrupt enable or the PIE interrupt enables (perhaps to create an atomic section)? If so this is probably the key to the race condition that is resulting in the system ending up in a state where it won't service ISRs.

    You might also try my suggested workaround (disable the ADC SOC trigger during the ISR) because I think it should be pretty robust against an overflow/race condition.

  • Devin,

    Thanks for your detailed explanations and I will ask customer to have some testing based on your suggestions like check ADCINTOVF flags and clear them as well as disabling ADC SOC trigger and confirming if there're any codes enable/disable PIE IER and PIEIER registers.

    We have already confirmed that ADCINT flag will be cleared in the end of an ISR and it also seems customer only enabled this one ADCINT so should be not handle PIE anymore in their background or ISR.

    Without ADCINTOVF cleared in an long ISR, we still can't understand why a new ISR can be serviced as long as previous ISR duration is NOT an integer multiple of ADC trigger period, as you mentioned above further ADCINT should not be triggered and propagated to PIE when continuous run mode for ADCINT is not enabled.

    We also know enabling continuous run mode for ADCINT will help to fix this issue while customer didn't do that in their original codes, so we still need to understand why the ISR behaved like this - most can be resumed but only those can't if previous ISR duration is an integer multiple of ADC trigger period.

    Best Regards,

    Ricky Zhang

  • Hi Ricky,

    The integer multiple issue is almost certainly a race condition:  If the ISR duration is some integer multiple of the triggering period, this means that the ISR is ending at the same time the next ISR trigger pulse is coming in.  In this case the particular timing sequence of clears/acknowledgements/enables/disables is leaving the device in a state where interrupts are not enabled or not being triggered.  My questions are trying to help you determine which particular setting is subject to the race.  

  • Devin,

    I agree with you that the integer multiply issue is certainly a race condition, however, some conditions will be definitely not met by testing and observing the codes. Those conditions or facts include below:

    - ADCINTOVF is not cleared in customer code;

    - ADCINTOVF is not set when servicing a new ISR even if the previous ADCINT ISR execution time is longer than a trigger period;

    - There're no peripheral interrupt enable/disable and PIE enable/disable in customer codes except for the PIE acknowledgement which is required (we know DINT/EINT and PIE IER will be handled when exiting an ISR but this is proceeded by hardware or compiler, you can also find how customer did in their codes when entering and exiting an ISR in the document listed in my original post);

    I appreciate your questions to identify the possible reasons on below 2 issues:

    - when execution time is integer multiply of trigger period, new ISR can not be serviced;

    - when execution time exceeds an trigger period but is not  integer multiply of trigger period, new ISR can be serviced in integer multiply point of trigger period;

    However, it seems our experiments in customer lab are still not able to identify the root cause, but instead, it seems to be an inherent behavior inside our device as long as you configure the ISR execution time properly, you can easily re-produce it in an experiment kit.

    In that case, would you mind helping us trying it out in your side and understanding how will our chip exactly behave in both scenarios?

    Thanks for your time and efforts to get this issue closed.

    Best Regards,

    Ricky Zhang

  • Hi Ricky,

    I believe we've answered question (1) from the initial report already: during the ISR, interrupts are disabled; since continuous mode is not enabled the ADC interrupt flag is overflowed so no further ISRs are generated (and because no ISRs are generated, the ADC flags are never cleared).

    For (2) from the initial report:
    See normal interrupt operation here: processors.wiki.ti.com/.../Interrupt_Nesting_on_C28x
    The PIE is disabled during the ISR, so the pulse interrupt from the ADC is lost because an ISR is executing. The ISR does not resume right after the previous one ends because nothing got set in the PIE because it was disabled. The next interrupt request pulse from the ADC goes through because no ISR is executing at that time.
  • Devin,

    Despite of the ISR execution time and PIE is disabled or enabled in the ISR, ADCINTFLG and ADNINTOVF flags' set is a behavior inside ADC module, correct?

    However, in your statement to question #1 and #2 below, it seems not to be the same:

    Devin Cottier said:

     
    I believe we've answered question (1) from the initial report already: during the ISR, interrupts are disabled; since continuous mode is not enabled the ADC interrupt flag is overflowed so no further ISRs are generated (and because no ISRs are generated, the ADC flags are never cleared).

    Here you mentioned "ADNINTOVF flag will be set".

    Devin Cottier said:

     
    For (2) from the initial report:
    See normal interrupt operation here: processors.wiki.ti.com/.../Interrupt_Nesting_on_C28x
    The PIE is disabled during the ISR, so the pulse interrupt from the ADC is lost because an ISR is executing. The ISR does not resume right after the previous one ends because nothing got set in the PIE because it was disabled. The next interrupt request pulse from the ADC goes through because no ISR is executing at that time.

    Here ADC interrupt flag will not be overflowed so new ADCINTFLG can be set and pulse can go through?

    I still insist in thinking those strange behaviors are not clarified clear enough and we should perform some tests on bench to look into how and why it happens as this should have nothing to do with customer application code, but instead, it's a simply timing issue causes the PIE works like this and we should be able to re-produce it in our experiment kit.

    Best Regards,

    Ricky Zhang

  • Hi Ricky,

    My understanding is that continuous mode is enabled in case 2, but not in case 1. In both cases the ADCINT will overflow but pulses can still propagate out to the PIE in case 2.
  • Devin,

    No, unfortunately both cases below are observed in the same condition, which is clearly described in the document in my original post in the "Background and Issue Description - Basic configuration" section (continuous mode is only mentioned in the Recommendation and Solutions section):

    - when execution time is integer multiply of trigger period, new ISR can not be serviced;

    - when execution time exceeds an trigger period but is not  integer multiply of trigger period, new ISR can be serviced in integer multiply point of trigger period;

    Could you please help to dig out the difference here?

    Best Regards,

    Ricky Zhang

    continuous mode

  • Hi Ricky,

    I'm in the process of handing this issue off to the PIE expert. Expect an update soon.
  • Devin,

    Thank you so much for your follow up and look forward to your response.

    Best Regards,

    Ricky Zhang

  • Devin,

    Wish you had a nice holiday.

    Do you think we can have an update for this issue now?

    Best Regards,

    Ricky Zhang

  • Devin,

    Can we expect an update by this week?

    Customer is pushing us a bit harder as they are still under their end-customer's pressure, while we need to meet with them next week and explain how and why our chip behaves like this.

    Thanks in advance for team's great support.

    Best Regards,

    Ricky Zhang

  • Hi Ricky,

    Sorry for late reply. Basically we are looing at following two queries -

    - when execution time is integer multiply of trigger period, new ISR can not be serviced;

    Have you tried this on a setup with you or this is something customer trying. I really don't see any relation with PIE working with ISR execution time. I kind of agree with Devin that this is somewhat race condition. If you clear ADCINTFLG and ADCINTOVF flag at the end of ISR, which is the correct way to write the ISR then you should not see this issue. If that is not done then we can have unexpected behavior.

    - when execution time exceeds an trigger period but is not  integer multiply of trigger period, new ISR can be serviced in integer multiply point of trigger period;

    Will it be possible to provide a sample code to reproduce this issue (with NOP would be better) ?

    Regards,

    Vivek Singh

  • Vivek,

    Thanks for your response.

    I didn't try it out in my laptop or EVM.

    Customer did clear ADCINTFLG bit but didn't clear ADCINTOVF flag in their original codes where this issue occurred, while now we need to understand why and how this happens.

    Furthermore, when we did next test based on above set-up, i.e. clearing both ADCINTFLG and ADCINTOVF flags, it didn't change anything - the issue still exists.

    Only when you enable ADC continuous mode, the ISR won't be missed and this will not depend on both flags' operation, i.e. no matter you clear ADCINTOVF flag or not, the ISR can be serviced all the time.

    I asked customer to provide a sample code and he sent me one but it's blocked by security. I will try to post it once I receive it in other ways, or ask customer to post it here directly.

    Best Regards,

    Ricky Zhang

  • Vivek,

    Sample codes are attached FYI, and please let me know if any questions.

    You can modify the variable g_u16Test in ISR to change ISR execution time.

    As a reference based on customer test, when its value equals to 80, 163, 246 and 330, the ISR execution time corresponds to 1x, 2x, 3x and 4x of trigger period (12.5us) respectively and new ISR will be missed.

    GPIO33 is added here to mark where a new ISR is entered or exited. Setting to 1 means entering and clearing to 0 means exiting.

    VAILD6043C2U39_App_INTTEST.7z

    Best Regards,

    Ricky Zhang

  • Vivek,

    Sorry for push, while do you think we can expect an update within these 2 days?

    Customer waits for an explanation for long time and needs to discuss with their end-customer, so your help will be highly appreciated.

    Best Regards,

    Ricky Zhang

  • Hi Ricky,

    Thanks for the code. That helped us in debug the issue. As expected, this is because of race condition only. We are checking with design on some implementation detail but it look like if another EOC event happens at same time when ADC INT flag is getting cleared (AdcaRegs.ADCINTFLGCLR.bit.ADCINT1 = 1;), then the flag remains set and in addition overrun flag also gets set (which is expected). Since the flag was already set, ADC does not generate new interrupt hence you never see code entering into ISR again.

    There are couple of ways to solve this issue -

    1. Use the continuous mode of ADC. I assume you used it and it resolved the issue.
    2. Disable the ADC interrupt (AdcaRegs.ADCINTSEL1N2.bit.INT1E = 0;) before clearing the ADC interrupt and overrun flag. Enable the interrupt again after clearing the flag.This will avoid the race condition (setting and clearing of the flag).

    I'll update you further after I get some more clarification from design team.

    Regards,

    Vivek Singh

  • Vivek,

    Thanks for your confirmation and here are some of my comments:

    1. We already have a solution to get this issue addressed, just like you and Devin mentioned to use ADC continuous mode and clear the ADCINTOVF flag;

    2. Customer simply want to understand root cause for 2 scenarios:

    a. why and how INT is missed when execution time is integer multiply of trigger period?

    b. why INT will not be missed when execution time is not integer multiply of trigger period?

    We will look forward to more details on this from design team, and thanks for your following-up to get this issue closed as soon as possible.

    Best Regards,

    Ricky Zhang

  • Ricky,

    As I mentioned, issues happens when setting of flag and clearing of the flag happens in same cycle so that will happen at specific time which explains both the questions. Design team has confirmed that SET has priority over CLEAR hence flags not getting cleared and new interrupt does not get generated.

    Vivek Singh

  • Vivek,

    Okay, thanks for your confirmation.

    I have forwarded the post to my customer and will let you know if they have further questions.

    I will get this post closed now and thank you again for your response.

    Best Regards,

    Ricky Zhang