This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

ROM_SysCtlDelay() Fault

I have noticed something strange when using either the ROM_SysCtlDelay() or SysCtlDelayfunctions on a LM4F232H5QC.

If I pass a value over (approx) 110000 to either function, a Fault Interrupt is triggered. The Fault Status Register has the folwwoing flags set: 

NVIC_FAULT_STAT_BFARV	1	Bus Fault Address Register Valid	
NVIC_FAULT_STAT_PRECISE	1	Precise Data Bus Error	

Has anyone else experienced this before?
Could anyone shed some light on this perhaps?
  • We spun custom pcb for 4F231H5QR - repeatedly use "ROM_SysCtlDelay(2000000);" (or larger) w/o issue.   Are you using latest version of StellarisWare (believe 8555) and is your toolset fully/properly upgraded?   From memory - believe that we used this or larger values on EK232 eval board w/o issue too.  Suggest that you search for "official" TI project w/in EK232 directory - and then experiment with your data value.

    I note that ROM_SysCtlDelay() is defined w/in rom.h and that we:  #include "inc/hw_sysctl.h" and "driverlib/sysctl.h" along w/ "driverlib/rom.h."  Be sure that all of these directory paths are enabled.

    We use paid IAR and our C Preprocessor is set up as:

    ewarm
    PART_LM4F231H5QR
    TARGET_IS_BLIZZARD_RA1

     

  • Yes, I have the StellarisWare 8555, and the correct h files and device definitions.

    I suspect that the problem may not necessarily be with the libraries/drivers as the delay functions have been working for me for some time. I added some more code to my project today and after that I am having the above mentioned problem. It doesn't make sense to me why this would be happening, as I don't see how the two would interfere with each other.

    Weird

    Thanks.

  • I think I may have found the cause of the problem, but I'm not sure what the best approach would be to fix it.

    I am calling the ROM_SysCtlDelay() from the SysTick interrupt as well as from functions in the main loop. I am assuming this function is non re-entrant which could possibly cause the delay decrement values to corrupt and possible the return address, if the interrupt were to occur whilst ROM_SysCtlDelay is being executed by the main loop?

    I will also be calling ROM_GPIOPinTypeGPIOOutput() from an interrupt and the main loop

    Is there a work around I could use to solve this problem? 

  • Preferred usage sees interrupt handlers as, "short/sweet," get in - do the job - and exit.  Thus delays w/in handlers may cause issues.

    Like your use of SysTick() - bet you that some re-think will enable a re-craft/ordering of your code so that SysTick() may replace problematic calls to ROM_SysCtlDelay().  (as you know - SysTick() is less disruptive to your code execution)

  • Thanks cb1_mobile,

    I agree to keep the interrupts simples and quick. It won't be too complex for me to do this. 

    My concern is what will happen when calling the GPIO pin writing ans similar functions from interrupts and the mail loop?

  • We've not (yet) experienced issues (4F or M3 ARM) when writing/reading GPIO either from main or interrupt handler.  (in fact - by setting GPIO @ top of handler and then clearing same pin @ bottom - a reasonable measure of the interrupt's duration is provided)

    Recall that StellarisWare has a clever mechanism to 1st read a GPIO pin and then conditionally alter that pin - based upon the read - all achieved w/in a single StellarisWare instruction.   (this provides an efficient means to toggle a GPIO) 

    It's unclear as to why you're concerned about GPIO behavior with calls from interrupts and/or main.  Our tests have not identified any differences in GPIO behavior when called from either...

  • are the GPIO pin writes performed atomically by the CPU?

    If they are then there should not ever be an issue. But If they aren't, then what is the risk of a corrupt value being written to the port in the following scenario:

    if an interrupt where to interrupt the main loop mid execution of ROM_GPIOPinWrite(), and this interrupt also calls ROM_GPIOPinWrite(); (or any other StellarisWare function for that matter)

  • This has moved fairly far from original - now outside of my instant recall.  Suspect that a forum search and/or good re-read of GPIO and/or bit-banding may enlighten.

  • GPIOs are really a special case here since the hardware does allow you to do atomic writes to any group of pins. In general, however, as the owner of the application, you have to be responsible and design your application to ensure that you do not include code that accesses the same resource from different execution contexts without some form of protection. That protection could be as simple as temporarily turning off interrupts while calling a function that you also from an interrupt handler or, if you are using an RTOS, it could involve mutexes or semaphores.

    As far as SysCtlDelay is concerned, I find it very hard to believe that this function itself is faulting since it's a trivially simple loop. I suspect, however, that what you are seeing is some kind of race condition that is exacerbated by the introduction of a particular delay somewhere. This kind of problem can be very tricky to track down and is one of the reasons why very careful design is vital at the start of a project to ensure that you have considered all contexts that need to use a given resource and made appropriate design choices to ensure that they can never trample on each other.

  • Hi Dave,

    Thanks for the reply.

     

    I solved the delay problem by adding a simple for loop instead of calling the ROM delay function – I needed a very quick delay between toggling a pin from the SysTick interrupt. But I would still like to further simply the SysTick interrupt routine, and rather get the main loop to execute code flagged by the SysTick interrupt event.

    Looking at the disassembly in CCS there are around 4 asm instructions (using a few R registers) to write to a port using either the ROM function call or something like this: GPIO_PORTJ_DATA_R = 0;

     

    At this stage I’m not sure how CCS handles the stacks when entering and exiting an interrupt, if these registers are pushed/popped on the stack then I guess it should be ok. Some further debugging will be necessary to get a definite answer.

     

    Does CCS have any pragma options to specify re-entrant functions, if I remember Keil has something like this? As you suggested, turning interrupts off for critical code is also a good idea (provided that code executes quickly and efficiently too I guess).

  • All functions in DriverLib are re-entrant in the sense that we don't use any global data and that parameters and locals are held on the stack or in registers that are stacked. Any function that performs a read-modify-write, however, will not be safely reentrant unless you use some serialization mechanism to prevent two contexts calling it in an overlapping way. As I mentioned before, this is a problem you need to fix by designing your code correctly - it's not something you can just flip a compiler switch for and expect everything to work. If you have multiple context that need to access a given device or resource, your design should ensure that they can never clash or step on each other's toes. One pretty easy way of doing this is to have a single context that accesses the resource and a signalling mechanism that you use to allow the other contexts to tell the resource owner to perform whatever action is needed. If you are using an RTOS, you can also use mutexes or semaphores to protect the resource but this won't work if you want to access the resource from an interrupt handler since you can't typically make blocking RTOS calls from within an ISR.

    The fix you mention above waves a big red flag for me. Any time you have a problem that goes away when you add or remove a delay, this is almost certainly an indication of a serious design flaw somewhere else in the code. Usually it's related to the kind of thing you are asking about - poorly serialized access to some resource. I would strongly encourage you to spend the time determining the real root cause of your problem rather than just assuming that it doesn't exist because a delay made it go away. Most likely, the delay just reduced the frequency of the occurrence and this just makes debugging the issue a lot harder. Spending the time to get to the bottom of this now is a whole lot better than waiting until your product is in the field and customers start having problems with it!

  • Thanks Dave. Makes sense.

  • Dave,

    Yours is a most valuable, pertinent and well detailed post - thank you - much appreciated.  In one place you've provided an excellent guideline and clearly identified just why "band-aid" techniques - rather than causal-nexus (root cause i.e. real understanding) are a false comfort.  (see ticking time bomb...)   Your experience and clear direction are sure to help many - again thanks...

  • Thank you, sir! In my 25 years of doing this stuff, the problems that always take the longest to debug are these types of resource contentions or race conditions. In my youth, I would be happy sticking in a delay loop and hoping that the problem would go away but, over time, it's become very clear that that kind of approach is a false economy since they ALWAYS come back to bite you later. It's far better to take a 2 day hit early in development and properly diagnose and fix this kind of problem that to hack around it and then spend 3 weeks working on it when a customer notices it 6 months later.