This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

  • Resolved

RTOS/LAUNCHXL-CC1310: Task_sleep(1500) never returns?

Part Number: LAUNCHXL-CC1310

Tool/software: TI-RTOS

Hi,

Today I'm having a problem where my task deadlocks in Task_sleep. I need to wait 15ms for an external chip to complete a measurement, but when I wait using Task_sleep(1500) my task never runs again.

If I check in the ROV, it says it's blocked on Task_sleep, and the callstack clearly shows that it's entered Task_sleep and scheduled another task which runs for a little then pends on a semaphore - however TI-RTOS never resumes my task even though the timeout has long since passed!

Currently,  TI-RTOS has been waiting 15 minutes for a 15ms timeout to expire..

How can I find out why this is happening?

  • With some experimentation, it appears that it only deadlocks if there's another task to switch to. If no other tasks are ready, it just snoozes for 15ms and returns as expected.
  • In reply to Michael Moon:

    Apparently, *all* tasks that use Task_sleep when this issue occurs will deadlock.

    They will all show as "blocked by Task_sleep(undefined)" in ROV, and remain that way apparently indefinitely.

    I'm using BIOS in flash because I find it really difficult to debug stuff with BIOS in ROM.

    I'm using CCS 7.1.0.00016, ti-rtos 2.21.00.06, and TI compiler (although I've experienced essentially the same sort of problems with GNU toolchain as well)

    I've tried clean rebuild of project and restarting CCS to no avail - sometimes this actually helps although I've no idea why it would.
  • In reply to Michael Moon:

    I tried swapping Task_sleep for Semaphore_pend(&sem, timeout) but this also deadlocks until/unless the semaphore is posted. The pend never times out despite being passed a timeout value.
  • In reply to Michael Moon:

    When this state occurs, it seems that ti_sysbios_knl_Clock_Module__state__V.ticks never changes - I'm not sure if it's supposed to, or if it's supposed to wake up via Hwi when the RTC hits nextScheduledTick, or if it works some other way..

    Here's a dump from the Expression viewer on the Clock_Module symbol:

    ti_sysbios_knl_Clock_Module__state__V	struct ti_sysbios_knl_Clock_Module_State__	{ticks=8923581,swiCount=0,timer=0x200044A0 {__fxns=0x00000000 {__base=0x20004D00 ...,__sysp=...,__label=...,swi=...	0x20004664	
    	ticks	unsigned int	8923581	0x20004664	
    	swiCount	unsigned int	0	0x20004668	
    	timer	struct ti_sysbios_interfaces_ITimer___Object *	0x200044A0 {__fxns=0x00000000 {__base=0x20004D00 {base=0xBEBEBEBE {base=???},__sysp=...,__label=...	0x2000466C	
    		*(timer)	struct ti_sysbios_interfaces_ITimer___Object	{__fxns=0x00000000 {__base=0x20004D00 {base=0xBEBEBEBE {base=???},__sysp=0x0000BD81 ...,getNumTimers=...,__label=...	0x200044A0	
    			__fxns	struct ti_sysbios_interfaces_ITimer_Fxns__ *	0x00000000 {__base=0x20004D00 {base=0xBEBEBEBE {base=???},__sysp=0x0000BD81 {__create=...,getNumTimers=...	0x200044A0	
    			__label	unsigned int	1	0x200044A4	
    	swi	struct ti_sysbios_knl_Swi_Object *	0x20004634 {qElem={next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C ...,prev=...,prev=...,prev=...,prev=...,fxn=...	0x20004670	
    		*(swi)	struct ti_sysbios_knl_Swi_Object	{qElem={next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=...,prev=...,prev=...,prev=...,prev=...,fxn=...	0x20004634	
    			qElem	struct ti_sysbios_knl_Queue_Elem	{next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C ...,prev=...,prev=...,prev=...,prev=...,prev=...	0x20004634	
    			fxn	void (*)(unsigned int,unsigned int)	0x00005E49	0x2000463C	
    			arg0	unsigned int	0	0x20004640	
    			arg1	unsigned int	0	0x20004644	
    			priority	unsigned int	5	0x20004648	
    			mask	unsigned int	32	0x2000464C	
    			posted	unsigned short	0	0x20004650	
    			initTrigger	unsigned int	0	0x20004654	
    			trigger	unsigned int	0	0x20004658	
    			readyQ	struct ti_sysbios_knl_Queue_Object *	0x2000462C {elem={next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C ...,prev=...,prev=...,prev=...,prev=...}	0x2000465C	
    				*(readyQ)	struct ti_sysbios_knl_Queue_Object	{elem={next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=...,prev=...,prev=...,prev=...,prev=...}	0x2000462C	
    					elem	struct ti_sysbios_knl_Queue_Elem	{next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C ...,prev=...,prev=...,prev=...,prev=...,prev=...	0x2000462C	
    						next	struct ti_sysbios_knl_Queue_Elem *	0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=...,prev=...,prev=...,prev=...,prev=...	0x2000462C	
    						prev	struct ti_sysbios_knl_Queue_Elem *	0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=0x2000462C {next=...,prev=...,prev=...,prev=...,prev=...	0x20004630	
    			hookEnv	void * *	0x00000000 {0x20004D00}	0x20004660	
    				*(hookEnv)	void *	0x20004D00	0x00000000	
    					*(*(hookEnv))	unknown	cannot load from non-primitive location	
    	numTickSkip	unsigned int	50	0x20004674	
    	nextScheduledTick	unsigned int	8923834	0x20004678	
    	maxSkippable	unsigned int	3240050766	0x2000467C	
    	inWorkFunc	unsigned short	0	0x20004680	
    	startDuringWorkFunc	unsigned short	0	0x20004682	
    	ticking	unsigned short	1	0x20004684	
    	Object_field_clockQ	struct ti_sysbios_knl_Queue_Object__	{elem={next=0x2000477C {next=0x200041C4 {next=0x200041E8 {next=0x2000420C {next=...,prev=...,prev=...,prev=...,prev=...}	0x20004688	
    		elem	struct ti_sysbios_knl_Queue_Elem	{next=0x2000477C {next=0x200041C4 {next=0x200041E8 {next=0x2000420C {next=0x20004230 ...,prev=...,prev=...,prev=...,prev=...,prev=...	0x20004688	
    			next	struct ti_sysbios_knl_Queue_Elem *	0x2000477C {next=0x200041C4 {next=0x200041E8 {next=0x2000420C {next=0x20004230 {next=...,prev=...,prev=...,prev=...,prev=...	0x20004688	
    			prev	struct ti_sysbios_knl_Queue_Elem *	0x200020C8 {next=0x20004688 {next=0x2000477C {next=0x200041C4 {next=0x200041E8 {next=...,prev=...,prev=...,prev=...,prev=...	0x2000468C	
    

    I'm not sure exactly what it's supposed to look like, but it looks sensible enough to me.

    How can I find out why TI-RTOS's clock suddenly breaks?

    Usually I'd just set a hardware watchpoint somewhere, but TI-RTOS source seems very convoluted with huge amounts of symbol redirection and I haven't managed to find out any sensible place to put a watchpoint yet...

    I tried putting a breakpoint in Clock_tick but CCS spat errors at me about Clock_doTickFunc(0); not being an executable line of code.

    So, how do I debug stuff when (as always) everything looks like TI-RTOS itself is choking?

  • In reply to Michael Moon:

    ROV/BIOS/Scan for errors says:
    ,ti.sysbios.knl.Clock,Basic,ti.sysbios.knl.Clock@200020c8,N/A,Caught exception in view init code: "./xdctools_3_32_00_06_core/packages/xdc/rov/StructureDecoder.xs", line 518: java.lang.Exception: Target memory read failed at address: 0x2000462c, length: 32This read is at an INVALID address according to the application's section map. The application is likely either uninitialized or corrupt.
    ,ti.sysbios.knl.Clock,Module,N/A,N/A,Caught exception in view init code: "./xdctools_3_32_00_06_core/packages/xdc/rov/StructureDecoder.xs", line 518: java.lang.Exception: Target memory read failed at address: 0x2000462c, length: 32This read is at an INVALID address according to the application's section map. The application is likely either uninitialized or corrupt.

    This is straight after a fresh build from clean.

    0x200020C8 is inside my task's stack, it holds the value 0x20004300 which is Obect_field_clockQ inside ti_sysbios_knl_Clock_Module__state__V (from TI-RTOS) which just has next=0x200043F4 and prev=0x200020C8 (the address listed by BIOS)

    These struct ti_sysbios_knl_Queue_Elem seem to be about 8 array elements in a circular linked list, with 0x200020C8 being the odd one out.

    I assume Task_sleep has added something to the Clock's queue so it can be woken after the timeout.

    Now, why Expression viewer and Memory Browser can see this memory, but ROV spits Java errors about it being inaccessible I have no idea.

    Also I have no idea if this is relevant to the problem I'm having, I'm basically clutching at straws here because I'm finding it so difficult to navigate and debug TI-RTOS internals, which all of my numerous issues point to.

    Whenever I pause my application, it's always stopped at address 0x10000486 which is in an unmapped part of the internal ROM - even though I've set my cfg to use TI-BIOS in flash for easier debugging.

    Is it possible that whatever code is in this piece of ROM is messing up TI-RTOS internal state since the linker might not make room for its RAM usage if I select BIOS in flash?

    Then again, I get the same sort of problems with BIOS in ROM, but they're far harder to debug because I get either no symbols at all, or messed up symbols everywhere else since rtos_rom.xem3 has stuff outside the 0x1001nnnn range.

    Any ideas? Even suggestions for further debugging steps would be most welcome
  • In reply to Michael Moon:

    Michael,

    First, do out of the box examples work for you? For example UART Echo.

    Do you have power enabled or disabled in your application? It's in PowerCC26XX_config.

    Can you disable power management to see if it is related?

    You can set a breakpoint in ti_sysbios_knl_Clock_doTick__I to see if the clock tick is happening.

    Finally can you let me know the size and peaks of all the task stacks and Hwi stack? You can get these from ROV. You may have to enable the setting of the stacks. This is done by setting the following in the .cfg file.
    var halHwi = xdc.useModule('ti.sysbios.hal.Hwi');
    var Task = xdc.useModule('ti.sysbios.knl.Task');
    Task.initStackFlag = true;
    halHwi.initStackFlag = true;

    Todd
  • In reply to ToddMullanix:

    I set a breakpoint on ti_sysbios_knl_Clock_doTick__I while my app was deadlocked, it did not trigger.

    When I restart with the breakpoint set, after hammering away on F8 for a while, the breakpoint stopped triggering. My application continued to run on other interrupt sources (eg radio, SCS) since the breaking had upset it's timing and it didn't want to sleep anymore.

    Curiously, before the breakpoint stopped triggering, it would trigger while pending on radio commands. After it stopped, the radio commands still worked fine but presumably the pend wouldn't time out if they somehow got stuck.

    After switching power policy to doWFI, I seem to have regular calls to doTick and I no longer get deadlocks in Task_sleep.

    I disabled the breakpoint so it could free-run for a while, and it seems fine.

    Surprisingly, after switching back to standbyPolicy, it also isn't deadlocking anymore!

    All I've done is set some breakpoints, change power policy, disable the breakpoints and change power policy back, and suddenly everything works fine... what effect could that have, that multiple debugger+target powercycles, a clean build and CCS restart doesn't?

    Since it seems to work now, there's not much point in me checking examples?

    Checking task stacks is one of the first things I do - as soon as I worked out that ROV requires the Task_struct to be global rather than static or dynamically allocated that is. I don't bother to post here if a stack has overflowed, that would be daft ;)
  • In reply to Michael Moon:

    Aaand just as suddenly it's deadlocking again.

    Images: deadlocked, ROV task view

    As can be seen, neither task has overflowed its stack and they're both deadlocked in Task_sleep. The sleeps are both less than 100 milliseconds so they certainly shouldn't stop for longer than the several minutes I waited!

    The Clock state thing looks ok, with nextScheduledTick being a little ahead of ticks, but doTick never gets called when the deadlock occurs.

    So, presumably something somewhere is breaking the timer somehow? How can I find what's happening to it? How/where does the clock choose which timer to use?

    I tried to get a dump of GPT0's registers after it deadlocked, but Memory Browser apparently can't read it for some reason.

    ROV says the only Timer is the RTC, which is apparently disabled. It's also disabled when timeouts are working fine (after initial startup) so whatever is being used as a timing source isn't appearing in ROV's Timer pane.

  • In reply to Michael Moon:

    Forgot to mention, it's still deadlocking with power policy doWFI, so I don't think it's a power management issue.

    All I've been able to find out so far is that whatever is supposed to call Clock_doTick() stops doing so after a minute or two of normal running
  • In reply to Michael Moon:

    I suspect that interrupts have been disabled.

    When it's locked up can you print out the contents of the CTRL_FAULT_BASE_PRI register under the Core registers view:

    CTRL_FAULT_BASE_PRI 0x02000000 CM3 Special Registers [Core]

    Above is how it should look with interrupts enabled.

    CTRL_FAULT_BASE_PRI 0x02002000 CM3 Special Registers [Core]

    Above is how it looks with an unbalanced Hwi_disable() call.

    Alan

This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.