Shadow task?

Chris Thomas

A project I have been working on has developed a bizarre problem, maybe it is a known issue.

The code base is long lived and stable, I took a branch off to develop a complicated new feature a few months back so have not been really testing all the original code continued working while adding all the new code. So I have a lot of edits (and compiler/tools upgrades) between now and the last known good.

Before I go off on a bisect spree, perhaps someone has seen this.

One of my threads, created in the point and click interface thus;

var parm_9 = new TSK.Params();
parm_9.instance.name = "ces_task";
parm_9.vitalTaskFlag = false;
parm_9.stackSection = "stk_cestask";
parm_9.stackSize = 1792;
Program.global.ces = TSK.create('&ces_task', parm_9);

Seems to be executing twice, with the same task id, I tried log statements etc to see what was going on, and found that the instances of this thread had a parallel set of globals. To the extent that I created a semaphore (counting init val=1) which both instances are able to obtain in parallel (both instances reporting the same semaphore address)

It is not double logging, I put TSCL into the log strings and get different times.

And from behavior I can see the code going into self test routines and tripping over itself in just the way I would expect if I had coded 2 identical tasks.

Confused...

Chris

DM6433 - custom board.
CCS 5.4.0.00091
XDC 3.25.2.70
PSP 1.10.3
CGT 7.4.5
BIOS 6.34.4.22
EDMA 2.11.07.04

over 12 years ago

0 ToddMullanix over 12 years ago

TI__Guru* 96960 points

Have you looked in ROV to see if there are two instances?

Todd

0 Chris Thomas over 12 years ago in reply to ToddMullanix

Expert 1310 points

Right now I am writing my own dump task utility to see (ROV/JTAG etc are real last resort things, the code is quite different to enable these things, built differently and I have to deal with loads of BSOD's).

If two tasks had got started somehow, wouldn't TSK_self() give a different number, might ROV just lump them together - hence DIY to be sure...

Chris

0 ToddMullanix over 12 years ago in reply to Chris Thomas

TI__Guru* 96960 points

They would definitely have a different handle (return of Task_self). You can have the same entry function for different Task instances. ROV will show each one. Can you put a Log_print (or System_printf) with the Task_self return at the beginning of the entry function?

Todd

0 Chris Thomas over 12 years ago in reply to ToddMullanix

Expert 1310 points

I put my version of logging into the code at the start of the entry function, I logged tsk_self, TSCL and address of semaphore.

I logged before and after pending on the semaphore, I did not bother posting the semaphore, so the second task should have hung.

I got the same tsk_self, same semaphore address and different ticks.

I also had a couple of globals, I incremented these with irq off protection, got to see the same global go from 0-->1 twice.

To me it feels like some kind of stack overflow type issue, but it runs remarkably well if this is the case.

0 Chris Thomas over 12 years ago in reply to Chris Thomas

Expert 1310 points

With my thread list I can see now that the thread in question appears once in the list. It has a unique stacksize which is not shared by any others on the list either.

0 ToddMullanix over 12 years ago in reply to Chris Thomas

TI__Guru* 96960 points

What is the stack usage in ROV-Task->detailed? Also, what was the initial count on the semaphore you created?

0 Chris Thomas over 12 years ago in reply to ToddMullanix

Expert 1310 points

It is a different beast with all the debug stuff turned on, however in release with my task dump code (essentially this http://e2e.ti.com/support/embedded/bios/f/355/p/259653/908416.aspx#908416) shows stack use.

First time in the stack is 1252 from 1792, the second time in it is 1380 from 1792, I have increased the stack to 8K and am building it now. No other thread was over 10% of the available stack.

0 Chris Thomas over 12 years ago in reply to Chris Thomas

Expert 1310 points

With extra space the stack usage is just 1148/8192 so that would seem clear.

This has got to be some sort of tool issue - I cannot think how I could code 2 threads with the same id, not sharing global variables.

0 Naoki Kawada over 12 years ago in reply to Chris Thomas

Guru 19890 points

Hi Chris,

The Task instance having same handle should not be in same time.

I think this is system crash issue caused by stack overflow or something. In addition to the usage of task stack, you should check if system stack is enough or not. You can increase system stack , for example, with the following config script:

Program.stack = 0x2000; // 0x2000 bytes system stack

Regards,

Kawada

0 Chris Thomas over 12 years ago in reply to Naoki Kawada

Expert 1310 points

Hi Kwanda,

In this system I have lots of free ram, so I have default stacks of 32K, most threads are on this and are nowhere near clashing.

I have a couple of critiical threads which have small stacks in L1Data, at the point of failure one has 68 used out of 1280 - and will be pretty much idle because no data is flowing. The other, the bad thread as it happens was showing about 1200 bytes used.

I expanded the stack in the bad thread to 8K in system ram, from 2K in L1data, it made no difference and showed the same bytes used.

I don't think it is as simple as a stack overflow.

But who knows, I have never seen anything like this...

0 Naoki Kawada over 12 years ago in reply to Chris Thomas

Guru 19890 points

As you know, each task instance can have its dedicated stack memory, but bios kernel has a single stack and this is shared between other threads type. It can be increased as I suggested before. Have you tried that?

If system wrongly invioked twice, you may hit breakpoint at c_int00 after running executable. Can you check that ?

kawada

0 Chris Thomas over 12 years ago in reply to Naoki Kawada

Expert 1310 points

Hi Kwanda,

I shunted some stuff out of L1Data and doubled the program stack from 3K to 6K, it made no difference, is there a programmatic way to read the stack usage in release mode?

Chris

0 Naoki Kawada over 12 years ago in reply to Chris Thomas

Guru 19890 points

Hmm.. I don't know the programatic ways, but you can check the usage of system stack via CCS memory browser. Please check the address of stack memory in your map file and see if 0xbe is filled at the top stack. It would indicate stack is enough in your system. If stack is not filled with the fixed value such as 0xbe, please fill it with something by using CCS just before running the executable and then check its usage. ROV may be able to use to verify that.

Regards,

Kawada

0 Chris Thomas over 12 years ago in reply to Naoki Kawada

Expert 1310 points

After some fun, I can say that the stack has only 968 bytes used, well inside the 3K I had assigned.

All of which did not help.

0 Chris Thomas over 12 years ago in reply to Chris Thomas

Expert 1310 points

Now that is embarrassing, this board has a spare unused DSP fitted that has always held in reset by the FPGA - until now.

Aaarrrgggghhhhhhhhhhhhhh!

0 Naoki Kawada over 12 years ago in reply to Chris Thomas

Guru 19890 points

So, the problem is solved ? Can I understand the system had been re-launched wrongly because of the reset signal ?

Regards,
Kawada

0 Chris Thomas over 12 years ago in reply to Naoki Kawada

Expert 1310 points

Yes the FPGA on the board changes for other reasons,obvious with hindsight

0 Naoki Kawada over 12 years ago in reply to Chris Thomas

Guru 19890 points

Good to hear that. Please verify your answer if you are ok.

Regards,
Kawada

Processors

Processors forum

Shadow task?