The TI E2E™ design support forums will undergo maintenance from Sept. 28 to Oct. 2. If you need design support during this time, contact your TI representative or open a new support request with our customer support center.

Another post inspired by some customer support issues this week...

The reported issue was that they were able to successfully create 7 instances of their remote JPEG Encoding algorithm, but attempts to create instance #8 failed.  The reported CE_DEBUG-generated trace log contained this subtle (but revealing!) set of lines:

[DSP] @26,301,689tk: [+4 T:0x8f078b44] OT - Thread_create > name: "jpegenc#8", pri:  -1, stack size:  40960, stack seg: 0
[DSP] @26,301,790tk: [+0 T:0x8f078b44] ti.sdo.ce.image1.IMGENC1 - IMGENC1_delete> Enter (handle=0x8f07da98)

From the trace log, it was clear to me (and probably no one else!) that they were failing during a thread creation - likely running out of memory for their algorithm's stack.  What's happening is that all the algorithm requested resources - DSKT2-based memory, RMAN-based DMA, etc - are allocated before actually creating the remote thread that will run this algorithm.  All that resource allocation succeeded... but the final thread creation failed... and the code then cleans up all those resource allocations by internally calling IMGENC1_delete().  This was fresh in my mind as this error was reported and tracked down a few months earlier, and we had added better tracing for this error in CE 2.25.02.11.  The trace for this issue in newer releases looks something like this - extra detail highlighted in red.

[DSP] @0,041,437tk: [+4 T:0x8780030c] OT - Thread_create > name: "videnc1_copy#1", pri:  -1, stack size:  65536, stack seg: 0
[DSP] @0,041,509tk: [+6 T:0x8780030c] CN - NODE_create: Thread_create() failed!
[DSP] @0,041,553tk: [+0 T:0x8780030c] ti.sdo.ce.video1.VIDENC1 - VIDENC1_delete> Enter (handle=0x87812408)

While we don't report explicitly that there wasn't enough stack (as we don't get that much detail from Thread_create()!), we do at least now report that Thread_create() failed.

Ok, now that we suspect what the issue is, the next question is how do we fix it.  How do we increase the memory available for remote algorithms' stack?  On BIOS, a thread's stack is allocated from a given "memory segment ID", and CE's Server config scripts are where this ID gets set.  The customer's Server .cfg script, which contains the alg config, looked like this (red highlight is key):

Server.algs = [
    /* ... */
    {name: "jpegenc", mod: JPEGENC , threadAttrs: {
            stackMemId: 0, priority: Server.MINPRI + 1}, groupId : 0,
   }
}

In most cases, the algorithm's .stackMemId field is (and should be!) set to zero.  So, we suspected the memory being exhausted was from "stackMemId 0".  Next question... where is stackMemId 0's size configured, and how big was it?

In BIOS 5, memory segment zero is a 'special segment'.  It's the segment in which all of BIOS's internal objects are allocated from.  Because of its special-ness, it has its own config param in the BIOS .tcf script - bios.MEM.BIOSOBJSEG.  From the BIOS man page for MEM.BIOSOBJSEG:

Segment For DSP/BIOS Objects. The default memory segment to contain objects created at run-time with an XXX_create function. The XXX_Attrs structure passed to the XXX_create function can override this default. If you select MEM_NULL for this property, creation of DSP/BIOS objects at run-time via the XXX_create functions is disabled.

Tconf Name: BIOSOBJSEG    Type: Reference

Example: bios.MEM.BIOSOBJSEG = prog.get("myMEM");

Digging through the customer's .tcf script, we found the following lines:

bios.DDR2.createHeap = true;                 /* put a heap in this memory segment */
bios.DDR2.heapSize   = 0x20000;              /* make this heap 128K */
bios.MEM.BIOSOBJSEG = bios.DDR2;             /* use DDR2 as the special 'segment zero' where all BIOS objects are allocated from */

Fortunately, in this customer's particular case, we were able to provide more memory for heap by increasing the .heapSize setting.  This had the side effect of shrinking the DDR2 memory available for DSP-side code/data (which is also typically placed in DDR2), but this customer's system had enough to spare.

Chris

Anonymous