This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Stackoverflow

Other Parts Discussed in Thread: SYSBIOS

Hi,

I work with c6678. I get every workcycles data from pcie and afterwords I send them back. I have a plugin system which give all cores something to work (the result will be save in ddr3) if they are finish, they say this to mothercore via MessageQ , afterwards Edma send the result via pcie to fpga

After one successful cycles I get stack overflow. I don't allocate more memory and it are the same plugin functions.

Console output is

A0=0x0 A1=0x8

[C66xx_0] A2=0x0 A3=0x1d1a0300

[C66xx_0] A4=0x1 A5=0xf8

[C66xx_0] A6=0x800002f9 A7=0x0

[C66xx_0] A8=0xc042860 A9=0x80c780

[C66xx_0] A10=0x80c87c A11=0x80c310

[C66xx_0] A12=0x0 A13=0x0

[C66xx_0] A14=0xc04f284 A15=0x0

[C66xx_0] A16=0xc20a408 A17=0x8

[C66xx_0] A18=0x804e34 A19=0x20

[C66xx_0] A20=0x0 A21=0x0

[C66xx_0] A22=0x40500211 A23=0x6a80212c

[C66xx_0] A24=0x20000005 A25=0x141801ac

[C66xx_0] A26=0x0 A27=0x0

[C66xx_0] A28=0x400 A29=0x1

[C66xx_0] A30=0x400 A31=0xc1d2

[C66xx_0] B0=0x1 B1=0x813d38

[C66xx_0] B2=0x0 B3=0xc02b96c

[C66xx_0] B4=0x1 B5=0x15000103

[C66xx_0] B6=0x1 B7=0x1

[C66xx_0] B8=0x1 B9=0x0

[C66xx_0] B10=0x80d8c8 B11=0x0

[C66xx_0] B12=0x0 B13=0x0

[C66xx_0] B14=0x80e184 B15=0x813ce8

[C66xx_0] B16=0x0 B17=0x40000000

[C66xx_0] B18=0x80000000 B19=0x3ff001f0

[C66xx_0] B20=0x83d0b5f4 B21=0x1

[C66xx_0] B22=0xf B23=0x0

[C66xx_0] B24=0x810344 B25=0x10a000

[C66xx_0] B26=0x11ab4406 B27=0x1

[C66xx_0] B28=0x0 B29=0x144f

[C66xx_0] B30=0x3f800000 B31=0xffffffff

[C66xx_0] NTSR=0x1020d

[C66xx_0] ITSR=0x20d

[C66xx_0] IRP=0xc003e54

[C66xx_0] SSR=0x0

[C66xx_0] AMR=0x0

[C66xx_0] RILC=0x0

[C66xx_0] ILC=0x0

[C66xx_0] Exception at 0x0

[C66xx_0] EFR=0x2 NRP=0x0

[C66xx_0] Internal exception: IERR=0x1

[C66xx_0] Instruction fetch exception

[C66xx_0] ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0x0c003e54, sp = 0x00813ce8.

[C66xx_0] xdc.runtime.Error.raise: terminating execution

After each Plugin the cores write back the cache to ddr3 memory. I read in some former post that I should block the idle function. I do this on all cores. The problem cause lesser than at the begin but it is not solve. (passed more than one cycle)

Task.enableIdleTask = false;

Task.allBlockedFunc = Idle.run;

I increase the stack size of my work task but doesn't solve the problem.

Program.global.workerTask = Task.create("&workerTaskFxn");

Program.global.workerTask.stackSize = 0x4000;

Program.global.workerTask.priority = 0x6;

I hope you have some further tips.

Regards

  • This doesn't appear to have anything to do with the stack.  It's an Instruction Fetch Exception.  Particularly at 0x0C003E54 (in MSMC SRAM/Shared-L2 Space), which isn't near your stack (B15 register would indicate it's in Local L2 memory space)

    Look at the disassembly window at that address.   Also look at the disassembly window at the address pointed to by B3 register (this would be where it Branched from in the last function call.)  If you can do a screen shot of the disassembly windows w/ about 10 instructions before and after these addresses, we can take a look.

    Best Regards,

    Chad

  • Hi,

    thank you for the fast answer. I really though that it is stack overflow.

    I take a image of the disassemble code in this address area. I hope that could help you. You can found it in the attachment.

    Regards

  • Was this capture after it failed or was it after a fresh load and run?  I see a breakpoint so I suspect it would have been after a fresh load and run.

    I'll need the capture after the failure to see if we have memory corruption (being overwritten.)

    Also, grab a capture at the IRP (0xC003E54)

    Another look at this, I see that the Exception is actually at 0x0.  I should have noticed that before.  That's the address it tried to branch and execute from.  There is no valid memory there.  This goes along with my earlier theory of an uninitialized variable possibly causing the issue.

    Best Regards,

    Chad

  • Hi,

    oh sorry the capture was after fresh run. I try to get the same error, but I get others.

    This Exception I get after MessageQ_get

    console output

    [C66xx_0] A0=0x0 A1=0x0

    [C66xx_0] A2=0x1 A3=0x80b3a8

    [C66xx_0] A4=0x80c018 A5=0x80c008

    [C66xx_0] A6=0x0 A7=0xffffffff

    [C66xx_0] A8=0x157c0000 A9=0x803d70

    [C66xx_0] A10=0x0 A11=0x0

    [C66xx_0] A12=0x0 A13=0x15000103

    [C66xx_0] A14=0x10 A15=0x1

    [C66xx_0] A16=0x803d70 A17=0x20

    [C66xx_0] A18=0x803d70 A19=0x0

    [C66xx_0] A20=0x6c A21=0x4c

    [C66xx_0] A22=0x40500211 A23=0x42800128

    [C66xx_0] A24=0x0 A25=0x341801ac

    [C66xx_0] A26=0x0 A27=0x0

    [C66xx_0] A28=0x400 A29=0x1

    [C66xx_0] A30=0x400 A31=0xc04f274

    [C66xx_0] B0=0x803e40 B1=0x0

    [C66xx_0] B2=0x0 B3=0xc243580

    [C66xx_0] B4=0xfffff2c4 B5=0x806000

    [C66xx_0] B6=0x159 B7=0x1

    [C66xx_0] B8=0x80c050 B9=0x3a

    [C66xx_0] B10=0x80c681a8 B11=0xc031088

    [C66xx_0] B12=0x80c681a8 B13=0x0

    [C66xx_0] B14=0x80cca4 B15=0x805ea0

    [C66xx_0] B16=0x30 B17=0x803f9c

    [C66xx_0] B18=0x9c869536 B19=0x3feffafb

    [C66xx_0] B20=0x80c67e74 B21=0x1

    [C66xx_0] B22=0x1 B23=0x0

    [C66xx_0] B24=0x80ee44 B25=0x10a800

    [C66xx_0] B26=0x110b4406 B27=0x0

    [C66xx_0] B28=0x0 B29=0x1

    [C66xx_0] B30=0x0 B31=0x803ec4

    [C66xx_0] NTSR=0x1000e

    [C66xx_0] ITSR=0xf

    [C66xx_0] IRP=0xc03f73a

    [C66xx_0] SSR=0x0

    [C66xx_0] AMR=0x0

    [C66xx_0] RILC=0x0

    [C66xx_0] ILC=0x0

    [C66xx_0] Exception at 0xc2435a0

    [C66xx_0] EFR=0x2 NRP=0xc2435a0

    [C66xx_0] Internal exception: IERR=0x8

    [C66xx_0] Opcode exception

    [C66xx_0] ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0x0c03f73a, sp = 0x00805ea0.

    [C66xx_0] xdc.runtime.Error.raise: terminating execution

    disassemble

  • Yeah, that appears to be garbage info there.  Do you have the .map file from the build?  Let's see if this is supposed to actually be the code space or not with it.

    Another question, just to make sure we don't have a bad piece of Si.  Have you run other code w/o issue, such as TI example code?  Have you tried this on another EVM?

    Best Regards,
    Chad

  • Hi,

    I include the map file in attachment.

    I don't run other test code on this DSP. I will test it now. I have some others customs boards here and one EVM Board. But on EVM board I don't can run all the application because I don't have the connection to fpga via pcie.

    1145.tof_multicore_master.txt

    it was not possible to load up a map file also i change the name to txt file

    Regards

  • Not sure why it's not allowing you to load it up, especially if you changed to a .txt file.

    Can you look at the map file and see if there are functions around the failure address?

    Best Regards,
    Chad

  • Hi,

    I have tested the software on other custom boards and get the same error. I tried to test the software on one EVM Board but I have to switch off to much things to get it run on this board and can not use all the plugins to test the software.



    I edit my last post and include the map file. I hope you have some idea to help me



    Regards

  • As I suspected your code is jumping into the middle of unitilized space (i.e. jumping into nothing.) 

    ti.sdo.ipc.SharedRegion_0
    *          0    0c200000    00200000     NOLOAD SECTION
                      0c200000    00200000     --HOLE--

    The question is why is it doing this.  The most common reason would be using accessing uninitialized variables.  Another possibility is the corruption of stack itself, if you have data overwriting the locations where the stack is and that has pointers that need to be preserved and later used to jump to locations in code.

    You can go back and uses some of the example code provided in the MCSDK on your platform to see that it would run perfectly fine.  But your code isn't running fine.

    Usually the compiler will give you warnings when you build code w/ unitialized variables.

    Best Regards,
    Chad