C6678 x IdleTask

weber de souza calixto

Other Parts Discussed in Thread: SYSBIOS

Hi all,

I'm currently working with the EVMC6678LE, and I've recurrently found a problem related to IdleTask, both on core_0 and core_1..7

The error happens after a few miliseconds before I load the .out's and run all the cores:

[C66xx_0] MCSDK IMAGE PROCESSING DEMONSTRATION
[C66xx_0]
[C66xx_0] QMSS successfully initialized
[C66xx_0] CPPI successfully initialized
[C66xx_0] PA successfully initialized
[C66xx_0] EVM in StaticIP mode at 192.168.0.225
[C66xx_0] Set IP address of PC to 192.168.0.98
[C66xx_0] Starting webfiles_add
[C66xx_0] Done webfiles_add
[C66xx_0] HTTP Status Added
[C66xx_0] Debug Message Level Added
[C66xx_0] Last while
[C66xx_0] PASS successfully initialized
[C66xx_0] Ethernet subsystem successfully initialized
[C66xx_0] Ethernet eventId : 48 and vectId (Interrupt) : 7
[C66xx_0] Registration of the EMAC Successful, waiting for link up ..
[C66xx_0] Network Added: If-1:192.168.0.225
[C66xx_0] Service Status: HTTP : Enabled : : 000
[C66xx_6] A0=0xc8e41000 A1=0x0
[C66xx_6] A2=0x0 A3=0x804a5254
[C66xx_6] A4=0x803823e0 A5=0x804a5600
[C66xx_6] A6=0x0 A7=0x0
[C66xx_6] A8=0x15a00000 A9=0x1
[C66xx_6] A10=0x803823e0 A11=0x1d750300
[C66xx_6] A12=0x801c607c A13=0x1d750300
[C66xx_6] A14=0x0 A15=0x2
[C66xx_6] A16=0xc21a980 A17=0xc
[C66xx_6] A18=0x804b996c A19=0x20
[C66xx_6] A20=0x0 A21=0x0
[C66xx_6] A22=0x804c03cc A23=0x804c0184
[C66xx_6] A24=0x804c0184 A25=0x804c0184
[C66xx_6] A26=0x804c0184 A27=0x804c0184
[C66xx_6] A28=0x804c0184 A29=0x804a5010
[C66xx_6] A30=0x0 A31=0x80381a2c
[C66xx_6] B0=0x804b9928 B1=0x0
[C66xx_6] B2=0x0 B3=0x0
[C66xx_6] B4=0x80382398 B5=0x1
[C66xx_6] B6=0x159 B7=0x1
[C66xx_6] B8=0x80382388 B9=0x15000102
[C66xx_6] B10=0x801d4158 B11=0x80382398
[C66xx_6] B12=0xe B13=0x80383270
[C66xx_6] B14=0x801000 B15=0x804b90b8
[C66xx_6] B16=0x1 B17=0x2baa8059
[C66xx_6] B18=0x80384e28 B19=0x8
[C66xx_6] B20=0x80384dc0 B21=0x2
[C66xx_6] B22=0xf B23=0x0
[C66xx_6] B24=0x812cda7a B25=0x40010203
[C66xx_6] B26=0x14c00 B27=0x804c02b8
[C66xx_6] B28=0x804c0ca8 B29=0x804c02b8
[C66xx_6] B30=0x0 B31=0x8042
[C66xx_6] NTSR=0x1020e
[C66xx_6] ITSR=0xf
[C66xx_6] IRP=0x801d6c88
[C66xx_6] SSR=0x0
[C66xx_6] AMR=0x0
[C66xx_6] RILC=0x0
[C66xx_6] ILC=0x0
[C66xx_6] Exception at 0x80382414
[C66xx_6] EFR=0x2 NRP=0x80382414
[C66xx_6] Internal exception: IERR=0x18
[C66xx_6] Opcode exception
[C66xx_6] Resource conflict exception
[C66xx_6] ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0x801d6c88, sp = 0x804b90b8.
[C66xx_6] To see more exception detail, use ROV or set 'ti.sysbios.family.c64p.Exception.enablePrint = true;'
[C66xx_6] xdc.runtime.Error.raise: terminating execution

Opening the disassembly window at the address 0x80382414, that seems to be related to ti.sysbios.knl.Task:

ti_sysbios_knl_Task_Object__table__V:
80382340: 803827A8 || [ A1] MVK.S1 0x704f,A0
80382344: 803827A8 [ A1] MVK.S1 0x704f,A0
80382348: 00000005 LDHU.D1T1 *-A0[0],A0
8038234c: 00000020 || BPOS.S1 ti_sysbios_knl_Task_Object__table__V (PC+0 = 0x80382340),A0
80382350: 804B8A58 [ A1] CMPEQ.L1 -4,A18,A0
80382354: 00000001 NOP
80382358: 00000000 || NOP
8038235c: 00002000 NOP 2
80382360: 804B6AA0 [ A1] CMPLTDP.S1 A27:A26,A19:A18,A0
80382364: 00000000 NOP
80382368: 801C2C80 [ A1] MPY.M1 A1,A7,A0
8038236c: 00000000 NOP
80382370: 00000000 NOP
80382374: 00000000 NOP
80382378: 804A55E0 [ A1] SUB.S1X A18,B18,A0
8038237c: 00000001 NOP
80382380: 803827A8 || [ A1] MVK.S1 0x704f,A0
80382384: 8048A21A [ A1] ADDSP.L2 B5,B18,B0
80382388: 80382780 [ A1] MPYHLU.M1 A1,A14,A0
8038238c: 80382780 [ A1] MPYHLU.M1 A1,A14,A0
80382390: 00000000 NOP
80382394: 00000001 NOP
80382398: 804B9048 || [ A1] EXT.S1 A18,28,16,A0
8038239c: 00000001 NOP
803823a0: 00000000 || NOP
803823a4: 00000800 MPY32.M1 A0,A0,A0
803823a8: 804B8AA0 [ A1] CMPLTDP.S1 A29:A28,A19:A18,A0
803823ac: 00000000 NOP
803823b0: 801EC780 [ A1] MPYHLU.M1 A22,A7,A0
803823b4: 00000000 NOP
803823b8: 00000000 NOP
803823bc: 00000000 NOP
803823c0: 804A55F0 [ A1] MPYSP2DP.M1X A18,B18,A1:A0
803823c4: 00000001 NOP
803823c8: 80382780 || [ A1] MPYHLU.M1 A1,A14,A0
803823cc: 8048A225 [ A1] LDB.D1T1 *+A18[5],A0
803823d0: 80382788 || [ A1] SET.S1 A14,1,7,A0
803823d4: 80382788 [ A1] SET.S1 A14,1,7,A0
803823d8: 00000001 NOP
803823dc: 00000002 || NOP
803823e0: 804B9A58 [ A1] CMPEQ.L1X -4,B18,A0
803823e4: 00000001 NOP
803823e8: 00000000 || NOP
803823ec: 00000800 MPY32.M1 A0,A0,A0
803823f0: 804B92A0 [ A1] XOR.S1X -4,B18,A0
803823f4: 00000000 NOP
803823f8: 801DD680 [ A1] MPYHULS.M1X A14,B7,A0
803823fc: 00000000 NOP
80382400: 00000000 NOP
80382404: 00000000 NOP
80382408: 804A5600 [ A1] MPYID.M1X -14,B18,A1:A0
8038240c: 00000001 NOP
80382410: 80382788 || [ A1] SET.S1 A14,1,7,A0
80382414: 8048A242 .word 0x8048a242
80382418: 72547663 [!B2] SHRU2.S2X A21,0x3,B4
8038241c: 00656361 || .word 0x00656361

Removing the default idle task solves the problem. On .cfg file:

Task.enableIdleTask = false;
Task.allBlockedFunc = Idle.run;

Am I looking at a sysbios bug here? This happens everytime the IdleTask is enabled and I increase stack and heap size, and rellocate to DDR3 in order to fit

over 13 years ago

0 Robert Tivy over 13 years ago

TI__Mastermind 18260 points

weber de souza calixto said:
ti_sysbios_knl_Task_Object__table__V:

Your program has jumped into data. The label that you disassembled contains the table of Task_Objects.

Note that your highlighted .word at the NRP memory location is the first data in this table that doesn't correspond to a valid assembly instruction. If your code jumped to any address between ti_sysbios_knl_Task_Object__table__V and 0x80382414 then the CPU would happily execute all those "nonsense" assembly instructions until it hits the .word value, at which point it generates an exception.

In order to determine how you got to this bogus location, you can combine your knowledge of the code that has executed just prior to the exception and the value of the registers in the exception dump. There are quite a few registers that contain a value between 0x80382340 and 0x80382414 in your exception dump, and if any one of those have been used as the target of a branch instruction then you will end up in the same situation.

Normally, B3 is a good breadcrumb, holding the address to which a bogus function call is expected to return. However, in your case, it has 0 so it is of no value. You need to somehow determine what code was executing just before the bogus branch. Typically, a bogus branch is taken when a function pointer (perhaps a callback) has been corrupted in memory, so in finding out how you branched to this data area, you might need to then find out how the function pointer got overwritten.

You ask if there might be a bug in BIOS idle processing (or just a general bug in BIOS). While this is certainly possible, it's not likely, as BIOS is used extensively enough that a problem in idle task processing would have been identified by now. Since you mention DDR3, the cache needs to be considered, since cacheing issues can easily result in corrupted data.

Regards,

- Rob

0 weber de souza calixto over 13 years ago in reply to Robert Tivy

Expert 1700 points

Hi,

Thanks for the informations. The weird about this error is that by turning off the idle task seems to solve it, and it happens every time I place stack and heap of a thread into DDR3. That is why I've asked if this could be a sysbios error.

You mention that corrupted data can lead to nasty errors. The problem is, I'm running a code that far exceeds the cache memory capabilities (both for .stack and heap) for only one core (~2MB Heap and 1Mb stack for each slave core + 6Mb heap for master core). Therefore, I must place this symbols into DDR3.

In my case, I'm running a proprietary library for OCR. When called, the library tries to allocate lots of data space, which requires lots of stack and heap. What is the 'etiquette' for these cases? How can I check for corrupted data? Is there any document that deeply explains how should I proceed regarding big stack and heap allocation and DDR3 probllems?

Weber

0 Robert Tivy over 13 years ago in reply to weber de souza calixto

TI__Mastermind 18260 points

It shouldn't be a problem to have a large heap and stack and large data allocations from those. I believe the problem is more related to the thing that you have to do to support the large heap/stack - moving to DDR3. When your heap and stack are in L2 memory then there is no cacheing in L2 cache, and the L1 cacheing is probably OK since it's handled automatically. When you move to DDR3, the L2 cache comes into play, and the cacheability of DDR3 accesses depends on the MAR register settings of the C6x+.

As an experiment, you could try turning off all MAR bits, so no cacheing occurs for DDR3 accesses. This will ensure that no stale cache data is used and all memory accesses go straight to DDR3. This might also drastically slow down performance, which might in itself mask the problem, or move it to a different place, but it's worth a shot IMO. If the problem goes away when disabling the MAR bits, then that tells you there might be a cacheing issue (where some flushing or invalidating might need to be done, especially if you're sharing DDR3 across cores).

Regards,

- Rob

0 weber de souza calixto over 13 years ago in reply to Robert Tivy

Expert 1700 points

Hi,

How do I disable the MAR bits on C66 devices? This wiki refers to C64x http://processors.wiki.ti.com/index.php/Enabling_64x%2B_Cache, but I cant find similar for C66.

I think that someone at texas should take a careful look at this problem. The user Jean-Baptiste Theou refers similar errors when using big heap/stack. As for myself, the error always happens with big heap/stack placed outside the processor's internal memory.

At this post http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/186916/674511.aspx he complains about the same problem.

0 weber de souza calixto over 13 years ago in reply to weber de souza calixto

Expert 1700 points

Robert,

Could you (or someone else from IT) take a careful look at this problem?

Jean reported the same problem as I did, after increasing heap size weird errors happen.

He reported a disassembly from a crash. There we can see that the NPR points to a NOP instruction, but the next instruction is a .word just as my disassembly here.

http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/186916/674729.aspx

I'm waiting your response.

Weber

0 Chad Courtney over 13 years ago in reply to weber de souza calixto

TI__Mastermind 30825 points

Robert,

I've been working on the other thread mentioned. It would appear that a buffer (not sure if it's the heap or what) is being overflowed and the code space is being overwritten, thus when the jump back to that code space they get an exception.

Can you suggest a way for them to track if they're overflowing the buffer and which buffer it is?

Best Regards,

Chad

0 judahvang over 13 years ago

TI__Mastermind 32475 points

Weber,

Are you running a single image on all cores or do you have different executables and memory maps for each core?

If you have a single image that is to run on all cores then any data cannot be placed in shared memory. This includes any heaps.

Data must be placed in local memory or you need to be able to separate the shared memory between each core.

Judah

0 weber de souza calixto over 13 years ago in reply to judahvang

Expert 1700 points

My project is based on Image processing demo project for EVM6678.

It has two .out images: One goes on master core, the other runs on each one of the 7 slave cores. So there is a single image that runs on 7 cores (slaves), but master core runs a different image.

There are my .map files for master and slave projects:

8420.evmc6678l_memory_maps.zip

Weber

0 judahvang over 13 years ago in reply to weber de souza calixto

TI__Mastermind 32475 points

Weber,

I notice that DDR3 is the same for both programs. I think this is a problem. Could you please split of DDR3 so that the two programs don't overlap?
Then see if your program still crashes.

Judah

0 weber de souza calixto over 13 years ago in reply to judahvang

Expert 1700 points

Hi,

Is there any errata for EVMC6678 regarding DDR?

I'm doing a few DDR tests, and seems that the default rang is incorrect:

MEMORY
{
L2SRAM (RWX) : org = 0x800000, len = 0x40000
MSMCSRAM_MASTER (RWX) : org = 0xc000000, len = 0x100000
MSMCSRAM_SLAVE (RWX) : org = 0xc100000, len = 0x100000
MSMCSRAM_IPC (RWX) : org = 0xc200000, len = 0x200000
DDR3 (RWX) : org = 0x80000000, len = 0x1000000
}

This gives 256Mbyte for DDR3, however the EVM should have 512Mbyte. When try to use

DDR3 (RWX) : org = 0x80000000, len = 0x2000000 the EVM goes really slow.

Weber

0 Chad Courtney over 13 years ago in reply to weber de souza calixto

TI__Mastermind 30825 points

All Errata's are documented in the Errata for the products and are the 2nd document listed on their product page C6678.

That said, there are no performance related errata's for the DDR interface on C6678. As long as you've programmed it correctly you should be getting expected performance levels from it. That said, make sure the PLL is set correctly so you're not running it slow, if you used the default GEL file for configuration, then this should be fine.

Also, make sure you have appropriate caches turned on.

You may also want to clarify what you mean by 'EVM goes really slow.'

DDR throughput performance is also documented in the Throughput Performance Guide SPRABK5.

Best Regards,

Chad

0 weber de souza calixto over 13 years ago in reply to Chad Courtney

Expert 1700 points

Chad,

I'm working with the demo project for EVM6678: "Multicore Image Processing Demo". Following Judah suggestion, I splitted DDR3 in two parts, onde for each project:

DDR3 (RWX) : org = 0x80000000, len = 0x10000000 for the slave core project and

DDR3 (RWX) : org = 0x90000000, len = 0x10000000 for the master core project. This way I can guarantee that each project runs on a isolated memory range.

By "the EVM runs really slow" I meant that the multicore border detection that the demo implements runs 10x slower than wen I run with

DDR3 (RWX) : org = 0x80000000, len = 0x10000000 for the two projects.

Length 0x10000000 means ~256Mbytes, right? However, the EVM has 512 MByte DDR3. That's why I've asked about erratas regarding DDR3.

I'm also using the default GEL file. In fact, right now most of the project is the default provided with the EVM, so I can test the memory allocation before putting that big library to run.

Processors

Processors forum

C6678 x IdleTask