This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hello!
Sometimes I get this exception, and its root cause it totally unclear.
However it worries me a lot, as I have found such link (below), and one of them tells this can be HW problem; anyway, it seems it's not my direct fault, so I can't just fix it by myself.
http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/112/t/85262.aspx
http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/238045.aspx
So I will provide as much information as possible, and will appreciate any help.
-----------------------------------
I) System info:
Target CPU: C6748.
DSP lib: dsplib_c674x_3_1_1_1.
CGTools: 7.4.1.
SYS/BIOS: bios_6_34_02_18.
CCS: 5.3.0.00090.
PRUSS is not used.
L2 Cache disabled.
Output format: COFF, little endian.
Debugger: Spectrum Digital XDS510 USB
2) Compiler switches:
-mv6740 --abi=coffabi -g --include_path="C:/Programs/ti/ccsv5/tools/compiler/c6000_7.4.1/include" --include_path="F:/1 _ Sandbox/PROJECT/Project/Pxxxxx/SW/DSP/App/WPL_DSP" --include_path="F:/Sandbox/PROJECT/Project/Pxxxxx/SW/DSP/App/WPL_DSP/Inc" --include_path="C:/Programs/ti/dsplib_c674x_3_1_1_1/inc" --gcc --define=c6748 --display_error_number --diag_warning=225
3) Exception output
Console:
A8=0xa7065293 A9=0x1
A10=0x40467b40 A11=0x40467878
A12=0x0 A13=0x0
A14=0x4046d0b0 A15=0x0
A16=0xa67cd6ac A17=0x3b076c70
A18=0x3b466e0e A19=0x800038a8
A20=0x0 A21=0x0
A22=0x0 A23=0x0
A24=0x54 A25=0x800035c0
A26=0x80003818 A27=0x24
A28=0x4c A29=0x0
A30=0x0 A31=0x800036e0
B0=0x0 B1=0x0
B2=0x80000000 B3=0x802ec4
B4=0x0 B5=0x0
B6=0x0 B7=0x80003398
B8=0x0 B9=0x0
B10=0x40467a00 B11=0x0
B12=0x0 B13=0x0
B14=0x400001c0 B15=0x819f80
B16=0x0 B17=0x0
B18=0x80000000 B19=0x80000000
B20=0x0 B21=0x0
B22=0x800031e8 B23=0x70
B24=0x50 B25=0x0
B26=0x0 B27=0xffffffff
B28=0x0 B29=0x0
B30=0x10 B31=0x800031b8
NTSR=0x1420e
ITSR=0x20f
IRP=0x802ed0
SSR=0x0
AMR=0x0
RILC=0x0
ILC=0x95
Exception at 0x80e2e8
EFR=0x2 NRP=0x80e2e8
Internal exception: IERR=0x180
Loop buffer exception
Missed stall exception
ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0x0080e2e8, sp = 0x00819f80.
To see more exception detail, use ROV or set 'ti.sysbios.family.c64p.Exception.enablePrint = true;'
xdc.runtime.Error.raise: terminating execution
4) Behavior:
It always crashes in same place: inside function DSPF_sp_fir_gen(). Also, few times a day I get this exception, when stopping execution by debugger (suspend button, insertion of new breakpoint or clear "skip all breakpoints" flag). Very-very rarely it occurs just in runtime, but occurs. And, without debugger connected, I have never seen it; however, currently I almost always work with debugger, so not enough statistic.
C code:
old_CSR = Hwi_disable();
DSPF_sp_fir_gen(mFreqSignal, smHilbertCoef, ortho, c_FIR_HILBERT_LEN, c_FREQ_PROCES_LEN); // nr = 120, nh = 80, cycles = 5108
Hwi_restore(old_CSR);
Disassembly:
DSPF_sp_fir_gen() function and exception address
DSPF_sp_fir_gen:
0080e180: 0C181FD8 OR.L1X 0,B6,A24
0080e184: 03E10800 MPY32.M1 A8,A24,A7
0080e188: 00006000 NOP 4
0080e18c: 089C9DA2 SHR.S2X A7,0x4,B17
0080e190: EC91 ADD.L2 B17,-1,B17
0080e192: 4CE7 SPLOOPD 10
0080e194: 07A7 || MVK.L2 0,B23
0080e196: D217 || MV.D2X A4,B22
0080e198: D8EF || MVC.S2 B17,ILC
0080e19a: 2D67 SPMASK L1,S1
0080e19c: EE082E00 .fphead n, h, W, BU, nobr, nosat, 1110000
0080e1a0: 0E00A359 || ^ MVK.L1 0,A28
0080e1a4: 0C9016A1 || ^ OR.S1X 0,B4,A25
0080e1a8: 03DAFC42 || ADDAW.D2 B22,B23,B7
0080e1ac: 0FE79C41 ADDAW.D1 A25,A28,A31
0080e1b0: 0A1C03E4 || LDDW.D2T1 *+B7[0],A21:A20
0080e1b4: 0E7081A1 ADD.S1 4,A28,A28
0080e1b8: 087C0364 || LDDW.D1T1 *+A31[0],A17:A16
0080e1bc: 0A1C23E6 LDDW.D2T2 *+B7[1],B21:B20
0080e1c0: 021C43E7 LDDW.D2T2 *+B7[2],B5:B4
0080e1c4: 00E38AF8 || CMPLT.L1 A28,A24,A1
0080e1c8: 0BDC8942 ADD.D2 B23,0x4,B23
0080e1cc: 00000000 NOP
0080e1d0: 02D20E00 MPYSP.M1 A16,A20,A5
0080e1d4: 000B0001 SPMASK L2
0080e1d8: 0C1B805B || SUB.L2 B6,4,B24
0080e1dc: 03DAFC43 || ADDAW.D2 B22,B23,B7
0080e1e0: 01D23E00 || MPYSP.M1X A17,B20,A3
0080e1e4: 2C67 SPMASK L1
0080e1e6: 4F46 || ^ MV.L1 A6,A26
0080e1e8: 047C2365 || LDDW.D1T1 *+A31[1],A9:A8
0080e1ec: 9BE2E5E3 || [!A1] SUB.S2 B23,B24,B23
0080e1f0: 03562E00 || MPYSP.M1 A17,A21,A6
0080e1f4: 03560E01 MPYSP.M1 A16,A21,A6
0080e1f8: 9E000041 || [!A1] MVK.D1 0,A28
0080e1fc: E040000C .fphead n, l, W, BU, nobr, nosat, 0000010
0080e200: 07CF || MV.S2 B7,B8
0080e202: 0C6E NOP 1
0080e204: 09C2BE03 MPYSP.M2X B21,A16,B19
0080e208: 049008F2 || OR.D2 0,B4,B9
0080e20c: 048740F1 MVD.M1 A1,A9
0080e210: 0214C219 || ADDSP.L1 A6,A5,A4
0080e214: 0846BE02 || MPYSP.M2X B21,A17,B16
0080e218: 092408F1 OR.D1 0,A9,A18
0080e21c: E0200000 .fphead n, l, W, BU, nobr, nosat, 0000001
0080e220: 0A553E00 || MPYSP.M1X A9,B21,A20
0080e224: 01986219 ADDSP.L1 A3,A6,A3
0080e228: 03551E01 || MPYSP.M1X A8,B21,A6
0080e22c: 09429E02 || MPYSP.M2X B20,A16,B18
0080e230: 03511E01 MPYSP.M1X A8,B20,A6
0080e234: 04453E02 || MPYSP.M2X B9,A17,B8
0080e238: 031407B2 ROTL.M2 B5,0x0,B6
0080e23c: 00000000 NOP
0080e240: 020CC219 ADDSP.L1 A6,A3,A4
0080e244: 082042E7 || LDW.D2T2 *+B8[2],B16
0080e248: 03213E03 || MPYSP.M2X B9,A8,B6
0080e24c: 024A021A || ADDSP.L2 B16,B18,B4
0080e250: 0290CE19 ADDSP.S1 A6,A4,A5
0080e254: 0320DE03 || MPYSP.M2X B6,A8,B6
0080e258: 024D021A || ADDSP.L2 B8,B19,B4
0080e25c: 02A65E01 MPYSP.M1X A18,B9,A5
0080e260: 0948DE02 || MPYSP.M2X B6,A18,B18
0080e264: 01A740F0 MVD.M1 A9,A3
0080e268: 0210C21A ADDSP.L2 B6,B4,B4
0080e26c: 02968E19 ADDSP.S1 A20,A5,A5
0080e270: 034A1E03 || MPYSP.M2X B16,A18,B6
0080e274: 0290C21A || ADDSP.L2 B6,B4,B5
0080e278: 0C6E NOP 1
0080e27a: 6F66 SPMASK S1,S2,D1
0080e27c: E8002000 .fphead n, l, W, BU, nobr, nosat, 1000000
0080e280: 0F000029 || ^ MVK.S1 0x0000,A30
0080e284: 0C80002B || ^ MVK.S2 0x0000,B25
0080e288: 0E800041 || ^ MVK.D1 0,A29
0080e28c: 0190A218 || ^ ADDSP.L1 A5,A4,A3
0080e290: EE67 SPMASK L1,S2,D1,D2
0080e292: 07A6 || ^ MVK.L1 0,A7
0080e294: 0D00002B || ^ MVK.S2 0x0000,B26
0080e298: 08800043 || ^ MVK.D2 0,B17
0080e29c: E2000300 .fphead n, l, W, BU, nobr, nosat, 0010000
0080e2a0: 0D800040 || ^ MVK.D1 0,A27
0080e2a4: 0314C21B ADDSP.L2 B6,B5,B6
0080e2a8: 09EB7C41 || ADDAW.D1 A26,A27,A19
0080e2ac: 04924E1B || ADDSP.S2 B18,B4,B9
0080e2b0: 01CE || MV.S1 A3,A0
0080e2b2: 0C6E NOP 1
0080e2b4: 0EF4A218 ADDSP.L1 A5,A29,A29
0080e2b8: 0F786E18 ADDSP.S1 A3,A30,A30
0080e2bc: E2000000 .fphead n, l, W, BU, nobr, nosat, 0010000
0080e2c0: 0CE52E1B ADDSP.S2 B9,B25,B25
0080e2c4: 0D68C21A || ADDSP.L2 B6,B26,B26
0080e2c8: DDEC8940 [!A0] ADD.D1 A27,0x4,A27
0080e2cc: 00000000 NOP
0080e2d0: DB7406A1 [!A0] OR.S1 0,A29,A22
0080e2d4: DBF808F0 || [!A0] OR.D1 0,A30,A23
0080e2d8: DE9C06A1 [!A0] OR.S1 0,A7,A29
0080e2dc: DB4C0345 || [!A0] STDW.D1T1 A23:A22,*+A19[0]
0080e2e0: D36406A3 || [!A0] OR.S2 0,B25,B6
0080e2e4: D3E808F2 || [!A0] OR.D2 0,B26,B7
0080e2e8: 04C34001 SPKERNEL 0x13
0080e2ec: DF1C0FD9 || [!A0] OR.L1 0,A7,A30
0080e2f0: DCC406A3 || [!A0] OR.S2 0,B17,B25
0080e2f4: DD4408F3 || [!A0] OR.D2 0,B17,B26
0080e2f8: D34C2346 || [!A0] STDW.D1T2 B7:B6,*+A19[1]
0080e2fc: 008CA362 BNOP.S2 B3,5
DSPF_sp_biquad:
Calling code:
00802e86: 0627 || MVK.L2 0,B4
00802e88: 03001428 || MVK.S1 0x0028,A6
C$RL29:
00802e8c: 020403E2 MVC.S2 CSR,B4
00802e90: 10004000 DINT
00802e94: 9E45 STW.D2T2 B4,*B15[16]
00802e96: BC5D LDW.D2T2 *B15[1],B5
00802e98: 01B84228 MVK.S1 0x7084,A3
00802e9c: E440000C .fphead n, l, W, BU, nobr, nosat, 0100010
00802ea0: 021AE02A MVK.S2 0x35c0,B4
00802ea4: 01800BE8 MVKH.S1 0x170000,A3
00802ea8: 9353 MVK.S2 84,B6
00802eaa: 72B0 ADD.L1X A3,B5,A3
00802eac: 0240006B || MVKH.S2 0x80000000,B4
00802eb0: 033C62E4 || LDW.D2T1 *+SP[3],A6
00802eb4: 10165C13 CALLP.S2 DSPF_sp_fir_gen (PC+45792 = 0x0080e180),B3
00802eb8: 020C0265 || LDW.D1T1 *+A3[0],A4
00802ebc: E0800020 .fphead n, l, W, BU, nobr, nosat, 0000100
00802ec0: 04002228 || MVK.S1 0x0044,A8
C$RL30:
00802ec4: 9E4D LDW.D2T2 *B15[16],B4
00802ec6: 6C6E NOP 4
00802ec8: 009003A2 MVC.S2 B4,CSR
00802ecc: 00000000 NOP
5) Possible root-causes:
Some time ago I noticed that modifying code like this:
unsigned hackety_hack1[10] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
unsigned old_CSR;// = Hwi_disable();
unsigned hackety_hack2[10] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
old_CSR = Hwi_disable();
DSPF_sp_fir_gen(mFreqSignal, smHilbertCoef, ortho, c_FIR_HILBERT_LEN, c_FREQ_PROCES_LEN); // nr = 120, nh = 80, cycles = 5108
Hwi_restore(old_CSR);
Makes exception more rare, however it seems there is not direct relation, just codebase (and assembly) is changed by such modification... I'll wonder if library function DSPF_sp_fir_gen() corrupts stack; and more than that, this could lead to such an exception...
Also,
there are some silicon errata regarding CPU stall on L2 and L1 memories due to priority settings and DMA activity:
Advisory 2.1.1 —DMA Access to L2 RAM Can Stall When DMA and C674x CPU Command Priorities Are Equal.
Advisory 2.1.17 —SDMA Activity Can Corrupt L1D When L2 Is Configured as Mixed/Cache/SRAM.
I have 2 concurrent continuous EDMA activities, but they don't access L2 (actually, it is not mapped for EDMA, I can't access it by EDMA. I use L3, on-chip ram, started at 0x80000000).
And L2 Cache is disabled. Whole code is loaded and executed from there.
6) So, I have not more guesses.
What can be cause of this exception, more specifically than "resource conflict"? Can be this caused by Cache, EDMA, SDRAM, TI DSP lib, or maybe debugger?? HW? Power on CPU?
As mentioned in the two threads you referenced, this is likely a hardware issue that we folks in the compiler forum cannot help much with. I will move your thread to the processor fourm as they may have ideas and suggestions to help you.
Iaroslav,
[ed - RP - I have changed my understanding of the SPLOOP operation and added correcting comments in post below.]
I counted 37 execution packets in the SPLOOPD to SPKERNEL creation of the loop. The SPLOOP buffer only has space for 14 execution packets, so this implementation is not valid.
Where did this assembly come from? It should not be generated by our C Compiler, and it would be surprising and embarrassing to have this in the precompiled library from TI.
Regards,
RandyP
Hello! Thank you for reply!
As I wrote, I use TI dsplib_c674x_3_1_1_1 ( http://software-dl.ti.com/sdoemb/sdoemb_public_sw/dsplib/latest/index_FDS.html,
http://software-dl.ti.com/sdoemb/sdoemb_public_sw/dsplib/latest/exports/dsplib_c674x_3_1_1_1_Win32.exe ).
DSPF_sp_fir_gen() function is taken from there, and this is the latest version of this TI lib.
I use precompiled library ...\dsplib_c674x_3_1_1_1\lib\dsplib.a674 (size 42492 creation 11.10.12 05:00).
So, if this is wrong library, what can be the solution for me (C6748 target)?
Hello!
So what's the result of this discussion?
This is a bug in TI DSP library?
Should I wait for some new version????
Laroslav,
After reading through some of the SPLOOP documentation, probably for the 20th time, I think the code is not wrong, but I cannot explain what is happening.
As far as the 14-execution-packet limitation, I now believe this is not the same as the total number of execution packets shown within the SPLOOP/SPKERNEL construct. Instead, there is a statement in the C674x CPU & INstruction Set Reference Guide that says "The loop buffer can accommodate a SPLOOP body of up to 48 cycles." This is different than the 14-execute-packet limitation which I was considering before.
My understanding now is that the code is not built wrong since there are fewer than 48 cycles required to load the code into the SPLOOP loop buffer. There will be fewer than 14 total execution packets because many of the packets copied into the loop buffer will be executed in parallel.
What is happening in the occassions when this exception occurs?
Are you debugging with breakpoints?
Are you accessing very slow memory or peripherals?
Have you found a way to work around this in the long time since our last exchange?
Regards,
RandyP
Hello RandyP!
First of all, I haven't noticed this issue for a very long time: 2 months, I think... Now our board works in test production, without any debugger connected, and everything is OK. So, this issues was reproducible only when debugger was connected. It occurred mostly on breakpoint stop, and in runtime - this means when debugger is connected, but board is free-running without any breakpoints... So this answers on the question "Are you debugging with breakpoints".
Then - "Are you accessing very slow memory or peripherals" - No. All code and data are in the internal DSP memory and access only it.
"Have you found a way to work around this in the long time since our last exchange" - issue is not reproducible without debugger connected, so no workaround needed. Even during debug, I haven't seen issue for a long time, but as I wrote it seems issues strongly depends on the whole codebase...
Now I have two points to suspect:
1) We had some thread-errors (e.g. concurrent access to some buffer), which are fixed now. Maybe this was the root-cause. However, this only raises my first question: "What can be the root-cause of the subj?", because documentation tells nothing about how to treat this exception (I mean - what "high-level" errors can cause it...).
2) Maybe this is really only debugger-related issue.
Anyway, we can continue the investigation on our side only if the exception will appear in runtime (production) without debugger....