The issue is not solved at the following post: https://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/438821
I have reproduced the issue by a simple code with correct argument ordering. The SUB instruction does not work when the same unit is used for the instructions and when there is NOP 2 between instructions.
For example:
06B4F218 ADDSP.L1X A7,B13,A13
00002000 NOP 2
060440D8 SUB.L1 2,A1,A12
06B4FE3A SUBSP.S2X B7,A13,B13
00002000 NOP 2
0604A5A2 SUB.S2 5,B1,B12
Please see the attached file for a series of assembly code:
07BE9DC2 SUBAW.D2 SP,0x14,SP 07BC18F1 OR.D1X 0,SP,FP 07BC22F4 || STW.D2T1 FP,*+SP[1] 073C42F5 STW.D2T1 A14,*+SP[2] 073CE276 || STW.D1T2 DP,*+FP[7] 06BC62F5 STW.D2T1 A13,*+SP[3] 06BD0276 || STW.D1T2 B13,*+FP[8] 063C82F5 STW.D2T1 A12,*+SP[4] 063D2276 || STW.D1T2 B12,*+FP[9] 05BCA2F5 STW.D2T1 A11,*+SP[5] 05BD4276 || STW.D1T2 B11,*+FP[10] 053CC2F5 STW.D2T1 A10,*+SP[6] 053D6276 || STW.D1T2 B10,*+FP[11] 071808F1 OR.D1 0,A6,A14 051816A1 || OR.S1X 0,B6,A10 05183D42 || ADDAW.D2 B6,0x1,B10 029018F3 OR.D2X 0,A4,B5 000002AA || MVK.S2 0x0005,B0 04383764 LDDW.D1T1 *A14++[1],A9:A8 03280265 LDW.D1T1 *+A10[0],A6 032802E6 || LDW.D2T2 *+B10[0],B6 04383764 LDDW.D1T1 *A14++[1],A9:A8 03A84265 LDW.D1T1 *+A10[2],A7 03A842E6 || LDW.D2T2 *+B10[2],B7 00000000 NOP 06A0BE03 MPYSP.M2X B5,A8,B13 06911E00 || MPYSP.M1X A8,B4,A13 LOOP: 008000A9 MVK.S1 0x0001,A1 008000AA || MVK.S2 0x0001,B1 00008000 NOP 5 06A0BE02 MPYSP.M2X B5,A8,B13 00008000 NOP 5 06B4F218 ADDSP.L1X A7,B13,A13 00002000 NOP 2 060440D8 SUB.L1 2,A1,A12 00008000 NOP 5 0119B21A ADDSP.L2X B13,A6,B2 00002000 NOP 2 060440DA SUB.L2 2,B1,B12 00008000 NOP 5 06B4FE18 ADDSP.S1X A7,B13,A13 00002000 NOP 2 060465A0 SUB.S1 3,A1,A12 00008000 NOP 5 0119BE1A ADDSP.S2X B13,A6,B2 00002000 NOP 2 060465A2 SUB.S2 3,B1,B12 00008000 NOP 5 06911E00 MPYSP.M1X A8,B4,A13 00008000 NOP 5 0134D2B8 SUBSP.L1X B6,A13,A2 00002000 NOP 2 060480D8 SUB.L1 4,A1,A12 00008000 NOP 5 06B4F23A SUBSP.L2X B7,A13,B13 00002000 NOP 2 060480DA SUB.L2 4,B1,B12 00008000 NOP 5 0119BEB8 SUBSP.S1X B6,A13,A2 00002000 NOP 2 0604A5A0 SUB.S1 5,A1,A12 00008000 NOP 5 06B4FE3A SUBSP.S2X B7,A13,B13 00002000 NOP 2 0604A5A2 SUB.S2 5,B1,B12 00008000 NOP 5 000029C2 SUB.D2 B0,0x1,B0 2FFFED92 [ B0] B.S2 LOOP (PC-148 = 0x118118cc) 00008000 NOP 5 07BC18F0 OR.D1X 0,SP,FP 053CC2E5 LDW.D2T1 *+SP[6],A10 053D6266 || LDW.D1T2 *+FP[11],B10 05BCA2E5 LDW.D2T1 *+SP[5],A11 05BD4266 || LDW.D1T2 *+FP[10],B11 063C82E5 LDW.D2T1 *+SP[4],A12 063D2266 || LDW.D1T2 *+FP[9],B12 06BC62E5 LDW.D2T1 *+SP[3],A13 06BD0266 || LDW.D1T2 *+FP[8],B13 073C42E5 LDW.D2T1 *+SP[2],A14 073CE267 || LDW.D1T2 *+FP[7],DP 000C0362 || B.S2 B3 07BC22E4 LDW.D2T1 *+SP[1],FP 178014FE ADDAW.D2 B15,20,SP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP
If the instructions are successive (there is not NOP) or there is NOP except 2 cycles between the instructions, it works fine. It seems not to work only when 3 cycles passed after the ADDSP or SUBSP instruction.
In addition, the ADD instruction does not also work.
For example:
0119B21A ADDSP.L2X B13,A6,B2
00002000 NOP 2
0604405A ADD.L2 2,B1,B12
0119BEB8 SUBSP.S1X B6,A13,A2
00002000 NOP 2
0604A1A0 ADD.S1 5,A1,A12
Please see the attached file for a series of assembly code:
07BE9DC2 SUBAW.D2 SP,0x14,SP 07BC18F1 OR.D1X 0,SP,FP 07BC22F4 || STW.D2T1 FP,*+SP[1] 073C42F5 STW.D2T1 A14,*+SP[2] 073CE276 || STW.D1T2 DP,*+FP[7] 06BC62F5 STW.D2T1 A13,*+SP[3] 06BD0276 || STW.D1T2 B13,*+FP[8] 063C82F5 STW.D2T1 A12,*+SP[4] 063D2276 || STW.D1T2 B12,*+FP[9] 05BCA2F5 STW.D2T1 A11,*+SP[5] 05BD4276 || STW.D1T2 B11,*+FP[10] 053CC2F5 STW.D2T1 A10,*+SP[6] 053D6276 || STW.D1T2 B10,*+FP[11] 071808F1 OR.D1 0,A6,A14 051816A1 || OR.S1X 0,B6,A10 05183D42 || ADDAW.D2 B6,0x1,B10 029018F3 OR.D2X 0,A4,B5 000002AA || MVK.S2 0x0005,B0 04383764 LDDW.D1T1 *A14++[1],A9:A8 03280265 LDW.D1T1 *+A10[0],A6 032802E6 || LDW.D2T2 *+B10[0],B6 04383764 LDDW.D1T1 *A14++[1],A9:A8 03A84265 LDW.D1T1 *+A10[2],A7 03A842E6 || LDW.D2T2 *+B10[2],B7 00000000 NOP 06A0BE03 MPYSP.M2X B5,A8,B13 06911E00 || MPYSP.M1X A8,B4,A13 LOOP: 008000A9 MVK.S1 0x0001,A1 008000AA || MVK.S2 0x0001,B1 00008000 NOP 5 06A0BE02 MPYSP.M2X B5,A8,B13 00008000 NOP 5 06B4F218 ADDSP.L1X A7,B13,A13 00002000 NOP 2 06044058 ADD.L1 2,A1,A12 00008000 NOP 5 0119B21A ADDSP.L2X B13,A6,B2 00002000 NOP 2 0604405A ADD.L2 2,B1,B12 00008000 NOP 5 06B4FE18 ADDSP.S1X A7,B13,A13 00002000 NOP 2 060461A0 ADD.S1 3,A1,A12 00008000 NOP 5 0119BE1A ADDSP.S2X B13,A6,B2 00002000 NOP 2 060461A2 ADD.S2 3,B1,B12 00008000 NOP 5 06911E00 MPYSP.M1X A8,B4,A13 00008000 NOP 5 0134D2B8 SUBSP.L1X B6,A13,A2 00002000 NOP 2 06048058 ADD.L1 4,A1,A12 00008000 NOP 5 06B4F23A SUBSP.L2X B7,A13,B13 00002000 NOP 2 0604805A ADD.L2 4,B1,B12 00008000 NOP 5 0119BEB8 SUBSP.S1X B6,A13,A2 00002000 NOP 2 0604A1A0 ADD.S1 5,A1,A12 00008000 NOP 5 06B4FE3A SUBSP.S2X B7,A13,B13 00002000 NOP 2 0604A1A2 ADD.S2 5,B1,B12 00008000 NOP 5 000029C2 SUB.D2 B0,0x1,B0 2FFFED92 [ B0] B.S2 LOOP (PC-148 = 0x118118cc) 00008000 NOP 5 07BC18F0 OR.D1X 0,SP,FP 053CC2E5 LDW.D2T1 *+SP[6],A10 053D6266 || LDW.D1T2 *+FP[11],B10 05BCA2E5 LDW.D2T1 *+SP[5],A11 05BD4266 || LDW.D1T2 *+FP[10],B11 063C82E5 LDW.D2T1 *+SP[4],A12 063D2266 || LDW.D1T2 *+FP[9],B12 06BC62E5 LDW.D2T1 *+SP[3],A13 06BD0266 || LDW.D1T2 *+FP[8],B13 073C42E5 LDW.D2T1 *+SP[2],A14 073CE267 || LDW.D1T2 *+FP[7],DP 000C0362 || B.S2 B3 07BC22E4 LDW.D2T1 *+SP[1],FP 178014FE ADDAW.D2 B15,20,SP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP
Furthermore, I have reproduced the issue on C6713.
Please see the attached file for a series of assembly code:
07BE9DC2 SUBAW.D2 SP,0x14,SP 07BC11A1 MV.S1X SP,FP 07BC22F4 || STW.D2T1 FP,*+SP[1] 073C42F5 STW.D2T1 A14,*+SP[2] 073CE276 || STW.D1T2 DP,*+FP[7] 06BC62F5 STW.D2T1 A13,*+SP[3] 06BD0277 || STW.D1T2 B13,*+FP[8] 00000000 || NOP 063C82F5 STW.D2T1 A12,*+SP[4] 063D2276 || STW.D1T2 B12,*+FP[9] 05BCA2F5 STW.D2T1 A11,*+SP[5] 05BD4276 || STW.D1T2 B11,*+FP[10] 053CC2F5 STW.D2T1 A10,*+SP[6] 053D6277 || STW.D1T2 B10,*+FP[11] 00000001 || NOP 00000000 || NOP 07180941 MV.D1 A6,A14 051811A1 || MV.S1X B6,A10 05183D42 || ADDAW.D2 B6,0x1,B10 0290105B MV.L2X A4,B5 000002AA || MVK.S2 0x0005,B0 04383764 LDDW.D1T1 *A14++[1],A9:A8 03280265 LDW.D1T1 *+A10[0],A6 032802E6 || LDW.D2T2 *+B10[0],B6 04383764 LDDW.D1T1 *A14++[1],A9:A8 03A84265 LDW.D1T1 *+A10[2],A7 03A842E6 || LDW.D2T2 *+B10[2],B7 00000000 NOP 06A0BE03 MPYSP.M2X B5,A8,B13 06911E00 || MPYSP.M1X A8,B4,A13 LOOP: 008000A9 MVK.S1 0x0001,A1 008000AA || MVK.S2 0x0001,B1 00008000 NOP 5 06A0BE02 MPYSP.M2X B5,A8,B13 00008000 NOP 5 06B4F218 ADDSP.L1X A7,B13,A13 00002000 NOP 2 060440D8 SUB.L1 2,A1,A12 00008000 NOP 5 0119B21A ADDSP.L2X B13,A6,B2 00002000 NOP 2 060440DA SUB.L2 2,B1,B12 00008000 NOP 5 06911E00 MPYSP.M1X A8,B4,A13 00008000 NOP 5 0134D2B8 SUBSP.L1X B6,A13,A2 00002000 NOP 2 060480D8 SUB.L1 4,A1,A12 00008000 NOP 5 06B4F23A SUBSP.L2X B7,A13,B13 00002000 NOP 2 060480DA SUB.L2 4,B1,B12 00008000 NOP 5 000029C2 SUB.D2 B0,0x1,B0 2FFFF712 [ B0] B.S2 LOOP (PC-72 = 0x00011f38) 00008000 NOP 5 07BC11A0 MV.S1X SP,FP 053CC2E5 LDW.D2T1 *+SP[6],A10 053D6266 || LDW.D1T2 *+FP[11],B10 05BCA2E5 LDW.D2T1 *+SP[5],A11 05BD4266 || LDW.D1T2 *+FP[10],B11 063C82E5 LDW.D2T1 *+SP[4],A12 063D2267 || LDW.D1T2 *+FP[9],B12 00000000 || NOP 06BC62E5 LDW.D2T1 *+SP[3],A13 06BD0266 || LDW.D1T2 *+FP[8],B13 073C42E5 LDW.D2T1 *+SP[2],A14 073CE267 || LDW.D1T2 *+FP[7],DP 000C0362 || B.S2 B3 07BC22E4 LDW.D2T1 *+SP[1],FP 07BE9D42 ADDAW.D2 SP,0x14,SP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP
07BE9DC2 SUBAW.D2 SP,0x14,SP 07BC11A1 MV.S1X SP,FP 07BC22F4 || STW.D2T1 FP,*+SP[1] 073C42F5 STW.D2T1 A14,*+SP[2] 073CE276 || STW.D1T2 DP,*+FP[7] 06BC62F5 STW.D2T1 A13,*+SP[3] 06BD0277 || STW.D1T2 B13,*+FP[8] 00000000 || NOP 063C82F5 STW.D2T1 A12,*+SP[4] 063D2276 || STW.D1T2 B12,*+FP[9] 05BCA2F5 STW.D2T1 A11,*+SP[5] 05BD4276 || STW.D1T2 B11,*+FP[10] 053CC2F5 STW.D2T1 A10,*+SP[6] 053D6277 || STW.D1T2 B10,*+FP[11] 00000001 || NOP 00000000 || NOP 07180941 MV.D1 A6,A14 051811A1 || MV.S1X B6,A10 05183D42 || ADDAW.D2 B6,0x1,B10 0290105B MV.L2X A4,B5 000002AA || MVK.S2 0x0005,B0 04383764 LDDW.D1T1 *A14++[1],A9:A8 03280265 LDW.D1T1 *+A10[0],A6 032802E6 || LDW.D2T2 *+B10[0],B6 04383764 LDDW.D1T1 *A14++[1],A9:A8 03A84265 LDW.D1T1 *+A10[2],A7 03A842E6 || LDW.D2T2 *+B10[2],B7 00000000 NOP 06A0BE03 MPYSP.M2X B5,A8,B13 06911E00 || MPYSP.M1X A8,B4,A13 LOOP: 008000A9 MVK.S1 0x0001,A1 008000AA || MVK.S2 0x0001,B1 00008000 NOP 5 06A0BE02 MPYSP.M2X B5,A8,B13 00008000 NOP 5 06B4F218 ADDSP.L1X A7,B13,A13 00002000 NOP 2 06044058 ADD.L1 2,A1,A12 00008000 NOP 5 0119B21A ADDSP.L2X B13,A6,B2 00002000 NOP 2 0604405A ADD.L2 2,B1,B12 00008000 NOP 5 06911E00 MPYSP.M1X A8,B4,A13 00008000 NOP 5 0134D2B8 SUBSP.L1X B6,A13,A2 00002000 NOP 2 06048058 ADD.L1 4,A1,A12 00008000 NOP 5 06B4F23A SUBSP.L2X B7,A13,B13 00002000 NOP 2 0604805A ADD.L2 4,B1,B12 00008000 NOP 5 000029C2 SUB.D2 B0,0x1,B0 2FFFF712 [ B0] B.S2 LOOP (PC-72 = 0x00011f38) 00008000 NOP 5 07BC11A0 MV.S1X SP,FP 053CC2E5 LDW.D2T1 *+SP[6],A10 053D6266 || LDW.D1T2 *+FP[11],B10 05BCA2E5 LDW.D2T1 *+SP[5],A11 05BD4266 || LDW.D1T2 *+FP[10],B11 063C82E5 LDW.D2T1 *+SP[4],A12 063D2267 || LDW.D1T2 *+FP[9],B12 00000000 || NOP 06BC62E5 LDW.D2T1 *+SP[3],A13 06BD0266 || LDW.D1T2 *+FP[8],B13 073C42E5 LDW.D2T1 *+SP[2],A14 073CE267 || LDW.D1T2 *+FP[7],DP 000C0362 || B.S2 B3 07BC22E4 LDW.D2T1 *+SP[1],FP 07BE9D42 ADDAW.D2 SP,0x14,SP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP 00000000 NOP
Why do the ADD and SUB instructions not work?
Best regards,
Daisuke