This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CPU Hanging while executing SPKERNEL instruction.



I'm running code on a 6455, developing using CGT 7.2.0B2 with DSP/BIOS 5.41.09.34.  I'm using CCS V5.0.1.201101102000, which is a little long in the tooth....

I've been developing an application, and it was working as well as expected until recently.  After making some "trivial" code changes in "unrelated" code, my application has been hanging.  When I use the Blackhawk emulator to inspect things, after retrying to connect when it complains:

after a retry, I see this in the disassembly view:

00840eb6:   4DE7                SPLOOPD       12
00840eb8:   069813A2 ||         MVC.S2X       A6,ILC
00840ebc:   E5260842            .fphead       n, l, DW/NDW, NW, nobr, nosat, 0101001
312               s = survivor[s][n];
          C$L14, C$DW$L$_ViterbiEqualiser$22$B:
00840ec0:   2CE7                SPMASK        L1,L2
00840ec2:   B247     || ^       MV.L2X        A4,B5
00840ec4:   04012829 ||         MVK.S1        0x0250,A8
00840ec8:   02480FD8 || ^       OR.L1         0,A18,A4
00840ecc:   04110570            MPYLI.M1      A8,A4,A9:A8
00840ed0:   4C6E                NOP           3
00840ed2:   2D66                SPMASK        S1
00840ed4:   0220A079 ||         ADD.L1        A5,A8,A4
00840ed8:   DA8E     || ^       MV.S1X        B21,A6
00840eda:   C230                ADD.L1        A6,A4,A3
00840edc:   EA200203            .fphead       n, l, W, BU, nobr, nosat, 1010001
00840ee0:   020C0264            LDW.D1T1      *+A3[0],A4
00840ee4:   4C6E                NOP           3
00840ee6:   2C67                SPMASK        L1
00840ee8:   03AC0FD8 || ^       OR.L1         0,A11,A7
313               VE_Iest[n-1] = VE_Symbols[s][0];
00840eec:   019C9E40            ADDAD.D1      A7,A4,A3
00840ef0:   018C0334            LDNW.D1T1     *+A3[0],A3
00840ef4:   2C6E                NOP           2
310           while(n > 0)
00840ef6:   8ED0                ADD.L1        A5,-4,A5
313               VE_Iest[n-1] = VE_Symbols[s][0];
00840ef8:   82C7                MV.L2         B5,B4
00840efa:   1FE6                SPKERNEL      0x 7752677664,8166
00840efc:   EC402008            .fphead       n, l, W, BU, nobr, nosat, 1100010
00840f00:   8ED1     ||         ADD.L2        B5,-4,B5
00840f02:   0235     ||         STNW.D2T1     A3,*B4[0]
          C$L16, C$L15, C$DW$L$_ViterbiEqualiser$22$E:
00840f04:   1586                MV.L1X        B11,A0
00840f06:   2627     ||         MVK.L2        1,B4
          C$L17:
00840f08:   D23C42F7     [!A0]  STW.D2T2      B4,*+SP[2]

The PC is 0c840efa, i.e.

00840efa:   1FE6                SPKERNEL      0x 7752677664,8166

which is obviously rubbish, if I reload the executable and don't let it run to the problem point, I see a much more reasonable disassembly,:

          C$L13:
00840eb6:   4DE7                SPLOOPD       12
00840eb8:   069813A2 ||         MVC.S2X       A6,ILC
00840ebc:   E5260842            .fphead       n, l, DW/NDW, NW, nobr, nosat, 0101001
312               s = survivor[s][n];
          C$L14, C$DW$L$_ViterbiEqualiser$22$B:
00840ec0:   2CE7                SPMASK        L1,L2
00840ec2:   B247     || ^       MV.L2X        A4,B5
00840ec4:   04012829 ||         MVK.S1        0x0250,A8
00840ec8:   02480FD8 || ^       OR.L1         0,A18,A4
00840ecc:   04110570            MPYLI.M1      A8,A4,A9:A8
00840ed0:   4C6E                NOP           3
00840ed2:   2D66                SPMASK        S1
00840ed4:   0220A079 ||         ADD.L1        A5,A8,A4
00840ed8:   DA8E     || ^       MV.S1X        B21,A6
00840eda:   C230                ADD.L1        A6,A4,A3
00840edc:   EA200203            .fphead       n, l, W, BU, nobr, nosat, 1010001
00840ee0:   020C0264            LDW.D1T1      *+A3[0],A4
00840ee4:   4C6E                NOP           3
00840ee6:   2C67                SPMASK        L1
00840ee8:   03AC0FD8 || ^       OR.L1         0,A11,A7
313               VE_Iest[n-1] = VE_Symbols[s][0];
00840eec:   019C9E40            ADDAD.D1      A7,A4,A3
00840ef0:   018C0334            LDNW.D1T1     *+A3[0],A3
00840ef4:   2C6E                NOP           2
310           while(n > 0)
00840ef6:   8ED0                ADD.L1        A5,-4,A5
313               VE_Iest[n-1] = VE_Symbols[s][0];
00840ef8:   82C7                MV.L2         B5,B4
00840efa:   1FE6                SPKERNEL      0,7
00840efc:   EC402008            .fphead       n, l, W, BU, nobr, nosat, 1100010

i.e. 00840efa:   1FE6                SPKERNEL      0,7

but as far as I can see the memory contents around this point are identical to the previous result.  I only see the odd disassembly after a crash.  The fstg value seems to vary somewhat but is always enormous typically starting 77 followed by 8 digits.

Note the Fcyc value 8166 is the decimal value for the OP code as a whole 0x1FE6.

It seems obvious that the disassembler has screwed-up, but what information is different in the hung case to the reloaded one?

My code in this area is written in C, and the loop that it is trying to execute is:

    while(n > 0)
    {
        s = survivor[s][n];
        VE_Iest[n-1] = VE_Symbols[s][0];
        n--;
    }
 

I'm fairly confident that my application is hanging in this loop, from previous occasions where I added code to monitor the CPU's progress through the routine.

My changes are typically adding more instrumentation, i.e. text output, but are not in the immediate vicinity to this code in terms of execution in this thread, although the application is multi-tasking and I guess that my changes could be having an effect near in terms of time.  Unfortunately I can't pin down a single change that breaks the application, I've tried undoing them in sequence, but can find no pattern as to which the application will work with and which it will not.  I'm currently assuming that one or more changes affects timing.  Note interrupts are running and EDMAs are active, but I can't say if the are coincident with thus SPLOOP.  It doesn't fail the first time through this code, it must have been run 100s of times before the crash.  The compiler also produces 2 other SPLOOPs in the function being executed, but every time I've seen it fail is has been in this "while (n>0)" loop.

The compilation for the source code for used the following options:

-mv64+
-g
-O3
--relaxed_ansi
--gcc
--define="_DEBUG" --define=MINI_NESIE_HW --define="CHIP_6416"
--include_path="C:/TI_CCS_V5.0.1/ccsv5/tools/compiler/c6000/include" ....

--display_error_number
--issue_remarks
--diag_warning=225 --diag_warning=270 --diag_warning=183
--optimize_with_debug
--interrupt_threshold=960
--abi=coffabi
--opt_for_speed=5
--printf_support=nofloat
--preproc_with_compile
--preproc_dependency=...

I've checked that I haven't overflown any of the stacks in the application.

Are there any known issues with interrupts or task pre-emption and SPLOOPs, particularly on the  6455. ?

Paul Bray

  • Hi,

    Thanks for  your post.

    In general, the build option --disable_software_pipeline tells the compiler not to generate any software pipeline loops.  In effect, SPLOOP is not used, but you shouldn't have to do that. Kindly refer section 7.13.1 titled "Interrupt the Loop Buffer" from the c6000 CPU user guide below:

    http://www.ti.com/lit/ug/spru732j/spru732j.pdf

    If SPLOOP is in action and is interrupted, and if SPLOOP buffer is used to process other loop in the interrupt, the previous partial loop information will be gone because there is "stack" that can back up the state of the SPLOOP buffer. The key is to keep ISR short or brief.

    In genernal, interrupt service routines must save and restore the ITSR or NTSR, ILC, and RILC registers. A B IRP instruction copies ITSR to TSR, and a B NRP restores TSR from NTSR.

    The compiler returns from an interrupt function with B IRP, so a compiler generated interrupt routine which contains SPLOOP should preserve ITSR, ILC, and RILC.  It turns out interrupt code generated by the compiler does preserve ILC and RILC, but not ITSR. So, a compiler generated interrupt routine can use SPLOOP

    Please read the below E2E post:

    https://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/112/p/11376/99894#pi317286=2

    To know more on SPLOOP, please refer section 3.4 in the c6000 optimizing compiler user guide as below:

    http://www.ti.com/lit/ug/spru187u/spru187u.pdf

    There are c6000 optimization workshop and materials with c64+ optimization techniques involved. please refer the below wiki's:

    http://processors.wiki.ti.com/index.php/Optimization_Techniques_for_the_TI_C6000_Compiler

    http://processors.wiki.ti.com/index.php/TMS320C6000_DSP_Optimization_Workshop

    http://processors.wiki.ti.com/index.php/Optimized_Sort_Algorithms_For_DSP#Sort_Algorithms

    Also, there are c64/c64+ compiler optimization tricks involved. To know more on this, please refer the below wiki presentation:

    http://processors.wiki.ti.com/images/6/6e/C64p_cgt_optimization.pdf

    Thanks & regards,

    Sivaraj K

    -------------------------------------------------------------------------------------------------------

    Please click the Verify Answer button on this post if it answers your question.

    -------------------------------------------------------------------------------------------------------

  • Sivaraj,

    Thanks for the fast resaponse.

    All my interrupt routines are written in C, and attached via the DSP/BIOS, so I would like to think that they save and restore the necessary registers.  In fact there is no "user" assembly code in the application.  Since there is no "user" assembly, at no point have we optimized an SPLOOP, all the SPLOOP code is generated by the compiler.

    I'm afraid I'm a little confused by you reply in places, perhaps you can expand your explanations a little.

    1) "If SPLOOP is in action and is interrupted, and if SPLOOP buffer is used to process other loop in the interrupt, the previous partial loop information will be gone because there is "stack" that can back up the state of the SPLOOP buffer"

    "the previous partial loop information will be gone" suggests that it is lost and we won't be able to restart the loop where we left it when we return from the interrupt handler, but the description then goes on to talk about a stack that can back up the state of the SPLOOP buffer, so that would rescue the situation perhaps.  The stack must have finite size, so at some point it could get filled, is that the issue that requires ISRs to be brief or short?  That's a good objective anyway.  Is there any way to inspect the stack to see if it has overflowed?

    If interrupts are allowed in an SPLOOP then I presume that task preemption is also possible, so the task context would have to contain the necessary information to continue the loop on return to the preempted task.  Does task switching save the SPLOOP stack, and restore it from a saved version for the new task?

    7.13.1 of spru732 suggests that the SPLOOP buffer is drained before the interrupt can be handled and is reinstated on return, which seems sensible.  But that would suggest that the ISR could use the SPLOOP without too much concern.

    2) "It turns out interrupt code generated by the compiler does preserve ILC and RILC, but not ITSR. So, a compiler generated interrupt routine can use SPLOOP"

    Isn't the lack of preservation of the ITSR an issue.  Earlier you say that an ISR must save and restore ITSR or(?) NTSR, ILC and RILC.

    I forgot to mention in my initial post that my application is loaded via the PCI interface from a host processor.  When the hang occurs I connect up to the DSP via my Blackhawk BH-USB-510L emulator pod, and inspect the CPU registers, memory contents etc.  When attempting to connect the first time CCS gives me an error message, typically "C64XP_0: Error connecting to the target: CPU hangs, but driver forces it ready and issues an halt. Retry connection again!"  on a retry connection is possible.

    If I load and run the application through the debugger I don't see the hang.  But maybe that's due to subtly different timing.

    Do you know what value the PC register would report when an SPLOOP is active?  Would it be the address of the SPKERNEL instruction?

    Are there any registers that I should inspect that might give a clue as to what is going wrong? 

    I note ILC is 0x53 which is plausible. In the loop that it is executing, C version shown below,  n would be 87:

        while(n > 0)
        {
            s = survivor[s][n];
            VE_Iest[n-1] = VE_Symbols[s][0];
            n--;
        }

    Core Registers    Core Registers    
        PC    0x00840EFA    Core Register    
        CLK    0x00000000    Core Register    
        SP    0x008B20D0    Core Register    
        FP    0x008B20D0    Core Register    
        A0    0x00000000    Core Register    
        A1    0x00000000    Core Register    
        A2    0x00000000    Core Register    
        A3    0x4CB64E2C    Core Register    
        A4    0x00020003    Core Register    
        A5    0x00000154    Core Register    
        A6    0x00872F58    Core Register    
        A7    0x0087E330    Core Register    
        A8    0x4C2F1D80    Core Register    
        A9    0x000000AE    Core Register    
        A10    0x00871738    Core Register    
        A11    0x0087E330    Core Register    
        A12    0x0087A9F0    Core Register    
        A13    0x00000058    Core Register    
        A14    0x008C7718    Core Register    
        A15    0x0087E370    Core Register    
        A16    0x0087E3FC    Core Register    
        A17    0x00000250    Core Register    
        A18    0x00000009    Core Register    
        A19    0x0000015C    Core Register    
        A20    0x00871CD8    Core Register    
        A21    0x0087E3F8    Core Register    
        A22    0x00871738    Core Register    
        A23    0x00000000    Core Register    
        A24    0x00000058    Core Register    
        A25    0x00000000    Core Register    
        A26    0x00000250    Core Register    
        A27    0x00872D08    Core Register    
        A28    0x0000001C    Core Register    
        A29    0x00000250    Core Register    
        A30    0x00000000    Core Register    
        A31    0x0087E3A8    Core Register    
        B0    0x00000000    Core Register    
        B1    0x00000000    Core Register    
        B2    0x008741F8    Core Register    
        B3    0x00840984    Core Register    
        B4    0x00871BDC    Core Register    
        B5    0x00871BD8    Core Register    
        B6    0x00871A88    Core Register    
        B7    0x00871BE8    Core Register    
        B8    0x00873304    Core Register    
        B9    0x0087E37A    Core Register    
        B10    0x0087AC50    Core Register    
        B11    0x00000000    Core Register    
        B12    0x00872F58    Core Register    
        B13    0x00871CD8    Core Register    
        B14    0x008CE0C8    Core Register    
        B15    0x008B20D0    Core Register    
        B16    0x0087E3F8    Core Register    
        B17    0x0087E378    Core Register    
        B18    0x00000000    Core Register    
        B19    0xFFFFF000    Core Register    
        B20    0x0087E330    Core Register    
        B21    0x00872F58    Core Register    
        B22    0x0000015C    Core Register    
        B23    0x00000000    Core Register    
        B24    0x00000003    Core Register    
        B25    0x00872868    Core Register    
        B26    0x00000000    Core Register    
        B27    0x00871F28    Core Register    
        B28    0x00000094    Core Register    
        B29    0x008731A8    Core Register    
        B30    0x00000000    Core Register    
        B31    0x008730B4    Core Register    
        AMR    0x00000000    Core Register: Addressing mode register    
        CSR    0x10000103    Core Register: Control status register    
        IFR    0x00004090    Core Register: Interrupt Flag Register    
        ISR    0x00004090    Core Register: Interrupt Set Register    
        ICR    0x00000000    Core Register: Interrupt clear Register    
        IER    0x000078DB    Core Register: Interrupt enable Register    
        ISTP    0x008D0080    Core Register: Interrupt service table pointer    
        IRP    0x008542C0    Core Register: Interrupt return pointer    
        NRP    0x00000000    Core Register: Non maskable interrupt    
        ERP    0x00000000    Core Register: Exception return pointer    
        TSCL    0xA8D9CF47    Core Register: Low half of 64-bit timestamp    
        TSCH    0x00000031    Core Register: High half of 64-bit timestamp    
        ARP    0x00000000    Core Register: Analysis return pointer    
        ILC    0x00000053    Core Register: Inner loop SPL buffer count    
        RILC    0x0000000C    Core Register: Reload Inner loop SPL buffer count    
        PCE1    0x00840EFA    Core Register: Program counter E1 phase    
        DNUM    0x00000000    Core Register: DSP Number    
        SSR    0x00000000    Core Register: Saturation status register    
        GPLYA    0x00000000    Core Register: GMPY polynomial for A-side    
        GPLYB    0x00000000    Core Register: GMPY polynomial for B-side    
        GFPGFR    0x0700001D    Core Register: Galois field multiply control register    
        DIER    0x00000000    Core Register: Debug interrupt enable register    
        TSR    0x0000420F    Core Register: Task state register    
        ITSR    0x0000020F    Core Register: Interrupt task state register    
        NTSR    0x00010000    Core Register: Non maskable TSR snapshot    
        ETSR    0x00010000    Core Register: Exception TSR snapshot    
        EFR    0x00000000    Core Register: Exception Flag Register    
        ECR    0x00000000    Core Register: Exception clear register    
        IERR    0x00000000    Core Register: Internal exception cause register    

    TSR is interesting I believe that 0x0000420f would indicate that the SPLOOP is active and that we are currently processing an interrupt.  The PC shows  0x00840EFA which is the SPKERNEL instruction in non-interrupt code.   Not sure if it is relevant but   IRP  is 0x008542C0, which is in the memset function, but maybe that is left over from the previous interrupt. 

    Could the processor be in the "Branch to Interrupt, Pipe-Down Sequence" as per 7.13.4 of SPRU732J?

    IFR is 0x00004090 i.e. INTs 4, 7 & 14 are flagged, which are 4 & 7 are user interrupts that I have connected,   not too sure why 14 would be active though

     IER    0x000078DB, I'm surprised at how many interrupts are enabled here

    Paul

     

  • OK, the issue lies in our C code.

    It is possible for s to be an illegal value (9) when we enter this loop:

        while(n > 0)
        {
            s = survivor[s][n];
            VE_Iest[n-1] = VE_Symbols[s][0];
            n--;
        }

    we have

    int survivor[8][148];

    int VE_Symbols[8][2];

    The initial reading with s = 9, will almost certainly be from valid memory, but of course, will allow the following s to be any 32-bit signed value.  Typically it is 0, so we don't see a problem, but may be 0x00010001 or 0x00030003 etc depending what has been written into the location that has been accessed.  Once it gets that wrong we can be reading from illegal memory locations, or reading even wilder values.  I'm assuming that my "unrelated", "trival" changes may have altered the memory mapping and /or timing so that we read "nasty" values rather than more benign ones.

    The underlying cause was the unexpected condition of all 0's data being fed into the function, taking a belt and braces approach I've fixed my code to a) avoid that condition, and b) to handle it should the condition ever arise again, i.e. s on entry must now be a valid value.

    Thanks for your assistance with software pipelined loops, although it wasn't really relevant to the problem I feel I've learnt something.

    Paul

  • Hi Paul,

    Glad to hear you resolved the issue.
    Thanks For sharing solution with the community.