This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/OMAPL138B-EP: Floating Point conversion crashes system

Part Number: OMAPL138B-EP
Other Parts Discussed in Thread: OMAPL138, SYSBIOS

Tool/software: TI C/C++ Compiler

Hello all,

I am working on the OMAPL138 DSP in a System with multiple Tasks, all using (single precision) floating Point calculations.

I Need to add the following Code in the main() Task:

typedef struct

{

       float32 InputVal;

       int32        Position;

} _Sample_t;

 

 

typedef struct /*union*/

{

       _Sample_t Sample[4096];

       double                     Val[4096];

} Buffer_t;

 

 

static Buffer_t           Buffer[3];

 

 

[…]

// main Task function

 

float32 InputVal;

int32 Position;

usign32 i;

 

for (i = 0; i < 4096; i++)

{

InputVal = Buffer[AxisID].Sample[i].InputVal;

Position   = Buffer[AxisID].Sample[i].Position;

 

_disable_interrupts();

Buffer[AxisID].Val[i] = InputVal;

_enable_interrupts();

}

This Code works only with the Interrupt lock as shown. When I remove it, it would disturb the calculations in a fast task.

I was not able to find out what exactly happens. I also failed in creating an easier Code that would show the Problem.

Thanks for any help.

Alexander

e2e.ti.com/.../272989

  • Please describe your setup a little better for our understanding. Are you using an RTOS or is this bare-metal/no OS code? What interrupt sources are enabled in the system?  How does your code context save ad restore when interrupt occurs?  What are the compiler setting ? 

    Regards,

    Rahul

  • I'm sorry.

    We are using the DSP without OS.

    The Compiler Settings are:

    "C:/DevTools/TI/CCS/c6000_7.4.20/bin/cl6x" -mv6740 --abi=eabi -O3 -ms3 -g --optimize_with_debug=on --disable_push_pop --include_path="C:/DevTools/TI/CCS/c6000_7.4.20/include" --include_path="C:/pro/1400/soft/00c/work/OMAP_DSP/include" --include_path="C:/pro/1400/soft/00c/work/common/include" --include_path="C:/pro/1400/soft/00c/work/common/include/ti/pspiom/cslr" --gcc --define=OMAPL138 --define=OMAPL138_DSP --define=PARA_CPU_DSP --diag_warning=225 --diag_error=225 --diag_error=145 --display_error_number --mem_model:data=far --opt_for_speed=2 --printf_support=full --output_all_syms --preproc_with_compile --preproc_dependency="main.pp" --obj_directory="MotionOne/Encoder_(ENC)" "main.c"

    The Interrupt source of the fastest Task is GPIO, the others are called via Software Interrupt.

    intcVectorTable:

    _vector0: VEC_ENTRY _c_int00 ; RESET

    _vector1: VEC_ENTRY DSP_EXCEPTION ; NMI

    _vector2: VEC_ENTRY vec_dummy ; DSP non-Maskable INT2 : Empty

    _vector3: VEC_ENTRY vec_dummy ; DSP non-Maskable INT3 : Empty

    _vector4: VEC_ENTRY DSP_MPU_IRQ ; DSP Maskable INT4 : Empty

    _vector5: VEC_ENTRY SYNC_isr ; DSP Maskable INT5 : Sthe DMA process

    _vector6: VEC_ENTRY EDMA3_INT_isr ; DSP Maskable INT6 : End-of-DMA-Interrupt

    _vector7: VEC_ENTRY EDMA_CONTROL_isr ; DSP Maskable INT7 : called by Software interrupt ISR = ...

    ; this task is disturbed

    _vector8: VEC_ENTRY vec_dummy ; DSP Maskable INT8 : Empty

    _vector9: VEC_ENTRY vec_dummy ; DSP Maskable INT9 : Empty

    _vector10: VEC_ENTRY DSP_1ms_Handler ; DSP Maskable INT10: Empty

    _vector11: VEC_ENTRY DSP_4ms_Handler ; DSP Maskable INT11: Empty

    _vector12: VEC_ENTRY DSP_10ms_Handler ; DSP Maskable INT12: Empty

    _vector13: VEC_ENTRY vec_dummy ; DSP Maskable INT13: Empty

    _vector14: VEC_ENTRY vec_dummy ; DSP Maskable INT14: Empty

    _vector15: VEC_ENTRY vec_dummy ; DSP Maskable INT15: Empty

     

    We did nothing special to save the context. The critical Task is:

     

    #pragma CODE_SECTION( EDMA_CONTROL_isr, "<L2 RAM>")

    interrupt void EDMA_CONTROL_isr (void)

    {

    usign32 IRP_loc;

    usign32 ITSR_loc;

    usign32 IER_loc;

     

    IRP_loc = IRP;

    ITSR_loc = ITSR;

    IER_loc = IER;

    IER = IER_loc & (ISR_CONTROL - 1);

    _enable_interrupts();

    [...]

    _disable_interrupts();

    IRP = IRP_loc;

    ITSR = ITSR_loc;

    IER = IER_loc;

    // clear Interrupt flag

    ICR = ISR_CONTROL;

    }

     

    best regards

    Alexander

     

  • Alexander,

    With the information provided, it is difficult to pinpoint what may be happening here. Also there isn't a clear indication that the issue is directly related to the compiler. Some general advise I would provide is keep your ISR as short as possible to avoid missing interrupts. You mention doing some calculations in the "fast task" - do you mean in the ISR? If so, I would recommend not doing any time consuming processing inside the ISR, but to move those to the main task.

    Without a reproducible test case it is difficult for us to provide a more precise answer, but I will forward this to the device experts to see if they have anything else to add here.

  • Alexander,

    I agree with Aarti's comments, though will offer a few other thoughts:

    1. I was a little confused by your usage of "task" and "software interrupt".  You mentioned you were running bare metal, but those are RTOS terms.  Do you have your own proprietary RTOS that you're running?  If so, does each task have its own independent stack?

    2. Your interrupt vector table shows a VEC_ENTRY for _c_int00.  I don't know all the details of how you're booting, but if that vector actually gets executed as part of the startup process it could be problematic.  In particular, the VEC_ENTRY definition pushes and pops a register surrounding the function call.  That's fine at run-time, but for startup before the stack pointer has been initialized you end up writing and then reading from an undefined and often random memory location.  I've seen many an issue of systems not starting up reliably that have been traced to that specific mistake.

    3. Where are you allocating the structures/arrays?  In particular, given that the size is 4k doubles, that's a big chunk of memory.  If you declared it as a local variable somewhere that can quickly overrun the stack and cause the strange type of issues you're reporting.

    I encourage you to consider using sysbios for your development.  It provides a very well tested RTOS kernel for your development and can help avoid a lot of issues.

    Best regards,
    Brad

  • Hi Alexander

    Were you able to make progress on this with the guidance provided.

    I am marking this thread as resolved since we have not heard back from you, please reply back as needed to reopen the thread.

    Regards

    Mukul 

  • Hi Brad,

    sorry for the misunderstanding.

    "Task" means for me an Interrupt Routine where the Interrupt is triggered cyclically.

    "Software Interrupt" means ISR = DSPINTC_IST_INTxx.

    The buffer is a global variable in DDR RAM.

    I can't tell how the startup process exactly works. However, the whole System is well balanced and has been working for years now.

    Floating Point arithmetic is used mainly in the fastest Task. We rarely use double (64-bit) Floating Point. My problem is not system stability. I simply get wrong computations in the fastest Task in  float (32-bit) when I use the 32-bit to 64-bit conversion in the slow Task without Interrupt lock.

    What is special about 64-bit arithmetic? Is there some Floating Point Registers that I need to save and restore? (However, in this case I would expect Errors in the Task that was interrupted)

    Do I have to put the Interrupt lock around every 64-bit computation?

    Thanks

    Alexander

  • Alexander Baehr said:
    What is special about 64-bit arithmetic? Is there some Floating Point Registers that I need to save and restore? (However, in this case I would expect Errors in the Task that was interrupted)

    The sysbios dispatcher preserves the GLPYA and GLPYB GPLYA and GPLYB registers.  I suspect these are your missing registers, i.e. you need to explicitly preserve them in your ISR.

  • Sorry, just realized I misspelled the register names. Should be GPLYA and GPLYB.

  • Hi Brad,

    I saved and restored GPLYA and GPLYB in the Interrupt function. Further, I initialized both to zero after saving. This did not change the problem.

    Looking at the C6000 Manual, those registers seem to be for some integer polynomial calculation...?

    Alexander

  • In short, most control registers are not preserved by the interrupt keyword, so the multiplication routine must be modifying one or more of them.  My next best guesses would be FADCR, FAUCR and FMCR.  Another one might be AMR.

  • [/quote]

    Alexander Baehr said:

    _disable_interrupts();

    Buffer[AxisID].Val[i] = InputVal;

    _enable_interrupts();

    }

    Alexander,

    I was looking at your original problem description again.  Can you provide the corresponding disassembly from this code above?  I would expect this to basically consist of a load, SPDP, store...  There shouldn't be any manipulation of floating point control registers or anything like that.  This operation seems so simple that I can hardly imagine what could be going wrong!  How big is your stack?  I suspect there's something more fundamental going wrong such as a stack overflow, or possibly incorrect procedure of re-enabling interrupts in the first place to allow for nested interrupts.

  • Hi there,

    GPLYA and GPLYB are registers to store generator polynomials used by Galois multiplication intrinsic _gmpy(). To my knowledge, these registers are not used for other purposes, so saving/restoring them might be overkill.

    As to register context save/restore, I would suggest skim through clause 7.6.2 of C6000 compiler user guide. It tells, that for ISR written in C/C++ and marked with 'interrupt' keyword compiler will take care of saving and restoring any register used within ISR. So may happen you hunt where is no bug.

  • Hi Brad,

    your were right, the cause is 'more fundamental'

    The assembly code for the loop without Interrupt lock is as follows. The compiler places a DINT/ RINT around the whole loop, which is too long for our system. What is the reason for this ??

    c03991b4:   01400429            MVK.S1        0xffff8008,A2
    c03991b8:   0095EF8A ||         SET.S2        B5,15,15,B1
    c03991bc:   E1C080E0            .fphead       n, l, W, BU, br, nosat, 0001110
    c03991c0:   69A2                SET.S1        A3,11,11,A3
    c03991c2:   30D1     ||         ADD.L2X       B1,A1,B5
    c03991c4:   10004000 ||         DINT         
    c03991c8:   01604869            MVKH.S1       0xc0900000,A2
    c03991cc:   ED80     ||         ADD.L1        A3,-1,A0
    c03991ce:   4250                ADD.L1        A2,A4,A5
    c03991d0:   021111A0 ||         ADD.S1X       8,B4,A4

    c03991d4:   001096E6            LDW.D2T2      *B4++[4],B0
    c03991d8:   000000A2            SPDP.S2       B0,B1:B0
    c03991dc:   E1300083            .fphead       p, l, W, BU, nobr, nosat, 0001001
    c03991e0:   0C6E                NOP           1
    c03991e2:   3C85                STDW.D2T2     B1:B0,*B5++[2]
    c03991e4:   01909664            LDW.D1T1      *A4++[4],A3
    c03991e8:   00000000            NOP          
    c03991ec:   C07FB020     [ A0]  BDEC.S1       0xC03991DA (PC-6 = 0xc03991da),A0
    c03991f0:   00002000            NOP           2
    c03991f4:   030C00A0            SPDP.S1       A3,A7:A6
    c03991f8:   0C6E                NOP           1
    c03991fa:   2CE4                STDW.D1T1     A7:A6,*A5++[2]
    c03991fc:   E8240000            .fphead       n, l, DW/NDW, W, nobr, nosat, 1000001

    c0399200:   0014A35B            MVK.L2        5,B0
    c0399204:   10006000 ||         RINT         

    With the hand-written Interrupt lock, the Overall Interrupt lock is gone:

    c0399586:   1010                ADD.L1X       A0,B0,A1
    c0399588:   2C8C     ||         LDW.D1T1      *A5++[2],A0
    c039958a:   0093     ||         MVK.S2        0,B1
    c039958c:   E8A3                SET.S2        B1,15,15,B1
    c039958e:   0013                MVK.S2        0,B0
    c0399590:   30C0                ADD.L1X       A1,B1,A4
    c0399592:   8823     ||         SET.S2        B0,12,12,B0

    c0399594:   10004000            DINT         
    c0399598:   030000A1            SPDP.S1       A0,A7:A6
    c039959c:   E3E08118            .fphead       n, l, W, BU, br, nosat, 0011111
    c03995a0:   018403E2 ||         MVC.S2        CSR,B3
    c03995a4:   008C2FDA            OR.L2         1,B3,B1
    c03995a8:   008403A3            MVC.S2        B1,CSR
    c03995ac:   0C64     ||         STDW.D1T1     A7:A6,*A4++[1]
    c03995ae:   EC01                ADD.L2        B0,-1,B0
    c03995b0:   2FFAA121     [ B0]  BNOP.S1        (PC-12 = 0xc0399594),5
    c03995b4:   20145665 ||  [ B0]  LDW.D1T1      *A5++[2],A0
    c03995b8:   3014A35A ||  [!B0]  MVK.L2        5,B0

  • Alexander Baehr said:

    The compiler places a DINT/ RINT around the whole loop,

    There's nothing inherently wrong with the compiler using DINT/RINT surrounding the loop.  The fact that it seems to be problematic makes me suspect that something is incorrect with regard to how you are re-enabling interrupts to allow for nesting.  In addition to using the interrupt keyword for your ISR's, are you also following the steps from the CPU and Instruction Set Guide Section 5.6.2 "Nested Interrupts"?  Can you please post some code snippets for review?  Side note, this is all fully vetted using sysbios.  Why aren't you using sysbios?

    Alexander Baehr said:

    which is too long for our system. What is the reason for this ??

    Please see Section 2.12 "Interrupt Flexibility Options (--interrupt_threshold Option)" of the C6000 Compiler Guide.

  • The --interrupt_threshold Option solved the problem, thanks.

    Our System is hard real-time, performing multiple Interrupts and DMAs in a 125µs cycle. We do not think this is possible with the Overhead of an Operation System.

  • Is everything working now?  Or are you still trying to solve the original problem?

    Can you share any more details or code snippets regarding how you are enabling pre-emption?  There are a lot of subtle ways to have issues there.  For example, are you modifying IER prior to re-enabling interrupts?  I encourage you to limit pre-emption as much as possible.  For example, if there is one particular interrupt that absolutely must run the moment it occurs, then perhaps that is the one and only interrupt that you leave enabled in the IER prior to re-enabling interrupts.  If you're not touching IER at all, let's talk about that in more detail, because there are several related issues.

  • Everything fine now. The --Interrupt-threshold solved the Problem.

    We re-enable all Interrupts that have higher priority that the running task. See my 2nd post for our Interrupt handling.

    Alexander

  • For your requirements, I can see how --interrupt-threshold will be very important, but I don't understand how that solved your original issue of failures in casting floating point numbers.  Do you watermark your stack on startup?  I highly recommend doing that so you can quantify your peak stack usage.