This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

debugging run-time error on C6746

Other Parts Discussed in Thread: SYSBIOS

Hello,

I'm working through run-time issues that have come as a result of porting from DSP-BIOS, to the latest SYS-BIOS, on a C6746.  I keep getting an exception error, but am still clueless why.  I wonder, if I post the exception dump from ROV below, if someone can give me some pointers on what might be wrong, or next steps to try.

Thanks,

Robert

  • The team is notified. They will post their feedback directly here.

    BR
    Tsvetolin Shulev
  • I think this is telling me the exception occurred when it went to run HWI 15, which is the clock ISR?

  • Why is it executing HWI 15, when there is nothing in the dispatch table?

  • Can you instead directly view the CPU registers to look at NRP, EFR, and IERR? The values shown in ROV look bogus to me.

    Or another interpretation might be that so much destruction has already occurred that it has destroyed much of the evidence. For example, NRP shows the location at which the exception occurred and it has the address 0x34323531 which corresponds to an undefined memory range (i.e. you shouldn't ever be there in the first place). The EFR should show what type of exception occurred, and it has all bits set. EFR[29:2] is expected to always be 0, so that is unexpected. Similarly IERR[31:9] should always be 0, but they also show everything set.

    Often times these issues are a cascade of problems that start with accesses to undefined memory spaces. A method of catching them can be to place a data watchpoint on a range of memory addresses. The goal here is for the debugger to halt you before so much damage gets done. Then perhaps we can understand what is happening.
  • Here are those registers, at the point of the exception

  • That looks a lot more like I was expecting. The PC was at address 0x11800524 when the exception occurred. The EFR indicates "internal exception", and then IERR decodes to "opcode exception". In other words, the CPU executed an illegal opcode at address 0x11800524. Based on your screenshot of that address, that looks like data, not instructions, so that's why you're getting an exception.

    I'm not sure exactly what would lead to that error. Perhaps you have an improper usage of a Swi handle somewhere? Was there somewhere that you got a compiler warning/error about types and you perhaps cast it to the expected type instead of changing to the proper type? Perhaps you can show how you created your SWI and any associated references to that SWI in your code.
  • One other thought, perhaps have a look at the SWI info in ROV. Maybe there's a clue as to what SWI was most recently called/posted. That might give a clue as to the specific SWI causing an issue.
  • The more I look into this, the more I don't think it is something within an SWI that is causing it, but something else in the system ... either BIOS-related or a different part of my application. I use "Live Session" to see the modules running. It enters one of my SWI's but never leaves before the exception. However, I essentially made that SWI a shell, commenting out everything except a simple counter at the end.
  • Don't know if this helps at all, but I see the SWI posted, but I never get there before the exception (breakpoint at beginning never reached).

  • How many times does that SWI run before it has an issue? So did it crash after hitting that SWI? Perhaps the underlying issue is that the SWI structure is being corrupted. Perhaps we need to look at that.
  • Brad Griffis said:
    How many times does that SWI run before it has an issue? So did it crash after hitting that SWI? Perhaps the underlying issue is that the SWI structure is being corrupted. Perhaps we need to look at that.

    It never gets to the SWI at all.  I'm currently trying to step through some of the BIOS code, to see what throws it off.  I'm not sure that will work, though, since it may be some combination of events that causes the exception, which I'll not see while single-stepping.  It does appear to get through the Swi_post() ok, though.  So probably somewhere between the post and getting to the actual Swi function, something goes wrong.

    Robert

  • Hello,

    The exception occurs on the first call to the SWI associated with it, which is the first any SWI's in the application, so I can set a breakpoint in Swi.c, right before it occurs.  Here is the sequence found so far:

    1) after program load, hit F8.  It comes to line 116 in Swi.c/Swi_run(), as shown in the first figure.  The second and third figures show the fxn and arguments, which all make sense.

    2) hit F8.  In this case, it hits the break point at line 114, but not 118, as shown in the fourth figure.  This tells me it re-entered the Swi_run function, without ever exiting or moving on the first entry to it.  

    3) hit F8.  It hits the break point at line 116

    4) hit F8, and it goes to exception, without hitting any of the breakpoints at 114, 116, or 118, nor the breakpoint I set at the very beginning of the associated SWI function handler.

    Is some sort of re-entrant problem?

    Any ideas of addition steps would be appreciated.

    Robert

  • Can you please post a screenshot of your SWI definition as well as its associated usage (i.e. the "post")?
  • Brad Griffis said:
    Can you please post a screenshot of your SWI definition as well as its associated usage (i.e. the "post")?

    Want to make sure I provide what's requested ... can you provide a few more words what you're looking for in each of those two cases?

    Thanks,

    Robert

  • Think this is what you're looking for:

  • Can you try making the second parameter for the get/setAttrs NULL? It will just leave the function alone in that case. I think your getAttrs may actually be corrupting the function because it's attempting to write the result to the address you provided, which is the function you're running. Simply using NULL for that field in both the getAttrs/setAttrs should fix it.

  • Here is the associated config.  

    var Swi = xdc.useModule( 'ti.sysbios.knl.Swi' );

    Swi.numPriorities = 9;

    var swi_params_edma_acq_left = new Swi.Params();
    swi_params_edma_acq_left.priority = 7;
    swi_params_edma_acq_left.instance.name = "swi_edma_acq_left";
    Program.global.SWI_edma_acq_left = Swi.create( '&swi_fxn_edma_acq_left', swi_params_edma_acq_left );

    Funny thing happened, though.  I changed some things about diags in the config file, and it doesn't appear to exception anymore.

    I made this change ... see commented and un-commented lines

    //Main.common$.diags_INFO = Diags.ALWAYS_ON;
    //Main.common$.diags_ANALYSIS = Diags.ALWAYS_OFF;
    //Main.common$.diags_STATUS = Diags.ALWAYS_OFF;
    //Main.common$.diags_ENTRY = Diags.ALWAYS_OFF;
    //Main.common$.diags_EXIT = Diags.ALWAYS_OFF;
    //Main.common$.diags_LIFECYCLE = Diags.ALWAYS_OFF;
    //Main.common$.diags_INTERNAL = Diags.ALWAYS_OFF;
    //Main.common$.diags_ASSERT = Diags.ALWAYS_OFF;
    //Main.common$.diags_USER1 = Diags.ALWAYS_OFF;
    //Main.common$.diags_USER2 = Diags.ALWAYS_OFF;
    //Main.common$.diags_USER3 = Diags.ALWAYS_OFF;
    //Main.common$.diags_USER4 = Diags.ALWAYS_OFF;
    //Main.common$.diags_USER5 = Diags.ALWAYS_OFF;
    //Main.common$.diags_USER6 = Diags.ALWAYS_OFF;

    Main.common$.diags_INFO = Diags.ALWAYS_ON;
    Main.common$.diags_ANALYSIS = Diags.ALWAYS_ON;
    Main.common$.diags_STATUS = Diags.ALWAYS_ON;
    Main.common$.diags_ENTRY = Diags.ALWAYS_ON;
    Main.common$.diags_EXIT = Diags.ALWAYS_ON;
    Main.common$.diags_LIFECYCLE = Diags.ALWAYS_ON;
    Main.common$.diags_INTERNAL = Diags.ALWAYS_ON;
    Main.common$.diags_ASSERT = Diags.ALWAYS_ON;
    Main.common$.diags_USER1 = Diags.ALWAYS_ON;
    Main.common$.diags_USER2 = Diags.ALWAYS_ON;
    Main.common$.diags_USER3 = Diags.ALWAYS_ON;
    Main.common$.diags_USER4 = Diags.ALWAYS_ON;
    Main.common$.diags_USER5 = Diags.ALWAYS_ON;
    Main.common$.diags_USER6 = Diags.ALWAYS_ON;

    and 

    //Defaults.common$.diags_INFO = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_ANALYSIS = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_STATUS = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_ENTRY = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_EXIT = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_LIFECYCLE = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_INTERNAL = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_ASSERT = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_USER1 = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_USER2 = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_USER3 = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_USER4 = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_USER5 = Diags.ALWAYS_OFF;
    //Defaults.common$.diags_USER6 = Diags.ALWAYS_OFF;

    Defaults.common$.diags_INFO = Diags.ALWAYS_ON;
    Defaults.common$.diags_ANALYSIS = Diags.ALWAYS_ON;
    Defaults.common$.diags_STATUS = Diags.ALWAYS_ON;
    Defaults.common$.diags_ENTRY = Diags.ALWAYS_ON;
    Defaults.common$.diags_EXIT = Diags.ALWAYS_ON;
    Defaults.common$.diags_LIFECYCLE = Diags.ALWAYS_ON;
    Defaults.common$.diags_INTERNAL = Diags.ALWAYS_ON;
    Defaults.common$.diags_ASSERT = Diags.ALWAYS_ON;
    Defaults.common$.diags_USER1 = Diags.ALWAYS_ON;
    Defaults.common$.diags_USER2 = Diags.ALWAYS_ON;
    Defaults.common$.diags_USER3 = Diags.ALWAYS_ON;
    Defaults.common$.diags_USER4 = Diags.ALWAYS_ON;
    Defaults.common$.diags_USER5 = Diags.ALWAYS_ON;
    Defaults.common$.diags_USER6 = Diags.ALWAYS_ON;

    I was trying to keep control over diag related overhead, so turned off everything except Main info.  But turning them all on seems to have changed something, so the exception doesn't occur anymore, and I can see the SWI being entered and exited.  

    Robert

  • Whittling away, to see which is the issue ... I can set the Main diags to all off, except info, like I had originally. But not Defaults. Will try each one of them now.
  • ASSERT ... can't be turned off in Defaults. The below worked (no exception, application appears to be running), which is all that I had before for diags, both Main and Defaults, except that Defaults ASSERT is turned on

    Defaults.common$.diags_INFO = Diags.ALWAYS_OFF;
    Defaults.common$.diags_ANALYSIS = Diags.ALWAYS_OFF;
    Defaults.common$.diags_STATUS = Diags.ALWAYS_OFF;
    Defaults.common$.diags_ENTRY = Diags.ALWAYS_OFF;
    Defaults.common$.diags_EXIT = Diags.ALWAYS_OFF;
    Defaults.common$.diags_LIFECYCLE = Diags.ALWAYS_OFF;
    Defaults.common$.diags_INTERNAL = Diags.ALWAYS_OFF;
    Defaults.common$.diags_ASSERT = Diags.ALWAYS_ON;
    Defaults.common$.diags_USER1 = Diags.ALWAYS_OFF;
    Defaults.common$.diags_USER2 = Diags.ALWAYS_OFF;
    Defaults.common$.diags_USER3 = Diags.ALWAYS_OFF;
    Defaults.common$.diags_USER4 = Diags.ALWAYS_OFF;
    Defaults.common$.diags_USER5 = Diags.ALWAYS_OFF;
    Defaults.common$.diags_USER6 = Diags.ALWAYS_OFF;
  • Hi Robert,

    I think Brad nailed it. Can you either do either (changes in bold)

    void swi_fxn_edma_acq_left(UArg arg0, UArg arg1)
    {

        blah blah;

    }

    bool_t swi_edma_acq_left_notched(int32_t *buffer_rx_ping, int32_t *buffer_tx_ping)

    {       

           Swi_Params swi_params_edma_acq_left;

           Swi_FuncPtr swiFxn;

           Swi_getAttrs(SWI_edma_acq_left, &swiFxn, &swi_params_edma_acq_left);

           swi_params_edma_acq_left.arg0 = (Arg)buffer_rx_ping;

           swi_params_edma_acq_left.arg1 = (Arg)buffer_tx_ping;

           Swi_setAttrs(SWI_edma_acq_left, swiFxn, &swi_params_edma_acq_left);

           ...

    or even better the following since you are not changing the function.

    bool_t swi_edma_acq_left_notched(int32_t *buffer_rx_ping, int32_t *buffer_tx_ping)

    {       

           Swi_Params swi_params_edma_acq_left;

           Swi_getAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

           swi_params_edma_acq_left.arg0 = (Arg)buffer_rx_ping;

           swi_params_edma_acq_left.arg1 = (Arg)buffer_tx_ping;

           Swi_setAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

           ...

    Additionally, I'd make the Swi_Params variable a local instead of a global. Your implementation is not currently thread-safe. Could multiple threads be calling this? If so, a higher priority thread could preempt the lower priority in the middle of the global variable update and mayhem would occur. You may not be having the problem now, but it's rather subtle and if the design changed in the future, it could be an issue then.

    I think changing the Diags setting is just moving memory around and making it not a catastrophic. If you make the above change, can you play with the Diags settings to verify that assumption?

    Todd

  • ToddMullanix said:

    Additionally, I'd make the Swi_Params variable a local instead of a global. 

    The reason I didn't is that in the implementations you showed, the params are never init'd.  I have an init procedure, before any Swi runs, that comes through and calls Swi_Params_init() on those global Swi_Params, so they're in some know state, before everything starts up.  Important, no?  I guess I can just do the init each before the Swi_post, and my associated update of the params args.  That'd certainly be worth making things more thread-safe.

    Robert

  • ToddMullanix said:

    bool_t swi_edma_acq_left_notched(int32_t *buffer_rx_ping, int32_t *buffer_tx_ping)

    {       

           Swi_Params swi_params_edma_acq_left;

           Swi_FuncPtr swiFxn;

           Swi_getAttrs(SWI_edma_acq_left, &swiFxn, &swi_params_edma_acq_left);

           swi_params_edma_acq_left.arg0 = (Arg)buffer_rx_ping;

           swi_params_edma_acq_left.arg1 = (Arg)buffer_tx_ping;

           Swi_getAttrs(SWI_edma_acq_left, swiFxn, &swi_params_edma_acq_left);

           ...

    or even better the following since you are not changing the function.

    bool_t swi_edma_acq_left_notched(int32_t *buffer_rx_ping, int32_t *buffer_tx_ping)

    {       

           Swi_Params swi_params_edma_acq_left;

           Swi_getAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

           swi_params_edma_acq_left.arg0 = (Arg)buffer_rx_ping;

           swi_params_edma_acq_left.arg1 = (Arg)buffer_tx_ping;

           Swi_getAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

    In these examples, how does the Swi_post() know which function to call?  Or is it because that function is already tied to the Swi handle via the config file settings?  I guess I would wonder why the function pointer is even required then, in the API, but probably something/use case I'm overlooking.

    Thanks,

    Robert

  • Good point on the Param_init. The Swi_getAttrs API currently will set the following fields in the supplied params structure: arg0, arg1, priority, and trigger. The Swi_setAttrs API currently only sets the exact same fields. However, we don't document that, so it could change in the future. So to be safe, call Swi_Params_init. So it's

    bool_t swi_edma_acq_left_notched(int32_t *buffer_rx_ping, int32_t *buffer_tx_ping)

    {       

           Swi_Params swi_params_edma_acq_left;

           Swi_Params_init(&swi_params_edma_acq_left);

           Swi_getAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

           swi_params_edma_acq_left.arg0 = (Arg)buffer_rx_ping;

           swi_params_edma_acq_left.arg1 = (Arg)buffer_tx_ping;

           Swi_setAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

                   ...

    Fyi...your temporary "work of art" comment gave me a good chuckle!

    Todd

  • Some people like to change the Swi function (not a common use case though). The Handle points to a Swi_Object structure. This structure maintains the function pointer, args, etc. for each Swi instance. So the getAttrs is reading from the object and the setAttrs is setting the object fields.

    Please note the following when using Swi_setAttrs: Swi_setAttrs() must not be used on a Swi that is preempted or is ready to run.

    Todd
  • ToddMullanix said:

    Good point on the Param_init. The Swi_getAttrs API currently will set the following fields in the supplied params structure: arg0, arg1, priority, and trigger. The Swi_setAttrs API currently only sets the exact same fields. However, we don't document that, so it could change in the future. So to be safe, call Swi_Params_init. So it's

    bool_t swi_edma_acq_left_notched(int32_t *buffer_rx_ping, int32_t *buffer_tx_ping)

    {       

           Swi_Params swi_params_edma_acq_left;

           Swi_Params_init(&swi_params_edma_acq_left);

           Swi_getAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

           swi_params_edma_acq_left.arg0 = (Arg)buffer_rx_ping;

           swi_params_edma_acq_left.arg1 = (Arg)buffer_tx_ping;

           Swi_setAttrs(SWI_edma_acq_left, NULL, &swi_params_edma_acq_left);

                   ...

    Ok, will do.  I actually added in some more of my application Swi's (with similar get/set Attrs on globals), and started getting an exception again.  So I'll go through and change them all accordingly, and see if the exception goes away.

    ToddMullanix said:

    Fyi...your temporary "work of art" comment gave me a good chuckle!

    Todd

    Edit button - good for tidying up posts, or removing extraneous comments after the fact  :)

  • ToddMullanix said:


    Please note the following when using Swi_setAttrs: Swi_setAttrs() must not be used on a Swi that is preempted or is ready to run.

    Todd

    Ok, noted.  I should be safe, since the Swi_setAttrs() is not called from an Swi ... it's a Hwi handler.

    Thanks,

    Robert

  • All SWI's running so far, without crash, after making these changes. We'll call it resolved.

    Thanks,
    Robert
  • Just so I'm clear -- the diags had nothing to do with this right? To answer your earlier question, the function that runs due to the Swi_post was configured at build time as part of your cfg file. That's why I was suggesting to simply use NULL for that associated parameter and leave it alone. Was that what you used for a solution? Or did you do it along the lines of what Todd suggested?
  • Brad Griffis said:
    Just so I'm clear -- the diags had nothing to do with this right? To answer your earlier question, the function that runs due to the Swi_post was configured at build time as part of your cfg file. That's why I was suggesting to simply use NULL for that associated parameter and leave it alone. Was that what you used for a solution? Or did you do it along the lines of what Todd suggested?

    Sorry, just realized I missed your earlier reply suggesting the NULL.  Ok, yeah thought it was the config linkage. 

    I was taking my cues from these pages, where it indicated the need for the function in both Attr calls, first as a pointer in the getAttrs:

    and without pointer for the setAttrs:

    unless I'm overlooking something, someone else referencing these online API's might get tripped up by the same thing?

    diags ended up having nothing to do with it.  After making these final changes, I put everything back, so that ASSERT was turned off in Defaults, and still no exception.

    Here's the final implementation I went with, which is what Todd had last suggested, including the NULL you pointed out; plus the update to include the params init

    Swi_Params t_swi_params;

    Swi_Params_init( &t_swi_params );

    Swi_getAttrs( SWI_edma_acq_left, NULL, &t_swi_params );

    t_swi_params.arg0 = (Arg)buffer_rx_ping;

    t_swi_params.arg1 = (Arg)buffer_tx_ping;

    Swi_setAttrs( SWI_edma_acq_left, NULL, &t_swi_params );

    Swi_post( SWI_edma_acq_left );

    Thanks,

    Robert

  • Thanks for confirming that the diags were unrelated. That had me a bit confused, so now it's making sense.

    The documentation is correct. That's what I used to diagnose your issue. The second argument of Swi_getAttrs is supposed to be of type Swi_FuncPtr*. In other words, it's a pointer to a type Swi_FuncPtr. You needed to do something like Todd showed:

    Swi_FuncPtr swiFxn;

    Swi_getAttrs(SWI_edma_acq_left, &swiFxn, &swi_params_edma_acq_left);

    I imagine the compiler tried to warn you, but there was an improper cast in your getAttrs call that was masking the error. The other solution (also noted in the documentation) is to pass NULL for those parameters which avoided the issue altogether.

    Brad
  • Brad Griffis said:
    Thanks for confirming that the diags were unrelated. That had me a bit confused, so now it's making sense.

    The documentation is correct. That's what I used to diagnose your issue. The second argument of Swi_getAttrs is supposed to be of type Swi_FuncPtr*. In other words, it's a pointer to a type Swi_FuncPtr. You needed to do something like Todd showed:

    Swi_FuncPtr swiFxn;

    Swi_getAttrs(SWI_edma_acq_left, &swiFxn, &swi_params_edma_acq_left);

    I imagine the compiler tried to warn you, but there was an improper cast in your getAttrs call that was masking the error. The other solution (also noted in the documentation) is to pass NULL for those parameters which avoided the issue altogether.

    Brad

    To be honest, that seems meaningless to me.  swiFxn in that context is some local function declaration, tied to nothing.  When reading it, I would see that it's asking for a Swi_Func, which I've declared/associated with this Swi handle, ergo that is the one to put in there.  I wouldn't imagine I'd be the only one reading it that way (but who knows, maybe).  From reviewing the DSP-BIOS implementation, the attr calls did take the actual SWI function call (unless I was doing that incorrect there too, and it just happened to work), leading further to that misinterpretation.

    Robert

  • Brad Griffis said:


    I imagine the compiler tried to warn you, but there was an improper cast in your getAttrs call that was masking the error. 

    The compiler was saying that it didn't like a (void) or (void *), when a (Swi_FuncPtr) or (Swi_FuncPtr *) was needed (for set/get respectively), because my Swi function is a type void.  Ok, it wants some fancy pointer type for the Swi func ... no problem, so yes, put in the casts ;)  (Swi_FuncPtr *) for (void *) of the function address/getAttrs, and (Swi_FuncPtr) for the (void) of the setAttrs.  I don't think that would be viewed as destructive type casting.

    Robert