This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

BUSERR running out of SDRAM on OMAP-L137



I have some code running on the DSP side of the L137 part. It's pretty solid, stable code, with many, many thousands of hours of run time on it.  It runs inside the internal RAM on the device. It boots out of  SPI based flash.  I wanted to move it to the external SDRAM, so I relocated everything to reside in SDRAM.  It's based on DSP/BIOS 5.

Now a few strange things happen:

The code always runs through the main() function and then gets into the various threads which were created. In every case, the thread is blocked, usually on a TSK_sleep() call.

1) The Timer1 interrupt for the kernel isn't happening.  The timer itself is running, and the interrupts are enabled, but if I set a breakpoint in the vector table for INT15, it doesn't get hit, unless I manually set the interrupt in the DSP core registers. DSP/BIOS sets up and handles the timer.

2) Occasionally Interrupt event 127 is showing up as being triggered. It's not enabled, so the flag just remains set until I reset the system.  Docs say it's EMC_BUSERR, but there's next to no info about what that actually means.  Best I can find is the Misc registers in the megamodule:  BUSERR = 80000e01, which decodes to CFG write status error detected, transaction ID of E, and it's an addressing error.  And this appears to be occurring somewhere inside the kernel.  All the threads are blocked.  It occurs after the last call to TSK_sleep(). Any way to find out what the bad address is? or who/what code address tried to make the bad access? What is a transaction ID and what is significant about 'E'?

3) Once the system gets into this state, if I download the version of the code that runs out of internal RAM, it will also hang most of the time. I have to perform a reset through the emulator to get the internal RAM version working again.

I have compared all the obvious registers from the register view (timers, gpio config, system config, interrupt controller, etc) between the two versions at the point it hangs, and everything is pretty much the same.

The code uses the EMAC controller and the EDMA3 controller, but neither one is initialized yet at the point where it hangs. There is nothing in EMIFA or EMIFB registers that looks like something went wrong.

Any info on debugging this BUSERR issue or why running out of external RAM would behave differently would be appreciated.


Thanks.

  • Hi,

    The following concepts applicable to only C66x devices (multicore DSP) and not C674x.

    We are referring this only for analyzing the problem of EMC_BUSERR.

    EMC_BUSERR meant by CFG bus error event.


    The CFG bus error event (EMC_BUSERR) and the CFG bus error register (ECFGERR) signals error for transactions on the external configuration bus.

    The EMC (external memory controller) controller monitors transaction errors that occur in the Configuration Bus when accessing global configuration space from the DSP core. Bus errors trigger an event that is routed to DSP Core interrupt controller input event 127.

    and also as you said that the difference between working and non-working application is the location of the application get stored (ie external SDRAM and internal SDRAM)

    So can you please check your SDRAM configurations and initialization sequence in your code.

    You can also view the status of the register ECFGERR & ECFGERRCLR to check the what type of transactions error occurred.

    What is a transaction ID and what is significant about 'E'?

    http://www.ti.com/general/docs/lit/getliterature.tsp?baseLiteratureNumber=sprugw0&fileType=pdf

  • Thanks for the information. 

    Although I'm still confused about the "transaction ID".  Is it just a sequence number? or does it refer to a specific source/bus master?  The only references I could find in "SPRUGW0" (pages 4-37 and 6-4) refer to a transaction ID field in a register, but no references as to what it actually means.

  • Hi PeteG,

    This is my understanding but it may go wrong too.

    Transaction ID is not a sequence number. It should probably refer to the privilege ID of Data I/O masters that configures it for each transfer. There seems to be 16 total PrivID values supported in KeyStone devices.

    As per this document, http://www.ti.com/lit/wp/spry150a/spry150a.pdf, privilege ID is carried by each transaction and this transaction ID may store the ID when the read/write error is detected during the transaction.

     

    (Note: This explanation is applicable only to the keystone devices.)

    Regards,

    Shankari

    -------------------------------------------------------------------------------------------------------

    Please click the Verify Answer button on this post if it answers your question.
    --------------------------------------------------------------------------------------------------------

  • XID and PrivID are two different things.  The XID is essentially a "sequence" number that doesn't have much meaning in isolation.  This is part of the internal handshaking and is how a given master correlates a given request (from master to slave) with an associated response (from slave to master).

    Regards,

    Kyle