This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Code memory corruption on C6488 tri-core Faraday DSP

Hi,

I'm trying to debug a rather strange error in which the code section is trying to access a reserved memory location (0x00010000) with the result that the Program Counter is unable to fetch a valid instruction, causing a fatal exception in the process. I tried to debug the issue by using memory dumps and TI's XDA560 'Tracepod' but have not been able to gather much information about which part of the application code is getting corrupted (the traces only contain platform calls which do not trace back to the application code)

Any tips in debugging this is much appreciated.

Regards,

Sandip

  • Sandip,

    How far along in your debugging are you?  If this is still the early stages, using a trace pod at this point is only going to make debugging more difficult.  I'd suggest narrowing the problem down further, until you can pinpoint a small error where you know the problem is occuring, and then use trace to capture relevant data.

    It seems, to this point, you have identified the memory address that is causing the problem.  This is a good start.  Now is the access that is being made a read or a write?  You can set a hardware watchpoint to break when 0x10000 is accessed.  At that point, you should be able to identify which instruction is causing the access.  Best guess is that this might be a corrupted function pointer.  So, the next step would be to possibly break whenever the value 0x10000 is written to the address of the pointer.  From there, you should be able to get an indication of where the pointer is being overwritten. 

    Once you have narrowed down this type of information, you can then use trace to monitor program/data accesses in a very narrow window that should point you to the issue.

     

    Regards,
    Dan

     

  • Hi Dan,

    Thanks a lot for your response! I will try out the approach you suggested.

    A quick question: How do I break whenever the value 0x10000 is written to the address of the pointer?

     

    Regards,
    Sandip

  • Hi Sandip,

    I think you might have L1P enabled. Could you try to check the L1P memory fault registers? If you are enabling cache, try disabling all caches (L1P, L1D and L2 caches), as a quick workaround to see if this fixes the issue.

    To help avoid such issues, please enable exception handling and set memory-protection for all pages. Another generic good practise: If you are using global memory addressing for L2 memory, make sure to put unused DSP-cores into IDLE after bootup.

    Regards,

    Justin

  • Hi Justin,

    Thanks for your reply. Yes I do have L1P, L1D and L2 Cache enabled. I think I have identified the cause of the problem - this issue is seen just after a "soft recovery" of the DSP, when the process is being created. I am using an process management API which seems to be having trouble assigning a  pid to the restarted process. There is memory protection for the DSP cores but it has been disabled (the reason beats me). All 3 cores are being used, so putting unused cores in IDLE mode does not apply here.

    Regards,
    Sandip