PRU loading fails

scoutu75

Hi,

I'm using the C6747. I've been using the PRU for some time to handle SPI and this works. The code in the DSP that loads and starts the PRU comes from the PRU package v1.03 (host\common\src subdirectory).

Recently, a bug in my product was isolated to the code that loads the PRU instruction memory space with the compiled PRU code. It doesn't happen when the product is electrically reset and booted, only when the DSP's reset pin is toggled after the DSP application was already loaded and running correctly at least once. Also, the bug is not systematic, it happens occasionally.

The DSP hangs in the PRU_load() function in pru.c. It is precisely in the following loop:

for(i=0; i<codeSizeInWords; i++)

{

pruIram[i] = pruCode[i];

}

I found this out by setting a port pin before the equation and clearing it aftwards in the loop and visualizing with a logic analyzer. I could not reproduce the problem with my XDS100-class emulator. My PRU code is 520 words. When the bug manifests itself, the port pin stops toggling after approx 128 word transfers (could be a little more, if this is important I could redo the tests).

After countless hours trying to pinpoint the exact reason of the bug, I think I've found a variable in the equation. The PRU code table is constant data placed in Shared RAM (L3). I also tried placing the code table in external SDRAM (EMIFB) and it gave the same result. But when I placed the table in L2 RAM, the bug disappeared!

Could there be an issue with the transfer of data from L3 or external memory and the PRU instruction memory space? Looking at SPRUFK4D, 3.2 System Interconnect Block Diagram, the Shared RAM and EMIFB modules are separated from the core with System Interconnects and Bridges. Are there registers that have to have a special (non-default) configuration? I looked at the errata but did not see anything pertaining to this issue.

Notes:

- I'm not using the Memory Protection units.

- All interrupts are off.

- L1D and L1P are all cache, L2 is all RAM.

- Shared RAM (L3) and external SDRAM are set to cacheable.

Let me know what you think. If there's an issue with this type of mem-to-mem transfer, I'll have to check other parts of the DSP code to see if other conflits could arise.

Thanks,

over 13 years ago

0 scoutu75 over 13 years ago

Intellectual 440 points

Hi,

I kept thinking about this issue and I'm not sure it's a mem-to-mem copy problem. I have a feeling that an event happens at this moment and it just so happens that it's during the loading (copying) of the PRU code and the combination of both may cause the DSP to hang.

Coming back to the toggling of the port pin, I've seen since the start of this method that after approx 128 toggles, there's a period of approx 1.5 us where the toggling stops and then restarts again until the end of the copy of the PRU code table. It is during this "stall" period that the DSP may hang. I thought this might be a cache issue so before the loop, I invalidated all of L1P and the PRU code table in L2 (both calls wait until invalidation is complete). These calls add approx 8.5 us of processing time before the loop starts. I then observe that the same "stall" period during the copy is not shifted in time by +8.5 us and not falling at about 128 iterations but instead occurs approx 39 us from the call of the first of the two cache invalidate functions. In other words, I don't think the "stall" is a function of the number of loop iterations but a function of time since DSP reset.

To continue on this path, I added "nops" in the loop so that every iteration would be longer. I observe that the "stall" is about at the same instant since reset and is not coupled to the number of iterations.

I went on doing more tests and for each a different compilation of my DSP application. This has the effect of changing the moment since reset where the application calls the PRU load function. If the load function is called after "stall", there are not problems.

I don't have a precise measurement but this "stall" (I don't know how else to qualify it) seems to occur about 400 to 430 ms after the start of my DSP application. My app does not use the watchdog.

Note also that I have my own bootloader that reads a serial SPI flash containing the DSP app in AIS format, loads it and finally jumps to it. This custom bootloader exists because it also allows the reprogramming of the SPI flash. The bootloader is loaded in Shared RAM (L3), L1P and L1D memories remain in full cache configuration. The bootloader does not use the watchdog but uses one interrupt for SPI Rx. Interrupts are disabled before jumping into the DSP application.

Could there be something left running or not correctly reset/configured in the bootloader that can cause this problem? Any other ideas?

Thanks,

0 Mukul Bhatnagar over 13 years ago

TI__Guru* 83935 points

This is going to be a difficult problem to diagnose over email :).

Few suggestions/questions

1) Do you think the issue would be reproducible if you were to replace the PRU code with just a simple DSP mem to mem copy to see if the problem still exists?

2) When you say you are not able to reproduce this emulator, what do you mean by that? WIth the emulator connected the problem does not exist?

3) Once the toggling stop, have you tried to connect to the processor via emulator to see where the DSP program counter and/or if it is truly in weeds does it give you an error connecting to JTAG, if there is an error, what is it?

4) The difference on L2 vs Shared RAM vs SDRAM could simply imply difference in behavior due to difference in access latency L2 being the smallest and SDRAM being the longest or caching? Can you see if decreasing the speed of the processor have any effect on the failure?

5) Can you try disabling caching to Shared RAM and EMIFB , this can be done by controlling the MAR bits for those memory regions

6) Are you absolutely sure the interrupts/exceptions are all disabled?

7) What is the DSP doing during the time of failure, are their events/data transfers that are happening around the vicinity of the hang?

Regards

Mukul

0 Gagan Maur over 13 years ago in reply to Mukul Bhatnagar

TI__Expert 8150 points

In additon,

What do you mean by ‘DSP hangs’? Does the connection to JTAG is lost or DSP end up in code area that it shouldn’t be?
If you disable interrupt across the code that loads the PRU image, do you see any difference in behavior?
Have you experimented with disabling DSP cache and is the behavior same?
What platform are you using? Is it your own board or are you using TI EVM?
Is it possible to simplify the application code to a simple test case that we can recreate on EVM and debug?

Cheers,

Gagan

0 scoutu75 over 13 years ago in reply to Gagan Maur

Intellectual 440 points

Hi,

Thank you both for your suggestions. I tried replying here the two weeks following your replies but my message could not be posted (nothing happed when I clicked "post").

I finally tracked down the problem. There was a bug in the bootloader were interrupts were not always disabled. The main application code was then booted and all required peripherals were configured with interrupts enabled... with the effect of unexpected behaviour!

I fixed the bootloader and added interrupt disactivation at the beginning of the application code.

Best regards,

Processors

Processors forum

PRU loading fails