OMAP L137 hangs when reading address 0xBC000000

ernesto colizzi

Other Parts Discussed in Thread: SYSBIOS

Dear TI experts,

apparently there is a condition that causes a complete hang of the OMAP L137.
By "complete hang" I mean the state for which my XDS560v2 STM emulator declares:

"Trouble Halting Target CPU:
(Error -1032 @ 0x3FE)
Data bus is 'not ready'. Choose 'Abort to try to abort the pending
transaction. Chose 'Force' to try to force the bus ready state.
(Emulation package 5.0.747.0)"

when I try to connect to the ARM core

"Trouble Halting Target CPU:
(Error -1060 @ 0x0)
Device is not responding to the request.
Reset the device, and retry the operation.
If error persists, confirm configuration, power-cycle the board, and/or try more reliable JTAG settings *e.g. lower TCLK).
(Emulation package 5.0.747.0)"

when I try to connect to the DSP core

The only way to exit this state is a complete power cycle of the OMAP device.
Resetting is not enough as well as all of the operations suggested by the emulator.

The condition under which I can create this state is the following:
- the ARM core executes instructions located in SDRAM (adresses 0xC000xxxx)
- the ARM core runs a simple endless loop that reads a valid location in the internal shared RAM.
- the DSP core runs instructions located in the internal RAM (addresses 0x1180xxxx)
- the DSP core runs a simple endless loop that reads a location at address 0xBC000000

Note that the address 0xBC000000 corresponds to a memory region where nothing should be present, according to L137 specs.
The access to this memory location mimics a bug that was present in my application code (an uninitialized pointer).
I tried other addresses in around 0xBC000000 and I found that, as an example:
- 0xB0000000 works fine
- 0xBCDE0000 fails

Note that if I let either of the two cores running alone, no issue occurs.
As soon as I run both cores together, the OMAP enters in the "complete hang state".

Please help me in understanding why this is happening and, if you confirm my diagnosis, how I can prevent this state to be entered.

Thank you very much in advance for your help.

regards

Ernesto

over 9 years ago

0 Shankari G over 9 years ago

TI__Mastermind 43955 points

Hi Emesto,

We will try to reproduce this issue and get back.

Regards,

Shankari

0 ernesto colizzi over 9 years ago in reply to Shankari G

Prodigy 70 points

Hi Shankari,

thank you for your prompt feedback.

in order to ease the job of reproducing the issue, I thought it could be helpful for you to know the exact configuration of the device I am using.

The attached file describes the basic settings (clocks, EMIFA, EMIFB), the DSP cache usage and the ARM cache usage.

Please ask if you need further information.

Thanks

regards

Ernesto

0068.OCC lockup.pptx

0 ernesto colizzi over 9 years ago in reply to ernesto colizzi

Prodigy 70 points

Hi Shankari,

I add some observations that might be useful in the attempt to reproduce the issue.

In order to fall into the failing state, you need the simultaneous access to the 0xBC00-0000 address from DSP and the fetching of an instruction out of the EMIFB-SDRAM by the ARM.

This seems trivial to implement but care must be taken in designing the two endless loops (one on ARM and the other on DSP). As a matter of fact the access to the 0xBC00-0000 and to the SDRAM are very fast and represent a subset of the execution time of the two endless loops. This means that two situations should be avoided:

a) The two loops are perfectly syncronous: in this case there is the risk that the two accesses occurr in different times and never ovrelap.

b) The two loops are apparently asynchronous but they overlap only in fixed, periodical instants. In this case it may happen that the overlap of the two accesses never occurs.

I had some hard time in designing the two loops and I found useful to use some GPIO signals connected to a logic state analyzer. The basic problem is that the ARM and DSP clocks are the same, so it is very easy to fall into condition (b).

regards

Ernesto

0 Titusrathinaraj Stalin over 9 years ago

TI__Guru** 116100 points

Hi Ernesto,

"Trouble Halting Target CPU:
(Error -1032 @ 0x3FE)
Data bus is 'not ready'. Choose 'Abort to try to abort the pending
transaction. Chose 'Force' to try to force the bus ready state.
(Emulation package 5.0.747.0)"

when I try to connect to the ARM core

"Trouble Halting Target CPU:
(Error -1060 @ 0x0)
Device is not responding to the request.
Reset the device, and retry the operation.
If error persists, confirm configuration, power-cycle the board, and/or try more reliable JTAG settings *e.g. lower TCLK).
(Emulation package 5.0.747.0)"

Could you please try to lower down the TCLK on JTAG.

Please refer the following links.

http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/p/145860/551153.aspx

http://processors.wiki.ti.com/index.php/XDS100#Q:_How_can_I_turn_on_adaptive_clocking.3F">http://processors.wiki.ti.com/index.php/XDS100#Q:_How_can_I_turn_on_adaptive_clocking.3F

- the DSP core runs a simple endless loop that reads a location at address 0xBC000000

- the ARM core runs a simple endless loop that reads a valid location in the internal shared RAM.

Did you get something like "No source available for "0xBC000000" ?

Could you please share the screen shot of this.

0 Shankari G over 9 years ago

TI__Mastermind 43955 points

Hi Ernesto,

We just tried running the rCSL examples on both the cores, ARM and DSP using CCS 5.5 with SDI XDS510USB emulator.

On DSP side, UART_hyperterminal_dspL137 and ARM side, GPIO_multi_led_interrupt_armL137.

Able to run the examples on both the cores.

What is the type of project you are running on both the cores? and What are the peripherals used by both the apps?

Regards,

Shankari

0 one and zero over 9 years ago in reply to Shankari G

TI__Genius 17676 points

Hi Ernesto,

looking at you initial post I believe the behavior you're reporting is because you're accessing reserved memory space.

You can see this in the datasheet (SPRS563G - page 23) : “Note: Read/Write accesses to illegal or reserved addresses in the memory map may cause undefined behavior.”

0xBCDE0000 is reserved memory. So undefined behavior is expected.
0xB0000000 is EMIFB Control Registers so it's allowed to access.

Kind regards,

one and zero

0 ernesto colizzi over 9 years ago in reply to one and zero

Prodigy 70 points

Hi One and Zero,

thank you very much for your answer.

I saw the warning on pag.23 of the document you mentioned, but I could not see an explicit definition of "reserved memory" for the grey areas of the table represented in the same page.

If you therefore confirm that:

a) the grey areas of table 2-4 on pag,23 correspond to "reserved memory"

b) the behavior I am observing and describing is part of what you call "undefined behavior"

then we can close the analysis.

Would you be able to suggest me a way of capturing and managing accesses to this memory range in order to prevent the application code to read from it?

I tried to use the memory protection units available on the L137 but I could find any way to apply them to my case.

Thanks in advance

regards

Ernesto

0 ernesto colizzi over 9 years ago in reply to Shankari G

Prodigy 70 points

Hi Shankari,

thanks for your comment.

the applications I am running are simply two trivial endless loops, as indicated in my original post.

They represent a subset of the activities run by the real applications.

There are several perpherals connected to the L137 but the only one involved in the two endless loops and in the phenomenon i desribed is the external SDRAM connected to EMIFB.

For sake of convenience, I enclose the values of all OMAP registers (seen by ARM and DSP) before the failing event takes place.

Regards

Ernesto

3010.OMAP registers before lockup.xlsx

0 ernesto colizzi over 9 years ago in reply to Titusrathinaraj Stalin

Prodigy 70 points

Hi Titus,

thank for your post.

The emulator and the target board work fine, despite the message displayed.

This is not an issue of JTAG clock, unfortunately.

The issue raises only in a very specific case, when the two cores run together the application code I described in my original post.

thanks again

regards

Ernesto

0 Titusrathinaraj Stalin over 9 years ago in reply to ernesto colizzi

TI__Guru** 116100 points

Hi Ernesto,

What type of code are you running ?

Is it CSL package or BSL or SYSBIOS etc., ?

What is the size of .out files for the both codes (ARM & DSP) ?

0 Mukul Bhatnagar over 9 years ago in reply to ernesto colizzi

TI__Guru* 78435 points

Hi Ernesto

One and Zeros answer is correct. In general accesses it is not recommended that software does any accesses to the reserved memories, the note on the top of the memory map is specifically to highlight this.

Usually we design the arch, such that accesses to reserved memories do not cause a hang, but specifically for OMAPL137 , the region between the EMIFB MMR and data memory when accessed simultaneously with some other masters accessing other portion of memory map can cause a hang.

Typically accesses to truly reserved memory space, will also cause the BUSERR interrupt to be generated and it can be used by software to do the corrective action, however I do not think in this specific memory region the buserr triggers.

I think you will essentially need to ensure that there are no accesses to this region in software to prevent any hang. I would also recommend having a watchdog timer in the system , if you are worried for such scenarios where the cores are not responsive.

You are correct in your assessment that this is not a JTAG issue etc.

Regards

Mukul

0 ernesto colizzi over 9 years ago in reply to Mukul Bhatnagar

Prodigy 70 points

Thanks Mukul,

I could not see any chance for triggering bus error, unfortunately.

We do have a watchdog, but it is only able of issuing an OMAP reset.

It does not perform a complete power cycle, unfortunately.

thanks anyway for your answer and comments

best regards

Ernesto

0 Mukul Bhatnagar over 9 years ago in reply to ernesto colizzi

TI__Guru* 78435 points

Hi Ernesto

Understood, yes the only thing I could additional suggest is the WDT will also toggle RESETOUT, which can be used to propagate to other components on the board, but if you are not doing that already that implies a board change.

Regards

Mukul

Processors

Processors forum

OMAP L137 hangs when reading address 0xBC000000