GBL_versionMismatch + stuck in loop during dsp/bios initialization

Ethan Dunn

I am having some problems moving some of our product code for a 6416 DSP from a Cygwin build environment to a Linux build environment. The change in build environments has required upgrading from the .cdb config format to the .tcf config format, which I did using the cdb2tcf utility, and upgrading DSP/BIOS from vers 5.31.02 to version 5.41.11.38. I am able to successfully build a loadable image, however the image does not run. I used a XDS510 jtag emulator/debugger to step through the startup code, and I have found how it differs from a known good image from the previous build environment. Here is the sequence of symbols/sections that the execution follows that I find suspect: BIOS_init -> versret -> GBL_versionMismatch -> gbl_cslcacheinit_ret -> loop0 -> loop1 The execution seems to be stuck in either loop0 or loop1. I did not see the known good image enter the versret section or the GBL_versionMismatch section. The name of the GBL_versionMismatch section makes me think that there is a problem either with the bios upgrade or the conversion to TCF, but I have not found much documentation to give me an idea of what to look for. Any input on what could be going on would be greatly appreciated.

over 9 years ago

0 ToddMullanix over 9 years ago

TI__Guru* 96960 points

If you are in loop0 and loop1, you are past the check for a mismatch. If the versions are mismatched, you go into an infinite loop on the GBL_versionMismatch label.

loop0 and loop1 are initializing the GBL table. Did you set a breakpoint after the two loops to see if it was truly stuck. I'm asking because single stepping through these two loops would take a looong time (stack being initialized to 0xc00ffee, etc).

What's in the .gblinit section? There should be sets of triples: nwords, addr, value. The two loops finish when a nword value is 0. Do you see a zero terminated triple?

Todd

0 Ethan Dunn over 9 years ago in reply to ToddMullanix

Prodigy 100 points

I set breakpoints at the end of loop1 and at the beginning of done, and I never saw it hit either of them. I was assuming the "done" section would be executed next, maybe that isn't the case?

Here is my .gblinit section (from a dissassembly:
DATA Section .gblinit (Little Endian), 0x4C bytes at 0xFD300
000fd300 000001f0 .word 0x000001f0
000fd304 000fcbe8 .word 0x000fcbe8
000fd308 00c0ffee .word 0x00c0ffee
000fd30c 000000e7 .word 0x000000e7
000fd310 000fbac0 .word 0x000fbac0
000fd314 bebebebe .word 0xbebebebe
000fd318 000000e7 .word 0x000000e7
000fd31c 000fbec0 .word 0x000fbec0
000fd320 bebebebe .word 0xbebebebe
000fd324 000000e7 .word 0x000000e7
000fd328 000fc2c0 .word 0x000fc2c0
000fd32c bebebebe .word 0xbebebebe
000fd330 000001e7 .word 0x000001e7
000fd334 000fae20 .word 0x000fae20
000fd338 bebebebe .word 0xbebebebe
000fd33c 00000040 .word 0x00000040
000fd340 000fd200 .word 0x000fd200
000fd344 ffffffff .word 0xffffffff
000fd348 00000000 .word 0x00000000

I have also noticed that after the image runs, my EMIFA registers get all messed up and I can no longer read external memory. One of the values they get overwritten with is 0x00c0ffee. Could that have something to do with it, or is it just a coincidence?

0 Ethan Dunn over 9 years ago in reply to Ethan Dunn

Prodigy 100 points

More info - It looks like in my known good image, there is a bunch of initialization that happens before I get to loop1 and loop0.
It looks like there is some CSL initialization, IRQ initialization, HWI initialization and so on that is not happening in the image from my new build environment.

0 ToddMullanix over 9 years ago in reply to Ethan Dunn

TI__Guru* 96960 points

I'd like to figure out on which triple the loops are hanging on. Can you load the program and put a breakpoint on the mvkh gblinit, a4 right before loop0 in gbl.h62. Run to this breakpoint. Now put another one on the done. Run the target. After a bit, halt it and see what values are in b0, a0, and b2.

0 Ethan Dunn over 9 years ago in reply to ToddMullanix

Prodigy 100 points

I am having a bit of trouble with this, when I halt the program the debugger disconnects from the processor, so I can't read any of the register values. I do know that as far as I have been able to step through the loop (via weighting the F5 key), B2 stays at 0x00C0FFEE, A0 is at 0x00C1015E, and B0 is at 0x00C0FF92. From what I can tell, A0, B0, and B2 are all 0x00C00FFEE in loop0 at the line [!B0] B.S2 done$162$ (PC+52 = 0x000ffb94) . That would mean the loop has to execute 12,648,430 times. Does that seem right?

0 ToddMullanix over 9 years ago in reply to Ethan Dunn

TI__Guru* 96960 points

It looks like that first triple is causing problems. The following are the values based on your previous post

nword (b0): 0x000001f0
address (a0): 0x000fcbe8
value (b2): 0x00c0ffee

The address register is incremented in loop1. The nword register is decremented in the same loop until it gets to 0. It sure looks like loop1 is writing the value (0x00coffee) in the wrong spot(s) and corrupting stuff.

Can you look in the .map file and see what is at 0x000fcbe8 thru 0x000fd3a8?

It looks like the length (b0) and address (a0) got written with 0x00c0ffee also. Then it kept writing 92 words (0x00c0ffee -0x00c0ff92) corrupting address 0x00c0ffee to 0x00c1015e.

Todd

0 Ethan Dunn over 9 years ago in reply to ToddMullanix

Prodigy 100 points

0x000fcbe8 is the start of my stack. After that is my .sem, .tsk, .swi, .sys, .gblinit, .log, .trcdata. These sections are all filled with stuff from the *cfg_cfg.o file. Now that I know what those registers mean, I followed the loop from the initial values for nword, address, and value you posted above until b0 got to 0x00000000. At that point, it went back to loop 0 and loaded b0 and then a0 with 0x00c0ffee. The value at A4 after a0 is set is 0x000fd318, which should be in the gblinit section. Furthermore, when a0 is 0xfcddc, which should be the end of the stack, b0 is still 0x174, which is exactly one quarter of the initial counter value. It looks like a0 is incremented by 0x4 but the counter only decrements by 1, which is correct according to my known good image. This makes me think the initial counter value is wrong, but I'm not sure what could be affecting that. Maybe some sort of compiler/linker flag?

0 Ethan Dunn over 9 years ago in reply to Ethan Dunn

Prodigy 100 points

Actually, the nword value in my known good image is also 0x1f0, so I think that is correct, so I must be incrementing the address by the wrong amount?

0 ToddMullanix over 9 years ago in reply to Ethan Dunn

TI__Guru* 96960 points

Can you attach the mapfile?

0 ToddMullanix over 9 years ago in reply to ToddMullanix

TI__Guru* 96960 points

Actually the export project would be even batter.

0 Ethan Dunn over 9 years ago in reply to Ethan Dunn

Prodigy 100 points

I have solved it. As it turns out, the -stack option must have changed. Before the argument was -stack500, which gave me a stack of length 0x800. With the newer compiler it was giving me a stack length of 0x1f4. Changing the argument to -stack2048 has solved the problem.

Processors

Processors forum

GBL_versionMismatch + stuck in loop during dsp/bios initialization