TMS320F28388D: MCAN data corruption in bootloader

Thomas Craddock

Part Number: TMS320F28388D
Other Parts Discussed in Thread: C2000WARE

Tool/software:

Hello,

I am having an issue developing a custom bootloader that boots from flash. It seems that adding code to support branching is causing my firmware update function to fail, and I suspect the failure is in the MCAN module. The problem manifests as a corruption in the data sent from and received by the MCAN.

The bootloader is configured to idle until it receives instructions over DCAN from an external controller (for now, a PC). It will either be instructed to 1) accept a new firmware update over DCAN and pass a new firmware update over MCAN to several submodules (up to 10 f28003x's) or 2) to branch to the main application.The firmware update portion of the code was developed first, and has been working well for some time.

The issue arises when I introduce code for option 2 as described above. I added a handshaking loop to put the f28388D and the submodule(s) in a branch-ready state before branching. The handshaking loop takes place after the main loop, which contains the check on received data to determine if the f28388D should enter its firmware update function or not. If a branch command is received by the f28388D, the program exits the main loop and enters the handshaking loop. Simply including the handshaking loop in my program causes the firmware update to fail. That is to say that I can omit the handshaking loop and do firmware updates without issue, but if I include the handshaking loop and make no other changes to my code, the data sent and received by the MCAN is corrupted. The MCAN data I receive in the f28388D does not match the data sent by the submodule(s), and the submodule(s) don't see some messages sent by the MCAN. This was determined by running both programs in debug sessions and monitoring the data in the expressions window.

I would like some help understanding why my firmware update code or the MCAN module would behave differently when I add in code unrelated to the part that works.

Other issues that may be related to this: My program will often get stuck in CAN_initRam at seemingly random times. This problem did not arise until I started trying to include this handshaking loop to control branching. I also have seen that the MCAN stops sending and receiving messages whenever I hit a breakpoint or pause my program for debugging purposes. I typically need to re-start my program every time I hit a breakpoint, and I find this very inconvenient.

Thanks,

Tom

3 months ago

0 QJ Wang 3 months ago

TI__Guru**** 197426 points

Hi Tom,

Yu use F28388 DCAN to receive new firmware from a host (PC), and transfer the firmware to F28003x devices through MCAN. Is the handshaking loop done by DCAN or MCAN?

0 Thomas Craddock 3 months ago in reply to QJ Wang

Prodigy 20 points

The handshaking loop is done through MCAN.

0 QJ Wang 3 months ago in reply to Thomas Craddock

TI__Guru**** 197426 points

CAN_initRam() is to initialize the DCAN message RAM for DCAN communication between F28388 and host. The handshaking uses MCAN communication between F28388 and F28003x devices after the DCAN communication has finished. The handshaking should not impact the DCAN RAM init operation.

Are MCAN configurations changed by handshaking code? Is there any handshaking MCAN message (low priority) transmitted/received during firmware? update? I can not image how the handshaking code causes the firmware.

0 Thomas Craddock 3 months ago in reply to QJ Wang

Prodigy 20 points

I understand that the DCAN initialization should not be impacted by the MCAN, I just wanted to report another observation I had made. I had thought that maybe something in the MCAN was affecting the clock, but it isn't clear to me based on Figure 3.6 in the TRM where the MCAN clock is coming from.

Nothing about the MCAN configuration is changing in the handshaking loop. The loop sends one message to all submodules, delays for 5 milliseconds, then checks for new data. It checks the incoming message id to verify that the correct submodule is responding, sets a bit to note that the submodule of interest responded, and increments the submodule index. The loop repeats starting at transmitting the message until all submodules reply with an ACK bit in their data. I am using the XIDAM register to mask incoming buffer ID's, which is why I am checking the incoming message's ID.

0 QJ Wang 3 months ago in reply to Thomas Craddock

TI__Guru**** 197426 points

Hi Thomas,

You may try to dump the value of the MCAN registers and contents of the MCAN ID filter elements in message RAM from CCS, and compare the files for operations with handshaking loop and without handshaking loop.

0 Thomas Craddock 3 months ago in reply to QJ Wang

Prodigy 20 points

I think I found the problem, but I believe it raises other questions. The macros used to set up the message RAM use the function MCAN_getMsgObjSize, which returns a value from a global array objSize. In my bootloader, I am using a copy of f2838x_codestartbranch_cpu1.asm from another example that comments out _c_int00, so that global array is not initialized. I made my own version of MCAN_getMsgObjSize in a separate file that uses a local version of objSize rather than the global array, and that seemed to fix my problem. The message RAM is now being set up properly. Is there a reason that objSize is a global variable? It only seems to get used in MCAN_getMsgObjSize.

The new question I have is why another fix I had tried earlier didn't fix this. I made a local version of _c_int00 called _c_int00_BL that is just copied and pasted from the original _c_int00, but I comment out the call to _exit and introduce the _ExitBoot function from that copy of f2838x_codestartbranch_cpu1.asm. The goal is to allow my bootloader project to use initialized global variables and not get caught in the abort loop when I try to branch, but that doesn't seem to be doing what I expect it to. I don't have any background in assembly, so maybe I'm doing something wrong.

In summary, swapping out the MCAN_getMsgObjSize function for a version that uses a local objSize array as opposed to a global objSize array seems to fix my problem both with and without _c_int00 called. This is confusing, as I would have expected the original MCAN_getMsgObjSize to behave differently with _c_int00 included vs _c_int00 not included.

0 QJ Wang 3 months ago in reply to Thomas Craddock

TI__Guru**** 197426 points

Thomas Craddock said:
Is there a reason that objSize is a global variable?

No, you can define it as local variable if it is not used in other files.

The c_int00 is the entry point of the program. It sets up the initial stack pointer, initializes global variables. The c_int00 should be called.

0 Thomas Craddock 3 months ago in reply to QJ Wang

Prodigy 20 points

I agree that _c_int00 should be called, but in several example project provided by TI, it is commented out. In these examples, main is called from the local codestartbranch.asm file. For example, flash_kernel_ex4_can_flash_kernel from the examples for the f28003x from c2000ware version 5.01 has it commented out in the file flash_kernel_ex3_codestartbranch.asm on line 102. To me, this seems like an issue, because any code based on these examples either needs to a) avoid using ANY global variables, including ones buried in TI's driverlib support files or b) have some sort of user-created solution that initializes global variables, sets up the stack pointer, and handles exiting from main and branching to a different application. I think that TI should provide a more robust example of how this process is intended to work. Merely commenting out _c_int00 to avoid calling _exit() after main terminates is not sufficient.

I'll mark this resolved for now, because I think the MCAN issue is resolved, but I would ask that someone at TI make an example bootloader project that allows for initialized global variables and branching to different applications.

C2000™︎ microcontrollers

C2000 microcontrollers forum

TMS320F28388D: MCAN data corruption in bootloader