This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28377D: Illegal_isr() trigged when merging code

Part Number: TMS320F28377D
Other Parts Discussed in Thread: TMS320F28379D, , C2000WARE

Hi,

Good day. I hope you are well.

Our customer meet some problem on integrating two projects into one final version for TMS320F28379D control card. Please find below their current situation.

"The first project uses CPU1 mainly for two functions:
(1) ADC sampling interrupt using ADCA1: defined as “__interrupt void epwmADC_isr(void)” starts from line 613.
(2) Control interrupt using EPWM10: defined as “__interrupt void epwmcontrol_isr(void)” starts from line 697.
(The first project is zipped in CPU1_CONTROL.zip attached).

The second project is modified based on the bitfield RAM_MANAGEMENT_CPU01 and RAM_MANAGEMENT_CPU02 example. Specifically:
- CPU1 creates some data and put into shared memory (shareData1to2_32, "SHARERAMGS2");
- CPU2 sends the shareData1to2_32 through SPIA FIFO TX as a SPI slave.
- CPU2 receives data from SPIA FIFO RX and put it into shared memory (shareData2to1_32, "SHARERAMGS3");
- CPU1 and CPU2 uses IPC to control the R/W for the shareData1to2_32 and shareData2to1_32;

Both projects work perfectly alone. The share the same directories for linkers and includes (I put all of the linkers and includes in two separate folders and attached below in the LinkersANDIncludes.zip)

Now, I am trying to integrate the first project to the second one, but it causes ILLEGAL_ISR() issue. However, if I comment out line 1228-1854 in the epwmcontrol_isr in cpu1 (RAM_MANAGEMENT_CPU01.c), both cores work fine (it enters the ADC + EPWM interrupts in CPU01 and SPI+CPU TIMER0 interrupt in CPUU02). The integrated project is attached in DUAL_CONTROL_IPC_SharedMemory.zip (It uses the same Includes and Linkers as the first and second project (LinkersANDIncludes.zip)).

I am wondering the causes for the ILLEGAL_ISR(). I am guessing is it because of short of running time for the epwmcontrol_isr ? (But compared to the first project, I did not add too many stuffs to CPU1 in the integrated project.) "

Thank you for your help on this matter.
LinkersANDIncludes.zip
CPU1_CONTROL.zip
DUAL_CONTROL_IPC_SharedMemory.zip


Regards,

Cedrick

  • Hi Cedrick,

    Thanks for your question. The ILLEGAL_ISR() issue typically comes from illegal opcode, stack overflow, array pointer issue, or something similar. It is raised when the CPU fetches an illegal instruction during the code execution. So to debug the issue, I would recommend stepping through the code in CCS until the ILLEGAL_ISR() occurs. You can use breakpoints to skip the portions you mentioned are already working, by putting a breakpoint right before that line 1228-1854 section. Then step through till the error occurs. Whatever the last instruction was before the error happens is the source of the issue.

    We recommend to also keep the register/expression windows open in CCS, as this will let you track if the code is accidentally overwriting the existing registers or similar issues.

    Regards,

    Vince

  • Hi Vince,

    Thank you for the feedback.

    One more quick question, they use "2837xD_RAM_lnk_shared_cpu1.cmd" as the memory assignment. They am wondering will the usage of the sharedmemory SHARERAMGS0/SHARERAMGS1 affects the performance of the regular memory (e.g. RAMGS0, RAMLS1,...)?

    Thank you.

    Regards,

    Cedrick

  • Hi Cedrick,

    I appreciate the follow up! However, in order to better route you to the correct expert (for memory/CCS cmd files), and to help people who also search for similar questions, would you be able to post this follow-up as a new post?

    Thanks and kind regards,

    Vince

  • Hello Vince,

    Thank you for the suggestion however I was not able to locate the right forum for this.

    As I've key in the part number "TMS320F28377D" the option for the forum is "C2000™ microcontrollers forum" only.

    Can you please help us re-direct on the correct forum? Thank you.


    Regards,

    Cedrick

  • Hi Cedrick,

    I apologize for the confusion, I misunderstood the follow-up, I will get you in touch with the correct owners. In the meantime, could you provide more details on the particular performance impact you are seeing?

    Regards,

    Vince

  • Hi Vince,

    Good day.

    I am sorry for the late reply. I attached the modified .cmd file they used for their project. They used SHARERAMGS1/SHARERAMGS2/SHARERAMGS3/SHARERAMGS4. And they combined some RAM block for the .text and .cinit. They are wondering if the modifications are correct? Or they made some mistakes and causes conflicts in memory assignment.

    Thank you so much for your help!


    Regards

    Cedrick

    MEMORY
    {
    PAGE 0 :
       /* BEGIN is used for the "boot to SARAM" bootloader mode   */
    
       BEGIN           	: origin = 0x000000, length = 0x000002
       RAMM0           	: origin = 0x000122, length = 0x0002DE
       RAMD0           	: origin = 0x00B000, length = 0x000800
       /*
       RAMLS0          	: origin = 0x008000, length = 0x000800
       RAMLS1          	: origin = 0x008800, length = 0x000800
       RAMLS2      		: origin = 0x009000, length = 0x000800
       */
       RAMLS02          : origin = 0x008000, length = 0x001800
    
       RAMLS3      		: origin = 0x009800, length = 0x000800
       RAMLS4      		: origin = 0x00A000, length = 0x000800
    
       RAMGS14          : origin = 0x01A000, length = 0x001000     // Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices.
       RAMGS15          : origin = 0x01B000, length = 0x001000     // Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices.
    
    
       RESET           	: origin = 0x3FFFC0, length = 0x000002
    
    PAGE 1 :
    
       BOOT_RSVD       : origin = 0x000002, length = 0x000120     /* Part of M0, BOOT rom will use this for stack */
       RAMM1           : origin = 0x000400, length = 0x000400     /* on-chip RAM block M1 */
       RAMD1           : origin = 0x00B800, length = 0x000800
    
       RAMLS5      : origin = 0x00A800, length = 0x000800
    
       //....RAMGS0      : origin = 0x00C000, length = 0x001000
       // remap RAMGS0 to small segments that can save the measurements in floating format (32 bits)
       // RAMGS0 is 4096-bit long -> 128 measurements can be saved here
    
       RAMGS0_0    : origin = 0X00C000, length = 0x000020	// each segment is assigned with 32-bit length
       RAMGS0_1    : origin = 0X00C020, length = 0x000020
       RAMGS0_2    : origin = 0X00C040, length = 0x000020
       RAMGS0_3    : origin = 0X00C060, length = 0x000020
       RAMGS0_4    : origin = 0X00C080, length = 0x000020
       RAMGS0_5    : origin = 0X00C0A0, length = 0x000020
       RAMGS0_6    : origin = 0X00C0C0, length = 0x000020
       RAMGS0_7    : origin = 0X00C0E0, length = 0x000020
       RAMGS0_8    : origin = 0X00C100, length = 0x000020
       RAMGS0_9    : origin = 0X00C120, length = 0x000020
       //------ above is 10 32-bit shared ram segments----//
       //------ (if more needed, follow the pattern and make more)---//
       RAMGS0_REST : origin = 0X00C140, length = 0x000EC0	// assign rest of the RAM to a big section
    
       RAMGS1      : origin = 0x00D000, length = 0x001000
       RAMGS2      : origin = 0x00E000, length = 0x001000
       RAMGS3      : origin = 0x00F000, length = 0x001000
       RAMGS4      : origin = 0x010000, length = 0x001000
       RAMGS5      : origin = 0x011000, length = 0x001000
       RAMGS6      : origin = 0x012000, length = 0x001000
       RAMGS7      : origin = 0x013000, length = 0x001000
       RAMGS8      : origin = 0x014000, length = 0x001000
       RAMGS9      : origin = 0x015000, length = 0x001000
       RAMGS10     : origin = 0x016000, length = 0x001000
       RAMGS11     : origin = 0x017000, length = 0x001000
       RAMGS12     : origin = 0x018000, length = 0x001000     /* Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices. */
       RAMGS13     : origin = 0x019000, length = 0x001000     /* Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices. */
    
       CANA_MSG_RAM     : origin = 0x049000, length = 0x000800
       CANB_MSG_RAM     : origin = 0x04B000, length = 0x000800
    }
    
    SECTIONS
    {
       codestart        : > BEGIN,     PAGE = 0
       //.text            : >>RAMD0  |  RAMLS0 | RAMLS1 | RAMLS2 | RAMLS3 | RAMLS4 | RAMGS14 | RAMGS15,   PAGE = 0 // assign RAMGS14 to this
       .text            : >>RAMD0  |  RAMLS02 | RAMLS3 | RAMLS4 | RAMGS14 | RAMGS15,   PAGE = 0 // assign RAMGS14 to this
       //.cinit           : >RAMM0  |  RAMLS0,     PAGE = 0    // give extra ram (RAMLS4) to .cinit
       .cinit           : >RAMD0  |  RAMLS02 | RAMLS3 | RAMLS4 | RAMGS14 | RAMGS15,   PAGE = 0
       .pinit           : > RAMM0,     PAGE = 0
       .switch          : > RAMM0,     PAGE = 0
       .reset           : > RESET,     PAGE = 0, TYPE = DSECT /* not used, */
    
       .stack           : > RAMM1,     PAGE = 1
       .ebss            : > RAMGS5 | RAMGS6 | RAMGS7 | RAMGS8 | RAMGS9 | RAMGS10 | RAMGS11 | RAMGS12 | RAMGS13,     PAGE = 1
       .econst          : > RAMGS5 | RAMGS6 | RAMGS7 | RAMGS8 | RAMGS9 | RAMGS10 | RAMGS11 | RAMGS12 | RAMGS13,     PAGE = 1
       .esysmem         : > RAMLS5,     PAGE = 1
       Filter_RegsFile  : > RAMGS0_REST,	   PAGE = 1
    
       //.... SHARERAMGS0		: > RAMGS0,		PAGE = 1
       // This SHAREMGS0 is also divided into small segments the same as RAMGS0 to save data
       SHARERAMGS0_0	: > RAMGS0_0,		PAGE = 1
       SHARERAMGS0_1	: > RAMGS0_1,		PAGE = 1
       SHARERAMGS0_2	: > RAMGS0_2,		PAGE = 1
       SHARERAMGS0_3	: > RAMGS0_3,		PAGE = 1
       SHARERAMGS0_4	: > RAMGS0_4,		PAGE = 1
       SHARERAMGS0_5	: > RAMGS0_5,		PAGE = 1
       SHARERAMGS0_6	: > RAMGS0_6,		PAGE = 1
       SHARERAMGS0_7	: > RAMGS0_7,		PAGE = 1
       SHARERAMGS0_8	: > RAMGS0_8,		PAGE = 1
       SHARERAMGS0_9	: > RAMGS0_9,		PAGE = 1
       SHARERAMGS0_REST : > RAMGS0_REST,    PAGE = 1
    
       SHARERAMGS1		: > RAMGS1,		PAGE = 1
       SHARERAMGS2		: > RAMGS2,		PAGE = 1
       SHARERAMGS3		: > RAMGS3,		PAGE = 1
       SHARERAMGS4		: > RAMGS4,		PAGE = 1
       SHARERAMGS5		: > RAMGS5,		PAGE = 1
       SHARERAMGS6		: > RAMGS6,		PAGE = 1
       SHARERAMGS7		: > RAMGS7,		PAGE = 1
       SHARERAMGS8		: > RAMGS8,		PAGE = 1
       SHARERAMGS9		: > RAMGS9,		PAGE = 1
       SHARERAMGS10		: > RAMGS10,	PAGE = 1
       SHARERAMGS11		: > RAMGS11,	PAGE = 1
       SHARERAMGS12		: > RAMGS12,	PAGE = 1     /* Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices. */
       SHARERAMGS13		: > RAMGS13,	PAGE = 1     /* Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices. */
       SHARERAMGS14		: > RAMGS14,	PAGE = 0     /* Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices. */
       SHARERAMGS15		: > RAMGS15,	PAGE = 0     /* Only Available on F28379D, F28377D, F28375D devices. Remove line on other devices. */
    
    #ifdef __TI_COMPILER_VERSION__
       #if __TI_COMPILER_VERSION__ >= 15009000
        .TI.ramfunc : {} > RAMM0,      PAGE = 0
       #else
       ramfuncs         : > RAMM0      PAGE = 0   
       #endif
    #endif
    
    
    }
    
    /*
    //===========================================================================
    // End of file.
    //===========================================================================
    */
    

  • Cedrick,

    Sorry for the late reply, I was OOO yesterday and still catching up :).  

    I looked through the attached and made a 4 small changes in the attached(can search for MP in the comments).  

    Two were small 1 word changes to the stack pointer allocation and its remainder to match what we have currently in C2000Ware.  The fact that our linker file has this aligned to an odd word boundary makes me hesitant to change it.

    There other change was to RAMGS15 and RAMM1, making both 8 words shorter.  There is an errata related to this, there is potential for the prefetch mechanism to fetch beyond a valid memory range on the device if the last 8 words are used, this could cause a ITRAP0 issue if that ties back to what the customer observed.  The RAMGS15 was in our linker, the RAMM1 was not.  I will file a bug on this one.

    Errata is here   

    There's nothing incorrect with the customer splitting of RAMGS0 into smaller blocks per say.  I would just keep in mind that all of GS0 is all still the same physical block.  So if something in the code(or across cores) tried to access GS0_0 and GS0_1(for example) at the same time there would be a memory stall for one of the cores.  This shouldn't cause an issue like a bad fetch, but just something to keep in mind.  I think the arbitration scheme is such that other HW with access permission would get access to the block in a round robin fashion if there is contention with another HW block vs a continuous stall, etc.

    Best,

    Matthewhttps://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/171/2837xD_5F00_RAM_5F00_lnk_5F00_shared_5F00_cpu1_5F00_reallocateHFB_5F00_TIfeedback.cmd-_2800_1_2900_

  • Hello Matthew,

    Thank you for your support here.

    I have shared the information above to our customer and they said that they still encounter the problem. I've copied their response below.

    "I tried the modified memory allocation file you shared with me yesterday, and it still not solving my problem of triggering illegal_isr().

    I am wondering would there be any performance difference for CPU1 between 1) running CPU1 only and 2) booting up CPU2 from CPU1and giving peripheral control rights to CPU2 from CPU1? In another word, would running dual core affect the performance of CPU1 comparing to CPU1 running alone?

    I observed that my control code (adca and epwm interrupts) works perfectly on one core without connect to CPU2 (and booting up CPU2 in the main C code of CPU1). But when I merged to the shell running two code without adding excessive functionalities on CPU1, it triggers the illegal_isr()."

    "One more question, since I am using both cores, should I also make these changes (with MP comments) on the .cmd file for CPU2?"

    I hope you could still help us on this matter. Thank you.


    Regards,

    Cedrick

  • Cedrick,

    Yes, I would make the changes to the CPU2 side as well.

    The only difference when running both cores is that there could be simultaneous access of both memories which would cause one of the cores to stall until the access of the other is complete.  This shouldn't cause an illegal ISR.

    From the linker file the shared GS14 and GS15 are being used to store code(Page =0) on CPU1.  Has customer made sure to either not use GS14/GS15 for CPU .text or define these separately as TYPE = NOLOAD on CPU2?  If not CPU2 could overwrite these memories and the contents won't match what CPU1 expects.  I think in our examples we tend to keep PAGE 0 memories as the local Flash or RAMs only.  

    Note that for some of the data memory(PAGE = 1) this could apply as well, although that wouldn't cause a Illegal ISR, just unexpected variables.

    Best,

    Matthew

  • HI Matthew,

    Good day. Thank you for your continuous support.

    We got an update from our customer which can be found below. 

    "I checked my .cmd file for CPU2, and I did not "use GS14/GS15 for CPU .text or define these separately as TYPE = NOLOAD on CPU2".

    I also attached my current .cmd files for CPU1 and CPU2 modified based on your feedback.
    One more small request, could the expert engineer share the linker files he is using for CPU1 and CPU2 (libc.a, rts2800_fpu32.lib, and F2837xD_Headers_nonBIOS_cpu1.cmd, and F2837xD_Headers_nonBIOS_cpu2.cmd)? Since I realize that those files might be different for different version of CCS/Compiler.

    I am wondering besides the memory issue, what would be the case of triggering the illegal_isr() if the code works on one core (the code work on one core without connect to CPU2 (and booting up CPU2 in the main C code of CPU1, give the SPI rights to CPU2)). But when I merged to the shell running two CPUs without adding other functions on CPU1 besides booting CPU2 and give rights"
    2837xD_RAM_lnk_shared_cpu1_reallocateHFB_TIfeedback(1).zip

    Thank you for your support.


    Regards,

    Cedrick

  • Cedrick,

    Let's have the customer look at this post(go to the 2nd page where the resolved answer is).  The thread explores a similar issue that is caused by the GEL files, and what happens if there are some out of order processes.  I know from previous experience that during emulation there are some steps that have to be taken when using both CPUs to mimic a normal standalone process flow.

    To clarify do we know if CPU1 or CPU2 is issuing the Illegal ISR?  If the above doesn't resolve we may need to just single step through whichever CPU is giving the Illegal and back track the instruction.  Since illegal is caused by executing code, we could just set BP and single step to narrow down.

    Best,
    Matthew

  • Hi Matthew,

    I've got this feedback from our customer.

    Thank you for your reply. The illegal_isr() is triggered when I start the debugger to run CPU1 (before start run CPU2). I ran the code line by line, the illegal_isr() is triggered by different lines based on the "blocks of code" I put in. (And those "blocks" work properly when I test them separately.)

    And the IPC works properly when I comment out some blocks of my code as I mentioned previously. I comment out different blocks at different times. I find out as long as the total length of the uncomment part is within some range, it won't trigger illegal_isr().


    Regards,

    Cedrick

  • Cedrick,

    I'm going to loop in some others in the team to see if we have observed this behavior before.  I'm a bit stumped at the moment.

    Best,

    Matthew

  • Cedrick,

    Others will chime in as well as Matt mentioned, and I have not personally seen this behavior before, but my guess would be that an overflow is happening somewhere. I would suggest to track which variables (specifically pointers, arrays, and any variable that has free range of the memory space) are in the blocks of code that are causing the problem. It’s possible it is just one variable that, once other parts of the code increment it, it starts modifying memory that it shouldn’t.

    Again,  others will provide more suggestions but this could be one possibility.

    Regards,

    Vince

  • Hi Vince,

    Thank you for the response..

    I would to know if there's an update on this. Any idea on when should we receive the inputs from other colleagues? Thank you.


    Regards,

    Cedrick

  • Hi Cedrick,

    Please expect a response by mid next week as we have someone look into this issue. Thanks for your patience!

    In the meantime, were you able to get a list of all pointer/array variables in the problem part of the code, and then individually remove them for testing? This could help with debugging in the meantime (it is likely to be a single pointer causing this if overflow related). Essentially, you could track the values and see if they go outside of expected ranges.

    Regards,

    Vince

  • Hi Cedrick,

    I apologize for the delay. After speaking offline, the others were unable to determine any definitive reason for this to occur based on prior experience, but we proposed that a few of the following possibilities could be the cause:

    1. Given that the code is interrupt nesting in the ISR, you could be accidentally causing spurious interrupts from undefined ISRs that lead to this ILLEGAL_ISR() being called.

    Basically, if you modify PieCtrlRegs.PIEIER* for a different group, you will trigger the wrong interrupt. If that interrupt does not have an ISR, you will get an ILLEGAL_ISR().

    Please check anytime in the code that PieCtrlRegs.PIEIER* is changed, and make sure you only change it for that interrupt's group. For example, the interrupt for EPWM10 should ONLY modify PieCtrlRegs.PIEIER3, because it is in group 3:

    2. One additional possibility mentioned is that if the user overwrites or uses flash during execution, it is possible that calls to the flash write/read could be conflicting. Although you did not mention this, if the code is modifying flash (or even large chunks of memory) at any point, then it is another potential cause of what you are seeing.

    Thanks again for your patience.

    Regards,

    Vince