This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Customer with interrupt latency problem

Other Parts Discussed in Thread: TMS320C6727, CCSTUDIO

Hello Mr. Lethaby,

 

All:

I got this e-mail from a customer. I have asked him to provide the BIOS version.

Nick Lethaby

My name is Christian Fuchs. I’m working together with my colleague Franz Sommerauer for Voith Turbo in Austria. We are using TMS320C6727 with DSP/BIOS for a motor control unit for trams.

Actually we have a problem with a delayed (20µs) start of an ISR. This leads to a deadline failure if we drive at maximum interrupt rate (150µs).

I hope you can give us a hint. We are searching and debugging already for 2 month.

 

Here is a more detailed explanation of our system:

The DSP gets three interrupts (HWI6, HWI11 and HWI12) generated from a FPGA. Interrupt nesting is allowed and necessary. (e.g. HWI6 can interrupt HWI11 and HWI12 but not the other way round). All HWI’s are configured by DSP/BIOS dispatcher (because of nesting and calling of DSP/BIOS functions). All three can have different rates.

If HWI6 and HWI11 both run at the same highest rate with a delay of 30µs between, the interrupt latency is about 2µs. But if both interrupts occur in a short time span the latency of both HWI’s raises up to 20µs.

Can this be explained with the DSP/BIOS overhead of interrupt handling?

Or is there another problem which I’m not aware of?

Thank you in advance.

 

  • Hello,

    we are using DSP/BIOS 5.41.09.34 with CCS v3.3 and CGT7.0.4

    Thank you

    Best regards

    Christian Fuchs

  • I suppose the latency of 20us for HWI11 can be explained by the fact that HWI6 can't be interrupted by HWI11.  There are two scenarios for situation where HWI6 & HWI11 "occur in a short time span"...

    1) HWI6 followed shortly by HWI11: If HWI11 happens shortly after HWI6 (by shortly, I mean just after DSP/BIOS has started HWI6 processing) then HWI11 will be held off for the entire duration of HWI6 (including the DSP/BIOS post-processing, which involves complete unrolling of the stack).  This means that the response latency for HWI11 depends on the sum of DSP/BIOS HWI pre-/post-processing plus the execution time of the HWI6 body itself.

    2) HWI11 followed shortly by HWI6: If HWI6 happens shortly after HWI11 starts to be processed by DSP/BIOS then HWI6 will be held off only for the duration of the DSP/BIOS HWI pre-processing (i.e., dispatcher) of HWI11, but HWI6 would start getting serviced by DSP/BIOS before the body of HWI11 executes (since you've said that HWI6 can interrupt HWI11).  Again, the response latency for HWI11 is dependent on the execution time of the HWI6 body plus the DSP/BIOS pre-/post-processing.

    Neither of these scenarios would explain a 20us latency for HWI6

    One possibility that occurs to me...DSP/BIOS HWI processing involves pushing a "HWI context" onto the stack that's in use when the interrupt occurs, followed by a switch to the HWI stack (which can also be referred to as the ISR stack, which is also used for all SWI processing).  For scenario 2) above, HWI11's context will be pushed on the "stack in use" (assuming idle TSK stack) and then HWI6's context will be saved on the HWI stack, so if the HWI stack is in very slow, uncacheable memory then this 2nd level of HWI processing will be much slower than the first level.  Here I assume that the 1st level of HWI processing happens while on a fast stack, since your nominal processing latency of 2us is low.

    If I've made some incorrect assumptions about your system, please clarify and I can re-evaluate.

    Regards,

    - Rob

  • Rob: how can I check where the stack of my HWI's are linked (we also have a SDRAM on our board). Is there a seperate section for this? (The .stack section in BIOS config is configured in IRAM).

    For completeness I should mention that we also have configured HWI4 (BIOS timer), HWI5 (timer1 interrupt), HWI8 (Dmax transfer completion interrupt) and HWI13 (external interrupt).

    In our last test we changed the FPGA logic that no HWI11 interrupt occurs within 10ms before and 10ms after a HWI6 interrupt. But there we also have this long delay.

    We have also tried to disable the dispatcher for HWI6 (and added interrupt keyword for the ISR), but here also we didn't see any change.

    Can it be that DSP/BIOS disables HWI under some circumstances? In our code there is no call to HWI_disable().

    Thanks and regards,

    Christian

  • Markus Glasl said:

    Rob: how can I check where the stack of my HWI's are linked (we also have a SDRAM on our board). Is there a seperate section for this? (The .stack section in BIOS config is configured in IRAM).

    Yes, the .stack section is specified for holding the HWI stack.  There are also a few symbols that reflect the HWI stack - HWI_STKTOP/GBL_stackbeg for the base address of the stack block, and HWI_STKBOTTOM/GBL_stackend for the end address of the stack block (the HWI one is SP-aligned, the GBL one is the exact byte-based end).  You can see the value of these symbols in the generated .map file, which should be in your <appdir>/package/cfg/<subpath-of-executable>/<exectuable-name>.map file.

    Markus Glasl said:

    For completeness I should mention that we also have configured HWI4 (BIOS timer), HWI5 (timer1 interrupt), HWI8 (Dmax transfer completion interrupt) and HWI13 (external interrupt).

    The presence of all these active interrupts will definitely affect the overall system interrupt response latency.  If your tight-deadline interrupt (e.g., HWI11) fires and starts getting serviced by DSP/BIOS, and it allows nested interrupts, then other interrupts that come shortly thereafter will hold off execution of HWI11, and such "HWI preemption" can occur several times, leading to many HWIs getting serviced before DSP/BIOS eventually gets back to servicing HWI11.

    Markus Glasl said:

    Can it be that DSP/BIOS disables HWI under some circumstances? In our code there is no call to HWI_disable().

    DSP/BIOS will certainly be disabling HWI during certain critical sections within it.  They key for applications is the "worst case HWI disabling", and DSP/BIOS strives to keep this as low as possible.  Your DSP/BIOS product should come with some published "benchmarks", which, in addition to "worst case HWI disabling", contain benchmarks such as TSK response latency and SEM_post-to-TSK execution times (cycle counts).

    Regards,

    - Rob

  • Markus --

    How fast is your 674x device running?  I want to figure out how many instructions 20us corresponds to.   Can you send some code snips for your HWI_dispatchPlug() calls?  And some screen shots of the configuration (or zip up and attach your .tcf file) if you are plugging them via the configuration tool?   You've implied that you have set up all interrupts to allow nesting except for interrupt 6. I just want to make sure that you are doing the "all" and "self" options correctly.

    Do you have interrupt  4 or 5 in your system (0 and 1 are reset and NMI)?   If all of these interrupts (4, 5, and 6) are flagged at the same time, then the CPU would handle them in this priority order:

    interrupt 4, run dispatcher until dispatcher has a chance to reenable other interrupts

    interrupt 5, run dispatcher until dispatcher has a chance to reenable other interrupts

    interrupt 6, will finally run.

    You might want to try setting all of your interrupt masks for the other ISRs to disable all other interrupts except interrupt 6.  This will minimize the chance of the other ISRs in your system getting in the way of 6.

    You can also map your most important interrupt to interrupt 4 so that it will always be highest priority when more than one is pending.

    One last thing.  Are you using RTDX?   You should disable RTDX if possible.

    Regards,
    -Karl-

  • Karl Wechsler said:

    How fast is your 674x device running?  I want to figure out how many instructions 20us corresponds to.   Can you send some code snips for your HWI_dispatchPlug() calls?  And some screen shots of the configuration (or zip up and attach your .tcf file) if you are plugging them via the configuration tool?   You've implied that you have set up all interrupts to allow nesting except for interrupt 6. I just want to make sure that you are doing the "all" and "self" options correctly.

    Our device (btw. its a C6727) is running at 250MHz. We are using DSP/BIOS graphical configuration tool to set up all HWI's. We don't have calls to HWI_dispatchPlug() in our code. Do we need to? I've attached the .tcf file. 0172.int_3.zip

    Karl Wechsler said:

    Do you have interrupt  4 or 5 in your system (0 and 1 are reset and NMI)?

    Yes, we have HWI 4 (Timer0 for DSP/BIOS), 5 (Timer1 for some tasks), 6 (external Int), 8 (DMax transfer complete int) ,11 (external Int), 12 (external Int), 13 (external Int) active.

    Karl Wechsler said:

    You might want to try setting all of your interrupt masks for the other ISRs to disable all other interrupts except interrupt 6.

    HWI 6 has already Interruptmask = All. I think this should prevent any other HWI to run? HWI 11 only allows HWI6 to preempt, and HWI12 allows HWI6 and HWI11 to preempt. All other HWIs have Interruptmask = Self. Is this ok?

    Karl Wechsler said:

    You can also map your most important interrupt to interrupt 4 so that it will always be highest priority when more than one is pending.

    Is that possible on C6727 device? I thought that HWI4 is always Timer0 Int.

    Karl Wechsler said:

    One last thing.  Are you using RTDX?   You should disable RTDX if possible.

    Yes we are using it. And we have also tried to disable it, but with no positive effect.

    Regards,
    Christian

     

  • During my last debug session I faced an interesting point:

    If HWI6 (disables all other HWIs when running) preempts HWI11 (disables all HWI's except HWI6) there is no delay of HWI6. But if HWI6 doesn't preempt HWI11 there is a delay. For this debug session I've disabled HWI5 and also RTDX and NMI => so there is only HWI4 (Timer0 Int for DSP/BIOS; interrupt mask=self) which may prevent HWI6 from running.

    Could this be the case?

    Regards

    Christian

  • Christian --

    I made a mistake in my earlier post.  The 672x has a fixed interrupt mapping and it is not possible to remap different interrupt sources. I was thinking you were on the 674x which allows mapping different external events to any of  the 12 core interrupts.

    20us corresponds to 5000 instructions at 250MHz so something is out of whack here.   BIOS latency should be a 1-200 cycles unless code is running from very slow external memory with lots of wait states.

    What compiler options are you using to building your application code?   The compiler has an option to specify max time that compiler will disable interrupts.   We use -mi10 to build BIOS which means that compiler will not disable interrupt for more than 10 instructions.  On the 6x, the compiler can generate long loops that cannot be interrupted.

    (1) I would try changing your interrupt mask to 0xffbf for _all_ interrupt sources except for INT6 which I'd set to all.

    (2)  I would add -mi10 or -mi100 (where # corresponds to your tolerance) to your compile line and recompile all of your source files.

    Regards,
    -Karl-

  • Karl,

    Karl Wechsler said:

    What compiler options are you using to building your application code?   The compiler has an option to specify max time that compiler will disable interrupts.   We use -mi10 to build BIOS which means that compiler will not disable interrupt for more than 10 instructions.  On the 6x, the compiler can generate long loops that cannot be interrupted.

    Here are our compiler options:

    -o2 -fr"C:\temp\INT_3_ccslink_von_Franz\CustomMW" -i"c:\temp\INT_3_ccslink_von_Franz" -d"MODEL=INT_3" -d"NUMST=1" -d"NCSTATES=0" -d"HAVESTDIO=" -d"ONESTEPFCN=0" -d"TERMFCN=0" -d"MAT_FILE=0" -d"MULTI_INSTANCE_CODE=0" -d"INTEGER_CODE=0" -d"MT=0" -d"TID01EQ=0" -d"C6727" -d"__TICCSC__" -d"RT" -d"USE_RTMODEL" -mi -ms3 -mv67p

    Karl Wechsler said:

    (1) I would try changing your interrupt mask to 0xffbf for _all_ interrupt sources except for INT6 which I'd set to all.

    (2)  I would add -mi10 or -mi100 (where # corresponds to your tolerance) to your compile line and recompile all of your source files.

    The new interrupt masks didn't help. As you can see in our compiler options we have already used -mi flag (without a threshold). Changing the threshold to 10 or 100 also didn't help.

    Is it possible to disable the Timer0 (DSP/BIOS) interrupt? Just to check that it doesn't influence the system!

    Regards,
    Christian

  • Christian  --

    You can disable the BIOS timer with the following lines in your .tcf file:

     

    bios.PRD.USECLK = 0;

    bios.CLK.ENABLECLK = 0;

     

    BIOS does not require the timer interrupt, but if you have code that relies on timeout (TSK_sleep() or SEM_pend() with timeout), then you might have to rework your application.

    It is also possible to drive the PRD module (which drives CLK) from another periodic source.  If you have another periodic ISR, you can add a call to PRD_tick() from that ISR to give the PRD module its' tick.

    Regards,
    -Karl-

  • Karl,

    deactivating BIOS timer brought no improvement. But thanks for the explanation.

    I've now made a small program with 3 Interrupts each calculating a for-loop with 1000 iterations and one task also calculating a for-loop. Here I can see that I'm getting a delay of the interrupts if they preempt a running task. If the task is blocked (idle task is running) there is no delay.

    The very strange is, if I activate full symbolic debug I don't get the delay even when the ISR interrupts the task. I've also deactivated optimization. Here are the compiler options I used:

    -fr"C:\temp\INT_3_ccslink_von_Franz\CustomMW" -i"c:\temp\INT_3_ccslink_von_Franz" -i"C:\CCStudio_v3.3\C6000\csl_C672x_03_00_09_00\dsp\inc" -i"C:\CCStudio_v3.3\C6000\csl_C672x_intc_03_00_09_00\dsp\inc" -d"MODEL=INT_3" -d"NUMST=1" -d"NCSTATES=0" -d"HAVESTDIO=" -d"ONESTEPFCN=0" -d"TERMFCN=0" -d"MAT_FILE=0" -d"MULTI_INSTANCE_CODE=0" -d"INTEGER_CODE=0" -d"MT=0" -d"TID01EQ=0" -d"C6727" -d"__TICCSC__" -d"RT" -d"USE_RTMODEL" -mi100 -mv67p

    Unfortunaltly we cannot disable optimization and enable full symbolic debug in our project, because then it will be to big and the runtime growth to large.

    So, what does the compiler do if "full symbolic debug" is active than not active? Could this be a compiler/optimizer issue?

    Regards,

    Christian

  • Is it possible to send the test case across?   zip it up and post to this site?   Please be sure to include pre-built .out file and .map file and .c files.  If this test case runs on the 6727 EVM, even better, but not required.   I can take a look. 

    A few more questions/ideas:

    (1)  what does the for loop in the task look like?  Is it a small loop?   Can you look at the assembly code in CCS and see if there's anything unusual about the assembly code?  On C6x, interrupts are disabled for the 4 or 5 (I forget) instructions right after a branch instruction.  If all of the instructions in the loop are in shadow of a branch, then no interrupts can come in until the loop completes.   The -mi100 option is supposed to prevent the compiler from generating code like this.

    (2) can you try adding single "asm(" nop");' instruction within the loop?  This will change the compiled code slightly but would be good data point.  Be sure to include a space before the nop instruction since the assembler needs leading space and will otherwise treat nop as a symbol.

    (3)  do you have any assembly code of your own this project?

    Thanks,
    -Karl-

  • Karl Wechsler said:

    Is it possible to send the test case across?   zip it up and post to this site?   Please be sure to include pre-built .out file and .map file and .c files.  If this test case runs on the 6727 EVM, even better, but not required.   I can take a look. 

    I've created a project without any code from our real project. This I'm posting here: 0312.INT_3.zip

    This software runs on C6727 EVM from DSP Weuffen. A .gel file needed for the EVM is also in the zip. You can trigger HWI6 with a pulse on pin UHPI_HAS. You can see if HWI6 ISR is running on pin UHPI_HD18 (goes to low if ISR is running) and TaskA on pin UHPI_HD17 (goes to low if TaskA is running).

    In this project the delaytime has reduced to 11µs (worst).

    Karl Wechsler said:

    (1)  what does the for loop in the task look like?  Is it a small loop?   Can you look at the assembly code in CCS and see if there's anything unusual about the assembly code?  On C6x, interrupts are disabled for the 4 or 5 (I forget) instructions right after a branch instruction.  If all of the instructions in the loop are in shadow of a branch, then no interrupts can come in until the loop completes.   The -mi100 option is supposed to prevent the compiler from generating code like this.

    Yes, it's a small loop, but it doesn't look strange. I've set the -mi option without a threshold.

    Karl Wechsler said:

    (2) can you try adding single "asm(" nop");' instruction within the loop?  This will change the compiled code slightly but would be good data point.  Be sure to include a space before the nop instruction since the assembler needs leading space and will otherwise treat nop as a symbol.

    Tried it, but delay still remains.

    Karl Wechsler said:

    (3)  do you have any assembly code of your own this project?

    Do you mean assembly code of our motor control project? If yes, I'm not allowed to post it here, sorry.

    Thanks,
    Christian

  • Karl,

    did you already find time to have a look on the demo project? Found something?

    Thanks,

    Christian

  • I was out of the office part of last week and didn't have time to review this completely.  I reviewed the code and reviewed the disassembly output from 'dis6x' to look for any non-interruptable sections of code but the loops look OK.  I don't have any great ideas.  I will look into this some more tomorrow and get back if I see anything.

    -Karl-

  • I looked into this more today but still no answers.  10us corresponds to 2500 instruction cycles at 250MHz.   This is a lot.   Your setup looks correct.

    [1] How are you measuring the jitter?  Are you triggering on the signal that generates the interrupt and then monitoring the I/O pin flipped by the INT6 ISR?  And you are seeing 10us of jitter in this?

    [2] I see that you are doing this in your code:

      pUHPI_REG[5] &= ~0x00020000;

    This translates into a read, modify and write operation in assembly language.   And it is possible for this to be interrupted in this window.  I'm not sure if there's a problem with this.  I've thought it through and don't think there is a problem, but I wonder if you could surround these calls with something like this to make sure.

     

    key = HWI_disable();

      pUHPI_REG[5] &= ~0x00020000;

    HWI_restore(key);

     

    I don't have the same EVM as you.  I have a different EVM.  It would take me some time to reproduce on my EVM.

    [3] are you certain that you are running at 250MHz?  Is there a chance that you are running slower?     You can use CLK_getltime() on the target to get the current time.  By default, this tick increments every 1ms.

  • Karl Wechsler said:

    [1] How are you measuring the jitter?  Are you triggering on the signal that generates the interrupt and then monitoring the I/O pin flipped by the INT6 ISR?  And you are seeing 10us of jitter in this?

    Correct! But not every interrupt has 10µs jitter. Only if the isr preempts a task.

     

    Karl Wechsler said:

    [2] I see that you are doing this in your code:

     

      pUHPI_REG[5] &= ~0x00020000;

    This translates into a read, modify and write operation in assembly language.   And it is possible for this to be interrupted in this window.  I'm not sure if there's a problem with this.  I've thought it through and don't think there is a problem, but I wonder if you could surround these calls with something like this to make sure.

     key = HWI_disable();

      pUHPI_REG[5] &= ~0x00020000;

    HWI_restore(key);

    I've tried it, but it didn't help.

    Karl Wechsler said:

    [3] are you certain that you are running at 250MHz?  Is there a chance that you are running slower?     You can use CLK_getltime() on the target to get the current time.  By default, this tick increments every 1ms.

    Yes I'm sure. I've measured the SDRAM clock which is configured to be 1/3 of SYSCLK. Oszi says 83.33333333MHz

  • Is the external/INT6 interrupt periodic?  If it is periodic, or if you can make it periodic, then we might be able to catch the problem as follows:

     

    Use CLK_gethtime() to get the high resolution time in the ISR.   And use a variable to keep track of the difference since the last interrupt.  If the delta is greater than some threshold, halt the processor.  The ‘IRP’ register should contain the interrupt return address.   I think this will point to the point where interrupts were reenabled after the long disable time.   If you can catch this, then please attach another .zip with the updated project (including the .out file), along with a screenshot of the CPU registers – IRP and B3 are key but all of them would be ideal.   We can disassemble your .out file and look at the code at/near IRP and see if there’s a clue.

     

    Code should be something like the following which I didn’t actually compile, but enclosed for reference (sorry, but formatting got messed up with copy-paste)

     

    #include <clk.h>

     

    #define THRESHOLD 12345   /* replace with real threshold */

     

    void isr()

    {

        static Int oldtime = 0;


        Int newtime;

        newtime = CLK_gethtime();

        /* handle the initial case */

        if (oldtime == 0) {

            oldtime = newtime;

            return;

        }

     

        if ((newtime – oldtime) > THRESHOLD) {

            asm(“stop: nop”);        /* put a breakpoint here */

        }

     

        oldtime = newtime;

    }

  • I've set HWI to occur periodically every 200µs and added your code to measure the time between each ISR call. I've also added a conversion from CLK_gethtime() to µs. The threshold is 210µs. If the deltatime is bigger than that threshold the CPU stops at a breakpoint like suggested by you.

    I attach here the project (it's not for EVM because it was easier to test for me on our custom board but the code is quiet the same):5164.INT_3_time_measure.zip

    In the zip file there is also a register.png screenshot which shows the registers after breaking. The IRP points somewhere in taskA() function. I've repeated the measurement 20 and more times and the IRP is always pointing to the same function.

  • Hi Markus --

    Would it be possible for you to host a WebEx where you could reproduce the issue and we can debug it with you live?   If so, please send me a friend request and we can connect and I can send you conference # and WebEx information.   Can we do this on Thursday morning 8AM (California time).  I think this is late afternoon for you?


    In the meantime, I have some more questions and ideas to continue to narrow this down.

    (1)  You have commented out the calls to SEM_post() for taskB and taskC.  I think this removes them from the equation which is good.

    (2)  Do you require ISR11 and ISR12 to reproduce this?  Are those 2 ISRs being actively triggered?  If not, can you remove them, just to simplify like you did with taskB/C?

    (3)  Can you please add some global variables to your program.   2 for each ISR?   t_ISR5_enter, t_ISR5_exit, t_ISR6_enter, etc?   And save CLK_gethtime into these variables as the first and last thing you do in each ISR?   Would be useful to review these variables when the problem occurs to see if there's some relationship to ISRs happening atop one another.  

    (4)  It would also be useful to store IER, IRP and CSR into similar global variables at beginning of each ISR.   This might help us see if the ISRs are preempting one another as expected.   And that IER is as we expect.   You can get to these registers from 'C' using the following:

    extern cregister volatile unsigned int IRP;
    extern cregister volatile unsigned int IER;
    extern cregister volatile unsigned int CSR;

    (5)  I notice that the stack for taskA is in SDRAM, while the stack for the tBaseRate task is in IRAM.   Both of these tasks are pri=4.  And tBaseRate task simply posts a semaphore that taskA is pending on.  It would be interesting to add a delay in tBaseRate (similar to what you have in taskA) to see if the IRP when you have the problem is ever in tBaseRate task.  You say it is always in taskA which might be a clue. 

    (6)  It would also be interesting to try placing the stack for taskA in IRAM to see if that makes a difference.

     

    Please try to do 1-6 before Thursday if you can.   We can review on this forum ahead of the call on Thu and debug further on call if we need to.

    Thanks,
    -Karl-

  • Karl,

    your last point "(6) Placing the Stack for TaskA to IRAM" solved the problem for the demo project. The delay reduces to 1µs. This is great!

    So I've tried the same thing in our real project (putting the stack of every task to IRAM): Here it also helped a lot: delay changes from 20µs to 5µs. This is also very helpful, but are 5µs still too long?

    I think debuging my demo project doesn't make sense anymore (because it's working fine now), but if it is ok for you I'll set up our real project with the suggested changes from your last post, and we can debug this!?

    I've sent you a friend request. But I'm not familar with WebEx. Do I need a special application for that?

    Thanks in advance

    Christian

    PS: I'm using the account of my colleague Markus Glasl. Every post until now was from me.

  • Hi Christian --

    I wish I would have thought of this sooner.   Rob alluded to this earlier but we didn't drive that path.   I think you should make every effort to have all your stacks and critical real-time data reside in on-chip memory.   The 672x has a program cache which will help if you have code in external memory, but I think you should work to keep data in internal memory.   Can you review your application and .map file to make sure you have all data in on-chip memory? 

    I'm not sure if a WebEx makes sense at this point.  I'm not a hardware guy and don't know what we'd debug.   I think the root cause of the problem is data memory placement.  Can you review your memory .map and try to get all data into on-chip memory?   The next thing to do would be to move the .bios section and your critical ISR code to on-chip memory.  This would avoid any cache misses for .bios and your ISRs.  You can place code in specific sections using #pragma CODE_SECTION (or some such -- check compiler manual) and then place your critical sections in on-chip memory using second linker .cmd file.  The linker can accept multiple linker .cmd files, so you can place your sections in separate file.  If you place all your critical code in on-chip memory and all stacks and critical data on-chip, I think your latencies will be good.

    Regards,
    -Karl-

    P.S. BTW, WebEx is an app that you run on your machine that allows you to share applications with people at remote locations.   You install a small app on your side and enable external access.

  • Hi Karl,

    We've already placed .bios section and ISR code in IRAM. Also all stacks (system stack and all task stacks) are in IRAM.

    Unfortunatly our project is too big to fit into IRAM completely. Code section of the tasks and most of data sections are in external RAM. We have to find a compromise between ext. RAM usage and latency.

    Hope it is ok if we can come back to you if we have problems again.

    Thank you

    Regards

    Christian

  • Hi Karl,

    I still don't understand what instruction cache (C6727 doesn't have a data cache) has to do with stack in internal/external RAM?

    You wrote: "put all code and data into on-chip memory to avoid cache misses". But stack isn't in program cache. So the only reason that stack in IRAM helps is that IRAM has a faster access time.

    Another question regarding instruction cache: If I have a function in external RAM which calls another function in IRAM, do I have a cache miss at the time the call occures? If yes, how does it affect the run time of my program?

    Regards

    Christian

  • Hi Christian --

    The instruction cache has nothing to do with the stack placement.   I was more referring to the 672x architecture in general.  The 672x has an instruction cache which should help when code is placed externally.  It will make such code run faster since it will be loaded in cache which has fast access times (after initial slow read).   The 672x does not have a data cache.  So I think you should try to place critical data and stacks on chip and use remaining on chip for .bios and critical ISRs.  Then less critical code can go external.  External code will be cached to help with performance.

    Code in external memory will be loaded to cache (if not already in the cache).  Calling code that resides in IRAM should not cause a cache miss. 

    -Karl-