Other Parts Discussed in Thread: TMS320C6748
Hi,
We are developing an application for a C6742 DSP based production custom board and we are facing problems to fit our application code into the internal SRAM memory of this DSP. We are launching our application by means of a AIS NOR boot, which configures by default the internal memories L1D, L1P as cache (32KB each) and L2 as RAM (64KB). In addition to these internal memories, our board features an external SDRAM memory connected through EMIF (with speed around 24MHz).
During the application development we have been using the TMS320C6748 DSP Development Kit (LCDK), featuring the C6748 DSP, which has a greater L2 memory (256KB) than C6742. We have been running several tests allocating all the code into this augmented L2 memory, satisfying the time performance constraints defined for the application.
The problem is that our application code is far too big to be stored completely in the final L2 64KB SRAM memory of C6742 DSP (around 200KB), and we would want to minimize the code placed in the external SDRAM memory due to the time performace constraints I mentioned before. We have identified approximately the critical sections of the application code, on which those timing restrictions are mostly applied, and possibly they will not fil completely into L2 memory. So, assuming will need to take part of this code out of L2, we have considered two possibilities here:
- Place the most critical sections of the code in internal L2 memory and take the rest of it to the external SRAM memory. This will cause part of the critical code (not the most critical, but critical in some way) to be in a significantly slower memory, which possibly will affect the timing performance.
- Configure part of L1D, L1P memories as RAM so as to fit all critical code within internal memories (L1 and L2). The goal will be to maintain enough L1 memory space available for cache purposes to not slow down the coding execution but avoiding placing critical code in the external SRAM.
The second possibility could be the optimal, but raises several questions:
1. As far as I know, there are two possibilities to change the default configuration of L1 from cache to RAM: either instructing the bootloader to change that configuration or doing it in runtime in the code application itself. I do not know if the former way is possible for the AIS NOR boot, at least I did not see any reference to do that in the bootloader app note. I have seen some alternatives in similar threads on this topic such as a secondary bootloader, but I think this would mean to discard the AIS generation tool, am I wrong? On the other hand, I have seen an example of the runtime possibility in the Cache User Guide, section 2.7, but I do not know whether it is applicable to our case. In any case, I am wondering how tricky this be. Do you have any advise on that?
2. Considering the configuration can be done, what would be the criteria to allocate code to L1P and L1D memories? Is it enough to copy, for instance, .text sections (executable code) to L1P and .far sections (static variables) to L1D, or other considerations should be taken into account?
3. I have seen in some GEL and .cmd files, and also in the datasheet documentation of C6724 that the L1P, L1D, L2 memories are often referred by two different addresses: [0x0080 0000, 0x00E0 0000, 0x00F0 0000] and [0x1180 0000, 0x11E0 0000, 0x11F0 0000]. The second group of addresses is referred as "shared" or "mirror". What is the difference between them?
4. Would you have any estimation regarding to what extent the approach using L1 SRAM will offer a significative peformance gain respect to the one keeping the whole L1 as cache?
Any advise or help will be appreciated
Thanks,
David