So having a mostly-working single core application, I wanted to start using the same application image across multiple cores on a C6678.This raises the question of how to configure the memory map. I am using the MAD tools to create a single Ethernet boot image. Guided by the MAD documentation, I set this up as an execute-in-place shared program image. I had days of grief with this - tearing hair out in frustration - and have concluded that it's a very bad idea. But more below.
As far as I can see, the options for an application running on more than one core are as follows:
- Keep all code and static data in L2SRAM, and use DDR and MSMCSRAM only for heaps. Not very viable for larger applications.
- Use shared code and keep stack and non-shared data in L2SRAM. Additional data can be allocated using MP-safe heaps This is what the Image Processing demo's slaves do, which is the most "grown up" multicore example provided in the MCSDK. This is fine as long as your data fits in L2 and/or you're happy with the heap approach. It has the nice feature that you can just load the image on all cores using the CCS ELF loader without doing anything else.
- Build a separate project for each core, relocating code and data to non-overlapping addresses. This seems a lot of extra work and pain, though you can then use the MAD prelinker-bypass mode so the MAD step is simpler. It also means you're addressing memory without remapping, which may be simpler to understand.
- Use shared code, and have private data arranged using XMC - each core sees its own data at the same virtual address, but different physical RAM. This is the model the MAD tool examples configure for you.
- Use XMC to make private copies of code and private copies of static data for each core, at the same location. Again, the MAD tools can arrange this for you.
I was trying to use (4). The problem I had with (4) and (5) is that you can't just load the app on every core in CCS "out of the box" because you have to configure the memory mapping, and in any case loading apps on multiple cores is a bit tedious. So I tried loading the MAD image into RAM and jumping into it as described in the MCSDK docs.
This worked as far as it went. Unfortunately my program didn't work, because of some bugs in my IPC setup (there turned out to be an interrupt conflict, for starters). This meant I wanted to step through a multi-core program.
Loading symbols "the MAD way" reliably crashed CCS 5.2. (I have a separate forum thread about this, which nobody has bitten at yet).
Nevertheless I managed to use DSS to automate the whole thing: loading the BBLOB image, loading symbols on all cores, jumping into it, and breaking all cores at main().
That's when the weirdness started. Despite being started by a script, runs would behave in non-repeatable ways. CIO would sometimes work and sometimes not. I have stepped through function calls and seen them do nothing, because there was a software breakpoint (SWBP) still lurking there. (You can see it in the disassembly window). It took me a long time to blame the debugger, but others seem to have similar issues, egi n this thread
http://e2e.ti.com/support/development_tools/code_composer_studio/f/81/p/206790/802889.aspx#802889
Even if I didn't set my own breakpoints or limited myself to hardware breakpoints, I had oddities that may have been to do with the presence of CIO breakpoints.
The other things I found difficult was that, in the shared-memory model, a software breakpoint would logically break any core that hits it - which mostly but not entirely happened. I think some of the problems I had were because each breakpoint actually belongs to a single target [core] but it isn't obvious in the default CCS view which core "owns" the breakpoint. (Once I discovered the "group by debug context" option, things became a lot clearer).
So in the end I decided I needed option (5), giving each core its own private virtualised copy of the code segment. This works perfectly, without weirdness, and with the great bonus that you can set per-core breakpoints properly (again, I recommend grouping the breakpoint view as above or this is very confusing).
Another bonus is that all the symbols are exactly where they were compiled to be, so you don't need any of the symbol-relocation malarkey you have with the prelinker.
The downside is that, if you launch my program from an MAD image, it's difficult to break in main(). You have to load symbols for code that doesn't exist yet, set a hardware breakpoint for main(), jump into the MAD image and then probably reload symbols again to get the CIO breakpoints etc. in place once it breaks. It was all getting quite hard work.
In the end, I thought of configuring the XMC mapping I want in GEL, so that I can then "just load" the app using the CCS ELF loader on each core. This works brilliantly. However, I don't remember seeing this idea mentioned anywhere, and I wouldn't have known how to do it if I hadn't spent a day stepping through the MAD NML loader trying to debug my MAD config in excrutiating detail. I'm a bit worried I'm hiking off-trail here ("bush-bashing" as we used to say in my youth in Australia).
What do other people do? Are people using option (2) or (3), or something I haven't though of? Do Real Men not need to step through in a debugger? Is my XMC trick so obvious that you're all rolling your eyes?
Regards,
Gordon