Hello, everyone.
I've been developing a FIR filter for a while in the TMS320C6670 DSP and I need to increase the processing frequency as much as possible. In order to do so, I use DSPLIB to filter (it has good benchmarks) and SYS/BIOS. My idea is to split the input data buffer into 4 segments and process each of those in a different core. To avoid the process of locally copying the data the input buffers and the output are placed in a the shared memory, wich I allocate dinamically at "initialization" using the HeapBufMP module. Then I use a MessageQ to send from the "master" Core 0 to the others the shared pointers.
The problem occurs when the main loop begins. Previously, I tried to syncronize the four cores by sending back a MessageQ to the Core 0 when filtering of a block finished and the master core send a new message with the new pointer to control the flow (input buffers are "Ping-Ponged"). However, that introduced an very high latency in the process (a round trip of Messages, without including the time of input buffer updating), 4 or 5 times longer than the filtering itself.
To correct that, I changed that and I sent just I message at the begining with the shared location of a variable (also allocated with HeapBufMP), called dataRep. The struct has an array named coreStates in wich only each core writes a flag in a possition to indicate if it is ready or not to process data and before doing it polls all the flags until all other cores are ready. This should work and remove the latency, but at simulation or running in the DSP, I notice the program flow stops at the flag polling since the contents of the supposed shared variable are different in each core, so they can not sync.
I don't know the reason of that but I guess it could be some kind of caching process performed by the BIOS because of this error does not seem to happend with the processing data, wich are longer buffers to be cached.
My question are then:
a) Is there a way to avoid SYS/BIOS to cache the contents of an specific variable or memory section?
b) Is the syncronization method I'm using valid?
c) Is there a way to send messages using SYS/BIOS APIs with lower overhead?
Thanks for any help.
-- Adrian
Note: I attach my main .c file and the RTSC .cfg file.