This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Ways of doing fast memory transfer

Hi,

I am working on PADK board. We take audio input, do some processing (AM modulation) on it and transfer the output to McASP o/p port. Things are fine when I am dealing with 1 channels (1 channel = 1 left & 1 right input) but as soon as I switch to 4 channels problems creep in. Right now the program flow is:

#define FRAME_SIZE 80

#define DECIMATE_FACTOR 4

int dmaxDacBuffer[PINGPONG][STEREO][NUM_CHANNEL][DECIMATE_FACTOR*FRAME_SIZE];
int dmaxAdcBuffer[PINGPONG][STEREO][NUM_CHANNEL][FRAME_SIZE];
float processBufInL[FRAME_SIZE];
float processBufInR[FRAME_SIZE];
float processBufOutL[DECIMATE_FACTOR*FRAME_SIZE];
float processBufOutR[DECIMATE_FACTOR*FRAME_SIZE]; 

//CH_0
  for (i=0; i<FRAME_SIZE; i++)
        {
   processBufInL[i] = (float)(dmaxAdcBuffer[!ppAdc][LEFT][CH_0][i] >> 16) / 0x8000;
   processBufInR[i] = (float)(dmaxAdcBuffer[!ppAdc][RIGHT][CH_0][i] >> 16) / 0x8000;
  }
  Process();
  DSPF_fltoq15(processBufOutL,tempBufL,DECIMATE_FACTOR*FRAME_SIZE);
  DSPF_fltoq15(processBufOutR,tempBufR,DECIMATE_FACTOR*FRAME_SIZE);
  for (i=0; i<DECIMATE_FACTOR*FRAME_SIZE; i++)
  {
   dmaxDacBuffer[!ppDac][LEFT][CH_0][i] = tempBufL[i] << 16;
   dmaxDacBuffer[!ppDac][RIGHT][CH_0][i] = tempBufR[i] << 16;
  }
    
  //CH_1
   for (i=0; i<FRAME_SIZE; i++)
  {
   processBufInL[i] = (float)(dmaxAdcBuffer[!ppAdc][LEFT][CH_1][i] >> 16) / 0x8000;
   processBufInR[i] = (float)(dmaxAdcBuffer[!ppAdc][RIGHT][CH_1][i] >> 16) / 0x8000;
  }
  Process();
  DSPF_fltoq15(processBufOutL,tempBufL,DECIMATE_FACTOR*FRAME_SIZE);
  DSPF_fltoq15(processBufOutR,tempBufR,DECIMATE_FACTOR*FRAME_SIZE);
  for (i=0; i<DECIMATE_FACTOR*FRAME_SIZE; i++)
  {
   dmaxDacBuffer[!ppDac][LEFT][CH_1][i] = tempBufL[i] << 16;
   dmaxDacBuffer[!ppDac][RIGHT][CH_1][i] = tempBufR[i] << 16;
  } 
  
   //CH_2
   for (i=0; i<FRAME_SIZE; i++)
  {
   processBufInL[i] = (float)(dmaxAdcBuffer[!ppAdc][LEFT][CH_2][i] >> 16) / 0x8000;
   processBufInR[i] = (float)(dmaxAdcBuffer[!ppAdc][RIGHT][CH_2][i] >> 16) / 0x8000;
  }
  Process();
  DSPF_fltoq15(processBufOutL,tempBufL,DECIMATE_FACTOR*FRAME_SIZE);
  DSPF_fltoq15(processBufOutR,tempBufR,DECIMATE_FACTOR*FRAME_SIZE);
  for (i=0; i<DECIMATE_FACTOR*FRAME_SIZE; i++)
  {
   dmaxDacBuffer[!ppDac][LEFT][CH_2][i] = tempBufL[i] << 16;
   dmaxDacBuffer[!ppDac][RIGHT][CH_2][i] = tempBufR[i] << 16;
  } 
     
   //CH_3
   for (i=0; i<FRAME_SIZE; i++)
  {
   processBufInL[i] = (float)(dmaxAdcBuffer[!ppAdc][LEFT][CH_3][i] >> 16) / 0x8000;
   processBufInR[i] = (float)(dmaxAdcBuffer[!ppAdc][RIGHT][CH_3][i] >> 16) / 0x8000;
  }
  Process();
  DSPF_fltoq15(processBufOutL,tempBufL,DECIMATE_FACTOR*FRAME_SIZE);
  DSPF_fltoq15(processBufOutR,tempBufR,DECIMATE_FACTOR*FRAME_SIZE);
  for (i=0; i<DECIMATE_FACTOR*FRAME_SIZE; i++)
  {
   dmaxDacBuffer[!ppDac][LEFT][CH_3][i] = tempBufL[i] << 16;
   dmaxDacBuffer[!ppDac][RIGHT][CH_3][i] = tempBufR[i] << 16;
  } 
  

Obviously, we have just one buffer which is used by all 4 channels. Can anyone suggest any method which can help speeding up the system (may be introducing more buffers, DMA or any such thing). Since, I am a newbie to this platform your suggestions can be quite helpful.