This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OMAP3530EVM DSPLINK FFT/IFFT Problem

Hello, 

I am using the OMAP3530EVM by Mistral and I am familiarizing myself with DSPLINK and DSPLIB. 

I have used the readwrite example and have modified it to get an input buffer from analog input pass it on to the DSP, fetch back unaltered and put it on the analog audio output. The ARM processor uses ALSA drivers for the audio i/o. 

The readwrite example works fine when I have the DSP to do nothing on the received buffer.

Then I tried to do an FFT and immediately and IFFT on the DSP side just to check that I can compile correctly. And this is where the problems begin.

On the DSP side, the only modifications are on tskReadwrite.c

I have defined the necessary buffers and double-word aligned them as follows:

 

 

#define TIMEN 2048u

#define FREQN 4096u

 

#pragma DATA_ALIGN(forward_twiddle_factors,8);

short forward_twiddle_factors[FREQN];

#pragma DATA_ALIGN(reverse_twiddle_factors,8);

short reverse_twiddle_factors[FREQN];

#pragma DATA_ALIGN(fftBuf,8);

short fftBuf[FREQN];

#pragma DATA_ALIGN(cmplxBuf,8);

short cmplxBuf[FREQN];

#pragma DATA_ALIGN(invBuf,8);

short invBuf[FREQN];

Then in the TSR_Execute function I added the following code between the HAL_cacheInv functions that read/write between GPP/DSP

 

HAL_cacheInv ((Ptr) readBuf, size) ;

   for(j=0;j<FREQN/2;j++) {

cmplxBuf[2*j]=readBuf[j];

        cmplxBuf[2*j+1]=0;

   }

   DSP_fft16x16 (forward_twiddle_factors, FREQN, cmplxBuf, fftBuf);

   DSP_ifft16x16(reverse_twiddle_factors, FREQN, fftBuf, cmplxBuf);


   for (j=0;j<FREQN/2;j++) {

writeBuf[j]=cmplxBuf[2*j];

   }

   HAL_cacheWbInv ((Ptr)(msg->dspWriteAddr), size) ;

I have used different twiddle factors for FFT and IFFT. As you can see I am constructing a "fake" complex input for the FFT by setting all odd samples of cmplxBuf to zero and all even ones equal to my input buffer. The problem is that after the IFFT cmplxBuf contains mostly zeros and when I fetch it back with the GPP and put it to audio output there is almost no audio except some strange noise.

I also tried using a different array for the output of the IFFT as follows

 

   DSP_fft16x16 (forward_twiddle_factors, FREQN, cmplxBuf, fftBuf);

   DSP_ifft16x16(reverse_twiddle_factors, FREQN, fftBuf, invBuf);


   for (j=0;j<FREQN/2;j++) {

writeBuf[j]=invBuf[2*j];

   }

 

And this time I get an error and the whole board halts and I have to reset. The error is
DSP MMU Error Fault! MMU_IRQSTATUS=[0x1]. Virtual DSP addr reference that generated the interrupt = [0x93f398e0].
I suppose I should mention that I ran readwrite on the address 2280855450 which in my configuration is inside the POOLMEMORYADDR.
What am I doing wrong? I have searched through the forums extensively and I have seen that there are some scaling issues that may distort audio after FFT/IFFT. However, the problem is that there is no audio after the IFFT.
I would appreciate some support. I am also attaching the tskReadwrite.c file 2146.tskReadwrite.txt

 

 

  • Hi Elias,

    I looked at your example, and I think there are several possibilities:

    1. readBuf may not be large enough to hold FREQN/2 short values. So when you copied readBuf into cmplxBuf, you may not end up with valid data in cmplxBuf. This would give you invalid output.

    2. writeBuf may not be large enough to hold FREQN/2 short values. So you may be running over the end of the buffer when copying data from cmplxBuf to writeBuf. Also the call to HAL_cacheWbInv is writing back 'size' bytes. So if size is < FREQN/2, only part of the array will make it into external memory. And when you play the audio on the ARM side, you only hear a partial stream.

    3. The FFT/IFFT functions are doing the wrong thing, perhaps something needs to be done differently from the dsplib perspective.

    I am not a DSPLIB expert, so what I want to do first is to make sure you are not running into problems due to #1 and #2 and that you are starting with the correct data in cmplxBuf and that the resulting output from ifft16x16 corresponds to what you are hearing on the ARM side.

    First, verify that the buffer size you pass in on the command line when invoking readwrite is at least as large as FREQN. This should ensure that a correct size is used for readBuf and writeBuf. Next, it'd be good to double-check that the data you are getting in readBuf on the DSP is the same as what was on the ARM and that it is correctly transferred to cmplxBuf. If you have CCS, you can connect to the DSP using JTAG by following these instructions:

    http://processors.wiki.ti.com/index.php/Debugging_the_DSP_side_of_a_DSPLink_application_on_OMAP_using_CCS

    It talks about how to connect to the DSP side of the message sample, but the same procedure applies to readwrite. You can use the memory view in CCS to inspect contents in readBuf. You can also inspect the data in the writeBuf to make sure it is the same data you see on the ARM side. And if you wish to chase the other use case, you can step through the code and find out on which line the MMU fault occurs.

    If you do not have CCS, you can try to add up all values in the input and output arrays to produce two checksums on the DSP and send them back to the ARM side at the end of the writeBuf (make sure the buffersize is made larger. Or better yet, use the scalingFactor field in the status message to exchange checksum data since you are not using it).  Then compare these checksums on the ARM as a quick and dirty way to ensure data is indeed exchanged correctly between the ARM and the DSP.

    Let us know how it goes. Once you have ensured that the DSP and the ARM are indeed looking at the same input and output data I can get someone more familiar with DSPLIB to take a look.

    Best regards,

    Vincent

     

     

     

     

  • Dear Vincent,

    thank you for your answer.

    I have used the checksum method to verify the buffers that I read/write between the ARM and the DSP. 

    I should note that size=FREQN/2, hence all the buffers are of sufficient size.

    readBuf size is 2048 and cmplxBuf size is 4096 since I want to create a "complex" buffer for input to DSP_fft16x16 and writeBuf size is 2048 since I take on the "real" part of the DSP_ifft16x16 output.

    So, when I omit the FFT step the checksums of readBuf, cmplxBuf and writeBuf are exactly the same and I hear the audio correctly, as I said in my previous post.

    For the FFT step, I noticed an error in my code. I was doing an 4096-size FFT instead of the proper 2048-size one and I corrected this.

    Now, when I include the FFT step, readBuf and cmplxBuf have exactly the same checksums before DSP_fft16x16, that means I send the correct values to the FFT. The checksums of the "real part" (even indices) of cmplxBuf after the DSP_ifft16x16 is exactly the same with writeBuf so that means that I return the output of the IFFT correctly to the ARM side. However, the audio is pure noise.

    I suppose it has something to do with the DSPLIB FFT routines. I have read in the forums that I should use scaling to prevent overflow. I have tried scaling(dividing) the input to the DSP_fft16x16 by 2^log2(2048)=2048 (that is the FFT size) and all I get for output is pure silence. I have tried also scaling (multiplying) the DSP_ifft16x16 output by 2048 and I get noise.

    Am I missing something really obvious here? 

  • Hi Elias,

    Just be aware that functions such as HAL_cacheInv ((Ptr) readBuf, size)  and HAL_cacheWbInv ((Ptr)(msg->dspWriteAddr), size)  operate assuming size is the number of 8-bit bytes, whereas you have made the arrays readBuf and writeBuf to be arrays of 'short', which I believe is 16-bit wide on the C6000. So you need to make sure you take that into account when specifying the size on the command line. If you use a size of 2048 bytes thinking that it would hold 2048 shorts, the arrays might be too small.

    If you are indeed using a large enough size and you are still seeing matching checksums on both the ARM code and the DSP code (please confirm), then I think the problem may be with your usage of dsplib. I will try to contact someone who can help.

    Best regards,

    Vincent

  • Hello Vincent,

    thanks again for your reply.

    I see your point, however since when I don't perform any operation on the data I receive the correct data and hear the correct audio I don't believe it is an issue with the array size. In any case, if you could point to some detailed documentation on HAL_ functions I would take a look.

    I don't think it is an issue of buffer sizes. If you could provide some more help on the FFT functions of DSPLIB I would be thankful.

  • Elias, have you referred the FFT example that is part of the DSPLIB package?

    C:\Program Files\Texas Instruments\dsplib_c64Px_3_x_y_z\examples

    The example demonstrated correct usage of the FFT function. I recommend trying out with an artificial input (sine wave) first to ensure the function is working correctly.

    Cheers,
    Gagan 

  • Dear Vincent and Gagan,

    I am posting an update on my situation.

    First, it seems after all Vincent was right about the read/write process and losing some samples. The buffers were indeed large enough, but PROC_read, PROC_write, HAL_cacheInv and HAL_cacheWbIn needed to read "more" samples, to read the correct whole buffer. This didn't show up in the checksum debugging. 

    So I am now correctly sending data back and forth between the ARM and the DSP.

    For the FFT, I looked into the example that came with DSPLIB. I should note, that there is a different gen_twiddle_fft16x16 function in this example than the one included in the DSP_fft16x16 source code folder.

    There is also a problem with the IFFT twiddle factors and I use the gen_twiddle_ifft16x16_sa function to have the correct results.

    The only problem now is scaling. I am using an analog audio input of a sine wave with a frequency which is an integer multiple of my block size (2048). I pass this to the DSP, scale(divide) by 2048, do FFT/IFFT, scale back (multiply) by 2048 and get the output. The output is a sine of the same frequency with a lot of quantization noise. As I understand the 2048 scaling is the maximum in order to prevent any overflow. This means that my data are effective 5 bit after the scaling. How can I make a better use of the dynamic range?

    Also, I would be grateful if you could point me to some detailed documentation about PROC_read, PROC_write, HAL_cacheInv and HAL_cacheWbIn.

    Best regards,

    Elias

  • Hi Elias,

    Regarding the documentation of the DSPLINK API,

    PROC_read, PROC_write: Open dsplink_1_xx_xx_xx/dsplink/doc/source_documentation/html/index.html, On the left hand-side, click on 'File List', and look for 'proc.h' in the drop-down list. Click on it and the API reference for the PROC module will show up.

    HAL_cacheInv, HAL_cacheWbInv: You can check out the header file dsplink_1_xx_xx_xx/dsplink/dsp/src/base/hal/hal_cache.h. These are convenience functions created to call the OS dependent version of the functions which performs cache invalidation and writeback.

    Best regards,

    Vincent