We are using dsp_fir_gen from DSPLIB, and found that there is a restriction that the output buffer needs to be word aligned, so we are using a pragma to achieve that. A second restriction is that the output length has to be a multiple of 4. Since that is not always guarnateed, we filter the first few samples with generic C code until the remaining samples is a multiple of 4, and then filter those with dsp_fir_gen. For example, if the output lenght is 23, we filter 3 samples with a C loop, then the final 20 samples with dsp_fir_gen. In this case, the value passed in to dsp_fir_gen for the output buffer is the buffer base (word aligned) plus 3 (not word aligned). So what I was thinking was have the user pass in a buffer that is (in this example) 24 samples long so that y[3] is word aligned:
Non-Aligned Aligned
+-------+ +-------+
0 | y[0] | 0 | - |
+-------+ +-------+
1 | y[1] | 1 | y[0] |
+-------+ +-------+
2 | y[2] | 2 | y[1] |
+-------+ +-------+
3 | y[3] | 3 | y[2] |
+-------+ +-------+
4 | y[4] | 4 | y[3] |
+-------+ +-------+
. .
. .
. .
+-------+ +-------+
21 | y[21] | 21 | y[20] |
+-------+ +-------+
22 | y[22] | 22 | y[21] |
+-------+ +-------+
23 | y[22] |
+-------+
Before we update our user documentation to give insructions on how to manually align the key sample, I want to confirm with you that the restriction is valid. The reason I ask, is our unit test passes even with the non-aligned memory arrangment! Did we just get "lucky"?